Basic Math Concepts - Logistic Regression - Little Bits of Artificial Intelligence

Basic Math Concepts – Logistic Regression

1. Sigmoid Function

The sigmoid function takes any number (positive or negative) and squashes it between 0 and 1.

Formula:

σ(z)=1 / 1+e^-z

Where:

z is the input (could be anything — score, sum of weights, etc.)
e is Euler’s number (~2.718)

Why Is It Used?

Logistic Regression needs to output probabilities, not just raw numbers.

Raw Score zzz	Sigmoid Output σ(z) \| sigma(z) \| σ(z)	Interpretation
-10	~0.00005	Very close to 0 (No)
0	0.5	50/50 — Uncertain
+10	~0.99995	Very close to 1 (Yes)

It gives us a smooth curve that gently transitions from 0 to 1. That way, we don’t make harsh decisions — we assign confidence.

Visual: How Sigmoid Looks

If we plot it, we get a beautiful S-shaped curve:



|
|                *****
|             ****
|          ***
|       ***
|     **
|   **
| **
|*________________________
             x-axis (input score)

Real-Life Analogy

Imagine a dimmer switch for a light.

A low input keeps the light off (like 0).

A high input turns the light fully on (like 1).

In between, the light gradually increases — just like sigmoid, it doesn’t jump, it smoothly increases.

Logistic Regression Needs This Because:

We calculate a score: z=w⋅x+bz = w \cdot x + bz=w⋅x+b

Then we use sigmoid to turn the score into a probability

Finally, if probability > 0.5 → predict 1 (Yes), else 0 (No)

2. Basic Algebra

Understanding how to work with equations like:

z=w⋅x+b

Where:

x: input value (e.g. hours studied)
w: weight (importance of that input)
b: bias (base score or offset)

Why needed: Logistic Regression models this kind of equation to get a “score” before applying the sigmoid.

3. Exponentials

Know what e^x and e^{-x} mean.
(Euler’s number e≈2.718e \approx 2.718e≈2.718, kind of like π but for growth/decay.)

Why needed: The sigmoid function is built using exponentials.

4. Probability Basics

We should know what 0.0 to 1.0 means in terms of likelihood.

0 = Impossible
1 = Certain
0.5 = 50% chance

Why needed: Logistic Regression gives us probabilities — not direct answers.

5. Logarithms (light touch)

Used during training when we calculate log loss:

Loss=−[y⋅log⁡(p)+(1−y)⋅log⁡(1−p)]

Why needed: Optional at beginner level, but important for understanding how errors are penalized during training.

6. Derivatives (light touch of calculus)

Needed only if we’re implementing Logistic Regression from scratch (like we did earlier). We’ll use:

How to calculate the gradient (slope)
Update rules in gradient descent

Why needed: To optimize weights during training using gradient descent. But don’t worry — many just learn this part through logic first, then math later.

Logistic Regression – Logistic Regression Dataset Suitability Checklist