Connecting the Dots – Supervised Learning
We can connect this to vectors and matrices. Let’s dig into it and see how everything fits together.
Statement:
“If we only use y = m * x, our line always goes through (0,0)”
That is true in both:
- Basic algebra (2D lines)
- Linear algebra (vectors/matrices)
And yes — this behavior is deeply rooted in how matrix multiplication works.
In Simple Algebra:
A line like:
y=m∗x
means:
When x = 0, y = 0 — always.
So the line must go through the origin (0, 0).
We can’t move the line up or down without adding a bias (intercept) like this:
y=m∗x+c
In Linear Algebra (Vectors and Matrices):
When using matrix multiplication for linear models:
y = X @ W
- X is our input matrix (shape: [n_samples, n_features])
- W is our weights vector (shape: [n_features, 1])
- y is the output (predicted values)
This is a pure linear transformation.
And here’s the rule:
Linear transformations always map the origin (0) to the origin.
So if our model is:
y=X@W
Then:
If X=[0],y=0
That’s the same as forcing the line to pass through (0,0).
To Shift the Line (or Plane), an Affine Transformation needed
That’s where we add bias (b):
y=X@W+b
This is now an affine transformation — linear + shift.
It lets our model move the line (or decision boundary) up/down/around to better match the data.
Summary Table:
Equation | Type | Goes through origin? | Can shift? |
---|---|---|---|
y = m * x | Linear | Yes | No |
y = m * x + c | Affine (shifted line) | Not always | Yes |
y = X @ W | Linear (matrix) | Yes | No |
y = X @ W + b | Affine (matrix) | Not always | Yes |
Final Answer:
The reason y = m * x (or y = X @ W) always passes through the origin is a fundamental property of linear transformations in vector/matrix algebra.
Adding + c or + b makes it affine, not purely linear — and allows the model to learn any offset, not just the slope.
Let’s build a NumPy demo that shows the difference between:
A pure linear transformation (y = W * x)
An affine transformation (y = W * x + b)
We’ll visualize both so we can see how the bias shifts the line.
Python Code with Plot (Linear vs. Affine)
import numpy as np import matplotlib.pyplot as plt # Inputs (x values) X = np.array([[0], [1], [2], [3], [4], [5]]) # Target equation: y = 2x + 1 W = np.array([2]) # weight/slope b = np.array([1]) # bias/intercept # Pure Linear Transformation (no bias): y = W * x y_linear = X @ W # Same as np.dot(X, W), no intercept # Affine Transformation (with bias): y = W * x + b y_affine = X @ W + b # Plot both plt.figure(figsize=(8, 5)) plt.plot(X, y_linear, label="Linear: y = 2x", linestyle='--', color='blue') plt.plot(X, y_affine, label="Affine: y = 2x + 1", linestyle='-', color='green') plt.scatter(X, y_affine, color='green') plt.axhline(0, color='gray', linewidth=0.5) plt.axvline(0, color='gray', linewidth=0.5) # Labels and legend plt.title("Linear vs Affine Transformation") plt.xlabel("Input x") plt.ylabel("Output y") plt.legend() plt.grid(True) plt.show()
What we’ll See:
- Dashed Blue Line (y = 2x): always passes through (0,0)
- Solid Green Line (y = 2x + 1): same slope, but shifted up by 1
Summary:
y = W * x is a pure linear transformation → can only rotate/scale, no shift.
y = W * x + b is an affine transformation → can rotate/scale and shift.
This is why machine learning models almost always include a bias term — it gives them the flexibility to fit more real-world data.
Supervised Learning – Visual Roadmap