Summary – Supervised Learning

1. Definition & Basic Concept

  • Learn from labeled examples (input → correct output)
  • The model tries to find a pattern to predict outputs for new inputs

2. Training Data

  • Data must have both:
    • Inputs (features) → what the model sees
    • Outputs (labels) → the correct answers
  • Example: x = [1, 2, 3] and y = [3, 5, 7]

3. The Model

  • A function that maps input to output: y = f(x)
  • Can be as simple as a linear equation, or a complex neural network

4. Linear Regression (First Example Model)

  • Model: y = m * x + c (slope and intercept)
  • m = weight, c = bias/intercept
  • Real models often start with: y = X @ W + b

5. Why Add a Bias (Intercept)?

  • Without bias: Line always passes through (0,0)
  • Bias allows the model to shift the line up/down to better fit real data
  • Mathematically:
    • Without bias → linear transformation
    • With bias → affine transformation

6. How the Model Learns (Training Process)

  • The model guesses weights (m) and bias (c)
  • Makes predictions using current values
  • Measures error (difference between predicted and actual)
  • Adjusts weights and bias to reduce error → Gradient Descent

7. Loss Function (How Error is Measured)

Common: Mean Squared Error (MSE):

loss = mean((y_pred – y_true)^2)

  • Lower loss = better model

8. Gradient Descent (How Model Improves)

  • Calculates gradients (slopes) of the loss function
  • Adjusts weights/bias in the opposite direction of the gradient
  • Learning rate controls the step size

9. Matrix Representation (Vectorized Form)

  • Inputs = matrix X
  • Weights = vector W
  • Prediction: y = X @ W + b
  • Without +b, it’s a pure linear transformation (passes through origin)

10. Practical Tools (Optional for Real Projects)

  • scikit-learn for linear regression and other models
  • matplotlib to visualize predictions and fitting

Supervised Learning – Visual Roadmap