Basic Math Concepts – Prediction Error in Neural Network

The error is the basis for most loss functions.

1. Absolute Error:

Error = |y − y^|

2. Mean Squared Error (MSE):

Used commonly in regression problems. Squaring ensures the error is always positive and penalizes large errors more.

Visualization Over Epochs

Epoch Predicted (ŷ) Actual (y) Error (y − ŷ)
1 5.0 10 5.0
2 7.0 10 3.0
3 9.0 10 1.0
4 9.8 10 0.2

The error keeps reducing as training progresses.

1. Algebra

  • Understanding linear equations like:

    y = wx + b

  • Variables, constants, coefficients
  • Substitution and solving equations

2. Functions

  • Concept of input → function → output
  • Example:

    f(x) = wx + b

  • Understand how inputs are transformed through layers

3. Error Measurement (Distance Metrics)

  • Difference between actual and predicted values:

    Error = y − y^

  • Squared error to penalize large errors:

    Loss = (y − y^)^2

4. Calculus – Derivatives (Gradient)

Most important to understand how neural networks learn!

  • How a function changes with respect to a variable:

    D / dX(x^2) = 2x

  • Derivative tells us the slope or direction to minimize error
  • In neural nets:
    • Derivative of loss w.r.t weight (∂L / ∂w) tells how to update the weight
    • Weights are updated in the direction that reduces the loss

5. Gradient Descent

  • Iterative method to minimize a function:

    w = w − η⋅∂L / ∂w

    where η is the learning rate

  • It moves the weights in the negative gradient direction to reduce loss

6. Chain Rule of Calculus (for Multi-layer Networks)

  • When functions are composed:

    z = f(g(x)) → dz / dx = df / dg ⋅ dg / dx

  • Essential in backpropagation through multiple layers

7. Mean and Averaging

  • Loss is often averaged across samples:

    MSE = 1 / n ∑(yi − y^i)^2

Summary Table

Math Topic Why It’s Needed in Neural Networks
Algebra To define models (e.g., y = wx + b)
Functions To understand how input maps to output
Error Measurement To evaluate prediction accuracy
Derivatives To calculate how error changes w.r.t parameters
Gradient Descent To reduce error by updating weights
Chain Rule To apply backpropagation across layers
Averages To generalize loss over datasets

Prediction Error in Neural Network – Summary