Basic Math Concepts - Minimize objective in Neural Network - Little Bits of Artificial Intelligence

We assume a simple linear model:

y^ = w ⋅ x

Where:

To know how “wrong” our prediction is, we use:

Loss = (y^ – y)^2 = (w ⋅ x – y)^2

This gives a non-negative number representing error — the smaller, the better.

We want to adjust w to minimize the loss.
So we compute the gradient (i.e., slope of the loss function w.r.t w):

d / dw[(w ⋅ x – y)^2] = 2(w ⋅ x – y) ⋅ x

This derivative tells us: How fast the loss increases or decreases if we change w a little.

We adjust the weight in the opposite direction of the gradient:

w = w – η ⋅ dLoss / dw

Where:

Concept	Description
Linear Equation	y^ = w ⋅ x
Squared Error Loss	(y^ – y)^2
Derivative (Gradient)	To measure how weight affects loss
Gradient Descent Update	w = w – η ⋅ gradient

Optional, but Helpful: