Basic Math Concepts - L2 Regularization in Neural Network - Little Bits of Artificial Intelligence

L2 regularization adds a penalty term to the loss function:

Loss Without Regularization:

Loss With L2 Regularization:

Where:

Why It Helps?

Minimizing this loss not only makes predictions accurate but also keeps weights small.
Small weights mean simpler models, which reduce overfitting.

Summary

Here is the chart showing how the weight values evolve over epochs with and without L2 regularization:

Without L2: The weights can grow larger as the model tries to minimize the error aggressively.
With L2: The weight values are more controlled and converge smoothly, thanks to the regularization penalty.