Basic Math Concepts – L1 Regularization vs L2 Regularization Selection for Different Use Cases in Neural Network

1. L1 Regularization:

Loss Function (e.g., for regression):

Loss = MSE + λ ∑ |wi|

  • Makes some wi = 0

2. L2 Regularization:

Loss = MSE + λ ∑ wi²

  • Shrinks all wi, but keeps them non-zero

3. Rule of Thumb Summary

Question Go with
Do I want to remove unnecessary features? L1
Do I want smooth predictions with all features? L2
Do I have many correlated inputs? L2
Do I care about feature importance and simplicity? L1
Do I want the best of both worlds? Elastic Net

Next – Prediction Error in Neural Network