Basic Math Concepts – L1 Regularization vs L2 Regularization Selection for Different Use Cases in Neural Network
1. L1 Regularization:
Loss Function (e.g., for regression):
Loss = MSE + λ ∑ |wi|
- Makes some wi = 0
2. L2 Regularization:
Loss = MSE + λ ∑ wi²
- Shrinks all wi, but keeps them non-zero
3. Rule of Thumb Summary
Question | Go with |
---|---|
Do I want to remove unnecessary features? | L1 |
Do I want smooth predictions with all features? | L2 |
Do I have many correlated inputs? | L2 |
Do I care about feature importance and simplicity? | L1 |
Do I want the best of both worlds? | Elastic Net |