Basic Math Concepts – LeCun Initialization Applicability in Neural Network
The main goal is to preserve variance of activations across layers:
Formula:
So that:
Summary
| Aspect | Explanation |
|---|---|
| Why LeCun? | Avoids gradient vanishing for tanh/sigmoid activations |
| When to use? | Shallow networks or activations like tanh/sigmoid |
| Stock Use Case Fit | Works well in shallow stock predictors with sliding windows (time series) |
| Math Core | Init weights with N(0, 1/nin) |
| Visual Difference | Loss decreases faster and smoother with LeCun Init |

