Basic Math Concepts – LeCun Initialization Applicability in Neural Network

The main goal is to preserve variance of activations across layers:
Formula:

So that:

Summary

Aspect Explanation
Why LeCun? Avoids gradient vanishing for tanh/sigmoid activations
When to use? Shallow networks or activations like tanh/sigmoid
Stock Use Case Fit Works well in shallow stock predictors with sliding windows (time series)
Math Core Init weights with N(0, 1/nin)
Visual Difference Loss decreases faster and smoother with LeCun Init

Next – Regularization in Neural Network