Basic Math Concepts – Xavier Initialization Applicability in Neural Network
| Concept | Explanation |
|---|---|
| Uniform Distribution | Values are chosen randomly between [-limit, +limit] |
| Variance Matching | Keeps output variance equal to input variance |
| ReLU Activation | Helps prevent vanishing gradient |
| Square Root Scaling | Derived from variance propagation through layers |
Xavier Formula Again:
This is based on preserving variance:
- Variance of inputs ≈ Variance of outputs
- Helps stable gradient descent
Next – Sparse Initialization Applicability in Neural Network

