First and Second Derivatives Example with Simple Python – First and Second Derivatives in Neural Networks
1. Python Program: First and Second Derivatives
import numpy as np
# Sample loss function: L(w) = (w - 3)^2
def loss(w):
return (w - 3)**2
# First-order derivative: gradient
def gradient(w):
return 2 * (w - 3)
# Second-order derivative: constant in this case
def hessian(w):
return 2 # constant curvature
# Newton's update
def newtons_method_step(w):
grad = gradient(w)
hess = hessian(w)
return w - grad / hess
# Simulation
w = 0.0
for i in range(5):
print(f"Step {i+1}: w = {w:.4f}, Loss = {loss(w):.4f}")
w = newtons_method_step(w)
2. How They Relate to Neural Network Sections
| Neural Network Component | First Order (Gradient) | Second Order (Curvature) |
|---|---|---|
| Forward Propagation | Not involved directly | Not involved |
| Loss Function | Gradient used to minimize loss | Hessian is used for precise step adjustment |
| Backpropagation | Computes gradients layer-by-layer | Can be extended for second-order backprop |
| Optimizers (SGD, Adam) | Use the gradient only | Use approximations, not Hessian |
| Advanced Optimizers (L-BFGS, Newton) | Require or approximate the Hessian | Crucial for curvature awareness |
3. Summary of Differences
| Feature | First Order | Second Order |
|---|---|---|
| Output | Gradient (Vector) | Hessian (Matrix) |
| Use Case | Weight update | Refined optimization |
| Cost | Computationally cheaper | Expensive |
| Role in Training | Always used (backprop) | Optional (advanced methods) |
| Analogy | Slope | Curve/Bend of slope |
First and Second Derivatives in Neural Networks – Basic Math Concepts
