Regularization example with Simple Python

We’ll simulate a mini neural network with and without regularization.

import math

# Sigmoid activation
def sigmoid(x):
    return 1 / (1 + math.exp(-x))

# Derivative of sigmoid
def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1 - sx)

# Loss with L2 Regularization (squared weights penalty)
def compute_loss(y_true, y_pred, weights, lambda_reg=0.0):
    mse = (y_true - y_pred) ** 2
    reg_term = lambda_reg * sum([w ** 2 for w in weights])
    return mse + reg_term

# Mini training loop
inputs = [0.5, 1.5]
target = 1
weights = [0.2, -0.4]  # Initial weights
bias = 0.1
learning_rate = 0.1
lambda_reg = 0.1  # Regularization strength

for epoch in range(10):
    # Weighted sum
    z = sum(i * w for i, w in zip(inputs, weights)) + bias
    # Activation
    pred = sigmoid(z)
    # Loss
    loss = compute_loss(target, pred, weights, lambda_reg)
    
    # Gradients
    dloss_dpred = -2 * (target - pred)
    dpred_dz = sigmoid_derivative(z)

    dz_dw = [i for i in inputs]
    dz_db = 1

    # Update weights with L2 regularization
    for i in range(len(weights)):
        grad = dloss_dpred * dpred_dz * dz_dw[i] + lambda_reg * weights[i]
        weights[i] -= learning_rate * grad

    # Update bias (no regularization applied to bias here)
    bias -= learning_rate * (dloss_dpred * dpred_dz * dz_db)

    print(f"Epoch {epoch+1}: Loss={loss:.4f}, Pred={pred:.4f}, Weights={weights}")

Output:

Epoch 1: Loss=0.3784, Pred=0.4013, Weights=[0.2123841143684869, -0.35284765689453934]
Epoch 2: Loss=0.3453, Pred=0.4269, Weights=[0.22428074663693348, -0.3072577600891995]
Epoch 3: Loss=0.3146, Pred=0.4522, Weights=[0.2356085081369916, -0.2634734755890252]
Epoch 4: Loss=0.2864, Pred=0.4767, Weights=[0.24630756956134686, -0.2216733013159594]
Epoch 5: Loss=0.2608, Pred=0.5002, Weights=[0.25634008957002646, -0.18196978118992052]
Epoch 6: Loss=0.2378, Pred=0.5226, Weights=[0.26568862005527466, -0.144414289235176]
Epoch 7: Loss=0.2174, Pred=0.5437, Weights=[0.27435315629222085, -0.10900587903032746]
Epoch 8: Loss=0.1993, Pred=0.5635, Weights=[0.2823475455038854, -0.07570205791626385]
Epoch 9: Loss=0.1833, Pred=0.5819, Weights=[0.2896958437647122, -0.04442971618950425]
Epoch 10: Loss=0.1693, Pred=0.5991, Weights=[0.29642901457388354, -0.015095031287153796]

What to Observe?

Without lambda_reg,weights may become large (overfitting).
With lambda_reg,weights are tamed, and generalization improves.

Regularization in Neural Network – Basic Math Concepts