Validation Set with Simple Python

We’ll simulate a mini neural network training with a validation check (without external libraries).

import random

# Step 1: Generate toy data
def generate_data(num_samples):
    data = []
    for _ in range(num_samples):
        x = random.uniform(0, 10)
        y = 2 * x + 3  # True function: y = 2x + 3
        data.append((x, y))
    return data

# Step 2: Split into train and validation sets
data = generate_data(100)
train_data = data[:80]
val_data = data[80:]

# Step 3: Initialize weights
w = random.uniform(0, 1)
b = random.uniform(0, 1)
lr = 0.001  # learning rate

# Step 4: Train the model
def train_epoch(data, w, b, lr):
    for x, y in data:
        y_pred = w * x + b
        error = y_pred - y
        # Gradient Descent
        w -= lr * error * x
        b -= lr * error
    return w, b

# Step 5: Loss function (Mean Squared Error)
def evaluate(data, w, b):
    total_error = 0
    for x, y in data:
        y_pred = w * x + b
        total_error += (y - y_pred) ** 2
    return total_error / len(data)

# Step 6: Training loop with validation check
for epoch in range(1, 101):
    w, b = train_epoch(train_data, w, b, lr)
    train_loss = evaluate(train_data, w, b)
    val_loss = evaluate(val_data, w, b)
    if epoch % 10 == 0:
        print(f"Epoch {epoch}: Train Loss = {train_loss:.4f}, Validation Loss = {val_loss:.4f}")

Basic Observations

  1. Training Loss Decreases Quickly:
    • Because the model adjusts weights to fit the training data well.
  2. Validation Loss Should Also Decrease (Initially):
    • This means the model is learning generalized patterns.
  3. If Validation Loss Starts Increasing:
    • It’s a sign of overfitting: the model is doing too well on training, but worse on new data.
  4. Stopping Training Early:
    • When validation loss stops improving, we should stop training (a technique called early stopping).

Validation Set in Neural Network – Summary