Overfitting Vs Underfitting Impact example with Simple Python

1. We’ll simulate this using a simple polynomial-like model.

import random

# Step 1: Generate Data (y = 2x + 1 + noise)
def generate_data(n=20):
    data = []
    for _ in range(n):
        x = random.uniform(0, 10)
        noise = random.uniform(-1, 1)
        y = 2 * x + 1 + noise  # True underlying function
        data.append((x, y))
    return data

train_data = generate_data(20)
test_data = generate_data(10)

# Step 2: Predict with simple linear model (Underfitting)
def predict_linear(x, w, b):
    return w * x + b

# Step 3: Predict with complex model (Overfitting mimic with high-degree poly)
def predict_complex(x, weights):
    # Simulate high degree polynomial: y = w0 + w1*x + w2*x^2 + ... + w5*x^5
    return sum([weights[i] * (x ** i) for i in range(len(weights))])

# Step 4: Evaluate Mean Squared Error
def mse(data, predict_fn):
    error = 0
    for x, y in data:
        y_pred = predict_fn(x)
        error += (y - y_pred) ** 2
    return error / len(data)

# Underfitting: Model too simple
w_simple = 0.5
b_simple = 0.5
underfit_predict = lambda x: predict_linear(x, w_simple, b_simple)

# Overfitting: Model too complex
w_complex = [1, 0.5, -0.2, 0.1, -0.05, 0.02]  # over-tuned weights
overfit_predict = lambda x: predict_complex(x, w_complex)

# Ground truth model (close to reality)
true_predict = lambda x: 2 * x + 1

print("Training MSE:")
print("Underfitting Model:", mse(train_data, underfit_predict))
print("Overfitting Model :", mse(train_data, overfit_predict))
print("True Model        :", mse(train_data, true_predict))

print("\nTesting MSE:")
print("Underfitting Model:", mse(test_data, underfit_predict))
print("Overfitting Model :", mse(test_data, overfit_predict))
print("True Model        :", mse(test_data, true_predict))

Sample Output You May See:

Training MSE:
Underfitting Model: 59.55756432821365
Overfitting Model : 78237.63722256606
True Model : 0.2031184902109579

Testing MSE:
Underfitting Model: 77.43513174946756
Overfitting Model : 60178.331390988395
True Model : 0.36663658392975257

Here, underfitting model fails on both, overfitting model wins on training but fails on test, and true model generalizes well

2. Statement:

A model that appears to be overfitted on a small dataset might perform just fine (i.e., not overfitted) on a much larger dataset of the same nature.

Why This Happens:

1. Overfitting is about memorizing noise

On a small dataset, a complex model might memorize specific patterns or noise, leading to poor generalization.
But on a larger dataset, the model has more diverse examples to learn from, making it harder to memorize and easier to learn general patterns.

2. Larger dataset gives better generalization

A bigger dataset tends to have:

Less chance of coincidental correlations.
Better representation of the real-world variance.
Smoother underlying function to learn.

3. Model capacity stays the same, but data gets richer

If the model capacity (number of layers, neurons, etc.) is fixed, increasing the dataset reduces overfitting risk — because now, the model is not too large relative to the data.

Example Analogy:

Imagine training a student with:

5 sample questions → the student memorizes (overfits).
500 sample questions of the same pattern → the student starts learning the actual concept, not just the answers.

Important:

Just increasing data doesn’t guarantee solving overfitting — but it greatly helps, especially when:

The data is clean and diverse.
The model is not too excessively large even for the bigger data.
Proper validation techniques (like cross-validation) are used.

Empirical Rule of Thumb:

Complex models + Small Data → High risk of overfitting
Complex models + Large Data → Better generalization
Simple models + Large Data → Often safest and robust

Overfitting vs Underfitting Impact in Neural Network – Basic Math Concepts