Overfitting Vs Underfitting Impact example with Simple Python

1. We’ll simulate this using a simple polynomial-like model.

import random

# Step 1: Generate Data (y = 2x + 1 + noise)
def generate_data(n=20):
    data = []
    for _ in range(n):
        x = random.uniform(0, 10)
        noise = random.uniform(-1, 1)
        y = 2 * x + 1 + noise  # True underlying function
        data.append((x, y))
    return data

train_data = generate_data(20)
test_data = generate_data(10)

# Step 2: Predict with simple linear model (Underfitting)
def predict_linear(x, w, b):
    return w * x + b

# Step 3: Predict with complex model (Overfitting mimic with high-degree poly)
def predict_complex(x, weights):
    # Simulate high degree polynomial: y = w0 + w1*x + w2*x^2 + ... + w5*x^5
    return sum([weights[i] * (x ** i) for i in range(len(weights))])

# Step 4: Evaluate Mean Squared Error
def mse(data, predict_fn):
    error = 0
    for x, y in data:
        y_pred = predict_fn(x)
        error += (y - y_pred) ** 2
    return error / len(data)

# Underfitting: Model too simple
w_simple = 0.5
b_simple = 0.5
underfit_predict = lambda x: predict_linear(x, w_simple, b_simple)

# Overfitting: Model too complex
w_complex = [1, 0.5, -0.2, 0.1, -0.05, 0.02]  # over-tuned weights
overfit_predict = lambda x: predict_complex(x, w_complex)

# Ground truth model (close to reality)
true_predict = lambda x: 2 * x + 1

print("Training MSE:")
print("Underfitting Model:", mse(train_data, underfit_predict))
print("Overfitting Model :", mse(train_data, overfit_predict))
print("True Model        :", mse(train_data, true_predict))

print("\nTesting MSE:")
print("Underfitting Model:", mse(test_data, underfit_predict))
print("Overfitting Model :", mse(test_data, overfit_predict))
print("True Model        :", mse(test_data, true_predict))

Sample Output You May See:

Training MSE:
Underfitting Model: 59.55756432821365
Overfitting Model : 78237.63722256606
True Model : 0.2031184902109579

Testing MSE:
Underfitting Model: 77.43513174946756
Overfitting Model : 60178.331390988395
True Model : 0.36663658392975257

Here, underfitting model fails on both, overfitting model wins on training but fails on test, and true model generalizes well

2. Statement:

A model that appears to be overfitted on a small dataset might perform just fine (i.e., not overfitted) on a much larger dataset of the same nature.

Why This Happens:

1. Overfitting is about memorizing noise

  • On a small dataset, a complex model might memorize specific patterns or noise, leading to poor generalization.
  • But on a larger dataset, the model has more diverse examples to learn from, making it harder to memorize and easier to learn general patterns.

2. Larger dataset gives better generalization

  • A bigger dataset tends to have:
    • Less chance of coincidental correlations.
    • Better representation of the real-world variance.
    • Smoother underlying function to learn.

3. Model capacity stays the same, but data gets richer

  • If the model capacity (number of layers, neurons, etc.) is fixed, increasing the dataset reduces overfitting risk — because now, the model is not too large relative to the data.

Example Analogy:

Imagine training a student with:

  • 5 sample questions → the student memorizes (overfits).
  • 500 sample questions of the same pattern → the student starts learning the actual concept, not just the answers.

Important:

Just increasing data doesn’t guarantee solving overfitting — but it greatly helps, especially when:

  • The data is clean and diverse.
  • The model is not too excessively large even for the bigger data.
  • Proper validation techniques (like cross-validation) are used.

Empirical Rule of Thumb:

  • Complex models + Small Data → High risk of overfitting
  • Complex models + Large Data → Better generalization
  • Simple models + Large Data → Often safest and robust

Overfitting vs Underfitting Impact in Neural Network – Basic Math Concepts