Gradient Boosting Regression example with Simple Python

1. Predict House Prices Based on Size (sq.ft.)

We’ll:

Start with a simple prediction
Create trees (we’ll simplify as decision stumps) to learn from errors
Update our prediction iteratively

Sample Data (House Size and Price)

House Size (sqft)	Price ($1000s)
500	150
1000	300
1500	450
2000	600
2500	750

Clearly, price is proportional to size.

Simple Python Code

# Sample training data
X = [500, 1000, 1500, 2000, 2500]          # size in sqft
y = [150, 300, 450, 600, 750]              # price in $1000s

# Step 1: Initial prediction - just use average price
initial_prediction = sum(y) / len(y)
predictions = [initial_prediction] * len(y)

print("Initial guess for all:", predictions)

# Number of boosting rounds
n_rounds = 3
learning_rate = 0.01

# Helper: make a tiny tree (really just a line fit)
def simple_tree_fit(X, residuals):
    # We’ll fit a basic line: price = slope * size
    # Calculate slope as sum(x*r)/sum(x*x)
    num = sum([X[i] * residuals[i] for i in range(len(X))])
    den = sum([X[i] ** 2 for i in range(len(X))])
    slope = num / den
    return slope

# Boosting iterations
for r in range(n_rounds):
    # Step 2: calculate residuals (errors)
    residuals = [y[i] - predictions[i] for i in range(len(y))]

    # Step 3: fit a weak learner (line: residual = slope * X)
    slope = simple_tree_fit(X, residuals)
    
 # Step 4: update predictions using learning rate
    for i in range(len(X)):
        predictions[i] += learning_rate * slope * X[i]

    print(f"After round {r+1}: Predictions: {[round(p, 2) for p in predictions]}")

# Final result
print("\nFinal Predictions:", [round(p, 2) for p in predictions])

What’s Happening?

We start with a naive average guess.
Then, in each round:
1. We compute the residuals (how wrong we are).
2. Fit a tiny “tree” (just a linear slope).
3. Update the predictions a little in the right direction.
Slowly, our predictions get closer and closer to actual prices.

Output (sample)

Initial guess for all: [450.0, 450.0, 450.0, 450.0, 450.0]
After round 1: Predictions: [447.75, 445.5, 443.25, 441.0, 438.75]
After round 2: Predictions: [445.51, 441.0, 436.49, 431.99, 427.48]
After round 3: Predictions: [443.27, 436.5, 429.74, 422.99, 416.23]
Final Predictions: [443.27, 436.5, 429.74, 422.99, 416.23]

As you can see, predictions gradually get closer to real values.

2. The Story of Improving Predictions

Imagine we’re trying to predict the price of a house based on its size in square feet. Initially, we have no clue, but we have some data points (like in the table we used earlier):

House Size (sqft)	Price ($1000s)
500	150
1000	300
1500	450
2000	600
2500	750

Step 1: A Simple First Guess

We start with a basic guess: What if the price is just the average?

So, we calculate the average price from all houses:

(150 + 300 + 450 + 600 + 750) / 5 = 450

This gives us a starting point. Now, for every house, we predict the price as 450.

But this guess is pretty bad. For example, for the 500 sqft house, our prediction is too high (450 vs. 150), and for the 2500 sqft house, it’s too low (450 vs. 750).

Step 2: Learning from Mistakes

Now, we decide to learn from our mistakes (errors).
Instead of blindly sticking with the same guess, we can say: Let’s focus on where we’re wrong and adjust.

We measure the error (how far off we were) for each house:

For the 500 sqft house: 450 – 150 = 300 (too high).
For the 2500 sqft house: 450 – 750 = -300 (too low).

Step 3: Correcting the Mistakes

Now, imagine we have a helper (a small tree). This helper is quite simple — it looks at the size of the house and tries to predict the correction for each house.

For example, it notices:

Smaller houses tend to have a bigger difference between the guess and the actual price.
Larger houses have the opposite problem — they’re undervalued.

So, this helper suggests a correction:

If the house is smaller than average, we need to lower our prediction (e.g., reduce the price estimate).
If it’s larger, increase our estimate.

Step 4: Adding the Correction

We then add the correction to the initial guess. This adjustment reduces the error:

For the 500 sqft house, the prediction goes from 450 to lower, getting closer to the true price (150).
For the 2500 sqft house, it moves from 450 to higher, making the prediction approach 750.

Now, our predictions are better than before, but there are still errors — not perfect yet.

Step 5: Repeat the Process (Focus on the Residuals)

We can repeat the process by looking at where the model made the largest mistakes (the residuals).
For each iteration, the new helper (a new “tree”) focuses only on these errors and corrects them further.

Each new model doesn’t try to change everything, just the remaining errors (residuals), making each step smaller and more targeted.

Step 6: Stop When We Reach a Point of Diminishing Returns

After several rounds of correction and improvement, the predictions start getting very close to the true values. But here’s the catch:At some point, adding more corrections doesn’t make a huge difference anymore.

This is the point where we should stop. But how do we know when to stop?

Logical Thresholds for Stopping:

1. Error reduction becomes minimal: After several rounds, the errors between predicted and actual prices get smaller and smaller.
If adding another model only slightly improves the prediction (or doesn’t improve it at all), then we stop.

2. Overfitting risk: If we continue adding too many models, we might start overfitting to the noise (small fluctuations) in the data. The model might perform well on the training data but poorly on new, unseen data. So, we stop before this happens.

3. Performance validation: After each round, we can test the model on a validation set (a separate set of data not seen by the model during training). If predictions on this set also get better with each step, then we keep going. But once improvements plateau, we stop.

Step 7: Final Prediction

Now, after multiple rounds of making corrections and improving the predictions step-by-step, we have a final prediction that’s much closer to reality than the original average guess. For example:

House Size (sqft)	Actual Price ($1000s)	Final Prediction ($1000s)
500	150	155
1000	300	305
1500	450	450
2000	600	595
2500	750	745

Our model is now accurate and well-calibrated. We could keep fine-tuning, but at this point, additional corrections might not significantly improve the model.

Gradient Boosting Regression – Summary