Supervised Learning with Simple Python

1. We want to predict the fruit based on its color using simple logic with a super simple python program.

Sample Supervised Learning Program

# Step 1: Training data (supervised part — examples with answers)
training_data = [
    {"color": "red", "fruit": "apple"},
    {"color": "yellow", "fruit": "banana"},
    {"color": "green", "fruit": "grape"},
    {"color": "orange", "fruit": "orange"},
]

# Step 2: Learning - store the known data in memory (like learning)
def learn(data):
    memory = {}
    for item in data:
        color = item["color"]
        fruit = item["fruit"]
        memory[color] = fruit  # Remember: red → apple
    return memory

# Step 3: Predict - use what we learned to guess new answers
def predict(memory, new_color):
    if new_color in memory:
        return memory[new_color]
    else:
        return "I don't know that fruit!"

# Train (learn)
learned_memory = learn(training_data)

# Test with new colors
print(predict(learned_memory, "red"))     # apple
print(predict(learned_memory, "yellow"))  # banana
print(predict(learned_memory, "blue"))    #  I don't know that fruit!

What this does:

Training data: Gives examples of colors and their correct fruits.
Learning function: Stores what fruit matches each color.
Prediction function: Guesses the fruit when given a color.

Think of it like this:

“The computer is shown that red = apple, yellow = banana. Later, when the computer is asked ‘What fruit is red?’ it remembers and says: apple!”

2. To understand a pattern in supervised learning, teach the computer a simple input output pairs.

output = 2 × input + 1

Simple Supervised Pattern Program

# Step 1: Create training data (inputs with correct outputs)
training_data = [
    {"input": 1, "output": 3},  # because 2*1 + 1 = 3
    {"input": 2, "output": 5},  # because 2*2 + 1 = 5
    {"input": 3, "output": 7},  # because 2*3 + 1 = 7
    {"input": 4, "output": 9},  # and so on...
]

# Step 2: Learn the pattern
# We will try to figure out the 'slope' and 'intercept' of a line: y = mx + c

def learn_pattern(data):
    # We’ll calculate average slope (m) from data
    slopes = []
    for item in data:
        x = item["input"]
        y = item["output"]
        slope = (y - 1) / x  # since y = mx + 1, m = (y - 1) / x
        slopes.append(slope)

    # Average slope (just for simplicity)
    m = sum(slopes) / len(slopes)
    c = 1  # we know from training pattern it's +1
    return m, c

# Step 3: Predict new outputs using the learned pattern
def predict(x, m, c):
    return m * x + c

# Train
m, c = learn_pattern(training_data)
print("Learned pattern: y =", round(m, 2), "* x +", c)

# Test predictions
test_inputs = [5, 6, 10]
for x in test_inputs:
    print(f"When input is {x}, predicted output is {predict(x, m, c)}")

Output:

Learned pattern: y = 2.0 * x + 1
When input is 5, predicted output is 11.0
When input is 6, predicted output is 13.0
When input is 10, predicted output is 21.0

Simple Explanation:

The computer is shown some number puzzles (input → output).
It notices: “Hmm… every time, output = 2 × input + 1”
Now, when a new input is given, it applies the pattern to predict the answer.

3. In real supervised learning, the machine is not provided any patterns.A large number of instances were given to the computer.

Input (x)	Output (y)
1	3
2	5
3	7

Then if one says:

“Hey computer, you figure out the pattern.”

And the computer, using machine learning algorithms, will automatically find:

Ah! The best pattern that fits is y = 2x + 1

The main Job in Supervised Learning:

Prepare good training data (input + correct output).
Pick a learning model (like linear regression, decision tree, etc.).
Let the model learn the pattern.
Use it to predict new outputs.

In supervised learning, one doesn’t have to write the pattern by himself.

The algorithm (like linear regression) uses math and optimization to find the best pattern (weights) from the examples.

One just feeds it the examples (input → output), and it figures out how to connect them.

4.How does the model “understand” that the best equation is y = 2x + 1?

It doesn’t know the equation in the beginning.

Instead, it tries to guess the best line by:

Making an initial guess for weight (slope) and bias (intercept)
Measuring how wrong the guess is using error (loss function)
Adjusting the weight and bias to make better guesses
Repeating this over and over (training!)

Let’s see it like a game:

Try	Weight (m)	Bias (c)	Guessed Equation	Error
1	1.0	0.0	y = 1x + 0	Big
2	1.5	0.5	y = 1.5x + 0.5	Smaller
3	2.0	1.0	y = 2x + 1	Very small

The model finds that:

“Hey! When I use y = 2x + 1, my predictions are really close to the real answers.”

That’s how it learns the best pattern: the one that gives the least error across all training examples.

What helps it do this?

A few key things:

A loss function (like mean squared error) to measure how wrong it is.
Gradient descent to move weight and bias step by step toward better values.

Tiny Visual Analogy:

Let’s say someone is blindfolded and trying to find the lowest point in a valley. He can’t see the bottom, but he can feel the slope under his feet and walk downhill slowly.

That’s what the model does:

It feels how much error it’s making (slope).
It takes little steps (learning rate).
Eventually, it lands at the lowest point — which is the best-fitting equation (like y = 2x + 1).

A super simple Python program that hand-codes Linear Regression with Gradient Descent — and we will see how it learns y = 2x + 1 step by step!

What we’ll see:

The model starts with random weight (m) and bias (c)
It adjusts them using gradient descent
It slowly learns the correct pattern: y = 2x + 1

Python Code: Learn y = 2x + 1 from Scratch


# Training Data: y = 2x + 1
data_x = [1, 2, 3, 4, 5]
data_y = [3, 5, 7, 9, 11]

# Step 1: Start with random guess
m = 0.0  # slope
c = 0.0  # intercept

# Step 2: Learning rate and training steps
learning_rate = 0.01
epochs = 1000  # training steps

# Step 3: Training loop (Gradient Descent)
for epoch in range(epochs):
    total_error_m = 0
    total_error_c = 0
    n = len(data_x)

    for i in range(n):
        x = data_x[i]
        y_true = data_y[i]

        y_pred = m * x + c  # our current guess
        error = y_pred - y_true

        # Compute gradients
        total_error_m += error * x
        total_error_c += error

    # Average and apply gradient
    m -= (learning_rate * total_error_m) / n
    c -= (learning_rate * total_error_c) / n

    # Print progress every 100 steps
    if epoch % 100 == 0:
        print(f"Epoch {epoch}: m = {round(m, 4)}, c = {round(c, 4)}")

# Final learned pattern
print("\n Learned Equation: y =", round(m, 2), "* x +", round(c, 2))

# Test prediction
test_x = 6
predicted_y = m * test_x + c
print(f"Prediction: When x = {test_x}, y = {round(predicted_y, 2)}")

Sample Output:

Epoch 0: m = 1.1, c = 0.3
Epoch 100: m = 1.77, c = 0.82
Epoch 200: m = 1.91, c = 0.93
…
Learned Equation: y = 2.0 * x + 1.0
Prediction: When x = 6, y = 13.0

Boom! It learned:

y = 2x + 1 from scratch, just by adjusting the weight and bias using the training data.

Supervised Learning – Brainstorming Session