Feature Engineering example with simple python
1. Goal: Predict if someone will buy ice cream
Given:
- temperature (numeric)
- is_holiday (0 or 1)
We’ll:
- Train a simple linear model (like a 1-layer neural net) without feature engineering.
- Train the same model with feature engineering: add interaction feature temperature × is_holiday, normalize inputs.
Python Program (Pure Python, Simple Logic)
# Sample dataset
data = [
{'temp': 35, 'holiday': 0, 'buy': 1},
{'temp': 30, 'holiday': 1, 'buy': 1},
{'temp': 25, 'holiday': 0, 'buy': 0},
{'temp': 40, 'holiday': 1, 'buy': 1},
{'temp': 20, 'holiday': 0, 'buy': 0},
{'temp': 38, 'holiday': 0, 'buy': 1},
{'temp': 18, 'holiday': 1, 'buy': 0},
{'temp': 22, 'holiday': 0, 'buy': 0},
{'temp': 45, 'holiday': 1, 'buy': 1},
{'temp': 27, 'holiday': 0, 'buy': 0}
]
# --------- Step 1: Raw data learning (simple weights) ---------
def train_simple_model(data):
w_temp = 0.0
w_holiday = 0.0
bias = 0.0
lr = 0.01 # learning rate
for epoch in range(1000):
for row in data:
x1 = row['temp']
x2 = row['holiday']
y = row['buy']
z = w_temp * x1 + w_holiday * x2 + bias
pred = 1 if z > 30 else 0 # simple threshold
error = y - pred
# Update weights
w_temp += lr * error * x1
w_holiday += lr * error * x2
bias += lr * error
return w_temp, w_holiday, bias
def test_model(data, w_temp, w_holiday, bias):
correct = 0
for row in data:
x1 = row['temp']
x2 = row['holiday']
y = row['buy']
z = w_temp * x1 + w_holiday * x2 + bias
pred = 1 if z > 30 else 0
if pred == y:
correct += 1
accuracy = correct / len(data)
return accuracy
# --------- Step 2: Add Feature Engineering ---------
def normalize(val, min_val, max_val):
return (val - min_val) / (max_val - min_val)
def train_engineered_model(data):
w_temp = 0.0
w_holiday = 0.0
w_interact = 0.0
bias = 0.0
lr = 0.01
temps = [row['temp'] for row in data]
min_temp, max_temp = min(temps), max(temps)
for epoch in range(1000):
for row in data:
# Normalized features + interaction
x1 = normalize(row['temp'], min_temp, max_temp)
x2 = row['holiday']
x3 = x1 * x2 # interaction
y = row['buy']
z = w_temp * x1 + w_holiday * x2 + w_interact * x3 + bias
pred = 1 if z > 0.5 else 0
error = y - pred
# Update weights
w_temp += lr * error * x1
w_holiday += lr * error * x2
w_interact += lr * error * x3
bias += lr * error
return w_temp, w_holiday, w_interact, bias, min_temp, max_temp
def test_engineered(data, w_temp, w_holiday, w_interact, bias, min_temp, max_temp):
correct = 0
for row in data:
x1 = normalize(row['temp'], min_temp, max_temp)
x2 = row['holiday']
x3 = x1 * x2
y = row['buy']
z = w_temp * x1 + w_holiday * x2 + w_interact * x3 + bias
pred = 1 if z > 0.5 else 0
if pred == y:
correct += 1
return correct / len(data)
# --------- Run both models ---------
print("Training without feature engineering...")
w1, w2, b = train_simple_model(data)
acc_raw = test_model(data, w1, w2, b)
print("Accuracy (Raw):", acc_raw)
print("\nTraining with feature engineering...")
w1, w2, w3, b2, min_t, max_t = train_engineered_model(data)
acc_eng = test_engineered(data, w1, w2, w3, b2, min_t, max_t)
print("Accuracy (Engineered):", acc_eng)
Expected Output:
Training without feature engineering…
Accuracy (Raw): 0.6Training with feature engineering…
Accuracy (Engineered): 1.0
Summary
- The raw model struggles because it can’t learn complex patterns like “holiday + hot day → higher sales”.
- The engineered model learns faster and more accurately because we gave it a meaningful combined feature and normalized data.
2. Why adding engineered features improves prediction?
Story Recap :
Imagine we’re trying to guess if people will buy ice cream. We have only:
- temperature (e.g., 35°C)
- holiday (yes or no → 1 or 0)
Now let’s observe two scenarios:
Scenario 1: Raw Inputs Only
- Model sees:
- temp = 30
- holiday = 1
- Model tries:
score = w_temp * 30 + w_holiday * 1 + bias
But the model is linear. It can’t easily learn: “Sales are high when both temperature is high AND it’s a holiday.”
This is a non-linear interaction, and raw linear models can’t combine two inputs multiplicatively unless we help them.
Scenario 2: Engineered Feature Added → Interaction
We created:
x3 = x1 * x2 → normalized_temp × holiday
This feature only activates when both conditions are true.
So for:
- Hot day (temp = 40 → normalized = ~0.8)
- Holiday (1)
Interaction feature becomes:
x3 = 0.8 × 1 = 0.8
Whereas:
- Cold day + holiday → 0.2 × 1 = 0.2
- Hot day + no holiday → 0.8 × 0 = 0
New model:
score = w_temp * temp + w_holiday * holiday + w_interact * (temp × holiday) + bias
Now the model can assign special importance to combinations like:”High temperature AND holiday = people buy ice cream”
That logic couldn’t be learned by a simple sum of weights.
Simple Analogy :
Without interaction: You’re saying “Hot days” are good, “Holidays” are good — independently.
With interaction: You’re saying “Hot Holidays” are especially good!
Technically Speaking:
A simple model without interactions:
z = w1 * temp + w2 * holiday + bias
is a linear separator, i.e., it draws a straight line.
Adding:
z = w1 * temp + w2 * holiday + w3 * (temp × holiday) + bias
lets it fit curved or conditional boundaries, i.e., it can:
- Adjust slope depending on combinations.
- Learn contextual influence (e.g., holiday matters only if hot).
This is feature engineering — making the model’s job easier by expressing logic explicitly in features.
In our code
Before:
- Model has no way to “learn” temperature × holiday effect.
- It gives average accuracy (like 60%).
After:
- Model sees that when both temp is high and holiday is 1 → customers buy.
- Model fits faster and with more precision → 100% accuracy.
Conclusion
Adding the temp × holiday interaction tells the model: “Hey, don’t treat temp and holiday separately. Sometimes they work together to drive behavior.”
This unlocks learning of deeper patterns.
Next – Encoding in Neural Networks
