Feature Engineering example with simple python
1. Goal: Predict if someone will buy ice cream
Given:
- temperature (numeric)
- is_holiday (0 or 1)
We’ll:
- Train a simple linear model (like a 1-layer neural net) without feature engineering.
- Train the same model with feature engineering: add interaction feature temperature × is_holiday, normalize inputs.
Python Program (Pure Python, Simple Logic)
# Sample dataset data = [ {'temp': 35, 'holiday': 0, 'buy': 1}, {'temp': 30, 'holiday': 1, 'buy': 1}, {'temp': 25, 'holiday': 0, 'buy': 0}, {'temp': 40, 'holiday': 1, 'buy': 1}, {'temp': 20, 'holiday': 0, 'buy': 0}, {'temp': 38, 'holiday': 0, 'buy': 1}, {'temp': 18, 'holiday': 1, 'buy': 0}, {'temp': 22, 'holiday': 0, 'buy': 0}, {'temp': 45, 'holiday': 1, 'buy': 1}, {'temp': 27, 'holiday': 0, 'buy': 0} ] # --------- Step 1: Raw data learning (simple weights) --------- def train_simple_model(data): w_temp = 0.0 w_holiday = 0.0 bias = 0.0 lr = 0.01 # learning rate for epoch in range(1000): for row in data: x1 = row['temp'] x2 = row['holiday'] y = row['buy'] z = w_temp * x1 + w_holiday * x2 + bias pred = 1 if z > 30 else 0 # simple threshold error = y - pred # Update weights w_temp += lr * error * x1 w_holiday += lr * error * x2 bias += lr * error return w_temp, w_holiday, bias def test_model(data, w_temp, w_holiday, bias): correct = 0 for row in data: x1 = row['temp'] x2 = row['holiday'] y = row['buy'] z = w_temp * x1 + w_holiday * x2 + bias pred = 1 if z > 30 else 0 if pred == y: correct += 1 accuracy = correct / len(data) return accuracy # --------- Step 2: Add Feature Engineering --------- def normalize(val, min_val, max_val): return (val - min_val) / (max_val - min_val) def train_engineered_model(data): w_temp = 0.0 w_holiday = 0.0 w_interact = 0.0 bias = 0.0 lr = 0.01 temps = [row['temp'] for row in data] min_temp, max_temp = min(temps), max(temps) for epoch in range(1000): for row in data: # Normalized features + interaction x1 = normalize(row['temp'], min_temp, max_temp) x2 = row['holiday'] x3 = x1 * x2 # interaction y = row['buy'] z = w_temp * x1 + w_holiday * x2 + w_interact * x3 + bias pred = 1 if z > 0.5 else 0 error = y - pred # Update weights w_temp += lr * error * x1 w_holiday += lr * error * x2 w_interact += lr * error * x3 bias += lr * error return w_temp, w_holiday, w_interact, bias, min_temp, max_temp def test_engineered(data, w_temp, w_holiday, w_interact, bias, min_temp, max_temp): correct = 0 for row in data: x1 = normalize(row['temp'], min_temp, max_temp) x2 = row['holiday'] x3 = x1 * x2 y = row['buy'] z = w_temp * x1 + w_holiday * x2 + w_interact * x3 + bias pred = 1 if z > 0.5 else 0 if pred == y: correct += 1 return correct / len(data) # --------- Run both models --------- print("Training without feature engineering...") w1, w2, b = train_simple_model(data) acc_raw = test_model(data, w1, w2, b) print("Accuracy (Raw):", acc_raw) print("\nTraining with feature engineering...") w1, w2, w3, b2, min_t, max_t = train_engineered_model(data) acc_eng = test_engineered(data, w1, w2, w3, b2, min_t, max_t) print("Accuracy (Engineered):", acc_eng)
Expected Output:
Training without feature engineering…
Accuracy (Raw): 0.6Training with feature engineering…
Accuracy (Engineered): 1.0
Summary
- The raw model struggles because it can’t learn complex patterns like “holiday + hot day → higher sales”.
- The engineered model learns faster and more accurately because we gave it a meaningful combined feature and normalized data.
2. Why adding engineered features improves prediction?
Story Recap :
Imagine we’re trying to guess if people will buy ice cream. We have only:
- temperature (e.g., 35°C)
- holiday (yes or no → 1 or 0)
Now let’s observe two scenarios:
Scenario 1: Raw Inputs Only
- Model sees:
- temp = 30
- holiday = 1
- Model tries:
score = w_temp * 30 + w_holiday * 1 + bias
But the model is linear. It can’t easily learn: “Sales are high when both temperature is high AND it’s a holiday.”
This is a non-linear interaction, and raw linear models can’t combine two inputs multiplicatively unless we help them.
Scenario 2: Engineered Feature Added → Interaction
We created:
x3 = x1 * x2 → normalized_temp × holiday
This feature only activates when both conditions are true.
So for:
- Hot day (temp = 40 → normalized = ~0.8)
- Holiday (1)
Interaction feature becomes:
x3 = 0.8 × 1 = 0.8
Whereas:
- Cold day + holiday → 0.2 × 1 = 0.2
- Hot day + no holiday → 0.8 × 0 = 0
New model:
score = w_temp * temp + w_holiday * holiday + w_interact * (temp × holiday) + bias
Now the model can assign special importance to combinations like:”High temperature AND holiday = people buy ice cream”
That logic couldn’t be learned by a simple sum of weights.
Simple Analogy :
Without interaction: You’re saying “Hot days” are good, “Holidays” are good — independently.
With interaction: You’re saying “Hot Holidays” are especially good!
Technically Speaking:
A simple model without interactions:
z = w1 * temp + w2 * holiday + bias
is a linear separator, i.e., it draws a straight line.
Adding:
z = w1 * temp + w2 * holiday + w3 * (temp × holiday) + bias
lets it fit curved or conditional boundaries, i.e., it can:
- Adjust slope depending on combinations.
- Learn contextual influence (e.g., holiday matters only if hot).
This is feature engineering — making the model’s job easier by expressing logic explicitly in features.
In our code
Before:
- Model has no way to “learn” temperature × holiday effect.
- It gives average accuracy (like 60%).
After:
- Model sees that when both temp is high and holiday is 1 → customers buy.
- Model fits faster and with more precision → 100% accuracy.
Conclusion
Adding the temp × holiday interaction tells the model: “Hey, don’t treat temp and holiday separately. Sometimes they work together to drive behavior.”
This unlocks learning of deeper patterns.
Next – Encoding in Neural Networks