Lasso Regression
1. What is Lasso Regression (L1 Regularization)
Imagine we’re packing a travel bag.We want to carry only the most essential items. We could pack everything, but that would be heavy and inefficient.
Lasso Regression does the same with variables in a machine learning model — it keeps only the most important ones, and removes or reduces the rest to zero.
Lasso stands for Least Absolute Shrinkage and Selection Operator.
It is a linear regression technique that adds a penalty (called L1 penalty) to the loss function:
Loss=MSE (Mean Squared Error)+λ∑∣coefficients∣
The goal is to:
- Minimize prediction error (like regular regression)
- Shrink less useful features’ coefficients to zero (drop them)
2. Why Is This Useful?
- Helps with feature selection: removes unnecessary variables.
- Avoids overfitting: simplifies the model, especially useful when there are many predictors.
- Works well when only a few variables are actually useful, and others are just noise.
Simple Analogy:
We’re a chef choosing ingredients for a signature dish.We have 50 ingredients, but only 5 actually add flavor.Lasso helps us identify and keep only the tasty ones – removing the rest.
3. Real-Life Use Cases of Lasso Regression
1. Predicting House Prices (Real Estate)
- Problem: We have many variables — size, location, number of rooms, age of property, distance from the city, etc.
- Issue: Not all variables actually impact the price.
- Lasso helps by automatically ignoring the variables that don’t contribute much (e.g., distance to the nearest bus stop) and only keeping the important ones (e.g., size, location).
2. Healthcare: Predicting Disease Risk
- Scenario: Predicting risk of diabetes based on 100+ blood markers and lifestyle metrics.
- Issue: Only a few markers (like glucose, BMI) are truly relevant.
- Lasso will assign zero weight to irrelevant or weak predictors, simplifying diagnosis and improving accuracy.
4. What Does “Central to L1 Penalty” Mean?
It means that a particular math concept — in this case, absolute value — is at the core or heart of how the L1 penalty works in Lasso Regression.
5. What is L1 Penalty?
L1 penalty is the sum of absolute values of weights:
L1 penalty=λ⋅(∣w1∣+∣w2∣+⋯+∣wn∣)
This is added to the loss function during training to discourage the model from using too many features.
6. Why Absolute Value?
- If we just added w₁ + w₂ + w₃, positive and negative values might cancel out.
- But using |w₁| + |w₂| + |w₃| ensures every weight contributes positively to the penalty — whether it’s -50 or +50, both cost the same: 50.
7. Why It’s “Central”?
- The absolute value is what makes L1 different from L2 (Ridge Regression).
- It causes a sharp penalty and leads to exact zeros in weights (dropping features).
- Without it, Lasso would not do feature selection.
So, absolute value is not optional or side detail — it’s what defines L1 penalty.
Lasso Regression – Lasso Regression example with Simple Python