Lasso Regression

1. What is Lasso Regression (L1 Regularization)

Imagine we’re packing a travel bag.We want to carry only the most essential items. We could pack everything, but that would be heavy and inefficient.
Lasso Regression does the same with variables in a machine learning model — it keeps only the most important ones, and removes or reduces the rest to zero.

Lasso stands for Least Absolute Shrinkage and Selection Operator.

It is a linear regression technique that adds a penalty (called L1 penalty) to the loss function:

Loss=MSE (Mean Squared Error)+λ∑∣coefficients∣

The goal is to:

  • Minimize prediction error (like regular regression)
  • Shrink less useful features’ coefficients to zero (drop them)

2. Why Is This Useful?

  • Helps with feature selection: removes unnecessary variables.
  • Avoids overfitting: simplifies the model, especially useful when there are many predictors.
  • Works well when only a few variables are actually useful, and others are just noise.

Simple Analogy:

We’re a chef choosing ingredients for a signature dish.We have 50 ingredients, but only 5 actually add flavor.Lasso helps us identify and keep only the tasty ones – removing the rest.

3. Real-Life Use Cases of Lasso Regression

1. Predicting House Prices (Real Estate)

  • Problem: We have many variables — size, location, number of rooms, age of property, distance from the city, etc.
  • Issue: Not all variables actually impact the price.
  • Lasso helps by automatically ignoring the variables that don’t contribute much (e.g., distance to the nearest bus stop) and only keeping the important ones (e.g., size, location).

2. Healthcare: Predicting Disease Risk

  • Scenario: Predicting risk of diabetes based on 100+ blood markers and lifestyle metrics.
  • Issue: Only a few markers (like glucose, BMI) are truly relevant.
  • Lasso will assign zero weight to irrelevant or weak predictors, simplifying diagnosis and improving accuracy.

4. What Does “Central to L1 Penalty” Mean?

It means that a particular math concept — in this case, absolute value — is at the core or heart of how the L1 penalty works in Lasso Regression.

5. What is L1 Penalty?

L1 penalty is the sum of absolute values of weights:

L1 penalty=λ⋅(∣w1∣+∣w2∣+⋯+∣wn∣)

This is added to the loss function during training to discourage the model from using too many features.

6. Why Absolute Value?

  • If we just added w₁ + w₂ + w₃, positive and negative values might cancel out.

  • But using |w₁| + |w₂| + |w₃| ensures every weight contributes positively to the penalty — whether it’s -50 or +50, both cost the same: 50.

7. Why It’s “Central”?

  • The absolute value is what makes L1 different from L2 (Ridge Regression).
  • It causes a sharp penalty and leads to exact zeros in weights (dropping features).
  • Without it, Lasso would not do feature selection.

So, absolute value is not optional or side detail — it’s what defines L1 penalty.

Lasso Regression – Lasso Regression example with Simple Python