What is Lasso Regression? An Introduction to L1 Regularisation
Jun 03, 2026 4 Min Read 37 Views
(Last Updated)
Machine learning models tend to overfit the training data and cannot generalise well to unseen data. This is known as overfitting. To solve this, machine learning engineers use regularisation techniques that control model complexity and improve generalisation.
Lasso Regression is among the most common regularisation techniques in machine learning. This reduces overfitting and also does automatic feature selection, making models simpler and more efficient.
In this article, you will learn what Lasso Regression is, how L1 regularisation works, why feature selection matters, how the lambda parameter affects the model performance and how to implement Lasso Regression using Scikit Learn.
Table of contents
- TLDR;
- The Importance of Regularisation
- L1 Regularisation Explained
- Sparse Model is defined as:
- How Lasso Regression Does Feature Selection
- Meaning of the lambda parameter
- Ridge Regression vs Lasso Regression
- Lasso Regression:
- Ridge regression (RR)
- What is Elastic Net?
- Implement Lasso Regression in Scikit-Learn
- Step 1: Install Required Libraries
- Step 2: Import Libraries
- Step 3: Load Dataset
- Step 4: Split the data into training data and testing data
- Step 5: Build and Train the Model
- Step 6: Make predictions
- Step 7: Evaluate the Model
- Benefits of Lasso Regression
- Drawbacks of Lasso Regression
- Lasso Regression in Practical Scenarios
- Conclusions
- FAQs
- The main objective of Lasso Regression is:
- What is L1 regularisation?
- What is the meaning of the lambda parameter in Lasso Regression?
- What is the difference between Lasso and Ridge Regression?
- When to use Lasso Regression?
TLDR;
- Lasso Regression is a regularised regression technique, which uses L1 regularisation to reduce overfitting.
- It shrinks less important feature coefficients towards zero.
- Some coefficients are precisely zero, which helps in automatic feature selection.
- The lambda parameter regulates the shrinkage level of the model.
- Lasso regression is commonly used when data sets contain many irrelevant features.
- Sklearn Lasso has a simple implementation in Scikit Learn.
What is Lasso Regression?
Lasso Regression is a type of linear regression that adds a penalty term to the loss function to reduce model complexity and prevent overfitting. LASSO stands for Least Absolute Shrinkage and Selection Operator, and it works by penalizing the absolute values of regression coefficients. As the penalty increases, the model shrinks coefficient values, and some coefficients may become exactly zero. This property makes Lasso Regression especially useful for feature selection, as it can automatically remove less important features from the model.
The Importance of Regularisation
Machine learning models can sometimes memorise training data rather than learn real patterns. This leads to high variance and poor real-world performance.
Regularisation is a way to control this problem by adding penalties to large coefficients.
The main advantages of regularisation are as follows:
- Overfitting mitigation
- Improving model generalisability.
- Stabilising the model.
- Better interpretability.
Lasso Regression does this with L1 regularisation.
Lasso Regression does this with L1 regularisation. If you want to understand regularisation concepts in more detail, you can also read this guide on Regularisation in Machine Learning.
L1 Regularisation Explained
L1 regularisation adds the sum of the absolute values of coefficients to the loss function.
The mathematical objective function of Lasso Regression is as follows:
Loss = RSS + λ ∑ |β|
Where:
- RSS = Residual Sum of Squares.
- λ the lambda parameter.
- The coefficients of the model are β.
The amount of the penalty is controlled by the parameter lambda.
The coefficients shrink more aggressively as lambda increases.
Another unique property of L1 regularisation is that it can zero out coefficients. Some of the feature weights become exactly zero, removing those variables from the model.
This leads to a sparse model.
Curious about how these concepts work? Download HCL GUVI’s free AI ebook to learn more about machine learning concepts, regression models, and real-world AI applications.
Sparse Model is defined as:
In a sparse model, many of the feature coefficients are zero.
That is, the model only keeps the variables that are relevant and discards the irrelevant ones.
Sparse models are useful because they:
- Reduce the model complexity.
- Make it more interpretable.
- Lower cost of computing.
- Helps to avoid noisy features.
- Make predictions more efficient.
This is one of the biggest advantages of Lasso Regression over linear regression.
How Lasso Regression Does Feature Selection
Feature selection is the identification of the most important variables in a dataset.
Traditional linear regression uses all features, even though some features have very little contribution.
It automatically drops the weak features by setting their coefficients to 0.
For example, a model to predict the price of a house might have features such as:
- Territory.
- Rooms.
- Property age.
- Wall Colour.
- Nickname of the owner.
Some features, like wall colour or owner nickname, might not be very predictive.
Lasso Regression can set its coefficients to zero and remove them from the model.
Traditional linear regression uses all features, even though some features have very little contribution. To understand how regression models work before regularisation, you can also explore this Linear Regression Model in the Machine Learning Guide.
Meaning of the lambda parameter
One of the most important parts of Lasso Regression is the lambda parameter.
It controls the degree of regularisation applied.
Different values of lambda give different results:
- Lambda = 0 is like normal linear regression.
- Small λ results in mild shrinkage.
- Large lambda shrinks coefficients aggressively. The model may be underfitting with a large lambda.
Choosing the right lambda is crucial to balancing bias and variance.
This is called the bias-variance tradeoff.
If the bias is low, the model may overfit.
A very high bias model could underfit.
Lasso Regression helps to find a happy medium.
Choosing the right lambda is crucial to balancing bias and variance. This concept is closely related to the Bias and Variance in Machine Learning Guide.
Ridge Regression vs Lasso Regression
Lasso and Ridge Regression are both regularised regression techniques, but they behave differently.
Lasso Regression:
- Uses L1 norm regularisation.
- Coefficients can be forced to be exactly zero.
- Feature selection.
- Creates sparse models.
Ridge regression (RR)
- L2 regularisation is applied.
- It reduces coefficients but rarely to zero.
- Keeps all properties of the model.
- Works fine when all variables are a little.
If you have a lot of irrelevant features in your dataset, then Lasso Regression is often a better choice.
Lasso and Ridge Regression are both regularised regression techniques, but they behave differently. You can also explore other important regression models in this Types of Regression in Machine Learning Guide.
What is Elastic Net?
Elastic Net is a hybrid of L1 and L2 regularisation.
It inherits the best of both Lasso and Ridge Regression.
Elastic Net is helpful when:
- Features are highly correlated.
- There are many variables in the data set.
- Lasso can become less stable.
- We want a better trade-off between shrinkage and feature selection.
Many production machine learning systems use Elastic Net for better stability.
Implement Lasso Regression in Scikit-Learn
Lasso Regression is simply implemented in Scikit-Learn.
Step 1: Install Required Libraries
pip install pandas numpy scikit-learn
Step 2: Import Libraries
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.datasets import load_diabetes
Step 3: Load Dataset
data = load_diabetes()
X = data.y = data.target
Step 4: Split the data into training data and testing data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
Step 5: Build and Train the Model
model = Lasso(alpha=0.1)
model.fit(X_train, y_train)
Step 6: Make predictions
predictions = model.predict(X_test)
Step 7: Evaluate the Model
mse = mean_squared_error(y_test, predictions)
print(“Mean Squared Error:”, mse)
The alpha parameter in Scikit Learn is the same as the lambda parameter.
The larger the alpha is, the stronger the regularisation.
Benefits of Lasso Regression
Lasso Regression has several advantages in machine learning.
- Reduces overfitting nicely.
- Features are selected automatically.
- Leads to simpler and more interpretable models.
- Reduces noise from less important features.
- It works well for high-dimensional data sets.
- Helps increase computational efficiency.
These benefits make Lasso Regression a very useful technique for practical machine learning applications.
Drawbacks of Lasso Regression
However, Lasso Regression also has its shortcomings.
- May remove useful correlated variables.
- Performance may be unstable for highly correlated features.
- The lambda value is selected through tuning.
- Underfitting is possible at high regularisation.
In these cases, Elastic Net may sometimes perform better.
Lasso Regression in Practical Scenarios
Lasso Regression has many applications in many industries.
Some common uses are:
- Medical diagnostic systems.
- Financial risk forecasting.
- Customer Attrition Prediction.
- Analysis of Marketing.
- Gene Selection in Bioinformatics.
- Fraud Detection Systems.
These domains often involve a large number of variables, so feature selection is very valuable.
Want to learn more about Artificial Intelligence, Machine Learning, and intelligent systems? Explore HCL GUVI’s AI and ML courses to gain hands-on experience with modern AI technologies.
Conclusions
Lasso Regression is one of the most popular regularisation techniques in machine learning. It helps in reducing overfitting, improves model simplicity and does automatic feature selection.
It’s very effective in dealing with datasets with irrelevant features, as it can reduce coefficients to zero.
You can build more accurate and interpretable machine learning systems by understanding L1 regularisation, tuning lambda, sparse models, and feature selection.
If you are dealing with high-dimensional data or noisy data sets, Lasso Regression can be a powerful addition to your machine learning toolbox.
FAQs
1. The main objective of Lasso Regression is:
Lasso Regression uses L1 regularisation to reduce overfitting and perform automatic feature selection.
2. What is L1 regularisation?
L1 regularisation adds the sum of the absolute value of the coefficients to the loss function to shrink less important features.
3. What is the meaning of the lambda parameter in Lasso Regression?
The amount of regularisation applied to the model is controlled by the lambda parameter.
4. What is the difference between Lasso and Ridge Regression?
Lasso can set coefficients to zero and remove features. Ridge just shrinks coefficients.
5. When to use Lasso Regression?
When datasets contain many irrelevant or superfluous features, Lasso Regression is useful.



Did you enjoy this article?