ElasticNet Regression

ElasticNet blends the L1 and L2 penalties into a single model, inheriting Lasso's sparsity and Ridge's stability when features are correlated.

The ElasticNet Loss

ElasticNet minimises: SSR + \u03b1[\u03c1\u03a3|\u03b2\u1d62| + (1-\u03c1)\u03a3\u03b2\u1d62\u00b2], where \u03c1 (l1_ratio) controls the mix. Setting \u03c1=1 gives pure Lasso; \u03c1=0 gives pure Ridge.

Fitting ElasticNet with Cross-Validation

<pre><code class="language-python">from sklearn.linear_model import ElasticNetCV from sklearn.preprocessing import StandardScaler from sklearn.pipeline import make_pipeline from sklearn.datasets import fetch_california_housing from sklearn.model_selection import train_test_split from sklearn.metrics import r2_score import numpy as np data = fetch_california_housing() X_tr, X_te, y_tr, y_te = train_test_split( data.data, data.target, test_size=0.2, random_state=42 ) en = make_pipeline( StandardScaler(), ElasticNetCV(l1_ratio=[0.1, 0.5, 0.9, 1.0], cv=5, random_state=0) ) en.fit(X_tr, y_tr) en_model = en.named_steps["elasticnetcv"] print(f"Best alpha: {en_model.alpha_:.4f}") print(f"Best l1_ratio: {en_model.l1_ratio_:.2f}") print(f"Test R\u00b2: {r2_score(y_te, en.predict(X_te)):.3f}")</pre>

When ElasticNet Shines

ElasticNet is the default choice in high-dimensional settings where both sparsity and correlated features are present — common in genomics and text modelling.

Grouping Effect

Unlike Lasso, ElasticNet tends to include or exclude correlated features together (the grouping effect). If you have a cluster of genes that collectively predict a disease outcome, ElasticNet is more likely to retain all of them than Lasso, which would arbitrarily pick one.