AdaBoost (Adaptive Boosting) Algorithm

AdaBoost adaptively boosts classification accuracy by re-weighting training samples after each round so that subsequent learners focus on previously misclassified examples.

The AdaBoost Algorithm

AdaBoost maintains a weight distribution over training samples, initially uniform. After each weak learner is trained, correctly classified samples get lower weight and misclassified ones get higher weight, forcing the next learner to focus on harder examples.

Algorithm Steps

Initialize sample weights: w_i = 1/N for all i.
Train a weak learner h_t on weighted data.
Compute weighted error: \u03b5_t = \u03a3 w_i \u00b7 \u1d40[h_t(x_i) \u2260 y_i].
Compute learner weight: \u03b1_t = 0.5 \u00b7 ln((1 - \u03b5_t) / \u03b5_t).
Update sample weights and normalize.
Final prediction: H(x) = sign(\u03a3 \u03b1_t h_t(x)).

AdaBoost in scikit-learn

scikit-learn's AdaBoostClassifier implements SAMME and SAMME.R (real-valued probability version) algorithms, with decision stumps as the default base learner.

Training AdaBoost

<pre><code class="language-python">from sklearn.ensemble import AdaBoostClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split X, y = load_breast_cancer(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) ada = AdaBoostClassifier( estimator=DecisionTreeClassifier(max_depth=1), # decision stumps n_estimators=200, learning_rate=0.5, random_state=42 ) ada.fit(X_train, y_train) print(f"Test Accuracy: {ada.score(X_test, y_test):.3f}")</pre>

Staged Predictions

<pre><code class="language-python">import matplotlib.pyplot as plt train_errs = [1 - acc for acc in ada.staged_score(X_train, y_train)] test_errs = [1 - acc for acc in ada.staged_score(X_test, y_test)] plt.plot(train_errs, label='Train Error') plt.plot(test_errs, label='Test Error') plt.xlabel('Boosting Round'); plt.ylabel('Error Rate') plt.legend(); plt.title('AdaBoost Learning Curves'); plt.show()</pre>

Strengths and Limitations

AdaBoost is fast, interpretable, and resistant to overfitting on clean data. It is, however, sensitive to noisy data and outliers because outliers receive extremely high weights.

When to Use AdaBoost

AdaBoost excels on relatively clean, balanced datasets. For noisy or imbalanced data, Gradient Boosting or XGBoost (with built-in regularization) are preferred. Always inspect the distribution of sample weights to detect if outliers are dominating the boosting process.