AdaBoost (Adaptive Boosting) Algorithm
AdaBoost adaptively boosts classification accuracy by re-weighting training samples after each round so that subsequent learners focus on previously misclassified examples.
The AdaBoost Algorithm
AdaBoost maintains a weight distribution over training samples, initially uniform. After each weak learner is trained, correctly classified samples get lower weight and misclassified ones get higher weight, forcing the next learner to focus on harder examples.
Algorithm Steps
- Initialize sample weights:
w_i = 1/Nfor all i. - Train a weak learner
h_ton weighted data. - Compute weighted error:
\u03b5_t = \u03a3 w_i \u00b7 \u1d40[h_t(x_i) \u2260 y_i]. - Compute learner weight:
\u03b1_t = 0.5 \u00b7 ln((1 - \u03b5_t) / \u03b5_t). - Update sample weights and normalize.
- Final prediction:
H(x) = sign(\u03a3 \u03b1_t h_t(x)).
AdaBoost in scikit-learn
scikit-learn's AdaBoostClassifier implements SAMME and SAMME.R (real-valued probability version) algorithms, with decision stumps as the default base learner.
Training AdaBoost
Staged Predictions
Strengths and Limitations
AdaBoost is fast, interpretable, and resistant to overfitting on clean data. It is, however, sensitive to noisy data and outliers because outliers receive extremely high weights.
When to Use AdaBoost
AdaBoost excels on relatively clean, balanced datasets. For noisy or imbalanced data, Gradient Boosting or XGBoost (with built-in regularization) are preferred. Always inspect the distribution of sample weights to detect if outliers are dominating the boosting process.