Model Stacking and Voting Classifiers
Stacking and voting are meta-ensemble techniques that combine predictions from multiple diverse models to achieve better performance than any single model.
Voting Classifiers
A voting classifier aggregates predictions from multiple models by majority vote (hard voting) or averaged probabilities (soft voting). Soft voting generally outperforms hard voting when models are well-calibrated.
Hard vs. Soft Voting
Model Stacking
Stacking trains a meta-learner on the out-of-fold predictions of diverse level-0 (base) models, learning how to optimally combine them. This avoids the information leakage that would result from training the meta-learner on the same data as the base models.
StackingClassifier in sklearn
Out-of-Fold Predictions
sklearn's StackingClassifier automatically uses cross-validation to generate out-of-fold predictions for training the meta-learner, preventing leakage. The passthrough=True option also passes original features to the meta-learner alongside base model predictions.
Practical Tips
Stacking adds significant computational cost; ensure base models are diverse enough to justify the complexity. Voting is simpler and often nearly as effective.
When to Stack
Stacking is most beneficial in competition settings or when squeezing out the last percentage point of accuracy. In production, voting or a single well-tuned model may be preferable for speed and maintainability. Always validate with cross-validation — stacking can overfit if not done carefully.