Model Stacking and Voting Classifiers

Stacking and voting are meta-ensemble techniques that combine predictions from multiple diverse models to achieve better performance than any single model.

Voting Classifiers

A voting classifier aggregates predictions from multiple models by majority vote (hard voting) or averaged probabilities (soft voting). Soft voting generally outperforms hard voting when models are well-calibrated.

Hard vs. Soft Voting

<pre><code class="language-python">from sklearn.ensemble import VotingClassifier, RandomForestClassifier, GradientBoostingClassifier from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_breast_cancer from sklearn.model_selection import cross_val_score X, y = load_breast_cancer(return_X_y=True) estimators = [ ('rf', RandomForestClassifier(n_estimators=100, random_state=42)), ('gbm', GradientBoostingClassifier(n_estimators=100, random_state=42)), ('lr', LogisticRegression(max_iter=1000)) ] for voting in ['hard', 'soft']: vc = VotingClassifier(estimators=estimators, voting=voting) scores = cross_val_score(vc, X, y, cv=5) print(f"{voting.title()} Voting CV: {scores.mean():.3f}")</pre>

Model Stacking

Stacking trains a meta-learner on the out-of-fold predictions of diverse level-0 (base) models, learning how to optimally combine them. This avoids the information leakage that would result from training the meta-learner on the same data as the base models.

StackingClassifier in sklearn

<pre><code class="language-python">from sklearn.ensemble import StackingClassifier from sklearn.svm import SVC level0 = [ ('rf', RandomForestClassifier(n_estimators=100, random_state=42)), ('gbm', GradientBoostingClassifier(n_estimators=100, random_state=42)), ('svm', SVC(probability=True, kernel='rbf')) ] level1 = LogisticRegression() # meta-learner stack = StackingClassifier(estimators=level0, final_estimator=level1, cv=5, passthrough=False, n_jobs=-1) scores = cross_val_score(stack, X, y, cv=5) print(f"Stacking CV: {scores.mean():.3f}")</pre>

Out-of-Fold Predictions

sklearn's StackingClassifier automatically uses cross-validation to generate out-of-fold predictions for training the meta-learner, preventing leakage. The passthrough=True option also passes original features to the meta-learner alongside base model predictions.

Practical Tips

Stacking adds significant computational cost; ensure base models are diverse enough to justify the complexity. Voting is simpler and often nearly as effective.

When to Stack

Stacking is most beneficial in competition settings or when squeezing out the last percentage point of accuracy. In production, voting or a single well-tuned model may be preferable for speed and maintainability. Always validate with cross-validation — stacking can overfit if not done carefully.