XGBoost: Advanced Gradient Boosting

XGBoost (eXtreme Gradient Boosting) extends gradient boosting with L1/L2 regularization, second-order Taylor expansion for splits, parallel tree construction, and built-in handling of missing values.

XGBoost Innovations

Unlike vanilla GBM, XGBoost uses both the first (gradient) and second (Hessian) derivatives of the loss to compute optimal leaf weights and split gains, leading to better-calibrated trees.

Regularized Objective

XGBoost minimizes: L(\u03a6) = \u03a3 l(y_i, \u0177_i) + \u03a3_k \u03a9(f_k) where \u03a9(f) = \u03b3T + 0.5\u03bb||w||\u00b2 + \u03b1||w||\u2081. Here T is the number of leaves, \u03b3 penalizes tree complexity, \u03bb is L2 weight regularization, and \u03b1 is L1 weight regularization.

Training XGBoost with scikit-learn API

XGBoost provides a scikit-learn-compatible API through XGBClassifier and XGBRegressor, supporting early stopping and cross-validation.

Basic XGBoost Training

<pre><code class="language-python">from xgboost import XGBClassifier from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split X, y = load_breast_cancer(return_X_y=True) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) xgb = XGBClassifier( n_estimators=500, learning_rate=0.05, max_depth=4, subsample=0.8, colsample_bytree=0.8, reg_alpha=0.1, # L1 reg_lambda=1.0, # L2 use_label_encoder=False, eval_metric='logloss', random_state=42 ) xgb.fit(X_train, y_train, eval_set=[(X_test, y_test)], early_stopping_rounds=20, verbose=False) print(f"Test Accuracy: {xgb.score(X_test, y_test):.3f}")</pre>

Early Stopping

Early stopping monitors a validation metric and stops training when no improvement occurs for early_stopping_rounds consecutive rounds. This prevents overfitting without manually tuning n_estimators.

Performance and Scalability

XGBoost uses column-block data structures, out-of-core computation for large datasets, and parallel processing across CPUs and GPUs, making it one of the fastest gradient boosting implementations.

GPU Acceleration

<pre><code class="language-python"># Enable GPU training (requires CUDA-enabled XGBoost) xgb_gpu = XGBClassifier( n_estimators=500, learning_rate=0.05, tree_method='hist', # or 'gpu_hist' for GPU device='cuda', # XGBoost >= 2.0 random_state=42 ) # xgb_gpu.fit(X_train, y_train)</pre>