Tracking Data Drift in Production

Data drift occurs when the statistical properties of production inputs diverge from the training distribution — silently degrading model accuracy without any obvious error signals until it's too late.


Types of Drift

There are several distinct flavours of drift, each requiring different detection strategies.

Drift Taxonomy

  • Data/Feature Drift (covariate shift): p(X) changes but p(y|X) stays the same — inputs shift but the relationship holds
  • Label Drift (prior probability shift): p(y) changes — the target class balance shifts
  • Concept Drift: p(y|X) changes — the fundamental relationship between features and labels changes
  • Prediction Drift: p(\\hat{y}) changes — a proxy for concept drift when labels are delayed

Statistical Tests for Drift Detection

Common drift detection methods compare reference (training) and production (current) distributions using statistical tests.

Kolmogorov-Smirnov and Chi-Squared Tests

<pre><code class="language-python">from scipy import stats import numpy as np # Simulate reference and production samples ref = np.random.normal(0, 1, 1000) # training distribution prod = np.random.normal(0.5, 1.2, 500) # shifted production data # K-S test for continuous features ks_stat, p_value = stats.ks_2samp(ref, prod) print(f"KS Statistic: {ks_stat:.4f}, p-value: {p_value:.4f}") if p_value < 0.05: print("Drift detected!") else: print("No significant drift.") # Chi-Squared test for categorical features from scipy.stats import chi2_contingency ref_counts = np.array([600, 300, 100]) prod_counts = np.array([400, 350, 250]) chi2, p, dof, expected = chi2_contingency([ref_counts, prod_counts]) print(f"Chi2 p-value: {p:.4f}")</pre>

Monitoring with Evidently AI

<pre><code class="language-python">from evidently.report import Report from evidently.metric_preset import DataDriftPreset import pandas as pd # reference_df: your training data # production_df: a recent window of production data reference_df = pd.read_csv("train_data.csv") production_df = pd.read_csv("production_window.csv") report = Report(metrics=[DataDriftPreset()]) report.run(reference_data=reference_df, current_data=production_df) report.save_html("drift_report.html") # Open drift_report.html for a full interactive drift analysis</pre>

Responding to Drift

Detecting drift is only half the solution — you also need a playbook for what happens next.

Drift Response Strategies

  • Alert: Send a notification to the model owner for investigation
  • Retrain: Trigger a new training run on recent data if performance has degraded
  • Fallback: Route traffic to a simpler, more robust model
  • Feature engineering review: Investigate whether upstream data pipelines have changed
  • Window-based retraining: Automatically retrain on a rolling window of recent data on a schedule