Phase 1: Data, Features, and Model Training
The first phase covers everything from raw data to a cross-validated, tuned, and serialized model artifact ready for serving.
EDA, Pipeline, and Tuning
<pre><code class="language-python"># capstone/train.py
import pandas as pd
import numpy as np
import joblib
import mlflow
import mlflow.sklearn
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import StratifiedKFold, RandomizedSearchCV, cross_validate
from sklearn.metrics import classification_report
from scipy.stats import randint, uniform
# -- Load Data --
df = pd.read_csv("data/train.csv")
X = df.drop("target", axis=1)
y = df["target"]
# -- Preprocessing Pipeline --
num_cols = X.select_dtypes(include=["number"]).columns.tolist()
cat_cols = X.select_dtypes(include=["object", "category"]).columns.tolist()
preprocessor = ColumnTransformer([
("num", Pipeline([("imp", SimpleImputer(strategy="median")),
("scl", StandardScaler())]), num_cols),
("cat", Pipeline([("imp", SimpleImputer(strategy="most_frequent")),
("ohe", OneHotEncoder(handle_unknown="ignore", sparse_output=False))]), cat_cols)
])
pipe = Pipeline([
("pre", preprocessor),
("clf", GradientBoostingClassifier(random_state=42))
])
# -- Hyperparameter Tuning --
param_dist = {
"clf__n_estimators": randint(50, 300),
"clf__learning_rate": uniform(0.01, 0.29),
"clf__max_depth": randint(2, 7),
}
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
search = RandomizedSearchCV(pipe, param_dist, n_iter=50, cv=skf,
scoring="roc_auc", n_jobs=-1, random_state=42)
# -- MLflow Tracking --
mlflow.sklearn.autolog()
with mlflow.start_run(run_name="capstone-gbt"):
search.fit(X, y)
print("Best params:", search.best_params_)
print("Best ROC-AUC:", search.best_score_)
joblib.dump(search.best_estimator_, "artifacts/model.joblib")</pre>
Full Evaluation Report
<pre><code class="language-python"># Hold-out evaluation
from sklearn.model_selection import train_test_split
X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.2, stratify=y, random_state=42)
best_pipe = search.best_estimator_
best_pipe.fit(X_tr, y_tr)
print(classification_report(y_te, best_pipe.predict(X_te)))</pre>
Phase 2: API Serving and Containerisation
The second phase wraps the trained model in a FastAPI application and packages it as a Docker container ready for deployment to any cloud or on-premise environment.
FastAPI Serving App
<pre><code class="language-python"># capstone/app.py
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import pandas as pd
from contextlib import asynccontextmanager
class InferenceRequest(BaseModel):
data: dict # column -> value mapping matching training schema
model_store = {}
@asynccontextmanager
async def lifespan(app: FastAPI):
model_store["pipeline"] = joblib.load("artifacts/model.joblib")
yield
model_store.clear()
app = FastAPI(title="Capstone ML API", version="1.0.0", lifespan=lifespan)
@app.get("/health")
def health():
return {"status": "ok", "model_loaded": "pipeline" in model_store}
@app.post("/predict")
def predict(req: InferenceRequest):
pipe = model_store["pipeline"]
X = pd.DataFrame([req.data])
pred = int(pipe.predict(X)[0])
proba = pipe.predict_proba(X)[0].tolist()
return {"prediction": pred, "probability": round(max(proba), 4)}</pre>
Dockerfile and Deployment
<pre><code class="language-python"># Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY artifacts/ artifacts/
COPY app.py .
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
# Build and run:
# docker build -t capstone-ml:v1 .
# docker run -p 8000:8000 capstone-ml:v1
# Test:
# curl -X POST http://localhost:8000/predict \
# -H "Content-Type: application/json" \
# -d '{"data": {"age": 35, "income": 65000, "gender": "M"}}'</pre>
Phase 3: Testing, Monitoring, and Next Steps
The final phase adds the operational layer: automated tests, performance monitoring, and a clear path for iterating on the model in production.
CI/CD and Testing Checklist
- ✅ Unit tests: transformer correctness, output shapes, data schema contracts
- ✅ Integration test: end-to-end POST to
/predict returns expected format - ✅ Performance gate: CI fails if CV ROC-AUC < 0.85
- ✅ Docker image build on every merge to main
- ✅ Shadow deployment before full traffic cutover
- ✅ Drift monitoring: weekly Evidently report against training distribution
- ✅ Latency SLO: P99 < 200ms tracked in Prometheus / Grafana
Continuous Improvement Loop
Production is not the end — it's the beginning of the feedback loop:
- Monitor prediction drift and accuracy on labelled production data
- Retrain when drift is detected or scheduled (weekly/monthly)
- Log all experiments with MLflow, promote only models that pass the CI gate
- Use shadow deployment then A/B testing for every new candidate
- Automate the entire cycle with an ML pipeline (Airflow, Prefect, or Vertex Pipelines) for Level 2 MLOps maturity