Setting Decision Thresholds
Most classifiers output probabilities, and the threshold that converts those probabilities into a class label is a critical design choice that should be tuned to your application's needs.
The Default Threshold and Why to Change It
By default, classifiers use a threshold of 0.5 — predict class 1 if P > 0.5. This is rarely optimal, especially with imbalanced classes or when false positives and false negatives have very different costs.
Precision-Recall Tradeoff
Lowering the threshold increases recall (catch more true positives) but decreases precision (more false positives). Raising it does the opposite. In medical screening you often prefer high recall; in spam filtering you prefer high precision. The right balance depends entirely on the cost of each error type.
Finding an Optimal Threshold
Class-Weighted Approaches
An alternative to threshold tuning is to train with class_weight='balanced', which adjusts the loss to weight minority classes more heavily and implicitly shifts the model toward a better threshold.
When to Use Each Approach
Threshold tuning is post-hoc and flexible — you can adapt to business requirements without retraining. Class weighting is baked into training and produces a model whose probabilities already reflect the cost asymmetry. For the most control, tune both: train with class weights, then apply threshold tuning on a validation set.