Receiver Operating Characteristic (ROC) and AUC
The ROC curve evaluates a classifier's ability to discriminate between classes across all possible thresholds at once, and the AUC summarises that performance in a single threshold-independent number.
The ROC Curve
The ROC curve plots the True Positive Rate (Recall) on the y-axis against the False Positive Rate on the x-axis as the decision threshold sweeps from 1 to 0. A perfect classifier's curve hugs the top-left corner; a random classifier follows the diagonal.
Plotting the ROC Curve
Area Under the Curve (AUC)
AUC ranges from 0.5 (random) to 1.0 (perfect). It is threshold-independent and equals the probability that the model ranks a random positive higher than a random negative.
Interpreting AUC
AUC > 0.9: excellent discrimination. AUC 0.8-0.9: good. AUC 0.7-0.8: fair. AUC 0.5-0.7: poor — barely better than guessing. AUC is particularly valuable when comparing classifiers across different datasets or when the deployment threshold is not yet known.
ROC vs. Precision-Recall Curve
With highly imbalanced datasets the ROC curve can look optimistic because it includes TN in FPR. In those cases, the Precision-Recall (PR) curve is more informative — it focuses entirely on the minority class and does not benefit from a large number of true negatives.
<pre><code class="language-python">from sklearn.metrics import PrecisionRecallDisplay PrecisionRecallDisplay.from_predictions(y_te, y_scores).plot() </pre>