One-Class SVMs for Outlier Detection

One-Class SVMs learn a tight boundary around normal training data in a high-dimensional feature space, classifying new points outside this boundary as outliers.


How One-Class SVM Works

One-Class SVM (Sch\u00f6lkopf et al., 2001) maps training data to a high-dimensional kernel space and finds the smallest hypersphere (or hyperplane from the origin) that encloses the data, controlled by the nu parameter.

The nu Parameter

nu (in [0, 1]) is an upper bound on the fraction of training errors (outliers in training) and a lower bound on the fraction of support vectors. A smaller nu creates a tighter boundary (fewer training points allowed outside); a larger nu is more lenient. Typical values: 0.01 to 0.1.

One-Class SVM in scikit-learn

sklearn's OneClassSVM supports various kernels and is trained only on normal (inlier) data. It returns -1 for outliers and 1 for inliers.

Training and Detecting Outliers

<pre><code class="language-python">import numpy as np from sklearn.svm import OneClassSVM from sklearn.preprocessing import StandardScaler import matplotlib.pyplot as plt rng = np.random.RandomState(42) X_train = 0.3 * rng.randn(200, 2) # normal data for training X_test_normal = 0.3 * rng.randn(50, 2) # normal test points X_test_outliers = rng.uniform(-4, 4, (20, 2)) # outliers scaler = StandardScaler() X_train_s = scaler.fit_transform(X_train) X_norm_s = scaler.transform(X_test_normal) X_out_s = scaler.transform(X_test_outliers) ocsvm = OneClassSVM(kernel='rbf', gamma='auto', nu=0.05) ocsvm.fit(X_train_s) pred_norm = ocsvm.predict(X_norm_s) # should be mostly 1 pred_out = ocsvm.predict(X_out_s) # should be mostly -1 print(f"Normal points flagged as outlier: {(pred_norm == -1).sum()}/{len(pred_norm)}") print(f"Outliers correctly detected: {(pred_out == -1).sum()}/{len(pred_out)}")</pre>

Visualizing the Decision Boundary

<pre><code class="language-python">xx, yy = np.meshgrid(np.linspace(-5, 5, 200), np.linspace(-5, 5, 200)) Z = ocsvm.decision_function(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape) plt.contourf(xx, yy, Z, levels=[Z.min(), 0], colors='lightcoral', alpha=0.5) plt.contour(xx, yy, Z, levels=[0], colors='red') plt.scatter(X_train_s[:, 0], X_train_s[:, 1], c='blue', s=10, alpha=0.5, label='Train') plt.scatter(X_out_s[:, 0], X_out_s[:, 1], c='red', s=60, label='Outliers') plt.legend(); plt.title('One-Class SVM Decision Boundary'); plt.show()</pre>

One-Class SVM vs. Isolation Forest

One-Class SVM is better for novelty detection (test data from a different distribution) when training data is clean. Isolation Forest is generally faster and more scalable for large-scale anomaly detection.

Choosing the Right Method

  • One-Class SVM: Best with clean training data; powerful with RBF kernel for complex boundaries; slow for &gt;10K samples.
  • Isolation Forest: Faster, scales to millions of points, robust to high dimensions, better when training data contains some outliers.
  • SGD One-Class SVM: Use sklearn.linear_model.SGDOneClassSVM for linear approximation at large scale.