One-Class SVMs for Outlier Detection
One-Class SVMs learn a tight boundary around normal training data in a high-dimensional feature space, classifying new points outside this boundary as outliers.
How One-Class SVM Works
One-Class SVM (Sch\u00f6lkopf et al., 2001) maps training data to a high-dimensional kernel space and finds the smallest hypersphere (or hyperplane from the origin) that encloses the data, controlled by the nu parameter.
The nu Parameter
nu (in [0, 1]) is an upper bound on the fraction of training errors (outliers in training) and a lower bound on the fraction of support vectors. A smaller nu creates a tighter boundary (fewer training points allowed outside); a larger nu is more lenient. Typical values: 0.01 to 0.1.
One-Class SVM in scikit-learn
sklearn's OneClassSVM supports various kernels and is trained only on normal (inlier) data. It returns -1 for outliers and 1 for inliers.
Training and Detecting Outliers
Visualizing the Decision Boundary
One-Class SVM vs. Isolation Forest
One-Class SVM is better for novelty detection (test data from a different distribution) when training data is clean. Isolation Forest is generally faster and more scalable for large-scale anomaly detection.
Choosing the Right Method
- One-Class SVM: Best with clean training data; powerful with RBF kernel for complex boundaries; slow for >10K samples.
- Isolation Forest: Faster, scales to millions of points, robust to high dimensions, better when training data contains some outliers.
- SGD One-Class SVM: Use
sklearn.linear_model.SGDOneClassSVMfor linear approximation at large scale.