SVM Margins and Hyperplanes
The geometry of SVMs — hyperplanes, margins, and support vectors — is what gives the algorithm its theoretical elegance and practical power.
Hyperplanes and Margins
A hyperplane in d dimensions is a flat subspace of d-1 dimensions (a line in 2D, a plane in 3D). SVM places the separating hyperplane at the midpoint between the two class margins: w\u1d40x + b = 0, with the two margin planes at w\u1d40x + b = +1 and -1.
Margin Width Formula
The width of the margin is 2 / \u2016w\u2016. Maximising the margin is equivalent to minimising \u2016w\u2016\u00b2 subject to the constraint that all training points are correctly classified. This is a convex quadratic programming problem with a unique global solution.
The Soft-Margin Parameter C
Real data is rarely linearly separable. The soft-margin SVM introduces slack variables \u03be\u1d62 that allow some points to violate the margin. The objective becomes: minimise \u2016w\u2016\u00b2/2 + C\u03a3\u03be\u1d62.
C: Regularisation Strength
A large C penalises margin violations heavily — the model tries hard to classify all training points correctly (low bias, high variance, narrow margin). A small C allows more violations — wider margin, more generalisable model (higher bias, lower variance). C is the primary hyperparameter to tune in SVM.