SVM Margins and Hyperplanes

The geometry of SVMs — hyperplanes, margins, and support vectors — is what gives the algorithm its theoretical elegance and practical power.

Hyperplanes and Margins

A hyperplane in d dimensions is a flat subspace of d-1 dimensions (a line in 2D, a plane in 3D). SVM places the separating hyperplane at the midpoint between the two class margins: w\u1d40x + b = 0, with the two margin planes at w\u1d40x + b = +1 and -1.

Margin Width Formula

The width of the margin is 2 / \u2016w\u2016. Maximising the margin is equivalent to minimising \u2016w\u2016\u00b2 subject to the constraint that all training points are correctly classified. This is a convex quadratic programming problem with a unique global solution.

The Soft-Margin Parameter C

Real data is rarely linearly separable. The soft-margin SVM introduces slack variables \u03be\u1d62 that allow some points to violate the margin. The objective becomes: minimise \u2016w\u2016\u00b2/2 + C\u03a3\u03be\u1d62.

C: Regularisation Strength

A large C penalises margin violations heavily — the model tries hard to classify all training points correctly (low bias, high variance, narrow margin). A small C allows more violations — wider margin, more generalisable model (higher bias, lower variance). C is the primary hyperparameter to tune in SVM.

Tuning C with Grid Search

<pre><code class="language-python">from sklearn.svm import SVC from sklearn.model_selection import GridSearchCV from sklearn.preprocessing import StandardScaler from sklearn.pipeline import make_pipeline from sklearn.datasets import load_breast_cancer X, y = load_breast_cancer(return_X_y=True) pipeline = make_pipeline(StandardScaler(), SVC(kernel="rbf")) param_grid = {"svc__C": [0.01, 0.1, 1, 10, 100]} search = GridSearchCV(pipeline, param_grid, cv=5, scoring="accuracy") search.fit(X, y) print(f"Best C: {search.best_params_['svc__C']}") print(f"CV Accuracy: {search.best_score_:.4f}")</pre>