A Support Vector Machine (SVM) is a supervised machine learning algorithm for binary classification. It finds the optimal linear decision boundary (hyperplane) that separates two classes with the maximum margin.
Given labelled data: \(\{(x_1, y_1), \ldots, (x_n, y_n)\}\) where \(y_i \in \{-1, +1\}\)
The SVM finds a hyperplane separating the two classes:
Classification rule: \(\hat{y} = \text{sign}(\mathbf{w} \cdot \mathbf{x} + b)\)
The margin is the perpendicular distance between the decision boundary and the nearest training points from each class.
A wider margin generally leads to better generalisation.
Support vectors are the training points lying exactly on the margin boundaries:
\$\(y_i(\mathbf{w} \cdot \mathbf{x}_i + b) = 1\)\$
These are the only training points that determine the decision boundary. All other points can be removed without changing the classifier.
SVM solves:
\$\(\text{minimise} \;\frac{1}{2}\|\mathbf{w}\|^2 \quad \text{subject to} \quad y_i(\mathbf{w} \cdot \mathbf{x}_i + b) \geq 1 \;\; \forall i\)\$
Minimising \(\|\mathbf{w}\|^2\) maximises the margin \(\frac{2}{\|\mathbf{w}\|}\).
KEY TAKEAWAY: The SVM finds the hyperplane that maximises the margin between the two classes. The support vectors are the only training points that define the boundary — they are the most informative data points.
Not all data is linearly separable. Solutions:
- Soft-margin SVM: Allow some misclassifications (controlled by hyperparameter \(C\)).
- Kernel trick: Map data to higher dimensions where linear separation is possible.
| Concept | Description |
|---|---|
| Hyperplane | Decision boundary |
| Margin | Width of the gap between classes |
| Support vectors | Points on the margin boundary |
| Maximise margin | SVM’s training objective |
| \(\|\mathbf{w}\|^2\) | Minimised to maximise margin |
EXAM TIP: For VCAA, understand the geometric concepts: margin, support vectors, decision boundary. Know the optimisation goal (maximise margin = minimise \(\|\mathbf{w}\|^2\)). You do not need to solve the full quadratic programming problem.
COMMON MISTAKE: The SVM is defined by the support vectors only, not all training points. If you removed all non-support-vector training data, the decision boundary would not change.
VCAA FOCUS: Know the definition of margin, the role of support vectors, and why maximising the margin leads to good generalisation. Apply SVM concepts geometrically to 1D and 2D data.