Mean Shift Clustering
Mean Shift is a non-parametric clustering algorithm that iteratively shifts each point toward the densest region in its neighborhood, converging to cluster modes without requiring K to be specified.
The Mean Shift Procedure
Each point is shifted toward the mean of its neighbors within a window of radius (bandwidth). This process repeats until convergence, with points converging to the same mode forming a cluster.
Algorithm Steps
- For each point
x_i, define a kernel window of radius h. - Compute the weighted mean of all points within the window.
- Shift
x_ito this mean. - Repeat until convergence (change < threshold).
- Points with the same convergence mode are assigned to the same cluster.
Mean Shift in sklearn
scikit-learn provides MeanShift with automatic bandwidth estimation via estimate_bandwidth.
Training and Predicting
Bandwidth Selection
estimate_bandwidth uses a quantile of pairwise distances to estimate the appropriate window size. Smaller quantile → smaller bandwidth → more clusters. Tune the quantile (0.1 to 0.5) as the main hyperparameter.
Strengths and Limitations
Mean Shift automatically determines cluster count, handles arbitrary cluster shapes, and is robust to outliers. Its main drawbacks are O(N\u00b2) complexity and sensitivity to bandwidth choice.
When to Use Mean Shift
Mean Shift is ideal for image segmentation and applications where the data density naturally defines clusters. For large datasets (>10K points), it becomes slow unless approximations like bin_seeding=True are used. DBSCAN or HDBSCAN are faster alternatives for large-scale density-based clustering.