Hierarchical Clustering Fundamentals
Hierarchical clustering builds a multi-level tree of clusters (dendrogram) that reveals data structure at multiple granularities without requiring a predetermined number of clusters.
How Hierarchical Clustering Works
At each step, the two most similar clusters (or points) are merged (agglomerative) or the least-cohesive cluster is split (divisive), building a full hierarchy from N individual points to one cluster.
Linkage Criteria
- Single linkage: Distance between nearest points in two clusters. Sensitive to outliers.
- Complete linkage: Distance between farthest points. Produces compact clusters.
- Average linkage: Average pairwise distance. Balanced trade-off.
- Ward linkage: Minimizes total within-cluster variance increase. Usually best for compact, equal-size clusters.
AgglomerativeClustering in sklearn
scikit-learn implements bottom-up agglomerative clustering efficiently, supporting different linkage criteria and affinity metrics.
Fitting and Predicting
Choosing Without Specifying n_clusters
Set n_clusters=None and use distance_threshold to cut the dendrogram at a specific height. This is useful when you want to explore the hierarchy without committing to a fixed number of clusters.
Time and Space Complexity
Naive agglomerative clustering has O(N\u00b3) time complexity and O(N\u00b2) space due to the distance matrix. Ward linkage with scikit-learn's connectivity constraints can scale to larger datasets.