Hierarchical Clustering and Interaction
- Author(s): Yenugula, Madhavi
- Advisor(s): Dasgupta, Sanjoy
- et al.
Hierarchical Clustering is a clustering method which defines clusters on the data at various granularities - starting with a single cluster with all input data points to clusters with just individual points. Any desired number of clusters can be obtained by breaking off the hierarchy at some level and nodes of the pruned branches can be merged to form clusters.
Clusters at any given level of hierarchy depend on clusters formed in the previous level. Hierarchical clustering approaches operate greedily without backtracking. The final hierarchy is often not what the user expects, it can be improved by providing feedback. This work studies various ways of interacting with the hierarchy - providing feedback to and incorporating feedback into the hierarchy. We discuss metrics to quantify quality of a hierarchy. We apply the designed feedback mechanism on datasets with different attribute types. We report results of application of these methods on datasets and improvements in the hierarchies as per defined metrics.