An Algorithm for Hierarchical Clustering - Experiments and Analysis
We present and analyze a new algorithm idea to hierarchically cluster data, and show that it performs well compared to other established hierarchical clustering methods when evaluated based on Dasgupta’s cost function. We prove that in one iteration, our algorithm finds the optimal hierarchical clustering tree for the given order of the points, when limited to hierarchical clustering trees the leaves of which have the same ordering as said points.
Additionally, we show that the cost of the found tree can only improve or remain the same with each iteration. Thus, the presented algorithm can be used for hierarchical clustering tasks, or on the results of other hierarchical clustering algorithms to improve those, and leaves room for theoretical improvements and experimentation.