Solution Path Clustering with Robust Loss and Concave Penalty
- Author(s): Schuberg, Edward;
- Advisor(s): Yao, Weixin;
- Ma, Shujie
- et al.
The main purpose of this dissertation is to demonstrate that using a robust loss function (instead of the usual least squares loss) improves the clustering quality in the solution path clustering scheme. Cluster analysis simultaneously attempts to determine the number of clusters and estimate cluster location and membership. Convex clustering, distinguishing itself from other popular clustering methods, casts the clustering objective as a convex optimization problem and thus admits a global solution. It is a useful exploratory technique which outputs a solution path, evoking the name, ``solution path clustering." The solution path is a tree-like structure with cluster results ranging from n clusters down to a single cluster. Now, the benefits of convex clustering come at a cost since the use of a convex penalty can seriously bias the results and ruin the search for good cluster results. To lessen the bias, Ma and Huang (2017) proposed concave penalties to form the cluster centers. While the clustering objective is no longer convex, the quality of the solutions is improved.
We extend the solution path clustering scheme by implementing robust loss functions instead of the usual least squares loss. Following Ma and Huang (2017), we also use a concave penalty to form clusters. The robust loss and concave penalty work together to mitigate the influence of outliers and minimize bias in the estimation of cluster locations, especially when the true distance between clusters is large. We introduce the IRLS-ADMM algorithm to minimize our proposed objective function and prove its convergence to a local minimum. Any loss function that admits an IRLS formulation or a majorizing surrogate can be used. We also study asymptotic and oracle properties of the estimator. Finally, we demonstrate the performance of our proposed method through simulation experiments and on real data sets, as well as provide some preliminary results on choosing the number of clusters via the modified BIC (Wang, Li, and Leng, 2009).