Modern high dimensional data poses serious difficulties for various learning tasks. However, most high dimensional problems exhibit manifold structure, i.e., there exist only a few degrees of freedom that matter for the task at hand. Exploring such intrinsic structure is the key to designing accurate and efficient learning algorithms. In this thesis, we demonstrate the use of mean-shift, a popular mode-finding and clustering algorithm, for learning problems involving manifold structure. In particular, we propose several new algorithms based on the mean-shift update for the tasks of manifold denoising, matrix completion, and centroid-based clustering.
The first algorithm, manifold blurring mean-shift (MBMS), is an algorithm of the predictor-corrector type. It alternates a predictor, blurring mean-shift step that acts as an isotropic low-pass filter and a corrector, projection step that removes the shrinkage of data along the manifold, where the manifold structure is estimated by local PCA. The algorithm achieves anisotropic denoising, can be used as a pre-processing step for dimension reduction and classification, and significantly improves over previous manifold denoising algorithms. Furthermore, we apply MBMS to matrix completion/missing value problems by iteratively denoising the whole dataset and filling in the observed entries. In contrast to the popular low-rank approaches based on a globally linear assumption, our algorithm preserves locally linear structure instead when the data is globally nonlinear. This simple approach provides a fresh view of the matrix completion problem, and greatly improves over several previous approaches.
We also propose two new, mode-based algorithms for clustering. The first one, which we call the K-modes algorithm, partitions a dataset into a pre-specified number of clusters, and provides a representative centroid of each cluster. Each centroid is the mode of the kernel density estimate defined by each cluster and is thus located in a high-density area. The algorithm is computationally inexpensive and more robust than K-means and mean-shift. We then provide a continuous relaxation for the hard partition rule of K-modes and impose a Laplacian smoothing penalty so that similar input samples receive similar assignments. The new algorithm, which we call the Laplacian K-modes algorithm, is able to handle non-convex, complex-shaped clusters, has an efficient optimization procedure, and shares nice properties with many well-known clustering algorithms.
The proposed mean-shift algorithms are simple and very easy to implement, yet they have superior performance on the tasks of consideration compared to previous approaches. We demonstrate them on various high dimensional datasets from different domains.