Analysis and Application of Graph-Based Semi-Supervised Learning Methods
- Author(s): LUO, XIYANG
- Advisor(s): Bertozzi, Andrea L
- et al.
In recent years, the need for pattern recognition and data analysis has grown exponentially in various fields of scientific research. My research is centered around graph Laplacian based techniques for image processing and machine learning. Three papers pertaining to this theme will be presented in this thesis.The first work is an application of graph Laplacian regularization to the problem of convolutional sparse coding. The additional regularization improves the robustness of the sparse representation with respect to noise, and has empirically shown to improve the performance of denoising on several well-known images. Efficient algorithms for computing the eigen-decomposition of the graph Laplacian were also incorporated to the solver for fast implementations of the method.The second piece of work studies the convergence of the graph Allen-Cahn scheme. A technique inspired by the maximum principle for the heat equation is used to show stability of the convex-splitting numeric scheme. This coupled with techniques from convex optimization allows for a proof of convergence under an a-posteriori condition. The analysis is then generalized to handle spectral trunction, a common method to save computational cost, and also to the case of multi-class classification. In particular, the results for spectral trunction are drastically different from that of the original scheme in the worst case, but does not present itself in practical applications.The third piece of work combines two fields of research, uncertainty quantification, and
semi-supervised learning on graphs. The work presents a unified Bayesian framework thatincorporates most previous methods for graph-based semi-supervised learning. A Bayesianframework allows for the computation of uncertainty for certain quantities under the pos-terior distribution. We show via solid numerical evidence that for a few carefully designedquantities, the expectations computed under the posterior yields meaningful notions of un-certainty for the classification problem. Efficient numerical methods were also devised tomake possible the evaluation of these quantities for large scale graphs.