Recent improvements in machine learning methods have significantly advanced many fields in- cluding computer vision and natural language processing. While the models have become increasingly complex, they have also become harder to interpret and be trusted.In my thesis, the focus is to improve or solve machine learning problems with clear guidance from statistics and design algorithms which can be supported or explained by statistics. We show statistics can be applied from various aspects and play different roles.
In the first project, I use a heuristic search method to design a simply but effective neural network compression method, with a theoretical understanding of how the search works. In the second project, optimization based on Kullback-Leibler divergence is applied to fine-tune the output distribution of an image segmentation module to improve quality of segmentation. In the third project you will see how hypothesis testing can help measure the similarity between datasets, which is a crucial question for a better understanding for maching learning.
Besides these applications on machine learning algorithms, there are also engineering challenges in real-life projects. For the fourth project, I summarized my contribution to Healthy Davis Together project, including the systems I built with full details, the logic behind those systems and a mobility variable I designed.