Building Trustworthy Machine Learning Models
- Author(s): Liu, Xuanqing
- Advisor(s): Hsieh, Cho-Jui
- et al.
How and when can we depend on machine learning systems to make decisions for human-being? This is probably the question everybody may (and should) ask before deploying machine learning models in their own fields. Failure to do so can suffer from unexpected consequences: the text recognition systems in the mail distribution center may send the package to the wrong addresses; the self-driving cars may recognize a stop sign as a speed sign; or even worse, the AI-based medical imaging system may mislead the doctors into wrong diagnostics. We attribute a trustworthy machine learning model to three properties: robustness, interpretation, and precise uncertainty estimation. Robustness concerns how the model withstands unexpected inputs, also called out-of-distribution (OOD) data. Depending on whether the data is maneuvered in purpose, the OOD data comprises adversarial examples or unadversarial examples. Interpretation is a set of algorithms that uncover the black-box model inference process, trying to help humans understand why or why not the model generates the desired results. Finally, we seek the uncertainty estimation tools to locate the ground-truth value relative to the estimated value. It also protects the model users by holding the machine predictions for human inspections once the uncertainties raise above some threshold.
In this thesis, I will walk through robustness, interpretation, and uncertainty estimation in three parts. In the first part, I will introduce the backgrounds of robust machine learning models with an example in graph-based semi-supervised learning, followed by a series of methods to train robust neural networks. In the next part, we will move to model interpretation tools, we relate this part to the previous part by discussing our work called Greedy-AS. In the final part, I will discuss my works on robust uncertainty estimation and confidence calibration, this part contains the algorithms, software packages, as well as a demo on how uncertainty estimation helps biological scientists to do quality control of stem cells more efficiently.