Machine Learning with Provable Robustness Guarantees
- Author(s): Zhang, Huan
- Advisor(s): Hsieh, Cho-Jui
- et al.
Although machine learning has achieved great success in numerous complicated tasks, many machine learning models lack robustness under the presence of adversaries and can be misled by imperceptible adversarial noises. In this dissertation, we first study the robustness verification problem of machine learning, which gives provable guarantees on worst case performance under arbitrarily strong adversaries. We study two popular machine learning models, deep neural networks (DNNs) and ensemble trees, and design efficient and effective algorithms to provably verify the robustness of these models. For neural networks, we develop a linear relaxation based framework, CROWN, where we relax the non-linear units in DNNs using linear bounds, and propagate linear bounds through the network. We generalize CROWN into a linear relaxation based perturbation analysis (LiRPA) algorithm on any computational graphs and general network architectures to handle irregular neural networks used in practice, and released an open source software package, auto_LiRPA, to facilitate the use of LiRPA for researchers in other fields. For tree ensembles, we reduce the robustness verification algorithm to a max-clique finding problem on a specially created graph, which is very efficient compared to existing approaches and can produce high quality lower or upper bounds for the output of a tree ensemble based classifier. After developing our robustness verification algorithms, we utilize them to create a certified adversarial defense for neural networks, where we explicitly optimize the bounds obtained from verification to greatly improve network robustness in a provable manner. Our LiRPA based training method is very efficient: it can scale to large datasets such as downscaled ImageNet and modern computer vision models such as DenseNet. Lastly, we study the robustness of reinforcement learning (RL), which is more challenging than the problem in supervised learning settings. We focus on the robustness of state observations for a RL agent, and develop the state-adversarial Markov decision process (SA-MDP) to characterize the behavior of a RL agent under adversarially perturbed observations. Based on SA-MDP, we develop two orthogonal approaches to improve the robustness of RL: a state-adversarial regularization helping to improve the robustness of function approximators, and alternating training with learned adversaries (ATLA) to mitigate the intrinsic weakness in a policy. Both approaches are evaluated in various simulated environments and they significantly improve the robustness of RL agents under strong adversarial attacks, including a few novel adversarial attacks proposed by us.