Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

ADMIT: An Adversarial Defense Methodology for Neural Networks based on Randomization and Reconstruction

Creative Commons 'BY' version 4.0 license
Abstract

From simple time series forecasting to computer security and autonomous systems, machine learning (ML) is employed in a wide range of applications. Despite the fact that machine learning algorithms are resistant to random noise, it has been shown that intentionally targeted perturbations to the input data, known as adversarial samples, can lead to a significant degradation in the ML performance. Existing countermeasures to mitigate or minimize the impact of adversarial samples, including adversarial training or randomization, are limited to specific categories of adversaries, are computationally costly, and/or result in lower performance even when no adversaries are present. To address the inadequacies of the existing works on adversarial defense, we propose a two-stage adversarial defense technique (ADMIT). To thwart the exploitation of the deep neural network by the attacker, we first include a random nullification (RNF) layer. The RNF nullifies/removes some of the features from the input randomly to lessen the influence of adversarial noise and minimize the attacker's capability of extracting the model parameters. The elimination of input features using RNF, on the other hand, reduces the performance of the ML. We outfit the network with a Reconstructor as an antidote. The Reconstructor primarily contributes to reconstructing the input data by utilizing an autoencoder network, but based on the distribution of the normal samples, thereby improving the performance, and also being robust to the adversarial noise. We evaluated the performance of proposed multi-stage ADMIT on the MNIST digits and Fashion-MNIST datasets against variety of adversarial techniques including FGSM, JSMA, BIM, Deepfool, and CW attacks. Our findings report improvements as high as 80% in the performance when compared to the existing defenses such as adversarial training and randomization-based defense.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View