Human Decision on Targeted and Non-Targeted Adversarial Samples
Skip to main content
eScholarship
Open Access Publications from the University of California

Human Decision on Targeted and Non-Targeted Adversarial Samples

Abstract

In a world that relies increasingly on large amounts of data and on powerful Machine Learning (ML) models, the veracity of decisions made by these systems is essential. Adversarial samples are inputs that have been perturbed to mislead the in- terpretation of the ML and are a dangerous vulnerability. Our research takes a first step into what can be an important innova- tion in cognitive science: we analyzed human’s judgments and decisions when confronted with targeted (inputs constructed to make a ML model purposely misclassify an input as some- thing else) and non-targeted (a noisy perturbed input that tries to trick the ML model) adversarial samples. Our findings sug- gest that although ML models that produce non-targeted adver- sarial samples can be more efficient than targeted samples they result in more incorrect human classifications than those of tar- geted samples. In other words, non-targeted samples interfered more with human perception and categorization decisions than targeted samples.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View