Learning to Attack, Protect, and Enhance Deep Networks
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Learning to Attack, Protect, and Enhance Deep Networks

Creative Commons 'BY' version 4.0 license
Abstract

Artificial intelligence (AI) systems have demonstrated remarkable capabilities, yet concerns about their security and safe deployment persist. With the rapid adoption of AI across critical domains, ensuring the robustness and reliability of these models is imperative. This researchaddresses this challenge by exposing vulnerabilities in AI systems and enhancing their trustworthiness. By systematically uncovering flaws, it aims to raise awareness of the precautions necessary for utilizing AI in high-stakes scenarios. The methodology involves identifying vulnerabilities, quantifying worst-case performance via attacks, and generalizing insights to practical deployment settings. Additionally, it investigates techniques to strengthen model trustworthiness in real-world scenarios, contributing to rigorous AI safety research that promotes responsible and beneficial system development. Specifically, this research reveals vulnerabilities in neural networks by developing efficient black-box attacks on various deep learning models across different tasks. Additionally, it focuses on improving AI trustworthiness by detecting adversarial examples using language models and enhancing user privacy through innovative facial de-identification methods.

For highly effective black-box attacks, ensemble-based and context-aware approacheswere developed. These methods optimize over ensemble model weight spaces to craft adversarial examples with extreme efficiency, significantly outperforming existing input space attacks. Multimodal testing demonstrated that these attacks could fool systems on diverse tasks, highlighting the need to evaluate deployment robustness against such methods. Additionally, by weaponizing context to manipulate statistical relationships that models rely on, context-aware attacks were shown to profoundly mislead systems, revealing reasoning vulnerabilities. To protect user privacy, an algorithm was developed for seamlessly de-identifying facial images while retaining utility for downstream tasks. This approach, grounded in differential privacy and ensemble learning, maximizes obfuscation and non-invertibility to prevent re-identification. By disentangling identity attributes from utility attributes like expressions, the method significantly enhances de-identification rates while preserving utility. To enhance the robustness and efficiency of computational imaging pipelines, including Fourier phase retrieval and coded diffraction imaging, I developed a framework that learns reference signals or illumination patterns using a small number of training images. This framework employs an unrolled network as a solver. Once learned, the reference signals or illumination patterns serve as priors, significantly improving the efficiency of signal reconstruction. Overall, this research contributes to a more secure and reliable deployment of AI systems, ensuring their safe and beneficial use across critical domains.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View