Search

Scholarly Works (3 results)

Sort By:

Thesis
Peer Reviewed

Robust and Efficient Deep Learning for Multimedia Generation and Recognition

Hussain, Shehzeen Samarah
Advisor(s): Koushanfar, Farinaz

UC San Diego Electronic Theses and Dissertations (2023)

Deep Neural Networks (DNNs) have transformed the field of multimedia generation and recognition by replacing traditional hand-engineered systems in domains like vision, speech and text. This is because DNNs can operate end-to-end and model complex dependencies yielding state-of-the-art results on several generation and recognition benchmarks. However, there are three key challenges that need to be addressed for the practical, secure and reliable deployment of DNN-based media processing systems: 1) Robustness: DNNs are vulnerable to adversarial attacks, 2) Data-Requirement: DNNs often require large amounts of labelled data, 3) Compute-Efficiency: DNNs require extensive compute and resources.

My research focuses on addressing the above three challenges of DNN based multimedia generation and recognition systems. On the robustness side, I first analyze practical vulnerabilities of DNN-based recognition systems and then propose a robust defense framework that can reliably identify adversarial inputs using perceptually informed input transformations. To address the challenge of data-requirement, I develop training frameworks that can effectively adapt foundation models trained using self-supervised learning for recognition and synthesis tasks in a data-efficient manner. Finally, to address the challenge of compute-efficiency, I propose acceleration methods using hardware-software codesign that significantly reduce the latency and resource-requirement while preserving the synthesis quality of DNN generators.

Cover page: Robust and Efficient Deep Learning for Multimedia Generation and Recognition

Article
Peer Reviewed

Adversarial Reprogramming of Text Classification Neural Networks

UC San Diego Previously Published Works (2019)

Adversarial Reprogramming has demonstrated success in utilizing pre-trained neural network classifiers for alternative classification tasks without modification to the original network. An adversary in such an attack scenario trains an additive contribution to the inputs to repurpose the neural network for the new classification task. While this reprogramming approach works for neural networks with a continuous input space such as that of images, it is not directly applicable to neural networks trained for tasks such as text classification, where the input space is discrete. Repurposing such classification networks would require the attacker to learn an adversarial program that maps inputs from one discrete space to the other. In this work, we introduce a context-based vocabulary remapping model to reprogram neural networks trained on a specific sequence classification task, for a new sequence classification task desired by the adversary. We propose training procedures for this adversarial program in both white-box and black-box settings. We demonstrate the application of our model by adversarially repurposing various text-classification models including LSTM, bi-directional LSTM and CNN for alternate classification tasks.

Cover page: Adversarial Reprogramming of Text Classification Neural Networks

Article
Peer Reviewed

Universal Adversarial Perturbations for Speech Recognition Systems

UC San Diego Previously Published Works (2019)

In this work, we demonstrate the existence of universal adversarial audio perturbations that cause mis-transcription of audio signals by automatic speech recognition (ASR) systems. We propose an algorithm to find a single quasi-imperceptible perturbation, which when added to any arbitrary speech signal, will most likely fool the victim speech recognition model. Our experiments demonstrate the application of our proposed technique by crafting audio-agnostic universal perturbations for the state-of-the-art ASR system - Mozilla DeepSpeech. Additionally, we show that such perturbations generalize to a significant extent across models that are not available during training, by performing a transferability test on a WaveNet based ASR system.

Cover page: Universal Adversarial Perturbations for Speech Recognition Systems