Deep Neural Networks (DNNs) yield state-of-the-art performance in an increasing array of applications. Despite the pervasive impact of DNNs, there remain significant concerns regarding their (lack of) stability and robustness. In this thesis, we explore several complementary approaches for guiding DNNs to learn robust and stable features, including domain expertise, domain-specific measures, and neuro-inspired modifications. We present novel augmentation techniques, cost functions, and data rejection methods that supplement conventional DNN training for reliable feature extraction.
We first study the robustness in the presence of strong confounding factors for Radio-frequency (RF) fingerprinting in which the aim is to distinguish devices using subtle hardware imperfections which vary from device to device. However, the features such as carrier frequency offset and wireless channel misguide DNNs. We point out that, unless proactively discouraged from doing so, DNNs learn these strong confounding features rather than the nonlinear device-specific characteristics that we seek to learn. We investigate and evaluate strategies based on augmentation and estimation to promote generalization across realizations of these confounding factors using WiFi data.
In our second study, we present robustness measures in the context of self-supervised contrastive learning. We investigate how to pretrain speaker recognition models by leveraging dialogues between customers and smart-speaker devices. However, the supervisory information in such dialogues is inherently noisy, as multiple speakers may speak to a device in the course of the same dialogue. To address this issue, we propose an effective rejection mechanism that selectively learns from dialogues based on their acoustic homogeneity. We also present a novel cost function particularly designed for a corrupted dataset in the contrastive learning setting.
Lastly, we introduce a promising neuro-inspired architectural DNN design and a cost function to learn robust and interpretable features. We develop a software framework in which end-to-end costs can be supplemented with costs which depend on layer-wise activations, permitting more fine-grained control of features. We apply this framework to include Hebbian/anti-Hebbian (HaH) learning in a discriminative setting, demonstrating promising gains in robustness for the CIFAR10 image classification.