Modernizing Deep Unsupervised Learning with Human Experience
Deep unsupervised learning has emerged as a promising alternative to supervised approaches. However, supervised learning needs a tremendous amount of information in the form of annotations on specific pre-defined tasks. In contrast, human learning requires much fewer annotations and is flexible. Recent research efforts have been motivated to explore different deep unsupervised learning algorithms to leverage the massive unlabeled data for various applications that move beyond the supervised learning setting. While recent deep unsupervised learning works have shown their success in representation learning, clustering, and anomaly detection, many challenges remain unsolved. For example, how to improve the quality of learned representations used for downstream applications (the quality of learned representations challenge)? How to interpret and understand the deep unsupervised learning model predictions (the explainability challenge)? Is there any risk of bias for deep unsupervised learning applications (the bias and fairness challenge)?
To gain insights into the aforementioned challenges, we propose a broad range of novel techniques to address them. Each injects human-level knowledge into deep unsupervised learning. To be specific, this dissertation presents five approaches. The first two address the quality of representation challenge, the third the explainability challenge, and the last two the bias and fairness challenges. Our first formulation introduces a deep constrained clustering framework that enhances clustering performance via various constraints. Our second formulation is a self-supervised representation learning framework that automatically discovers and differentiates different categories. The third formulation simultaneously performs representation learning for clustering and describing the generated clusters with semantic tags associated with the clustered instances. Our fourth formulation proposes a novel deep fair anomaly detection architecture that uses adversarial learning to inject human fairness rules. Finally, our fifth formulation enforces disparate impact rules into deep clustering models via minimal modification learning. These methods are unified in modernizing deep unsupervised learning with different types of human guidance.