Despite much success, the effectiveness of deep learning models largely relies on the availability of large amounts of labeled data. A large amount of labeled data, however, is costly to acquire in many applications of interest, which hinders the applicability of these models, especially in resource-poor settings. On the other hand, with the growth of the internet, an enormous amount of user-generated data have been accumulated which is readily available and free. Although they may not annotate the necessary structured output of the target downstream tasks, they can provide relevant information and background knowledge which can be formed into auxiliary learning signals to enhance the target application. Hence, computational approaches for leveraging the open-source data as well as utilizing the resource-rich corpora in low-resource applications can enable us to build models for a broad spectrum of languages, domains, and modalities regardless of their training data size.This dissertation discusses the fundamental challenges and proposes several approaches for multi-modal low-resource NLP problems that (1) construct auxiliary training data from un/labeled (open-source) resources and (2) learn through the auxiliary data and enhance the downstream application. The proposed approaches in this dissertation are effectively applied across a wide range of NLP applications, including sequence tagging, text classification, natural language inference, text to code generation, QA, and more.