Effective natural agents excel in learning representations of our world and efficiently generalizing to make decisions. Critically, developing such advanced reasoning capabilities can occur even with limited information-rich samples. In stark contrast, the major success of deep learning-based artificial agents is primarily trained on massive datasets. This dissertation focuses on curvature-informed learning and generative modeling methods that boost efficiency and close the gap between natural and artificial agents, thus enabling computationally efficient and improved reasoning.
This dissertation is comprised of two parts. First, we formally lay the foundations for learning. The goal is to establish optimization techniques, understand datasets, establish probabilistic generative models, and provide natural learning objectives even in settings with limited supervision. We discuss various first and second-order optimization methods, show the importance of modeling distributions in Variational Auto Encoders (VAEs),
and discuss which points are essential for generalization in supervised learning.
Building on these insights, we develop new algorithms to boost the performance of state-of-the-art models, select subsets to improve data quality, speed up training, mitigate their biases, and generate new augmentations on large labeled and partially labeled datasets. These contributions enable ML systems to better model and generalize to unseen and potentially out-of-distribution samples while drastically reducing training time and computational cost.