In this dissertation, we present several forays into the complexity that characterizes modern machine learning, with a focus on the interplay between learning processes, incentives, and high-dimensional models. We aim to uncover new principles that address the challenges that arise at the frontiers of this rapidly advancing field. This work is structured into two parts, each exploring a different facet of these complexities.
In Part I, we examine complexity arising from strategic and adversarial environments. We present two studies. The first explores learning and decision-making in a matching market, where a platform hopes to learn a market equilibrium amidst uncertainty about user preferences. The second investigates the robustness of safety-trained large language models to adversarial "jailbreak" attacks. We identify and exploit failure modes of safety training and discuss the implications of these findings for language model safety going forward.
In Part II, we study complexity arising from the high-dimensional models that are by now ubiquitous in machine learning. We start by investigating what mathematical foundations lead to an accurate predictive theory of high-dimensional generalization, and identify models based on random matrix theory as a promising candidate. We then delve further into the theoretical underpinnings of random matrix theory for high-dimensional linear regression to shed light on phenomena such as double descent, benign overfitting, and scaling laws.