This dissertation builds new defenses to thwart digital attacks on enterprises. Specifically, we develop a set of data-driven insights and methods that enable organizations to uncover and stymie three prominent enterprise attacks: spearphishing, lateral phishing, and lateral movement. For each of these threats, we present new conceptual models that deconstruct each attack into a set of fundamental actions that an attacker must perform in order to succeed, enabling organizations to more precisely search for signs of such malicious activity. The successful detection systems we construct based on these models highlight the value of decomposing and pursuing attacks along two facets: preventing attackers from gaining entry into an enterprise’s network and hunting for attacker activity within an organization’s internal environment.
Even with a clear specification of what to look for, uncovering sophisticated attacks has long eluded enterprises because such attacks give rise to a detection problem with two challenging constraints: an extreme class imbalance and a lack of ground truth. In particular, targeted enterprise attacks occur at a low rate, reflect the work of stealthy attackers (and thus frequently remain unknown and unlabeled), and transpire amidst a sea of anomalous-but-benign activity that inherently occurs within modern enterprise networks. This setting poses fundamental challenges to traditional machine learning methods, causing them to detect an insufficient number of attacks or produce an intractable volume of false positives. To overcome these challenges, we present a new approach to anomaly detection for security settings, specification-based anomaly detection, which we use to construct new detection algorithms for identifying rare attacks in large, unlabeled datasets.
Combining these algorithms with the attack models we develop, we design and implement a set of detection systems that collectively form a defense-in-depth approach to unearthing and mitigating enterprise attacks. Through collaborations with three large organizations, we validate the efficacy and practicality of our approach. Given the ability of our systems to detect a wide-range of attacks, the low volume of false positives they generate, and the real-world adoption of many of our ideas, this dissertation illustrates the utility and promise of a data-empowered approach to thwarting enterprise attacks.