- Main
Towards Robust and Scalable Large Language Models
- Jain, Paras Jagdish
- Advisor(s): Stoica, Ion;
- Gonzalez, Joseph E
Abstract
This dissertation addresses two significant challenges of large language models (LLMs): robustness and scalability. Firstly, we focus on improving large language model robustness through the lens of learning code representations. I highlight our work on ContraCode which learns representations of code that are robust to label-preserving edits. Secondly, we tackle scalability challenges from a systems perspective. We present Checkmate, a system to support training models beyond GPU memory capacity limits through optimal rematerialization. Furthermore, Skyplane, a system that optimizes bulk data transfers between cloud object stores, enables training models on larger pre-training datasets in the cloud. Together, these contributions present a roadmap for enhancing the robustness and scalability of large language models.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-