Memory-intensive applications suffer significant performance degradation when their working sets exceed available memory capacity, which can result in swapping with slow disks. Far memory, where memory accesses are directed to other connected nodes, has become more popular in recent years as a solution to expand memory size and avoid memory stranding.Prior far memory systems have taken two approaches: 1) devising a swap system that uses far memory as a backup device and transparently exposes these regions to unmodified applications, and 2) introducing a new programming model/data structure that interfaces with far memory runtime. The former requires no program changes but comes with a significant performance penalty, while the latter requires considerable developer efforts to adopt and tune new APIs despite potential performance gains.
Our key insight is that by capturing both statically known and dynamically monitored program behaviors, we can optimize both the program itself and the underlying runtime, resulting in a notable performance boost. Furthermore, we propose automating this process within the compiler to achieve a certain level of transparency. In this thesis, we introduce Mira, a far-memory system that transforms unmodified C/C++ programs to adopt remote memory accesses and optimizes them by tailoring runtime support to their specific behaviors. Our evaluation demonstrates that Mira significantly enhances workload performance, particularly in terms of execution time, surpassing previous swap-based systems and programming models by up to 18×.