Hardware resource disaggregation is a solution that decomposes general-purpose monolithic servers into segregated, network-attached resource pools, each of which can be built, managed, and scaled independently. Despite its management, cost, and fault-tolerance benefits, hardware resource disaggregation is a drastic departure from the traditional computing paradigm and it calls for a top-down redesign on system software, hardware, and data center networks.
This dissertation shows that it is possible to overcome the challenges of building and deploying hardware resource disaggregation solutions in real data centers, delivering its promises on better manageability, scalability, and cost.
We first explored logical resource disaggregation for emerging persistent memory technologies. Logical resource disaggregation logically breaks the server boundary by building an indirection layer on top of monolithic servers to collectively expose a logical resource pool abstraction. However, we fail to overcome the inherent problems of monolithic servers. We then explored hardware resource disaggregation to overcome these limitations by physically separating hardware resources into network-attached pools. We emulated disaggregated devices using monolithic servers and built the first operating system designed for managing disaggregated resources. It provides backward compatible interfaces while delivering good performance. However, emulation incurs non-trivial overhead and has limited parallelism in serving highly-concurrent requests. To avoid such overhead, we then built the first publicly known hardware-based disaggregated memory device, which co-designs networking transport, virtual memory, and hardware. We soon realized that while an increasing amount of effort goes into disaggregating compute, memory, and storage, the network has been completely left out. The final piece of this dissertation proposes the concept of network disaggregation, which decouples network functionalities from endpoints and then consolidates them into a centralized network resource pool. We built a new hardware-based networking device along with a distributed runtime system to realize such a network resource pool. Together, these four pieces outline a practical path to enable hardware resource disaggregation solutions in real data centers, especially how one can navigate the complex trade-offs among performance, cost, and manageability.