A Network Control Platform for Performance Isolation and Modular Composition
- Author(s): Webb, Kevin Christopher
- et al.
Large data centers host thousands of tenants and services, and today's data center networks are no longer simple end- to-end transfer fabrics. Tenants desire more functionality like strong performance isolation, latency control, load balancing, middlebox placement, and other services. As tenant performance requirements become more demanding, data center providers wish to increase the set of services they offer in an attempt to meet the demand and attract new customers. Critically, while the promise of cloud computing makes it straightforward to rent compute resources, there is no standard model for tenants to pick and choose the network resources and services they require. Traditionally, a single procedure governs the allocation of resources to tenant services. For efficiency, data center providers multiplex tenants across the physical infrastructure, leveraging end host virtualization to carve out well-specified units of isolated CPU, memory, and storage. In contrast, tenants typically receive loose, qualitative descriptions of network performance with a limited set of available features. This disparity is problematic for today's data centers, which need to satisfy services with a diverse and non-uniform set of networking needs. This dissertation proposes to resolve this contention by provisioning a custom virtual network for each of the data center's clients. It presents task switching, a framework that redefines how tenants interact with and obtain resources from the network by servicing their individual requests for virtual network tasks. It then describes a model with two components, Blender and Omakase, that defines interfaces for the network provider to implement differing features and resource allocation procedures that tenants may choose to adopt for their tasks. Blender focuses on performance isolation, eliminating the either/or balancing act between resource efficiency and performance predictability by allowing data center operators to run multiple performance isolation models simultaneously. Omakase augments the Blender model to incorporate general purpose features for tasks, including customized in- network middlebox processing, failure resolution, and flow scheduling. We evaluate the architecture with a prototype implementation that addresses its correctness, scalability, and performance. We show that our SDN-based prototype can efficiently express many recently proposed performance isolation models and network features