DieCast: Testing Distributed Systems with an Accurate Scale Model
Skip to main content
eScholarship
Open Access Publications from the University of California

DieCast: Testing Distributed Systems with an Accurate Scale Model

Abstract

Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems, configuration errors, and fault tolerance remains a difficult problem. Existing testing techniques, for example through simulation or running smaller instances of a service, have limitations in predicting overall service behavior. Although technically and economically infeasible at this time, testing should ideally be performed at the same scale and with the same configuration as the deployed service. We present DieCast, an approach to scaling network services; we multiplex all of the nodes in a given service configuration as virtual machines (VM) spread across a much smaller number of physical machines in a test harness. CPU, network, and disk are then accurately scaled to provide the illusion that each VM matches a machine from the original service in terms of both available computing resources and communication behavior to remote service nodes. We present the architecture and evaluation of a system to support such experimentation and discuss its limitations. We show that for a variety of services, including a high-performance, cluster-based file system, and resource utilization levels, DieCast matches the behavior of the original service while using a fraction of the physical resources.

Pre-2018 CSE ID: CS2007-0910

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View