Algorithms for measuring and enhancing distributed systems
- Author(s): Uyeda, Frank C.;
- et al.
The industry-wide movement toward large data centers and cloud computing has brought many economic advantages, but also numerous technical challenges. In this work we describe three software-based contributions towards improving the communication performance of these large- scale distributed systems. While advances to commodity Ethernet in the data center have increased throughput, efficient application design can produce even more significant speedups. We first describe an efficient, new data structure, called "Difference Digests", that can serve as a building block in distributed-application protocols. Difference Digests identify the elements that differ between two sets using communication and computation overheads proportional to population of the difference. This functionality has many promising applications, including recovery from network partitions, data synchronization, and data de-duplication. Beyond more efficient protocols, tuning applications can further improve performance by reducing network congestion. As link speeds increase, measuring bandwidth at fine time scales produces valuable debugging and tuning information, but is complicated by the immense number of packets and the short per-packet processing time. Further, it is unclear in advance which time scales will prove insightful, but storing all measurements incurs significant storage overhead. Thus, we propose two algorithms to summarize streams of fine-grain bandwidth samples and identify bursty traffic behavior. Our implementation records bandwidth information at speeds up to 10 Gbps while decreasing storage costs by orders of magnitude. After the measurement, bandwidth statistics can be generated for arbitrary time scales down to microseconds, allowing users to identify short-lived phenomena affecting their application's performance. While network capacity has increased, maintaining low latency for time-sensitive flows remains challenging. One approach is to minimize in- network buffering by controlling flows with software-based pacing at the end host. However, these mechanisms are unproven at the Gigabit speeds found in data centers. Hence, we demonstrate measurement tools to evaluate the bandwidth and "burstiness" of one such mechanism, the Linux Token Bucket Filter (TBF). We find that TBF struggles to scale to multi-Gigabit speeds due to Linux's inability to reliably service timers with micro-second accuracy. To address this limitation, we describe two enhancements, which improve software-based pacing at speeds up to 10 Gbps while minimizing bursts