Sequentialization and Synchronization for Distributed Programs
- Author(s): Bakst, Alexander Goldberg
- Advisor(s): Jhala, Ranjit
- et al.
Distributed systems are essential for building services that can handle the ever increasing number of people and devices connected to the internet as well as the associated growth in data accumulation. However, building distributed programs is hard, and building confidence in the correctness of an algorithm or implementation is harder still. One fundamental reason is the highly asynchronous nature of distributed execution. Timing differences caused by network delays and variation in compute power can trigger behaviors that were unanticipated by the programmer.
Unfortunately, techniques for building confidence are all up against the same problem: the combinatorial explosion in the number of behaviors of a distributed system. Testing and model checking techniques can not hope to weed out all behaviors when the state space is infinite. At the other end of the spectrum, constructing proofs by hand is a daunting task, as inductive invariants must simultaneously account not only for the algorithm-specific properties, but also the asynchronous nature of distributed computation.
In this dissertation, we take the view that many distributed programs are actually designed in such a way that the developer can safely reason about a small number of representative, synchronous executions. To support this claim, we first identify a property of message passing programs called symmetric non-determinism. Intuitively, the observable behavior of a program with symmetric non-determinism is insensitive to underlying network and processor non-determinism. Second, we develop an algorithm for transforming programs with symmetric non-determinism into an equivalent sequential program, called its canonical sequentialization. Our experiments show that canonical sequentializations can be computed quickly enough for eventual use in the design-implement-check cycle. Finally, we generalize the previous approach and present an algorithm for proving the correctness of distributed programs by automatically transforming them into an equivalent synchronized program. In this setting, rather than reasoning about message buffers, the developer writes loop invariants as synchronous assertions. We have implemented both of these algorithms as tools that demonstrate a reduction in the manual annotation burden for distributed systems.