Corfu is a platform for building systems which are extremely scalable, strongly consistent and robust. Unlike other systems which weaken guarantees to provide better performance, we have built Corfu with a resilient fabric tuned and engineered for scalability and strong consistency at its core: the Corfu shared log. On top of the Corfu log, we have built a layer of advanced data services which leverage the properties of the Corfu log. Today, Corfu is already replacing data platforms in commercial products, and the thriving Corfu open source code base enjoys regular contributions from a number of industrial and academic institutions.
One of the key properties of Corfu is consistency, a highly desirable property which simplifies programming complex, asynchronous distributed systems by increasing the number of assumptions a programmer can make about how a system will behave. For years, system designers focused on providing the strongest possible guarantees on top of unreliable and even malicious systems. The rise of the Internet and cloud-scale computing, however, shifted the focus of system designers towards scalability. In a rush to meet the needs of cloud-scale workloads, system designers realized that if they weaken the consistency guarantees they provided, they would greatly increase the scalability of their systems. As a result, designers simplified the guarantees provided by their systems and weaker consistency models such as eventual consistency emerged, greatly increasing the burden on developers leading to error-prone applications. Programmers in the cloud era are forced to choose between consistency and scalability.
Corfu, the topic of this dissertation, is a platform for scalable consistency. Corfu answers the question: ``If we were to build a distributed system from scratch, taking into consideration both the desire for consistency and the need for scalability, what would it look like?''. The answer lies in the Corfu distributed log.
We begin by introducing the Corfu distributed log. Corfu achieves strong consistency by presenting the abstraction of a log --- clients may read from anywhere in the log but they may only append to the end of the log. The ordering of updates on the log is decided by a high throughput sequencer, which we show can handle nearly a million requests per second. The log is scalable as every update to the log is replicated independently, and every client appending to the log merely needs to acquire a token before beginning replication. This means that we can scale the log by merely adding replicas, and our only limit is the rate of requests the sequencer can handle.
While building a single distributed log already provides strong consistency and scalability, multiple applications may wish to share the same log. By sharing the same log, updates across multiple applications can be ordered with respect to one another, which form the basic building block for advanced operations such as transactions. This dissertation details two designs for virtualizing the log: \emph{streaming}, which divides the log into streams built using log entries which point to one another, and \emph{\branching}, which virtualizes the log by radically changing how data is replicated in the shared log. \xmakefirstuc{\branching} greatly improves the performance of random reads, and allows applications to exploit locality by placing virtualized logs on a single replica.
Efficiently virtualizing the log turns out to be important for implementing distributed objects in Corfu, a convenient and powerful abstraction for interacting with the Corfu distributed log introduced earlier. Rather than reading and appending entries to a log, distributed objects enable programmers to interact with in-memory objects which resemble traditional data structures such as maps, trees and linked lists. Under the covers, the \emph{Corfu runtime}, a library which client applications link to, translates accesses and modifications of in-memory objects into operations on the Corfu distributed log.
The Corfu runtime provides rich support for objects. An automated translation process converts plain old Java objects (POJOs) directly into Corfu objects through both runtime and compile-time transformation of code. This allows programmers to quickly adapt existing code to run on top of Corfu. The Corfu runtime also provides strong support for transactions, which enables multiple applications to read and modify objects without relaxing consistency guarantees. We show that with stream materialization, Corfu can support storing large amounts of state while supporting strong consistency and transactions.
Finally, we describe our experience in both writing new applications and adapting existing applications to Corfu. We start by building an adapter for Apache Zookeeper clients to run on top of Corfu and then describe the implementation of Silver, a new distributed file system which leverages the power of the Corfu log. We then conclude by describing our efforts to retrofit a large and complex commerical application: a software defined network (SDN) switch controller, and detail how the strong transaction model and rich object interface greatly reduce the burden on distributed system programmers.
Overall, Corfu promises to change how data platforms are built. Instead of forcing developers to compromise, Corfu offers programmers the best of both worlds. This dissertation traces the development and evolution of Corfu over five years and through eight publications. As a testament to the robustness of the original Corfu log design, the core log API remains mostly unchanged from the early Corfu papers, even though Corfu is now a public open source project with contributors from multiple academic and industrial institutions.