- Main
Enhancing the performance, fault tolerance, and security of distributed data management systems
- Maiyya, Sujaya Anantha
- Advisor(s): Agrawal, Divyakant;
- El Abbadi, Amr
Abstract
Individuals and enterprises produce over 2.5 exabytes (1018 bytes) of data everyday. Much of this data - including sensitive and private information - is stored with and managed by third-parties, such as Amazon Web Services or Google Cloud. These companies can lose millions to billions of dollars in sales if their data access latencies increase by only a few hundred milliseconds. Achieving data fault tolerance – a necessary primitive of database systems – while maintaining low access latency is particularly challenging. Hence, reducing data access latency to improve performance and guaranteeing data fault tolerance received the highest priority while designing cloud data management systems. But the ever growing number and sophistication of cyber attacks on the cloud coupled with increasing legal requirements for data privacy and security (e.g., GDPR or HIPAA) have forced cloud providers to re-evaluate their priorities. However, there exists a fundamental trade-off between security and efficiency in data management systems.
This dissertation discusses designing and evaluating data management protocols that strike a balance between efficiency, fault tolerance, and security in both trusted and untrusted environments. Before being able to solve security challenges in database systems, we first delve into traditional cloud settings, which assumes trust, to understand existing system designs. Existing cloud databases replicate their data to provide fault tolerance and shard (or partition) the data and store the shards on multiple servers to provide scalability. In trusted environments, we propose two solutions: G-PAC, an atomic commitment protocol that commits transactions accessing data that is both sharded and replicated, and Samya, a data system that maintains aggregate data and supports high contention write-intensive workloads. As the next step towards building secure data systems, to better understand the interplay between multiple security guarantees and performance, we study various blockchain systems – an ideal example where untrustedgeo-distributed entities manage critical data.
Equipped with blockchain techniques that protect data, we build three solutions that focus on data Confidentiality, Integrity, and Availability, more popularly known as the CIA triad, which forms the pillars of secure systems. For confidentiality, this dissertation proposes ORTOA: a protocol that allows users to read or write data onto an untrusted external server without revealing the type of operation in a single round, whereas all existing solutions to hide the type of operation require two rounds of communication. For integrity, this dissertation presents Fides: a transactional database system that guarantees data integrity and provides verifiable ACID guarantees. In this work, we also propose TFCommit - the first distributed transaction commitment protocol that tolerates up to n − 1 maliciously failing servers (out of n servers) without using expensive data replication. And for availability, we propose QuORAM : the first fully fault-tolerant Oblivious RAM datastore that guarantees data privacy by hiding access patterns of users along with the contents of data.
Main Content
Enter the password to open this PDF file.
-
-
-
-
-
-
-
-
-
-
-
-
-
-