Large-Scale Data Management using Permissioned Blockchains
- Author(s): Amiri, Mohammad Javad
- Advisor(s): Agrawal, Divyakant
- El Abbadi, Amr
- et al.
The unique features of blockchain such as transparency, provenance, and authenticity are used by many large-scale data management systems to deploy a wide range of distributed applications including supply chain management, healthcare, and crowdsourcing in a permissioned setting. Unlike permissionless settings, e.g., Bitcoin, where the network is public, and anyone can participate without a specific identity, a permissioned blockchain consists of a set of known, identified nodes that might not fully trust each other. While the characteristics of permissioned blockchains are appealing to a wide range of large-scale data management systems, these systems, have to deal with five important challenges: confidentiality, verifiability, performance, scalability, and fault tolerance. Confidentiality of data is required in many collaborative large-scale data management applications where collaboration between enterprises, e.g., cross-enterprise transactions, should be visible to all enterprises, however, the internal data of each enterprise, e.g, internal transactions, might be confidential. Besides confidentiality, in many multi-enterprise systems, e.g., crowdworking environments, participants need to verify transactions that are initiated by other enterprises to ensure some predefined global constraints on the entire system. Thus, the system needs to support verifiability while preserving the confidentiality of transactions. Verifiability will gain in importance as crowdworking applications increase in popularity, and the need for regulation will arise. Large-scale data management applications also require high performance in terms of throughput and latency. Scalability is one of the main obstacles to business adoption of blockchain systems. To support a large-scale data management application, a blockchain system should be able to scale efficiently by adding more resources to the system. Finally, large-scale data management systems must provide fault tolerance. Fault-tolerant protocols are the main building block of large-scale data management systems. However and and in spite of years of intensive research, existing fault-tolerant protocols, do not adequately address hybrid environments consisting of trusted and untrusted servers which are widely used by enterprises. In this dissertation, we propose several techniques and develop different systems to address all five main challenges of large-scale data management using permissioned blockchains. We have developed systems, called CAPER, SEPAR, ParBlockchain, SharPer, and SeeMoRe to deal with the confidentiality, verifiability, performance, scalability, and fault tolerance requirements of large-scale data management respectively.