Coding for Efficient and Robust Storage
- Author(s): Zorgui, Marwen
- Advisor(s): Wang, Zhiying
- et al.
Driven by the growth of data-centric applications, efficient data storage and retrieval has become crucial for modern service providers. Storage systems are becoming increasingly complex, in response to the various challenges rising from the different layers of storage systems. This dissertation aims at designing novel data coding techniques improving device-level reliability and reducing storage and communication costs at the system level.
We consider the repair problem in distributed storage systems (DSSs), which refers to the process of regenerating the content of inaccessible or failed nodes. Repairing failures is necessary to maintain a desired level of fault-tolerance in DSSs. Specifically, we consider the repair problem of multiple erasures in a centralized manner. Using information-theoretic tools, we derive fundamental limits for the problem. Our analysis exhibits a fundamental tradeoff between the storage overhead required for reliability and the amount of data being transferred during repair. Accordingly, we propose several data coding techniques operating at various points of the tradeoff, with provable optimality for certain configurations.
A standard assumption in coding theory for distributed storage is that messages are static, and write and read operations are completed instantaneously for every non-failed storage node. In this work, we study the fundamental problem of storing evolving information in distributed networks from a coding-theoretic perspective.
Specifically, we study the design of storage-efficient algorithms for emulating atomic shared memory over an asynchronous, distributed message-passing system, a fundamental question in the field of distributed algorithms, as well as a ubiquitous problem in distributed computing applications. We propose new erasure-code based algorithms and compare their merits to existing ones. In particular, our algorithms feature a parameter v that allows a system designer to trade off between the storage size and liveness of read operations, while guaranteeing the strictest consistency requirement, atomicity.
Finally, we consider the device level and study the reliability of resistive memories (RRAMs). RRAMs are a promising non-volatile memory technology with high storage densities. However, the readout reliability of RRAMs is impaired due to the sneak path problem. In crossbar arrays, sneak paths cause different reliability levels across array cells. Motivated by this problem, we study the framework of polar coding over channels with different reliability levels, and we propose a coding technique improving its bit error rate (BER) performance. We then apply our framework to the crossbar array and demonstrate a significant reduction in BER.