Coding for Future Large-Scale Data Systems
This dissertation is focused on creating mathematical techniques---influenced by information theory and coding theory---to address the difficulties associated with storing, transmitting, and analyzing massive amounts of data. By necessity, memory devices are being created more compactly, leading to higher rates of errors. Each of the three parts of this dissertation seeks to combat the unique challenges and potential errors associated with next-generation storage technologies. These advanced error-correcting techniques can be utilized at the system-level for a variety of purposes, e.g., reducing energy consumption, increasing storage density, or decreasing the risk of a catastrophic system failure.
The first part of the dissertation introduces Software-Defined Error-Correcting Codes: a framework for exploiting side-information to heuristically recover from detected (but uncorrectable) errors. The prominent features of this section include the underlying theory, experimental results, and an extension to error-localizing codes. The middle section of the dissertation focuses on coding for unequal message protection, in which special messages are granted extra error-correcting guarantees. A broad class of unequal message protection codes are constructed, maintaining the same amount of redundancy overhead as the baseline alternative. The final part of the dissertation includes code constructions to correct burst deletion errors in DNA storage---a very promising technology that will likely be commonplace in the near-future, complete with its own set of features and challenges.
The coding theoretic techniques presented here, along with tools inspired by this dissertation, will play a significant role in mitigating errors in future large-scale data systems.