Skip to main content
eScholarship
Open Access Publications from the University of California

Algorithms and Coding Techniques for Reliable Data Management and Storage

  • Author(s): Sala, Frederic
  • Advisor(s): Dolecek, Lara
  • et al.
Abstract

This dissertation studies problems of data management under unreliable conditions: how can data be efficiently reconstructed, synchronized, transmitted, and stored in the presence of uncertainty or noise? The common underlying thread running through the approaches to these problems is the discipline of coding theory.

Problems related to data editing and modification are considered in the first part of the dissertation. For the combinatorial data reconstruction problem, a new result regarding the minimum number of traces needed for exact reconstruction is introduced, resolving an open problem. Several applications and examples are discussed. An efficient and practical protocol relying on building blocks from coding theory is proposed for synchronizing data with general, non-uniform edits; this protocol outperforms existing tools in many scenarios. In addition, synchronization of data represented by complete or partial ranked lists is studied and novel bounds and code constructions are presented.

The second part of the dissertation is concerned with problems of efficient and robust data storage. Inspired by memories operating in high-radiation environments, an exploration of non-uniform noisy decoding for the popular low-density parity-check (LDPC) class of error-correcting codes is performed. A novel model and density evolution analysis are introduced. The problem of comparing, representing, and classifying practical error-control strategies for caches and other on-chip memories is tackled by introducing a powerful explanatory theoretical framework. Finally, a novel coding problem for data used in learning algorithms is considered.

The insights from this work, combining tools from a variety of disciplines, including algorithms, coding and information theory, and combinatorics, contribute to a unified approach to general problems of data. Given the ongoing data revolution, solutions to these problems are of paramount importance.

Main Content
Current View