Li, Weiqi

Flexible Coding for Distributed Systems

2021

Abstract

In this dissertation, the constructions and schemes for flexible coding in distributed systems are investigated. Depending on the system parameters, the proposed methods allow adaptive choices of code constructions or data reconstruction schemes that provide desirable cost functions, such as network traffic, complexity, and latency. First, the repair of a node failure for Reed-Solomon (RS) codes, one of the most popular codes used in practical storage systems, is considered. Code constructions and repair schemes that provide a tradeoff between the repair communication cost and the coding complexity are presented. Second, facing the fact that the failures are unpredictable in a distributed system, a framework for flexible codes to achieve the optimal latency of accessing information is proposed. Instead of accessing a fixed number of nodes with a conventional code, a flexible code allows one to recover the entire information from a flexible number of storage nodes, and use all the available nodes efficiently. Constructions for different storage scenarios are proposed. Third, flexible coding in distributed matrix multiplication for failure tolerance is presented to reduce the computing latency. The goal is to allow a master node to efficiently obtain the computation results from a flexible number of available servers. The number of failures is unknown a priori and we provide code constructions that can efficiently make use of the computation results from all available servers. Given the storage capacity of the servers, the computation load optimization problem is analyzed.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Irvine

Flexible Coding for Distributed Systems