UC San Diego
Distributed storage with communication costs
- Author(s): Armstrong, Craig Kenneth
- et al.
Distributed storage systems provide reliable storage of data by dispersing redundancy across multiple nodes. As individual nodes are unreliable this protects the integrity of the data against the failure of nodes. In order to maintain this reliability new nodes must be introduced into the system whenever nodes are lost which restore the redundancy. This process involves having a new node download information from remaining nodes and is known as the repair problem. In this thesis, we consider networks with communication costs associated to each link and explore means to minimize the cost of performing these repairs. We do this by considering a generalized method of repair wherein the amount of information downloaded to a new node varies amongst the other nodes in the network. We find that when nodes store the minimum amount of data that the minimum cost can be achieved by quasi-uniform repair, where the same amount of data is downloaded from each node communicated with. We also consider systems with the additional freedom that the amount of storage is allowed to vary from node to node and look at repair cost minimization there as well