Percival: A Reliable, Long-Term, Distributed Storage System Free of Fixed-Key Encryption
- Author(s): Frank, Joel Cameron
- Advisor(s): Long, Darrell
- et al.
Secret splitting has been shown to improve reliability, reduce the risk of insider threat, and remove the issues surrounding key management in distributed long-term datastores. However, to date there has been little or no adoption of this technique in production environments. When it has been implemented, it was done relying on fixed-key encryption for various parts of the system, e.g. during ingestion to maintain user privacy, or pre-indexing to facilitate searching since the inherent security of such a datastore normally precludes it from being directly searched without reassembling the data. Fixed-key encryption, unfortunately, is not well suited for long-term applications due to its introduction of a single point of compromise and failure as well as its key management issues. Furthermore, even if the data remains intact after a long period of time, since standard reconstruction methodologies rely upon external knowledge to perform the reconstruction, they will eventually fail. When they do, information loss is almost certain in applications of sufficient size to make reconstruction combinatorially prohibitive. The most recent method to mitigate this risk has a high runtime, and limits the inherent security of the secret-split datastore.
To address the need of a reliable, long-term, distributed storage system free of fixed-key encryption, we propose Percival: a novel system that enables searching a secret-split datastore, maintains information privacy, and does not rely on external information to ensure reconstruction remains feasible. It is built upon the knowledge gained from conducting an in-depth comparison of file migration activity on the mass storage system (MSS) at the National Center for Atmospheric Research (NCAR) during two periods, one in the early 1990s, and another nearly twenty years later. To accommodate real-world user access patterns, Percival allows one to search the secret-split data while both keeping the bulk of the work on each client and the data custodians blinded to both the contents of a query as well as its results. Furthermore, to ensure reconstruction is feasible for even very large secret-split datastores, we also present two novel disaster recovery methods that greatly reduce the number of reconstruction attempts required during reconstruction; this enables recovery of the original data, where previously the data would have been lost.