Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Cruz

UC Santa Cruz Electronic Theses and Dissertations bannerUC Santa Cruz

The Promise Of Data Grouping In Large Scale Storage Systems

Abstract

"Big Data" is one of the most prominent buzzwords of our age, and for good reason. From genetics to supercomputers to Facebook, big data is everywhere, and managing this data is going to be essential in the years to come.

One of the most basic interactions between a user and a storage system is the ability to retrieve data quickly and reliably. My thesis centers around the use of statistical properties of I/O traces to identify sets of data that are accessed together in order to store big data reliably and efficiently. We begin by showing that data can be assigned to predictive groups scalably with minimal domain knowledge and system impact. We then show how these groupings spur a variety of substantial improvements in areas including power management, performance, reliability, and resource allocation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View