Skip to main content
eScholarship
Open Access Publications from the University of California

Usage Pattern-Driven Dynamic Data Layout Reorganization:

  • Author(s): Tang, Houjun
  • Byna, Suren
  • Harenberg, Steven
  • Zou, Xiaocheng
  • Zhang, Wenzhao
  • Wu, Kesheng
  • Dong, Bin
  • Ruebel, Oliver
  • Bouchard, Kristofer
  • Klasky, Scott
  • Samatova, Nagiza
  • et al.
Abstract

As scientific simulations and experiments move toward extremely large scales and generate massive amounts of data, the data access performance of analytic applications becomes crucial. A mismatch often happens between write and read patterns of data accesses, typically resulting in poor read performance. Data layout reorganization has been used to improve the locality of data accesses. However, current data reorganizations are static and focus on generating a single (or set of) optimized layouts that rely on prior knowledge of exact future access patterns. We propose a framework that dynamically recognizes the data usage patterns, replicates the data of interest in multiple reorganized layouts that would benefit common read patterns, and makes runtime decisions on selecting a favorable layout for a given read pattern. This framework supports reading individual elements and chunks of a multi-dimensional array of variables. Our pattern-driven layout selection strategy achieves multi-fold speedups compared to reading from the original dataset.

Main Content
Current View