Skip to main content
Open Access Publications from the University of California

Resource and Data Management in Accelerator-Rich Architectures

  • Author(s): Huang, Muhuan
  • Advisor(s): Cong, Jingsheng Jason
  • et al.

In many domains, accelerators---such as graphic processing units (GPUs) and field programmable gate arrays (FPGAs)---provide a significantly higher performance than general-purpose processors and at a much lower power. Accelerator-rich architectures are thus much more energy-efficient and are becoming mainstream.

This dissertation investigates two important keys to the performance and power efficiency of accelerator-rich architectures---resource and data management. Three broad classes of accelerator-rich architectures are considered: chip-level accelerator-rich architectures such as systems-on-chips(SoC), node-level accelerator-rich architectures, and cluster-level accelerator-rich architectures.

We first study SoC resource management for a broader class of streaming applications. On accelerator-rich SoCs, where multiple computation kernels space-share a single chip, we target the exploration of tradeoffs of on-chip resources and system performance, and find the best combination of accelerator implementations and data communication channel implementations to realize the application functionality.

We continue our study of node-level accelerator-rich architectures where we consider orchestrating two kinds of computation resources, CPU and accelerator, in the PCIe-integrated CPU-accelerator platform and explore the CPU-FPGA collaboration approach to improve application performance.

Then we study the resource allocation problem on accelerator-rich clusters, where accelerators are time-shared among multiple tenants. Unlike traditional cluster resource management, we propose to consider accelerators as the first-class citizen in the cluster resource pool, and develop an accelerator-centric resource scheduling policy to enable fine-grained accelerator sharing among multiple tenants.

Finally, we investigate data shuffling on accelerator-rich clusters and evaluate the possibility of using accelerators during data shuffling. We find that although data shuffling involves a large amount of computation, using accelerators does not necessarily improve system performance due to the data serialization and deserialization overhead introduced by accelerators.

Main Content
Current View