Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Improving GPU Efficiency With Fine-Grained Spatial Partitioning

Creative Commons 'BY-NC-SA' version 4.0 license
Abstract

GPU architecture has enabled an era of high-performance and scientific computing and this is why machine learning has the capabilities it does today. While they are still designed for the highest computationally intensive workloads, there are emerging situations where a single workload doesn't efficiently utilize all of the GPUs resources, leaving room to execute concurrent workloads. This dissertation aims to improve GPU efficiency through partitioning and resource scaling. The first part studies the limitations of current spatial partitioning mechanisms through the use of execution task graphs. The second part motivates and proposes fast fine-grained spatial partitions to improve system throughput in GPU inference servers and explores how a kernel's partition can be optimized to reduce its footprint while maintaining overall inference performance. Third, spatial partitions are used as a resource scaling mechanism and are coordinated with frequency scaling to reduce energy usage in dynamic load environments. Lastly, a methodology is proposed to generate a detailed floorplan to enable research in improving thermal efficiency in GPUs.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View