Search

Scholarly Works (2 results)

Article
Peer Reviewed

Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories

LBL Publications (2020)

Performance analysis is a daunting job, especially for the rapid-evolving accelerator technologies. The Roofline Scaling Trajectories technique aims at diagnosing various performance bottlenecks for GPU programming models through the visually intuitive Roofline plots. In this work, we introduce the use of the Roofline Scaling Trajectories to capture major performance bottlenecks on NVIDIA Volta GPU architectures, such as warp efficiency, occupancy, and locality. Using this analysis technique, we explain the performance characteristics of the NAS Parallel Benchmarks (NPB) written with two programming models, CUDA and OpenACC. We present the influence of the programming model on the performance and scaling characteristics. We also leverage the insights of the Roofline Scaling Trajectory analysis to tune some of the NAS Parallel Benchmarks, achieving up to 2$$\times $$ speedup.

Cover page: Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories

Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories

DesignSafe: New Cyberinfrastructure for Natural Hazards Engineering