High-end computer systems increasingly rely on heterogeneous computing resources. For instance, a datacenter server might include multiple CPUs, high-end GPUs, PCIe SSDs, and high-speed networking interface cards. All of these components provide computing resources and operate at a high bandwidth. Coordinating the movement of data and scheduling computation across these resources is a complex task, as current programming models require system developers to explicitly schedule data transfers. Moving data is also inefficient in terms of both performance and energy costs: some applications running on GPU-equipped systems spend over 55% of their execution time and 53% of energy moving data between the storage device and the GPU. This paper proposes Gullfoss, a system that provides a simplified programming model for these heterogeneous computing systems. Gullfoss provides a high-level interface for specifying an application’s data movement requirements, and dynamically schedules data transfers while accounting for current system load and program requirements. Our initial implementation of Gullfoss focuses on data transfers between an SSD and a GPU, eliminating wasteful transfers to and from main memory as data moves between the two. This saves memory energy and bandwidth, leaving the CPU free to do useful work or operate at a lower frequency to improve energy efficiency. We implement and evaluate Gullfoss using commercially available hardware components. Gullfoss achieves 1.46× speedup, reduces energy consumption by 28%, and improves energy-delay product by 41%, comparing with systems without Gullfoss. For multi-program workloads, Gullfoss shows 1.5× speedup. Gullfoss also improves the performance of a GPU-based MapReduce framework by 10%.
Pre-2018 CSE ID: CS2015-1015