Systems and Algorithm Support for Efficient Heterogeneous Computing with GPUs
Heterogeneous computing has seen a great rise in the age of big data. In particular, heterogeneous computing systems with GPUs are able to deliver exceptional performance with better energy efficiency, thus equip us with great power to deal with the enormous yet fast growing volume of data.
In order to exploit the power of such heterogeneous systems with GPUs, we have to address the problems from three closely related aspects. First, we need to design and implement algorithms best suited to the CPU and the GPU, and schedule workloads to the best processors. Second, we have to maximize the utilization of the system resources for better performance and higher power efficiency. Third, we should provide efficient I/O management and data transfer mechanisms to accommodate the increasing amount of data.
This dissertation explores the systems and algorithm support for efficient heterogeneous computing with GPUs to respond to the above three problems. We first present a system called Hippogriff, which enables efficient direct data transfers between the SSD and the GPU. Hippogriff is able to dynamically choose the best data transfer route for the GPU to improve the overall system performance and efficiency.
We then focus on improving the resource utilization in the context of MapReduce, and present SPMario, a system to scale up MapReduce with the GPU via optimized I/O handling and task scheduling. SPMario proposes I/O Oriented Scheduling to coordinate concurrent task execution in a way that minimizes the idle time of the system resources while avoiding I/O contention.
Last, we present a heterogeneous search engine called Griffin, which explores new parallel algorithms and task scheduling to meet the rigorous requirement of query latency in Web search. Griffin uses a dynamic intra-query scheduling algorithm to break a query into sub-operations, and adaptively schedules them to the state-of-the-art CPU search implementation and to our new GPU-based search kernels, based on the ever-changing runtime characteristics of the queries.
Overall, we demonstrate our systems address the above three problems, and can improve the performance and efficiency of the heterogeneous computing systems with GPUs.