Lessons Learned from Exploring the Backtracking Paradigm on the GPU
- Author(s): Jenkins, John;
- Arkatkar, Isha;
- Owens, John D.;
- Choudhary, Alok;
- Samatova, Nagiza F.
- et al.
Published Web Locationhttps://doi.org/10.1007/978-3-642-23397-5_42
Abstract. We explore the backtracking paradigm with properties seen as sub-optimal for GPU architectures, using as a case study the maximal clique enumeration problem, and find that the presence of these properties limit GPU performance to approximately 1.4--2.25 times a single CPU core. The GPU performance ''lessons'' we find critical to providing this performance include a coarse-and-fine-grain parallelization of the search space, a low-overhead load-balanced distribution of work, global memory latency hiding through coalescence, saturation, and shared memory utilization, and the use of GPU output buffering as a solution to irregular workloads and a large solution domain. We also find a strong reliance on an efficient global problem structure representation that bounds any efficiencies gained from these lessons, and discuss the meanings of these results to backtracking problems in general.