Lawrence Berkeley National Laboratory
Compiler-Directed Transformation for Higher-Order Stencils
- Author(s): Basu, P
- Hall, M
- Williams, S
- Straalen, BV
- Oliker, L
- Colella, P
- et al.
Published Web Locationhttps://doi.org/10.1109/IPDPS.2015.103
© 2015 IEEE. As the cost of data movement increasingly dominates performance, developers of finite-volume and finite-difference solutions for partial differential equations (PDEs) are exploring novel higher-order stencils that increase numerical accuracy and computational intensity. This paper describes a new compiler reordering transformation applied to stencil operators that performs partial sums in buffers, and reuses the partial sums in computing multiple results. This optimization has multiple effect son improving stencil performance that are particularly important to higher-order stencils: exploits data reuse, reduces floating-point operations, and exposes efficient SIMD parallelism to backend compilers. We study the benefit of this optimization in the context of Geometric Multigrid (GMG), a widely used method to solvePDEs, using four different Jacobi smoothers built from 7-, 13-, 27-and 125-point stencils. We quantify performance, speedup, andnumerical accuracy, and use the Roofline model to qualify our results. Ultimately, we obtain over 4× speedup on the smoothers themselves and up to a 3× speedup on the multigrid solver. Finally, we demonstrate that high-order multigrid solvers have the potential of reducing total data movement and energy by several orders of magnitude.