Lawrence Berkeley National Laboratory
An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling
- Author(s): Ghysels, P
- Li, XS
- Rouet, FH
- Williams, S
- Napov, A
- et al.
Published Web Locationhttps://doi.org/10.1137/15M1010117
© 2016 U.S. Government. We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to sevenfold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK (STRUctured Matrices PACKage), which also has a distributed memory component for dense rank-structured matrices.