Skip to main content
eScholarship
Open Access Publications from the University of California

Distributed memory, GPU accelerated Fock construction for hybrid, Gaussian basis density functional theory

Published Web Location

https://doi.org/10.1063/5.0151070Creative Commons 'BY' version 4.0 license
Abstract

With the growing reliance of modern supercomputers on accelerator-based architecture such a graphics processing units (GPUs), the development and optimization of electronic structure methods to exploit these massively parallel resources has become a recent priority. While significant strides have been made in the development GPU accelerated, distributed memory algorithms for many modern electronic structure methods, the primary focus of GPU development for Gaussian basis atomic orbital methods has been for shared memory systems with only a handful of examples pursing massive parallelism. In the present work, we present a set of distributed memory algorithms for the evaluation of the Coulomb and exact exchange matrices for hybrid Kohn-Sham DFT with Gaussian basis sets via direct density-fitted (DF-J-Engine) and seminumerical (sn-K) methods, respectively. The absolute performance and strong scalability of the developed methods are demonstrated on systems ranging from a few hundred to over one thousand atoms using up to 128 NVIDIA A100 GPUs on the Perlmutter supercomputer.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View