Fusion PIC Code Performance Analysis on The Cori KNL System
- Author(s): Koskela, T;
- Deslippe, J;
- Friesen, B;
- Karthik, R
- et al.
Published Web Locationhttps://cug.org/proceedings/cug2017_proceedings/includes/files/pap152s2-file1.pdf
We study the attainable performance of Particle-In-Cell codes on the Cori KNL system by analyzing a miniature particle push application based on the fusion PIC code XGC1. We start from the most basic building blocks of a PIC code and build up the complexity to identify the kernels that cost the most in performance and focus optimization efforts there. Particle push kernels operate at high AI and are not likely to be memory bandwidth or even cache bandwidth bound on KNL. Therefore, we see only minor benefits from the high bandwidth memory available on KNL, and achieving good vectorization is the most beneficial optimization path and can theoretically yield up to 8x speedup on KNL, but is in practice limited by the data layout to 4x.