X-ray scattering is a valuable tool for measuring the structural properties of
materials used in the design and fabrication of energy-relevant nanodevices
(e.g., photovoltaic, energy storage, battery, fuel, and carbon capture and
sequestration devices) that are key to the reduction of carbon emissions.
Although today's ultra-fast X-ray scattering detectors can provide tremendous
information on the structural properties of materials, a primary challenge
remains in the analyses of the resulting data. We are developing novel
high-performance computing algorithms, codes, and software tools for the
analyses of X-ray scattering data. In this paper we describe two such HPC
algorithm advances. Firstly, we have implemented a flexible and highly efficient
Grazing Incidence Small Angle Scattering (GISAXS) simulation code based on the
Distorted Wave Born Approximation (DWBA) theory with C++/CUDA/MPI on a cluster
of GPUs. Our code can compute the scattered light intensity from any given
sample in all directions of space; thus allowing full construction of the GISAXS
pattern. Preliminary tests on a single GPU show speedups over 125x compared to
the sequential code, and almost linear speedup when executing across a GPU
cluster with 42 nodes, resulting in an additional 40x speedup compared to using
one GPU node. Secondly, for the structural fitting problems in inverse modeling,
we have implemented a Reverse Monte Carlo simulation algorithm with C++/CUDA
using one GPU. Since there are large numbers of parameters for fitting in the in
X-ray scattering simulation model, the earlier single CPU code required weeks of
runtime. Deploying the AccelerEyes Jacket/Matlab wrapper to use GPU gave around
100x speedup over the pure CPU code. Our further C++/CUDA optimization delivered
an additional 9x speedup.