- Main
Performance Analysis and Optimization for Scientific Data Workloads
Abstract
Scientific data generated at experimental and observational facilities are increasingly being processed on large-scale compute systems. Most of the experimental data analysis workflows are not designed or implemented to run on large scale environments and take full advantage of HPC compute and storage resources. These applications are unlike the traditional tightly-coupled scientific applications and hence face significant performance and scalability challenges as the volume of data increases exponentially. In this paper, we conduct a performance and scalability analysis for experimental analysis applications and workflows operating on data from light sources. Our analysis detects and quantifies I/O performance, scalability and runtime bottlenecks for three data analysis applications that run on NERSC resources. Based on our analysis we propose and implement a set of optimizations that lead to reducing the amount of time spent on I/O operations by almost 90%
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-