Architecture and Performance of Perlmutter’s 35 PB ClusterStor E1000 All-Flash File System
NERSC's newest system, Perlmutter, features a 35 PB all-flash Lustre file system built on HPE Cray ClusterStor E1000. We present its architecture, early performance figures, and performance considerations unique to this architecture. We demonstrate the performance of E1000 OSSes through low-level Lustre tests that achieve over 90% of the theoretical bandwidth of the SSDs at the OST and LNet levels. We also show end-to-end performance for both traditional dimensions of I/O performance (peak bulk-synchronous bandwidth) and non-optimal workloads endemic to production computing (small, incoherent I/Os at random offsets) and compare them to NERSC's previous system, Cori, to illustrate that Perlmutter achieves the performance of a burst buffer and the resilience of a scratch file system. Finally, we discuss performance considerations unique to all-flash Lustre and present ways in which users and HPC facilities can adjust their I/O patterns and operations to make optimal use of such architectures.