CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Skip to main content
eScholarship
Open Access Publications from the University of California

CosmoFlow: Using Deep Learning to Learn the Universe at Scale

  • Author(s): Mathuriya, Amrita
  • Bard, Deborah
  • Mendygral, Peter
  • Meadows, Lawrence
  • Arnemann, James
  • Shao, Lei
  • He, Siyu
  • Karna, Tuomas
  • Moise, Diana
  • Pennycook, Simon J
  • Maschhoff, Kristyn
  • Sewall, Jason
  • Kumar, Nalini
  • Ho, Shirley
  • Ringenburg, Michael F
  • Prabhat, Prabhat
  • Lee, Victor
  • et al.
Abstract

Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many element-wise operations, to improve training performance on Intel® Xeon Phi™ processors. We also utilize the Cray PE Machine Learning Plugin for efficient scaling to multiple nodes. We demonstrate fully synchronous data-parallel training on 8192 nodes of Cori with 77% parallel efficiency, achieving 3.5 Pflop/s sustained performance. To our knowledge, this is the first large-scale science application of the TensorFlow framework at supercomputer scale with fully-synchronous training. These enhancements enable us to process large 3D dark matter distribution and predict the cosmological parameters ΩM, σ8 and ns with unprecedented accuracy.

Main Content
Current View