Skip to main content
eScholarship
Open Access Publications from the University of California

Cataloging the visible universe through Bayesian inference in Julia at petascale

  • Author(s): Regier, J
  • Fischer, K
  • Pamnany, K
  • Noack, A
  • Revels, J
  • Lam, M
  • Howard, S
  • Giordano, R
  • Schlegel, D
  • McAuliffe, J
  • Thomas, R
  • Prabhat
  • et al.
Abstract

A key task in astronomy is to locate astronomical objects in images and to characterize them according to physical parameters such as brightness, color, and morphology. This task, known as cataloging, is challenging for several reasons: many astronomical objects are much dimmer than the sky background, labeled data is generally unavailable, overlapping astronomical objects must be resolved collectively, and the datasets are enormous – terabytes now, petabytes soon. In this work, we infer an astronomical catalog from 55 TB of imaging data using Celeste, a Bayesian variational inference code written entirely in the high-productivity programming language Julia. Using over 1.3 million threads on 650,000 Intel Xeon Phi cores of the Cori Phase II supercomputer, Celeste achieves a peak rate of 1.54 DP PFLOP/s. Celeste is able to jointly optimize parameters for 188 M stars and galaxies, loading and processing 178 TB across 8192 nodes in 14.6 min. To achieve this, Celeste exploits parallelism at multiple levels (cluster, node, and thread) and accelerates I/O through Cori's burst buffer. Julia's native performance enables Celeste to employ high-level constructs without resorting to hand-written or generated low-level code (C/C++/Fortran) while still achieving petascale performance.

Main Content
Current View