- Latif, Alaa;
- Mullen, Julia;
- Alkuzweny, Manar;
- Hufbauer, Emory;
- Tsueng, Ginger;
- Haag, Emily;
- Zeller, Mark;
- Aceves, Christine;
- Zaiets, Karina;
- Cano, Marco;
- Zhou, Xinghua;
- Qian, Zhongchao;
- Sattler, Rachel;
- Matteson, Nathaniel;
- Levy, Joshua;
- Lee, Raphael;
- Freitas, Lucas;
- Maurer-Stroh, Sebastian;
- Wu, Chunlei;
- Su, Andrew;
- Andersen, Kristian;
- Hughes, Laura;
- Suchard, Marc;
- Gangavarapu, Karthik
In response to the emergence of SARS-CoV-2 variants of concern, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info , a platform that currently tracks over 40 million combinations of Pango lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials and the general public. We describe the interpretable visualizations available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data and the server infrastructure that enables widespread data dissemination via a high-performance API that can be accessed using an R package. We show how outbreak.info can be used for genomic surveillance and as a hypothesis-generation tool to understand the ongoing pandemic at varying geographic and temporal scales.