The study of human disease relies heavily on an ability to reference thorough, reproducible, qualitative and quantitative observations of normal human biology. The last decade has seen many groups strive to create compendia of transcriptomic and epigenomic data at cell-type resolution to provide a decentralized road map of developing and homeostatic biology, and to characterize how existing cell-types veer off course during disease to generate novel pathological cell states and types. Recently the study of the lung as a field has trailed behind other organs such as the brain, heart, and pancreas in the creation of such atlases. Here we build a first-of-its-kind, single cell resolution transcriptomic and epigenomic atlas of the human lung to quickly advance the study of the human lung. We collected and processed human lung biopsy samples from 30-week development equivalent, 3-year-old, and 30-year-old patients (30wk, 3yr, 30yr), into 46,500 snRNA-seq and 91,000 snATAC-seq nuclear profiles. We then annotated cell types and identified marker genes, data which eventually contributed to a publication by the LungMAP2 community focused on standardizing cell type annotations through consensus on validated histological and molecular features.
We then demonstrate the utility in such a dataset through a publication focused on a rapid response to the COVID-19 pandemic. We identified distal candidate cis-regulatory elements (cCREs) with age-increased activity linked to SARS-CoV-2 host entry gene TMPRSS2 in alveolar type 2 cells, which had immune regulatory signatures and harbored variants associated with respiratory traits. At a COVID-19 risk locus, a candidate SNP overlapped a SLC6A20-linked cCRE, a gene expressed in alveolar cells and with known functional association with the SARS-CoV-2 receptor ACE2. These findings provide insight into regulatory logic underlying genes implicated in COVID-19 in individual lung cell types across age groups. More broadly, these analyses emphasize the value of human biological atlases in rapid response to emerging diseases, having been shared on Biorxiv within one month of the March 2020 lockdown.
Next, we explore the data set for its novel features, the first of its kind to span the maturation of the human lung from neonate to adult, as well as the first chromatin accessibility data for in-vivo human lung. We identified temporally dynamic gene expression in each cell type and performed MERFISH for structural visualization and expression validation, uncovering previously missed transient ontologies, developmental enrichment of disease-associated gene expression, a previously underappreciated lung structure: the intralobular septae, and a full view of cellular communication. Furthermore, we utilize chromatin accessibility data to identify putative regulatory elements and the transcription factors likely driving cell-type specific biology across neonate to adult maturation, particularly informative in understudied and difficult to isolate cell types.
Additionally, we demonstrate the utility of atlases of normal human biology in beginning to dissect mechanism of related diseases. Human variation associated with various lung traits and diseases were fine-mapped and overlaid with accessibility data to refine hypotheses about casual mechanisms. We found that lung function traits-associated SNPs were greatly enriched in the cCREs of mesenchymal cells. Interestingly, we found a unique and strong enrichment for COPD associated variation in myofibroblasts cCREs, in one example linked to FGF18. Mouse data from our own lab and from collaborators validate the role of FGF18 and its receptors in establishing normal lung architecture, providing further weight to the validity of these analyses.
Finally, we also include a demonstration of the value of animal modeling downstream of atlases of normal human biology. Using the sequence of the reference human genome, one of the greatest and most fundamental such biological references, a collaborator was able to identify mutations in a tRNA synthetase likely to be causal for a multi organ syndrome centered on a lung fibrosis phenotype. We then modeled this human mutation in mice and characterized the phenotype, discovering neonatal activation of the ISR pathway, complete yet specific fucosylation of the alveolar surface, and spontaneous formation of immune foci in adult lungs. These data highlight the necessity to continue to dissect this mutation and demonstrate the value of mice for modeling human diseases for generating hypotheses worthy of further study.
Altogether, our work provides a clear example of the utility of single cell resolution molecular atlases for uncovering novel cell type biology in an unbiased manner, to guide the rapid response to emerging human health concerns, and to dissect mechanisms of existing diseases.