Illuminating the intra-species diversity of bacterial populations from shotgun metagenomes
- Author(s): Nayfach, Stephen
- Advisor(s): Pollard, Katherine S
- et al.
Deep metagenomic sequencing has the potential to illuminate the intra-species genomic variation of abundant microbial species. In this thesis, I develop a new tool MIDAS (Metagenomic Intra-species Diversity Analysis System) for rapidly and automatically quantifying species abundance, single nucleotide polymorphisms (SNPs), and gene copy number variants (CNVs) from metagenomes. To illustrate the utility of this approach, I reanalyze three public datasets with MIDAS. First, I re-analyze stool metagenomes from 98 mother-infant pairs and used rare SNPs to track strain transmission. I find that early colonizers are likely transmitted from the mother whereas late colonizers are likely transmitted from the environment. Second, I re-analyze >300 stool metagenomes from healthy adults and use SNPs to identify examples of both strain co-existence and strain coexclusion. Third, I re-analyze 198 globally distributed marine metagenomes and used gene copy number variants to show that many species have population structure that correlates with geographic location. Strain level genetic variants clearly reveal extensive structure and dynamics that are obscured when metagenomes are analyzed at coarser taxonomic resolution.