Microbes live in complex communities that have shaped the planet for billions of years. However, much is not known about their diversity and metabolic potential due to biases in methods that require cultivation or PCR amplification. Metagenomics circumvents these issues and can be used to obtain genome sequences for microbial community members. Approximately 800 metagenome-derived complete and draft-quality genomes were reconstructed for groundwater-associated bacteria from a radiation of previously unrecognized and little-known phyla with essentially no isolated representatives. Unlike most other bacteria, these organisms consistently have small genomes, lack highly conserved ribosomal proteins, frequently have rRNA gene introns, and have significant metabolic limitations indicative of an obligate symbiotic lifestyle. Combined phylogenetic and genomic analyses enabled recognition of this group as the Candidate Phyla Radiation (CPR), a major feature of domain Bacteria that was subsequently determined to comprise >50% of all bacterial diversity. Using a newly developed method called iRep, it was determined that CPR organisms typically replicate slowly, although they did replicate rapidly under some conditions. These in situ measurements were possible because iRep uses draft-quality genomes and metagenome sequencing to determine replication rates based on changes in genome copy number that occur during genome replication.
In contrast to groundwater ecosystems, the human microbiome typically contains microorganisms from only a few phyla. Application of metagenomics enabled strain-level resolution of the human microbiome, measurement of iRep replication rates, and proteomic analyses of activity. Microbiome samples were collected from premature infants during the first months of life, and both metagenomics and metaproteomics were used to detect shifts in the gastrointestinal tract microbiome. Results showed that genetically similar bacteria behave differently depending on community context, leading to substantial changes in overall proteome composition. The metagenomic approach enabled identification of considerable genomic novelty. Analysis of the first genome sequence for a member of the genus Varibaculum uncovered a diverse repertoire of sugar utilization pathways and anaerobic respiration capacity. iRep analysis documented highly variable replication rates during initial colonization, and significantly higher rates following antibiotic administration. This work has added large and small branches to the tree of life with corresponding genomic and metabolic information, linked microbial responses and metabolism to changing environmental conditions, and provided previously unobtainable information on in situ replication rates.