Genomes from metagenomes have made enormous contributions to understandings of the phylogenetic diversity of bacteria and archaea from lineages lacking isolated representatives. This is especially true for members of the Candidate Phyla Radiation (CPR), a highly phylogenetically diverse clade of bacteria with ultrasmall cell sizes and predicted epi-symbiotic or parasitic lifestyles. However, much remains to be learned about the environmental distribution of CPR bacteria, relationships with their microbial hosts, and the ecological and evolutionary factors shaping their key metabolic capacities. Here, genome-resolved metagenomics and comparative genomics approaches were used to shed light on the processes governing gene content in CPR bacteria, looking both across and within natural ecosystems including groundwater, freshwater lakes, soil, and animal microbiomes.
The first chapter of this dissertation resolves a robust internal phylogeny for the CPR bacteria, and uses this phylogenetic framework to quantify the distribution of core metabolic capacities (e.g. carbon metabolism) across the radiation. Analysis of gene patchiness indicated that genetic components of glycolysis have been shaped by vertical transmission, loss, and lateral transfer to differing extents. Intriguingly, similarity in core metabolic platform was decoupled from phylogenetic relatedness, suggesting that gene gain and loss of similar genetic components have likely been commonplace across the CPR bacteria. Extensive gene gain and loss were also evident for NiFe hydrogenases, which may be involved in both energy conservation and sulfur metabolism in diverse CPR bacteria.
The second chapter focuses on RuBisCO, which was found to be patchily distributed across the CPR radiation. CPR bacteria encode a wide diversity of archaeal form III RuBisCO, including a novel clade (‘form III-c’) of sequences which likely function in an archaeal pathway for nucleotide assimilation that incorporates carbon dioxide. Evolutionary analysis suggested that RuBisCO in CPR has likely undergone extensive lateral gene transfer, including episodes of interdomain exchange that impacted the distribution of RuBisCO forms across the tree of life.
The third chapter examines the way overall gene content (i.e., entire proteomes) in CPR bacteria varies across broad environment types. A subset of lineages within the CPR were examined for linkages between phylogeny, habitat of origin, and gene content to reconstruct the path of transition from the environment to human/animal microbiomes. The results suggest that CPR from animal microbiomes have on average smaller proteomes than their environmental counterparts but are simultaneously enriched for a number of functions that may enable use of habitat-specific resources or tolerance of stressors. However, acquisition of these capacities likely did not enable habitat transitions; instead, we infer that transitions were driven by the suitability of available hosts and subsequently reinforced by gene gain.
Similar themes are explored in the final chapter, which concerns CPR bacteria and their surrounding microbial communities in a permanently stratified lake. Gene content in CPR bacteria is not highly differentiated across the lake’s compartments, which experience a gradient of oxygen conditions from relatively saturated to entirely anoxic. This stands in contrast to non-CPR organisms, where metabolic capacities covary with depth and in some cases may be impacted by the availability of light and oxygen. CPR bacteria throughout the water column can significantly contribute to overall metabolic potential for carbon fixation through RuBisCO.
Overall, the results presented in this dissertation shed light on the processes, both ecological and evolutionary, that have acted on CPR gene content over time and contributed to their reduced genomes, variable metabolic platforms, and lifestyles in which they are dependent on other bacteria. These observations are of interest because CPR bacteria organisms likely represent a relatively unique evolutionary trajectory within the domain Bacteria, and thus broaden our overall understanding of ‘the rules of life.’