Assembly of Deeply Sequenced Metagenomes Yields Insight into Viral and Microbial Ecology in Two Natural Systems
- Author(s): Emerson, Joanne Bell
- Advisor(s): Banfield, Jillian F.
- et al.
Virus-host dynamics is poorly understood in natural systems, despite known viral influences on host mortality and community structure, food web dynamics, and biogeochemical cycles. Without a universal marker gene for viruses, assessments of viral diversity and dynamics have previously relied upon models and/or low-resolution data, like viral counts or gel electrophoretic estimates of genome size. Here, de novo assemblies were created from 6.4 Gb of metagenomic sequence from eight community viral concentrate samples, collected from 12 hours to three years apart from hypersaline Lake Tyrrell (LT), Victoria, Australia. Seven complete and 133 partial, novel viral genomes were reconstructed, and new methods for assessing the diversity and dynamics of full viral assemblages were developed. This dissertation provides the first constraints on the timescales over which viral populations and assemblages tend to be stable (days) or dynamic (months-to-years). Comparisons to and reanalyses of previously reported haloviral metagenomes confirm that similar dynamics exist in other hypersaline systems and suggest that most haloviral populations have a limited or temporally variable global distribution. To place the LT viral metagenomes (viromes) in context, metagenomic data were assembled from 63 previously reported viromes from diverse environments, and genetic composition was compared across viromes. Despite spatial and temporal variation within LT, LT viral assemblages were most similar to each other and grouped with other hypersaline viromes. The 71 viromes (including eight from LT) generally clustered by ecosystem, and salinity is inferred to be a major determinant of the genetic composition of viral assemblages.
To further investigate LT microbial ecology, virus-host dynamics were assessed across 17 LT samples, including the eight summer samples from which viral concentrates were sequenced, five additional summer samples, and four winter samples. Contrary to previous reports of microbial population stability in hypersaline systems, dynamics were observed in host populations on similar spatial and temporal scales as in viral populations. An analysis of clustered regularly interspaced short palindromic repeat (CRISPR) regions, which confer host immunity to viruses, indicates that both rare and highly abundant LT viruses were targeted, primarily by lower abundance host organisms. Although very few CRISPR spacers had hits to the NCBI nr database and to the 140 complete and near-complete LT viral genomes, 21% had hits to unassembled LT viral concentrate reads, indicating adaptation to the LT system and successful CRISPR maintenance of viral populations at low enough abundance to preclude metagenomic assembly.
In addition to using model systems like Lake Tyrrell to understand fundamental aspects of microbial ecology, it is important to characterize understudied ecosystems, including subsurface systems, which are likely to harbor much of the carbon on Earth and have the potential to play important roles in solutions to human-induced climate change. In order to characterize the community structure and metabolic potential of a high-CO2 subsurface ecosystem, water was collected from iron-rich, CO2-driven Crystal Geyser (Green River, Utah), an established natural analog for geologic carbon sequestration (i.e., the proposed storage of anthropogenic CO2 in subsurface aquifers to mitigate climate change). Metagenomic sequences (~1.3 Gb) were generated from the 0.2 - 3.0 µm size fraction, and metagenomic assembly and binning resulted in the reconstruction of near-complete genomes of neutrophilic, iron-oxidizing Mariprofundus sp. and sulfur-oxidizing Thiomicrospira crunogena. Significant assembly was also achieved for a number of other organisms, including novel Bacterial phyla and both aerobic and anaerobic respirers predicted to oxidize iron, sulfur, and/or complex carbon. At least one-third of the microorganisms sampled were likely chemoautotrophs, demonstrating that microbial carbon fixation is an important component of the carbon cycle in this system. These results suggest that the biogeochemistry and carbon storage potential of subsurface carbon sequestration reservoirs could be significantly influenced by microorganisms of diverse phylogeny and metabolic potential.
This dissertation demonstrates that next-generation sequencing, de novo metagenomic assembly, and novel analytical techniques can provide answers to important and fundamental questions in the field of microbial ecology. Specifically, this dissertation demonstrates the application of these techniques to: 1) assess viral diversity and dynamics at the highest resolution to date, 2) place those dynamics in the context of a natural ecosystem, and 3) characterize the community structure and metabolic potential of an understudied subsurface ecosystem with relevance to anthropogenic climate change.