Nematodes are among the most abundant and diverse organisms on Earth, occurring nearly everywhere life can exist. With over a million estimated species, their diversity remains underexplored, largely due to the challenges of accurate identification. Traditional taxonomy requires specialized expertise and is time-consuming, especially for large environmental samples. DNA metabarcoding offers a faster, more scalable approach through the sequencing of bulk DNA, though it faces challenges in accurately linking sequences to species, particularly for multicellular organisms. High-throughput sequencing (HTS) technologies have transformed biodiversity studies by increasing throughput and sensitivity, but issues such as taxonomic misassignments, sequencing artifacts, and unreliable abundance estimates complicate their use. This dissertation focuses on characterization of nematode communities using multiple sequencing technologies with the purpose of enabling future ecological studies in desert soils and other environments.
Chapter 1 is a biodiversity inventory of the Boyd Deep Canyon Reserve (BDCR) in Palm Desert, California, using a DNA barcoding strategy based on the D1-D2 region of 28S ribosomal RNA. A reverse taxonomy approach was used to document the morphological and molecular characteristics of 160 nematode specimens isolated from 9 soil samples. A total of 26 distinct lineages were identified, including 14 putative new species, while 9 lineages await further morphological study. A custom reference sequence database was then constructed, incorporating 28S sequences generated from the reverse taxonomy study, sequences downloaded from NCBI, and additional sequences produced by our lab in previous studies. The reference database was tested using a 96-well plate format with nematodes isolated from sub-fractions of the 9 soil samples. Classification of 2480 Sanger sequences at 70% confidence yielded 79 OTUs in 44 taxonomic categories, with 32 classified to species level, including 15 putative new species. Nematode communities were primarily composed of microbivores, with smaller proportions of omnivore-predators, fungivores, and plant parasites. Beta diversity analyses revealed that nematode community composition was more strongly associated with plant species than collection site. PERMANOVA results showed that plant species accounted for 40.1% of the variation in nematode communities (R² = 0.401, F = 2.01, p = 0.061) based on Bray-Curtis distances, and 35.0% of the variation (R² = 0.350, F = 1.61, p = 0.071) using Jaccard distances. Neither metric showed a significant effect of collection site on community composition.
Chapter 2 compares three sequencing technologies (Illumina MiSeq, PacBio Sequel II, and Sanger sequencing) in their ability to detect nematode diversity from the same soil samples. All reads were trimmed to 270 base pairs to enable direct comparisons across technologies. Illumina detected 70 species-level taxa, PacBio detected 40, and Sanger detected 28. However, 37 of the taxa identified by Illumina were in extremely low abundance, accounting for less than 0.127% of total reads, are likely to be artifactual sequence variants (RSVs). PacBio identified only two taxa that were not detected by the other two sequencing technologies. The inflated richness observed in the Illumina dataset is likely due to RSVs, which had a larger effect on the Shannon alpha diversity metric because it is more sensitive to rare taxa. In contrast, the Simpson alpha diversity metric, which gives more weight to abundant taxa, was less affected by RSVs, making it a more stable measure when these artifacts are present. While Illumina consistently overestimated the number of observed taxa compared to PacBio, both technologies showed strong agreement in relative abundance estimates for the most common species. Despite the higher abundance of RSVs observed in Illumina data, its sequencing depth and lower cost per sample make it a practical option, particularly when downstream analyses are less sensitive to richness estimates.
In Chapter 3, nematode communities were characterized from the rhizospheres of creosote bushes across 11 locations in the Sonoran and Mojave Deserts, using PacBio-based metabarcoding of the 28S rDNA D1-D2 region. Soil samples were analyzed for physicochemical properties, including sand, organic carbon, water retention, and micronutrient concentrations. We detected 2319 amplified sequence variants (ASVs), classified into 92 taxonomic categories, 83% of which were successfully classified to species level, and identified 62 putative species. Over half of the ASVs (52.5%) were restricted to single locations, while others showed broader distributions across multiple sites. Microbivores dominated the trophic structure across all locations, and geographic patterns in nematode community composition were observed, particularly in Death Valley, where Panagrolaimus species were unusually dominant. Phylogenetic analysis revealed that the constituent ASVs of some putative species formed monophyletic groups that were sympatric at multiple locations, indicating potential overclassification, where ASVs classified as one species are actually multiple distinct species. Principal component analysis (PCA) identified sand content as a major contributor to variation in soil properties, followed by organic carbon, water-holding capacity, and micronutrient concentrations. Beta regression GLMM models tested the influence of soil physicochemical properties and geographic location on nematode beta diversity, revealing that soil properties, particularly those related to soil texture, had a stronger effect on nematode community composition than geographic location.
This dissertation contributes new information on nematode diversity in Southern California desert ecosystems and provides insights into the effect of environmental factors on nematode community composition. It also evaluates the relative benefits of different sequencing technologies for biodiversity assessment. Together, these findings offer a basis for future research focused on improving reference databases, addressing data quality concerns, and advancing our understanding of nematode ecology.