Skip to main content
eScholarship
Open Access Publications from the University of California

Open Access Policy Deposits

This series is automatically populated with publications deposited by UC San Diego Department of Computer Science & Engineering researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.
Cover page of Associations of plant-based foods, red and processed meat, and dairy with gut microbiome in Finnish adults.

Associations of plant-based foods, red and processed meat, and dairy with gut microbiome in Finnish adults.

(2024)

PURPOSE: Population-based studies on the associations of plant-based foods, red meat or dairy with gut microbiome are scarce. We examined whether the consumption of plant-based foods (vegetables, potatoes, fruits, cereals), red and processed meat (RPM) or dairy (fermented milk, cheese, other dairy products) are related to gut microbiome in Finnish adults. METHODS: We utilized data from the National FINRISK/FINDIET 2002 Study (n = 1273, aged 25-64 years, 55% women). Diet was assessed with 48-hour dietary recalls. Gut microbiome was analyzed using shallow shotgun sequencing. We applied multivariate analyses with linear models and permutational ANOVAs adjusted for relevant confounders. RESULTS: Fruit consumption was positively (beta = 0.03, SE = 0.01, P = 0.04), while a dairy subgroup including milk, cream and ice-creams was inversely associated (beta=-0.03, SE 0.01, P = 0.02) with intra-individual gut microbiome diversity (alpha-diversity). Plant-based foods (R2 = 0.001, P = 0.03) and dairy (R2 = 0.002, P = 0.01) but not RPM (R2 = 0.001, P = 0.38) contributed to the compositional differences in gut microbiome (beta-diversity). Plant-based foods were associated with several butyrate producers/cellulolytic species including Roseburia hominis. RPM associations included an inverse association with R. hominis. Dairy was positively associated with several lactic producing/probiotic species including Lactobacillus delbrueckii and potentially opportunistic pathogens including Citrobacter freundii. Dairy, fermented milk, vegetables, and cereals were associated with specific microbial functions. CONCLUSION: Our results suggest a potential association between plant-based foods and dairy or their subgroups with microbial diversity measures. Furthermore, our findings indicated that all the food groups were associated with distinct overall microbial community compositions. Plant-based food consumption particularly was associated with a larger number of putative beneficial species.

Cover page of Impact of diet change on the gut microbiome of common marmosets (Callithrix jacchus).

Impact of diet change on the gut microbiome of common marmosets (Callithrix jacchus).

(2024)

UNLABELLED: Gastrointestinal diseases are the most frequently reported clinical problems in captive common marmosets (Callithrix jacchus), often affecting the health and welfare of the animal and ultimately their use as a research subject. The microbiome has been shown to be intimately connected to diet and gastrointestinal health. Here, we use shotgun metagenomics and untargeted metabolomics in fecal samples of common marmosets collected before, during, and after a dietary transition from a biscuit to a gel diet. The overall health of marmosets, measured as weight recovery and reproductive outcome, improved after the diet transition. Moreover, each marmoset pair had significant shifts in the microbiome and metabolome after the diet transition. In general, we saw a decrease in Escherichia coli and Prevotella species and an increase in Bifidobacterium species. Untargeted metabolic profiles indicated that polyamine levels, specifically cadaverine and putrescine, were high after diet transition, suggesting either an increase in excretion or a decrease in intestinal reabsorption at the intestinal level. In conclusion, our data suggest that Bifidobacterium species could potentially be useful as probiotic supplements to the laboratory marmoset diet. Future studies with a larger sample size will be beneficial to show that this is consistent with the diet change. IMPORTANCE: Appropriate diet and health of the common marmoset in captivity are essential both for the welfare of the animal and to improve experimental outcomes. Our study shows that a gel diet compared to a biscuit diet improves the health of a marmoset colony, is linked to increases in Bifidobacterium species, and increases the removal of molecules associated with disease. The diet transition had an influence on the molecular changes at both the pair and time point group levels, but only at the pair level for the microbial changes. It appears to be more important which genes and functions present changed rather than specific microbes. Further studies are needed to identify specific components that should be considered when choosing an appropriate diet and additional supplementary foods, as well as to validate the benefits of providing probiotics. Probiotics containing Bifidobacterium species appear to be useful as probiotic supplements to the laboratory marmoset diet, but additional work is needed to validate these findings.

Cover page of Associations between gut microbiota and incident fractures in the FINRISK cohort.

Associations between gut microbiota and incident fractures in the FINRISK cohort.

(2024)

The gut microbiota (GM) can regulate bone mass, but its association with incident fractures is unknown. We used Cox regression models to determine whether the GM composition is associated with incident fractures in the large FINRISK 2002 cohort (n = 7043, 1092 incident fracture cases, median follow-up time 18 years) with information on GM composition and functionality from shotgun metagenome sequencing. Higher alpha diversity was associated with decreased fracture risk (hazard ratio [HR] 0.92 per standard deviation increase in Shannon index, 95% confidence interval 0.87-0.96). For beta diversity, the first principal component was associated with fracture risk (Aitchison distance, HR 0.90, 0.85-0.96). In predefined phyla analyses, we observed that the relative abundance of Proteobacteria was associated with increased fracture risk (HR 1.14, 1.07-1.20), while the relative abundance of Tenericutes was associated with decreased fracture risk (HR 0.90, 0.85-0.96). Explorative sub-analyses within the Proteobacteria phylum showed that higher relative abundance of Gammaproteobacteria was associated with increased fracture risk. Functionality analyses showed that pathways related to amino acid metabolism and lipopolysaccharide biosynthesis associated with fracture risk. The relative abundance of Proteobacteria correlated with pathways for amino acid metabolism, while the relative abundance of Tenericutes correlated with pathways for butyrate synthesis. In conclusion, the overall GM composition was associated with incident fractures. The relative abundance of Proteobacteria, especially Gammaproteobacteria, was associated with increased fracture risk, while the relative abundance of Tenericutes was associated with decreased fracture risk. Functionality analyses demonstrated that pathways known to regulate bone health may underlie these associations.

Cover page of Behavioral Intervention for Adults With Autism on Distribution of Attention in Triadic Conversations: A/B-Tested Pre-Post Study.

Behavioral Intervention for Adults With Autism on Distribution of Attention in Triadic Conversations: A/B-Tested Pre-Post Study.

(2024)

BACKGROUND: Cross-neurotype differences in social communication patterns contribute to high unemployment rates among adults with autism. Adults with autism can be unsuccessful in job searches or terminated from employment due to mismatches between their social attention behaviors and societys expectations on workplace communication. OBJECTIVE: We propose a behavioral intervention concerning distribution of attention in triadic (three-way) conversations. Specifically, the objective is to determine whether providing personalized feedback to each individual with autism based on an analysis of their attention distribution behavior during an initial conversation session would cause them to modify their orientation behavior in a subsequent conversation session. METHODS: Our system uses an unobtrusive head orientation estimation model to track the focus of attention of each individual. Head orientation sequences from a conversation session are analyzed based on five statistical domains (eg, maximum exclusion duration and average contact duration) representing different types of attention distribution behavior. An intervention is provided to a participant if they exceeded the nonautistic average for that behavior by at least 2 SDs. The intervention uses data analysis and video modeling along with a constructive discussion about the targeted behaviors. Twenty-four individuals with autism with no intellectual disabilities participated in the study. The participants were divided into test and control groups of 12 participants each. RESULTS: Based on their attention distribution behavior in the initial conversation session, 11 of the 12 participants in the test group received an intervention in at least one domain. Of the 11 participants who received the intervention, 10 showed improvement in at least one domain on which they received feedback. Independent t tests for larger test groups (df>15) confirmed that the group improvements are statistically significant compared with the corresponding controls (P<.05). Crawford-Howell t tests confirmed that 78% of the interventions resulted in significant improvements when compared individually against corresponding controls (P<.05). Additional t tests comparing the first conversation sessions of the test and control groups and comparing the first and second conversation sessions of the control group resulted in nonsignificant differences, pointing to the intervention being the main effect behind the behavioral changes displayed by the test group, as opposed to confounding effects or group differences. CONCLUSIONS: Our proposed behavioral intervention offers a useful framework for practicing social attention behavior in multiparty conversations that are common in social and professional settings.

Cover page of Genome-wide detection of somatic mosaicism at short tandem repeats.

Genome-wide detection of somatic mosaicism at short tandem repeats.

(2024)

MOTIVATION: Somatic mosaicism has been implicated in several developmental disorders, cancers, and other diseases. Short tandem repeats (STRs) consist of repeated sequences of 1-6 bp and comprise >1 million loci in the human genome. Somatic mosaicism at STRs is known to play a key role in the pathogenicity of loci implicated in repeat expansion disorders and is highly prevalent in cancers exhibiting microsatellite instability. While a variety of tools have been developed to genotype germline variation at STRs, a method for systematically identifying mosaic STRs is lacking. RESULTS: We introduce prancSTR, a novel method for detecting mosaic STRs from individual high-throughput sequencing datasets. prancSTR is designed to detect loci characterized by a single high-frequency mosaic allele, but can also detect loci with multiple mosaic alleles. Unlike many existing mosaicism detection methods for other variant types, prancSTR does not require a matched control sample as input. We show that prancSTR accurately identifies mosaic STRs in simulated data, demonstrate its feasibility by identifying candidate mosaic STRs in Illumina whole genome sequencing data derived from lymphoblastoid cell lines for individuals sequenced by the 1000 Genomes Project, and evaluate the use of prancSTR on Element and PacBio data. In addition to prancSTR, we present simTR, a novel simulation framework which simulates raw sequencing reads with realistic error profiles at STRs. AVAILABILITY AND IMPLEMENTATION: prancSTR and simTR are freely available at https://github.com/gymrek-lab/trtools. Detailed documentation is available at https://trtools.readthedocs.io/.

Cover page of Pangenome comparison of Bacteroides fragilis genomospecies unveils genetic diversity and ecological insights

Pangenome comparison of Bacteroides fragilis genomospecies unveils genetic diversity and ecological insights

(2024)

Bacteroides fragilis is a Gram-negative commensal bacterium commonly found in the human colon, which differentiates into two genomospecies termed divisions I and II. Through a comprehensive collection of 694 B. fragilis whole genome sequences, we identify novel features distinguishing these divisions. Our study reveals a distinct geographic distribution with division I strains predominantly found in North America and division II strains in Asia. Additionally, division II strains are more frequently associated with bloodstream infections, suggesting a distinct pathogenic potential. We report differences between the two divisions in gene abundance related to metabolism, virulence, stress response, and colonization strategies. Notably, division II strains harbor more antimicrobial resistance (AMR) genes than division I strains. These findings offer new insights into the functional roles of division I and II strains, indicating specialized niches within the intestine and potential pathogenic roles in extraintestinal sites.

Importance

Understanding the distinct functions of microbial species in the gut microbiome is crucial for deciphering their impact on human health. Classifying division II strains as Bacteroides fragilis can lead to erroneous associations, as researchers may mistakenly attribute characteristics observed in division II strains to the more extensively studied division I B. fragilis. Our findings underscore the necessity of recognizing these divisions as separate species with distinct functions. We unveil new findings of differential gene prevalence between division I and II strains in genes associated with intestinal colonization and survival strategies, potentially influencing their role as gut commensals and their pathogenicity in extraintestinal sites. Despite the significant niche overlap and colonization patterns between these groups, our study highlights the complex dynamics that govern strain distribution and behavior, emphasizing the need for a nuanced understanding of these microorganisms.

Cover page of House dust metagenome and pulmonary function in a US farming population.

House dust metagenome and pulmonary function in a US farming population.

(2024)

BACKGROUND: Chronic exposure to microorganisms inside homes can impact respiratory health. Few studies have used advanced sequencing methods to examine adult respiratory outcomes, especially continuous measures. We aimed to identify metagenomic profiles in house dust related to the quantitative traits of pulmonary function and airway inflammation in adults. Microbial communities, 1264 species (389 genera), in vacuumed bedroom dust from 779 homes in a US cohort were characterized by whole metagenome shotgun sequencing. We examined two overall microbial diversity measures: richness (the number of individual microbial species) and Shannon index (reflecting both richness and relative abundance). To identify specific differentially abundant genera, we applied the Lasso estimator with high-dimensional inference methods, a novel framework for analyzing microbiome data in relation to continuous traits after accounting for all taxa examined together. RESULTS: Pulmonary function measures (forced expiratory volume in one second (FEV1), forced vital capacity (FVC), and FEV1/FVC ratio) were not associated with overall dust microbial diversity. However, many individual microbial genera were differentially abundant (p-value < 0.05 controlling for all other microbial taxa examined) in relation to FEV1, FVC, or FEV1/FVC. Similarly, fractional exhaled nitric oxide (FeNO), a marker of airway inflammation, was unrelated to overall microbial diversity but associated with differential abundance for many individual genera. Several genera, including Limosilactobacillus, were associated with a pulmonary function measure and FeNO, while others, including Moraxella to FEV1/FVC and Stenotrophomonas to FeNO, were associated with a single trait. CONCLUSIONS: Using state-of-the-art metagenomic sequencing, we identified specific microorganisms in indoor dust related to pulmonary function and airway inflammation. Some were previously associated with respiratory conditions; others were novel, suggesting specific environmental microbial components contribute to various respiratory outcomes. The methods used are applicable to studying microbiome in relation to other continuous outcomes. Video Abstract.

Cover page of HyperGen: Compact and Efficient Genome Sketching using Hyperdimensional Vectors.

HyperGen: Compact and Efficient Genome Sketching using Hyperdimensional Vectors.

(2024)

MOTIVATION: Genomic distance estimation is a critical workload since exact computation for whole-genome similarity metrics such as Average Nucleotide Identity (ANI) incurs prohibitive runtime overhead. Genome sketching is a fast and memory-efficient solution to estimate ANI similarity by distilling representative k-mers from the original sequences. In this work, we present HyperGen that improves accuracy, runtime performance, and memory efficiency for large-scale ANI estimation. Unlike existing genome sketching algorithms that convert large genome files into discrete k-mer hashes, HyperGen leverages the emerging hyperdimensional computing (HDC) to encode genomes into quasi-orthogonal vectors (Hypervector, HV) in high-dimensional space. HV is compact and can preserve more information, allowing for accurate ANI estimation while reducing required sketch sizes. In particular, the HV sketch representation in HyperGen allows efficient ANI estimation using vector multiplication, which naturally benefits from highly optimized general matrix multiply (GEMM) routines. As a result, HyperGen enables the efficient sketching and ANI estimation for massive genome collections. RESULTS: We evaluate HyperGen s sketching and database search performance using several genome datasets at various scales. HyperGen is able to achieve comparable or superior ANI estimation error and linearity compared to other sketch-based counterparts. The measurement results show that HyperGen is one of the fastest tools for both genome sketching and database search. Meanwhile, HyperGen produces memory-efficient sketch files while ensuring high ANI estimation accuracy. AVAILABILITY: A Rust implementation of HyperGen is freely available under the MIT license as an open-source software project at https://github.com/wh-xu/Hyper-Gen. The scripts to reproduce the experimental results can be accessed at https://github.com/wh-xu/experiment-hyper-gen.

Cover page of LongTR: genome-wide profiling of genetic variation at tandem repeats from long reads.

LongTR: genome-wide profiling of genetic variation at tandem repeats from long reads.

(2024)

Tandem repeats are frequent across the human genome, and variation in repeat length has been linked to a variety of traits. Recent improvements in long read sequencing technologies have the potential to greatly improve tandem repeat analysis, especially for long or complex repeats. Here, we introduce LongTR, which accurately genotypes tandem repeats from high-fidelity long reads available from both PacBio and Oxford Nanopore Technologies. LongTR is freely available at https://github.com/gymrek-lab/longtr and https://zenodo.org/doi/10.5281/zenodo.11403979 .

Cover page of Integrating clinical research into electronic health record workflows to support a learning health system.

Integrating clinical research into electronic health record workflows to support a learning health system.

(2024)

OBJECTIVE: Integrating clinical research into routine clinical care workflows within electronic health record systems (EHRs) can be challenging, expensive, and labor-intensive. This case study presents a large-scale clinical research project conducted entirely within a commercial EHR during the COVID-19 pandemic. CASE REPORT: The UCSD and UCSDH COVID-19 NeutraliZing Antibody Project (ZAP) aimed to evaluate antibody levels to SARS-CoV-2 virus in a large population at an academic medical center and examine the association between antibody levels and subsequent infection diagnosis. RESULTS: The project rapidly and successfully enrolled and consented over 2000 participants, integrating the research trial with standing COVID-19 testing operations, staff, lab, and mobile applications. EHR-integration increased enrollment, ease of scheduling, survey distribution, and return of research results at a low cost by utilizing existing resources. CONCLUSION: The case study highlights the potential benefits of EHR-integrated clinical research, expanding their reach across multiple health systems and facilitating rapid learning during a global health crisis.