Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Electronic Theses and Dissertations bannerUCSF

Combined host and microbial metagenomic next-generation sequencing: Applying integrated analysis approaches for a comprehensive evaluation of infectious disease response to inform diagnosis, surveillance, and treatment

Abstract

Infectious diseases are a leading cause of morbidity and mortality worldwide. Despite significant advancement in our understanding of infectious disease biology, existing microbiologic diagnostic tests often fail to identify etiologic pathogens in cases of suspected infection. Metagenomic next-generation sequencing (mNGS) offers the potential for a universal pathogen detection method, but analysis and interpretation of findings are challenging. This is especially true for lower respiratory tract infections (LRTIs) where mNGS data interpretation is complicated by the existence of a respiratory microbiome composed of pathobionts present in both health and disease.

To address the need for improved LRTI diagnostics, we first compared two fluid types commonly used for diagnosis of LRTI, showing that despite moderate microbiome differences, both mini-bronchioalveolar lavage (mBAL) and tracheal aspirate (TA) samples are suitable for identification of pathogens in the context of an infection. Then, we evaluated the utility of mNGS as a diagnostic for LRTI in a cohort of 92 TA samples from adults with acute respiratory failure. We developed methods for sifting putative pathogens from commensal microbiota as well as pathogen, microbiome diversity, and host gene expression metrics to identify LRTI-positive patients and differentiate them from critically ill controls with noninfectious acute respiratory illnesses. We applied the models developed for evaluation of LRTI status to several other cohorts and disease contexts to show their broad applicability.

The low sensitivity of existing clinical diagnostics results in an imperfect gold standard, complicating the development of mNGS-based biomarkers. We explored the impact of label noise on host gene expression classifiers and methods for circumventing the issue. First, we tested whether label-noise robust logistic regression approaches could improve classifier performance by enabling the use of a larger training set. Then, we tested whether variational autoencoders, an unsupervised dimensionality reduction approach, could generate novel insight from combined host and microbial mNGS data. Altogether, this work suggests that a single streamlined protocol offering an integrated genomic portrait of pathogen, microbiome, and host transcriptome may hold promise as a tool for diagnosis of infections and contextualization of patient response.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View