Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Statistical Methods for Inferring Correlation and Causation Between Genotypes and Phenotypes

Abstract

Genome-Wide Association Studies (GWAS) have identified many genetic variants that are associated with a variety of complex phenotypes, including anthropometric and lifestyle traits as well as complex diseases. It is unclear, however, which of these variants actually play causal roles in these phenotypes, as opposed to simply being correlated with the causal variants. It is also unclear through which intermediate mechanisms causal variants impact complex phenotypes, such as effects on gene expression, metabolites, the microbiome, or other related phenotypes. In this dissertation, I present computational and statistical methods for addressing these issues. These methods infer causal variants for complex phenotypes, link variants to intermediate gene expression phenotypes, and use genetic variants to determine the causal effect of intermediate phenotypes on downstream phenotypes.

Further exploring one such set of intermediate phenotypes, I present several methods for the analysis of metagenomic sequencing data. Metagenomics, the study of microbial genomes sequenced directly from their host environment, has revolutionized the study of microorganisms and illuminated their key roles in environmental function and dysfunction, including in human health and disease. However, it is challenging to determine which microbes are present and their relative abundances from sequencing data, due to incomplete genomic reference databases as well as errors in the sequencing reads themselves. I introduce several methods addressing this challenge, providing means to correct errors in sequencing reads and then to estimate the relative abundances of microbial taxa in the sequenced sample. I then explore several machine learning approaches for predicting human diseases based on inferred microbe abundance information.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View