Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Electronic Theses and Dissertations bannerUCSF

Methods for Mining Genome-Wide Phenotype Associations in a Functional Context

Abstract

Genome-wide association studies (GWAS) have linked various complex diseases to many dozens, sometimes hundreds, of individual genomic loci. Since these are generally of small effect and may lack both functional annotations and an obvious relation to other disease-associated regions, they are difficult to place in a functional context that advances our understanding of the disease. When considered only in isolation, disparate collections of trait associations provide little mechanistic insight. Thus, there is a pressing need for computational methods that use genome-wide molecular data to aggregate GWAS associations and extract functional insight from seemingly unrelated SNPs. To this end, we recently introduced Sherlock, a Bayesian method that detects gene-disease associations through pattern matching between eQTL results and the the full set of GWAS loci (He et al., 2013). Here we review the Bayesian formulation and present Empirical Sherlock, a robust, parameter-free approach to detecting associations between various molecular functions and GWAS traits. It uses an empirically-derived null distribution to associate subsets of GWAS loci, grouped by their relevance to a particular molecular function, with the GWAS trait. The method is easily generalizable to most any genome-wide functional characterization (e.g. DNA methylation, transcription factor binding, etc.), in addition to eQTL. By avoiding null distributions that assume a particular theoretical form for the input GWAS, Empirical Sherlock yields results are resistant to false discovery due to inflation of the output p-values. The core method requires no tunable parameters or prior probabilities and, when used with eQTL, permits an adjustment for pleiotropic SNPs that control the expression of many genes. In addition, Empirical Sherlock calculates significance directly, avoiding the inherent limitations of permutation-based tests. As we demonstrate using moderately-powered GWAS for Crohn's disease, type 2 diabetes, and Schizophrenia, it detects gene associations that are either validated through better-powered GWAS or supported by independent literature.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View