Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Statistical Methods for Genome-Wide Association Studies on Biobank Data

Abstract

Genome-Wide Association Studies (GWAS) encompass an important area of statistical genetics. They seek to identify single-nucleotide polymorphisms (SNPs) that are associated with a trait of interest. It is becoming more common for large-scale resources of patient data such as biobanks to become available to researchers that include both genetic data and phenotype data from electronic health records (EHR). New techniques for GWAS are necessary to handle both the large sample sizes and the types of complex data generated from these resources. The first chapter aims to tackle both of these issues by establishing an efficient method of conducting a genome-wide scan of SNPs associated with ordinal traits, which commonly occur from phenotyping algorithms for complex diseases. Chapter two focuses on estimating the effects of covariates on intra-individual variances in a framework that can scale to big longitudinal data. Within-subject variances of traits such as blood pressure have been found to be risk factors, independent of mean levels, for a variety of conditions such as cardiovascular disease. We develop a weighted method of moments (MoM) framework for fitting a mixed effects location-scale model that is robust to distributional assumptions and is computationally tractable for biobank-sized data sets. The third chapter uses the framework from the second chapter to develop and conduct large-scale GWAS, identifying variants associated with intra-individual variability of longitudinal traits. In all of these projects, a main focus is ensuring that the methods can scale to the large sample sizes common in biobank data sets.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View