Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Methods for the analysis of human genetic variation in the search for the genetic basis of human disease

Abstract

Recent technological advances in the field of molecular biology have ushered in the genome wide association era of human genetics. Researchers can now simultaneously examine hundreds of thousands of single nucleotide polymorphisms (SNPs) in an individual at continually decreasing costs. In an effort to characterize distributions of SNPs in human populations a set of four million SNPs was collected in 269 individuals from four populations. This HapMap data set in combination with high throughput genotyping technology has caused a fundamental shift in the methodologies of scientists searching for the relationship between genotype and phenotype. The genome wide association study (GWAS) has become mainstream practice, leading to the discovery of a growing number of loci associated with the genetics basis of complex phenotypes including many human diseases. This work describes novel methods, resources, tools, and techniques designed to improve our ability to interpret and utilize GWAS and HapMap data. The Weighted Haplotype (WHAP) association method leverages the linkage structure information from the HapMap to improve GWAS power by providing accurate statistics for unobserved SNPs without the costs of additional genotyping. The SAT based tagging algorithm SATTagger identifies which SNPs to genotype as part of an association study, and provides the first optimal genome wide solution to this classic bioinformatics problem. The HapMap suffers from the fundamental limitation that at most 60 unrelated individuals are available per population. An analytical framework for analyzing the implications of a finite sample HapMap is presented. The results of the first round of GWAS studies showed that effect sizes of causal variants were small and that larger sample sizes were required for adequate power. Meta- analysis provides a mechanism for overcoming this problem with the cost of additional genotyping. A new statistic for imputation based meta analysis in a GWAS is given. Additional research is presented on MHC Class II binding prediction, which is a useful tool in understanding auto- immune and pathogenic diseases. A physics based binding model is presented with an EM like solution to find the optimal binding conformation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View