- Main
Computational Genetic Approaches for the Dissection of Complex Traits
- Furlotte, Nicholas A.
- Advisor(s): Eskin, Eleazar
Abstract
Over the past two decades, major technological innovations have transformed the field of genetics allowing researchers to examine the relationship between genetic and phenotypic variation at an unprecedented level of granularity. As a result, genetics has increasingly become a data-driven science, demanding effective statistical procedures and efficient computational methods and necessitating a new interface that some refer to as computational genetics. In this dissertation, I focus on a few problems existing within this interface. First, I introduce a method for calculating gene coexpression in a way that is robust to statistical confounding introduced through expression hetero- geneity. Heterogeneity in experimental conditions causes separate microarrays to be more correlated than expected by chance. This additional correlation between arrays induces correlation between gene expression measurements, in effect causing spuri- ous gene coexpression. By formulating the problem of calculating coexpression in a linear mixed-model framework, I show how it is possible to account for the cor- relation between microarrays and produce coexpression values that are robust to ex- pression heterogeneity. Second, I introduce a meta-analysis technique that allows for genome-wide association studies to be combined across populations that are known to contain population structure. This development was motivated by a specific problem in mouse genetics, the aim of which is to utilize multiple mouse association studies jointly. I show that by combining the studies using meta-analysis, while accounting for population structure, the proposed method achieves increased statistical power and increased association resolution. Next, I will introduce a computational and statistical procedure for performing genome-wide association using longitudinal measurements. I show that by accounting for the genetic and environmental correlation between mea- surements originating from the same individual, it is possible to increase association power. Finally, I will introduce a statistical and computational construct called the matrix-variate linear mixed-model (mvLMM), which is used for multiple phenotype genome-wide association. I show how the application of this method results in increased association power over single trait mapping and leads to a dramatic reduction in computational time over classical multiple phenotype optimization procedures. For example, where a classically-based approach takes hours to perform parameter optimization for moderate sample sizes mvLMM takes minutes. This technique is both a generalization and improvement on the previously proposed longitudinal analysis technique and its innovation has the potential to impact many current problems in the field of computational genetics.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-