Applications of Cross-Validated Genetic Predictions from Linear Mixed Models and Other Results in Statistical Genetics
In this document I present statistical methods for use in analyses of human genetics. The methods presented are based on the predictions made by linear mixed models (LMMs). The key result is that vectors of out-of-sample predictions from an LMM, here named cvBLUPs, may be efficiently calculated and then used in novel applications. One application is as adjustment covariates in association studies where they bring control for population structure and boost power as in LMM analyses, but now in an efficient linear regression framework. An interpretation of cvBLUPs is as reference-free polygenic risk scores. With this interpretation of cvBLUPs, a method for trans-eQTL analyses is developed. In this application, eQTLs are identified by gene-based tests of association between cvBLUPs for expression at possible regulatory genes with actual expression at other, re- mote genes. Analytic results are shown for the efficient calculation of effect size estimates and genetic predictions from sparse and multivariate extensions of the linear mixed model. In sim- ulations, cross-validated predictions (CVPs) from sparse models are shown to boost power even more than cvBLUPs in association tests for traits with sparse genetic architectures. Finally, the results of an association study on a multi-phenotype, metabolomic data set are presented.