Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Capturing hidden covariates with linear factor models and other statistical methods in differential gene expression and expression quantitative trait locus studies

Abstract

This works aims to provide value to three types of readers. First, for students in statistics, psychology, and the social sciences, I provide a summary and review of three classical statistical methods: factor analysis, principal component analysis (PCA), and probabilistic PCA (PPCA), all of which fall under the category of linear factor models. These methods are widely used in many fields, including psychology, education, and computational biology, and are the cornerstones of many new, more complicated methods. However, most available materials about them are either decades old (and very long and use old-style notations) or cursory. This work provides current coverage of them that is in-depth yet concise.

Second, for new computational biologists who are unfamiliar with differential gene expression (DE) analysis and quantitative trait locus (QTL) analysis — in particular, expression quantitative trait locus (eQTL) analysis — I provide an introduction to DE analysis and eQTL analysis from a statistical perspective, with an emphasis on DE and eQTL analysis with hidden covariates. I avoid unnecessary jargon and aim for this material to be accessible to those without much background in biology.

Third, for computational biologists and geneticists who need to work with newly developed computational methods such as surrogate variable analysis (SVA), probabilistic estimation of expression residuals (PEER), and hidden covariates with prior (HCP), I document these methods in a unified framework and explore their connections to classical methods such as factor analysis and PCA. To the best of our knowledge, such precise and in-depth review of SVA, PEER, and HCP is currently not available elsewhere in the literature.

In short, this work aspires to be a useful reference manual for students and researchers working with linear factor models or newly developed methods for capturing hidden covariates in DE or eQTL analysis.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View