A linear mixed model approach to gene expression-tumor aneuploidy association studies.
- Author(s): Yao, Douglas W
- Balanis, Nikolas G
- Eskin, Eleazar
- Graeber, Thomas G
- et al.
Published Web Locationhttps://doi.org/10.1038/s41598-019-48302-1
Aneuploidy, defined as abnormal chromosome number or somatic DNA copy number, is a characteristic of many aggressive tumors and is thought to drive tumorigenesis. Gene expression-aneuploidy association studies have previously been conducted to explore cellular mechanisms associated with aneuploidy. However, in an observational setting, gene expression is influenced by many factors that can act as confounders between gene expression and aneuploidy, leading to spurious correlations between the two variables. These factors include known confounders such as sample purity or batch effect, as well as gene co-regulation which induces correlations between the expression of causal genes and non-causal genes. We use a linear mixed-effects model (LMM) to account for confounding effects of tumor purity and gene co-regulation on gene expression-aneuploidy associations. When applied to patient tumor data across diverse tumor types, we observe that the LMM both accounts for the impact of purity on aneuploidy measurements and identifies a new association between histone gene expression and aneuploidy.