- Main
Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations
- Elgart, Michael;
- Lyons, Genevieve;
- Romero-Brufau, Santiago;
- Kurniansyah, Nuzulul;
- Brody, Jennifer A;
- Guo, Xiuqing;
- Lin, Henry J;
- Raffield, Laura;
- Gao, Yan;
- Chen, Han;
- de Vries, Paul;
- Lloyd-Jones, Donald M;
- Lange, Leslie A;
- Peloso, Gina M;
- Fornage, Myriam;
- Rotter, Jerome I;
- Rich, Stephen S;
- Morrison, Alanna C;
- Psaty, Bruce M;
- Levy, Daniel;
- Redline, Susan;
- Sofer, Tamar
- et al.
Abstract
Polygenic risk scores (PRS) are commonly used to quantify the inherited susceptibility for a trait, yet they fail to account for non-linear and interaction effects between single nucleotide polymorphisms (SNPs). We address this via a machine learning approach, validated in nine complex phenotypes in a multi-ancestry population. We use an ensemble method of SNP selection followed by gradient boosted trees (XGBoost) to allow for non-linearities and interaction effects. We compare our results to the standard, linear PRS model developed using PRSice, LDpred2, and lassosum2. Combining a PRS as a feature in an XGBoost model results in a relative increase in the percentage variance explained compared to the standard linear PRS model by 22% for height, 27% for HDL cholesterol, 43% for body mass index, 50% for sleep duration, 58% for systolic blood pressure, 64% for total cholesterol, 66% for triglycerides, 77% for LDL cholesterol, and 100% for diastolic blood pressure. Multi-ancestry trained models perform similarly to specific racial/ethnic group trained models and are consistently superior to the standard linear PRS models. This work demonstrates an effective method to account for non-linearities and interaction effects in genetics-based prediction models.
Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-