- Goyal, Jatin;
- Ng, Ding Quan;
- Zhang, Kevin;
- Chan, Alexandre;
- Lee, Joyce;
- Zheng, Kai;
- Hurley-Kim, Keri;
- Nguyen, Lee;
- He, Lu;
- Nguyen, Megan;
- McBane, Sarah;
- Li, Wei;
- Cadiz, Christine Luu
Introduction
Adverse drug events (ADEs) are associated with poor outcomes and increased costs but may be prevented with prediction tools. With the National Institute of Health All of Us (AoU) database, we employed machine learning (ML) to predict selective serotonin reuptake inhibitor (SSRI)-associated bleeding.Methods
The AoU program, beginning in 05/2018, continues to recruit ≥ 18 years old individuals across the United States. Participants completed surveys and consented to contribute electronic health record (EHR) for research. Using the EHR, we determined participants who were exposed to SSRIs (citalopram, escitalopram, fluoxetine, fluvoxamine, paroxetine, sertraline, vortioxetine). Features (n = 88) were selected with clinicians' input and comprised sociodemographic, lifestyle, comorbidities, and medication use information. We identified bleeding events with validated EHR algorithms and applied logistic regression, decision tree, random forest, and extreme gradient boost to predict bleeding during SSRI exposure. We assessed model performance with area under the receiver operating characteristic curve statistic (AUC) and defined clinically significant features as resulting in > 0.01 decline in AUC after removal from the model, in three of four ML models.Results
There were 10,362 participants exposed to SSRIs, with 9.6% experiencing a bleeding event during SSRI exposure. For each SSRI, performance across all four ML models was relatively consistent. AUCs from the best models ranged 0.632-0.698. Clinically significant features included health literacy for escitalopram, and bleeding history and socioeconomic status for all SSRIs.Conclusions
We demonstrated feasibility of predicting ADEs using ML. Incorporating genomic features and drug interactions with deep learning models may improve ADE prediction.