Background
Although electronic health records (EHR) have significant potential for the study of opioid use disorders (OUD), detecting OUD in clinical data is challenging. Models using EHR data to predict OUD often rely on case/control classifications focused on extreme opioid use. There is a need to expand this work to characterize the spectrum of problematic opioid use.Methods
Using a large academic medical center database, we developed 2 data-driven methods of OUD detection: (1) a Comorbidity Score developed from a Phenome-Wide Association Study of phenotypes associated with OUD and (2) a Text-based Score using natural language processing to identify OUD-related concepts in clinical notes. We evaluated the performance of both scores against a manual review with correlation coefficients, Wilcoxon rank sum tests, and area-under the receiver operating characteristic curves. Records with the highest Comorbidity and Text-based scores were re-evaluated by manual review to explore discrepancies.Results
Both the Comorbidity and Text-based OUD risk scores were significantly elevated in the patients judged as High Evidence for OUD in the manual review compared to those with No Evidence (p = 1.3E-5 and 1.3E-6, respectively). The risk scores were positively correlated with each other (rho = 0.52, p < 0.001). AUCs for the Comorbidity and Text-based scores were high (0.79 and 0.76, respectively). Follow-up manual review of discrepant findings revealed strengths of data-driven methods over manual review, and opportunities for improvement in risk assessment.Conclusion
Risk scores comprising comorbidities and text offer differing but synergistic insights into characterizing problematic opioid use. This pilot project establishes a foundation for more robust work in the future.