Objective
Depression is a prevalent disorder difficult to diagnose and treat. In particular, depressed patients exhibit largely unpredictable responses to treatment. Toward the goal of personalizing treatment for depression, we develop and evaluate computational models that use electronic health record (EHR) data for predicting the diagnosis and severity of depression, and response to treatment.Materials and methods
We develop regression-based models for predicting depression, its severity, and response to treatment from EHR data, using structured diagnosis and medication codes as well as free-text clinical reports. We used two datasets: 35,000 patients (5000 depressed) from the Palo Alto Medical Foundation and 5651 patients treated for depression from the Group Health Research Institute.Results
Our models are able to predict a future diagnosis of depression up to 12 months in advance (area under the receiver operating characteristic curve (AUC) 0.70-0.80). We can differentiate patients with severe baseline depression from those with minimal or mild baseline depression (AUC 0.72). Baseline depression severity was the strongest predictor of treatment response for medication and psychotherapy.Conclusions
It is possible to use EHR data to predict a diagnosis of depression up to 12 months in advance and to differentiate between extreme baseline levels of depression. The models use commonly available data on diagnosis, medication, and clinical progress notes, making them easily portable. The ability to automatically determine severity can facilitate assembly of large patient cohorts with similar severity from multiple sites, which may enable elucidation of the moderators of treatment response in the future.