Purpose
To develop models for progression of nonproliferative diabetic retinopathy (NPDR) to proliferative diabetic retinopathy (PDR) and determine if incorporating updated information improves model performance.Design
Retrospective cohort study.Participants
Electronic health record (EHR) data from a tertiary academic center, University of California San Francisco (UCSF), and a safety-net hospital, Zuckerberg San Francisco General (ZSFG) Hospital were used to identify patients with a diagnosis of NPDR, age ≥ 18 years, a diagnosis of type 1 or 2 diabetes mellitus, ≥ 6 months of ophthalmology follow-up, and no prior diagnosis of PDR before the index date (date of first NPDR diagnosis in the EHR).Methods
Four survival models were developed: Cox proportional hazards, Cox with backward selection, Cox with LASSO regression and Random Survival Forest. For each model, three variable sets were compared to determine the impact of including updated clinical information: Static0 (data up to the index date), Static6m (data updated 6 months after the index date), and Dynamic (data in Static0 plus data change during the 6-month period). The UCSF data were split into 80% training and 20% testing (internal validation). The ZSFG data were used for external validation. Model performance was evaluated by the Harrell's concordance index (C-Index).Main outcome measures
Time to PDR.Results
The UCSF cohort included 1130 patients and 92 (8.1%) patients progressed to PDR. The ZSFG cohort included 687 patients and 30 (4.4%) patients progressed to PDR. All models performed similarly (C-indices ∼ 0.70) in internal validation. The random survival forest with Static6m set performed best in external validation (C-index 0.76). Insurance and age were selected or ranked as highly important by all models. Other key predictors were NPDR severity, diabetic neuropathy, number of strokes, mean Hemoglobin A1c, and number of hospital admissions.Conclusions
Our models for progression of NPDR to PDR achieved acceptable predictive performance and validated well in an external setting. Updating the baseline variables with new clinical information did not consistently improve the predictive performance.Financial disclosures
Proprietary or commercial disclosure may be found after the references.