Machine-learning-based prediction models for high-need high-cost patients using nationwide clinical and claims data.
Published Web Locationhttps://doi.org/10.1038/s41746-020-00354-8
High-need, high-cost (HNHC) patients-usually defined as those who account for the top 5% of annual healthcare costs-use as much as half of the total healthcare costs. Accurately predicting future HNHC patients and designing targeted interventions for them has the potential to effectively control rapidly growing healthcare expenditures. To achieve this goal, we used a nationally representative random sample of the working-age population who underwent a screening program in Japan in 2013-2016, and developed five machine-learning-based prediction models for HNHC patients in the subsequent year. Predictors include demographics, blood pressure, laboratory tests (e.g., HbA1c, LDL-C, and AST), survey responses (e.g., smoking status, medications, and past medical history), and annual healthcare cost in the prior year. Our prediction models for HNHC patients combining clinical data from the national screening program with claims data showed a c-statistics of 0.84 (95%CI, 0.83-0.86), and overperformed traditional prediction models relying only on claims data.