- Karabayir, Ibrahim;
- Butler, Liam;
- Goldman, Samuel M;
- Kamaleswaran, Rishikesan;
- Gunturkun, Fatma;
- Davis, Robert L;
- Ross, G Webster;
- Petrovitch, Helen;
- Masaki, Kamal;
- Tanner, Caroline M;
- Tsivgoulis, Georgios;
- Alexandrov, Andrei V;
- Chinthala, Lokesh K;
- Akbilgic, Oguz
Background
Parkinson's disease (PD) is a chronic, disabling neurodegenerative disorder.Objective
To predict a future diagnosis of PD using questionnaires and simple non-invasive clinical tests.Methods
Participants in the prospective Kuakini Honolulu-Asia Aging Study (HAAS) were evaluated biannually between 1995-2017 by PD experts using standard diagnostic criteria. Autopsies were sought on all deaths. We input simple clinical and risk factor variables into an ensemble-tree based machine learning algorithm and derived models to predict the probability of developing PD. We also investigated relationships of predictive models and neuropathologic features such as nigral neuron density.Results
The study sample included 292 subjects, 25 of whom developed PD within 3 years and 41 by 5 years. 116 (46%) of 251 subjects not diagnosed with PD underwent autopsy. Light Gradient Boosting Machine modeling of 12 predictors correctly classified a high proportion of individuals who developed PD within 3 years (area under the curve (AUC) 0.82, 95%CI 0.76-0.89) or 5 years (AUC 0.77, 95%CI 0.71-0.84). A large proportion of controls who were misclassified as PD had Lewy pathology at autopsy, including 79%of those who died within 3 years. PD probability estimates correlated inversely with nigral neuron density and were strongest in autopsies conducted within 3 years of index date (r = -0.57, p < 0.01).Conclusion
Machine learning can identify persons likely to develop PD during the prodromal period using questionnaires and simple non-invasive tests. Correlation with neuropathology suggests that true model accuracy may be considerably higher than estimates based solely on clinical diagnosis.