- Klann, Jeffrey G;
- Estiri, Hossein;
- Weber, Griffin M;
- Moal, Bertrand;
- Avillach, Paul;
- Hong, Chuan;
- Tan, Amelia LM;
- Beaulieu-Jones, Brett K;
- Castro, Victor;
- Maulhardt, Thomas;
- Geva, Alon;
- Malovini, Alberto;
- South, Andrew M;
- Visweswaran, Shyam;
- Morris, Michele;
- Samayamuthu, Malarkodi J;
- Omenn, Gilbert S;
- Ngiam, Kee Yuan;
- Mandl, Kenneth D;
- Boeker, Martin;
- Olson, Karen L;
- Mowery, Danielle L;
- Follett, Robert W;
- Hanauer, David A;
- Bellazzi, Riccardo;
- Moore, Jason H;
- Loh, Ne-Hooi Will;
- Bell, Douglas S;
- Wagholikar, Kavishwar B;
- Chiovato, Luca;
- Tibollo, Valentina;
- Rieg, Siegbert;
- Li, Anthony LLJ;
- Jouhet, Vianney;
- Schriver, Emily;
- Xia, Zongqi;
- Hutch, Meghan;
- Luo, Yuan;
- Kohane, Isaac S;
- EHR, The Consortium for Clinical Characterization of COVID-19 by;
- Brat, Gabriel A;
- Murphy, Shawn N
Objective
The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity.Materials and methods
Twelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site.Results
The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability-up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review.Discussion
We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions.Conclusions
We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.