Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Previously Published Works bannerUC Irvine

Human chromosomal-scale length variation and severity of COVID-19 infection using the UK Biobank dataset

Abstract

Introduction

The course of COVID-19 varies from asymptomatic to severe (acute respiratory distress, cytokine storms, and death) in patients. The basis for this range in symptoms is unknown. One possibility is that genetic variation is responsible for the highly variable response to infection. We evaluated how well a genetic risk score based on chromosome-scale length variation and machine learning classification algorithms could predict severity of response to SARS-CoV-2 infection.

Methods

We compared 981 patients from the UK Biobank dataset who had a severe reaction to SARS-COV-2 infection before 27 April 2020 to a similar number of age matched patients drawn for the general UK Biobank population. For each patient, we built a profile of 88 numbers characterizing the chromosome-scale length variability of their germ line DNA. Each number represented one quarter of the 22 autosomes. We used the machine learning algorithm XGBoost to build a classifier that could predict whether a person would have a severe reaction to Covid-19 based only on their 88-number classification.

Results

We found that the XGBoost classifier could differentiate between the two classes at a significant level p = 2 · 10 as measured against a randomized control and p = 3 · 10 measured against the expected value of a random guessing algorithm (AUC=0.5). However, we found that the AUC of the classifier was only 0.51, too low for a clinically useful test.

Conclusion

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View