Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Previously Published Works bannerUC Berkeley

Whole-genome demography of COVID-19 virus during its pandemic period and on panvalent vaccine design.

Abstract

With over 16 million submitted genomic sequences, the SARS-CoV-2 (SC2) virus, the cause of the most recent worldwide COVID-19 pandemic, has become the most sequenced genome of all known viruses, revealing, for example, a vast number of expanding viral lineages. Since the pandemic phase appears to be over, we performed a retrospective re-examination of the demographic grouping pattern and their genomic characteristics during the entire pandemic period up to the peak of the last pandemic wave. For our study, we extracted from the NCBI only unique viral sequences and converted each sequence data to a relational vector, indicating the presence/absence of each variational event compared to a reference sequence. Our study revealed several genomic features that are unexpected or different from those of previous studies. For example, approximately 44,000 variants with unique sequences emerged during the pandemic period; they group into only four major viral-genomic groups and each has a set of mostly unique highly-conserved variant-genotypes (HCVGs); and a small set from the first (ancestral) group was inherited by the three (descendant) groups, suggesting that HCVGs in the next group may be predictable from the current group(s). Such a concept may be potentially important in designing panvalent vaccines against the current and future waves of viral infections.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View