Skip to main content
eScholarship
Open Access Publications from the University of California

Microbial species delineation using whole genome sequences

  • Author(s): Kyrpides, Nikos
  • Mukherjee, Supratim
  • Ivanova, Natalia
  • Mavrommatics, Kostas
  • Pati, Amrita
  • Konstantinidis, Konstantinos
  • et al.
Abstract

Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

Main Content
Current View