- Main
A Pangenomics-enriched Analysis of Klebsiella pneumoniae
- Balasubramanian, Archana
- Advisor(s): Palsson, Bernhard O
Abstract
Bacterial taxonomy has evolved significantly from its early days, where classifications were based solely on phenotypic traits, to the modern era where whole genome sequences allow for comprehensive genetic analyses. This dissertation explores the development and application of advanced computational methods for pangenome analysis and their implications for understanding bacterial phylogeny, with a focus on Klebsiella pneumoniae.
Traditional taxonomy, which relied heavily on observable characteristics, laid the foundation for classifying organisms but lacked the precision needed to distinguish closely related strains. With the advent of genomic sequencing, methods such as multi-locus sequence typing (MLST) and the Clermont quadruplex have provided more nuanced insights. Whole genome sequencing (WGS) has further revolutionized the field by enabling the use of metrics like the MASH distance to assess genetic relatedness on a genome-wide scale.
A core component of this research is the construction and analysis of the pangenome matrix, P, where rows represent genes and columns represent genomes, with entries indicating the presence or absence of specific genes. This matrix serves as a powerful tool for applying mathematical and machine learning techniques to bacterial classification. The variably present genes in the accessory genome, analyzed through these methods, allow for the definition of "phylons"—groups of genes that characterize specific phylogroups.
Our study presents a novel classification schema derived from the pangenome matrix, demonstrating its ability to mirror classical phylogroup definitions with remarkable accuracy. We employed Non-Negative Matrix Factorization (NMF) to decompose the pangenome matrix, revealing distinct phylogroups and sub-phylogroups in K. pneumoniae. The accessory genome’s gene distribution facilitated the identification of phylons, providing a genetic basis for differentiating strains. NMF segregated genes into phylons in concordance with traditional phylo-grouping methods, highlighting the genetic underpinnings of phenotypic traits.
Further, we explored the genetic composition of rare genomes by analyzing the frequency and distribution of transposable elements (TEs). This analysis shed light on the impact of horizontal gene transfer (HGT) and the assimilation of TEs into bacterial genomes, offering insights into the evolutionary history and adaptability of K. pneumoniae strains.To delve deeper into the functional capabilities of bacterial strains, we utilized Bactabolize, a tool for high-throughput genome-scale metabolic model construction. By applying this tool to K. pneumoniae, we generated strain-specific metabolic models and predicted growth phenotypes under various conditions using Flux Balance Analysis (FBA). This approach allowed us to identify essential genes and metabolic pathways, providing a comprehensive understanding of the strains’ metabolic capacities and potential vulnerabilities.
The findings of this research have broad implications for bacterial taxonomy, evolutionary biology, and clinical microbiology. The ability to rapidly classify pathogens based on genomic data can significantly enhance diagnostic accuracy and treatment decisions in clinical settings. Moreover, understanding the genetic basis of bacterial traits paves the way for developing targeted therapeutic strategies to combat antibiotic resistance.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-