UC San Diego
The Diminishing Role of Hybrid Genome Assemblies in Longitudinal and Automated Mutation Detection
- Author(s): Sales, Mia
- Advisor(s): Palsson, Bernhard O
- et al.
Hybrid genomes, formed by high accuracy short-reads and long-reads with the ability to span even large repeat regions, allow for analysis of specific mutations that arise in longitudinal isolates of both clinical and lab-evolved samples. Presented here is an analysis of mutations that arise in a clinical isolate of methicillin-sensitive Staphylococcus aureus, Tx0117, which displays a high inoculum effect with cefazolin. The curing of the beta-lactamase gene created strain Tx0117c, which demonstrates an increased resistance to a variety of antimicrobial peptides. Analysis of the acquired mutations using Breseq uncovered five additional genes affected by the curing process, which may contribute to the increased resistance observed. Additionally, this thesis includes a study on automation of copy number variation detection in lab-evolved strains using the ALE system. The implementation of a python function, which takes a map of coverage depth per base position as input, is tested here in its ability to accurately extract amplification events from only short-read data and a reference genome of the parent strain. The results are validated using an automated pipeline which utilizes hybrid genome assemblies of the evolved strains and blastn (version 2.9.0) for identification of local alignments in the genomes. The sample sets analyzed suggest that short-read data alone is sufficient in identifying amplifications by size, depth, and location. Additionally, the new function allows for efficient and accurate analysis of the flanking regions of amplifications and the encoded genes, providing insight into the mechanisms of acquisition of such events.