In all known cellular organisms, from the smallest bacteria to the tallest of trees, DNA stores the instructions for maintenance, growth and — crucially — reproduction that have allowed life to proliferate to the far reaches of our planet. When a cell divides to form two equivalent cells, it must first copy its genome using an enzyme known as a replicative DNA polymerase. This protein functions as part of a replisome, which is a group of specialized proteins that work in concert to duplicate, faithfully, DNA molecules numbering several million bases in length. As a graduate student, I studied the atomic-level structure, biochemical properties and evolution of the replicative polymerase, and some of its helper proteins, in the replisome of the Gram-negative γ-proteobacterium Escherichia coli. Interestingly, although some replisome proteins are universally conserved, the replicative polymerases themselves are not, and those from Bacteria appear, on the basis of sequence and structure, unrelated to those from the two other domains of life, Eukaryota and Archaea. This lack of homology is evidence of the convergent evolution of two different types of polymerases and suggests that bacterial DNA replication may be a promising target for antibiotic development. Within bacteria, there are two main types of replicative polymerases — the PolC homologs and the DnaE1-pol homologs. PolC homologs appear to be limited to the low-GC Gram-positive bacteria that comprise the phyla Firmicutes and Tenericutes, while nearly all others use a DnaE1-pol.
In this thesis, I align a diverse set of DnaE1-homologs and identify among them two main versions, one like the polymerase from E. coli (Eco-like) and the other like that of Thermus aquaticus (Taq-like). Using a second alignment, I observe that the Eco-like polymerase almost always appears with an editing protein, DnaQ-exonuclease, which removes mismatch errors during DNA synthesis. Guided by these sequence data, I analyze previously solved structures of the E. coli and T. aquaticus polymerases and place the sequence and structure interpretations in the context of known bacterial phylogeny. Taq-like DnaE1-pol, which is a constitutively active polymerase that contains DNA-editing exonuclease activity in its PHP domain, is the ancestral version and, based on its presence in cyanobacteria, likely dates back roughly 3,500 million years, nearly to the origin of life on Earth. In contrast, Eco-like DnaE1-pol can adopt a distorted conformation incompatible with DNA synthesis and has a PHP domain that lacks editing activity and, instead, binds a DnaQ-exonuclease for editing in trans. Because mitochondria are descended from an α-proteobacterium, the Eco-like polymerase predates the emergence of eukaryotes more than 1,500 million years ago.
In addition to identifying the two main types of DnaE1-pol, I describe significant progress toward crystal structures the E. coli replicative polymerase in complex with DNA. Furthermore, I demonstrate the precise structural conservation of the PHP domain in this polymerase by restoring metal binding to it using only three point-mutations.