Engineering DNA Polymerases to Synthesize Unnatural Genetic Polymers for Applications in Synthetic Biology
- Author(s): Nikoomanzar, Ali
- Advisor(s): Chaput, John C
- et al.
Polymerases are an ancient family of enzymes that synthesize long strands of DNA and RNA. These enzymes have found practical value in many biotechnology applications due to their ability to copy and amplify long stretches of DNA with high fidelity and efficiency. In recent years, engineered versions of naturally occurring DNA polymerases have been developed that can synthesize unnatural genetic polymers (also referred to as xenonucleic acids or XNAs) with reasonable fidelity and primer-extension efficiency. One analog α-L-threofuranosyl-(3’→2’) nucleic acid (TNA), is capable of forming antiparallel Watson-Crick duplex structures with complementary strands of DNA, RNA, and TNA, which is remarkable considering the structural differences between TNA and natural genetic polymers.
In this dissertation, we first developed methods and tools for purifying and measuring the activity of laboratory evolved XNA polymerases. Next, robust pipelines for the directed evolution of a wide variety of nucleic acid modifying enzymes (polymerase, ligase, restriction endonuclease) for synthetic biology were created and validated in a high-throughput fashion. Finally, new approaches toward molecular evolution were applied to evolve two unique TNA synthetases with improved catalytic efficiency and substrate specificity over their wild-type counterparts.
Chapter 1 begins with a review of polymerase function and structure, illustrating the latest techniques that have been used to answer fundamental questions about the mechanism of DNA synthesis. We next examine several techniques that have been applied to engineer polymerases with desired functional properties. Specifically, we focus our attention on droplet-based optical polymerase sorting (DrOPS) which is an advanced microfluidic screening technology utilized in Chapters 4-6 for evolving new XNA synthetases.
Chapter 2 describes an affinity chromatography-based polymerase purification protocol for studying the catalytic properties of thermophilic polymerases expressed as recombinant proteins from E. coli. This protocol was essential for future works presented in Chapters 3, 5, and 6 since many of the polymerases needed to synthesize XNA polymers are not available commercially and consequently researchers must express and purify these enzymes in-house. We discovered that taking advantage of the thermophilic nature of our recombinantly expressed polymerases by increasing the heat treatment step to 80°C for a full 60 minutes could remove > 90% of contaminating E. coli proteins. Furthermore, by placing on ice for a full 30 minutes, we allowed maximum precipitation and aggregation to occur and following ultracentrifugation obtained lysate with > 95% of purity. Previously reported protocols did not fully take advantage of this property of the polymerase which seems curious given the half-life of Kod polymerase is 12 hours at 95°C. The protocol details the steps needed to express, purify, and evaluate the activity of engineered polymerases with altered substrate recognition properties and produces ~20 mg of pure, nuclease-free polymerase per liter of E. coli bacterial culture.
Chapter 3 describes a highly parallel, low-cost method for measuring the average rate and substrate specificity of XNA polymerases in a standard qPCR instrument. This assay, termed polymerase kinetic profiling (PKPro), involves monitoring XNA synthesis on a self-priming template using high-resolution melting (HRM) fluorescent dyes that intercalate into the growing duplex as the template strand is copied into XNA. Mechanistically, HRM dyes function by intercalating between the base pairs of dsDNA and also by binding along the phosphate backbone via electrostatic interactions and groove binding. This presents synthetic biology researchers with a unique challenge; namely, the altered sugar-phosphate backbone structures of XNA/DNA heteroduplexes may not interact with commercially available HRM fluorescent dyes in the same manner as natural DNA/DNA homoduplexes. We discovered through our empirical analysis that using a non-optimal HRM dye for a given XNA system could lead to drastically different conclusions regarding the true rate of XNA synthesis. Using PKPro, we benchmarked for the first time three engineered polymerases capable of synthesizing non-cognate (RNA) or unnatural (2’-fluoroarabino nucleic acid (FANA) and TNA) genetic polymers and compared their substrate specificity ratios to determine the extent of their molecular recognition properties. On the basis of these results, we suggest that PKPro provides a powerful tool for evaluating the activity of XNA polymerases.
Chapter 4 describes the construction, validation, and application of a fluorescence-activated droplet sorting (FADS) instrument that was established to evolve enzymes for synthesizing and modifying XNAs. The microfluidic system enables droplet sorting at ∼2–3 kHz using fluorescent sensors that are responsive to enzymatic activity. The utility of this device was to create a dedicated instrument for directed evolution experiments utilizing the superior stability of single emulsion droplets over their double emulsion counterparts typically required for conventional fluorescence-activated cell sorting (FACS) instruments. This device was used extensively in Chapters 5 and 6 for screening large libraries of polymerase variants for TNA synthesis capabilities and discovering mutations responsible for repurposing the enzyme active site to nearly invert substrate specificity towards the eventual goal of evolving a TNA specialist. Furthermore, the custom nature of the FADS allowed us to quickly make adjustments to sensor designs or protein expression protocols in order to tune our selection stringencies to produce the desired outcome. Indeed, future protein engineers would do well to construct a device similar in design to ours in order to reduce technical bottlenecks for studying protein structure-function relationships.
The ability to rapidly interrogate large swaths of protein sequence space to discover the molecular determinants of substrate specificity relies on efficient methods for performing selections in high throughput as well as library creation strategies to produce the requisite genetic diversity amongst individual clones. Furthermore, once these sites are elucidated, the ability to rapidly discover the optimal amino acid sequence for a given unnatural function requires the further construction of even more focused combinatorial saturation mutagenesis libraries. Chapter 5 describes a high-throughput microfluidic-based approach for mapping sequence–function relationships that combines DrOPS with deep mutational scanning (DMS). We applied this strategy to map the finger subdomain of a replicative DNA polymerase isolated from Thermococcus kodakarensis (Kod). From a single round of sorting, we discovered two cases of positive epistasis and demonstrate the near inversion of substrate specificity from a double mutant variant. This effort indicates that polymerase specificity may be governed by a small number of highly specific residues that can be elucidated by DMS without the need for iterative rounds of directed evolution which contrasts markedly with previous polymerase engineering endeavors that required anywhere from 3 to 18 rounds of selection for discovering efficient unnatural synthetases.
Chapter 6 describes a programmed allelic mutagenesis (PAM) strategy to comprehensively evaluate all possible single-point mutations in the entire catalytic domain of a replicative DNA polymerase. Most DNA polymerase libraries sample unknown portions of mutational space and are constrained by the limitations of random mutagenesis. By applying the PAM strategy with ultrafast high-throughput screening, we demonstrated how DNA polymerases could be mapped for allelic mutations that exhibit enhanced activity for unnatural nucleic acid substrates. We provided the first sequence function map of an entire polymerase domain that highlights the drastic importance and flexibility of the thumb subdomain for modulating unnatural polymerase activity. Using our method, we discovered two mutations in the thumb subdomain from unstructured loop regions both proximal and distal to the polymerase core that had a dramatic impact on the synthesis of an unnatural congener TNA. These residues would not have been predicted using standard rational design approaches and are good examples of the types of discoveries possible with our selection technique (DrOPS) and library construction method (PAM). Furthermore, by leveraging the power of long-read sequencing (PacBio) we have found a way to quickly determine fitness peaks from degenerate combinatorial site-saturation libraries. In principle, we predict that at a minimum, our discoveries will allow fellow protein engineers to quickly “re-engineer” those enzymes previously discovered by utilizing a single round of selection and clone construction. Such methods bring us that much closer to building designer, custom-built proteins for biotechnology.