Uncovering the diversity of CRISPR-Cas systems
CRISPR has revolutionized the speed and efficiency of genome editing. This powerful tool originates from an adaptive immune system found in prokaryotes that protects against viruses and other nefarious nucleic acids. In these systems, genetic memories of prior infections are transcribed into guide RNAs which program an interference complex, such as Cas9, to make a targeted DNA break, halting the infection. These systems are present in nearly half of all prokaryotes and are highly diverse. This dissertation focuses on the diversity of CRISPR-Cas systems, the evolutionary pressures that have led to these elegant immune systems, and applications resulting from this investigation.
Although thousands of Cas9 orthologs have been sequenced, biochemical and genome editing experiments have largely focused on a small subset of representatives. To investigate this diversity further we conducted biochemical interrogation of various Cas9 orthologs and found that all Cas9 proteins tested are robust single stranded DNA (ssDNA) cutters. Moreover, we found that many smaller orthologs had limited ability to interrogate double stranded DNA (dsDNA), explaining their unsuccessful use for genome editing. In this process, we recognized a new Cas9 variant from thermophilic environments, GeoCas9. We developed this ortholog for genome editing in human cells and found that it was more resistant to degradation in human plasma compared to the widely used Cas9 from S. pyogenes, expanding CRISPR applications to thermophilic hosts. Our newfound understanding of Cas9 diversity led to curiosity about what factors were driving the diversification of this protein. Anti-CRISPRs (Acrs) are small, viral encoded proteins that have evolved to inactivate CRISPR in this microbial arms race. We studied three different Acrs using biochemical, structural and genome editing experiments. Our results showed that these three Acrs had distinct targeting mechanisms to inactivate Cas9. In addition to allowing precise control of CRISPR-Cas9 in cells, these results provided a window into one of the driving forces steering Cas9 evolution.
We next turned our attention to metagenomic datasets, with the hypothesis that Cas9 was not the only single effector CRISPR system. Searching through terabase scale metagenomic data, we identified two new types of CRISPR systems that we called CasX and CasY (reclassified as Cas12e and Cas12d, respectively). These new systems included the most compact CRISPR systems described to date and provided new, streamlined proteins for genome editing with less restrictive sequence requirements. Continuing our search of metagenomic data, we were surprised to find 8 new varieties of CRISPR proteins that pushed the limit of size even further, at just a third of the size of typical CRISPR effector. We found that these systems, in addition to Cas12a, were targeted ssDNA shredders, a property which we developed into a robust platform for rapid, single nucleotide resolution genotyping. Together the investigation of CRISPR adaptive immunity described here gives insight into the pressures that have shaped these immune systems and in turn provides new tools that enable us to edit genomes, control CRISPR cutting and diagnose disease.