Bacteria and archaea have evolved multiple defense pathways for protection from invading viruses and plasmids. A recently discovered adaptive immune system relies on specialized genomic loci called CRISPR (clustered regularly interspaced short palindromic repeats), which function together with CRISPR-associated (Cas) proteins to target foreign nucleic acids for degradation. A hallmark feature of CRISPR-Cas immune systems is the use of non-coding RNA transcribed from the CRISPR locus (crRNA) to identify foreign DNA via RNA:DNA base- pairing. Conserved families of Cas enzymes play critical roles both in producing crRNAs and in cleaving DNA sequences targeted with crRNA guides. This work describes the basic functions of two such endonucleases, with a focus on engineering these systems for desired biotechnological applications.
CRISPR loci are initially transcribed as long precursor crRNAs (pre-crRNAs), which must be enzymatically cleaved to generate libraries of mature crRNAs that each target a unique DNA sequence. This processing event typically occurs at the 3' side of a stable RNA stem-loop structure and is catalyzed by Cas6. We show that one Cas6 family member called Csy4 recognizes its RNA substrate with extremely high affinity and exquisite specificity. Binding energy derives exclusively from interactions upstream of the scissile phosphate, allowing Csy4 to retain the cleavage product and sequester the crRNA for subsequent ribonucleoprotein complex formation. Using biochemical assays and three protein-RNA co-crystal structures, we reveal the chemical mechanism of RNA cleavage by Csy4 and identify the catalytic roles of an unusual catalytic dyad comprising histidine and serine residues. Our experiments highlight diverse modes of substrate recognition that enable Csy4 to accurately select CRISPR transcripts for processing while avoiding off-target RNA binding and cleavage.
Following crRNA biogenesis, one or more Cas proteins form large ribonucleoprotein complexes with the crRNA and utilize its sequence content to target complementary nucleic acids. Cas9 is a DNA endonuclease found in some bacteria that uses a dual-guide RNA comprising crRNA and trans-activating crRNA (tracrRNA) to identify target DNA sites for cleavage. We unravel the mechanism of DNA interrogation by Cas9:RNA complexes using both single-molecule and bulk biochemical experiments. The target search process is guided by recognition of a short trinucleotide sequence adjacent to potential target sites called the protospacer adjacent motif (PAM), and PAM binding triggers Cas9 catalytic activity. We also present three-dimensional structures of Cas9 from X-ray crystallography and electron microscopy experiments, which reveal RNA/DNA binding interfaces and the organization of both catalytic domains. Strikingly, RNA binding drives large-scale rearrangements of the Cas9 enzyme to form a central DNA- binding channel. This observation implicates RNA loading as a key step in Cas9 activation.
Cas9:RNA complexes have proven to be extremely effective genome engineering agents in animals and plants. By redesigning the sequence of the crRNA, Cas9 can be programmed to target virtually any desired DNA sequence inside the cell. We reveal that Cas9 can also be programmed to target single-stranded RNA substrates for both high-affinity binding and site- specific cleavage using PAM-presenting oligonucleotides. This approach enables the isolation of specific endogenous mRNA transcripts from cells. We believe that RNA targeting by Cas9 has the potential to transform the study of RNA function, much as site-specific DNA targeting has revolutionized genetic and genomic research.