While high-throughput sequencing of messenger (m)RNA has become routine, analogous measurements of small and non-coding RNA (ncRNA) populations are challenging, limiting our view of the transcriptome. Likewise, sequencing-based measurement of protein synthesis by ribosome profiling have been possible for over a decade, but challenging workflows have limited its wide-spread adoption beyond specialists. From a technical standpoint, both endogenous small RNAs and ribosome-protected mRNA footprints (RPFs) lack well-defined 3′ sequences, such as polyadenylated tails, and thus require multiple enzymatic steps to append sequencing adapters before reverse transcription. These multi-step protocols have variable and inconsistent yields that introduce stochasticity, along with bias caused by preferential capture of enzymatically favorable substrates. Moreover, existing small RNA library generation approaches are notably labor intensive, unsuitable for automation, and require high RNA inputs, limiting the scale and scope of their use. In this dissertation, I created a new approach that removed these barriers, and I used it to uncover new biological insights. To address the technological shortcomings of small and ncRNA profiling, I co-invented, optimized, and applied a novel approach for small RNA library generation: Ordered Two-Template Relay (OTTR). This method takes advantage of a recombinantly expressed reverse transcriptase (RT) derived from a non-long terminal repeat R2 retroelement in Bombyx mori to seamlessly convert small RNA into cDNA with flanking sequencing adaptors in a single-tube, 4-hour, workflow. Critically, this dramatic simplification does not compromise the quality of the libraries; instead, it notably reduces library generation bias across a range of substrates, such as micro(mi) RNAs, RPFs, and transfer(t) RNAs and tRNA-derived fragments. In this dissertation, I identified potential sources of bias in OTTR and modified the protocol to mitigate this residual bias. These optimized conditions have since been made commercially available as of the completion of this dissertation, leading to the distribution and sale of OTTR cDNA library synthesis kits.
Beyond the invention and optimization of OTTR, I also completely redeveloped ribosome profiling, a technically demanding and bias-sensitive technique for quantifying transcriptome-wide ribosome occupancy through sequencing of RPFs generated by nuclease digestion. I found OTTR reduced ribosome profiling library bias compared to the traditional ligation-based protocol, leading to the adoption of OTTR for ribosome profiling. Furthermore, I established P1 nuclease as a valuable tool for RPF generation independently of OTTR. I found that RPF production by P1 nuclease was less disruptive for RPF-ribosome interactions and limited ribosomal RNA (rRNA) cleavage, greatly improving the yield of RPF sequencing reads by reducing rRNA contaminants. These improvements also reveal a wide array of protected mRNA fragments resulting from several types of ribosome-ribosome collisions, which are increasingly recognized to offer insight about translation efficiency and nascent protein folding.
Ribosome-ribosome collisions have been observed in both eukaryotic and prokaryotic cells, and specialized pathways to resolve these collisions and destroy problematic mRNA are under intensive study for their role in protein quality control. In recent years, a connection between ribosome traffic flow and cellular stress has been uncovered. With collaborators, I applied ribosome profiling with P1 nuclease and OTTR to uncover an unappreciated connection between selenium deficiency and ribosome stalling on selenocysteine-encoding mRNAs linked to LRP8, a ferroptosis resistance factor upregulated in cancer cells. Beyond ribosome pausing, I was able to synthesize and study ribosome profiling libraires made to investigate different questions, such as how a SARS-CoV-2 protein remodeled the transcriptomic and translatomic landscape of a cell, how a ribosome profile changes during neuronal progenitor cell differentiation, or how T-cell activation poises certain mRNAs for robust translation.
Overall, OTTR enabled rapid and efficient cDNA library synthesis across a range of small RNAs, and is well suited for low-input samples, machine automatization, and can even be extended for sequencing of highly fragmented and damaged DNA. This method simultaneously closes the gap between transcriptomic mRNA profiling and small and ncRNA profiling, while opening the door for high-throughput studies on ncRNA abundance and processing patterns in normal and diseased cells.