Computational Methods for Analyzing RNA Sequencing to Study Post-Transcriptional Gene Regulation
- Author(s): Cass, Ashley Anne
- Advisor(s): Xiao, Xinshu
- et al.
Since the completion of the Human Genome Project in 2003, massive DNA sequencing efforts enabled gene mapping and enhanced our understanding of genetic variation. However, exactly how the same DNA sequence in every cell of one individual leads to vast biological variation is still not fully understood. In particular, the DNA sequence does not directly contain information regarding which genes are expressed in different cell types, tissues, and disease states. With the advent of high-throughput RNA sequencing (RNA-Seq), gene expression and RNA isoform variation can be assayed cost- and time-efficiently in different conditions. In this work, we aimed to develop computational methods to analyze RNA-Seq for the purpose of elucidating mechanisms of post-transcriptional gene regulation. The first chapter briefly introduces RNA biology, including co- and post-transcriptional gene regulation concepts. The second chapter describes the identification of small cleavage-inducing RNAs and their RNA targets for degradation through bioinformatic integration of small RNA sequencing and Degradome Sequencing, the latter capturing RNA degradation products. This work revealed an expanded repertoire of small cleavage-inducing RNAs (sciRNAs) and their targets, suggesting that small RNA-mediated cleavage is more widespread than previously appreciated. Post-transcriptional regulation is often mediated by cis-regulatory elements in 5’ and 3’ untranslated regions (UTRs), including sciRNA target motifs. Thus, alternative transcription start sites (ATSS) and alternative polyadenylation (APA) often impact post-transcriptional gene regulation through the inclusion or exclusion of cis-regulatory elements in UTRs. In chapter three, we describe mountainClimber, a novel method that overcomes several limitations of existing approaches to identify ATSS and APA from RNA-Seq. In chapter four, we applied mountainClimber to thousands of RNA-Seq datasets derived from many human tissues in the largest study of ATSS and APA to date. In chapter five, we applied mountainClimber to chromatin-associated and poly(A)-selected RNA-Seq in murine macrophages with or without previous exposure to an endotoxin. This analysis revealed ATSS, APA, and alternative transcription end sites associated with tolerization of macrophages to endotoxins. Finally, we summarize our conclusions in chapter six.