The content and expression of the eukaryotic transcriptome are tightly regulated by sequence elements in the RNA. Understandably, genetic variations in these essential sequences can alter many aspects of the transcriptome. However, to date, the rules underlying RNA regulation by sequence elements remain perplexing. In this dissertation, we provide new insights into the effects of genetic variants on alternative splicing and RNA abundance as well as the molecular mechanisms underlying these functional effects.
While our understanding of rare variants remains elusive, their complexity is further heightened by their occurrence in noncoding regions of the genome, such as the 3’ untranslated region (UTR). To address this challenge, we developed a massively parallel reporter assay, called MapUTR, to identify post-transcriptional genetic regulators of mRNA abundance. We tested 17,301 rare single nucleotide variants (SNVs), a significant portion of which demonstrated a functional role in mRNA abundance regulation through miRNA and/or RNA binding protein (RBP) interactions. Additionally, we assayed 11,929 cancer somatic mutations, allowing us to characterize the potential impact of noncoding mutations on carcinogenesis and prognosis.
Next, we investigated the global regulatory effects of exonic variants on alternative splicing using next-generation sequencing datasets generated by the Genotype-Tissue Expression (GTEx) project. Our analyses revealed >18,000 SNVs associated with >12,000 exons subject to a genetically modulated alternative splicing (GMAS) program. We found that both the genetic background and tissue environment contributed to a similar extent to the GMAS pattern, although the allelic imbalance varied largely between samples. We also developed a computational method to pinpoint functional splice-disrupting variants and further highlighted their relevance to phenotypic traits and diseases.
Finally, we leveraged long-read sequencing data from ENCODE to segregate alternatively spliced exons that are primarily directed by cis-regulatory elements or trans-acting factors, using a novel method – isoLASER. We identified 2,807 and 5,322 unique exonic parts from human and mouse tissues under cis-directed splicing regulation. We demonstrate that alternative splicing in general is governed by tissue-specific factors whereas cis-directed splicing is dominated by the genetic background. The cis-directed events overlapped repeat elements and showed significant enrichment in genes related to immune function.