Functional Roles of Single Nucleotide Variants in Alternative Splicing
Next-generation sequencing has greatly facilitated large-scale analyses of the human genome. However, the function and biological significance of many single nucleotide variants, including single nucleotide polymorphisms (SNPs) and RNA editing sites, are still poorly understood and remain to be uncovered. The effects of single nucleotide variants on transcriptional regulation of gene expression are relatively well studied, but their impact on co- and post-transcriptional control of gene expression is less appreciated. In this project, the overarching goal was to investigate the functional roles of single nucleotide variants in RNA splicing regulation.
To tackle this question, we first developed a method to identify intronic tag SNPs that can modulate alternative splicing. The cell fractionation RNA-seq datasets from the ENCODE Project made identification of intronic SNPs possible in our method. Over 600 intronic and exonic splicing-relevant SNPs were predicted in this study, and our comprehensive analyses revealed the genomic, evolutionary and regulatory features of these genetically modulated alternative splicing (GMAS) events.
Next, we performed a genome-wide study to uncover the interplay between individual steps of RNA processing, an essential aspect of gene regulation. In eukaryotes, nascent RNA transcripts undergo an intricate series of RNA processing steps to achieve mRNA maturation. RNA editing and alternative splicing are two major RNA processing steps that can introduce significant modifications to the final gene products. Here, we aimed to determine RNA editing sites’ impact on RNA splicing. RNA-seq datasets from different subcellular fractions enabled tracking of RNA editing sites over the course of transcription. About 500 editing sites were observed to reside in 3’ acceptor sites, which could abolish normal splicing of the associated exons. In addition, we also explored the mechanism that editing sites affected splicing by RNA secondary structure remodeling and identified a subset of editing sites belonged to this category.
Lastly, the analyses of GMAS events in GTEx datasets provided insights into the characteristics of splicing-related SNPs and the associated exons. We observed that GMAS exons were highly tissue-specific and individual-specific, and rare GMAS events demonstrated outstanding associations with disease annotations compared to common events. In addition, we developed a new bioinformatic method to predict causal SNPs that alter splicing. This novel method identified over 600 causal SNPs for over 300 GMAS exons from the GTEx samples. Further analyses brought to light several splicing factors that were responsible for the GMAS patterns by interacting with the putative causal SNPs.