Computational approaches to study splicing regulation in development and disease
Alternative splicing is an elaborately regulated co-/post-transcriptional process that dramatically expands the diversity and complexity of the eukaryotic transcriptome and proteome. A coordinated cell type-specific alternative splicing network is essential for cell-fate determination and tissue-identity acquisition. Defects in splicing machinery, including the cis-acting elements and trans-acting factors, can result in extensive aberrant splicing, which has been implicated in a wide range of human diseases, especially cancers and neurological disorders. Large-scale RNA sequencing (RNA-seq) data accumulated in public repositories or generated by consortium projects provide an unprecedented resource for more comprehensive elucidation on splicing regulation in development and disease. In the meantime, it also posed new challenges for the development of computational tools on faster profiling and more precise interpretation of the alternative splicing. The first part of the dissertation presents rMATS-turbo, referring to rMATS 4.0.1 or above, an ultra-fast computational tool for alternative splicing analysis in a time- and memory-efficient manner. We provide two major application scenarios of rMATS-turbo to demonstrate its capability for straightforward and fast splicing analysis. Firstly, we described a single-command differential splicing analysis between two cell lines, yielding a robust identification of splicing alterations, including those derived from novel splice sites. Secondly, we demonstrated the workflow for comprehensive profiling of splicing landscape using 1,019 RNA-seq datasets (18.58 T base) from the Cancer Cell Line Encyclopedia. Benchmarks of time and memory consumption revealed that rMATS-turbo still performs well even with increasing read depth or sample size. These results illustrated the ultra-fast nature of rMATS-turbo, which makes it a useful tool for splicing analysis on large-scale RNA-seq data. In the second and third parts of the dissertation, we exploited rMATS-turbo and other computational approaches to study the dynamics and regulation of splicing in tissue development and disease. In the second part, we sought to evaluate how alternative splicing, under the control of RNA binding proteins (RBPs), affects cell fate commitment during induced osteogenic differentiation of human bone marrow-derived multipotent stem/stromal progenitor cells (MSPCs). Our analysis revealed temporal coordination between widespread alternative splicing changes and RBP expression alterations. We also developed a new computational platform to screen key RBPs during development using time-course RNA-seq data. Nine RBPs were identified as potential key splicing regulators during osteogenic differentiation. Perturbation of two candidate RBPs, KHDRBS3 and CPEB2 inhibited MSPC osteogenesis in vitro, validating our computational prediction of “driver” RBPs. In the third part of the dissertation, inspired by previous studies implying a linkage of PRMT9 with splicing and brain development, we aimed to unravel the direct molecular, cellular, and pathological contributions of PRMT9 on neurological disorders. First, we showed that the autosomal recessive intellectual disability-associated variant, PRMT9 G189R, cannot catalyze SF3B2 methylation on R508 (R508me2s) and is extremely unstable. We also demonstrated that Prmt9 conditional KO in excitatory neurons resulted in impairment of learning, memory, and maturation of functional synapses in mice. Transcriptomic analysis discovered widespread splicing alterations, but no steady-state gene expression changes in KO mice, which indicates that alternative splicing independently defines the brain-specific transcriptome in Prmt9 KO mice. Moreover, genes with splicing changes were enriched in neuron- and synapse-related pathways. All of those findings indicated a PRMT9-SF3B2-splicing-synapse regulatory cascade linking PRMT9 with brain development. Finally, a working model was proposed that PRMT9-mediated SF3B2 R508me2s regulates splicing through 3’ splice site competition by altering SF3B2/pre-mRNA interaction. Overall, this work clarified the molecular, cellular, and functional contributions of PRMT9 and also deepened our insights into the splicing regulations in the pathogenesis of intellectual disability and related disorders.