Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Characterizing transcript diversity using long-read RNA sequencing

Abstract

Alternative transcripts arise from the same gene via alternative TSS usage, splicing, and polyA site choice. Such transcripts can give rise to functional disparities in protein structure, post-transcriptional regulation, and translational efficiency. Moreover, their expression in appropriate spatiotemporal contexts is a key feature of eukaryotic genomes. However, detecting and quantifying these transcript isoforms across tissues, cell types, and species has been challenging due to their longer lengths compared to the short reads typical of standard RNA-seq. In contrast, long-read RNA-seq (LR-RNA-seq) provides complete transcript structures, enabling investigation of transcript features and usage with greater fidelity. Here, I describe my work on application of LR-RNA-seq to characterizing and comparing full-length transcriptomes. First, I describe Swan, a software library I developed to facilitate visualization of full-length transcripts and to compare transcript usage between biological conditions. Next, I describe the ENCODE4 human and mouse LR-RNA-seq datasets, where I applied a novel triplet-based framework to harmonize and classify transcripts that share transcript start sites, exon junction chains, and transcript end sites. Lastly, I discuss the application of our single-nucleus LR-RNA-seq technique (LR-Split-seq) on two geneticallydistinct mouse strains to uncover cell type and genotype-specific transcript usage patterns. Collectively, these projects form a solid foundation for future analyses of long read transcriptomes to quantify changes in transcript diversity and transcript usage between samples, cell types, and genotypes within and between species.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View