Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Identification and Characterization of Systems with Defects in Transcription Termination

Abstract

Transcription termination is a fundamental process in gene regulation. It is a critical step in mRNA maturation and it has been found that several cellular stresses can disrupt transcription termination. When termination is disrupted, transcription continues past the annotated 3’ end of genes (called readthrough transcription). This has many downstream effects such as novel elongated transcripts, changes in epigenetic state, and large alterations in 3D genome structure. Given the variety of both causes and consequences of this phenotype, it is critical to develop methods to both identify and characterize defects of transcription termination (DoTT). In my first chapter, I present a software package called Automatic Readthrough Transcription Detection (ARTDeco), which can quantify readthrough transcription in data generated by next generation sequencing (NGS) assays that measure transcription. We demonstrate ARTDeco’s ability to discriminate between systems with DoTT and those with normal transcription termination. ARTDeco is able to quantify the degree of readthrough transcription in a system using three separate metrics. It is able to discriminate whether genes are transcribed due to gene activation (called primary induction genes) or due to readthrough transcription extending from the end of one gene through the body of its downstream gene (called read-in genes). We show that read-in genes represent analytical noise in the context of functional analyses. In addition, ARTDeco can identify downstream of gene (DoG) transcripts, which are intergenic transcripts originating from faulty termination. We show that ARTDeco can flexibly perform these functions across a variety of data types and organisms. In my second chapter, I deploy ARTDeco on NCBI’s Gene Expression Omnibus (GEO) repository of NGS data to search for signs of DoTT in virally-infected samples. We find evidence that several viruses cause DoTT. Among these viruses, we identify a likely mechanism for readthrough transcription in Rift Valley Fever Virus (RVFV). We confirm that the RVFV’s NSs protein causes DoTT by expressing it in THP-1 monocytes. Further, we compare the full range of transcriptional responses between NSs and the NS1 protein from influenza A virus (IAV). We find that both proteins cause global readthrough transcription and disrupt interferon signaling in distinct ways. Finally, I develop a software package to address a different fundamental regulatory process: transcription initiation. Transcription initiation is known to occur as a result of multiple transcription factors (TFs) binding to a regulatory sequence and recruiting transcriptional machinery. Existing computational methods do not adequately capture the collaboration of the TFs from sequence alone. I developed the Dual HOMER method, which employs successive rounds of motif enrichment in order to infer cooperativity between TFs in transcription start site regions (TSRs). We show that Dual HOMER is able to recapitulate known interactions between TFs and lends novel insights into these interactions due to the properties of the transcriptional network it generates. In all, this thesis advances the understanding of two fundamental biological processes and outlines methods that lend biological insight to both.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View