Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Investigating the Effects of Genetic Variation on Transcriptional Regulation

Abstract

Thousands of genetic variants have been found to increase disease risk based on genome-wide association studies. Many of these variants are located outside of protein-coding regions, suggesting their regulatory effects on gene transcription. However, it is not fully understood the effects of non-coding genetic variation on transcriptional regulation. One way of interpreting these variants is to link with the specific DNA sequences recognized by transcription factors (TFs), which are also called motifs. I developed MAGGIE, a bioinformatic approach to identify functional motifs that mediate TF binding and function. Unlike many other motif analysis tools, MAGGIE associates motif mutations caused by non-coding variants with the changes in TF binding or regulatory function to provide more direct insights into the regulatory effects of genetic variation. I showed the outstanding performance of MAGGIE in various applications, including its ability to distinguish the divergent functions of distinct NF-kB factors in pro-inflammatory macrophages. As a detailed case study of the effects of non-coding variants, I applied MAGGIE to identify functional motifs for anti-inflammatory macrophages and discovered dominant TFs driving the anti-inflammatory response, which are also the frequent targets of genetic variation to influence such response. In combination with an integrative analysis of transcriptomic and epigenomic data, I revealed quantitative variations in motif affinity underlying the divergent anti-inflammatory responses observed in genetically different mouse strains. By leveraging deep learning approaches, I pinpointed functional variants altering functional motifs and provided strong evidence supporting the promise of using deep learning to identify functional variants. Finally, I went beyond motifs to systematically analyze the spacing between motifs and investigated its significance in the context of variant interpretation. I found most collaborative TFs do not require a constrained spacing but allow a relaxed range of spacing in between. Based on synthetic genetic variations from mutagenesis experiments and millions of naturally occurring variations, I showed that spacing alterations are generally tolerated by TF binding and regulatory function at TF binding sites. Collectively, these findings advance our understanding of how non-coding genetic variation influences gene transcription and phenotypic diversity.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View