A Multiplexed Approach to Defining Sequence-Function Relationships in Gene Expression using a Model Human Transcription Factor Binding Site
- Author(s): Davis, Jessica Elizabeth
- Advisor(s): Kosuri, Sriram
- et al.
In this dissertation I present a complete characterization of the transcriptional activity of a model human transcription factor binding site (TFBS), the c-AMP Response Element (CRE), across varied cis-regulatory element architectures. The arrangement and assortment of TFBSs within cis-regulatory elements drive specific gene regulatory responses, yet it remains difficult to predict gene expression based on sequence alone. Part of this issue lies in our incomplete picture of how a single TFBS drives expression across various regulatory architectures differing in TFBS composition, TFBS affinity, TFBS number, distance between TFBSs, distance of TFBSs to transcription start sites, and sequence content surrounding TFBSs. To better our understanding on sequence-function relationships in eukaryotic gene expression, we designed and assayed 9,126 synthetic regulatory elements isolating such TFBS variables. We developed and employed massively-parallel reporter assays (MPRAs) to enable episomal and genomic interrogation of synthetic regulatory element activities in a human cell line. Overall, we find CRE number and affinity within regulatory elements largely determines expression, and this relationship is shaped by CRE proximities to promoter elements. Expression is not only dependent upon CRE’s overall distance to a downstream promoter, but also on its precise positioning and follows a ~10 bp periodicity along regulatory elements. Additionally, in the episomal MPRA, we find the spacing between multiple CREs dictates the phasing of expression periodicity in addition to overall expression. Lastly, we indicate differences between a single-copy genomic and episomal assay, highlighting the varied role certain TFBS variables have across regulatory contexts.