The processes of transcriptional regulation have been implicated as key mechanisms by which changes in genotype, whether among human populations or between species, give rise to the diversity of phenotypes observed in nature, but a full understanding of how transcriptional regulation responds to genetic variation remains elusive. In the last decade, the use of high-throughput sequencing to assess read-outs of transcriptional regulatory activity such as transcription factor binding using chromatin immunoprecipitation followed by sequencing (ChIP-seq) and gene expression using RNA-seq has enabled quantitative genome-wide investigations of these processes.
Previous studies of differences in transcriptional regulation between species supported a model in which mammalian transcriptional regulation exhibited greater plasticity and rate of change compared to Drosophila, which comported with known differences in population size and generation time. However, by re-analyzing these data with a common framework, we find that gene expression and the binding patterns of all studied transcription factors, except for the chromatin organizer CTCF, diverge at indistinguishable rates in mammals, birds and fruit flies, suggesting the existence of a transcriptional regulatory “clock” in analogy to the molecular clock observed in protein sequences.
We next examined the role of the external environment in determining the transcriptional regulatory programs of microglia, a CNS-specific macrophage cell type. In both mice and humans, we find that the signals from the brain environment crucially determine both the enhancer landscape and gene expression programs that are significantly enriched for genes associated with neurological disease in humans. Substantial differences between human and mouse microglia and between ex vivo and in vitro cells argue for improved model systems to understand the role of microglia in human health.
Finally, differences in elements of transcriptional regulation such as transcription factor binding, and histone tail modifications have been observed to cluster in contiguous regions called cis-regulatory domains (CRDs). We developed algorithms for efficiently identifying genomic regions that are significantly enriched in quantitative trait differences such as ChIP-seq peak intensity. We applied this algorithm to data from five strains of laboratory mice and observed a concordance of CRD activity levels with clustered differences in chromatin activation state.