The ornate arrangement of diverse cells into specialized tissues, organs, and higher structures characteristic of multicellular organisms is all encoded from the same genome sequence. Despite their differences, morphologically distinct cells (e.g. muscle cells and neurons) must transcribe many of the same genes. Morphological indistinguishable cells must often transcribe distinct sets of genes (e.g. different odorant receptor cells). The ensemble of genes expressed in a given cell -- and the relative frequency they are expressed at, give each cell its characteristic identity more so than the presence of individual genes. Therefore understanding the genetic control of development and differentiation is a question not so much of the understanding the gene sequences themselves, but the regulatory structure of the genome which determines how they are deployed.
In order for development to unravel in such a manner that each embryo makes it through the process with all the correct parts in the correct positions at the end, this process must be exceedingly precise. Though often taken for granted, this precision becomes particular impressive if one considers the frequency with which mistakes are made in intelligently designed human built assembly processes. The developing animal must position components correctly on scales of microns (e.g. tissue boundaries) and nanometers (e.g. neuron-junctions), has no external direction of assembly, and requires thermal noise to position many of its components (including essentially all transcription factors - proteins which regulate read access to the genome).
It is not sufficient for the process to be precise. It must also be robust to changes in the conditions in which it operates, such as different thermal environments, nutrient conditions, and chemical environments. This robustness enables a certain degree of plasticity, such that some components of the system can change and evolve new functions, without causing catastrophic failure of the rest of the system.
In my thesis research I have tried to explore some of the molecular mechanisms of gene regulation which support the precise and robust expression of multicellular genomes. Rapid advances in post-genomic technologies have exposed a broad range of fundamental differences in the organization and regulation of multicellular genomes such as Drosophila. I have worked primarily on two phenomena, the use of promoter proximal pausing as a regulatory strategy, and the use of multiple apparently redundant regulatory sequences to drive expression of the same gene. Discovery of both of these phenomena emerged from analysis of whole genome polymerase and transcription factor binding data. Using quantitative high resolution in situ and semi-automated computational image processing I have studied the detailed differences in the transcriptional activation and transcription frequency of genes regulated by these mechanisms. Through this analysis I have shown a strong correlation through more rapid and synchronous gene expression and regulation through release of promoter proximal paused polymerase. Theoretical modeling demonstrates that such an effect can be expected from regulating release of stable downstream state in a general assembly process (such as construction of the RNA Pol II pre-initiation complex).
Analysis of gene expression driven by multiple enhancers with overlapping activity compared to constructs with only a single active enhancer revealed that the process by which an enhancer binds its target transcription factors and activates expression is often limiting enough that having a second independent copy can produce detectable changes in the frequency of transcription. This reduction of natural variation in gene activation is especially important under stress conditions, such as thermal stress or reduced levels of some of the activating factors. Robustness to this sort of variation may be important both for adaptation within a species and the flexibility to allow modification of interacting pathways in the course of evolutionary modification. These investigations also revealed a corrective propensity whereby the simultaneous activity of multiple enhancers, responding to repressors as well as activators, can give rise to correctly restricted gene expression even when the elements taken in isolation drive some degree of ectopic expression.
So far both of these mechanisms have only been reliably documented in multicellular systems, suggesting that the precision and robustness they confer may be an innovation of metazoans in response to increased levels of coordination required to keep many cells functioning in the tight cooperation of a multicellular organism. Doubtless this is but scratching the surface of the mechanisms which ensure such precision and control. However the rapid improvements to both genomic tools and imaging technology make it like to be a promising field for further exploration for years to come.