Polyploidy, or whole genome duplication, is rampant among both extant and ancient flowering plant species. Whole genome duplications create simultaneous copies of all genes contained within a genome as well as associated regulatory sequences. These duplication and the subsequent deletions of redundant coding and noncoding sequence both shape the natural evolution of plant genomes and provide a unique opportunity for researchers to characterize the regulatory sequences which determine when, in which cells and in what quantities the mRNA encoded for by particular genes will be produced.
This dissertation describes a model for explaining both bias in gene loss between parental subgenomes and the escape from preferential retention of duplicated genes between sequential whole genome duplications. Bias in gene deletion between individual duplicated segments had been previously observed by the publication of the sorghum and maize genomes provided an opportunity to demonstrate this bias was a consistent mark distinguishing whole pairs of ancestral chromosomes, and that ongoing gene loss remains consistently biased between high and low gene loss subgenomes millions of generations after a whole genome duplication. Bias in both ancestral and ongoing gene loss is shown to be correlated with biased gene expression between parental subgenomes with genes on the low gene loss subgenome tending to show higher expression levels than duplicate copies of the same genes on the high gene loss subgenome. This phenomena, originally referred to as genome dominance, although the literature has since become somewhat confused, provides an explanation both for biased gene loss between parental subgenomes and for the escape of deletion-resistant genes from the ratchet of ever increasing copy numbers through continued whole genome duplications.
This dissertation also demonstrates the use of polyploid lineage - in this case maize - as a deletion machine to rapidly characterize the function of regulatory sequences shared by orthologous genes within a clade. It was possible to develop testable hypothesis about the specific function of individual regulatory sequences by combining conserved noncoding sequence sequence datasets, noncoding sequence deletions identified using comparative genomics with analysis and visualization of gene expression data from diverse organs, tissues, and cell types. As a test of the accuracy of this method, a putative pollen specific enhancer of expression identified using expression data from maize was cloned from the orthologous sorghum gene and used to drive the expression of a reporter construct in Brachypodium distachyon. Polyploid deletion machines have the potential to radically accelerate the characterization of noncoding regulatory sequences, an area of genetics previously largely untouched by advances next generation sequencing technologies.