Genome Size and Transposable Element Content as Determined by High-Throughput Sequencing in Maize and Zea luxurians
The genome of maize (Zea mays ssp. mays) consists mostly of transposable elements (TEs) and varies in size among lines. This variation extends to other species in the genus Zea: although maize and Zea luxurians diverged only ∼140,000 years ago, their genomes differ in size by ∼50%. We used paired-end Illumina sequencing to evaluate the potential contribution of TEs to the genome size difference between these two species. We aligned the reads both to a filtered gene set and to an exemplar database of unique repeats representing 1,514 TE families; ∼85% of reads mapped against TE repeats in both species. The relative contribution of TE families to the B73 genome was highly correlated with previous estimates, suggesting that reliable estimates of TE content can be obtained from short high-throughput sequencing reads, even at low coverage. Because we used paired-end reads, we could assess whether a TE was near a gene by determining if one paired read mapped to a TE and the second read mapped to a gene. Using this method, Class 2 DNA elements were found significantly more often in genic regions than Class 1 RNA elements, but Class 1 elements were found more often near other TEs. Overall, we found that both Class 1 and 2 TE families account for ∼70% of the genome size difference between B73 and luxurians. Interestingly, the relative abundance of TE families was conserved between species (r = 0.97), suggesting genome-wide control of TE content rather than family-specific effects.