The pan-genome of Emiliania huxleyi
Coccolithophores are a major group of phytoplankton and play a significant role in the carbon cycle, as their calcareous exoskeletons comprise half the carbonate in deep-sea sediments. The bloom species Emiliania huxleyi is globally distributed with multiple strains isolated from around the world. A model for coccolithophore biology, it has been the object of numerous morphological, physiological, and transcriptomic studies, and it is now the first coccolithophore to have its genome sequenced. The reference genome provides an opportunity for us to address an outstanding issue of E. huxleyi biology: its remarkable morphological, physiological, and genomic variation between isolates. Variability suggests that E. huxleyi may have a ?pan-genome?, where different strains of a single species possess a heterogeneous gene complement. We use the JGI Annotation Pipeline and the JGI EST Pipeline to exploit the reference genome in combination with transcriptomic data from two other strains and discover potentially strain-specific genes. Such genes provide evidence of inter-strain heterogeneity, though they do not fully characterize the E. huxleyi pan-genome.