Complete chloroplast genome of Trachelium caeruleum: extensive rearrangements are associated with repeats and tRNAs
Skip to main content
eScholarship
Open Access Publications from the University of California

Complete chloroplast genome of Trachelium caeruleum: extensive rearrangements are associated with repeats and tRNAs

Abstract

Chloroplast genome structure, gene order and content are highly conserved in land plants. We sequenced the complete chloroplast genome sequence of Trachelium caeruleum (Campanulaceae) a member of an angiosperm family known for highly rearranged chloroplast genomes. The total genome size is 162,321 bp with an IR of 27,273 bp, LSC of 100,113 bp and SSC of 7,661 bp. The genome encodes 115 unique genes, with 19 duplicated in the IR, a tRNA (trnI-CAU) duplicated once in the LSC and a protein coding gene (psbJ) duplicated twice, for a total of 137 genes. Four genes (ycf15, rpl23, infA and accD) are truncated and likely nonfunctional; three others (clpP, ycf1 and ycf2) are so highly diverged that they may now be pseudogenes. The most conspicuous feature of the Trachelium genome is the presence of eighteen internally unrearranged blocks of genes that have been inverted or relocated within the genome, relative to the typical gene order of most angiosperm chloroplast genomes. Recombination between repeats or tRNAs has been suggested as two means of chloroplast genome rearrangements. We compared the relative number of repeats in Trachelium to eight other angiosperm chloroplast genomes, and evaluated the location of repeats and tRNAs in relation to rearrangements. Trachelium has the highest number and largest repeats, which are concentrated near inversion endpoints or other rearrangements. tRNAs occur at many but not all inversion endpoints. There is likely no single mechanism responsible for the remarkable number of alterations in this genome, but both repeats and tRNAs are clearly associated with these rearrangements. Land plant chloroplast genomes are highly conserved in structure, gene order and content. The chloroplast genomes of ferns, the gymnosperm Ginkgo, and most angiosperms are nearly collinear, reflecting the gene order in lineages that diverged from lycopsids and the ancestral chloroplast gene order over 350 million years ago (Raubeson and Jansen, 1992). Although earlier mapping studies identified a number of taxa in which several rearrangements have occurred (reviewed in Raubeson and Jansen, 2005), an extraordinary number of chloroplast genome alterations are concentrated in several families in the angiosperm order Asterales (sensu APGII, Bremer et al., 2003). Gene mapping studies of representatives of the Campanulaceae (Cosner, 1993; Cosner et al.,1997, 2004) and Lobeliaceae (Knox et al., 1993; Knox and Palmer, 1999) identified large inversions, contraction and expansion of the inverted repeat regions, and several insertions and deletions in the cpDNAs of these closely related taxa. Detailed restriction site and gene mapping of the chloroplast genome of Trachelium caeruleum (Campanulaceae) identified seven to ten large inversions, families of repeats associated with rearrangements, possible transpositions, and even the disruption of operons (Cosner et al., 1997). Seventeen other members of the Campanulaceae were mapped and exhibit many additional rearrangements (Cosner et al., 2004). What happened in this lineage that made it susceptible to so many chloroplast genome rearrangements? How do normally very conserved chloroplast genomes change? The cause of rearrangements in this group is unclear based on the limited resolution available with mapping techniques. Several mechanisms have been proposed to explain how rearrangements occur: recombination between repeats, transposition, or temporary instability due to loss of the inverted repeat (Raubeson and Jansen, 2005). Sequencing whole chloroplast genomes within the Campanulaceae offers a unique opportunity to examine both the extent and mechanisms of rearrangements within a phylogenetic framework.We report here the first complete chloroplast genome sequence of a member of the Campanulaceae, Trachelium caeruleum. This work will serve as a benchmark for subsequent, comparative sequencing and analysis of other members of this family and close relatives, with the goal of further understanding chloroplast genome evolution. We confirmed features previously identified through mapping, and discovered many additional structural changes, including several partial to entire gene duplications, deterioration of at least four normally conserved chloroplast genes into gene fragments, and the nature and position of numerous repeat elements at or near inversion endpoints. The focus of this paper is on analyses of sequences at or near these rearrangements in Trachelium caeruleum. Inversions are believed to occur due to the presence of repeat elements subject to homologous recombination (Palmer, 1991; Knox et al., 1993). Repeats may facilitate inversions or other genome rearrangements (Achaz et al., 2003), and higher incidences of repeats have been correlated with greater numbers of rearrangements (Rocha, 2003). Alternatively, repeats may proliferate within a genome as a result of DNA strand repair mechanisms following a rearrangement event such as an inversion. Gene mapping studies previously identified five families of dispersed repeats in Trachelium at or near inversion endpoints (Cosner et al., 1997). Here we examine the sequences of these repeats and identify, map and characterize numerous additional repeats within the genome. We compare the number and size of repeats in typical unrearranged angiosperm chloroplast genomes to what we find in the highly rearranged chloroplast genome of Trachelium. The Trachelium chloroplast genome has the highest number and the largest repeats of diverse origin of any sequenced angiosperm chloroplast genome. These repeats are generally clustered at or near rearrangements and they are of diverse origins: partial or entire chloroplast gene duplications, noncoding chloroplast sequences or novel DNA with no clear sequence identity to any existing chloroplast DNA sequences. The Trachelium chloroplast genome represents the most highly rearranged sequenced genome of land plants and its bizarre organization is clearly associated with the high incidence of dispersed repetitive DNA.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View