Lawrence Berkeley National Laboratory
Conservation patterns in different functional sequence categories of divergent Drosophila
- Author(s): Papatsenko, Dmitri
- Kislyuk, Andrey
- Levine, Michael
- Dubchak, Inna
- et al.
We have explored the distributions of fully conserved ungapped blocks in genome-wide pairwise alignments of recently completed species of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilis and D.mojavensis. Based on these distributions we have found that nearly every functional sequence category possesses its own distinctive conservation pattern, sometimes independent of the overall sequence conservation level. In the coding and regulatory regions, the ungapped blocks were longer than in introns, UTRs and non-functional sequences. At the same time, the blocks in the coding regions carried 3N+2 signature characteristic to synonymic substitutions in the 3rd codon positions. Larger block sizes in transcription regulatory regions can be explained by the presence of conserved arrays of binding sites for transcription factors. We also have shown that the longest ungapped blocks, or 'ultraconserved' sequences, are associated with specific gene groups, including those encoding ion channels and components of the cytoskeleton. We discussed how restrained conservation patterns may help in mapping functional sequence categories and improving genome annotation.