Lawrence Berkeley National Laboratory
Comparative Reannotation of 21 Aspergillus Genomes
- Author(s): Salamov, Asaf
- Riley, Robert
- Kuo, Alan
- Grigoriev, Igor
- et al.
We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;2percent per genome), supported by comparative analysis, additionally correcting ~;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.