UC San Diego
Evolution studied through protein structural domains
- Author(s): Yang, Song
- et al.
A protein structural domain is defined as a compact, spatially distinct part of a protein that can fold independently of neighboring sequences. Since the number of protein domains is limited, and protein domains are evolutionarily more conserved than protein sequences, protein domains play an important role in our understanding of the structure, function and evolution of proteins. As fundamental evolutionary units, protein domains are associated with a variety of evolutionary processes such as domain genesis, loss, horizontal gene transfer (HGT) and domain combination. In the era of complete genomes, the number and types of protein domains in over 300 organisms across all major lineages of life can be retrieved using sophisticated domain assignment algorithms. Protein domain content is used as a structural attribute to determine the relatedness of organisms and reconstruct a phylogenetic tree of life that is comparable to the phylogeny by sequence analysis and combinations of gene content and gene order. The positions and sequence of protein domains along the chromosomes of closely-related species, as viewed by genome alignment plots, reflects genome rearrangement events such as duplication and inversion. The evolutionary history of individual domains and domain combinations can be illustrated by domain trees and combination trees. The evolutionary origins of all protein domains suggest that a large proportion of protein domains were invented at the root of life, and during evolution biological organisms tend to create new proteins and functions through recombination and duplication of existing domains, rather than creation of new domains de novo. Taken together this work shows the power of using protein structural domains to study evolution