ASSESSMENT AND PREDICTION OF PROTEIN STRUCTURES
- Author(s): Eramian, David Edward
- Advisor(s): Sali, Andrej
- et al.
An ambitious goal of modern biology is to understand the structure(s), interaction(s) and function(s) of each protein within cells and organisms. Understanding the nature of the interactions a protein makes is important because no protein exists in isolation, but rather functions through interactions with other macromolecules. Knowledge about the function of proteins is essential to understanding biological processes. Structure is the unifying component: both interactions and functions are intrinsically related to structure, as the structure of a protein helps define its function and affects the nature, type, and number of interactions it has with other macromolecules. Great attention has been paid to the development of methods for both the theoretical prediction and experimental determination of protein structure. Though experimentally-derived structures are more accurate, they are relatively scarce: of the millions of known protein sequences, well fewer than 1% of their corresponding structures have been solved experimentally. In the absence of an experimentally determined structure, computational models are often valuable for generating testable hypotheses and giving insight into existing experimental data. Such computational structure models are available for over two orders of magnitude more protein sequences than are experimentally determined structures, yet suffer from two limitations that experimentally determined structures do not: they frequently contain significant errors, and their accuracy cannot be readily assessed. The research described herein sought to increase the accuracy and applicability of computational protein models by addressing these two limitations. This broad goal was approached in four principal ways: (1) identifying the most native-like models from among sets of similar models; (2) predicting the absolute accuracy of protein structure models; (3) improving the accuracy of target/template alignments to increase the accuracy of comparative models built from distantly related template structures; and (4) developing a unified protein structure prediction protocol that makes the best use of all available information about the structure of a given protein, regardless of whether it is directly based on experiment, on the broader knowledge base, on statistical potentials, or intuition.