UC San Diego
Identifying protein sequence signatures for flexible regions of functional importance
- Author(s): Gu, Jenny
- et al.
The advent of structural genomics and high-throughput sequencing technology has left researchers satiated with a plethora of protein structures to visualize and sequences to analyze. Nevertheless, the sequence-flexibility relationship generalized for all proteins remains to be realized. This research topic is of utmost importance with growing evidence of entropy playing a larger role in the allosteric nature of proteins. Many recent papers reporting the dynamics of several different protein systems indicate that fluctuations within a protein can be redistributed with sequence modulations or intermolecular binding. Furthermore, proteins in solution are constantly interconverting between multiple conformational states because of their dynamical nature. The understanding of local regions that facilitate the required changes in proteins to fulfill their biological role is the subject of this dissertation. These local regions include, but are not limited to, hinges, catalytic loops, and recognition loops. In this thesis, tools were developed to identify these flexible regions of functional importance with a structure- and sequence- based method. Transitional Dynamic Analysis identifies the redistribution of low frequency fluctuations between two different protein conformational states. Positional Impact Vertex for Entropy Transfer identifies position-specific regions in the protein structure with a large impact on global dynamics. With a third structure-based approach we were able to generate a training dataset to create Wiggle and Wiggle200 predictors that identify these regions of interest using only sequence information. The results of this research effort represent important contributions to the community in expanding our knowledge of the protein sequence space and the possible second order information that can be extracted from it