Prediction of Potential Host-Pathogen Protein Interactions by Structure
Proteins function through interactions with other biomolecules. Here I describe a series of tools developed and applied to study potential interactions between host and pathogen proteins. First, I describe a comprehensive relational database of structurally defined interfaces between pairs of protein domains, PIBASE. A diverse set of geometric, physicochemical, and topologic properties are calculated to describe each complex, its domains, interfaces, and binding sites (http://salilab.org/pibase). This database allows a range of observations, from the atomistic detail of individual interfaces, to the structural organization of protein interaction space. Next, I present a comparative modeling method that uses experimentally determined structures of protein complexes as templates to predict the composition of protein complexes. Candidate complexes are assessed by comparative modeling of the components and subsequent assessment by a statistical potential derived from binary domain interfaces in PIBASE. Moreover, the predicted complexes were also filtered using functional annotation and sub-cellular localization data. The protocol was validated using experimentally observed interactions in Saccharomyces cerevisiae (http://salilab.org/modbase). Finally, I present a global computational protocol that generates testable predictions of potential host--pathogen protein interactions. The protocol first scans the total genomes for host and pathogen proteins with similarity to known protein complexes, then assesses these putative interactions, using structure if available, and finally filters these using biological context, such as the stage-specific expression of pathogen proteins and tissue expression of host proteins. The technique was applied to a set of ten pathogens, including species of mycobacterium, apicomplexa, and kinetoplastida, responsible for "neglected" human diseases. The method was assessed by (i) comparison to a set of known host--pathogen interactions, (ii) comparison to genomics data describing host and pathogen genes involved in infection, and (iii) analysis of the functional properties of the human proteins predicted to interact with pathogen proteins. The predictions include interactions known from previously characterized mechanisms, such as cytoadhesion and protease inhibition, as well as suspected interactions in hypothesized pathways, such as apoptotic pathways (http://salilab.org/hostpathogen). These results suggest that comparative protein structure modeling in combination with genomic and proteomic data can be a valuable tool for the study of inter-specific protein interactions.