Understanding genetic variation in the proteome: a multi-scale structural systems biology toolkit
- Author(s): Mih, Nathan Da-Wei
- Advisor(s): Palsson, Bernhard O
- et al.
The computational representation of metabolic networks has traditionally abstracted the representation of enzymatic components that catalyze their associated reactions. However, the long history of reductionism in studying biological components, specifically the three-dimensional structure of proteins, has resulted in a rich knowledgebase of experimental data and computational tools capable of describing atomic-level biochemical interactions and inferring functional consequences. The challenge of integrating such information into large-scale computational models of cells results in the intersection of structural and systems biology. This dissertation aims
to explore and address the methodological issues that arise from the integration of these two fields, centered around the creation of a computational framework to answer the question of how best to utilize structural data in genome-scale models of metabolism. I present a software framework, ssbio, and an associated pipeline to simplify the integration process and allow the construction of high-quality genome-scale models with protein structures (GEM-PROs). The framework allows for the rigorous consideration and consolidation of the often repetitive or incomplete data of proteins in 3D space. Next, I explore a deep integration of molecular modeling tools into metabolic models to understand the impact of non-synonymous mutations on protein-ligand interactions within the human erythrocyte. I then show how these models provide utility in a comparative analysis of Escherichia coli and Thermotoga maritima by elucidating the impact of the structural proteome on the temperature dependence of growth, the distribution of protein fold families, and substrate specificity. Finally, I extend this framework to understand if hypothesized adaptations to oxidative stress are reflected in the structural proteomes of multiple strains of E. coli, along with the incorporation of a probabilistic model of oxidative damage based on selected structural features into a model of metabolism and protein synthesis. Overall, these studies represent the utility of a multi-scale approach to understanding how changes to the structural proteome can impact the system they participate in, along with providing the basis for future computational studies in structural systems biology.