Actionable Modeling: Elucidate Enzyme Interactions in Complex Biosynthesis Systems by Interpretable Models from Omics Data
Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Actionable Modeling: Elucidate Enzyme Interactions in Complex Biosynthesis Systems by Interpretable Models from Omics Data

No data is associated with this publication.
Abstract

With advances in mass spectrometry, there is increasing demand for more effective and adaptive approaches to systematically extract biological insights from glycomic and lipidomic data. This thesis explores the application of Markov modeling in understanding two key cellular processes, N-glycosylation and lipid biosynthesis.In Chapter 2, we demonstrate that a Markov model of N-glycosylation network successfully captures intricate glycosyltransferase interactions by reproducing a set of glycoprofiles from glycoengineered CHO cells producing erythropoietin (EPO). We further validate the model parsimony and accuracy by predicting the glycoprofiles of other glycoengineered drugs from the trained models and their wildtype glycoprofiles. The results attest to the model’s ability to learn glycosyltranferase activities and substrate specificities. To increase the impact of this approach, we also develop GlycoMME to allow broader access to this modeling pipeline, presenting a promising direction for rational glycoengineering. Chapter 3 extends the modeling approach to lipid biosynthesis, introducing the Lipid Synthesis Investigative Markov Model (LipidSIM). As a low-parameter, biologically interpretable framework, LipidSIM proves powerful in leveraging the interdependency in lipidomic data and extracts and quantifies perturbations to lipid biosynthesis reactions, generating hypotheses directly testable by transcriptomic data. This method is showcased in 5 different scenarios with 3 different lipidomic datasets, and the results substantiate LipidSIM as a valuable tool for extracting insights from high-dimensional lipidomic data of different types. In Chapter 4, a Taguchi design is used to systematically characterize the impact of 15 media supplements on potential cellular phenotypes for CHO cells, especially N-glycosylation. The analytical pipeline allows disentangling the impact of individual supplements at different concentration levels with minimal numbers of experimental configurations. This approach answers the demand to find more flexible strategies for controlling N-glycosylation beyond genetic engineering. When applied in conjunction with the Taguchi design, our modeling framework has the potential to facilitate rapid customization of media for the growing market of glycoprotein drugs. This thesis encapsulates the innovative applications of probabilistic modeling in accounting for the biological dependency underlying omics data, offering insights into intricate cellular processes and motivating further exploration in actionable modeling of biological systems.

Main Content

This item is under embargo until April 12, 2026.