Protein secretion is one of the most fundamental biological processes in all living organisms. The secretory pathway consists of a complex network of endomembrane systems and associated proteins working together to synthesize, post-translationally modify, transport, and quality control thousands of membrane and secreted proteins vital for cell-cell communication, cell adhesion, modification of the extracellular matrix, and immunity. Not surprisingly, perturbations to the secretory system are seen in various diseases such as cancer1, Alzheimer’s2,3, Parkinson’s4, and non-alcoholic fatty liver disease5. Furthermore, protein secretion is heavily exploited in biotechnology, exemplified by the extensive use of “cell factories” for producing biopharmaceuticals and industrial enzymes. In fact, many of the most important biotherapeutics and monoclonal antibodies are produced via the secretory pathway in mammalian cell culture. Thus, due to its biomedical importance, understanding protein secretion at the molecular level is fundamental to elucidating the underlying pathophysiology of diseases, and is paramount for the efficient production of life-altering biotherapeutics.
In this doctoral dissertation, computational tools and methods are developed to enable systems biology analysis of the secretory pathway. First, transcriptomic and machine learning methods are used to decipher the determinants of recombinant protein yield in the most popular bioproduction host, Chinese hamster ovary (CHO) cells, across the human secretome. Results from this study serve as a resource for future work, building the framework needed to understand protein production in this host system. Second, complex genome scale models are used to generate a user-friendly tool that can predict and quantify metabolic and secretory activity of cells. This tool enables anyone, regardless of their computational background, to monitor and quantify multiple complex biological systems providing phenotypic interpretation from omics data. Finally, using a bottom-up systems biology approach based on collecting, assembling, integrating, and curating information and data by a combination of comprehensive literature surveys and searches in numerous databases, a comprehensive map of the secretory landscape capturing spatial and functional pathway organization is presented. By accounting for all the molecular machinery involved in protein secretion, one can more effectively diagnose the cause of variations in protein secretion through the use of innovative big data analysis techniques and tools from systems biology.