Constraint-based modeling of metabolism at the genome scale has existed as a successful field for two decades. However, genome-scale modeling has entered a new era, with models of ever-increasing size which account for every known reaction in the metabolic and gene expression networks, known as ME models. These models are able to integrate for both metabolic costs (operational expenses) and gene expression costs (capital expenses) in a quasi-econometric model of growing bacterial cells. This modeling type presents new challenges but also brings exciting novel capabilities to constraint-based modeling.
An initial challenge due to models of this size and scale is simply software. Existing software tools for constraint-based modeling simply do not scale to handle models at the size of ME models. Additionally, they make assumptions about the model structure which prevent them from modeling the nonlinearities present in ME models. To address this, a new software package, COBRApy was developed to meet these needs. Unfortunately, interoperability between constraint-based modeling frameworks was already difficult, often resulting in different modeling results. To address this shortcoming, a web-based model validator was deployed to aid the constraint-based modeling community in ensuring all models compute identically with different tools. Finally, a new framework was written on top of COBRApy to build and simulate ME models quickly, accurately, and reproducibly.
To exploit the descriptive nature of ME models, new algorithms were developed to gain increased biological insight from modeling simulations. To improve the understanding of the transcriptional regulatory network in E. coli, a method was developed to predict transcription factor activity from ME simulations, which were successfully used to guide experimental design in identifying novel transcription factor binding sites. To improve the ability of the E. coli ME model to directly predict differential gene expression, a novel method was developed to estimate ME model parameters from high-throughput omics data. Finally, the ME model of the E. coli K-12 MG1655 strain was extended to build strain-specific ME models for 40 different E. coli strains.