-
Data-driven and Physics Informed Modelling of Chinese Hamster Ovary Cell Bioreactors
Authors:
Tianqi Cui,
Tom S. Bertalan,
Nelson Ndahiro,
Pratik Khare,
Michael Betenbaugh,
Costas Maranas,
Ioannis G. Kevrekidis
Abstract:
Fed-batch culture is an established operation mode for the production of biologics using mammalian cell cultures. Quantitative modeling integrates both kinetics for some key reaction steps and optimization-driven metabolic flux allocation, using flux balance analysis; this is known to lead to certain mathematical inconsistencies. Here, we propose a physically-informed data-driven hybrid model (a "…
▽ More
Fed-batch culture is an established operation mode for the production of biologics using mammalian cell cultures. Quantitative modeling integrates both kinetics for some key reaction steps and optimization-driven metabolic flux allocation, using flux balance analysis; this is known to lead to certain mathematical inconsistencies. Here, we propose a physically-informed data-driven hybrid model (a "gray box") to learn models of the dynamical evolution of Chinese Hamster Ovary (CHO) cell bioreactors from process data. The approach incorporates physical laws (e.g. mass balances) as well as kinetic expressions for metabolic fluxes. Machine learning (ML) is then used to (a) directly learn evolution equations (black-box modelling); (b) recover unknown physical parameters ("white-box" parameter fitting) or -- importantly -- (c) learn partially unknown kinetic expressions (gray-box modelling). We encode the convex optimization step of the overdetermined metabolic biophysical system as a differentiable, feed-forward layer into our architectures, connecting partial physical knowledge with data-driven machine learning.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Accelerating flux balance calculations in genome-scale metabolic models by localizing the application of loopless constraints
Authors:
Siu Hung Joshua Chan,
Lin Wang,
Satyakam Dash,
Costas D. Maranas
Abstract:
Background: Genome-scale metabolic network models and constraint-based modeling techniques have become important tools for analyzing cellular metabolism. Thermodynamically infeasible cy-cles (TICs) causing unbounded metabolic flux ranges are often encountered. TICs satisfy the mass balance and directionality constraints but violate the second law of thermodynamics. Current prac-tices involve imple…
▽ More
Background: Genome-scale metabolic network models and constraint-based modeling techniques have become important tools for analyzing cellular metabolism. Thermodynamically infeasible cy-cles (TICs) causing unbounded metabolic flux ranges are often encountered. TICs satisfy the mass balance and directionality constraints but violate the second law of thermodynamics. Current prac-tices involve implementing additional constraints to ensure not only optimal but also loopless flux distributions. However, the mixed integer linear programming problems required to solve become computationally intractable for genome-scale metabolic models. Results: We aimed to identify the fewest needed constraints sufficient for optimality under the loop-less requirement. We found that loopless constraints are required only for the reactions that share elementary flux modes representing TICs with reactions that are part of the objective function. We put forth the concept of localized loopless constraints (LLCs) to enforce this minimal required set of loopless constraints. By combining with a novel procedure for minimal null-space calculation, the computational time for loopless flux variability analysis is reduced by a factor of 10-150 compared to the original loopless constraints and by 4-20 times compared to the currently fastest method Fast-SNP with the percent improvement increasing with model size. Importantly, LLCs offer a scalable strategy for loopless flux calculations for multi-compartment/multi-organism models of very large sizes (e.g. >104 reactions) not feasible before. Matlab functions are available at https://github.com/maranasgroup/lll-FVA.
△ Less
Submitted 11 November, 2017;
originally announced November 2017.
-
Creation and analysis of biochemical constraint-based models: the COBRA Toolbox v3.0
Authors:
Laurent Heirendt,
Sylvain Arreckx,
Thomas Pfau,
Sebastián N. Mendoza,
Anne Richelle,
Almut Heinken,
Hulda S. Haraldsdóttir,
Jacek Wachowiak,
Sarah M. Keating,
Vanja Vlasov,
Stefania Magnusdóttir,
Chiam Yu Ng,
German Preciat,
Alise Žagare,
Siu H. J. Chan,
Maike K. Aurich,
Catherine M. Clancy,
Jennifer Modamio,
John T. Sauls,
Alberto Noronha,
Aarash Bordbar,
Benjamin Cousins,
Diana C. El Assal,
Luis V. Valcarcel,
Iñigo Apaolaza
, et al. (30 additional authors not shown)
Abstract:
COnstraint-Based Reconstruction and Analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental data and quantitative prediction of physicochemically and biochemically feasible phenotypic states. The COBRA Toolbox is a comprehensive software suite of interoperable COBRA methods. It has found widespread applications in biology, biomedicine, and biotechnology…
▽ More
COnstraint-Based Reconstruction and Analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental data and quantitative prediction of physicochemically and biochemically feasible phenotypic states. The COBRA Toolbox is a comprehensive software suite of interoperable COBRA methods. It has found widespread applications in biology, biomedicine, and biotechnology because its functions can be flexibly combined to implement tailored COBRA protocols for any biochemical network. Version 3.0 includes new methods for quality controlled reconstruction, modelling, topological analysis, strain and experimental design, network visualisation as well as network integration of chemoinformatic, metabolomic, transcriptomic, proteomic, and thermochemical data. New multi-lingual code integration also enables an expansion in COBRA application scope via high-precision, high-performance, and nonlinear numerical optimisation solvers for multi-scale, multi-cellular and reaction kinetic modelling, respectively. This protocol can be adapted for the generation and analysis of a constraint-based model in a wide variety of molecular systems biology scenarios. This protocol is an update to the COBRA Toolbox 1.0 and 2.0. The COBRA Toolbox 3.0 provides an unparalleled depth of constraint-based reconstruction and analysis methods.
△ Less
Submitted 23 February, 2018; v1 submitted 11 October, 2017;
originally announced October 2017.
-
Large-scale inference and graph theoretical analysis of gene-regulatory networks in B. stubtilis
Authors:
C. Christensen,
A. Gupta,
C. D. Maranas,
R. Albert
Abstract:
We present the methods and results of a two-stage modeling process that generates candidate gene-regulatory networks of the bacterium B. subtilis from experimentally obtained, yet mathematically underdetermined microchip array data. By employing a computational, linear correlative procedure to generate these networks, and by analyzing the networks from a graph theoretical perspective, we are abl…
▽ More
We present the methods and results of a two-stage modeling process that generates candidate gene-regulatory networks of the bacterium B. subtilis from experimentally obtained, yet mathematically underdetermined microchip array data. By employing a computational, linear correlative procedure to generate these networks, and by analyzing the networks from a graph theoretical perspective, we are able to verify the biological viability of our inferred networks, and we demonstrate that our networks' graph theoretical properties are remarkably similar to those of other biological systems. In addition, by comparing our inferred networks to those of a previous, noisier implementation of the linear inference process [17], we are able to identify trends in graph theoretical behavior that occur both in our networks as well as in their perturbed counterparts. These commonalities in behavior at multiple levels of complexity allow us to ascertain the level of complexity to which our process is robust to noise.
△ Less
Submitted 18 July, 2006;
originally announced July 2006.
-
Elucidation of Directionality for Co-Expressed Genes: Predicting Intra-Operon Termination Sites
Authors:
Anshuman Gupta,
Costas D. Maranas,
Reka Albert
Abstract:
We present a novel framework for inferring regulatory and sequence-level information from gene co-expression networks. The key idea of our methodology is the systematic integration of network inference and network topological analysis approaches for uncovering biological insights. We determine the gene co-expression network of Bacillus subtilis using Affymetrix GeneChip time series data and show…
▽ More
We present a novel framework for inferring regulatory and sequence-level information from gene co-expression networks. The key idea of our methodology is the systematic integration of network inference and network topological analysis approaches for uncovering biological insights. We determine the gene co-expression network of Bacillus subtilis using Affymetrix GeneChip time series data and show how the inferred network topology can be linked to sequence-level information hard-wired in the organism's genome. We propose a systematic way for determining the correlation threshold at which two genes are assessed to be co-expressed by using the clustering coefficient and we expand the scope of the gene co-expression network by proposing the slope ratio metric as a means for incorporating directionality on the edges. We show through specific examples for B. subtilis that by incorporating expression level information in addition to the temporal expression patterns, we can uncover sequence-level biological insights. In particular, we are able to identify a number of cases where (i) the co-expressed genes are part of a single transcriptional unit or operon and (ii) the inferred directionality arises due to the presence of intra-operon transcription termination sites.
△ Less
Submitted 16 November, 2005;
originally announced November 2005.