-
Data-Driven Observability Decomposition with Koopman Operators for Optimization of Output Functions of Nonlinear Systems
Authors:
Shara Balakrishnan,
Aqib Hasnain,
Robert Egbert,
Enoch Yeung
Abstract:
When complex systems with nonlinear dynamics achieve an output performance objective, only a fraction of the state dynamics significantly impacts that output. Those minimal state dynamics can be identified using the differential geometric approach to the observability of nonlinear systems, but the theory is limited to only analytical systems. In this paper, we extend the notion of nonlinear observ…
▽ More
When complex systems with nonlinear dynamics achieve an output performance objective, only a fraction of the state dynamics significantly impacts that output. Those minimal state dynamics can be identified using the differential geometric approach to the observability of nonlinear systems, but the theory is limited to only analytical systems. In this paper, we extend the notion of nonlinear observable decomposition to the more general class of data-informed systems. We employ Koopman operator theory, which encapsulates nonlinear dynamics in linear models, allowing us to bridge the gap between linear and nonlinear observability notions. We propose a new algorithm to learn Koopman operator representations that capture the system dynamics while ensuring that the output performance measure is in the span of its observables. We show that a transformation of this linear, output-inclusive Koopman model renders a new minimum Koopman representation. This representation embodies only the observable portion of the nonlinear observable decomposition of the original system. A prime application of this theory is to identify genes in biological systems that correspond to specific phenotypes, the performance measure. We simulate two biological gene networks and demonstrate that the observability of Koopman operators can successfully identify genes that drive each phenotype. We anticipate our novel system identification tool will effectively discover reduced gene networks that drive complex behaviors in biological systems.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
The Effect of Sensor Fusion on Data-Driven Learning of Koopman Operators
Authors:
Shara Balakrishnan,
Aqib Hasnain,
Rob Egbert,
Enoch Yeung
Abstract:
Dictionary methods for system identification typically rely on one set of measurements to learn governing dynamics of a system. In this paper, we investigate how fusion of output measurements with state measurements affects the dictionary selection process in Koopman operator learning problems. While prior methods use dynamical conjugacy to show a direct link between Koopman eigenfunctions in two…
▽ More
Dictionary methods for system identification typically rely on one set of measurements to learn governing dynamics of a system. In this paper, we investigate how fusion of output measurements with state measurements affects the dictionary selection process in Koopman operator learning problems. While prior methods use dynamical conjugacy to show a direct link between Koopman eigenfunctions in two distinct data spaces (measurement channels), we explore the specific case where output measurements are nonlinear, non-invertible functions of the system state. This setup reflects the measurement constraints of many classes of physical systems, e.g., biological measurement data, where one type of measurement does not directly transform to another. We propose output constrained Koopman operators (OC-KOs) as a new framework to fuse two measurement sets. We show that OC-KOs are effective for sensor fusion by proving that when learning a Koopman operator, output measurement functions serve to constrain the space of potential Koopman observables and their eigenfunctions. Further, low-dimensional output measurements can be embedded to inform selection of Koopman dictionary functions for high-dimensional models. We propose two algorithms to identify OC-KO representations directly from data: a direct optimization method that uses state and output data simultaneously and a sequential optimization method. We prove a theorem to show that the solution spaces of the two optimization problems are equivalent. We illustrate these findings with a theoretical example and two numerical simulations.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Prediction of fitness in bacteria with causal jump dynamic mode decomposition
Authors:
Shara Balakrishnan,
Aqib Hasnain,
Nibodh Boddupalli,
Dennis M. Joshy,
Robert G. Egbert,
Enoch Yeung
Abstract:
In this paper, we consider the problem of learning a predictive model for population cell growth dynamics as a function of the media conditions. We first introduce a generic data-driven framework for training operator-theoretic models to predict cell growth rate. We then introduce the experimental design and data generated in this study, namely growth curves of Pseudomonas putida as a function of…
▽ More
In this paper, we consider the problem of learning a predictive model for population cell growth dynamics as a function of the media conditions. We first introduce a generic data-driven framework for training operator-theoretic models to predict cell growth rate. We then introduce the experimental design and data generated in this study, namely growth curves of Pseudomonas putida as a function of casein and glucose concentrations. We use a data driven approach for model identification, specifically the nonlinear autoregressive (NAR) model to represent the dynamics. We show theoretically that Hankel DMD can be used to obtain a solution of the NAR model. We show that it identifies a constrained NAR model and to obtain a more general solution, we define a causal state space system using 1-step,2-step,...,τ-step predictors of the NAR model and identify a Koopman operator for this model using extended dynamic mode decomposition. The hybrid scheme we call causal-jump dynamic mode decomposition, which we illustrate on a growth profile or fitness prediction challenge as a function of different input growth conditions. We show that our model is able to recapitulate training growth curve data with 96.6% accuracy and predict test growth curve data with 91% accuracy.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Steady state programming of controlled nonlinear systems via deep dynamic mode decomposition
Authors:
Aqib Hasnain,
Nibodh Boddupalli,
Shara Balakrishnan,
Enoch Yeung
Abstract:
This paper describes the optimal selection of a control policy to program the steady state of controlled nonlinear systems with hyperbolic fixed points. This work is motivated by the field of synthetic biology, in which saddle points are common (along with limit cycles), and the aim is to program cells to perform both digital and analog computation, though develo** genetic digital computation ha…
▽ More
This paper describes the optimal selection of a control policy to program the steady state of controlled nonlinear systems with hyperbolic fixed points. This work is motivated by the field of synthetic biology, in which saddle points are common (along with limit cycles), and the aim is to program cells to perform both digital and analog computation, though develo** genetic digital computation has been the main focus. We frame the analog computing challenge of generating a steady state input-output function inside living cells. To program the steady state, a data-driven approach is taken wherein an approximation of the Koopman operator, identified via deep dynamic mode decomposition, is used to describe the dynamics of the system linearly. The new representation of the dynamics are then used to solve an optimization problem for the input which maximizes a direction in state space. Some added structure on the Koopman operator learning process for controlled systems is given for dynamics that are separable in the state and input. Finally, the methods are demonstrated on simulation examples of an incoherent feedforward loop and a combinatorial promoter system, two common network architectures seen in the field of synthetic biology.
△ Less
Submitted 9 June, 2020; v1 submitted 29 September, 2019;
originally announced September 2019.
-
A data-driven method for quantifying the impact of a genetic circuit on its host
Authors:
Aqib Hasnain,
Subhrajit Sinha,
Yuval Dorfan,
Amin Espah Borujeni,
Yong** Park,
Paul Maschhoff,
Uma Saxena,
Joshua Urrutia,
Niall Gaffney,
Diveena Becker,
Atsede Siba,
Narendra Maheshri,
Ben Gordon,
Chris Voigt,
Enoch Yeung
Abstract:
Genetic circuits are designed to implement certain logic in living cells, kee** burden on the host cell minimal. However, manipulating the genome often will have a significant impact for various reasons (usage of the cell machinery to express new genes, toxicity of genes, interactions with native genes, etc.). In this work we utilize Koopman operator theory to construct data-driven models of tra…
▽ More
Genetic circuits are designed to implement certain logic in living cells, kee** burden on the host cell minimal. However, manipulating the genome often will have a significant impact for various reasons (usage of the cell machinery to express new genes, toxicity of genes, interactions with native genes, etc.). In this work we utilize Koopman operator theory to construct data-driven models of transcriptomic-level dynamics from noisy and temporally sparse RNAseq measurements. We show how Koopman models can be used to quantify impact on genetic circuits. We consider an experimental example, using high-throughput RNAseq measurements collected from wild-type E. coli, single gate components transformed in E. coli, and a NAND circuit composed from individual gates in E. coli, to explore how Koopman subspace functions encode increasing circuit interference on E. coli chassis dynamics. The algorithm provides a novel method for quantifying the impact of synthetic biological circuits on host-chassis dynamics.
△ Less
Submitted 13 September, 2019;
originally announced September 2019.
-
Koopman Operators for Generalized Persistence of Excitation Conditions for Nonlinear Systems
Authors:
Nibodh Boddupalli,
Aqib Hasnain,
Sai Pushpak Nandanoori,
Enoch Yeung
Abstract:
It is hard to identify nonlinear biological models strictly from data, with results that are often sensitive to experimental conditions. Automated experimental workflows and liquid handling enables unprecedented throughput, as well as the capacity to generate extremely large datasets. We seek to develop generalized identifiability conditions for informing the design of automated experiments to dis…
▽ More
It is hard to identify nonlinear biological models strictly from data, with results that are often sensitive to experimental conditions. Automated experimental workflows and liquid handling enables unprecedented throughput, as well as the capacity to generate extremely large datasets. We seek to develop generalized identifiability conditions for informing the design of automated experiments to discover predictive nonlinear biological models. For linear systems, identifiability is characterized by persistence of excitation conditions. For nonlinear systems, no such persistence of excitation conditions exist. We use the input-Koopman operator method to model nonlinear systems and derive identifiability conditions for open-loop systems initialized from a single initial condition. We show that nonlinear identifiability is intrinsically tied to the rank of a given dataset's power spectral density, transformed through the lifted Koopman observable space. We illustrate these identifiability conditions with a simulated synthetic gene circuit model, the repressilator. We illustrate how rank degeneracy in datasets results in overfitted nonlinear models of the repressilator, resulting in poor predictive accuracy. Our findings provide novel experimental design criteria for discovery of globally predictive nonlinear models of biological phenomena.
△ Less
Submitted 13 September, 2019; v1 submitted 24 June, 2019;
originally announced June 2019.
-
Optimal reporter placement in sparsely measured genetic networks using the Koopman operator
Authors:
Aqib Hasnain,
Nibodh Boddupalli,
Enoch Yeung
Abstract:
Optimal sensor placement is an important yet unsolved problem in control theory. In biological organisms, genetic activity is often highly nonlinear, making it difficult to design libraries of promoters to act as reporters of the cell state. We make use of the Koopman observability gramian to develop an algorithm for optimal sensor (or reporter) placement for discrete time nonlinear dynamical syst…
▽ More
Optimal sensor placement is an important yet unsolved problem in control theory. In biological organisms, genetic activity is often highly nonlinear, making it difficult to design libraries of promoters to act as reporters of the cell state. We make use of the Koopman observability gramian to develop an algorithm for optimal sensor (or reporter) placement for discrete time nonlinear dynamical systems to ease the difficulty of design of the promoter library. This ease is enabled due to the fact that the Koopman operator represents the evolution of a nonlinear system linearly by lifting the states to an infinite-dimensional space of observables. The Koopman framework ideally demands high temporal resolution, but data in biology are often sampled sparsely in time. Therefore we compute what we call the temporally fine-grained Koopman operator from the temporally coarse-grained Koopman operator, the latter of which is identified from the sparse data. The optimal placement of sensors then corresponds to maximizing the observability of the fine-grained system. We demonstrate the algorithm on a simulation example of a circadian oscillator.
△ Less
Submitted 18 September, 2019; v1 submitted 3 June, 2019;
originally announced June 2019.
-
Querying over Federated SPARQL Endpoints ---A State of the Art Survey
Authors:
Nur Aini Rakhmawati,
Jürgen Umbrich,
Marcel Karnstedt,
Ali Hasnain,
Michael Hausenblas
Abstract:
The increasing amount of Linked Data and its inherent distributed nature have attracted significant attention throughout the research community and amongst practitioners to search data, in the past years. Inspired by research results from traditional distributed databases, different approaches for managing federation over SPARQL Endpoints have been introduced. SPARQL is the standardised query lang…
▽ More
The increasing amount of Linked Data and its inherent distributed nature have attracted significant attention throughout the research community and amongst practitioners to search data, in the past years. Inspired by research results from traditional distributed databases, different approaches for managing federation over SPARQL Endpoints have been introduced. SPARQL is the standardised query language for RDF, the default data model used in Linked Data deployments and SPARQL Endpoints are a popular access mechanism provided by many Linked Open Data (LOD) repositories. In this paper, we initially give an overview of the federation framework infrastructure and then proceed with a comparison of existing SPARQL federation frameworks. Finally, we highlight shortcomings in existing frameworks, which we hope helps spawning new research directions.
△ Less
Submitted 7 June, 2013;
originally announced June 2013.
-
Microrheological Characterisation of Anisotropic Materials
Authors:
I A Hasnain,
A M Donald
Abstract:
We describe the measurement of anisotropic viscoelastic moduli in complex soft materials, such as biopolymer gels, via video particle tracking microrheology of colloid tracer particles. The use of a correlation tensor to find the axes of maximum anisotropy, and hence the mechanical director, is described. The moduli of an aligned DNA gel are reported, as a test of the technique; this may have im…
▽ More
We describe the measurement of anisotropic viscoelastic moduli in complex soft materials, such as biopolymer gels, via video particle tracking microrheology of colloid tracer particles. The use of a correlation tensor to find the axes of maximum anisotropy, and hence the mechanical director, is described. The moduli of an aligned DNA gel are reported, as a test of the technique; this may have implications for high DNA concentrations in vivo. We also discuss the errors in microrheological measurement, and describe the use of frequency space filtering to improve displacement resolution, and hence probe these typically high modulus materials.
△ Less
Submitted 3 March, 2006; v1 submitted 6 June, 2005;
originally announced June 2005.