-
Feynman on Artificial Intelligence and Machine Learning, with Updates
Authors:
Eric Mjolsness
Abstract:
I present my recollections of Richard Feynman's mid-1980s interest in artificial intelligence and neural networks, set in the technical context of the physics-related approaches to neural networks of that time. I attempt to evaluate his ideas in the light of the substantial advances in the field since then, and vice versa. There are aspects of Feynman's interests that I think have been largely ach…
▽ More
I present my recollections of Richard Feynman's mid-1980s interest in artificial intelligence and neural networks, set in the technical context of the physics-related approaches to neural networks of that time. I attempt to evaluate his ideas in the light of the substantial advances in the field since then, and vice versa. There are aspects of Feynman's interests that I think have been largely achieved and others that remain excitingly open, notably in computational science, and potentially including the revival of symbolic methods therein.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Physics-based machine learning for modeling stochastic IP3-dependent calcium dynamics
Authors:
Oliver K. Ernst,
Tom Bartol,
Terrence Sejnowski,
Eric Mjolsness
Abstract:
We present a machine learning method for model reduction which incorporates domain-specific physics through candidate functions. Our method estimates an effective probability distribution and differential equation model from stochastic simulations of a reaction network. The close connection between reduced and fine scale descriptions allows approximations derived from the master equation to be int…
▽ More
We present a machine learning method for model reduction which incorporates domain-specific physics through candidate functions. Our method estimates an effective probability distribution and differential equation model from stochastic simulations of a reaction network. The close connection between reduced and fine scale descriptions allows approximations derived from the master equation to be introduced into the learning problem. This representation is shown to improve generalization and allows a large reduction in network size for a classic model of inositol trisphosphate (IP3) dependent calcium oscillations in non-excitable cells.
△ Less
Submitted 10 September, 2021;
originally announced September 2021.
-
Diff2Dist: Learning Spectrally Distinct Edge Functions, with Applications to Cell Morphology Analysis
Authors:
Cory Braker Scott,
Eric Mjolsness,
Diane Oyen,
Chie Kodera,
David Bouchez,
Magalie Uyttewaal
Abstract:
We present a method for learning "spectrally descriptive" edge weights for graphs. We generalize a previously known distance measure on graphs (Graph Diffusion Distance), thereby allowing it to be tuned to minimize an arbitrary loss function. Because all steps involved in calculating this modified GDD are differentiable, we demonstrate that it is possible for a small neural network model to learn…
▽ More
We present a method for learning "spectrally descriptive" edge weights for graphs. We generalize a previously known distance measure on graphs (Graph Diffusion Distance), thereby allowing it to be tuned to minimize an arbitrary loss function. Because all steps involved in calculating this modified GDD are differentiable, we demonstrate that it is possible for a small neural network model to learn edge weights which minimize loss. GDD alone does not effectively discriminate between graphs constructed from shoot apical meristem images of wild-type vs. mutant \emph{Arabidopsis thaliana} specimens. However, training edge weights and kernel parameters with contrastive loss produces a learned distance metric with large margins between these graph categories. We demonstrate this by showing improved performance of a simple k-nearest-neighbors classifier on the learned distance matrix. We also demonstrate a further application of this method to biological image analysis: once trained, we use our model to compute the distance between the biological graphs and a set of graphs output by a cell division simulator. This allows us to identify simulation parameter regimes which are similar to each class of graph in our original dataset.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
A dynamical system model for predicting gene expression from the epigenome
Authors:
James Brunner,
Jacob Kim,
Timothy Downing,
Eric Mjolsness,
Kord M. Kober
Abstract:
Gene regulation is an important fundamental biological process. The regulation of gene expression is managed through a variety of methods including epigenetic processes (e.g., DNA methylation). Understanding the role of epigenetic changes in gene expression is a fundamental question of molecular biology. Predictions of gene expression values from epigenetic data have tremendous research and clinic…
▽ More
Gene regulation is an important fundamental biological process. The regulation of gene expression is managed through a variety of methods including epigenetic processes (e.g., DNA methylation). Understanding the role of epigenetic changes in gene expression is a fundamental question of molecular biology. Predictions of gene expression values from epigenetic data have tremendous research and clinical potential. Despite active research, studies to date have focused on using statistical models to predict gene expression from methylation data. In contrast, dynamical systems can be used to generate a model to predict gene expression using epigenetic data and a gene regulatory network (GRN) which can also serve as a mechanistic hypothesis. Here we present a novel stochastic dynamical systems model that predicts gene expression levels from methylation data of genes in a given GRN. We provide an evaluation of the model using real patient data and a GRN created from robust reference sources. Software for dataset preparation, model parameter fitting and prediction generation, and reporting are available at \verb|https://github.com/kordk/stoch_epi_lib|.
△ Less
Submitted 28 May, 2021; v1 submitted 3 August, 2020;
originally announced August 2020.
-
Graph Prolongation Convolutional Networks: Explicitly Multiscale Machine Learning on Graphs with Applications to Modeling of Cytoskeleton
Authors:
C. B. Scott,
Eric Mjolsness
Abstract:
We define a novel type of ensemble Graph Convolutional Network (GCN) model. Using optimized linear projection operators to map between spatial scales of graph, this ensemble model learns to aggregate information from each scale for its final prediction. We calculate these linear projection operators as the infima of an objective function relating the structure matrices used for each GCN. Equipped…
▽ More
We define a novel type of ensemble Graph Convolutional Network (GCN) model. Using optimized linear projection operators to map between spatial scales of graph, this ensemble model learns to aggregate information from each scale for its final prediction. We calculate these linear projection operators as the infima of an objective function relating the structure matrices used for each GCN. Equipped with these projections, our model (a Graph Prolongation-Convolutional Network) outperforms other GCN ensemble models at predicting the potential energy of monomer subunits in a coarse-grained mechanochemical simulation of microtubule bending. We demonstrate these performance gains by measuring an estimate of the FLOPs spent to train each model, as well as wall-clock time. Because our model learns at multiple scales, it is possible to train at each scale according to a predetermined schedule of coarse vs. fine training. We examine several such schedules adapted from the Algebraic Multigrid (AMG) literature, and quantify the computational benefit of each. We also compare this model to another model which features an optimized coarsening of the input graph. Finally, we derive backpropagation rules for the input of our network model with respect to its output, and discuss how our method may be extended to very large graphs.
△ Less
Submitted 6 April, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.
-
Novel diffusion-derived distance measures for graphs
Authors:
C. B. Scott,
Eric Mjolsness
Abstract:
We define a new family of similarity and distance measures on graphs, and explore their theoretical properties in comparison to conventional distance metrics. These measures are defined by the solution(s) to an optimization problem which attempts find a map minimizing the discrepancy between two graph Laplacian exponential matrices, under norm-preserving and sparsity constraints. Variants of the d…
▽ More
We define a new family of similarity and distance measures on graphs, and explore their theoretical properties in comparison to conventional distance metrics. These measures are defined by the solution(s) to an optimization problem which attempts find a map minimizing the discrepancy between two graph Laplacian exponential matrices, under norm-preserving and sparsity constraints. Variants of the distance metric are introduced to consider such optimized maps under sparsity constraints as well as fixed time-scaling between the two Laplacians. The objective function of this optimization is multimodal and has discontinuous slope, and is hence difficult for univariate optimizers to solve. We demonstrate a novel procedure for efficiently calculating these optima for two of our distance measure variants. We present numerical experiments demonstrating that (a) upper bounds of our distance metrics can be used to distinguish between lineages of related graphs; (b) our procedure is faster at finding the required optima, by as much as a factor of 10^3; and (c) the upper bounds satisfy the triangle inequality exactly under some assumptions and approximately under others. We also derive an upper bound for the distance between two graph products, in terms of the distance between the two pairs of factors. Additionally, we present several possible applications, including the construction of infinite "graph limits" by means of Cauchy sequences of graphs related to one another by our distance measure.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Structural Commutation Relations for Stochastic Labelled Graph Grammar Rule Operators
Authors:
Eric Mjolsness
Abstract:
We show how to calculate the operator algebra and the operator Lie algebra of a stochastic labelled-graph grammar. More specifically, we carry out a generic calculation of the product (and therefore the commutator) of time-evolution operators for any two labelled-graph grammar rewrite rules, where the operator corresponding to each rule is defined in terms of elementary two-state creation/annihila…
▽ More
We show how to calculate the operator algebra and the operator Lie algebra of a stochastic labelled-graph grammar. More specifically, we carry out a generic calculation of the product (and therefore the commutator) of time-evolution operators for any two labelled-graph grammar rewrite rules, where the operator corresponding to each rule is defined in terms of elementary two-state creation/annihilation operators. The resulting graph grammar algebra has the following properties: (1) The product and commutator of two such operators is a sum of such operators with integer coefficients. Thus, the algebra and the Lie algebra occurs entirely at the structural (or graph-combinatorial) level of graph grammar rules, lifted from the level of elementary creation/annihilation operators (an improvement over [1], Propositions 1 and 2). (2) The product of the off-diagonal (state-changing) parts of two such graph rule operators is a sum of off-diagonal graph rule operators with non-negative integer coefficients. (3) These results apply whether the semantics of a graph grammar rule leaves behind hanging edges (Theorem 1), or removes hanging edges (Theorem 2). (4) The algebra is constructive in terms of elementary two-state creation/annihilation operators (Corollaries 3 and 8). These results are useful because dynamical transformations of labelled graphs comprise a general modeling framework, and algebraic commutators of time-evolution operators have many analytic uses including designing simulation algorithms and estimating their errors.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Deep Learning Moment Closure Approximations using Dynamic Boltzmann Distributions
Authors:
Oliver K. Ernst,
Tom Bartol,
Terrence Sejnowski,
Eric Mjolsness
Abstract:
The moments of spatial probabilistic systems are often given by an infinite hierarchy of coupled differential equations. Moment closure methods are used to approximate a subset of low order moments by terminating the hierarchy at some order and replacing higher order terms with functions of lower order ones. For a given system, it is not known beforehand which closure approximation is optimal, i.e…
▽ More
The moments of spatial probabilistic systems are often given by an infinite hierarchy of coupled differential equations. Moment closure methods are used to approximate a subset of low order moments by terminating the hierarchy at some order and replacing higher order terms with functions of lower order ones. For a given system, it is not known beforehand which closure approximation is optimal, i.e. which higher order terms are relevant in the current regime. Further, the generalization of such approximations is typically poor, as higher order corrections may become relevant over long timescales. We have developed a method to learn moment closure approximations directly from data using dynamic Boltzmann distributions (DBDs). The dynamics of the distribution are parameterized using basis functions from finite element methods, such that the approach can be applied without knowing the true dynamics of the system under consideration. We use the hierarchical architecture of deep Boltzmann machines (DBMs) with multinomial latent variables to learn closure approximations for progressively higher order spatial correlations. The learning algorithm uses a centering transformation, allowing the dynamic DBM to be trained without the need for pre-training. We demonstrate the method for a Lotka-Volterra system on a lattice, a typical example in spatial chemical reaction networks. The approach can be applied broadly to learn deep generative models in applications where infinite systems of differential equations arise.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
Learning Moment Closure in Reaction-Diffusion Systems with Spatial Dynamic Boltzmann Distributions
Authors:
Oliver K. Ernst,
Tom Bartol,
Terrence Sejnowski,
Eric Mjolsness
Abstract:
Many physical systems are described by probability distributions that evolve in both time and space. Modeling these systems is often challenging to due large state space and analytically intractable or computationally expensive dynamics. To address these problems, we study a machine learning approach to model reduction based on the Boltzmann machine. Given the form of the reduced model Boltzmann d…
▽ More
Many physical systems are described by probability distributions that evolve in both time and space. Modeling these systems is often challenging to due large state space and analytically intractable or computationally expensive dynamics. To address these problems, we study a machine learning approach to model reduction based on the Boltzmann machine. Given the form of the reduced model Boltzmann distribution, we introduce an autonomous differential equation system for the interactions appearing in the energy function. The reduced model can treat systems in continuous space (described by continuous random variables), for which we formulate a variational learning problem using the adjoint method for the right hand sides of the differential equations. This approach allows a physical model for the reduced system to be enforced by a suitable parameterization of the differential equations. In this work, the parameterization we employ uses the basis functions from finite element methods, which can be used to model any physical system. One application domain for such physics-informed learning algorithms is to modeling reaction-diffusion systems. We study a lattice version of the R{รถ}ssler chaotic oscillator, which illustrates the accuracy of the moment closure approximation made by the method, and its dimensionality reduction power.
△ Less
Submitted 22 April, 2019; v1 submitted 26 August, 2018;
originally announced August 2018.
-
Multilevel Artificial Neural Network Training for Spatially Correlated Learning
Authors:
C. B. Scott,
Eric Mjolsness
Abstract:
Multigrid modeling algorithms are a technique used to accelerate relaxation models running on a hierarchy of similar graphlike structures. We introduce and demonstrate a new method for training neural networks which uses multilevel methods. Using an objective function derived from a graph-distance metric, we perform orthogonally-constrained optimization to find optimal prolongation and restriction…
▽ More
Multigrid modeling algorithms are a technique used to accelerate relaxation models running on a hierarchy of similar graphlike structures. We introduce and demonstrate a new method for training neural networks which uses multilevel methods. Using an objective function derived from a graph-distance metric, we perform orthogonally-constrained optimization to find optimal prolongation and restriction maps between graphs. We compare and contrast several methods for performing this numerical optimization, and additionally present some new theoretical results on upper bounds of this type of objective function. Once calculated, these optimal maps between graphs form the core of Multiscale Artificial Neural Network (MsANN) training, a new procedure we present which simultaneously trains a hierarchy of neural network models of varying spatial resolution. Parameter information is passed between members of this hierarchy according to standard coarsening and refinement schedules from the multiscale modelling literature. In our machine learning experiments, these models are able to learn faster than default training, achieving a comparable level of error in an order of magnitude fewer training examples.
△ Less
Submitted 20 May, 2019; v1 submitted 14 June, 2018;
originally announced June 2018.
-
Prospects for Declarative Mathematical Modeling of Complex Biological Systems
Authors:
Eric Mjolsness
Abstract:
Declarative modeling uses symbolic expressions to represent models. With such expressions one can formalize high-level mathematical computations on models that would be difficult or impossible to perform directly on a lower-level simulation program, in a general-purpose programming language. Examples of such computations on models include model analysis, relatively general-purpose model-reduction…
▽ More
Declarative modeling uses symbolic expressions to represent models. With such expressions one can formalize high-level mathematical computations on models that would be difficult or impossible to perform directly on a lower-level simulation program, in a general-purpose programming language. Examples of such computations on models include model analysis, relatively general-purpose model-reduction maps, and the initial phases of model implementation, all of which should preserve or approximate the mathematical semantics of a complex biological model. The potential advantages are particularly relevant in the case of developmental modeling, wherein complex spatial structures exhibit dynamics at molecular, cellular, and organogenic levels to relate genotype to multicellular phenotype. Multiscale modeling can benefit from both the expressive power of declarative modeling languages and the application of model reduction methods to link models across scale. Based on previous work, here we define declarative modeling of complex biological systems by defining the operator algebra semantics of an increasingly powerful series of declarative modeling languages including reaction-like dynamics of parameterized and extended objects; we define semantics-preserving implementation and semantics-approximating model reduction transformations; and we outline a "meta-hierarchy" for organizing declarative models and the mathematical methods that can fruitfully manipulate them.
△ Less
Submitted 30 June, 2019; v1 submitted 30 April, 2018;
originally announced April 2018.
-
Learning Dynamic Boltzmann Distributions as Reduced Models of Spatial Chemical Kinetics
Authors:
Oliver K. Ernst,
Thomas Bartol,
Terrence Sejnowski,
Eric Mjolsness
Abstract:
Finding reduced models of spatially-distributed chemical reaction networks requires an estimation of which effective dynamics are relevant. We propose a machine learning approach to this coarse graining problem, where a maximum entropy approximation is constructed that evolves slowly in time. The dynamical model governing the approximation is expressed as a functional, allowing a general treatment…
▽ More
Finding reduced models of spatially-distributed chemical reaction networks requires an estimation of which effective dynamics are relevant. We propose a machine learning approach to this coarse graining problem, where a maximum entropy approximation is constructed that evolves slowly in time. The dynamical model governing the approximation is expressed as a functional, allowing a general treatment of spatial interactions. In contrast to typical machine learning approaches which estimate the interaction parameters of a graphical model, we derive Boltzmann-machine like learning algorithms to estimate directly the functionals dictating the time evolution of these parameters. By incorporating analytic solutions from simple reaction motifs, an efficient simulation method is demonstrated for systems ranging from toy problems to basic biologically relevant networks. The broadly applicable nature of our approach to learning spatial dynamics suggests promising applications to multiscale methods for spatial networks, as well as to further problems in machine learning.
△ Less
Submitted 2 March, 2018;
originally announced March 2018.
-
A Hierarchical Exact Accelerated Stochastic Simulation Algorithm
Authors:
David Orendorff,
Eric Mjolsness
Abstract:
A new algorithm, "HiER-leap", is derived which improves on the computational properties of the ER-leap algorithm for exact accelerated simulation of stochastic chemical kinetics. Unlike ER-leap, HiER-leap utilizes a hierarchical or divide-and-conquer organization of reaction channels into tightly coupled "blocks" and is thereby able to speed up systems with many reaction channels. Like ER-leap, Hi…
▽ More
A new algorithm, "HiER-leap", is derived which improves on the computational properties of the ER-leap algorithm for exact accelerated simulation of stochastic chemical kinetics. Unlike ER-leap, HiER-leap utilizes a hierarchical or divide-and-conquer organization of reaction channels into tightly coupled "blocks" and is thereby able to speed up systems with many reaction channels. Like ER-leap, HiER-leap is based on the use of upper and lower bounds on the reaction propensities to define a rejection sampling algorithm with inexpensive early rejection and acceptance steps. But in HiER-leap, large portions of intra-block sampling may be done in parallel. An accept/reject step is used to synchronize across blocks. This method scales well when many reaction channels are present and has desirable asymptotic properties. The algorithm is exact, parallelizable and achieves a significant speedup over SSA and ER-leap on certain problems. This algorithm offers a potentially important step towards efficient in silico modeling of entire organisms.
△ Less
Submitted 17 December, 2012;
originally announced December 2012.
-
Compositional Stochastic Modeling and Probabilistic Programming
Authors:
Eric Mjolsness
Abstract:
Probabilistic programming is related to a compositional approach to stochastic modeling by switching from discrete to continuous time dynamics. In continuous time, an operator-algebra semantics is available in which processes proceeding in parallel (and possibly interacting) have summed time-evolution operators. From this foundation, algorithms for simulation, inference and model reduction may be…
▽ More
Probabilistic programming is related to a compositional approach to stochastic modeling by switching from discrete to continuous time dynamics. In continuous time, an operator-algebra semantics is available in which processes proceeding in parallel (and possibly interacting) have summed time-evolution operators. From this foundation, algorithms for simulation, inference and model reduction may be systematically derived. The useful consequences are potentially far-reaching in computational science, machine learning and beyond. Hybrid compositional stochastic modeling/probabilistic programming approaches may also be possible.
△ Less
Submitted 3 December, 2012;
originally announced December 2012.
-
Time-Ordered Product Expansions for Computational Stochastic Systems Biology
Authors:
Eric Mjolsness
Abstract:
The time-ordered product framework of quantum field theory can also be used to understand salient phenomena in stochastic biochemical networks. It is used here to derive Gillespie's Stochastic Simulation Algorithm (SSA) for chemical reaction networks; consequently, the SSA can be interpreted in terms of Feynman diagrams. It is also used here to derive other, more general simulation and parameter-l…
▽ More
The time-ordered product framework of quantum field theory can also be used to understand salient phenomena in stochastic biochemical networks. It is used here to derive Gillespie's Stochastic Simulation Algorithm (SSA) for chemical reaction networks; consequently, the SSA can be interpreted in terms of Feynman diagrams. It is also used here to derive other, more general simulation and parameter-learning algorithms including simulation algorithms for networks of stochastic reaction-like processes operating on parameterized objects, and also hybrid stochastic reaction/differential equation models in which systems of ordinary differential equations evolve the parameters of objects that can also undergo stochastic reactions. Thus, the time-ordered product expansion (TOPE) can be used systematically to derive simulation and parameter-fitting algorithms for stochastic systems.
△ Less
Submitted 24 September, 2012;
originally announced September 2012.
-
Tessellations and Pattern Formation in Plant Growth and Development
Authors:
Bruce E Shapiro,
Henrik Jonsson,
Patrick Sahlin,
Marcus Heisler,
Adrienne Roeder,
Michael Burl,
Elliot M Meyerowitz,
Eric D Mjolsness
Abstract:
The shoot apical meristem (SAM) is a dome-shaped collection of cells at the apex of growing plants from which all above-ground tissue ultimately derives. In Arabidopsis thaliana (thale cress), a small flowering weed of the Brassicaceae family (related to mustard and cabbage), the SAM typically contains some three to five hundred cells that range from five to ten microns in diameter. These cells ar…
▽ More
The shoot apical meristem (SAM) is a dome-shaped collection of cells at the apex of growing plants from which all above-ground tissue ultimately derives. In Arabidopsis thaliana (thale cress), a small flowering weed of the Brassicaceae family (related to mustard and cabbage), the SAM typically contains some three to five hundred cells that range from five to ten microns in diameter. These cells are organized into several distinct zones that maintain their topological and functional relationships throughout the life of the plant. As the plant grows, organs (primordia) form on its surface flanks in a phyllotactic pattern that develop into new shoots, leaves, and flowers. Cross-sections through the meristem reveal a pattern of polygonal tessellation that is suggestive of Voronoi diagrams derived from the centroids of cellular nuclei. In this chapter we explore some of the properties of these patterns within the meristem and explore the applicability of simple, standard mathematical models of their geometry.
△ Less
Submitted 13 September, 2012;
originally announced September 2012.
-
Stochastic Process Semantics for Dynamical Grammar Syntax: An Overview
Authors:
Eric Mjolsness
Abstract:
We define a class of probabilistic models in terms of an operator algebra of stochastic processes, and a representation for this class in terms of stochastic parameterized grammars. A syntactic specification of a grammar is mapped to semantics given in terms of a ring of operators, so that grammatical composition corresponds to operator addition or multiplication. The operators are generators fo…
▽ More
We define a class of probabilistic models in terms of an operator algebra of stochastic processes, and a representation for this class in terms of stochastic parameterized grammars. A syntactic specification of a grammar is mapped to semantics given in terms of a ring of operators, so that grammatical composition corresponds to operator addition or multiplication. The operators are generators for the time-evolution of stochastic processes. Within this modeling framework one can express data clustering models, logic programs, ordinary and stochastic differential equations, graph grammars, and stochastic chemical reaction kinetics. This mathematical formulation connects these apparently distant fields to one another and to mathematical methods from quantum field theory and operator algebra.
△ Less
Submitted 19 November, 2005;
originally announced November 2005.