Search | arXiv e-print repository

arXiv:2406.06812 [pdf, other]

On Learning what to Learn: heterogeneous observations of dynamics and establishing (possibly causal) relations among them

Authors: David W. Sroczynski, Felix Dietrich, Eleni D. Koronaki, Ronen Talmon, Ronald R. Coifman, Erik Bollt, Ioannis G. Kevrekidis

Abstract: Before we attempt to learn a function between two (sets of) observables of a physical process, we must first decide what the inputs and what the outputs of the desired function are going to be. Here we demonstrate two distinct, data-driven ways of initially deciding ``the right quantities'' to relate through such a function, and then proceed to learn it. This is accomplished by processing multiple… ▽ More Before we attempt to learn a function between two (sets of) observables of a physical process, we must first decide what the inputs and what the outputs of the desired function are going to be. Here we demonstrate two distinct, data-driven ways of initially deciding ``the right quantities'' to relate through such a function, and then proceed to learn it. This is accomplished by processing multiple simultaneous heterogeneous data streams (ensembles of time series) from observations of a physical system: multiple observation processes of the system. We thus determine (a) what subsets of observables are common between the observation processes (and therefore observable from each other, relatable through a function); and (b) what information is unrelated to these common observables, and therefore particular to each observation process, and not contributing to the desired function. Any data-driven function approximation technique can subsequently be used to learn the input-output relation, from k-nearest neighbors and Geometric Harmonics to Gaussian Processes and Neural Networks. Two particular ``twists'' of the approach are discussed. The first has to do with the identifiability of particular quantities of interest from the measurements. We now construct map**s from a single set of observations of one process to entire level sets of measurements of the process, consistent with this single set. The second attempts to relate our framework to a form of causality: if one of the observation processes measures ``now'', while the second observation process measures ``in the future'', the function to be learned among what is common across observation processes constitutes a dynamical model for the system evolution. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2405.20836 [pdf, other]

Solving partial differential equations with sampled neural networks

Authors: Chinmay Datar, Taniya Kapoor, Abhishek Chandra, Qing Sun, Iryna Burak, Erik Lien Bolager, Anna Veselovska, Massimo Fornasier, Felix Dietrich

Abstract: Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent pr… ▽ More Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent probability distributions allows us to progress on both challenges. In most examples, the random sampling schemes outperform iterative, gradient-based optimization of physics-informed neural networks regarding training time and accuracy by several orders of magnitude. For time-dependent PDE, we construct neural basis functions only in the spatial domain and then solve the associated ordinary differential equation with classical methods from scientific computing over a long time horizon. This alleviates one of the greatest challenges for neural PDE solvers because it does not require us to parameterize the solution in time. For second-order elliptic PDE in Barron spaces, we prove the existence of sampled networks with $L^2$ convergence to the solution. We demonstrate our approach on several time-dependent and static PDEs. We also illustrate how sampled networks can effectively solve inverse problems in this setting. Benefits compared to common numerical schemes include spectral convergence and mesh-free construction of basis functions. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: 16 pages, 15 figures

arXiv:2403.16215 [pdf, other]

Systematic construction of continuous-time neural networks for linear dynamical systems

Authors: Chinmay Datar, Adwait Datar, Felix Dietrich, Wil Schilders

Abstract: Discovering a suitable neural network architecture for modeling complex dynamical systems poses a formidable challenge, often involving extensive trial and error and navigation through a high-dimensional hyper-parameter space. In this paper, we discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems, namely, Linear Time-Invariant (LTI) system… ▽ More Discovering a suitable neural network architecture for modeling complex dynamical systems poses a formidable challenge, often involving extensive trial and error and navigation through a high-dimensional hyper-parameter space. In this paper, we discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems, namely, Linear Time-Invariant (LTI) systems. We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE). Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system, leveraging its properties. We bring forth a novel neural architecture paradigm featuring horizontal hidden layers and provide insights into why employing conventional neural architectures with vertical hidden layers may not be favorable. We also provide an upper bound on the numerical errors of our neural networks. Finally, we demonstrate the high accuracy of our constructed networks on three numerical examples. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: 37 pages, 25 figures

MSC Class: 93B17; 65L70; 68T07 ACM Class: I.2.m; G.1.3; G.1.7

arXiv:2306.16830 [pdf, other]

Sampling weights of deep neural networks

Authors: Erik Lien Bolager, Iryna Burak, Chinmay Datar, Qing Sun, Felix Dietrich

Abstract: We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic d… ▽ More We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic distribution, e.g., a normal distribution, we use both the input and the output training data to sample shallow and deep networks. We prove that sampled networks are universal approximators. For Barron functions, we show that the $L^2$-approximation error of sampled shallow networks decreases with the square root of the number of neurons. Our sampling scheme is invariant to rigid body transformations and scaling of the input data, which implies many popular pre-processing techniques are not required. In numerical experiments, we demonstrate that sampled networks achieve accuracy comparable to iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from OpenML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures. △ Less

Submitted 12 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

Comments: 42 pages incl. references and appendix, 15 figures

MSC Class: 68T07 ACM Class: G.1; G.3

arXiv:2305.16227 [pdf, other]

Transporting Densities Across Dimensions

Authors: Michael Plainer, Felix Dietrich, Ioannis G. Kevrekidis

Abstract: Even the best scientific equipment can only partially observe reality. Recorded data is often lower-dimensional, e.g., two-dimensional pictures of the three-dimensional world. Combining data from multiple experiments then results in a marginal density. This work shows how to transport such lower-dimensional marginal densities into a more informative, higher-dimensional joint space by leveraging ti… ▽ More Even the best scientific equipment can only partially observe reality. Recorded data is often lower-dimensional, e.g., two-dimensional pictures of the three-dimensional world. Combining data from multiple experiments then results in a marginal density. This work shows how to transport such lower-dimensional marginal densities into a more informative, higher-dimensional joint space by leveraging time-delayed measurements from an observation process. This can augment the information from scientific equipment to construct a more coherent view. Classical transportation algorithms can be used when the source and target dimensions match. Our approach allows the transport of samples between spaces of different dimensions by exploiting information from the sample collection process. We reconstruct the surface of an implant from partial recordings of bacteria moving on it and construct a joint space for satellites orbiting the Earth by combining one-dimensional, time-delayed altitude measurements. △ Less

Submitted 25 May, 2023; originally announced May 2023.

MSC Class: 58K05; 60G30; 37C20; 62-07

arXiv:2304.11925 [pdf, ps, other]

Data-driven modelling of brain activity using neural networks, Diffusion Maps, and the Koopman operator

Authors: Ioannis K. Gallos, Daniel Lehmberg, Felix Dietrich, Constantinos Siettos

Abstract: We propose a machine-learning approach to model long-term out-of-sample dynamics of brain activity from task-dependent fMRI data. Our approach is a three stage one. First, we exploit Diffusion maps (DMs) to discover a set of variables that parametrize the low-dimensional manifold on which the emergent high-dimensional fMRI time series evolve. Then, we construct reduced-order-models (ROMs) on the e… ▽ More We propose a machine-learning approach to model long-term out-of-sample dynamics of brain activity from task-dependent fMRI data. Our approach is a three stage one. First, we exploit Diffusion maps (DMs) to discover a set of variables that parametrize the low-dimensional manifold on which the emergent high-dimensional fMRI time series evolve. Then, we construct reduced-order-models (ROMs) on the embedded manifold via two techniques: Feedforward Neural Networks (FNNs) and the Koopman operator. Finally, for predicting the out-of-sample long-term dynamics of brain activity in the ambient fMRI space, we solve the pre-image problem coupling DMs with Geometric Harmonics (GH) when using FNNs and the Koopman modes per se. For our illustrations, we have assessed the performance of the two proposed schemes using a benchmark fMRI dataset with recordings during a visuo-motor task. The results suggest that just a few (for the particular task, five) non-linear coordinates of the high-dimensional fMRI time series provide a good basis for modelling and out-of-sample prediction of the brain activity. Furthermore, we show that the proposed approaches outperform the one-step ahead predictions of the naive random walk model, which, in contrast to our scheme, relies on the knowledge of the signals in the previous time step. Importantly, we show that the proposed Koopman operator approach provides, for any practical purposes, equivalent results to the FNN-GH approach, thus bypassing the need to train a non-linear map and to use GH to extrapolate predictions in the ambient fMRI space; one can use instead the low-frequency truncation of the DMs function space of L^2-integrable functions, to predict the entire list of coordinate functions in the fMRI space and to solve the pre-image problem. △ Less

Submitted 24 April, 2023; originally announced April 2023.

MSC Class: 65P99; 46T10; 37N30; 68T05; 37M10

arXiv:2303.03260 [pdf, other]

doi 10.1016/j.cma.2023.116278

On the Use of Neural Networks for Full Waveform Inversion

Authors: Leon Herrmann, Tim Bürchner, Felix Dietrich, Stefan Kollmannsberger

Abstract: Neural networks have recently gained attention in solving inverse problems. One prominent methodology are Physics-Informed Neural Networks (PINNs) which can solve both forward and inverse problems. In the paper at hand, full waveform inversion is the considered inverse problem. The performance of PINNs is compared against classical adjoint optimization, focusing on three key aspects: the forward-s… ▽ More Neural networks have recently gained attention in solving inverse problems. One prominent methodology are Physics-Informed Neural Networks (PINNs) which can solve both forward and inverse problems. In the paper at hand, full waveform inversion is the considered inverse problem. The performance of PINNs is compared against classical adjoint optimization, focusing on three key aspects: the forward-solver, the neural network Ansatz for the inverse field, and the sensitivity computation for the gradient-based minimization. Starting from PINNs, each of these key aspects is adapted individually until the classical adjoint optimization emerges. It is shown that it is beneficial to use the neural network only for the discretization of the unknown material field, where the neural network produces reconstructions without oscillatory artifacts as typically encountered in classical full waveform inversion approaches. Due to this finding, a hybrid approach is proposed. It exploits both the efficient gradient computation with the continuous adjoint method as well as the neural network Ansatz for the unknown material field. This new hybrid approach outperforms Physics-Informed Neural Networks and the classical adjoint optimization in settings of two and three-dimensional examples. △ Less

Submitted 30 January, 2023; originally announced March 2023.

Journal ref: Computer Methods in Applied Mechanics and Engineering, 2023

arXiv:2211.12386 [pdf, other]

A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms

Authors: Danimir T. Doncevic, Alexander Mitsos, Yue Guo, Qianxiao Li, Felix Dietrich, Manuel Dahmen, Ioannis G. Kevrekidis

Abstract: Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta… ▽ More Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta neural network to a recursively recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. In contrast to off-the-shelf deep learning approaches, it features a distinct division into modules for generation of information and for the subsequent assembly of this information towards a solution. Local information in the form of a subspace is generated by subordinate, inner, iterations of recurrent function evaluations starting at the current outer iterate. The update to the next outer iterate is computed as a linear combination of these evaluations, reducing the residual in this space, and constitutes the output of the network. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields iterations similar to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta integrators for ordinary differential equations. Due to its modularity, the superstructure can be readily extended with functionalities needed to represent more general classes of iterative algorithms traditionally based on Taylor series expansions. △ Less

Submitted 6 July, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: manuscript (22 pages, 9 figures), supporting information (11 pages, 9 figures)

arXiv:2205.00286 [pdf, other]

Learning Effective SDEs from Brownian Dynamics Simulations of Colloidal Particles

Authors: Nikolaos Evangelou, Felix Dietrich, Juan M. Bello-Rivas, Alex Yeh, Rachel Stein, Michael A. Bevan, Ioannis G. Kevrekidis

Abstract: We construct a reduced, data-driven, parameter dependent effective Stochastic Differential Equation (eSDE) for electric-field mediated colloidal crystallization using data obtained from Brownian Dynamics Simulations. We use Diffusion Maps (a manifold learning algorithm) to identify a set of useful latent observables. In this latent space we identify an eSDE using a deep learning architecture inspi… ▽ More We construct a reduced, data-driven, parameter dependent effective Stochastic Differential Equation (eSDE) for electric-field mediated colloidal crystallization using data obtained from Brownian Dynamics Simulations. We use Diffusion Maps (a manifold learning algorithm) to identify a set of useful latent observables. In this latent space we identify an eSDE using a deep learning architecture inspired by numerical stochastic integrators and compare it with the traditional Kramers-Moyal expansion estimation. We show that the obtained variables and the learned dynamics accurately encode the physics of the Brownian Dynamic Simulations. We further illustrate that our reduced model captures the dynamics of corresponding experimental data. Our dimension reduction/reduced model identification approach can be easily ported to a broad class of particle systems dynamics experiments/models. △ Less

Submitted 30 January, 2023; v1 submitted 30 April, 2022; originally announced May 2022.

Comments: 21 pages, 16 figures, 2 tables

arXiv:2204.12536 [pdf, other]

doi 10.1016/j.jcp.2023.112072

Double Diffusion Maps and their Latent Harmonics for Scientific Computations in Latent Space

Authors: Nikolaos Evangelou, Felix Dietrich, Eliodoro Chiavazzo, Daniel Lehmberg, Marina Meila, Ioannis G. Kevrekidis

Abstract: We introduce a data-driven approach to building reduced dynamical models through manifold learning; the reduced latent space is discovered using Diffusion Maps (a manifold learning technique) on time series data. A second round of Diffusion Maps on those latent coordinates allows the approximation of the reduced dynamical models. This second round enables map** the latent space coordinates back… ▽ More We introduce a data-driven approach to building reduced dynamical models through manifold learning; the reduced latent space is discovered using Diffusion Maps (a manifold learning technique) on time series data. A second round of Diffusion Maps on those latent coordinates allows the approximation of the reduced dynamical models. This second round enables map** the latent space coordinates back to the full ambient space (what is called lifting); it also enables the approximation of full state functions of interest in terms of the reduced coordinates. In our work, we develop and test three different reduced numerical simulation methodologies, either through pre-tabulation in the latent space and integration on the fly or by going back and forth between the ambient space and the latent space. The data-driven latent space simulation results, based on the three different approaches, are validated through (a) the latent space observation of the full simulation through the Nyström Extension formula, or through (b) lifting the reduced trajectory back to the full ambient space, via Latent Harmonics. Latent space modeling often involves additional regularization to favor certain properties of the space over others, and the map** back to the ambient space is then constructed mostly independently from these properties; here, we use the same data-driven approach to construct the latent space and then map back to the ambient space. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: 25 pages,21 figures, 4 tables

arXiv:2110.06717 [pdf, other]

On the Parameter Combinations That Matter and on Those That do Not

Authors: Nikolaos Evangelou, Noah J. Wichrowski, George A. Kevrekidis, Felix Dietrich, Mahdi Kooshkbaghi, Sarah McFann, Ioannis G. Kevrekidis

Abstract: We present a data-driven approach to characterizing nonidentifiability of a model's parameters and illustrate it through dynamic as well as steady kinetic models. By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the output behavior of a chemical system: a set of effective parameters for the model. Furthermore, we introduc… ▽ More We present a data-driven approach to characterizing nonidentifiability of a model's parameters and illustrate it through dynamic as well as steady kinetic models. By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the output behavior of a chemical system: a set of effective parameters for the model. Furthermore, we introduce and use a Conformal Autoencoder Neural Network technique, as well as a kernel-based Jointly Smooth Function technique, to disentangle the redundant parameter combinations that do not affect the output behavior from the ones that do. We discuss the interpretability of our data-driven effective parameters, and demonstrate the utility of the approach both for behavior prediction and parameter estimation. In the latter task, it becomes important to describe level sets in parameter space that are consistent with a particular output behavior. We validate our approach on a model of multisite phosphorylation, where a reduced set of effective parameters (nonlinear combinations of the physical ones) has previously been established analytically. △ Less

Submitted 9 June, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: 47 pages, 23 figures, 4 tables, submitted to PNAS Nexus, revised and expanded in response to reviewers' comments

MSC Class: 37E99 (Primary); 68T07 (Secondary)

arXiv:2110.02296 [pdf, other]

On the Correspondence between Gaussian Processes and Geometric Harmonics

Authors: Felix Dietrich, Juan M. Bello-Rivas, Ioannis G. Kevrekidis

Abstract: We discuss the correspondence between Gaussian process regression and Geometric Harmonics, two similar kernel-based methods that are typically used in different contexts. Research communities surrounding the two concepts often pursue different goals. Results from both camps can be successfully combined, providing alternative interpretations of uncertainty in terms of error estimation, or leading t… ▽ More We discuss the correspondence between Gaussian process regression and Geometric Harmonics, two similar kernel-based methods that are typically used in different contexts. Research communities surrounding the two concepts often pursue different goals. Results from both camps can be successfully combined, providing alternative interpretations of uncertainty in terms of error estimation, or leading towards accelerated Bayesian Optimization due to dimensionality reduction. △ Less

Submitted 5 October, 2021; originally announced October 2021.

Comments: 26 pages, 9 figures

MSC Class: 42-XX; 42-08; 60G15

arXiv:2107.13735 [pdf, other]

doi 10.1063/5.0065093

Learning the temporal evolution of multivariate densities via normalizing flows

Authors: Yubin Lu, Romit Maulik, Ting Gao, Felix Dietrich, Ioannis G. Kevrekidis, **qiao Duan

Abstract: In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent map** t… ▽ More In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent map** that takes a reference distribution (say, a Gaussian) to each and every instance of our evolving distribution. If the reference distribution is the initial condition of a Fokker-Planck equation, what we learn is the time-T map of the corresponding solution. Specifically, the learned map is a multivariate normalizing flow that deforms the support of the reference density to the support of each and every density snapshot in time. We demonstrate that this approach can approximate probability density function evolutions in time from observed sampled data for systems driven by both Brownian and Lévy noise. We present examples with two- and three-dimensional, uni- and multimodal distributions to validate the method. △ Less

Submitted 3 May, 2022; v1 submitted 29 July, 2021; originally announced July 2021.

arXiv:2105.01303 [pdf, other]

doi 10.1137/21M1418629

Personalized Algorithm Generation: A Case Study in Learning ODE Integrators

Authors: Yue Guo, Felix Dietrich, Tom Bertalan, Danimir T. Doncevic, Manuel Dahmen, Ioannis G. Kevrekidis, Qianxiao Li

Abstract: We study the learning of numerical algorithms for scientific computing, which combines mathematically driven, handcrafted design of general algorithm structure with a data-driven adaptation to specific classes of tasks. This represents a departure from the classical approaches in numerical analysis, which typically do not feature such learning-based adaptations. As a case study, we develop a machi… ▽ More We study the learning of numerical algorithms for scientific computing, which combines mathematically driven, handcrafted design of general algorithm structure with a data-driven adaptation to specific classes of tasks. This represents a departure from the classical approaches in numerical analysis, which typically do not feature such learning-based adaptations. As a case study, we develop a machine learning approach that automatically learns effective solvers for initial value problems in the form of ordinary differential equations (ODEs), based on the Runge-Kutta (RK) integrator architecture. We show that we can learn high-order integrators for targeted families of differential equations without the need for computing integrator coefficients by hand. Moreover, we demonstrate that in certain cases we can obtain superior performance to classical RK methods. This can be attributed to certain properties of the ODE families being identified and exploited by the approach. Overall, this work demonstrates an effective learning-based approach to the design of algorithms for the numerical solution of differential equations. This can be readily extended to other numerical tasks. △ Less

Submitted 9 July, 2022; v1 submitted 4 May, 2021; originally announced May 2021.

MSC Class: 65L06; 68T07; 65L05

arXiv:1907.10807 [pdf, other]

doi 10.1137/19M1277059

On the Koopman operator of algorithms

Authors: Felix Dietrich, Thomas N. Thiem, Ioannis G. Kevrekidis

Abstract: A systematic mathematical framework for the study of numerical algorithms would allow comparisons, facilitate conjugacy arguments, as well as enable the discovery of improved, accelerated, data-driven algorithms. Over the course of the last century, the Koopman operator has provided a mathematical framework for the study of dynamical systems, which facilitates conjugacy arguments and can provide e… ▽ More A systematic mathematical framework for the study of numerical algorithms would allow comparisons, facilitate conjugacy arguments, as well as enable the discovery of improved, accelerated, data-driven algorithms. Over the course of the last century, the Koopman operator has provided a mathematical framework for the study of dynamical systems, which facilitates conjugacy arguments and can provide efficient reduced descriptions. More recently, numerical approximations of the operator have enabled the analysis of a large number of deterministic and stochastic dynamical systems in a completely data-driven, essentially equation-free pipeline. Discrete or continuous time numerical algorithms (integrators, nonlinear equation solvers, optimization algorithms) are themselves dynamical systems. In this paper, we use this insight to leverage the Koopman operator framework in the data-driven study of such algorithms and discuss benefits for analysis and acceleration of numerical computation. For algorithms acting on high-dimensional spaces by quickly contracting them towards low-dimensional manifolds, we demonstrate how basis functions adapted to the data help to construct efficient reduced representations of the operator. Our illustrative examples include the gradient descent and Nesterov optimization algorithms, as well as the Newton-Raphson algorithm. △ Less

Submitted 19 May, 2020; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: 27 pages, 11 figures

MSC Class: 47B33; 68W40; 37C10

Journal ref: SIAM Journal on Applied Dynamical Systems, 2020, Vol. 19, No. 2: pp. 860-885

arXiv:1812.01173 [pdf, ps, other]

Some manifold learning considerations towards explicit model predictive control

Authors: Robert J. Lovelett, Felix Dietrich, Seungjoon Lee, Ioannis G. Kevrekidis

Abstract: Model predictive control (MPC) is a de facto standard control algorithm across the process industries. There remain, however, applications where MPC is impractical because an optimization problem is solved at each time step. We present a link between explicit MPC formulations and manifold learning to enable facilitated prediction of the MPC policy. Our method uses a similarity measure informed by… ▽ More Model predictive control (MPC) is a de facto standard control algorithm across the process industries. There remain, however, applications where MPC is impractical because an optimization problem is solved at each time step. We present a link between explicit MPC formulations and manifold learning to enable facilitated prediction of the MPC policy. Our method uses a similarity measure informed by control policies and system state variables, to "learn" an intrinsic parametrization of the MPC controller using a diffusion maps algorithm, which will also discover a low-dimensional control law when it exists as a smooth, nonlinear combination of the state variables. We use function approximation algorithms to project points from state space to the intrinsic space, and from the intrinsic space to policy space. The approach is illustrated first by "learning" the intrinsic variables for MPC control of constrained linear systems, and then by designing controllers for an unstable nonlinear reactor. △ Less

Submitted 8 July, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

arXiv:1712.07144 [pdf, other]

On Matching, and Even Rectifying, Dynamical Systems through Koopman Operator Eigenfunctions

Authors: Erik M. Bollt, Qianxiao Li, Felix Dietrich, Ioannis Kevrekidis

Abstract: Matching dynamical systems, through different forms of conjugacies and equivalences, has long been a fundamental concept, and a powerful tool, in the study and classification of nonlinear dynamic behavior (e.g. through normal forms). In this paper we will argue that the use of the Koopman operator and its spectrum is particularly well suited for this endeavor, both in theory, but also especially i… ▽ More Matching dynamical systems, through different forms of conjugacies and equivalences, has long been a fundamental concept, and a powerful tool, in the study and classification of nonlinear dynamic behavior (e.g. through normal forms). In this paper we will argue that the use of the Koopman operator and its spectrum is particularly well suited for this endeavor, both in theory, but also especially in view of recent data-driven algorithm developments. We believe, and document through illustrative examples, that this can nontrivially extend the use and applicability of the Koopman spectral theoretical and computational machinery beyond modeling and prediction, towards what can be considered as a systematic discovery of "Cole-Hopf-type" transformations for dynamics. △ Less

Submitted 6 March, 2018; v1 submitted 19 December, 2017; originally announced December 2017.

Comments: 34 pages, 10 figures

arXiv:1712.05145 [pdf, ps, other]

Derivation of higher-order terms in FFT-based numerical homogenization

Authors: Felix Dietrich, Dennis Merkert, Bernd Simeon

Abstract: In this paper, we first introduce the reader to the Basic Scheme of Moulinec and Suquet in the setting of quasi-static linear elasticity, which takes advantage of the fast Fourier transform on homogenized microstructures to accelerate otherwise time-consuming computations. By means of an asymptotic expansion, a hierarchy of linear problems is derived, whose solutions are looked at in detail. It is… ▽ More In this paper, we first introduce the reader to the Basic Scheme of Moulinec and Suquet in the setting of quasi-static linear elasticity, which takes advantage of the fast Fourier transform on homogenized microstructures to accelerate otherwise time-consuming computations. By means of an asymptotic expansion, a hierarchy of linear problems is derived, whose solutions are looked at in detail. It is highlighted how these generalized homogenization problems depend on each other. We extend the Basic Scheme to fit this new problem class and give some numerical results for the first two problem orders. △ Less

Submitted 14 December, 2017; originally announced December 2017.

Comments: pre-print for conference proceeding of ENUMATH 2017

MSC Class: 74B05; 65T50; 35B27; 74E30; 41A58

arXiv:1707.00225 [pdf, other]

doi 10.1063/1.4993854

Extended dynamic mode decomposition with dictionary learning: a data-driven adaptive spectral decomposition of the Koopman operator

Authors: Qianxiao Li, Felix Dietrich, Erik M. Bollt, Ioannis G. Kevrekidis

Abstract: Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD) and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observab… ▽ More Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD) and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observables that spans a finite dimensional subspace on which the Koopman operator can be approximated. This enhances the accuracy of the solution reconstruction and broadens the applicability of the Koopman formalism. Although the convergence of the EDMD has been established, applying the method in practice requires a careful choice of the observables to improve convergence with just a finite number of terms. This is especially difficult for high dimensional and highly nonlinear systems. In this paper, we employ ideas from machine learning to improve upon the EDMD method. We develop an iterative approximation algorithm which couples the EDMD with a trainable dictionary represented by an artificial neural network. Using the Duffing oscillator and the Kuramoto Sivashinsky PDE as examples, we show that our algorithm can effectively and efficiently adapt the trainable dictionary to the problem at hand to achieve good reconstruction accuracy without the need to choose a fixed dictionary a priori. Furthermore, to obtain a given accuracy we require fewer dictionary terms than EDMD with fixed dictionaries. This alleviates an important shortcoming of the EDMD algorithm and enhances the applicability of the Koopman framework to practical problems. △ Less

Submitted 1 July, 2017; originally announced July 2017.

arXiv:1506.04793 [pdf, other]

doi 10.1137/15M1043613

Numerical Model Construction with Closed Observables

Authors: Felix Dietrich, Gerta Köster, Hans-Joachim Bungartz

Abstract: Performing analysis, optimization and control using simulations of many-particle systems is computationally demanding when no macroscopic model for the dynamics of the variables of interest is available. In case observations on the macroscopic scale can only be produced via legacy simulator code or live experiments, finding a model for these macroscopic variables is challenging. In this paper, w… ▽ More Performing analysis, optimization and control using simulations of many-particle systems is computationally demanding when no macroscopic model for the dynamics of the variables of interest is available. In case observations on the macroscopic scale can only be produced via legacy simulator code or live experiments, finding a model for these macroscopic variables is challenging. In this paper, we employ time-lagged embedding theory to construct macroscopic numerical models from output data of a black box, such as a simulator or live experiments. Since the state space variables of the constructed, coarse model are dynamically closed and observable by an observation function, we call these variables closed observables. The approach is an online-offline procedure, as model construction from observation data is performed offline and the new model can then be used in an online phase, independent of the original. We illustrate the theoretical findings with numerical models constructed from time series of a two-dimensional ordinary differential equation system, and from the density evolution of a transport-diffusion system. Applicability is demonstrated in a real-world example, where passengers leave a train and the macroscopic model for the density flow onto the platform is constructed with our approach. If only the macroscopic variables are of interest, simulation runtimes with the numerical model are three orders of magnitude lower compared to simulations with the original fine scale model. We conclude with a brief discussion of possibilities of numerical model construction in systematic upscaling, network optimization and uncertainty quantification. △ Less

Submitted 18 October, 2015; v1 submitted 15 June, 2015; originally announced June 2015.

Comments: 24 pages, 19 figures

MSC Class: 70-08 ACM Class: I.6.6; G.1.0

Showing 1–20 of 20 results for author: Dietrich, F