-
On Learning what to Learn: heterogeneous observations of dynamics and establishing (possibly causal) relations among them
Authors:
David W. Sroczynski,
Felix Dietrich,
Eleni D. Koronaki,
Ronen Talmon,
Ronald R. Coifman,
Erik Bollt,
Ioannis G. Kevrekidis
Abstract:
Before we attempt to learn a function between two (sets of) observables of a physical process, we must first decide what the inputs and what the outputs of the desired function are going to be. Here we demonstrate two distinct, data-driven ways of initially deciding ``the right quantities'' to relate through such a function, and then proceed to learn it. This is accomplished by processing multiple…
▽ More
Before we attempt to learn a function between two (sets of) observables of a physical process, we must first decide what the inputs and what the outputs of the desired function are going to be. Here we demonstrate two distinct, data-driven ways of initially deciding ``the right quantities'' to relate through such a function, and then proceed to learn it. This is accomplished by processing multiple simultaneous heterogeneous data streams (ensembles of time series) from observations of a physical system: multiple observation processes of the system. We thus determine (a) what subsets of observables are common between the observation processes (and therefore observable from each other, relatable through a function); and (b) what information is unrelated to these common observables, and therefore particular to each observation process, and not contributing to the desired function. Any data-driven function approximation technique can subsequently be used to learn the input-output relation, from k-nearest neighbors and Geometric Harmonics to Gaussian Processes and Neural Networks. Two particular ``twists'' of the approach are discussed. The first has to do with the identifiability of particular quantities of interest from the measurements. We now construct map**s from a single set of observations of one process to entire level sets of measurements of the process, consistent with this single set. The second attempts to relate our framework to a form of causality: if one of the observation processes measures ``now'', while the second observation process measures ``in the future'', the function to be learned among what is common across observation processes constitutes a dynamical model for the system evolution.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Solving partial differential equations with sampled neural networks
Authors:
Chinmay Datar,
Taniya Kapoor,
Abhishek Chandra,
Qing Sun,
Iryna Burak,
Erik Lien Bolager,
Anna Veselovska,
Massimo Fornasier,
Felix Dietrich
Abstract:
Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent pr…
▽ More
Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent probability distributions allows us to progress on both challenges. In most examples, the random sampling schemes outperform iterative, gradient-based optimization of physics-informed neural networks regarding training time and accuracy by several orders of magnitude. For time-dependent PDE, we construct neural basis functions only in the spatial domain and then solve the associated ordinary differential equation with classical methods from scientific computing over a long time horizon. This alleviates one of the greatest challenges for neural PDE solvers because it does not require us to parameterize the solution in time. For second-order elliptic PDE in Barron spaces, we prove the existence of sampled networks with $L^2$ convergence to the solution. We demonstrate our approach on several time-dependent and static PDEs. We also illustrate how sampled networks can effectively solve inverse problems in this setting. Benefits compared to common numerical schemes include spectral convergence and mesh-free construction of basis functions.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Multi-fidelity Gaussian process surrogate modeling for regression problems in physics
Authors:
Kislaya Ravi,
Vladyslav Fediukov,
Felix Dietrich,
Tobias Neckel,
Fabian Buse,
Michael Bergmann,
Hans-Joachim Bungartz
Abstract:
One of the main challenges in surrogate modeling is the limited availability of data due to resource constraints associated with computationally expensive simulations. Multi-fidelity methods provide a solution by chaining models in a hierarchy with increasing fidelity, associated with lower error, but increasing cost. In this paper, we compare different multi-fidelity methods employed in construct…
▽ More
One of the main challenges in surrogate modeling is the limited availability of data due to resource constraints associated with computationally expensive simulations. Multi-fidelity methods provide a solution by chaining models in a hierarchy with increasing fidelity, associated with lower error, but increasing cost. In this paper, we compare different multi-fidelity methods employed in constructing Gaussian process surrogates for regression. Non-linear autoregressive methods in the existing literature are primarily confined to two-fidelity models, and we extend these methods to handle more than two levels of fidelity. Additionally, we propose enhancements for an existing method incorporating delay terms by introducing a structured kernel. We demonstrate the performance of these methods across various academic and real-world scenarios. Our findings reveal that multi-fidelity methods generally have a smaller prediction error for the same computational cost as compared to the single-fidelity method, although their effectiveness varies across different scenarios.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Systematic construction of continuous-time neural networks for linear dynamical systems
Authors:
Chinmay Datar,
Adwait Datar,
Felix Dietrich,
Wil Schilders
Abstract:
Discovering a suitable neural network architecture for modeling complex dynamical systems poses a formidable challenge, often involving extensive trial and error and navigation through a high-dimensional hyper-parameter space. In this paper, we discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems, namely, Linear Time-Invariant (LTI) system…
▽ More
Discovering a suitable neural network architecture for modeling complex dynamical systems poses a formidable challenge, often involving extensive trial and error and navigation through a high-dimensional hyper-parameter space. In this paper, we discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems, namely, Linear Time-Invariant (LTI) systems. We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE). Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system, leveraging its properties. We bring forth a novel neural architecture paradigm featuring horizontal hidden layers and provide insights into why employing conventional neural architectures with vertical hidden layers may not be favorable. We also provide an upper bound on the numerical errors of our neural networks. Finally, we demonstrate the high accuracy of our constructed networks on three numerical examples.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Gappy local conformal auto-encoders for heterogeneous data fusion: in praise of rigidity
Authors:
Erez Peterfreund,
Iryna Burak,
Ofir Lindenbaum,
Jim Gimlett,
Felix Dietrich,
Ronald R. Coifman,
Ioannis G. Kevrekidis
Abstract:
Fusing measurements from multiple, heterogeneous, partial sources, observing a common object or process, poses challenges due to the increasing availability of numbers and types of sensors. In this work we propose, implement and validate an end-to-end computational pipeline in the form of a multiple-auto-encoder neural network architecture for this task. The inputs to the pipeline are several sets…
▽ More
Fusing measurements from multiple, heterogeneous, partial sources, observing a common object or process, poses challenges due to the increasing availability of numbers and types of sensors. In this work we propose, implement and validate an end-to-end computational pipeline in the form of a multiple-auto-encoder neural network architecture for this task. The inputs to the pipeline are several sets of partial observations, and the result is a globally consistent latent space, harmonizing (rigidifying, fusing) all measurements. The key enabler is the availability of multiple slightly perturbed measurements of each instance:, local measurement, "bursts", that allows us to estimate the local distortion induced by each instrument. We demonstrate the approach in a sequence of examples, starting with simple two-dimensional data sets and proceeding to a Wi-Fi localization problem and to the solution of a "dynamical puzzle" arising in spatio-temporal observations of the solutions of Partial Differential Equations.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Sampling weights of deep neural networks
Authors:
Erik Lien Bolager,
Iryna Burak,
Chinmay Datar,
Qing Sun,
Felix Dietrich
Abstract:
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic d…
▽ More
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks. In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed to obtain a trained network. The sampling is based on the idea of random feature models. However, instead of a data-agnostic distribution, e.g., a normal distribution, we use both the input and the output training data to sample shallow and deep networks. We prove that sampled networks are universal approximators. For Barron functions, we show that the $L^2$-approximation error of sampled shallow networks decreases with the square root of the number of neurons. Our sampling scheme is invariant to rigid body transformations and scaling of the input data, which implies many popular pre-processing techniques are not required. In numerical experiments, we demonstrate that sampled networks achieve accuracy comparable to iteratively trained ones, but can be constructed orders of magnitude faster. Our test cases involve a classification benchmark from OpenML, sampling of neural operators to represent maps in function spaces, and transfer learning using well-known architectures.
△ Less
Submitted 12 November, 2023; v1 submitted 29 June, 2023;
originally announced June 2023.
-
Transporting Densities Across Dimensions
Authors:
Michael Plainer,
Felix Dietrich,
Ioannis G. Kevrekidis
Abstract:
Even the best scientific equipment can only partially observe reality. Recorded data is often lower-dimensional, e.g., two-dimensional pictures of the three-dimensional world. Combining data from multiple experiments then results in a marginal density. This work shows how to transport such lower-dimensional marginal densities into a more informative, higher-dimensional joint space by leveraging ti…
▽ More
Even the best scientific equipment can only partially observe reality. Recorded data is often lower-dimensional, e.g., two-dimensional pictures of the three-dimensional world. Combining data from multiple experiments then results in a marginal density. This work shows how to transport such lower-dimensional marginal densities into a more informative, higher-dimensional joint space by leveraging time-delayed measurements from an observation process. This can augment the information from scientific equipment to construct a more coherent view. Classical transportation algorithms can be used when the source and target dimensions match. Our approach allows the transport of samples between spaces of different dimensions by exploiting information from the sample collection process. We reconstruct the surface of an implant from partial recordings of bacteria moving on it and construct a joint space for satellites orbiting the Earth by combining one-dimensional, time-delayed altitude measurements.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Data-driven modelling of brain activity using neural networks, Diffusion Maps, and the Koopman operator
Authors:
Ioannis K. Gallos,
Daniel Lehmberg,
Felix Dietrich,
Constantinos Siettos
Abstract:
We propose a machine-learning approach to model long-term out-of-sample dynamics of brain activity from task-dependent fMRI data. Our approach is a three stage one. First, we exploit Diffusion maps (DMs) to discover a set of variables that parametrize the low-dimensional manifold on which the emergent high-dimensional fMRI time series evolve. Then, we construct reduced-order-models (ROMs) on the e…
▽ More
We propose a machine-learning approach to model long-term out-of-sample dynamics of brain activity from task-dependent fMRI data. Our approach is a three stage one. First, we exploit Diffusion maps (DMs) to discover a set of variables that parametrize the low-dimensional manifold on which the emergent high-dimensional fMRI time series evolve. Then, we construct reduced-order-models (ROMs) on the embedded manifold via two techniques: Feedforward Neural Networks (FNNs) and the Koopman operator. Finally, for predicting the out-of-sample long-term dynamics of brain activity in the ambient fMRI space, we solve the pre-image problem coupling DMs with Geometric Harmonics (GH) when using FNNs and the Koopman modes per se. For our illustrations, we have assessed the performance of the two proposed schemes using a benchmark fMRI dataset with recordings during a visuo-motor task. The results suggest that just a few (for the particular task, five) non-linear coordinates of the high-dimensional fMRI time series provide a good basis for modelling and out-of-sample prediction of the brain activity. Furthermore, we show that the proposed approaches outperform the one-step ahead predictions of the naive random walk model, which, in contrast to our scheme, relies on the knowledge of the signals in the previous time step. Importantly, we show that the proposed Koopman operator approach provides, for any practical purposes, equivalent results to the FNN-GH approach, thus bypassing the need to train a non-linear map and to use GH to extrapolate predictions in the ambient fMRI space; one can use instead the low-frequency truncation of the DMs function space of L^2-integrable functions, to predict the entire list of coordinate functions in the fMRI space and to solve the pre-image problem.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
On the Use of Neural Networks for Full Waveform Inversion
Authors:
Leon Herrmann,
Tim Bürchner,
Felix Dietrich,
Stefan Kollmannsberger
Abstract:
Neural networks have recently gained attention in solving inverse problems. One prominent methodology are Physics-Informed Neural Networks (PINNs) which can solve both forward and inverse problems. In the paper at hand, full waveform inversion is the considered inverse problem. The performance of PINNs is compared against classical adjoint optimization, focusing on three key aspects: the forward-s…
▽ More
Neural networks have recently gained attention in solving inverse problems. One prominent methodology are Physics-Informed Neural Networks (PINNs) which can solve both forward and inverse problems. In the paper at hand, full waveform inversion is the considered inverse problem. The performance of PINNs is compared against classical adjoint optimization, focusing on three key aspects: the forward-solver, the neural network Ansatz for the inverse field, and the sensitivity computation for the gradient-based minimization. Starting from PINNs, each of these key aspects is adapted individually until the classical adjoint optimization emerges. It is shown that it is beneficial to use the neural network only for the discretization of the unknown material field, where the neural network produces reconstructions without oscillatory artifacts as typically encountered in classical full waveform inversion approaches. Due to this finding, a hybrid approach is proposed. It exploits both the efficient gradient computation with the continuous adjoint method as well as the neural network Ansatz for the unknown material field. This new hybrid approach outperforms Physics-Informed Neural Networks and the classical adjoint optimization in settings of two and three-dimensional examples.
△ Less
Submitted 30 January, 2023;
originally announced March 2023.
-
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms
Authors:
Danimir T. Doncevic,
Alexander Mitsos,
Yue Guo,
Qianxiao Li,
Felix Dietrich,
Manuel Dahmen,
Ioannis G. Kevrekidis
Abstract:
Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta…
▽ More
Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta neural network to a recursively recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. In contrast to off-the-shelf deep learning approaches, it features a distinct division into modules for generation of information and for the subsequent assembly of this information towards a solution. Local information in the form of a subspace is generated by subordinate, inner, iterations of recurrent function evaluations starting at the current outer iterate. The update to the next outer iterate is computed as a linear combination of these evaluations, reducing the residual in this space, and constitutes the output of the network. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields iterations similar to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta integrators for ordinary differential equations. Due to its modularity, the superstructure can be readily extended with functionalities needed to represent more general classes of iterative algorithms traditionally based on Taylor series expansions.
△ Less
Submitted 6 July, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Safe Policy Improvement Approaches and their Limitations
Authors:
Philipp Scholl,
Felix Dietrich,
Clemens Otte,
Steffen Udluft
Abstract:
Safe Policy Improvement (SPI) is an important technique for offline reinforcement learning in safety critical applications as it improves the behavior policy with a high probability. We classify various SPI approaches from the literature into two groups, based on how they utilize the uncertainty of state-action pairs. Focusing on the Soft-SPIBB (Safe Policy Improvement with Soft Baseline Bootstrap…
▽ More
Safe Policy Improvement (SPI) is an important technique for offline reinforcement learning in safety critical applications as it improves the behavior policy with a high probability. We classify various SPI approaches from the literature into two groups, based on how they utilize the uncertainty of state-action pairs. Focusing on the Soft-SPIBB (Safe Policy Improvement with Soft Baseline Bootstrap**) algorithms, we show that their claim of being provably safe does not hold. Based on this finding, we develop adaptations, the Adv-Soft-SPIBB algorithms, and show that they are provably safe. A heuristic adaptation, Lower-Approx-Soft-SPIBB, yields the best performance among all SPIBB algorithms in extensive experiments on two benchmarks. We also check the safety guarantees of the provably safe algorithms and show that huge amounts of data are necessary such that the safety bounds become useful in practice.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
Learning Effective SDEs from Brownian Dynamics Simulations of Colloidal Particles
Authors:
Nikolaos Evangelou,
Felix Dietrich,
Juan M. Bello-Rivas,
Alex Yeh,
Rachel Stein,
Michael A. Bevan,
Ioannis G. Kevrekidis
Abstract:
We construct a reduced, data-driven, parameter dependent effective Stochastic Differential Equation (eSDE) for electric-field mediated colloidal crystallization using data obtained from Brownian Dynamics Simulations. We use Diffusion Maps (a manifold learning algorithm) to identify a set of useful latent observables. In this latent space we identify an eSDE using a deep learning architecture inspi…
▽ More
We construct a reduced, data-driven, parameter dependent effective Stochastic Differential Equation (eSDE) for electric-field mediated colloidal crystallization using data obtained from Brownian Dynamics Simulations. We use Diffusion Maps (a manifold learning algorithm) to identify a set of useful latent observables. In this latent space we identify an eSDE using a deep learning architecture inspired by numerical stochastic integrators and compare it with the traditional Kramers-Moyal expansion estimation. We show that the obtained variables and the learned dynamics accurately encode the physics of the Brownian Dynamic Simulations. We further illustrate that our reduced model captures the dynamics of corresponding experimental data. Our dimension reduction/reduced model identification approach can be easily ported to a broad class of particle systems dynamics experiments/models.
△ Less
Submitted 30 January, 2023; v1 submitted 30 April, 2022;
originally announced May 2022.
-
Double Diffusion Maps and their Latent Harmonics for Scientific Computations in Latent Space
Authors:
Nikolaos Evangelou,
Felix Dietrich,
Eliodoro Chiavazzo,
Daniel Lehmberg,
Marina Meila,
Ioannis G. Kevrekidis
Abstract:
We introduce a data-driven approach to building reduced dynamical models through manifold learning; the reduced latent space is discovered using Diffusion Maps (a manifold learning technique) on time series data. A second round of Diffusion Maps on those latent coordinates allows the approximation of the reduced dynamical models. This second round enables map** the latent space coordinates back…
▽ More
We introduce a data-driven approach to building reduced dynamical models through manifold learning; the reduced latent space is discovered using Diffusion Maps (a manifold learning technique) on time series data. A second round of Diffusion Maps on those latent coordinates allows the approximation of the reduced dynamical models. This second round enables map** the latent space coordinates back to the full ambient space (what is called lifting); it also enables the approximation of full state functions of interest in terms of the reduced coordinates. In our work, we develop and test three different reduced numerical simulation methodologies, either through pre-tabulation in the latent space and integration on the fly or by going back and forth between the ambient space and the latent space. The data-driven latent space simulation results, based on the three different approaches, are validated through (a) the latent space observation of the full simulation through the Nyström Extension formula, or through (b) lifting the reduced trajectory back to the full ambient space, via Latent Harmonics. Latent space modeling often involves additional regularization to favor certain properties of the space over others, and the map** back to the ambient space is then constructed mostly independently from these properties; here, we use the same data-driven approach to construct the latent space and then map back to the ambient space.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Safe Policy Improvement Approaches on Discrete Markov Decision Processes
Authors:
Philipp Scholl,
Felix Dietrich,
Clemens Otte,
Steffen Udluft
Abstract:
Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy. Building on SPI with Soft Baseline Bootstrap** (Soft-SPIBB) by Nadjahi et al., we identify theoretical issues in their approach, provide a corrected theory, and derive a new algorithm that is provably safe on finite Markov Decision Processes (MDP). Additio…
▽ More
Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy. Building on SPI with Soft Baseline Bootstrap** (Soft-SPIBB) by Nadjahi et al., we identify theoretical issues in their approach, provide a corrected theory, and derive a new algorithm that is provably safe on finite Markov Decision Processes (MDP). Additionally, we provide a heuristic algorithm that exhibits the best performance among many state of the art SPI algorithms on two different benchmarks. Furthermore, we introduce a taxonomy of SPI algorithms and empirically show an interesting property of two classes of SPI algorithms: while the mean performance of algorithms that incorporate the uncertainty as a penalty on the action-value is higher, actively restricting the set of policies more consistently produces good policies and is, thus, safer.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
Quantum Process Tomography of Unitary Maps from Time-Delayed Measurements
Authors:
Irene López Gutiérrez,
Felix Dietrich,
Christian B. Mendl
Abstract:
Quantum process tomography conventionally uses a multitude of initial quantum states and then performs state tomography on the process output. Here we propose and study an alternative approach which requires only a single (or few) known initial states together with time-delayed measurements for reconstructing the unitary map and corresponding Hamiltonian of the time dynamics. The overarching mathe…
▽ More
Quantum process tomography conventionally uses a multitude of initial quantum states and then performs state tomography on the process output. Here we propose and study an alternative approach which requires only a single (or few) known initial states together with time-delayed measurements for reconstructing the unitary map and corresponding Hamiltonian of the time dynamics. The overarching mathematical framework and feasibility guarantee of our method is provided by the Takens embedding theorem. We explain in detail how the reconstruction of a single qubit Hamiltonian works in this setting, and provide numerical methods and experiments for general few-qubit and lattice systems with local interactions. In particular, the method allows to find the Hamiltonian of a two qubit system by observing only one of the qubits.
△ Less
Submitted 22 February, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
On the Parameter Combinations That Matter and on Those That do Not
Authors:
Nikolaos Evangelou,
Noah J. Wichrowski,
George A. Kevrekidis,
Felix Dietrich,
Mahdi Kooshkbaghi,
Sarah McFann,
Ioannis G. Kevrekidis
Abstract:
We present a data-driven approach to characterizing nonidentifiability of a model's parameters and illustrate it through dynamic as well as steady kinetic models. By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the output behavior of a chemical system: a set of effective parameters for the model. Furthermore, we introduc…
▽ More
We present a data-driven approach to characterizing nonidentifiability of a model's parameters and illustrate it through dynamic as well as steady kinetic models. By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the output behavior of a chemical system: a set of effective parameters for the model. Furthermore, we introduce and use a Conformal Autoencoder Neural Network technique, as well as a kernel-based Jointly Smooth Function technique, to disentangle the redundant parameter combinations that do not affect the output behavior from the ones that do. We discuss the interpretability of our data-driven effective parameters, and demonstrate the utility of the approach both for behavior prediction and parameter estimation. In the latter task, it becomes important to describe level sets in parameter space that are consistent with a particular output behavior. We validate our approach on a model of multisite phosphorylation, where a reduced set of effective parameters (nonlinear combinations of the physical ones) has previously been established analytically.
△ Less
Submitted 9 June, 2022; v1 submitted 13 October, 2021;
originally announced October 2021.
-
On the Correspondence between Gaussian Processes and Geometric Harmonics
Authors:
Felix Dietrich,
Juan M. Bello-Rivas,
Ioannis G. Kevrekidis
Abstract:
We discuss the correspondence between Gaussian process regression and Geometric Harmonics, two similar kernel-based methods that are typically used in different contexts. Research communities surrounding the two concepts often pursue different goals. Results from both camps can be successfully combined, providing alternative interpretations of uncertainty in terms of error estimation, or leading t…
▽ More
We discuss the correspondence between Gaussian process regression and Geometric Harmonics, two similar kernel-based methods that are typically used in different contexts. Research communities surrounding the two concepts often pursue different goals. Results from both camps can be successfully combined, providing alternative interpretations of uncertainty in terms of error estimation, or leading towards accelerated Bayesian Optimization due to dimensionality reduction.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Training Algorithm Matters for the Performance of Neural Network Potential: A Case Study of Adam and the Kalman Filter Optimizers
Authors:
Yunqi Shao,
Florian M. Dietrich,
Carl Nettelblad,
Chao Zhang
Abstract:
One hidden yet important issue for develo** neural network potentials (NNPs) is the choice of training algorithm. Here we compare the performance of two popular training algorithms, the adaptive moment estimation algorithm (Adam) and the Extended Kalman Filter algorithm (EKF), using the Behler-Parrinello neural network (BPNN) and two publicly accessible datasets of liquid water [Proc. Natl. Acad…
▽ More
One hidden yet important issue for develo** neural network potentials (NNPs) is the choice of training algorithm. Here we compare the performance of two popular training algorithms, the adaptive moment estimation algorithm (Adam) and the Extended Kalman Filter algorithm (EKF), using the Behler-Parrinello neural network (BPNN) and two publicly accessible datasets of liquid water [Proc. Natl. Acad. Sci. U.S.A. 2016, 113, 8368-8373 and Proc. Natl. Acad. Sci. U.S.A. 2019, 116, 1110-1115]. This is achieved by implementing EKF in TensorFlow. It is found that NNPs trained with EKF are more transferable and less sensitive to the value of the learning rate, as compared to Adam. In both cases, error metrics of the validation set do not always serve as a good indicator for the actual performance of NNPs. Instead, we show that their performance correlates well with a Fisher information based similarity measure.
△ Less
Submitted 9 November, 2021; v1 submitted 8 September, 2021;
originally announced September 2021.
-
Learning the temporal evolution of multivariate densities via normalizing flows
Authors:
Yubin Lu,
Romit Maulik,
Ting Gao,
Felix Dietrich,
Ioannis G. Kevrekidis,
**qiao Duan
Abstract:
In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent map** t…
▽ More
In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent map** that takes a reference distribution (say, a Gaussian) to each and every instance of our evolving distribution. If the reference distribution is the initial condition of a Fokker-Planck equation, what we learn is the time-T map of the corresponding solution. Specifically, the learned map is a multivariate normalizing flow that deforms the support of the reference density to the support of each and every density snapshot in time. We demonstrate that this approach can approximate probability density function evolutions in time from observed sampled data for systems driven by both Brownian and Lévy noise. We present examples with two- and three-dimensional, uni- and multimodal distributions to validate the method.
△ Less
Submitted 3 May, 2022; v1 submitted 29 July, 2021;
originally announced July 2021.
-
Learning effective stochastic differential equations from microscopic simulations: linking stochastic numerics to deep learning
Authors:
Felix Dietrich,
Alexei Makeev,
George Kevrekidis,
Nikolaos Evangelou,
Tom Bertalan,
Sebastian Reich,
Ioannis G. Kevrekidis
Abstract:
We identify effective stochastic differential equations (SDE) for coarse observables of fine-grained particle- or agent-based simulations; these SDE then provide useful coarse surrogate models of the fine scale dynamics. We approximate the drift and diffusivity functions in these effective SDE through neural networks, which can be thought of as effective stochastic ResNets. The loss function is in…
▽ More
We identify effective stochastic differential equations (SDE) for coarse observables of fine-grained particle- or agent-based simulations; these SDE then provide useful coarse surrogate models of the fine scale dynamics. We approximate the drift and diffusivity functions in these effective SDE through neural networks, which can be thought of as effective stochastic ResNets. The loss function is inspired by, and embodies, the structure of established stochastic numerical integrators (here, Euler-Maruyama and Milstein); our approximations can thus benefit from backward error analysis of these underlying numerical schemes. They also lend themselves naturally to "physics-informed" gray-box identification when approximate coarse models, such as mean field equations, are available. Existing numerical integration schemes for Langevin-type equations and for stochastic partial differential equations (SPDE) can also be used for training; we demonstrate this on a stochastically forced oscillator and the stochastic wave equation. Our approach does not require long trajectories, works on scattered snapshot data, and is designed to naturally handle different time steps per snapshot. We consider both the case where the coarse collective observables are known in advance, as well as the case where they must be found in a data-driven manner.
△ Less
Submitted 24 July, 2022; v1 submitted 10 June, 2021;
originally announced June 2021.
-
Personalized Algorithm Generation: A Case Study in Learning ODE Integrators
Authors:
Yue Guo,
Felix Dietrich,
Tom Bertalan,
Danimir T. Doncevic,
Manuel Dahmen,
Ioannis G. Kevrekidis,
Qianxiao Li
Abstract:
We study the learning of numerical algorithms for scientific computing, which combines mathematically driven, handcrafted design of general algorithm structure with a data-driven adaptation to specific classes of tasks. This represents a departure from the classical approaches in numerical analysis, which typically do not feature such learning-based adaptations. As a case study, we develop a machi…
▽ More
We study the learning of numerical algorithms for scientific computing, which combines mathematically driven, handcrafted design of general algorithm structure with a data-driven adaptation to specific classes of tasks. This represents a departure from the classical approaches in numerical analysis, which typically do not feature such learning-based adaptations. As a case study, we develop a machine learning approach that automatically learns effective solvers for initial value problems in the form of ordinary differential equations (ODEs), based on the Runge-Kutta (RK) integrator architecture. We show that we can learn high-order integrators for targeted families of differential equations without the need for computing integrator coefficients by hand. Moreover, we demonstrate that in certain cases we can obtain superior performance to classical RK methods. This can be attributed to certain properties of the ODE families being identified and exploited by the approach. Overall, this work demonstrates an effective learning-based approach to the design of algorithms for the numerical solution of differential equations. This can be readily extended to other numerical tasks.
△ Less
Submitted 9 July, 2022; v1 submitted 4 May, 2021;
originally announced May 2021.
-
Learning emergent PDEs in a learned emergent space
Authors:
Felix P. Kemeth,
Tom Bertalan,
Thomas Thiem,
Felix Dietrich,
Sung Joon Moon,
Carlo R. Laing,
Ioannis G. Kevrekidis
Abstract:
We extract data-driven, intrinsic spatial coordinates from observations of the dynamics of large systems of coupled heterogeneous agents. These coordinates then serve as an emergent space in which to learn predictive models in the form of partial differential equations (PDEs) for the collective description of the coupled-agent system. They play the role of the independent spatial variables in this…
▽ More
We extract data-driven, intrinsic spatial coordinates from observations of the dynamics of large systems of coupled heterogeneous agents. These coordinates then serve as an emergent space in which to learn predictive models in the form of partial differential equations (PDEs) for the collective description of the coupled-agent system. They play the role of the independent spatial variables in this PDE (as opposed to the dependent, possibly also data-driven, state variables). This leads to an alternative description of the dynamics, local in these emergent coordinates, thus facilitating an alternative modeling path for complex coupled-agent systems. We illustrate this approach on a system where each agent is a limit cycle oscillator (a so-called Stuart-Landau oscillator); the agents are heterogeneous (they each have a different intrinsic frequency $ω$) and are coupled through the ensemble average of their respective variables. After fast initial transients, we show that the collective dynamics on a slow manifold can be approximated through a learned model based on local "spatial" partial derivatives in the emergent coordinates. The model is then used for prediction in time, as well as to capture collective bifurcations when system parameters vary. The proposed approach thus integrates the automatic, data-driven extraction of emergent space coordinates parametrizing the agent dynamics, with machine-learning assisted identification of an "emergent PDE" description of the dynamics in this parametrization.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Transformations between deep neural networks
Authors:
Tom Bertalan,
Felix Dietrich,
Ioannis G. Kevrekidis
Abstract:
We propose to test, and when possible establish, an equivalence between two different artificial neural networks by attempting to construct a data-driven transformation between them, using manifold-learning techniques. In particular, we employ diffusion maps with a Mahalanobis-like metric. If the construction succeeds, the two networks can be thought of as belonging to the same equivalence class.…
▽ More
We propose to test, and when possible establish, an equivalence between two different artificial neural networks by attempting to construct a data-driven transformation between them, using manifold-learning techniques. In particular, we employ diffusion maps with a Mahalanobis-like metric. If the construction succeeds, the two networks can be thought of as belonging to the same equivalence class.
We first discuss transformation functions between only the outputs of the two networks; we then also consider transformations that take into account outputs (activations) of a number of internal neurons from each network. In general, Whitney's theorem dictates the number of measurements from one of the networks required to reconstruct each and every feature of the second network. The construction of the transformation function relies on a consistent, intrinsic representation of the network input space.
We illustrate our algorithm by matching neural network pairs trained to learn (a) observations of scalar functions; (b) observations of two-dimensional vector fields; and (c) representations of images of a moving three-dimensional object (a rotating horse). The construction of such equivalence classes across different network instantiations clearly relates to transfer learning. We also expect that it will be valuable in establishing equivalence between different Machine Learning-based models of the same phenomenon observed through different instruments and by different research groups.
△ Less
Submitted 14 January, 2021; v1 submitted 10 July, 2020;
originally announced July 2020.
-
LOCA: LOcal Conformal Autoencoder for standardized data coordinates
Authors:
Erez Peterfreund,
Ofir Lindenbaum,
Felix Dietrich,
Tom Bertalan,
Matan Gavish,
Ioannis G. Kevrekidis,
Ronald R. Coifman
Abstract:
We propose a deep-learning based method for obtaining standardized data coordinates from scientific measurements.Data observations are modeled as samples from an unknown, non-linear deformation of an underlying Riemannian manifold, which is parametrized by a few normalized latent variables. By leveraging a repeated measurement sampling strategy, we present a method for learning an embedding in…
▽ More
We propose a deep-learning based method for obtaining standardized data coordinates from scientific measurements.Data observations are modeled as samples from an unknown, non-linear deformation of an underlying Riemannian manifold, which is parametrized by a few normalized latent variables. By leveraging a repeated measurement sampling strategy, we present a method for learning an embedding in $\mathbb{R}^d$ that is isometric to the latent variables of the manifold. These data coordinates, being invariant under smooth changes of variables, enable matching between different instrumental observations of the same phenomenon. Our embedding is obtained using a LOcal Conformal Autoencoder (LOCA), an algorithm that constructs an embedding to rectify deformations by using a local z-scoring procedure while preserving relevant geometric information. We demonstrate the isometric embedding properties of LOCA on various model settings and observe that it exhibits promising interpolation and extrapolation capabilities. Finally, we apply LOCA to single-site Wi-Fi localization data, and to $3$-dimensional curved surface estimation based on a $2$-dimensional projection.
△ Less
Submitted 14 January, 2021; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Spectral Discovery of Jointly Smooth Features for Multimodal Data
Authors:
Felix Dietrich,
Or Yair,
Rotem Mulayoff,
Ronen Talmon,
Ioannis G. Kevrekidis
Abstract:
In this paper, we propose a spectral method for deriving functions that are jointly smooth on multiple observed manifolds. This allows us to register measurements of the same phenomenon by heterogeneous sensors, and to reject sensor-specific noise. Our method is unsupervised and primarily consists of two steps. First, using kernels, we obtain a subspace spanning smooth functions on each separate m…
▽ More
In this paper, we propose a spectral method for deriving functions that are jointly smooth on multiple observed manifolds. This allows us to register measurements of the same phenomenon by heterogeneous sensors, and to reject sensor-specific noise. Our method is unsupervised and primarily consists of two steps. First, using kernels, we obtain a subspace spanning smooth functions on each separate manifold. Then, we apply a spectral method to the obtained subspaces and discover functions that are jointly smooth on all manifolds. We show analytically that our method is guaranteed to provide a set of orthogonal functions that are as jointly smooth as possible, ordered by increasing Dirichlet energy from the smoothest to the least smooth. In addition, we show that the extracted functions can be efficiently extended to unseen data using the Nyström method. We demonstrate the proposed method on both simulated and real measured data and compare the results to nonlinear variants of the seminal Canonical Correlation Analysis (CCA). Particularly, we show superior results for sleep stage identification. In addition, we show how the proposed method can be leveraged for finding minimal realizations of parameter spaces of nonlinear dynamical systems.
△ Less
Submitted 29 April, 2021; v1 submitted 9 April, 2020;
originally announced April 2020.
-
Track Seed Classification with Deep Neural Networks
Authors:
Felix Dietrich
Abstract:
Future upgrades to the LHC will pose considerable challenges for traditional particle track reconstruction methods. We investigate how artificial Neural Networks and Deep Learning could be used to complement existing algorithms to increase performance. Generating seeds of detector hits is an important phase during the beginning of track reconstruction and improving the current heuristics of seed g…
▽ More
Future upgrades to the LHC will pose considerable challenges for traditional particle track reconstruction methods. We investigate how artificial Neural Networks and Deep Learning could be used to complement existing algorithms to increase performance. Generating seeds of detector hits is an important phase during the beginning of track reconstruction and improving the current heuristics of seed generation seems like a feasible task. We find that given sufficient training data, a comparatively compact, standard feed-forward neural network can be trained to classify seeds with great accuracy and at high speeds. Thanks to immense parallelization benefits, it might even be worthwhile to completely replace the seed generation process with the Neural Network instead of just improving the seed quality of existing generators.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
On Learning Hamiltonian Systems from Data
Authors:
Tom Bertalan,
Felix Dietrich,
Igor Mezić,
Ioannis G. Kevrekidis
Abstract:
Concise, accurate descriptions of physical systems through their conserved quantities abound in the natural sciences. In data science, however, current research often focuses on regression problems, without routinely incorporating additional assumptions about the system that generated the data. Here, we propose to explore a particular type of underlying structure in the data: Hamiltonian systems,…
▽ More
Concise, accurate descriptions of physical systems through their conserved quantities abound in the natural sciences. In data science, however, current research often focuses on regression problems, without routinely incorporating additional assumptions about the system that generated the data. Here, we propose to explore a particular type of underlying structure in the data: Hamiltonian systems, where an "energy" is conserved. Given a collection of observations of such a Hamiltonian system over time, we extract phase space coordinates and a Hamiltonian function of them that acts as the generator of the system dynamics. The approach employs an autoencoder neural network component to estimate the transformation from observations to the phase space of a Hamiltonian system. An additional neural network component is used to approximate the Hamiltonian function on this constructed space, and the two components are trained jointly. As an alternative approach, we also demonstrate the use of Gaussian processes for the estimation of such a Hamiltonian. After two illustrative examples, we extract an underlying phase space as well as the generating Hamiltonian from a collection of movies of a pendulum. The approach is fully data-driven, and does not assume a particular form of the Hamiltonian function.
△ Less
Submitted 4 February, 2020; v1 submitted 29 July, 2019;
originally announced July 2019.
-
On the Koopman operator of algorithms
Authors:
Felix Dietrich,
Thomas N. Thiem,
Ioannis G. Kevrekidis
Abstract:
A systematic mathematical framework for the study of numerical algorithms would allow comparisons, facilitate conjugacy arguments, as well as enable the discovery of improved, accelerated, data-driven algorithms. Over the course of the last century, the Koopman operator has provided a mathematical framework for the study of dynamical systems, which facilitates conjugacy arguments and can provide e…
▽ More
A systematic mathematical framework for the study of numerical algorithms would allow comparisons, facilitate conjugacy arguments, as well as enable the discovery of improved, accelerated, data-driven algorithms. Over the course of the last century, the Koopman operator has provided a mathematical framework for the study of dynamical systems, which facilitates conjugacy arguments and can provide efficient reduced descriptions. More recently, numerical approximations of the operator have enabled the analysis of a large number of deterministic and stochastic dynamical systems in a completely data-driven, essentially equation-free pipeline. Discrete or continuous time numerical algorithms (integrators, nonlinear equation solvers, optimization algorithms) are themselves dynamical systems. In this paper, we use this insight to leverage the Koopman operator framework in the data-driven study of such algorithms and discuss benefits for analysis and acceleration of numerical computation. For algorithms acting on high-dimensional spaces by quickly contracting them towards low-dimensional manifolds, we demonstrate how basis functions adapted to the data help to construct efficient reduced representations of the operator. Our illustrative examples include the gradient descent and Nesterov optimization algorithms, as well as the Newton-Raphson algorithm.
△ Less
Submitted 19 May, 2020; v1 submitted 24 July, 2019;
originally announced July 2019.
-
A geometric approach to the transport of discontinuous densities
Authors:
Caroline Moosmüller,
Felix Dietrich,
Ioannis G. Kevrekidis
Abstract:
Different observations of a relation between inputs ("sources") and outputs ("targets") are often reported in terms of histograms (discretizations of the source and the target densities). Transporting these densities to each other provides insight regarding the underlying relation. In (forward) uncertainty quantification, one typically studies how the distribution of inputs to a system affects the…
▽ More
Different observations of a relation between inputs ("sources") and outputs ("targets") are often reported in terms of histograms (discretizations of the source and the target densities). Transporting these densities to each other provides insight regarding the underlying relation. In (forward) uncertainty quantification, one typically studies how the distribution of inputs to a system affects the distribution of the system responses. Here, we focus on the identification of the system (the transport map) itself, once the input and output distributions are determined, and suggest a modification of current practice by including data from what we call "an observation process". We hypothesize that there exists a smooth manifold underlying the relation; the sources and the targets are then partial observations (possibly projections) of this manifold. Knowledge of such a manifold implies knowledge of the relation, and thus of "the right" transport between source and target observations. When the source-target observations are not bijective (when the manifold is not the graph of a function over both observation spaces, either because folds over them give rise to density singularities, or because it marginalizes over several observables), recovery of the manifold is obscured. Using ideas from attractor reconstruction in dynamical systems, we demonstrate how additional information in the form of short histories of an observation process can help us recover the underlying manifold. The types of additional information employed and the relation to optimal transport based solely on density observations is illustrated and discussed, along with limitations in the recovery of the true underlying relation.
△ Less
Submitted 24 July, 2019; v1 submitted 18 July, 2019;
originally announced July 2019.
-
Domain Adaptation with Optimal Transport on the Manifold of SPD matrices
Authors:
Or Yair,
Felix Dietrich,
Ronen Talmon,
Ioannis G. Kevrekidis
Abstract:
In this paper, we address the problem of Domain Adaptation (DA) using Optimal Transport (OT) on Riemannian manifolds. We model the difference between two domains by a diffeomorphism and use the polar factorization theorem to claim that OT is indeed optimal for DA in a well-defined sense, up to a volume preserving map. We then focus on the manifold of Symmetric and Positive-Definite (SPD) matrices,…
▽ More
In this paper, we address the problem of Domain Adaptation (DA) using Optimal Transport (OT) on Riemannian manifolds. We model the difference between two domains by a diffeomorphism and use the polar factorization theorem to claim that OT is indeed optimal for DA in a well-defined sense, up to a volume preserving map. We then focus on the manifold of Symmetric and Positive-Definite (SPD) matrices, whose structure provided a useful context in recent applications. We demonstrate the polar factorization theorem on this manifold. Due to the uniqueness of the weighted Riemannian mean, and by exploiting existing regularized OT algorithms, we formulate a simple algorithm that maps the source domain to the target domain. We test our algorithm on two Brain-Computer Interface (BCI) data sets and observe state of the art performance.
△ Less
Submitted 27 July, 2020; v1 submitted 3 June, 2019;
originally announced June 2019.
-
Status of the undulator-based ILC positron source
Authors:
Felix Dietrich,
Gudrid Moortgat-Pick,
Sabine Riemann,
Peter Sievers,
Andriy Ushakov
Abstract:
The design of the positron source for the International Linear Collider (ILC) is still under consideration. The baseline design plans to use the electron beam for the positron production before it goes to the IP. The high-energy electrons pass a long helical undulator and generate an intense circularly polarized photon beam which hits a thin conversion target to produce $e^+e^-$ pairs. The resulti…
▽ More
The design of the positron source for the International Linear Collider (ILC) is still under consideration. The baseline design plans to use the electron beam for the positron production before it goes to the IP. The high-energy electrons pass a long helical undulator and generate an intense circularly polarized photon beam which hits a thin conversion target to produce $e^+e^-$ pairs. The resulting positron beam is longitudinally polarized which provides an important benefit for precision physics analyses. In this paper the status of the design studies is presented with focus on ILC250. In particular, the target design and cooling as well as issues of the optical matching device are important for the positron yield. Some possibilities to optimize the system are discussed.
△ Less
Submitted 20 February, 2019;
originally announced February 2019.
-
Linking Gaussian Process regression with data-driven manifold embeddings for nonlinear data fusion
Authors:
Seungjoon Lee,
Felix Dietrich,
George E. Karniadakis,
Ioannis G. Kevrekidis
Abstract:
In statistical modeling with Gaussian Process regression, it has been shown that combining (few) high-fidelity data with (many) low-fidelity data can enhance prediction accuracy, compared to prediction based on the few high-fidelity data only. Such information fusion techniques for multifidelity data commonly approach the high-fidelity model $f_h(t)$ as a function of two variables $(t,y)$, and the…
▽ More
In statistical modeling with Gaussian Process regression, it has been shown that combining (few) high-fidelity data with (many) low-fidelity data can enhance prediction accuracy, compared to prediction based on the few high-fidelity data only. Such information fusion techniques for multifidelity data commonly approach the high-fidelity model $f_h(t)$ as a function of two variables $(t,y)$, and then using $f_l(t)$ as the $y$ data. More generally, the high-fidelity model can be written as a function of several variables $(t,y_1,y_2....)$; the low-fidelity model $f_l$ and, say, some of its derivatives, can then be substituted for these variables. In this paper, we will explore mathematical algorithms for multifidelity information fusion that use such an approach towards improving the representation of the high-fidelity function with only a few training data points. Given that $f_h$ may not be a simple function -- and sometimes not even a function -- of $f_l$, we demonstrate that using additional functions of $t$, such as derivatives or shifts of $f_l$, can drastically improve the approximation of $f_h$ through Gaussian Processes. We also point out a connection with "embedology" techniques from topology and dynamical systems.
△ Less
Submitted 16 December, 2018;
originally announced December 2018.
-
Some manifold learning considerations towards explicit model predictive control
Authors:
Robert J. Lovelett,
Felix Dietrich,
Seungjoon Lee,
Ioannis G. Kevrekidis
Abstract:
Model predictive control (MPC) is a de facto standard control algorithm across the process industries. There remain, however, applications where MPC is impractical because an optimization problem is solved at each time step. We present a link between explicit MPC formulations and manifold learning to enable facilitated prediction of the MPC policy. Our method uses a similarity measure informed by…
▽ More
Model predictive control (MPC) is a de facto standard control algorithm across the process industries. There remain, however, applications where MPC is impractical because an optimization problem is solved at each time step. We present a link between explicit MPC formulations and manifold learning to enable facilitated prediction of the MPC policy. Our method uses a similarity measure informed by control policies and system state variables, to "learn" an intrinsic parametrization of the MPC controller using a diffusion maps algorithm, which will also discover a low-dimensional control law when it exists as a smooth, nonlinear combination of the state variables. We use function approximation algorithms to project points from state space to the intrinsic space, and from the intrinsic space to policy space. The approach is illustrated first by "learning" the intrinsic variables for MPC control of constrained linear systems, and then by designing controllers for an unstable nonlinear reactor.
△ Less
Submitted 8 July, 2019; v1 submitted 3 December, 2018;
originally announced December 2018.
-
Manifold Learning for Organizing Unstructured Sets of Process Observations
Authors:
Felix Dietrich,
Mahdi Kooshkbaghi,
Erik M. Bollt,
Ioannis G. Kevrekidis
Abstract:
Data mining is routinely used to organize ensembles of short temporal observations so as to reconstruct useful, low-dimensional realizations of an underlying dynamical system. In this paper, we use manifold learning to organize unstructured ensembles of observations ("trials") of a system's response surface. We have no control over where every trial starts; and during each trial operating conditio…
▽ More
Data mining is routinely used to organize ensembles of short temporal observations so as to reconstruct useful, low-dimensional realizations of an underlying dynamical system. In this paper, we use manifold learning to organize unstructured ensembles of observations ("trials") of a system's response surface. We have no control over where every trial starts; and during each trial operating conditions are varied by turning "agnostic" knobs, which change system parameters in a systematic but unknown way. As one (or more) knobs "turn" we record (possibly partial) observations of the system response. We demonstrate how such partial and disorganized observation ensembles can be integrated into coherent response surfaces whose dimension and parametrization can be systematically recovered in a data-driven fashion. The approach can be justified through the Whitney and Takens embedding theorems, allowing reconstruction of manifolds/attractors through different types of observations. We demonstrate our approach by organizing unstructured observations of response surfaces, including the reconstruction of a cusp bifurcation surface for Hydrogen combustion in a Continuous Stirred Tank Reactor. Finally, we demonstrate how this observation-based reconstruction naturally leads to informative transport maps between input parameter space and output/state variable spaces.
△ Less
Submitted 21 June, 2019; v1 submitted 30 October, 2018;
originally announced October 2018.
-
The ILC positron target cooled by thermal radiation
Authors:
Sabine Riemann,
Felix Dietrich,
Gudrid Moortgat-Pick,
Peter Sievers,
Andriy Ushakov
Abstract:
The design of the conversion target for the undulator-based ILC positron source is still under development. One important issue is the cooling of the target. Here, the status of the design studies for cooling by thermal radiation is presented.
The design of the conversion target for the undulator-based ILC positron source is still under development. One important issue is the cooling of the target. Here, the status of the design studies for cooling by thermal radiation is presented.
△ Less
Submitted 31 January, 2018;
originally announced January 2018.
-
On Matching, and Even Rectifying, Dynamical Systems through Koopman Operator Eigenfunctions
Authors:
Erik M. Bollt,
Qianxiao Li,
Felix Dietrich,
Ioannis Kevrekidis
Abstract:
Matching dynamical systems, through different forms of conjugacies and equivalences, has long been a fundamental concept, and a powerful tool, in the study and classification of nonlinear dynamic behavior (e.g. through normal forms). In this paper we will argue that the use of the Koopman operator and its spectrum is particularly well suited for this endeavor, both in theory, but also especially i…
▽ More
Matching dynamical systems, through different forms of conjugacies and equivalences, has long been a fundamental concept, and a powerful tool, in the study and classification of nonlinear dynamic behavior (e.g. through normal forms). In this paper we will argue that the use of the Koopman operator and its spectrum is particularly well suited for this endeavor, both in theory, but also especially in view of recent data-driven algorithm developments. We believe, and document through illustrative examples, that this can nontrivially extend the use and applicability of the Koopman spectral theoretical and computational machinery beyond modeling and prediction, towards what can be considered as a systematic discovery of "Cole-Hopf-type" transformations for dynamics.
△ Less
Submitted 6 March, 2018; v1 submitted 19 December, 2017;
originally announced December 2017.
-
Derivation of higher-order terms in FFT-based numerical homogenization
Authors:
Felix Dietrich,
Dennis Merkert,
Bernd Simeon
Abstract:
In this paper, we first introduce the reader to the Basic Scheme of Moulinec and Suquet in the setting of quasi-static linear elasticity, which takes advantage of the fast Fourier transform on homogenized microstructures to accelerate otherwise time-consuming computations. By means of an asymptotic expansion, a hierarchy of linear problems is derived, whose solutions are looked at in detail. It is…
▽ More
In this paper, we first introduce the reader to the Basic Scheme of Moulinec and Suquet in the setting of quasi-static linear elasticity, which takes advantage of the fast Fourier transform on homogenized microstructures to accelerate otherwise time-consuming computations. By means of an asymptotic expansion, a hierarchy of linear problems is derived, whose solutions are looked at in detail. It is highlighted how these generalized homogenization problems depend on each other. We extend the Basic Scheme to fit this new problem class and give some numerical results for the first two problem orders.
△ Less
Submitted 14 December, 2017;
originally announced December 2017.
-
An Emergent Space for Distributed Data with Hidden Internal Order through Manifold Learning
Authors:
Felix P. Kemeth,
Sindre W. Haugland,
Felix Dietrich,
Tom Bertalan,
Kevin Höhlein,
Qianxiao Li,
Erik M. Bollt,
Ronen Talmon,
Katharina Krischer,
Ioannis G. Kevrekidis
Abstract:
Manifold-learning techniques are routinely used in mining complex spatiotemporal data to extract useful, parsimonious data representations/parametrizations; these are, in turn, useful in nonlinear model identification tasks. We focus here on the case of time series data that can ultimately be modelled as a spatially distributed system (e.g. a partial differential equation, PDE), but where we do no…
▽ More
Manifold-learning techniques are routinely used in mining complex spatiotemporal data to extract useful, parsimonious data representations/parametrizations; these are, in turn, useful in nonlinear model identification tasks. We focus here on the case of time series data that can ultimately be modelled as a spatially distributed system (e.g. a partial differential equation, PDE), but where we do not know the space in which this PDE should be formulated. Hence, even the spatial coordinates for the distributed system themselves need to be identified - to emerge from - the data mining process. We will first validate this emergent space reconstruction for time series sampled without space labels in known PDEs; this brings up the issue of observability of physical space from temporal observation data, and the transition from spatially resolved to lumped (order-parameter-based) representations by tuning the scale of the data mining kernels. We will then present actual emergent space discovery illustrations. Our illustrative examples include chimera states (states of coexisting coherent and incoherent dynamics), and chaotic as well as quasiperiodic spatiotemporal dynamics, arising in partial differential equations and/or in heterogeneous networks. We also discuss how data-driven spatial coordinates can be extracted in ways invariant to the nature of the measuring instrument. Such gauge-invariant data mining can go beyond the fusion of heterogeneous observations of the same system, to the possible matching of apparently different systems.
△ Less
Submitted 6 December, 2018; v1 submitted 17 August, 2017;
originally announced August 2017.
-
Why is solar cycle 24 an inefficient producer of high-energy particle events?
Authors:
Rami Vainio,
Osku Raukunen,
Allan J. Tylka,
William F. Dietrich,
Alexandr Afanasiev
Abstract:
The aim of the study is to investigate the reason for the low productivity of high-energy SEPs in the present solar cycle. We employ scaling laws derived from diffusive shock acceleration theory and simulation studies including proton-generated upstream Alfvén waves to find out how the changes observed in the long-term average properties of the erupting and ambient coronal and/or solar wind plasma…
▽ More
The aim of the study is to investigate the reason for the low productivity of high-energy SEPs in the present solar cycle. We employ scaling laws derived from diffusive shock acceleration theory and simulation studies including proton-generated upstream Alfvén waves to find out how the changes observed in the long-term average properties of the erupting and ambient coronal and/or solar wind plasma would affect the ability of shocks to accelerate particles to the highest energies. Provided that self-generated turbulence dominates particle transport around coronal shocks, it is found that the most crucial factors controlling the diffusive shock acceleration process are the number density of seed particles and the plasma density of the ambient medium. Assuming that suprathermal populations provide a fraction of the particles injected to shock acceleration in the corona, we show that the lack of most energetic particle events as well as the lack of low charge-to-mass ratio ion species in the present cycle can be understood as a result of the reduction of average coronal plasma and suprathermal densities in the present cycle over the previous one.
△ Less
Submitted 3 July, 2017;
originally announced July 2017.
-
Extended dynamic mode decomposition with dictionary learning: a data-driven adaptive spectral decomposition of the Koopman operator
Authors:
Qianxiao Li,
Felix Dietrich,
Erik M. Bollt,
Ioannis G. Kevrekidis
Abstract:
Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD) and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observab…
▽ More
Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD) and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observables that spans a finite dimensional subspace on which the Koopman operator can be approximated. This enhances the accuracy of the solution reconstruction and broadens the applicability of the Koopman formalism. Although the convergence of the EDMD has been established, applying the method in practice requires a careful choice of the observables to improve convergence with just a finite number of terms. This is especially difficult for high dimensional and highly nonlinear systems. In this paper, we employ ideas from machine learning to improve upon the EDMD method. We develop an iterative approximation algorithm which couples the EDMD with a trainable dictionary represented by an artificial neural network. Using the Duffing oscillator and the Kuramoto Sivashinsky PDE as examples, we show that our algorithm can effectively and efficiently adapt the trainable dictionary to the problem at hand to achieve good reconstruction accuracy without the need to choose a fixed dictionary a priori. Furthermore, to obtain a given accuracy we require fewer dictionary terms than EDMD with fixed dictionaries. This alleviates an important shortcoming of the EDMD algorithm and enhances the applicability of the Koopman framework to practical problems.
△ Less
Submitted 1 July, 2017;
originally announced July 2017.
-
Using Raspberry Pi for scientific video observation of pedestrians during a music festival
Authors:
Daniel H. Biedermann,
Felix Dietrich,
Oliver Handel,
Peter M. Kielar,
Michael Seitz
Abstract:
The document serves as a reference for researchers trying to capture a large portion of a mass event on video for several hours, while using a very limited budget.
The document serves as a reference for researchers trying to capture a large portion of a mass event on video for several hours, while using a very limited budget.
△ Less
Submitted 1 November, 2015;
originally announced November 2015.
-
Numerical Model Construction with Closed Observables
Authors:
Felix Dietrich,
Gerta Köster,
Hans-Joachim Bungartz
Abstract:
Performing analysis, optimization and control using simulations of many-particle systems is computationally demanding when no macroscopic model for the dynamics of the variables of interest is available. In case observations on the macroscopic scale can only be produced via legacy simulator code or live experiments, finding a model for these macroscopic variables is challenging.
In this paper, w…
▽ More
Performing analysis, optimization and control using simulations of many-particle systems is computationally demanding when no macroscopic model for the dynamics of the variables of interest is available. In case observations on the macroscopic scale can only be produced via legacy simulator code or live experiments, finding a model for these macroscopic variables is challenging.
In this paper, we employ time-lagged embedding theory to construct macroscopic numerical models from output data of a black box, such as a simulator or live experiments. Since the state space variables of the constructed, coarse model are dynamically closed and observable by an observation function, we call these variables closed observables. The approach is an online-offline procedure, as model construction from observation data is performed offline and the new model can then be used in an online phase, independent of the original. We illustrate the theoretical findings with numerical models constructed from time series of a two-dimensional ordinary differential equation system, and from the density evolution of a transport-diffusion system. Applicability is demonstrated in a real-world example, where passengers leave a train and the macroscopic model for the density flow onto the platform is constructed with our approach. If only the macroscopic variables are of interest, simulation runtimes with the numerical model are three orders of magnitude lower compared to simulations with the original fine scale model. We conclude with a brief discussion of possibilities of numerical model construction in systematic upscaling, network optimization and uncertainty quantification.
△ Less
Submitted 18 October, 2015; v1 submitted 15 June, 2015;
originally announced June 2015.
-
Derivation of an optical potential for statically deformed rare-earth nuclei from a global spherical potential
Authors:
G. P. A. Nobre,
A. Palumbo,
F. S. Dietrich,
M. Herman,
D. Brown,
S. Hoblit
Abstract:
The coupled-channel theory is a natural way of treating nonelastic channels, in particular those arising from collective excitations characterized by nuclear deformations. A proper treatment of such excitations is often essential to the accurate description of experimental nuclear-reaction data and to the prediction of a wide variety of scattering observables. Stimulated by recent work substantiat…
▽ More
The coupled-channel theory is a natural way of treating nonelastic channels, in particular those arising from collective excitations characterized by nuclear deformations. A proper treatment of such excitations is often essential to the accurate description of experimental nuclear-reaction data and to the prediction of a wide variety of scattering observables. Stimulated by recent work substantiating the near validity of the adiabatic approximation in coupled-channel calculations for scattering on statically deformed nuclei, we explore the possibility of generalizing a global spherical optical model potential (OMP) to make it usable in coupled-channel calculations on this class of nuclei. To do this, we have deformed the Koning-Delaroche global spherical potential for neutrons, coupling a sufficient number of states of the ground state band to ensure convergence. We present an extensive study of the effects of collective couplings and nuclear deformations on integrated cross sections as well as on angular distributions for neutron-induced reactions on statically deformed nuclei in the rare-earth region. We choose isotopes of three rare-earth elements (Gd, Ho, W), which are known to be nearly perfect rotors, to exemplify the results of the proposed method. Predictions from our model for total, elastic and inelastic cross sections, as well as for elastic and inelastic angular distributions, are in reasonable agreement with measured experimental data. These results suggest that the deformed Koning-Delaroche potential provides a useful regional neutron optical potential for the statically deformed rare earth nuclei.
△ Less
Submitted 22 December, 2014;
originally announced December 2014.
-
Gradient Navigation Model for Pedestrian Dynamics
Authors:
Felix Dietrich,
Gerta Köster
Abstract:
We present a new microscopic ODE-based model for pedestrian dynamics: the Gradient Navigation Model. The model uses a superposition of gradients of distance functions to directly change the direction of the velocity vector. The velocity is then integrated to obtain the location. The approach differs fundamentally from force based models needing only three equations to derive the ODE system, as opp…
▽ More
We present a new microscopic ODE-based model for pedestrian dynamics: the Gradient Navigation Model. The model uses a superposition of gradients of distance functions to directly change the direction of the velocity vector. The velocity is then integrated to obtain the location. The approach differs fundamentally from force based models needing only three equations to derive the ODE system, as opposed to four in, e.g., the Social Force Model. Also, as a result, pedestrians are no longer subject to inertia. Several other advantages ensue: Model induced oscillations are avoided completely since no actual forces are present. The derivatives in the equations of motion are smooth and therefore allow the use of fast and accurate high order numerical integrators. At the same time, existence and uniqueness of the solution to the ODE system follow almost directly from the smoothness properties. In addition, we introduce a method to calibrate parameters by theoretical arguments based on empirically validated assumptions rather than by numerical tests. These parameters, combined with the accurate integration, yield simulation results with no collisions of pedestrians. Several empirically observed system phenomena emerge without the need to recalibrate the parameter set for each scenario: obstacle avoidance, lane formation, stop-and-go waves and congestion at bottlenecks. The density evolution in the latter is shown to be quantitatively close to controlled experiments. Likewise, we observe a dependence of the crowd velocity on the local density that compares well with benchmark fundamental diagrams.
△ Less
Submitted 14 May, 2014; v1 submitted 2 January, 2014;
originally announced January 2014.
-
Towards an optical potential for rare-earths through coupled channels
Authors:
G. P. A. Nobre,
F. S. Dietrich,
M. Herman,
A. Palumbo,
S. Hoblit,
D. Brown
Abstract:
The coupled-channel theory is a natural way of treating nonelastic channels, in particular those arising from collective excitations, defined by nuclear deformations. Proper treatment of such excitations is often essential to the accurate description of reaction experimental data. Previous works have applied different models to specific nuclei with the purpose of determining angular-integrated cro…
▽ More
The coupled-channel theory is a natural way of treating nonelastic channels, in particular those arising from collective excitations, defined by nuclear deformations. Proper treatment of such excitations is often essential to the accurate description of reaction experimental data. Previous works have applied different models to specific nuclei with the purpose of determining angular-integrated cross sections. In this work, we present an extensive study of the effects of collective couplings and nuclear deformations on integrated cross sections as well as on angular distributions in a consistent manner for neutron-induced reactions on nuclei in the rare-earth region. This specific subset of the nuclide chart was chosen precisely because of a clear static deformation pattern. We analyze the convergence of the coupled-channel calculations regarding the number of states being explicitly coupled. Inspired by the work done by Dietrich \emph{et al.}, a model for deforming the spherical Koning-Delaroche optical potential as function of quadrupole and hexadecupole deformations is also proposed. We demonstrate that the obtained results of calculations for total, elastic and inelastic cross sections, as well as elastic and inelastic angular distributions correspond to a remarkably good agreement with experimental data for scattering energies above around a few MeV.
△ Less
Submitted 7 November, 2013;
originally announced November 2013.
-
Coupled-channel optical model potential for rare earth nuclei
Authors:
M. Herman,
G. P. A. Nobre,
A. Palumbo,
F. S. Dietrich,
D. Brown,
S. Hoblit
Abstract:
Inspired by the recent work by Dietrich et al., substantiating validity of the adiabatic assumption in coupled-channel calculations, we explore the possibility of generalizing a global spherical optical model potential (OMP) to make it usable in coupled-channel calculations on statically deformed nuclei. The generalization consists in adding the coupling of the ground state rotational band, deform…
▽ More
Inspired by the recent work by Dietrich et al., substantiating validity of the adiabatic assumption in coupled-channel calculations, we explore the possibility of generalizing a global spherical optical model potential (OMP) to make it usable in coupled-channel calculations on statically deformed nuclei. The generalization consists in adding the coupling of the ground state rotational band, deforming the potential by introducing appropriate quadrupole and hexadecupole deformation and correcting the OMP radius to preserve volume integral of the spherical OMP. We choose isotopes of three rare-earth elements (W, Ho, Gd), which are known to be nearly perfect rotors, to perform a consistent test of our conjecture on integrated cross sections as well as on angular distributions for elastic and inelastic neutron scattering. When doing this we employ the well-established Koning-Delaroche global spherical potential and experimentally determined deformations without any adjustments. We observe a dramatically improved agreement with experimental data compared to spherical optical model calculations. The effect of changing the OMP radius to preserve volume integral is moderate but visibly improves agreement at lower incident energies. We find that seven collective states need to be considered for the coupled-channel calculations to converge. Our results for total, elastic, inelastic, and capture cross sections, as well as elastic and inelastic angular distributions are in remarkable agreement with experimental data. This result confirms that the adiabatic assumption holds and can extend applicability of the global spherical OMP to rotational nuclei in the rare-earth region, essentially without any free parameter. Thus, quite reliable coupled-channel calculations can be performed on such nuclei even when the experimental data, and consequently a specific coupled-channel potential, are not available.
△ Less
Submitted 6 February, 2014; v1 submitted 5 November, 2013;
originally announced November 2013.
-
Towards a coupled-channel optical potential for rare-earth nuclei
Authors:
G. P. A. Nobre,
A. Palumbo,
D. Brown,
M. Herman,
S. Hoblit,
F. S. Dietrich
Abstract:
We present an outline of an extensive study of the effects of collective couplings and nuclear deformations on integrated cross sections as well as on angular distributions in a consistent manner for neutron-induced reactions on nuclei in the rare-earth region. This specific subset of the nuclide chart was chosen precisely because of a clear static deformation pattern. We analyze the convergence o…
▽ More
We present an outline of an extensive study of the effects of collective couplings and nuclear deformations on integrated cross sections as well as on angular distributions in a consistent manner for neutron-induced reactions on nuclei in the rare-earth region. This specific subset of the nuclide chart was chosen precisely because of a clear static deformation pattern. We analyze the convergence of the coupled-channel calculations regarding the number of states being explicitly coupled. A model for deforming the spherical Koning-Delaroche optical potential as function of quadrupole and hexadecupole deformations is also proposed, inspired by previous works. We demonstrate that the obtained results of calculations for total, elastic, inelastic, and capture cross sections, as well as elastic and inelastic angular distributions are in remarkably good agreement with experimental data for scattering energies around a few MeV.
△ Less
Submitted 2 November, 2013;
originally announced November 2013.
-
Towards a Microscopic Reaction Description Based on Energy-Density-Functional Structure Models
Authors:
G. P. A. Nobre,
F. S. Dietrich,
J. E. Escher,
I. J. Thompson,
M. Dupuis,
J. Terasaki,
J. Engel
Abstract:
A microscopic calculation of reaction cross sections for nucleon-nucleus scattering has been performed by explicitly coupling the elastic channel to all particle-hole excitations in the target and one-nucleon pickup channels. The particle-hole states may be regarded as doorway states through which the flux flows to more complicated configurations, and subsequently to long-lived compound nucleus re…
▽ More
A microscopic calculation of reaction cross sections for nucleon-nucleus scattering has been performed by explicitly coupling the elastic channel to all particle-hole excitations in the target and one-nucleon pickup channels. The particle-hole states may be regarded as doorway states through which the flux flows to more complicated configurations, and subsequently to long-lived compound nucleus resonances. Target excitations for $^{40,48}$Ca, $^{58}$Ni, $^{90}$Zr and $^{144}$Sm were described in a random-phase framework using a Skyrme functional. Reaction cross sections obtained agree very well with experimental data and predictions of a state-of-the-art fitted optical potential. Couplings between inelastic states were found to be negligible, while the pickup channels contribute significantly. The effect of resonances from higher-order channels was assessed. Elastic angular distributions were also calculated within the same method, achieving good agreement with experimental data. For the first time observed absorptions are completely accounted for by explicit channel coupling, for incident energies between 10 and 70 MeV, with consistent angular distribution results.
△ Less
Submitted 8 December, 2011; v1 submitted 4 October, 2011;
originally announced October 2011.
-
Reaction cross-section predictions for nucleon induced reactions
Authors:
G. P. A. Nobre,
I. J. Thompson,
J. E. Escher,
F. S. Dietrich
Abstract:
A microscopic calculation of the optical potential for nucleon-nucleus scattering has been performed by explicitly coupling the elastic channel to all the particle-hole (p-h) excitation states in the target and to all relevant pickup channels. These p-h states may be regarded as doorway states through which the flux flows to more complicated configurations, and to long-lived compound nucleus reson…
▽ More
A microscopic calculation of the optical potential for nucleon-nucleus scattering has been performed by explicitly coupling the elastic channel to all the particle-hole (p-h) excitation states in the target and to all relevant pickup channels. These p-h states may be regarded as doorway states through which the flux flows to more complicated configurations, and to long-lived compound nucleus resonances. We calculated the reaction cross sections for the nucleon induced reactions on the targets $^{40,48}$Ca, $^{58}$Ni, $^{90}$Zr and $^{144}$Sm using the QRPA description of target excitations, coupling to all inelastic open channels, and coupling to all transfer channels corresponding to the formation of a deuteron. The results of such calculations were compared to predictions of a well-established optical potential and with experimental data, reaching very good agreement. The inclusion of couplings to pickup channels were an important contribution to the absorption. For the first time, calculations of excitations account for all of the observed reaction cross-sections, at least for incident energies above 10 MeV.
△ Less
Submitted 28 July, 2010;
originally announced July 2010.
-
Coupled-channels calculations of nonelastic cross sections using a density-functional structure model
Authors:
G. P. A. Nobre,
F. S. Dietrich,
J. E. Escher,
I. J. Thompson,
M. Dupuis,
J. Terasaki,
J. Engel
Abstract:
A microscopic calculation of the reaction cross-section for nucleon-nucleus scattering has been performed by explicitly coupling the elastic channel to all particle-hole (p-h) excitation states in the target and to all one-nucleon pickup channels. The p-h states may be regarded as doorway states through which the flux flows to more complicated configurations, and subsequently to long-lived compoun…
▽ More
A microscopic calculation of the reaction cross-section for nucleon-nucleus scattering has been performed by explicitly coupling the elastic channel to all particle-hole (p-h) excitation states in the target and to all one-nucleon pickup channels. The p-h states may be regarded as doorway states through which the flux flows to more complicated configurations, and subsequently to long-lived compound nucleus resonances. Target excitations for 40,48Ca, 58Ni, 90Zr and 144Sm were described in a QRPA framework using a Skyrme functional. Reaction cross sections calculated in this approach were compared to predictions of a fitted optical potential and to experimental data, reaching very good agreement. Couplings between inelastic states were found to be negligible, while the couplings to pickup channels contribute significantly. For the first time observed reaction cross-sections are completely accounted for by explicit channel coupling, for incident energies between 10 and 40 MeV.
△ Less
Submitted 18 October, 2010; v1 submitted 1 June, 2010;
originally announced June 2010.