remarkRemark \newsiamremarkhypothesisHypothesis \newsiamthmclaimClaim \headersLearning Nonautonomous SDEYuan Chen and Dongbin Xiu
Modeling Unknown Stochastic Dynamical System Subject to External Excitation
Abstract
We present a numerical method for learning unknown nonautonomous stochastic dynamical system, i.e., stochastic system subject to time dependent excitation or control signals. Our basic assumption is that the governing equations for the stochastic system are unavailable. However, short bursts of input/output (I/O) data consisting of certain known excitation signals and their corresponding system responses are available. When a sufficient amount of such I/O data are available, our method is capable of learning the unknown dynamics and producing an accurate predictive model for the stochastic responses of the system subject to arbitrary excitation signals not in the training data. Our method has two key components: (1) a local approximation of the training I/O data to transfer the learning into a parameterized form; and (2) a generative model to approximate the underlying unknown stochastic flow map in distribution. After presenting the method in detail, we present a comprehensive set of numerical examples to demonstrate the performance of the proposed method, especially for long-term system predictions.
keywords:
Data-driven modeling, stochastic dynamical systems, deep neural networks, nonautonomous system60H10, 60H35, 62M45, 65C30
1 Introduction
There has been a growing interest in recovering/discovering unknown dynamical systems from observational data. Most of the existing studies focus on deterministic systems, with methods such as physics-informed neural networks (PINNs) ([39, 40]), SINDy ([5]), Fourier neural operator (FNO) ([24]), computational graph completion ([30]), sparsity promoting methods ([41, 42, 19]), flow map learning (FML) ([38, 11]), to name a few.
Learning unknown stochastic systems is notably more challenging, as the stochastic noises in the systems usually can not be directly observed. The existing work utilizes Gaussian process ([48, 1, 12, 29]), polynomial approximations ([44, 22]), deep neural networks (DNNs) [8, 47, 9, 49, 14, 50], etc. More recently, a stochastic extension of the deterministic flow map learning (FML) approach ([38, 11]) was proposed. It employs generative models such as GANs (generative adversarial networks) ([10]) or autoencoders ([46]) to model the underlying stochasticity. However, most, if not all, of these methods are developed for autonomous systems, where time-invariance (in distribution) holds true and is critical to the method development.
The focus and contribution of this paper is on the learning and modeling of unknown non-autonomous stochastic systems. More specifically, we consider SDEs with unknown governing equations and subject to time dependent external excitation or control signals. Our goal is to develop a method that can capture the stochastic dynamics of the unknown systems by using short-term data consisting of input/output (I/O) relations between excitation signals and their corresponding system responses. We remark that there exist some studies on modeling deterministic non-autonomous systems, using methodology such as Dynamic Mode Decomposition (DMD) ([25, 34]), SINDy ([6]), Koopman operator ([35]), FML ([36]), etc. These methods are not applicable for stochastic non-autonomous systems.
The proposed method in this paper has two key components. First, the method utilize the observational I/O data to construct an accurate representation of unknown stochastic dynamics of the system. This is accomplished by a generative model that learns the stochastic map** of the system between two consecutive discrete time steps. The learning of this stochastic flow map is similar to the work of [10, 46], which extended the deterministic FML to stochastic systems. While [10, 46] utilized GANs and autoencoder as the generative model, in this paper we employ conditional normalizing flow (cf. [31]). Normalizing flow has been widely adopted as a probabilistic model for generating data with desired distributions. Its applications include image and video generation [21], statistical inference and sampling [27, 43], reinforcement learning [18], as well as scientific computing [26, 23, 17, 13]. The second key component of the proposed method is local parameterization of the excitation signal in the training I/O data. The method was first introduced in [36] for deterministic nonautonomous sytem. We adopt the similar idea and extend it to stochastic system. The approach seeks to parameterize the excitation sigals in the training data via a localized polynomial over one time step. This then transforms the learning problem into a parametric learning between the coefficients of the local polynomials and the system responses. This is a critical component, as it allows the learned system to conduct long-term system predictions under arbitrary excitation signals that are never seen in the training data. Although the proposed method requires a large number of short bursts data, the overall demand for data may not be as large. This is because the burst length of the training I/O data is as short as two time steps. Once trained, the learned model is able to simulate the unknown stochastic systems for very long-term and subject to arbitrary exitation/control signals. We demonstrate this important property in several of our numerical examples. The learning is performed using training I/O data observed over only nondimensional time units. However, the system predictions by the learned system can be accurate for time units as large as and beyond, and under excitation signals not in the training data.
2 Setup
Let be an event space and a finite time horizon. We consider a -dimensional () stochastic process driven by an unknown (non-autonomous) stochastic differential equation (SDE) subject to external inputs
(1) |
where is -dimensional () Brownian motion, drift function, diffusion function, and and time-dependent external inputs into the stochastic system. In practice, the inputs can be external excitation signals or control signals. Throughout this paper, we will generally refer them as excitations and denote to consolidate the notation.
Our basic assumption is that the SDE is unknown, in the sense that the functions and are not known. Also, the driving Brownian motion can not be observed. However, we have input-output (I/O) time history data between the excitations , i.e., the inputs, and system response , i.e., the output,
(2) |
Our goal is to construct a numerical model for the unknown system (1) such that it can produce accurate predictions of the system response for arbitrarily given excitations that are not observed in the training data (2).
2.1 Problem Statement
The method presented in this paper is based on discrete time setting. Let be discrete time points. For simplicity, we assume the time steps are of uniform length , . Suppose we observe I/O sequences of solution responses subject to input excitations: for ,
(3) |
where is the length of the -th observation sequence. Note that each sequence of the I/O data can cover different time spans. Also, one may have more information about the excitation beyond its point values. For example, the analytical form of may be known within the time interval for some sequences.
The objective is to construct a numerical model to predict the system response of (1) subject to arbitrary excitations. More specifically, given an initial condition and excitation signal that is not in the training I/O data (3), we require the model prediction to approximate the true system response , i.e.,
(4) |
where stands for approximation in distribution. Note that since in general the stochastic driving term can not be directly observed, a weak approximation, such as approximation in distribution, is typically the most one can achieve from a mathematical point of view.
2.2 Related Work and Contribution
This method developed in this paper has its foundation in two recent work: flow map learning (FML) for modeling deterministic unknown dynamical systems and its extension to modeling stochastic dynamical systems.
For an unknown deterministic autonomous system, , , where is unknown. The FML method seeks to approximate the unknown flow map by using observation data. More specifically, by using data on over one time step , the FML method constructs a model
where is a numerical approximation of the true flow map over one time step . Once constructed, the FML model can be used as a time marching scheme to predict the system response under a given initial condition. This framework was proposed in [38], with extensions to partially observed system [16], parametric systems [37], as well as non-autonomous deterministic system [36].
For learning autonomous stochastic system, where represents an unknown stochastic process driving the system. The work of [10] developed stochastic flow map learning (sFML). Assuming the system satisfies time-homogeneous property ([28]) , , the method uses the observation data on the state variable to construct a one-step generative model
where is a random variable with known distribution (e.g., standard Gaussian). The function , termed stochastic flow map, approximates the conditional distribution Subsequently, the sFML model becomes a weak approximation, in distribution, to the true stochastic dynamics. Different generative models can be employed under the sFML framework. For example, generative adversarial networks (GANs) are used in [10], and an autoencoder is employed in [46].
The primary contribution of this paper is on the development of data driven modeling for unknown stochastic systems subject to external excitations. To accomplish this, we extend the sFML framework ([10]), which was developed for autonomous system, to non-autonomous stochastic system. To learn the system I/O responses, we employ the local parameterization technique developed for non-autonomous deterministic system ([36]). The method parameterizes the input excitations in the data and transforms the learning problem into learning a parametric dynamical system. For stochastic non-autonomous system considered in this paper, we incorporate the method into a generative model in the sFML framework. In particular, we use normalizing flow as the generative model, which has not been considered in stochastic dynamical system learning. We shall demonstrate that the newly developed method is highly effective in modeling unknown stochastic systems, when excitations are not present in the training data.
3 Methodology
In this section, we describe the proposed learning method in detail.
3.1 Parameterization of Inputs
Consider the unknown SDE (1) over a time interval , ,
(5) |
which can be wrriten equivalently as,
(6) |
By using the compact notation , we now consider the excitation in the time interval . Given the information of the excitation in the training data (3), we construct a parameterized form
(7) |
where is a set of prescribed analytical basis functions and
(8) |
are the expansion coefficients. In principle, one can choose any suitable basis functions. Since the time interval usually has a (very) small step size , it suffices to use low-order polynomials. In fact, low-degree monomials bases, , , would be sufficient for most problems. When , the parameterization takes form of piecewise constant function; when , piecewise linear function.
The local parameterization of is carried out based on the information one has about the excitations. If the excitations are only known at the discrete time instances, as shown in (3), then it is natural to utilize piecewise linear polynomial,
If more information about is available, one can construct a higher degree polynomial. Note that since the time step is usually small, a quadratic polynomial can be highly accurate in any time interval . We remark that in the representation, only the values of the excitations at and are needed. The values of the time and are not required.
3.2 Parametric Stochastic Flow Map
By replacing by the local polynomial (7), we transform the system (6) into
(9) |
where the excitation signals has been parameterized by via a set of parameters . Compared to (6), the transformed system (9) contains possible numerical error introduced by the parameterization of the excitations over the time domain . The error can be made arbitrarily small if one uses higher degree polynomials when is sufficiently small.
By using subscript to denote the time level and letting
the parameterized system (9) indicates that, there exists a map**
(10) |
where is what we shall call parametric stochastic flow map, which is parameterized by . It is an unknown operator as the functions and are unknown in the original system (1).
Remark 3.1.
It is important to recognize that for the Brownian motion , or in general for Lévy processes (càdlàg stochastic processes with stationary independent increments), the process is stationary and independent of . Therefore, only the time difference matters, and the values of and do not. Consequently, we have suppressed the time variable and in (10).
3.3 Stochastic Flow Map Learning
In this section, we describe our main method of stochastic flow map learning (sFML), which constructs a generative model to approximate the stochastic flow map (10) by using the trajectory data (3).
3.3.1 Training Data
To construct the training data set, we reorganize the original training data set (3) into pairs of consecutive time instances. Since for each of the -th trajectory, , we can extract such pairs, there are a total number of I/O data pairs from the data set (3):
(11) |
Next, we perform the local parameterization procedure from Section 3.1 to each pair of the input data and obtain its parameterization , , in the form of (7). We now have
(12) |
Since the values of the time variables do not matter, see Remark 3.1, we again suppress the time variables and write our training data set as
(13) |
where is the total number of the parametric data pairs. In this way, each -th entry of the data set is a trajectory of length two over one time step , starting with its “initial condition” at , ending one step later at , and driven by a known excitation parameterized by and an unknown stationary stochastic process .
3.3.2 Generative Model
In stochastic flow map learning (sFML), we seek to approximate the parametric stochastic flow map (10) via a recursive generative model in the form of
(14) |
where is a random variable of known distribution. Again, since the stationary stochastic process in (10) is not observed, the constructed sFML model (14) is expected to be a weak approximation of (10), and in this particular case, approximation in distribution.
In order to construct the sFML model (14), we execute the model for one time step over ,
(15) |
and utilize the training data set (13) to learn the unknown operator . Note that the random variable is not in the training date set. In practice, one chooses with a known distribution, typically a standard Gaussian, and a specified dimension . The presence of the random variable enables (15) to be a stochastic generative model that can produce random realizations. Several methods exist to construct stochastic generative models, e.g., GANs, diffusion model, normalizing flow, autoencoder-decoder, etc. In this paper, we adopt normalizing flow for (15).
3.3.3 Normalizing Flow Model
Normalizing flows are generative models that produce tractable distributions to enable efficient and accurate sampling and density evaluation. A normalizing flow is a transformation of a simple probability distribution, e.g., a standard normal, into a more complex distribution by a sequence of diffeomorphism. Let be a random variable with a known and tractable distribution . Let be a diffeomorphism, whose inverse is , and . Then using the change of variable formula, one obtain the probability of :
where is the Jacobian of and is the Jacobian of . When the target complex distribution is given, usually as a set of samples of , one chooses to find from a parameterized family , where the parameter is optimized to match the target distribution. Also, to circumvent the difficulty of constructing a complicated nonlinear function , one utilizes a composition of (much) simpler diffeomorphisms: . It can be shown that remains a diffeomorphism with its inverse . There exist a large amount of literature on normalizing flows. We refer interested reader to review articles [20, 32].
In our setting, we seek to construct the one-step generative model (15) by using the training data (13). Let be a random variable with a known distribution. In our approach, we choose to be -dimensional standard normal. Let be a diffeomorphism with a set of parameters . Our objective is to find such that follows the distribution of in (13).
Since the distribution of clearly depends on and , we constraint the choice of to be a function of and . We define
(16) |
where is a DNN map** with trainable hyperparameters . No special DNN structure required, and we adopt the straightforward fully connected feedforward DNN for . This effectively defines
(17) |
where the diffeomorphism is effectively parameterized by the trainable hyperparameters of the DNN. Let be the inverse of . We have .
The invertibility of allows us to compute:
(18) |
The hyperparameters are determined by maximizing the expected log-likelihood, which is accomplished by minimizing its negative as the loss function,
where is the distribution from the training data set (13) and computed as
(19) |
Several designs for the invertible map have been developed and studied extensively in the literature. These include, for example, masked autoregressive flow (MAF) [33], real-valued non-volume preserving (RealNVP) [15], neural ordinary differential equations (Neural ODE) [7], etc. In this paper, we adopt the MAF approach, where the dimension of the parameter in (16) is set to be , where is the dimension of the dynamical system. For the technical detail of MAF, see [33].
3.4 DNN Model Structure and System Prediction
An illustration of the proposed sFML model structure can be found in Figure 1. This is in direct correspondence of (17). Minimization of the loss function (19), using the data set (13), results in the training of the DNN hyperparameters . Once the training is completed and fixed, (17) effectively defines the one-step sFML model (15):
where we have suppressed the fixed parameter .
![Refer to caption](x1.png)
Iterative execution of the one-step sFML model allows one to conduct system predictions under excitations that are not in the training data. For a given (new) excitation signal , we first conduct its parameterization in the form of (7), to obtain its local parameter for , for any . The sFML system then produces the system prediction, for a given initial condition ,
(20) |
where are i.i.d. -dimensional standard normal random variables.
4 Numerical Examples
In this section, we present several numerical tests to demonstrate the performance of our proposed method. After presenting results for an Ornstein-Uhlenbeck (OU) process and a nonlinear SDE, we focus on nonlinear SDE systems for long-term predictions. These include stochastic a predator-prey model and a stochastic oscillator with double well potential. In both cases, we study very long-term predictions of the learned sFML models. In particular, for the stochastic oscillator, we utilize a periodic excitation signal that is known to generate the well-known “stochastic resonance” phenomenon.
In all the examples, the true SDE systems are known. However, the known SDEs are used only to generate the training data set (13). We solve the true systems by Euler-Maruyama method with a time step . The “initial conditions” in (13) are sampled uniformly in a domain , specified in each example, and the excitations are local polynomials whose coefficients are sampled in a domain specified for each example.
In our sFML model, Figure 1, the DNN has 3 layers, each of which with 20 nodes, and utilizes activation function. We employ cyclic learning rate with a base rate and a maximum rate , , and step size . The cycle is set for every training epochs and with a decay scale . A small weight decay of on the gradient updates is also used to help stablize the training. In our examples, the DNN training is usually conducted for epochs.
4.1 Linear SDE with Control
We first consider Ornstein–Uhlenbeck (OU) process with control/excitation. Two cases are considered: when the control is in the drift and when the control is in both the drift and the diffusion. Note that since the true equations are not known, one has no information on “where” the excitations operate onto the system. The sFML approach also does not seek to recover the drift or diffusion terms.
4.1.1 OU with Drift Control
We first consider an Ornstein–Uhlenbeck (OU) process,
(21) |
where and are set as and , and the control signal is applied to the drift. The training data set (13) is generated by sampling in and using Taylor polynomial of degree for the control . This introduces 3 parameters for , which are sampled from . A total of trajectory pairs are used in the training data set (13), where the time step .
Once the sFML model (14) is trained, we conduct system prediction for up to , which requires 1,000 time steps.
![Refer to caption](x2.png)
![Refer to caption](x3.png)
![Refer to caption](x4.png)
![Refer to caption](x5.png)
![Refer to caption](x6.png)
![Refer to caption](x7.png)
In Figure 2, we compare some sample trajectory pathes produced by the ground truth (left) and the learned sFML model (right), with an initial condition and a “new” control signal . We observe the two sets appear visually similar to each other. To further validate the sFML model prediction, we compute the mean and standard deviation of the solution averaged over trajectories. The sFML model predictions are shown in Figure 3, along with the reference ground truth. In Figure 4, we also show the comparison of the solution probability distributions at time . We observe good agreement between the learned sFML model and the true model. This verifies that the sFML model indeed provides an accurate approximation in distribution.
We now present the results under a different setting: the initial condition , and the excitation . The sample solution trajectories are shown in Figure 5 and the solution mean and standard deviation averaged over trajectories are shown in Figure 6. Again, we observe good agreement between the sFML model prediction and the ground truth.
![Refer to caption](x8.png)
![Refer to caption](x9.png)
![Refer to caption](x10.png)
4.1.2 Fully control
We then consider the following OU process with control on both drift and diffusion terms:
(22) |
where , and and are the excitation/control signals. To generate training data, we conduct the local parameterization of and with 2nd degree Taylor polynomials, resulting in , . Moreover, we generate training data pairs with initial conditions uniformly sampled from and .
![Refer to caption](x11.png)
![Refer to caption](x12.png)
![Refer to caption](x13.png)
![Refer to caption](x14.png)
![Refer to caption](x15.png)
![Refer to caption](x16.png)
To examine the performance of the learned sFML model, we conduct a simulation with an initial condition and excitations and . (Note that the excitations are not the Taylor polynomails in the training data set.) Some sample solution trajectories are shown in Figure 7. The mean and STD of the solution are shown in Figure 8. And in Figure 9, we also show the comparison of the probability distribution of the solution at . We observe good agreement between the sFML model prediction and the gorund truth.
4.2 Nonlinear SDEs with Control
We now consider a nonlinear system of SDEs, inspired by an exmple in Section 2.3.2 of [45]:
(23) |
where and are independent Brownian motions, , , , and the function contains a control signal :
To generate the training data, we simulate the system with sample paths over one time step from initial conditions uniformly in and under controls by 2nd-degree Taylor polynomials with coeffficients sampled from .
![Refer to caption](x17.png)
![Refer to caption](x18.png)
![Refer to caption](x19.png)
![Refer to caption](x20.png)
![Refer to caption](x21.png)
![Refer to caption](x22.png)
![Refer to caption](x23.png)
![Refer to caption](x24.png)
![Refer to caption](x25.png)
![Refer to caption](x26.png)
![Refer to caption](x27.png)
![Refer to caption](x28.png)
For the learned sFML model, we conduct system predictions with an initial condition and . In Figure 10, we plot a few sample phase portraits from ground truth (left), as well as from the sFML model prediction (right). They appear to be visually in agreement. The mean and standard deviation of the system prediction by the sFML model are shown in Figure 11, along with those of the true solution. In Figure 12, we also show the comparison of reference and learned density functions of the test trajectory at time . We observe that the sFML model exhibits good accuracy in these predictions.
4.3 Stochastic Predator-Prey Model
We then consider a stochastic Lotka-Volterra system with a time-dependent excitation :
(24) |
where and are independent Brownian motions, and . The training data are generated by simulating solutoin samples for one step , from initial conditions in and under exicitations of 2nd-degree Taylor polynomials whose coefficients are from .
Once we have the trained model, we conduct system prediction with an initial condition , and exitation . We conduct relatively long-term prediction for time up to . (Note that the training data are of lenght .) In Figure 13, we plot a few sample of the phase portrait of the system. Good visual agreement between the sFML prediction and the ground truth can be observed. To examine the accuracy more closely, we present the mean and standard deviation of the system in Figure 14. We observe good predictive accuracy of the sFML model for up to .
![Refer to caption](x29.png)
![Refer to caption](x30.png)
![Refer to caption](x31.png)
![Refer to caption](x32.png)
4.4 Stochastic Resonance
Finally, we consider the following SDE with a double-well potential and excitation,
(25) |
where is a parameter, and is the excitation. When , there is no excitation to the system. The solution would exhibit random transition between two metastable states and . The transition probability depends on the parameters . When , an excitation is exerted to the system. If the excitation is periodic, under the right circumstance the random transtion between the two metastable states becomes synchorized with the perodicity of the exication, resulting in the so-called stochastic resonance, cf., [4, 2, 3].
Here, we demonstrate that the proposed sFML method can accurately model and predict the long-term system behavior using only very short burst of measurement data. Our data are trajectories of one step () length, with initial conditions sampled from and under piecewise constant exictations sampled from .
Once the sFML model is trained, we conduct system prediction under various excitations. In particular, we choose , with and . These parameters are chosen according to [4], to ensure the occurrence of stochastic resonance. An exceptionially long-term system prediction is conducted by the sFML model, for time up to . The result is shown in the top of Figure 15, where we also plotted the (rescaled) periodic excitation in light grey line in the background. We can clearly observe the synchonization between the random transition and the periodic excitation — the stochastic resonance. For reference, we also conduct the sFML system prediction with , i.e., no excitation. The solution, shown in the bottom of Figure 15, exhibits the expected random transition between the two metastable states. We shall emphsize that in this case the transition probability is very small, . The learned sFML model is capable of capturing such a small probability event. We shall remark again that the training data are pairwise data separated by one time step. Thus, none of the (long-term) system behaviors can be observed in the training data.
![Refer to caption](x33.png)
![Refer to caption](x34.png)
5 Conclusion
In this paper, we presented a general numerical framework for modeling unknown nonautonomous stochastic systems by using observed trajectory data. To overcome the difficulties brought by the external time-dependent inputs, we transfer the original system into a local parametric stochastic system. We accomplished this by locally parameterizing the time-dependent external inputs on several discrete time points. The resulting stochastic system is then driven by a stationary parametric stochastic flow map. A normalizing flow model is devised to approximate the parametric stochastic flow map. By using a comprehensive set of numerical examples, we demonstrated that the proposed approach is effective and accurate in modeling a variety of unknown stochastic systems. The learned model can conduct expectational long-term system, subject to arbitrary external excitations that are not contained in the training data.
References
- [1] C. Archambeau, D. Cornford, M. Opper, and J. Shawe-Taylor, Gaussian process approximations of stochastic differential equations, in Gaussian Processes in Practice, N. D. Lawrence, A. Schwaighofer, and J. Quiñonero Candela, eds., vol. 1 of Proceedings of Machine Learning Research, Bletchley Park, UK, 12–13 Jun 2007, PMLR, pp. 1–16, https://proceedings.mlr.press/v1/archambeau07a.html.
- [2] R. Benzi, G. Parisi, A. Sutera, and A. Vulpiani, Stochastic resonance in climatic change, Tellus, 34 (1982), pp. 10–16.
- [3] R. Benzi, G. Parisi, A. Sutera, and A. Vulpiani, A theory of stochastic resonance in climatic change, SIAM J. Appl. Math., 43 (1983), pp. 565–478, https://doi.org/10.1137/0143037, https://doi.org/10.1137/0143037.
- [4] R. Benzi, A. Sutera, and A. Vulpiani, The mechanism of stochastic resonance, J. Phys. A, 14 (1981), pp. L453–L457, http://stacks.iop.org/0305-4470/14/L453.
- [5] S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proc. Natl. Acad. Sci. USA, 113 (2016), pp. 3932–3937, https://doi.org/10.1073/pnas.1517384113.
- [6] S. L. Brunton, J. L. Proctor, and J. N. Kutz, Sparse identification of nonlinear dynamics with control (sindyc), IFAC-PapersOnLine, 49 (2016), pp. 710–715, https://doi.org/https://doi.org/10.1016/j.ifacol.2016.10.249, https://www.sciencedirect.com/science/article/pii/S2405896316318298. 10th IFAC Symposium on Nonlinear Control Systems NOLCOS 2016.
- [7] R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds., vol. 31, Curran Associates, Inc., 2018, https://proceedings.neurips.cc/paper_files/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf.
- [8] X. Chen, J. Duan, J. Hu, and D. Li, Data-driven method to learn the most probable transition pathway and stochastic differential equation, Phys. D, 443 (2023), pp. Paper No. 133559, 15, https://doi.org/10.1016/j.physd.2022.133559.
- [9] X. Chen, L. Yang, J. Duan, and G. E. Karniadakis, Solving inverse stochastic problems from discrete particle observations using the Fokker-Planck equation and physics-informed neural networks, SIAM J. Sci. Comput., 43 (2021), pp. B811–B830, https://doi.org/10.1137/20M1360153.
- [10] Y. Chen and D. Xiu, Learning stochastic dynamical system via flow map operator, J. Comput. Phys., 508 (2024), p. Paper No. 112984, https://doi.org/10.1016/j.jcp.2024.112984, https://doi.org/10.1016/j.jcp.2024.112984.
- [11] V. Churchill and D. Xiu, Flow map learning for unknown dynamical systems: Overview, implementation, and benchmarks, Journal of Machine Learning for Modeling and Computing, 4 (2023), pp. 173–201.
- [12] M. Darcy, B. Hamzi, G. Livieri, H. Owhadi, and P. Tavallali, One-shot learning of stochastic differential equations with data adapted kernels, Phys. D, 444 (2023), pp. Paper No. 133583, 18, https://doi.org/10.1016/j.physd.2022.133583.
- [13] R. Deng, B. Chang, M. A. Brubaker, G. Mori, and A. Lehrmann, Modeling continuous stochastic processes with dynamic normalizing flows, in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, eds., vol. 33, Curran Associates, Inc., 2020, pp. 7805–7815, https://proceedings.neurips.cc/paper_files/paper/2020/file/58c54802a9fb9526cd0923353a34a7ae-Paper.pdf.
- [14] F. Dietrich, A. Makeev, G. Kevrekidis, N. Evangelou, T. Bertalan, S. Reich, and I. G. Kevrekidis, Learning effective stochastic differential equations from microscopic simulations: linking stochastic numerics to deep learning, Chaos, 33 (2023), pp. Paper No. 023121, 19, https://doi.org/10.1063/5.0113632, https://doi.org/10.1063/5.0113632.
- [15] L. Dinh, J. Sohl-Dickstein, and S. Bengio, Density estimation using real NVP, in International Conference on Learning Representations, 2017, https://openreview.net/forum?id=HkpbnH9lx.
- [16] X. Fu, L.-B. Chang, and D. Xiu, Learning reduced systems via deep neural networks with memory, J. Machine Learning Model. Comput., 1 (2020), pp. 97–118.
- [17] L. Guo, H. Wu, and T. Zhou, Normalizing field flows: Solving forward and inverse stochastic differential equations using physics-informed flow models, Journal of Computational Physics, 461 (2022), p. 111202, https://doi.org/https://doi.org/10.1016/j.jcp.2022.111202, https://www.sciencedirect.com/science/article/pii/S0021999122002649.
- [18] T. Haarnoja, K. Hartikainen, P. Abbeel, and S. Levine, Latent space policies for hierarchical reinforcement learning, in Proceedings of the 35th International Conference on Machine Learning, J. Dy and A. Krause, eds., vol. 80 of Proceedings of Machine Learning Research, PMLR, 10–15 Jul 2018, pp. 1851–1860, https://proceedings.mlr.press/v80/haarnoja18a.html.
- [19] S. H. Kang, W. Liao, and Y. Liu, IDENT: identifying differential equations with numerical time evolution, J. Sci. Comput., 87 (2021), pp. Paper No. 1, 27, https://doi.org/10.1007/s10915-020-01404-9.
- [20] I. Kobyzev, S. Prince, and M. Brubaker, Normalizing flows: An introduction and review of current methods, IEEE Trans. Pattern Anal. Machine Intel., 43 (2021), pp. 3964–3979.
- [21] V. Laparra, G. Camps-Valls, and J. Malo, Iterative gaussianization: From ica to random rotations, IEEE Transactions on Neural Networks, 22 (2011), pp. 537–549, https://doi.org/10.1109/TNN.2011.2106511.
- [22] Y. Li and J. Duan, A data-driven approach for discovering stochastic dynamical systems with non-Gaussian Lévy noise, Phys. D, 417 (2021), pp. Paper No. 132830, 12, https://doi.org/10.1016/j.physd.2020.132830.
- [23] Y. Li, Y. Lu, S. Xu, and J. Duan, Extracting stochastic dynamical systems with -stable Lévy noise from data, J. Stat. Mech. Theory Exp., (2022), pp. Paper No. 023405, 23, https://doi.org/10.1088/1742-5468/ac4e87, https://doi.org/10.1088/1742-5468/ac4e87.
- [24] Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, Fourier neural operator for parametric partial differential equations, in International Conference on Learning Representations, 2021, https://openreview.net/forum?id=c8P9NQVtmnO.
- [25] H. Lu and D. M. Tartakovsky, Data-driven models of nonautonomous systems, J. Comput. Phys., 507 (2024), p. Paper No. 112976, https://doi.org/10.1016/j.jcp.2024.112976, https://doi.org/10.1016/j.jcp.2024.112976.
- [26] Y. Lu, R. Maulik, T. Gao, F. Dietrich, I. G. Kevrekidis, and J. Duan, Learning the temporal evolution of multivariate densities via normalizing flows, Chaos, 32 (2022), pp. Paper No. 033121, 17, https://doi.org/10.1063/5.0065093, https://doi.org/10.1063/5.0065093.
- [27] T. Müller, B. Mcwilliams, F. Rousselle, M. Gross, and J. Novák, Neural importance sampling, ACM Trans. Graph., 38 (2019), https://doi.org/10.1145/3341156, https://doi.org/10.1145/3341156.
- [28] B. Øksendal, Stochastic differential equations, in Stochastic differential equations, Springer, 2003, pp. 65–84.
- [29] M. Opper, Variational inference for stochastic differential equations, Ann. Phys., 531 (2019), pp. 1800233, 9, https://doi.org/10.1002/andp.201800233.
- [30] H. Owhadi, Computational graph completion, Research in the Mathematical Sciences, 9 (2022), p. 27.
- [31] G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan, Normalizing flows for probabilistic modeling and inference, J. Mach. Learn. Res., 22 (2021), pp. Paper No. 57, 64.
- [32] G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan, Normalizing flows for probabilistic modeling and inference, J. Machine Learning Res., 22 (2021), pp. 1–64.
- [33] G. Papamakarios, T. Pavlakou, and I. Murray, Masked autoregressive flow for density estimation, in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds., vol. 30, Curran Associates, Inc., 2017, https://proceedings.neurips.cc/paper_files/paper/2017/file/6c1da886822c67822bcf3679d04369fa-Paper.pdf.
- [34] J. L. Proctor, S. L. Brunton, and J. N. Kutz, Dynamic mode decomposition with control, SIAM J. Appl. Dyn. Syst., 15 (2016), pp. 142–161, https://doi.org/10.1137/15M1013857, https://doi.org/10.1137/15M1013857.
- [35] J. L. Proctor, S. L. Brunton, and J. N. Kutz, Generalizing Koopman theory to allow for inputs and control, SIAM J. Appl. Dyn. Syst., 17 (2018), pp. 909–930, https://doi.org/10.1137/16M1062296, https://doi.org/10.1137/16M1062296.
- [36] T. Qin, Z. Chen, J. D. Jakeman, and D. Xiu, Data-driven learning of nonautonomous systems, SIAM J. Sci. Comput., 43 (2021), pp. A1607–A1624, https://doi.org/10.1137/20M1342859.
- [37] T. Qin, Z. Chen, J. D. Jakeman, and D. Xiu, Deep learning of parameterized equations with applications to uncertainty quantification, Int. J. Uncertain. Quantif., 11 (2021), pp. 63–82, https://doi.org/10.1615/Int.J.UncertaintyQuantification.2020034123, https://doi.org/10.1615/Int.J.UncertaintyQuantification.2020034123.
- [38] T. Qin, K. Wu, and D. Xiu, Data driven governing equations approximation using deep neural networks, J. Comput. Phys., 395 (2019), pp. 620–635, https://doi.org/10.1016/j.jcp.2019.06.042.
- [39] M. Raissi, P. Perdikaris, and G. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics, 378 (2019), pp. 686–707, https://doi.org/10.1016/j.jcp.2018.10.045.
- [40] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Multistep neural networks for data-driven discovery of nonlinear dynamical systems, arXiv preprint arXiv:1801.01236, (2018).
- [41] H. Schaeffer and S. G. McCalla, Sparse model selection via integral terms, Phys. Rev. E, 96 (2017), pp. 023302, 7, https://doi.org/10.1103/physreve.96.023302.
- [42] H. Schaeffer, G. Tran, and R. Ward, Extracting sparse high-dimensional dynamics from limited data, SIAM J. Appl. Math., 78 (2018), pp. 3279–3295, https://doi.org/10.1137/18M116798X.
- [43] J. Song, S. Zhao, and S. Ermon, A-nice-mc: Adversarial training for mcmc, in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds., vol. 30, Curran Associates, Inc., 2017, https://proceedings.neurips.cc/paper_files/paper/2017/file/2417dc8af8570f274e6775d4d60496da-Paper.pdf.
- [44] Y. Wang, H. Fang, J. **, G. Ma, X. He, X. Dai, Z. Yue, C. Cheng, H.-T. Zhang, D. Pu, D. Wu, Y. Yuan, J. Gonçalves, J. Kurths, and H. Ding, Data-driven discovery of stochastic differential equations, Engineering, 17 (2022), pp. 244–252, https://doi.org/https://doi.org/10.1016/j.eng.2022.02.007.
- [45] E. Weinan, Principles of multiscale modeling, Cambridge University Press, 2011.
- [46] Z. Xu, Y. Chen, Q. Chen, and D. Xiu, Modeling unknown stochastic dynamical system via autoencoder, arXiv preprint arXiv:2312.10001, (2023).
- [47] L. Yang, C. Daskalakis, and G. E. Karniadakis, Generative ensemble regression: Learning particle dynamics from observations of ensembles with physics-informed deep generative models, SIAM Journal on Scientific Computing, 44 (2022), pp. B80–B99, https://doi.org/10.1137/21M1413018.
- [48] C. Yildiz, M. Heinonen, J. Intosalmi, H. Mannerstrom, and H. Lahdesmaki, Learning stochastic differential equations with gaussian processes without gradient matching, in 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), IEEE, 2018, pp. 1–6.
- [49] J. Zhang, S. Zhang, and G. Lin, Multiauto-deeponet: A multi-resolution autoencoder deeponet for nonlinear dimension reduction, uncertainty quantification and operator learning of forward and inverse stochastic problems, arXiv preprint arXiv:2204.03193, (2022).
- [50] A. Zhu and Q. Li, Dyngma: a robust approach for learning stochastic differential equations from data, arXiv preprint arXiv:2402.14475, (2024).