\svgsetup

inkscapelatex=false

Thermodynamics-informed super-resolution of scarce temporal dynamics data

Carlos Bermejo-Barbanoj ESI Group-UZ Chair of the National Strategy on Artificial Intelligence.
Aragon Institute of Engineering Research (I3A). Universidad de Zaragoza. Zaragoza, Spain.
Beatriz Moya CNRS@CREATE LTD. Singapore. Alberto Badías ETSIAE, Universidad Politécnica de Madrid. Madrid, Spain. Francisco Chinesta CNRS@CREATE LTD. Singapore. ESI Group chair. PIMM Lab. ENSAM Institute of Technology. Paris, France. Elías Cueto ESI Group-UZ Chair of the National Strategy on Artificial Intelligence.
Aragon Institute of Engineering Research (I3A). Universidad de Zaragoza. Zaragoza, Spain.
Abstract

We present a method to increase the resolution of measurements of a physical system and subsequently predict its time evolution using thermodynamics-aware neural networks. Our method uses adversarial autoencoders, which reduce the dimensionality of the full order model to a set of latent variables that are enforced to match a prior, for example a normal distribution. Adversarial autoencoders are seen as generative models, and they can be trained to generate high-resolution samples from low-resoution inputs, meaning they can address the so-called super-resolution problem. Then, a second neural network is trained to learn the physical structure of the latent variables and predict their temporal evolution. This neural network is known as an structure-preserving neural network. It learns the metriplectic-structure of the system and applies a physical bias to ensure that the first and second principles of thermodynamics are fulfilled. The integrated trajectories are decoded to their original dimensionality, as well as to the higher dimensionality space produced by the adversarial autoencoder and they are compared to the ground truth solution. The method is tested with two examples of flow over a cylinder, where the fluid properties are varied between both examples.

Keywords Deep learning; Superresolution; Reduced order model; Autoencoder; Thermodynamics, GENERIC, Structure-Preserving

1 Introduction

Resolution Augmentation techniques, frequently known as super-resolution, refer to a series of techniques that aim to enhance the level of detail of data, often an image, through computational techniques. Their main goal is to produce a high-resolution version of a low-resolution input, improving overall detail and, in some cases, revealing smaller details that may not appear in the original input. Although these techniques have been extensively studied for years by the computer vision community, the growing advances in machine learning have supposed an important boost to this field, as they allow to generate better quality high-resolution samples while improving efficiency. Deep learning based approaches, such as Convolutional Neural Networks (CNNs) [1] and Generative Adversarial Networks (GANs) [2, 3, 4] have proved as an efficient way to augment the resolution of an image, as they are able to learn a map** from low-resolution to high-resolution images, surpassing more traditional techniques like interpolation-based methods. One field that could benefit from these recent advances in super-resolution are predictive digital twins for physical systems. When develo** a digital twin of a system, sensors are commonly employed to capture information of the fields of interest. Although sensors provide accurate measurements, usually their placement is limited to concrete areas due to physical limitations, resulting in sparse spatial measurements. Super-resolution techniques can be applied to this partial information to generate dense output fields [5, 6, 7].

While the Big Data paradigm has become famous in news media of all kinds, the reality is that big data is rarely available in engineering applications. Sensors are often expensive, and the storage, curation and subsequent handling of large amounts of data is no easy task. The result is that we are often faced with situations where less information is available than we would like or need. Given this situation, the application of super-resolution techniques to the world of time series forecasting becomes an urgent necessity.

The most widespread super-resolution techniques (mainly in the world of computational imaging) use black-box techniques to generate the missing information. Logically, this provides much better results than simple interpolation. However, such techniques show severe limitations, and recently new approaches to the problem have been tried, if we stick to the case of prediction of physical phenomena. Since we deal with physical phenomena, a logical and a priori very attractive option is based on taking advantage of the scientific knowledge developed over centuries of research to complement the missing information. Thus, for example, if we are faced with a fluid mechanics problem, the imposition of the Navier-Stokes equations provides valuable information to achieve a successful super-solution [8].

Predictive digital twins constitute a natural field of application of these techniques [9, 10]. They aim to predict the time evolution of the real physical system they represent. However, these systems commonly exhibit complex behaviours which makes their real-time prediction difficult. In order to obtain a complete analysis of the phenomena that describe the behaviour of those systems, physical simulations must be performed using computational tools such Finite Element Method (FEM) for solid mechanics and Computational Fluid Mechanics (CFD) for fluid mechanics. These tools require the discretization of the domain into fine meshes, in most cases with millions of degrees of freedom. As a result, generally simulations are very computationally expensive, often making it impossible to obtain almost real-time predictions of the time evolution of the system. One approach to overcome this problem is to use model order reduction (MOR) methods, as often the solution of the system is contained in a lower-dimensional space, as stated in the manifold hypothesis [11]. Basic approaches like Proper Orthogonal Decomposition (POD) [12] rely on linear transformations to project the information to a low dimensional space, but they usually fail to model complex nonlinear phenomena. The ROM community has developed some techniques to overcome this limitation and obtain nonlinear map**s, like Local Linear Embedding (LLE) [13] and kernel Principal Component Analysis (k-PCA) [14], but in recent years, deep learning based methods [15] have been gaining popularity, with Autoencoders [16] being the most common approach. Some works within the ROM community have addressed the multi-fidelity problem, for instance the Non-Intrusive Reduced Basis (NIRB) [17, 18]. Those methods could benefit from the advantages of super-resolution techniques to obtain high-fidelity data or to enhance the outputs. Multi-scale problems could also benefit from the interaction of both methods, as the NIRB could handle the large scale information, while the super-resolution could refine the fine details, leading to high resolution results. In the present work we focus on autoencoders, as they have proven their capabilities to produce highly nonlinear manifolds for a wide range of applications that include physical simulations [19, 20]. Moreover, some autoencoder architectures exhibit generative capabilities, which makes them a feasible option to generate high-resolution outputs from low-resolution data.

In order to predict the time evolution of the analysed systems, deep learning approaches can be also applied. While classically these approaches have been seen as black boxes, as they require large amounts of data and fail to generalize, leading to unreliable predictions, in the recent years there has been a growing interest in physics-consistent deep learning. These techniques consist in adding some physical knowledge of the system to neural network to guarantee the physical consistency of the solution, minimizing the amount of data needed and improving generalization capabilities. Some works in this field are based on solving the PDEs that govern the problem, which leads to very accurate results [21, 22]. The main drawback of these methods is that they require some knowledge of the governing equations of the phenomena, and in practical applications they are often not fully known. An alternative approach is thus to enforce more general physics, or physics of a higher epistemic level. In this last case, thermodynamics comes into play as a natural choice when more detailed information is missing. Some approaches have been done by imposing the so-called GENERIC (General Equation for Non-Equilibrium Reversible-Irreversible Coupling) metriplectic structure [23, 24] of the problem, by means of the so-called Structure Preserving Neural Networks [25] and Thermodynamics-Informed Graph Neural Networks [26]. These neural networks lead to a thermodynamically-consistent prediction that can be applied to both conservative and dissipative systems. Recently, new insights in the way we can impose the fulfillment of the first and second laws of thermodynamics to the learning process have been included in [27, 28]. Previous works [29] have proved the efficiency of combining model order reduction by autoencoders and time evolution prediction by structure-preserving neural networks, leading to fast and accurate predictions.

The aim of this work is to develop a method to augment the resolution of the low-resolution fields of the state variables of a system and consequently to predict a physically-consistent evolution of this system at the high-resolution regime. The proposed methodology is very general as the used formulation to predict the time evolution of the system is valid for a wide variety of dynamical systems, although we focus on fluid mechanics. The resulting high-resolution reconstruction of the system dynamics is guaranteed to fulfill the first and second principles of thermodynamics (energy conservation and non-negative entropy production).

In this way, both the super-resolution of the state variables of our system and the prediction of the time evolution of their dynamics will be carried out under the perspective of the same formalism, the so-called GENERIC equation, whose usefulness and physical correctness for a multitude of phenomena has already been demonstrated in previous works [30].

The structure of the paper is as follows. A description of the problem setup is presented in Section 2. The methodology is presented in Section 3, where both the model order reduction autoencoder and the GENERIC formalism to predict the evolution are described. In Section 4 two examples are analysed: the flow past a cylinder in a Newtonian and non-Newtonian setting. Finally, the conclusions of the paper are discussed at Section 5.

2 Problem statement

In this work we propose a framework to estimate the temporal evolution of a physical system from data, and to augment its spatial resolution, given the assumption of scarce data. We apply superresolution techniques based in the employ of deep learning and making use of the so-called dynamical system equivalence of scientific machine learning [31]. We assume a dynamical system governed by a set of state variables 𝒙𝒙xbold_italic_x Dabsentsuperscript𝐷\in\mathcal{M}\subseteq\mathbb{R}^{D}∈ caligraphic_M ⊆ blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT, with \mathcal{M}caligraphic_M the state space of these variables, assumed to evolve on a differentiable manifold in Dsuperscript𝐷\mathbb{R}^{D}blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT, thanks to the widespread manifold hypothesis [11]. The full-order model of a physical phenomenon can be expressed as a system of differential equations that give the temporal evolution of a set of state variables 𝒙𝒙xbold_italic_x,

𝒙˙=d𝒙dt=𝑭(𝒙,t),t in =(0,T],𝒙(0)=𝒙0,formulae-sequencebold-˙𝒙𝑑𝒙𝑑𝑡𝑭𝒙𝑡formulae-sequence𝑡 in 0𝑇𝒙0subscript𝒙0\mbox{\boldmath$\dot{x}$}=\frac{d\mbox{\boldmath$x$}}{dt}=\boldsymbol{F}\left(% \mbox{\boldmath$x$},t\right),\>t\text{ in }\mathcal{I}=\left(0,T\right],\>% \mbox{\boldmath$x$}\left(0\right)=\mbox{\boldmath$x$}_{0},overbold_˙ start_ARG bold_italic_x end_ARG = divide start_ARG italic_d bold_italic_x end_ARG start_ARG italic_d italic_t end_ARG = bold_italic_F ( bold_italic_x , italic_t ) , italic_t in caligraphic_I = ( 0 , italic_T ] , bold_italic_x ( 0 ) = bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , (1)

where t𝑡titalic_t refers to the time coordinate in the time interval \mathcal{I}caligraphic_I and 𝑭(𝒙,t)𝑭𝒙𝑡\boldsymbol{F}\left(\mbox{\boldmath$x$},t\right)bold_italic_F ( bold_italic_x , italic_t ) is an a priori unknown nonlinear function that represents the flow map of the governing variables. The identification of this function 𝑭𝑭\boldsymbol{F}bold_italic_F from data is precisely the objective of this work, where we assume that we work in a scarce data scenario.

Since in the most common applications of such techniques (such as the aforementioned digital twins) there is the additional circumstance of strong real-time constraints, it is also assumed that there is a need to work on reduced models of the physics under study. The dimensionality reduction procedure looks for a simpler representation of the full-order state vector represented by 𝒙𝒙xbold_italic_x, through a set of reduced (also denoted as latent in the literature of machine learning) variables 𝒛𝒛zbold_italic_z 𝒩dabsent𝒩superscript𝑑\in\mathcal{N}\subseteq\mathbb{R}^{d}∈ caligraphic_N ⊆ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT contained in a manifold with a dimensionality lower than the original space \mathcal{M}caligraphic_M. The map** between both spaces is denoted by ϕ:Dd:italic-ϕsuperscript𝐷superscript𝑑\phi:\mathcal{M}\subseteq\mathbb{R}^{D}\rightarrow\mathbb{R}^{d}italic_ϕ : caligraphic_M ⊆ blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, where dDmuch-less-than𝑑𝐷d\ll Ditalic_d ≪ italic_D. An inverse map** ϕ1superscriptitalic-ϕ1\phi^{-1}italic_ϕ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT allows to undo the transformation, recovering the information in the full-order space

The goal of this paper is two-fold. First, to find a map** ϕitalic-ϕ\phiitalic_ϕ for a dynamical system governed by Eq. (1) that allows us to predict its temporal evolution under stringent real-time constraints on a reduced-order manifold, and then to augment the spatial dimensionality of the data, back to the original, full-order state space manifold. The map** ϕitalic-ϕ\phiitalic_ϕ allows to learn the underlying physics of the system in the reduced space 𝒩𝒩\mathcal{N}caligraphic_N and then predict its temporal evolution. In order to obtain a physically consistent prediction of the system, the solution must fulfill the laws of thermodynamics, which are enforced by assuming that their evolution occurs under the GENERIC framwework. The second objective is to achieve this while simultaneously augmenting the spatial resolution of the data to generate a dense solution field, thus obtaining a solution space with a higher dimensionality than the original space \mathcal{M}caligraphic_M.

3 Methodology

The proposed framework splits the problem in two main steps. First, the low-resolution, full order model is encoded (projected) onto a reduced-order manifold (or latent space) with an autoencoder, thus achieving a nonlinear map** ϕitalic-ϕ\phiitalic_ϕ. The autoencoder learns a coded representation of the physical system, which allows to work with the data in a compact form. Moreover, the autoencoder is trained to generate high resolution fields of the state variables from the low resolution input data.

Then, a structure-preserving neural network is trained with (low resolution, but full order) simulation data so as to obtain a temporal prediction of the evolution of the dynamics of the system. This network predicts the time evolution of the system by using the GENERIC formalism. Finally, these latent variables are projected back by the decoder to both the original manifold of the low-resolution full order model and a higher resolution manifold. A general scheme of this procedure is shown in Fig. 1. In this work, the full order model data has been generated in silico, although this procedure could be applied to measurements coming from a real physical system.

Refer to caption
Figure 1: Scheme of the proposed framework. First, an encoder is used to reduce the dimensionality of the problem, obtaining a set of reduced variables or latent code. Then, a structure-preserving neural network (SPNN) is trained to integrate the time evolution of the reduced variables of the system. Thus, given the state of the system at time instant n𝑛nitalic_n, the net obtains the state at time instant n+Δn𝑛Δ𝑛n+\Delta nitalic_n + roman_Δ italic_n. Finally the decoder is used to recover the data to its original dimensionality and to generate the output in a resolution that is higher that the input one.

3.1 Model reduction with Adversarial Autoencoders

An autoencoder is a type of neural network that learns an efficient codification or embedding of data. This results in a dimensionality reduction of the input information into a set of latent variables, which ideally contain the same information as the original data. The classical autoencoder architecture is composed by two basic elements: an encoder, εϕsubscript𝜀italic-ϕ\varepsilon_{\phi}italic_ε start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT, that maps the high-dimensional information into a low-dimensional code, and a decoder, δϕsubscript𝛿italic-ϕ\delta_{\phi}italic_δ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT, that applies the inverse operation, recovering the information into the original full-order manifold.

εϕ:Dd,𝒛=εϕ(𝒙),:subscript𝜀italic-ϕformulae-sequencesuperscript𝐷superscript𝑑𝒛subscript𝜀italic-ϕ𝒙\varepsilon_{\phi}:\mathbb{R}^{D}\rightarrow\mathbb{R}^{d},\;\;\mbox{\boldmath% $z$}=\varepsilon_{\phi}\left(\mbox{\boldmath$x$}\right),italic_ε start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , bold_italic_z = italic_ε start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( bold_italic_x ) , (2)
δϕ:dD,𝒙^=δϕ(𝒛).:subscript𝛿italic-ϕformulae-sequencesuperscript𝑑superscript𝐷bold-^𝒙subscript𝛿italic-ϕ𝒛\delta_{\phi}:\mathbb{R}^{d}\rightarrow\mathbb{R}^{D},\;\;\mbox{\boldmath$\hat% {x}$}=\delta_{\phi}\left(\mbox{\boldmath$z$}\right).italic_δ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT , overbold_^ start_ARG bold_italic_x end_ARG = italic_δ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( bold_italic_z ) . (3)

In this work, we use adversarial autoencoders (AAEs) [32]. This kind of autoencoder enforces the latent vector to follow a desired distribution or prior, similarly to variational autoencoders (VAEs) [33], but instead of predicting the mean and standard deviation to enforce that the latent code follows the prior, it is enforced by using an additional network called discriminator. This allows the latent variables to follow not only normal distributions, like VAEs, but also more complex ones.

The discriminator is a simple neural network, usually a Multilayer Perceptron (MLP), that takes as input the latent code generated by the autoencoder and a random sample that follows the prior. It compares both to determine how close the latent code is to the prior. As the training process advances, the latent code produced by the AAE is closer to the prior, which means that the discriminator finds harder to discern if the sample comes from the prior or from the autoencoder.

AAEs are seen as a mix between VAEs and Generative Adversarial Neural Networks (GANs) [2], as they enforce the latent code to follow a prior but make use of a discriminator to ensure that this prior is matched. Like VAEs and GANs, AAEs are considered as generative models. This results in a very useful feature for the proposed task, as they can be used to generate a high-resolution output from a low-resolution input. The resolution augmentation has been achieved by training the decoder of the AAE to output the low-dimensional data (same as input, as in classical autoencoders) and also the high-dimensional data, supervising the training with the ground truth information. The AAE scheme can be seen in Fig. 2.

The loss function of the Autoencoder is composed therefore by three terms:

  • Low-Resolution data loss: The output of the autoencoder, 𝒙^^𝒙\hat{\boldsymbol{x}}over^ start_ARG bold_italic_x end_ARG, must match the ground truth, in this case the input of the network, the low resolution pressure and velocity fields, 𝒙𝒙\boldsymbol{x}bold_italic_x. The accuracy of the network is evaluated using the mean squared error:

    mse, LRAAE=1𝚗snapi=0𝚗snap(𝒙i𝒙^i)2,superscriptsubscriptmse, LRAAE1subscript𝚗snapsuperscriptsubscript𝑖0subscript𝚗snapsuperscriptsubscript𝒙𝑖subscript^𝒙𝑖2\mathcal{L}_{\text{mse, LR}}^{\text{AAE}}=\frac{1}{\tt n_{\text{snap}}}\sum_{i% =0}^{\tt n_{\text{snap}}}\left(\boldsymbol{x}_{i}-\hat{\boldsymbol{x}}_{i}% \right)^{2},caligraphic_L start_POSTSUBSCRIPT mse, LR end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( bold_italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (4)

    where the subscript i𝑖iitalic_i refers to the snapshot number, i=1,,𝚗snap𝑖1subscript𝚗snapi=1,\ldots,\tt n_{\text{snap}}italic_i = 1 , … , typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT.

  • High-Resolution data loss: The output of the autoencoder, 𝑿^^𝑿\hat{\boldsymbol{X}}over^ start_ARG bold_italic_X end_ARG, must match the ground truth, the high-resolution pressure and velocity fields obtained from the in-silico simulations, 𝑿𝑿\boldsymbol{X}bold_italic_X. As with the low resolution data, the accuracy of the autoencoder is evaluated using the mean squared error:

    mse, HRAAE=1𝚗snapi=0𝚗snap(𝑿i𝑿^i)2.superscriptsubscriptmse, HRAAE1subscript𝚗snapsuperscriptsubscript𝑖0subscript𝚗snapsuperscriptsubscript𝑿𝑖subscript^𝑿𝑖2\mathcal{L}_{\text{mse, HR}}^{\text{AAE}}=\frac{1}{\tt n_{\text{snap}}}\sum_{i% =0}^{\tt n_{\text{snap}}}\left(\boldsymbol{X}_{i}-\hat{\boldsymbol{X}}_{i}% \right)^{2}.caligraphic_L start_POSTSUBSCRIPT mse, HR end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( bold_italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over^ start_ARG bold_italic_X end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (5)
  • Adversarial loss: The third term of the loss function is the contribution of the discriminator, advAAEsuperscriptsubscriptadvAAE\mathcal{L}_{\text{adv}}^{\text{AAE}}caligraphic_L start_POSTSUBSCRIPT adv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT. This term measures the likehood between the proposed distribution and the distribution obtained by the encoder.

The final loss function of the autoencoder is composed of a weighted sum of all the terms. A hyperparameter, λadvAAEsuperscriptsubscript𝜆advAAE\lambda_{\text{adv}}^{\text{AAE}}italic_λ start_POSTSUBSCRIPT adv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT is added to control its influence to the total loss function,

AAE=recAAE+λadvAAEadvAAE,superscriptAAEsuperscriptsubscriptrecAAEsubscriptsuperscript𝜆AAEadvsuperscriptsubscriptadvAAE\mathcal{L}^{\text{AAE}}=\mathcal{L}_{\text{rec}}^{\text{AAE}}+\lambda^{\text{% AAE}}_{\text{adv}}\cdot\mathcal{L}_{\text{adv}}^{\text{AAE}},caligraphic_L start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = caligraphic_L start_POSTSUBSCRIPT rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT + italic_λ start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT start_POSTSUBSCRIPT adv end_POSTSUBSCRIPT ⋅ caligraphic_L start_POSTSUBSCRIPT adv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT , (6)

where the recAAEsuperscriptsubscriptrecAAE\mathcal{L}_{\text{rec}}^{\text{AAE}}caligraphic_L start_POSTSUBSCRIPT rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT term represents the reconstruction capabilities of the autoencoder and is composed by the terms associated to the low-resolution and high-resolution fields,

recAAE=mse, LRAAE+mse, HRAAE.superscriptsubscriptrecAAEsuperscriptsubscriptmse, LRAAEsuperscriptsubscriptmse, HRAAE\mathcal{L}_{\text{rec}}^{\text{AAE}}=\mathcal{L}_{\text{mse, LR}}^{\text{AAE}% }+\mathcal{L}_{\text{mse, HR}}^{\text{AAE}}.caligraphic_L start_POSTSUBSCRIPT rec end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = caligraphic_L start_POSTSUBSCRIPT mse, LR end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT + caligraphic_L start_POSTSUBSCRIPT mse, HR end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT . (7)
Refer to caption
Figure 2: Scheme of the adversarial autoencoder (AAE). The autoencoder takes a snapshot of the simulation as input and learns an encoded representation of the data. The decoder recovers the data to its original dimensionality, 𝒙^AAEsubscriptbold-^𝒙AAE\mbox{\boldmath$\hat{x}$}_{\text{AAE}}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT AAE end_POSTSUBSCRIPT, and it is trained to generate a higher-resolution output, 𝑿^AAEsubscriptbold-^𝑿AAE\mbox{\boldmath$\hat{X}$}_{\text{AAE}}overbold_^ start_ARG bold_italic_X end_ARG start_POSTSUBSCRIPT AAE end_POSTSUBSCRIPT, achieving "superresolution" of the analysed simulation. The discriminator takes as input a sample that follows the proposed distribution or prior (in this work a normal distribution) and the latent code generated by the autoencoder. The discriminator output is in the range between 0 and 1. An output close to 0 means that the latent code does not match the prior distribution, while if the output is close to 1 the autoencoder is able to produce a latent code that fits into the prior distribution.

3.2 Learning the dynamical evolution of the system by Structure-Preserving Neural Network

One of our main interests is to develop a framework that satisfies a priori, by construction, known principles of physics about the phenomenon at hand. This is crucial, as we want our framework to provide credible, robust and accurate predictions to help in fields like decision-making, and this can only be achieved with predictions that fulfill the basic principles of physics. In our approach, this is achieved by using physics principles as inductive bias. An inductive bias is a set of assumptions about the data that prioritise one solution over the rest—precisely, the one fulfilling known physical principles—, preventing the learning process from finding a local minimum of the loss function.

Maybe the most popular method in our community at this moment is the so-called Physics-Informed Neural Networks (PINN) [21], in which we enforce the fulfillment of a particular partial differential equation that governs our system. However, there are some situations where the governing equations are not well known or they cannot be applied easily. In other situations, there are models that are well known but nevertheless provide unconvincing results in predicting the evolution of the system. In this case, a very attractive option is to learn only the "ignorance" about the physical behaviour, so that the prediction is the sum of the evolution predicted by the model and the prediction of the learnt ignorance model about the system. This is the approach that has been followed, for example, in [9, 34].

For that reason, we want to guarantee the physical meaning of the solution, but without enforcing any particular physical equation.

For this purpose, we use a structure-preserving neural network (SPNN) [25]. Structure-preserving neural networks refer to a class of methods that are constructed to satisfy some high-level epistemic properties of the problem, for example, the principles of thermodynamics. SPNN can be applied to conservative and dissipative problems, ensuring that the principles of thermodynamics are satisfied by construction. This property allows us to use the thermodynamics laws as an inductive bias [35], ensuring the physical consistency of the results.

3.2.1 GENERIC Formalism

To guarantee the physical meaning of the solution, we enforce the "General Equation for Non-Equilibrium Reversible-Irreversible Coupling", usually referred as GENERIC formalism [23, 24]. This formalism is a generalization of the classic Hamiltonian formulation to dissipative systems. This approach assumes the reversible or conservative contribution to be of Hamiltonian form, thus requiring an energy function and a Poisson bracket. The irreversible contribution to the energetic balance is generated by the non-equilibrium entropy and an irreversible or friction bracket [36].

The GENERIC formulation of time evolution for non-equilibrium systems, parameterised by a set of state variables able to describe the evolution of the energy of the system, 𝒛𝒛zbold_italic_z—the choice is thus not unique—, is given by:

d𝒛dt={𝒛,E}+[𝒛,S],𝑑𝒛𝑑𝑡𝒛𝐸𝒛𝑆\frac{d\mbox{\boldmath$z$}}{dt}=\{\mbox{\boldmath$z$},E\}+\left[\mbox{% \boldmath$z$},S\right],divide start_ARG italic_d bold_italic_z end_ARG start_ARG italic_d italic_t end_ARG = { bold_italic_z , italic_E } + [ bold_italic_z , italic_S ] , (8)

where the so-called Poisson bracket {,}\{\cdot,\cdot\}{ ⋅ , ⋅ } and dissipative bracket [,][\cdot,\cdot][ ⋅ , ⋅ ] have been used. For practical use, the bracket notation is often reformulated using two linear operators:

𝑳:TT,𝑴:TT,:𝑳superscript𝑇𝑇𝑴:superscript𝑇𝑇\mbox{\boldmath$L$}:T^{*}\mathcal{M}\rightarrow T\mathcal{M},\;\mbox{\boldmath% $M$}:T^{*}\mathcal{M}\rightarrow T\mathcal{M},bold_italic_L : italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT caligraphic_M → italic_T caligraphic_M , bold_italic_M : italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT caligraphic_M → italic_T caligraphic_M , (9)

where Tsuperscript𝑇T^{*}\mathcal{M}italic_T start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT caligraphic_M and T𝑇T\mathcal{M}italic_T caligraphic_M represent, respectively, the cotangent and tangent bundles of the state space \mathcal{M}caligraphic_M. The operator 𝑳𝑳Lbold_italic_L represents the Poisson bracket and must be skew-symmetric, while the operator 𝑴𝑴Mbold_italic_M, the friction matrix, describes the irreversible part of the system and must be positive semidefinite to make sure that the dissipation rate is positive. For phenomena involving plasticity, for instance, this approach may not be valid, and a more general form of the dissipative term should be considered. A more general one is developed in [37, 38], among other references.

Replacing the original bracket formulation in Eq. (8) with their respective operators, the time evolution equation for the state variables 𝒛𝒛zbold_italic_z is derived,

d𝒛dt=𝑳E𝒛+𝑴S𝒛.𝑑𝒛𝑑𝑡𝑳𝐸𝒛𝑴𝑆𝒛\frac{d\mbox{\boldmath$z$}}{dt}=\mbox{\boldmath$L$}\frac{\partial E}{\partial% \mbox{\boldmath$z$}}+\mbox{\boldmath$M$}\frac{\partial S}{\partial\mbox{% \boldmath$z$}}.divide start_ARG italic_d bold_italic_z end_ARG start_ARG italic_d italic_t end_ARG = bold_italic_L divide start_ARG ∂ italic_E end_ARG start_ARG ∂ bold_italic_z end_ARG + bold_italic_M divide start_ARG ∂ italic_S end_ARG start_ARG ∂ bold_italic_z end_ARG . (10)

The equation is completed by adding the so-called degeneracy conditions:

{S,𝒛}=[E,𝒛]=𝟎.𝑆𝒛𝐸𝒛0\{S,\mbox{\boldmath$z$}\}=\left[E,\mbox{\boldmath$z$}\right]=\mbox{\boldmath$0% $}.{ italic_S , bold_italic_z } = [ italic_E , bold_italic_z ] = bold_0 . (11)

The first expression states that the entropy is a degenerate functional of the Poisson bracket, and shows the reversible nature of the Hamiltonian contribution to the dynamics. The second expression states that the energy is a degenerate functional of the friction matrix, so the total energy of the system is conserved. These conditions can be reformulated into a matrix form in terms of the previously defined 𝑳𝑳Lbold_italic_L and 𝑺𝑺Sbold_italic_S operators, which results in the following degeneracy conditions:

𝑳S𝒛=𝑴E𝒛=𝟎.𝑳𝑆𝒛𝑴𝐸𝒛0\mbox{\boldmath$L$}\frac{\partial S}{\partial\mbox{\boldmath$z$}}=\mbox{% \boldmath$M$}\frac{\partial E}{\partial\mbox{\boldmath$z$}}=\mbox{\boldmath$0$}.bold_italic_L divide start_ARG ∂ italic_S end_ARG start_ARG ∂ bold_italic_z end_ARG = bold_italic_M divide start_ARG ∂ italic_E end_ARG start_ARG ∂ bold_italic_z end_ARG = bold_0 . (12)

The degeneracy conditions, in addition to the non-negativeness of the irreversible bracket, guarantees that the first (energy conservation) and the second (entropy inequality) laws of thermodynamics are fulfilled.

dEdt=0,dSdt0.formulae-sequence𝑑𝐸𝑑𝑡0𝑑𝑆𝑑𝑡0\frac{dE}{dt}=0,\;\frac{dS}{dt}\geq 0.divide start_ARG italic_d italic_E end_ARG start_ARG italic_d italic_t end_ARG = 0 , divide start_ARG italic_d italic_S end_ARG start_ARG italic_d italic_t end_ARG ≥ 0 . (13)

3.2.2 Structure-Preserving Neural Networks

The structure-preserving neural networks impose the GENERIC formalism to guarantee the thermodynamical consistency of the solution. In order to work with the data coming from the simulation, the GENERIC formalism is discretized along time intervals ΔtΔ𝑡\Delta troman_Δ italic_t,

𝒛n+1𝒛nΔt=𝖫n(𝖣E𝖣𝒛)n+𝖬n(𝖣S𝖣𝒛)n,subscript𝒛𝑛1subscript𝒛𝑛Δ𝑡subscript𝖫𝑛subscript𝖣𝐸𝖣𝒛𝑛subscript𝖬𝑛subscript𝖣𝑆𝖣𝒛𝑛\frac{{\mbox{\boldmath$z$}}_{n+1}-{\mbox{\boldmath$z$}}_{n}}{\Delta t}=\mathsf% {L}_{n}\Big{(}\frac{\mathsf{D}E}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}+\mathsf{% M}_{n}\Big{(}\frac{\mathsf{D}S}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n},divide start_ARG bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT - bold_italic_z start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_ARG start_ARG roman_Δ italic_t end_ARG = sansserif_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_E end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + sansserif_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_S end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , (14)

where we employ the subscript n𝑛nitalic_n to refer to time instant t=nΔt𝑡𝑛Δ𝑡t=n\Delta titalic_t = italic_n roman_Δ italic_t and, therefore, n+1𝑛1n+1italic_n + 1 to refer to t+Δt=(n+1)Δt𝑡Δ𝑡𝑛1Δ𝑡t+\Delta t=(n+1)\Delta titalic_t + roman_Δ italic_t = ( italic_n + 1 ) roman_Δ italic_t.

In this scheme the time derivative is substituted by a forward-Euler scheme with time increments ΔtΔ𝑡\Delta troman_Δ italic_t. The accuracy and stability of different time discretisations of the GENERIC equation have been deeply analysed in [39, 40].The Poisson and friction operators are discretized as 𝖫nsubscript𝖫𝑛\mathsf{L}_{n}sansserif_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝖬nsubscript𝖬𝑛\mathsf{M}_{n}sansserif_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Similarly, energy and entropy gradients are discretized as (𝖣E𝖣𝒛)nsubscript𝖣𝐸𝖣𝒛𝑛\Big{(}\frac{\mathsf{D}E}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}( divide start_ARG sansserif_D italic_E end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and (𝖣S𝖣𝒛)nsubscript𝖣𝑆𝖣𝒛𝑛\Big{(}\frac{\mathsf{D}S}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}( divide start_ARG sansserif_D italic_S end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Eq.(14) can be rewritten to the proposed integration scheme to predict the temporal evolution of the system:

𝒛n+1=𝒛n+Δt(𝖫n(𝖣E𝖣𝒛)n+𝖬n(𝖣S𝖣𝒛)n).subscript𝒛𝑛1subscript𝒛𝑛Δ𝑡subscript𝖫𝑛subscript𝖣𝐸𝖣𝒛𝑛subscript𝖬𝑛subscript𝖣𝑆𝖣𝒛𝑛\mbox{\boldmath$z$}_{n+1}=\mbox{\boldmath$z$}_{n}+{\Delta t}\cdot\left(\mathsf% {L}_{n}\Big{(}\frac{\mathsf{D}E}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}+\mathsf{% M}_{n}\Big{(}\frac{\mathsf{D}S}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}\right).bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = bold_italic_z start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + roman_Δ italic_t ⋅ ( sansserif_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_E end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT + sansserif_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_S end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) . (15)

Additionally, discretized degeneracy conditions are added to ensure the thermodynamical consistency of the prediction:

𝖫n(𝖣S𝖣𝒛)n=𝟎,𝖬n(𝖣E𝖣𝒛)n=𝟎.formulae-sequencesubscript𝖫𝑛subscript𝖣𝑆𝖣𝒛𝑛0subscript𝖬𝑛subscript𝖣𝐸𝖣𝒛𝑛0\mathsf{L}_{n}\Big{(}\frac{\mathsf{D}S}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}=% \boldsymbol{0},\;\mathsf{M}_{n}\Big{(}\frac{\mathsf{D}E}{\mathsf{D}\boldsymbol% {z}}\Big{)}_{n}=\boldsymbol{0}.sansserif_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_S end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_0 , sansserif_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_E end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = bold_0 . (16)

The GENERIC structure is imposed to the encoded space learnt by the adversarial autoencoder, similarly to [29, 41]. The SPNN is a feed-forward neural network composed by a set of fully connected layers. The input of the net is the encoded state vector at a given given timestep 𝒛nAAEsuperscriptsubscript𝒛𝑛AAE\mbox{\boldmath$z$}_{n}^{\text{AAE}}bold_italic_z start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT. The output from the net is a vector containing the predicted 𝖫nsubscript𝖫𝑛\mathsf{L}_{n}sansserif_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and 𝖬nsubscript𝖬𝑛\mathsf{M}_{n}sansserif_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT matrices, as well as the predicted energy and entropy gradients, (𝖣E𝖣𝒛)nsubscript𝖣𝐸𝖣𝒛𝑛\Big{(}\frac{\mathsf{D}E}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}( divide start_ARG sansserif_D italic_E end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and (𝖣S𝖣𝒛)nsubscript𝖣𝑆𝖣𝒛𝑛\Big{(}\frac{\mathsf{D}S}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}( divide start_ARG sansserif_D italic_S end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT.

Actually, to enforce the skew-symmetry and positive semi-definiteness of matrices 𝑳𝑳\boldsymbol{L}bold_italic_L and 𝑴𝑴\boldsymbol{M}bold_italic_M, the output of the network is a pair of matrices 𝒍𝒍\boldsymbol{l}bold_italic_l and 𝒎𝒎\boldsymbol{m}bold_italic_m, reshaped in lower-triangular matrices,

𝑳=𝒍𝒍,𝑴=𝒎𝒎.formulae-sequence𝑳𝒍superscript𝒍top𝑴𝒎superscript𝒎top\boldsymbol{L}=\boldsymbol{l}-\boldsymbol{l}^{\top},\qquad\boldsymbol{M}=% \boldsymbol{m}\boldsymbol{m}^{\top}.bold_italic_L = bold_italic_l - bold_italic_l start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT , bold_italic_M = bold_italic_m bold_italic_m start_POSTSUPERSCRIPT ⊤ end_POSTSUPERSCRIPT . (17)

Then, using the integration scheme showed in Eq.(15), the reduced space state vector at the next time step is obtained 𝒛n+1SPNNsuperscriptsubscript𝒛𝑛1SPNN\mbox{\boldmath$z$}_{n+1}^{\text{SPNN}}bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT.

Refer to caption
Figure 3: Scheme of the structure-preserving neural network (SPNN). The SPNN is trained to predict the full time evolution of the latent variables generated by the AAE by applying the GENERIC structure of the underlying physics of the problem. The network takes the current snapshot as input and outputs the L and M matrices, as well as the energy and entropy gradients. Then, they are integrated following the GENERIC formalism, as shown in Eq. 15, and the latent variables of the next snapshot are obtained. This process can be done iteratively, obtaining the rollout prediction of the full simulation.

The loss function used to train the SPNN is composed by two different terms:

  • Data loss: The output of the integration scheme, 𝒛n+1SPNNsuperscriptsubscript𝒛𝑛1SPNN\mbox{\boldmath$z$}_{n+1}^{\text{SPNN}}bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT, must match the ground truth, in this case the encoded state vector, 𝒛n+1AAEsuperscriptsubscript𝒛𝑛1AAE\mbox{\boldmath$z$}_{n+1}^{\text{AAE}}bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT, predicted by the autoencoder. The accuracy of the network is evaluated using the mean squared error:

    mseSPNN=1𝚗snapn=0𝚗snap(𝒛n+1AAE𝒛n+1SPNN).superscriptsubscriptmseSPNN1subscript𝚗snapsuperscriptsubscript𝑛0subscript𝚗snapsuperscriptsubscript𝒛𝑛1AAEsuperscriptsubscript𝒛𝑛1SPNN\mathcal{L}_{\text{mse}}^{\text{SPNN}}=\frac{1}{\tt n_{\text{snap}}}\sum_{n=0}% ^{\tt n_{\text{snap}}}\left(\mbox{\boldmath$z$}_{n+1}^{\text{AAE}}-\mbox{% \boldmath$z$}_{n+1}^{\text{SPNN}}\right).caligraphic_L start_POSTSUBSCRIPT mse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_n = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT - bold_italic_z start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT ) . (18)
  • Degeneracy conditions loss: The loss function includes the fulfilment of the degeneracy conditions, ensuring the thermodynamical consistency of the solution. They are measured as the sum of the squared values of both conditions:

    degenSPNN=1𝚗snapn=0𝚗snap((𝖫n(𝖣S𝖣𝒛)n)2+(𝖬n(𝖣E𝖣𝒛)n)2).superscriptsubscriptdegenSPNN1subscript𝚗snapsuperscriptsubscript𝑛0subscript𝚗snapsuperscriptsubscript𝖫𝑛subscript𝖣𝑆𝖣𝒛𝑛2superscriptsubscript𝖬𝑛subscript𝖣𝐸𝖣𝒛𝑛2\mathcal{L}_{\text{degen}}^{\text{SPNN}}=\frac{1}{\tt n_{\text{snap}}}\sum_{n=% 0}^{\tt n_{\text{snap}}}\left(\left(\mathsf{L}_{n}\Big{(}\frac{\mathsf{D}S}{% \mathsf{D}\boldsymbol{z}}\Big{)}_{n}\right)^{2}+\left(\mathsf{M}_{n}\Big{(}% \frac{\mathsf{D}E}{\mathsf{D}\boldsymbol{z}}\Big{)}_{n}\right)^{2}\right).caligraphic_L start_POSTSUBSCRIPT degen end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_n = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( ( sansserif_L start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_S end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( sansserif_M start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( divide start_ARG sansserif_D italic_E end_ARG start_ARG sansserif_D bold_italic_z end_ARG ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) . (19)

The final loss function is composed of a weighted sum of both terms. A hyperparameter λmseSPNNsuperscriptsubscript𝜆mseSPNN\lambda_{\text{mse}}^{\text{SPNN}}italic_λ start_POSTSUBSCRIPT mse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT is added to control its influence and balance both of them,

SPNN=λmseSPNNmseSPNN+degenSPNN.superscriptSPNNsuperscriptsubscript𝜆mseSPNNsuperscriptsubscriptmseSPNNsuperscriptsubscriptdegenSPNN\mathcal{L}^{\text{SPNN}}=\lambda_{\text{mse}}^{\text{SPNN}}\cdot\mathcal{L}_{% \text{mse}}^{\text{SPNN}}+\mathcal{L}_{\text{degen}}^{\text{SPNN}}.caligraphic_L start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = italic_λ start_POSTSUBSCRIPT mse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT ⋅ caligraphic_L start_POSTSUBSCRIPT mse end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT + caligraphic_L start_POSTSUBSCRIPT degen end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT . (20)

4 Results

4.1 Example 1: Flow past a cylinder of a Newtonian fluid

4.1.1 Database generation

The first example consists in an unsteady flow past a cylindrical obstacle. The geometry of the obstacle is fixed for all examples and the flow conditions are varied by modifying the freestream velocity, which results in a variable Reynolds regime and generates a Kármán vortex street that exhibits a periodic behaviour during the steady state. The state variables for the flow past a cylinder are the velocity and pressure fields,

𝒮={𝒙=(𝒖,P)2×}.𝒮𝒙𝒖𝑃superscript2\mathcal{S}=\{\mbox{\boldmath$x$}=\left(\mbox{\boldmath$u$},P\right)\in\mathbb% {R}^{2}\times\mathbb{R}\}.caligraphic_S = { bold_italic_x = ( bold_italic_u , italic_P ) ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × blackboard_R } . (21)

The ground truth simulations are computed solving the 2D Navier-Stokes equations using OpenFOAM software [42]. No-slip condition is applied in the cylinder obstacle . The fluid is assumed to have a Newtonian behaviour with density of ρ=1𝜌1\rho=1italic_ρ = 1 and dynamic viscosity μ=103𝜇superscript103\mu=10^{-3}italic_μ = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT. The freestream velocity is contained within the interval 𝒖[0.8,3.4]𝒖0.83.4\mbox{\boldmath$u$}\in\left[0.8,3.4\right]bold_italic_u ∈ [ 0.8 , 3.4 ], resulting in a total of Nsim=27subscript𝑁sim27N_{\text{sim}}=27italic_N start_POSTSUBSCRIPT sim end_POSTSUBSCRIPT = 27 cases. Each case is discretized in 𝚗snap=800subscript𝚗snap800{\tt n}_{\text{snap}}=800typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT = 800 time increments of Δt=0.005Δ𝑡0.005\Delta t=0.005roman_Δ italic_t = 0.005.

The input of the autoencoder are the low resolution velocity and pressure fields, with size 3×16×48316483\times 16\times 483 × 16 × 48 , while the output are the velocitiy (two components) and pressure (a scalar) fields at the original resolution and a higher one, with sizes 3×16×48316483\times 16\times 483 × 16 × 48 and 3×64×1923641923\times 64\times 1923 × 64 × 192, respectively. Both the encoder and decoder use convolutional layers with Nch=64subscript𝑁ch64N_{\text{ch}}=64italic_N start_POSTSUBSCRIPT ch end_POSTSUBSCRIPT = 64 channels and a kernel size of k=3𝑘3k=3italic_k = 3, following a ResNet-like structure [43]. The number of latent variables at the bottleneck is set to d=5𝑑5d=5italic_d = 5. The activation function used is the Leaky-ReLU with a negative slope of 0.10.10.10.1, except for the last layer of both the encoder and decoder, where linear activations are used. The adversarial hyperparameter weight is set to λadvAAE=103superscriptsubscript𝜆advAAEsuperscript103\lambda_{\text{adv}}^{\text{AAE}}=10^{-3}italic_λ start_POSTSUBSCRIPT adv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT. The optimizer used is Adam [44] with a learning rate set to lrAAE=104superscriptsubscript𝑙𝑟AAEsuperscript104l_{r}^{\text{AAE}}=10^{-4}italic_l start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT with decreasing order of magnitude on epochs 600 and 1200, a weight decay set to wdAAE=104superscriptsubscript𝑤𝑑AAEsuperscript104w_{d}^{\text{AAE}}=10^{-4}italic_w start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, and a total number of Nepochs=1800subscript𝑁epochs1800N_{\text{epochs}}=1800italic_N start_POSTSUBSCRIPT epochs end_POSTSUBSCRIPT = 1800 epochs. Latent variables obtained at the bottleneck are then used as input variables for the structure preserving neural network that, as explained before, operated in the latent manifold. The training and validation loss curves for the autoencoder are shown in Fig. 4(a)

The SPNN input size coincides with the AAE latent dimension, NinSPNN=d=5superscriptsubscript𝑁inSPNN𝑑5N_{\text{in}}^{\text{SPNN}}=d=5italic_N start_POSTSUBSCRIPT in end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = italic_d = 5, while the output size is NoutSPNN=d(d+2)=35superscriptsubscript𝑁outSPNN𝑑𝑑235N_{\text{out}}^{\text{SPNN}}=d\cdot\left(d+2\right)=35italic_N start_POSTSUBSCRIPT out end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = italic_d ⋅ ( italic_d + 2 ) = 35, see Eq. (17). The number of hidden layers of the SPNN is NhlSPNN=5superscriptsubscript𝑁hlSPNN5N_{\text{hl}}^{\text{SPNN}}=5italic_N start_POSTSUBSCRIPT hl end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 5 with 100 neurons each one, Leaky-ReLU activations and linear for the last layer. The data weight hyperparameter is set to λdataSPNN=102superscriptsubscript𝜆dataSPNNsuperscript102\lambda_{\text{data}}^{\text{SPNN}}=10^{2}italic_λ start_POSTSUBSCRIPT data end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. The SPNN is trained for Nepochs=4500subscript𝑁epochs4500N_{\text{epochs}}=4500italic_N start_POSTSUBSCRIPT epochs end_POSTSUBSCRIPT = 4500 epochs using the Adam optimizer. The learning rate is set to lrSPNN=103superscriptsubscript𝑙rSPNNsuperscript103l_{\text{r}}^{\text{SPNN}}=10^{-3}italic_l start_POSTSUBSCRIPT r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT, decreasing one order of magnitude on epoch 1500 and 3000. The weight decay is set to wdSPNN=104superscriptsubscript𝑤dSPNNsuperscript104w_{\text{d}}^{\text{SPNN}}=10^{-4}italic_w start_POSTSUBSCRIPT d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT and noise variance is added to the train set, σnoise2=106superscriptsubscript𝜎noise2superscript106\sigma_{\text{noise}}^{2}=10^{-6}italic_σ start_POSTSUBSCRIPT noise end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT. The Fig. 4(b) shows the training and validation curves for the SPNN.

Refer to caption
((a)) AAE Loss
Refer to caption
((b)) SPNN Loss
Figure 4: Training and validation loss curves for the Adversarial Autoencoder (Left) and the SPNN (right).

4.1.2 Results

Fig. 6 shows the prediction achieved for the pressure and velocity fields predicted by the autoencoder in low and high resolution, as well as the absolute error for each field. The AAE prediction shows good agreement with the reconstructed low resolution fields and those generated in high resolution. Fig. 9 shows a box plot of the data error for the train and test sets, obtaining a mean error lower than 3% for the pressure and velocity fields in both low and high resolution.

Refer to caption
((a)) Low resolution
Refer to caption
((b)) High resolution
Figure 6: Results of the prediction made by the Adversarial Autoencoder (AAE). 6(a): Low resolution Ground Truth (GT), AAE prediction and absolute error for P𝑃Pitalic_P, Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT. The ground truth is the input of the AAE and it predicts the low resolution fields and the high resolution fields shown in Fig. 6(b). 6(b): High resolution GT, AAE prediction and absolute error fields for the same snapshot shown in Fig. 6(a). 8(a): Low resolution fields for a different snapshots. 8(b): High resolution GT, AAE pred. and abs. error fields for the snapshot shown in Fig. 8(a).
Refer to caption
((a)) Low resolution
Refer to caption
((b)) High resolution
Figure 8: Results of the prediction made by the Adversarial Autoencoder (AAE). 6(a): Low resolution Ground Truth (GT), AAE prediction and absolute error for P𝑃Pitalic_P, Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT. The ground truth is the input of the AAE and it predicts the low resolution fields and the high resolution fields shown in Fig. 6(b). 6(b): High resolution GT, AAE prediction and absolute error fields for the same snapshot shown in Fig. 6(a). 8(a): Low resolution fields for a different snapshots. 8(b): High resolution GT, AAE pred. and abs. error fields for the snapshot shown in Fig. 8(a) (cont.).
Refer to caption
Refer to caption
Figure 9: Box plots for the relative L2 error of the autoencoder for all the snapshots of the newtonian fluid for both train and test cases, in low (left) and high (right) resolution. The state variables represented are pressure (P) and velocity (Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT)

In order to prove the convenience of our proposed method we compare it to classical resolution augmentation techniques. The AAE is compared with a common technique to augment resolution in the computer vision field: the bicubic interpolation. Comparison between both is shown in Table 1. The results of the AAE outperform the bicubic interpolation, while being considerably faster, leading to a 37×37\times37 × speed increment.

Table 1: Comparison between the proposed method and bicubic interpolation. For every variable, mean relative error is shown, and the reconstruction time for the whole dataset is computed.
AAE Bicubic interpolation
P𝑃Pitalic_P (-) 0.0247 0.0621
Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT (-) 0.0105 0.0285
Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT (-) 0.0196 0.0524
Time (s𝑠sitalic_s) 711.05 26303.31

Fig. 10 shows the comparison between the AAE latent variables and the rollout prediction made by the SPNN, for the case with input velocity u=3.4𝑢3.4u=3.4italic_u = 3.4—the worst case scenario among all considered—. The SPNN is able to integrate the latent variables in the reduced space in good agreement with the original AAE encoding, considered as the ground truth for the SPNN.

The SPNN is also compared with a black box (BB) neural network. The black box neural network predicts the increment of the latent variables 𝒛𝒛zbold_italic_z and uses a forward Euler integration scheme to obtain the next snapshot of the simulation. The black box is trained using the same hyperparameters as the SPNN, except for the output size, which is the same dimension as the input, for this example NinBB=NoutBB=5superscriptsubscript𝑁inBBsuperscriptsubscript𝑁outBB5N_{\text{in}}^{\text{BB}}=N_{\text{out}}^{\text{BB}}=5italic_N start_POSTSUBSCRIPT in end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB end_POSTSUPERSCRIPT = italic_N start_POSTSUBSCRIPT out end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB end_POSTSUPERSCRIPT = 5. Results are compared in Fig. 11, which shows the velocity accumulated error mean and standard deviation with a confidence interval of 95% for every simulation at each snapshot. The SPNN (Fig. 11(a)) error raises as the prediction advances, which was expected, as the forward Euler is a first order integration scheme, but the network is able to converge to the solution. Meanwhile, the black box neural network (Fig. 11(b)) is not able to integrate the predicted trajectory and diverges from the ground truth, proving that the thermodynamic bias guides the network to converge to a meaningful solution.

Refer to caption
Refer to caption
Figure 10: Results of the SPNN integration with respect to the ground truth (GT) latent variables obtained by the AAE for two different simulations of the newtonian fluid. The dashed line corresponds to the SPNN prediction while the continuous line is the ground truth. To facilitate the identification of ground truth and prediction values, each latent variable is represented by a distinct color. This simplifies the comparison between the ground truth and the SPNN prediction. Left: simulation at Uin=1.4subscriptUin1.4\mathrm{U_{in}=1.4}roman_U start_POSTSUBSCRIPT roman_in end_POSTSUBSCRIPT = 1.4. Right: simulation at Uin=3.4subscriptUin3.4\mathrm{U_{in}=3.4}roman_U start_POSTSUBSCRIPT roman_in end_POSTSUBSCRIPT = 3.4.
Refer to caption
((a)) SPNN
Refer to caption
((b)) Black box
Figure 11: Evolution of the relative error during the whole integration of the velocity for all the simulations. Even if the error of the SPNN increases as a consequence of the explicit Euler integration scheme employed, the black box approach fails during the rollout and diverges from the solution. The SPNN error accumulates during the rollout prediction but its able to converge to a meaningful solution.

The ground truth simulations were performed on a MacBook Pro M1 Pro. Each simulation took around 20 minutes to complete. The AAE and the SPNN were trained using the Pytorch framework. The computer used to train both networks was a Linux-based machine equipped with a Intel i9-13900K CPU and a NVIDIA RTX 4090 GPU. The AAE training time was approximately 4 hours, while the SPNN took around 20 minutes to train. While working on inference, the prediction for the latent variables can be obtained in 1-2 seconds in a MacBook Pro M1 Pro, while rendering the video for the complete simulation takes around 15 seconds, achieving a considerable speedup when compared with the computational cost of running the high-fidelity simulation.

4.2 Example 2: Flow past a cylinder of a non-Newtonian fluid

4.2.1 Database generation

The second example is generated by using the same geometry than in the previous example, but the fluid is replaced by a non-Newtonian fluid. As in the previous case, the flow conditions are obtained by varying the initial velocity of the flow, which results in different Reynolds numbers. The state variables for the non-Newtonian flow past a cylinder are the velocity, shear rate and pressure fields, although good prediction results can be achieved by using only the pressure and velocity fields obtained from the solver:

𝒮={𝒙=(𝒖,γ˙,P)2××}.𝒮𝒙𝒖˙𝛾𝑃superscript2\mathcal{S}=\{\mbox{\boldmath$x$}=\left(\mbox{\boldmath$u$},\dot{\gamma},P% \right)\in\mathbb{R}^{2}\times\mathbb{R}\times\mathbb{R}\}.caligraphic_S = { bold_italic_x = ( bold_italic_u , over˙ start_ARG italic_γ end_ARG , italic_P ) ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × blackboard_R × blackboard_R } . (22)

Ground truth simulations are obtained by solving the 2D Navier-Stokes equations using OpenFOAM [42]. A no-slip condition is applied at the wall of the cylinder. In this example, a non-Newtonian fluid behaviour is applied using the Herschel-Bulkey model in OpenFOAM, defined by the following parameters: ρ=1𝜌1\rho=1italic_ρ = 1, ν0=0.00125subscript𝜈00.00125\nu_{0}=0.00125italic_ν start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0.00125, τ0=0.00125subscript𝜏00.00125\tau_{0}=0.00125italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0.00125, k=0.015625𝑘0.015625k=0.015625italic_k = 0.015625, n=1.88𝑛1.88n=1.88italic_n = 1.88. The freestream velocity is contained within the interval 𝒖[1.0,2.0]𝒖1.02.0\mbox{\boldmath$u$}\in\left[1.0,2.0\right]bold_italic_u ∈ [ 1.0 , 2.0 ], with speed increments of 0.10.10.10.1, which results in a total of Nsim=11subscript𝑁sim11N_{\text{sim}}=11italic_N start_POSTSUBSCRIPT sim end_POSTSUBSCRIPT = 11 cases. Each simulation is discretized in 𝚗snap=𝟼𝟶𝟶subscript𝚗snap600\tt n_{\text{snap}}=600typewriter_n start_POSTSUBSCRIPT snap end_POSTSUBSCRIPT = typewriter_600 time increments of Δt=0.005Δ𝑡0.005\Delta t=0.005roman_Δ italic_t = 0.005.

The input of the autoencoder are the low resolution velocity and pressure fields, with size 3×16×48316483\times 16\times 483 × 16 × 48 , while the output are the velocitiy and pressure fields at the original resolution and a higher one, with sizes 3×16×48316483\times 16\times 483 × 16 × 48 and 3×64×1923641923\times 64\times 1923 × 64 × 192. Both the encoder and decoder use convolutional layers with Nch=64subscript𝑁ch64N_{\text{ch}}=64italic_N start_POSTSUBSCRIPT ch end_POSTSUBSCRIPT = 64 channels an a kernel size of k=3𝑘3k=3italic_k = 3, following a ResNet-like structure [43]. The number of latent variables at the bottleneck is set to d=6𝑑6d=6italic_d = 6. The activation function used is the leaky-ReLU with a negative slope of 0.10.10.10.1, except for the last layer of both the encoder and decoder, where linear activations are used. The adversarial hyperparameter weight is set to λadvAAE=103superscriptsubscript𝜆advAAEsuperscript103\lambda_{\text{adv}}^{\text{AAE}}=10^{-3}italic_λ start_POSTSUBSCRIPT adv end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT. The optimizer used is Adam [44] with a learning rate set to lrAAE=104superscriptsubscript𝑙rAAEsuperscript104l_{\text{r}}^{\text{AAE}}=10^{-4}italic_l start_POSTSUBSCRIPT r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT with decreasing order of magnitude on epochs 600 and 1200, a weight decay set to wdAAE=106superscriptsubscript𝑤dAAEsuperscript106w_{\text{d}}^{\text{AAE}}=10^{-6}italic_w start_POSTSUBSCRIPT d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT AAE end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT, and a total number of Nepochs=1800subscript𝑁epochs1800N_{\text{epochs}}=1800italic_N start_POSTSUBSCRIPT epochs end_POSTSUBSCRIPT = 1800 epochs. Latent variables obtained at the bottleneck are then used as input variables for the structure preserving neural network. Fig. 12(a) shows the training and validation loss for the adversarial autoencoder.

The SPNN input size coincides with the AAE latent dimension, NinSPNN=d=6superscriptsubscript𝑁inSPNN𝑑6N_{\text{in}}^{\text{SPNN}}=d=6italic_N start_POSTSUBSCRIPT in end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = italic_d = 6, while the output size is NoutSPNN=d(d+2)=48superscriptsubscript𝑁outSPNN𝑑𝑑248N_{\text{out}}^{\text{SPNN}}=d\cdot\left(d+2\right)=48italic_N start_POSTSUBSCRIPT out end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = italic_d ⋅ ( italic_d + 2 ) = 48. The number of hidden layers of the SPNN is NhlSPNN=5superscriptsubscript𝑁hlSPNN5N_{\text{hl}}^{\text{SPNN}}=5italic_N start_POSTSUBSCRIPT hl end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 5 with 120 neurons each one, Leaky-ReLU activations and linear for the last layer. The data weight hyperparameter is set to λdataSPNN=102superscriptsubscript𝜆dataSPNNsuperscript102\lambda_{\text{data}}^{\text{SPNN}}=10^{2}italic_λ start_POSTSUBSCRIPT data end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. The SPNN is trained for Nepochs=6000subscript𝑁𝑒𝑝𝑜𝑐𝑠6000N_{epochs}=6000italic_N start_POSTSUBSCRIPT italic_e italic_p italic_o italic_c italic_h italic_s end_POSTSUBSCRIPT = 6000 epochs using the Adam optimizer, and a batch size of Bsize=128subscript𝐵size128B_{\text{size}}=128italic_B start_POSTSUBSCRIPT size end_POSTSUBSCRIPT = 128. The learning rate is set to lrSPNN=103superscriptsubscript𝑙rSPNN103l_{\text{r}}^{\text{SPNN}}=10{-3}italic_l start_POSTSUBSCRIPT r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 10 - 3, decreasing one order of magnitude on epoch 2000 and 4000. The weight decay is set to wdSPNN=104superscriptsubscript𝑤dSPNNsuperscript104w_{\text{d}}^{\text{SPNN}}=10^{-4}italic_w start_POSTSUBSCRIPT d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT SPNN end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT and noise variance is added to the train set, σnoise2=105superscriptsubscript𝜎noise2superscript105\sigma_{\text{noise}}^{2}=10^{-5}italic_σ start_POSTSUBSCRIPT noise end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT. Fig. 12(b) shows the training and validation loss for the SPNN.

Refer to caption
((a)) AAE Loss
Refer to caption
((b)) SPNN Loss
Figure 12: Adversarial Autoencoder (left) and the SPNN (right) training and validation loss curves for the non-Newtonian example.

4.2.2 Results

Fig. 14 shows the prediction results for the pressure and velocity fields obtained by the autoencoder for both, low and high resolution, as well as the absolute error for each field. The low and high resolution fields reconstructed by the autoencoder show good agreement with the ground truth fields obtained from the CFD simulation. A box plot containing the error of the state variables for the train and test cases is shown in Fig. 17, achieving a mean error lower than 3% for the pressure and velocity fields in both low and high resolution.

Refer to caption
((a)) Low resolution
Refer to caption
((b)) High resolution
Figure 14: Results of the AAE for two different snapshots. 14(a): Low resolution Ground Truth (GT), AAE prediction and absolute error for P𝑃Pitalic_P, Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT fields for the first snapshot. 14(b): High resolution GT, AAE prediction and absolute error fields for the same snapshot shown in Fig. 14(a). 16(a): Low resolution fields for the second snapshot. 16(b): High resolution fields for the snapshot shown in Fig. 16(a).
Refer to caption
((a)) Low resolution
Refer to caption
((b)) High resolution
Figure 16: Results of the AAE for two different snapshots. 14(a): Low resolution Ground Truth (GT), AAE prediction and absolute error for P𝑃Pitalic_P, Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT fields for the first snapshot. 14(b): High resolution GT, AAE prediction and absolute error fields for the same snapshot shown in Fig. 14(a). 16(a): Low resolution fields for the second snapshot. 16(b): High resolution fields for the snapshot shown in Fig. 16(a) (cont.).
Refer to caption
Refer to caption
Figure 17: Box plots for the relative L2 error of the autoencoder for all the snapshots of the non-newtonian fluid for both train and test cases, in low (left) and high (right) resolution.

As in the previous case, the autoencoder has been compared with the bicubic interpolation technique. Comparison is shown in Table 2. As expected, considering the results obtained for the newtonian fluid case, the AAE clearly outperforms the bocubic interpolation, specially if the time difference between both methods is considered, with the AAE being almost 45 times faster than the bicubic interpolation.

Table 2: Comparison between the AAE and bicubic interpolation. For every variable, relative error is shown, and time reconstruction time for all dataset is computed.
AAE Bicubic interpolation
P𝑃Pitalic_P (-) 0.0257 0.0557
Uxsubscript𝑈𝑥U_{x}italic_U start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT (-) 0.0113 0.0228
Uysubscript𝑈𝑦U_{y}italic_U start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT (-) 0.0240 0.0639
Time (s𝑠sitalic_s) 197.57 8818.12

Fig. 18 shows the comparison between the ground truth latent variables, the ones obtained by the AAE and the prediction made by the SPNN. As in the previous case, the SPNN is able to integrate the latent variables in the reduced space successfully with respect to the original AAE encoding.

As previously, the SPNN is compared with a black box neural network, which is trained using the same hyperparameters as the SPNN, except for the outputs size, for this example NinBB=NoutBB=6superscriptsubscript𝑁inBBsuperscriptsubscript𝑁outBB6N_{\text{in}}^{\text{BB}}=N_{\text{out}}^{\text{BB}}=6italic_N start_POSTSUBSCRIPT in end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB end_POSTSUPERSCRIPT = italic_N start_POSTSUBSCRIPT out end_POSTSUBSCRIPT start_POSTSUPERSCRIPT BB end_POSTSUPERSCRIPT = 6, the same as the bottleneck of the AAE and the input of the SPNN. Results are compared in Fig. 19, which shows the velocity accumulated error mean and standard deviation with a confidence interval of 95% for every simulation at each snapshot. The SPNN error (Fig. 19(a)) remains lower than the black box neural network (Fig. 19(b)), proving that the thermodynamic bias helps the network to converge to the correct solution.

Refer to caption
Refer to caption
Figure 18: Results of the SPNN integration with respect to the latent variables obtained by the AAE (ground truth) for two different simulations for the non-Newtonian fluid. The dashed line corresponds to the SPNN prediction while the continuous line is the ground truth. Each latent variable is represented in a different color to simplify the identification of the ground truth and its corresponding predicted value. Left: simulation at Uin=1.4subscriptUin1.4\mathrm{U_{in}=1.4}roman_U start_POSTSUBSCRIPT roman_in end_POSTSUBSCRIPT = 1.4. Right: simulation at Uin=1.7subscriptUin1.7\mathrm{U_{in}=1.7}roman_U start_POSTSUBSCRIPT roman_in end_POSTSUBSCRIPT = 1.7
Refer to caption
((a)) SPNN
Refer to caption
((b)) Black box
Figure 19: Evolution of the relative error during the whole integration of the velocity for all the simulations. The SPNN error remains lower than the black box neural network during all the prediction rollout, proving that the physical bias helps to reach the correct solution.

The ground truth simulations were performed on a MacBook Pro M1 Pro, with each simulation taking around 30 minutes to complete. Both, the AAE and the SPNN were trained on a Linux-based machine equipped with a Intel i9-13900K CPU and a NVIDIA RTX 4090 GPU using the Pytorch framework. Training the AAE took 2.5 hours in the NVIDIA GPU, while the SPNN training time was around 20 minutes. Regarding the inference time, each latent variables prediction can be obtained in less than 1 second on a MacBook Pro M1 Pro, although rendering the video with the complete prediction takes about 10 seconds, which is considerably faster than that computing the high-fidelity model.

5 Conclusions

In this work we have presented a new methodology to increase the spatial resolution of predictions obtained by learned simulators, while ensuring a thermodynamics-aware prediction, satisfying the basic principles of thermodynamics. The proposed AAE architecture is able to encode the information to a reduced-order space and to produce high-resolution output fields from low resolution input thanks to its generative capabilities. The AAE has been compared to a classical resolution augmentation technique: the bicubic interpolation. Not only the AAE outperforms the bicubic interpolation, but it is also considerably faster, making it feasible for quasi-real-time or even real-time applications. Additionally, AAE resolution augmentation technique can be applied to a wider range of geometries than bicubic interpolation. The structure-preserving neural network is able to estimate the evolution of the encoded variables in the reduced space and then the decoder re-projects the SPNN prediction to the original and higher resolution spaces. The SPNN is compared to a black-box approach, outperforming it thanks to the GENERIC formalism, as it adds physical constrains to the prediction that act as an inductive bias. The results show good agreement between our predictions and the synthetic ground truth obtained by CFD for the two examples analysed. However, there are some limitations in the current work that could be improved in the future:

  • Database: The present work makes use of a synthetic database generated by a CFD tool. However, real data coming from sensors could be used to train a system to work with real-world digital twins. Additionally, the database could be augmented with different geometry cases, improving the generalization to unseen geometries.

  • Integration scheme: In this work, an Euler integration scheme is used. This is integration scheme is simple, and higher order integration schemes like the midpoint rule, Heun’s method or a Runge-Kutta method [45, 28] could improve the accuracy of the SPNN. This would also allow the network to work with bigger time increments. However, increasing the complexity of the integration scheme would require more forward passes of the neural network for each time step, slowing the training process.

  • Net architecture: Graph Neural Networks (GNNs) [35, 46] could be used to take advantage of their unstructured data, in comparison to convolutional neural networks, that require grid-structured information. Thus, GNNs could be applied to real-world applications, e.g., a digital-twin of a system whose sensors are not evenly distributed.

Acknowledgements

This work was supported by the Spanish Ministry of Science and Innovation, AEI/10.13039/501100011033, through Grant number PID2020-113463RB-C31 and by the Ministry for Digital Transformation and the Civil Service, through the ENIA 2022 Chairs for the creation of university-industry chairs in AI, through Grant TSI-100930-2023-1.

This material is also based upon work supported in part by the Army Research Laboratory and the Army Research Office under contract/grant number W911NF2210271.

This research is also part of the DesCartes programme and is supported by the National Research Foundation, Prime Minister Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme.

The authors also acknowledge the support of ESI Group through the chairs at the University of Zaragoza and at ENSAM Institute of Technology.

References

  • [1] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  • [2] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014.
  • [3] Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic single image super-resolution using a generative adversarial network, 2017.
  • [4] Xintao Wang, Ke Yu, Shixiang Wu, **** Gu, Yihao Liu, Chao Dong, Chen Change Loy, Yu Qiao, and Xiaoou Tang. Esrgan: Enhanced super-resolution generative adversarial networks, 2018.
  • [5] Mathis Bode, Michael Gauding, Zeyu Lian, Dominik Denker, Marco Davidovic, Konstantin Kleinheinz, Jenia Jitsev, and Heinz Pitsch. Using physics-informed enhanced super-resolution generative adversarial networks for subfilter modeling in turbulent reactive flows. Proceedings of the Combustion Institute, 38(2):2617–2625, 2021.
  • [6] Kai Fukami, Koji Fukagata, and Kunihiko Taira. Super-resolution analysis via machine learning: a survey for fluid flows. Theoretical and Computational Fluid Dynamics, 37(4):421–444, 2023.
  • [7] Linqi Yu, Mustafa Z. Yousif, Meng Zhang, Sergio Hoyas, Ricardo Vinuesa, and Hee-Chang Lim. Three-dimensional ESRGAN for super-resolution reconstruction of turbulent flows with tricubic interpolation-based transfer learning. Physics of Fluids, 34(12):125126, 12 2022.
  • [8] Daniel Kelshaw, Georgios Rigas, and Luca Magri. Physics-informed cnns for super-resolution of sparse observations on dynamical systems, 2022.
  • [9] Francisco Chinesta, Elias Cueto, Emmanuelle Abisset-Chavanne, Jean Louis Duval, and Fouad El Khaldi. Virtual, digital and hybrid twins: a new paradigm in data-based engineering and engineered data. Archives of computational methods in engineering, 27:105–134, 2020.
  • [10] Adil Rasheed, Omer San, and Trond Kvamsdal. Digital twin: Values, challenges and enablers. arXiv preprint arXiv:1910.01719, 2019.
  • [11] Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, October 2016.
  • [12] S. Niroomandi, I. Alfaro, E. Cueto, and F. Chinesta. Real-time deformable models of non-linear tissues by model reduction techniques. Computer Methods and Programs in Biomedicine, 91(3):223–231, 2008.
  • [13] Alberto Badías, Sarah Curtit, David González, Icíar Alfaro, Francisco Chinesta, and Elías Cueto. An augmented reality platform for interactive aerodynamic design and analysis. International Journal for Numerical Methods in Engineering, 120(1):125–138, 2019.
  • [14] Beatriz Moya, Iciar Alfaro, David Gonzalez, Francisco Chinesta, and Elías Cueto. Physically sound, self-learning digital twins for sloshing fluids. PLoS One, 15(6):e0234569, 2020.
  • [15] Zulkeefal Dar, Joan Baiges, and Ramon Codina. Artificial neural network based correction for reduced order models in computational fluid mechanics. Computer Methods in Applied Mechanics and Engineering, 415:116232, 2023.
  • [16] Ian J. Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, Cambridge, MA, USA, 2016. http://www.deeplearningbook.org.
  • [17] Rachida Chakir, Benjamin Streichenberger, and P. Chatellier. A non-intrusive reduced basis method for urban flows simulation. 01 2021.
  • [18] Elise Grosjean and Yvon Maday. Error estimate of the Non-Intrusive Reduced Basis (NIRB) two-grid method with parabolic equations. The SMAI Journal of computational mathematics, 9:227–256, 2023.
  • [19] Hamidreza Eivazi, Soledad Le Clainche, Sergio Hoyas, and Ricardo Vinuesa. Towards extraction of orthogonal and parsimonious non-linear modes from turbulent flows. Expert Systems with Applications, 202:117038, 2022.
  • [20] Yuning Wang, Alberto Solera-Rico, Carlos Sanmiguel Vila, and Ricardo Vinuesa. Towards optimal β𝛽\betaitalic_β-variational autoencoders combined with transformers for reduced-order modelling of turbulent flows. International Journal of Heat and Fluid Flow, 105:109254, 2024.
  • [21] M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
  • [22] Chayan Banerjee, Kien Nguyen, Clinton Fookes, and George Karniadakis. Physics-informed computer vision: A review and perspectives, 2023.
  • [23] Miroslav Grmela and Hans Christian Öttinger. Dynamics and thermodynamics of complex fluids. i. development of a general formalism. Phys. Rev. E, 56:6620–6632, Dec 1997.
  • [24] Hans Christian Öttinger and Miroslav Grmela. Dynamics and thermodynamics of complex fluids. ii. illustrations of a general formalism. Phys. Rev. E, 56:6633–6655, Dec 1997.
  • [25] Quercus Hernández, Alberto Badías, David González, Francisco Chinesta, and Elías Cueto. Structure-preserving neural networks. Journal of Computational Physics, 426:109950, 2021.
  • [26] Quercus Hernandez, Alberto Badias, Francisco Chinesta, and Elias Cueto. Thermodynamics-informed graph neural networks. IEEE Transactions on Artificial Intelligence, pages 1–1, 2022.
  • [27] Kook** Lee, Nathaniel A. Trask, and Panos Stinis. Machine learning structure preserving brackets for forecasting irreversible processes, 2021.
  • [28] Zhen Zhang, Yeonjong Shin, and George Em Karniadakis. Gfinns: Generic formalism informed neural networks for deterministic and stochastic dynamical systems. Philosophical Transactions of the Royal Society A, 380(2229):20210207, 2022.
  • [29] Quercus Hernandez, Alberto Badías, David González, Francisco Chinesta, and Elías Cueto. Deep learning of thermodynamics-aware reduced-order models from data. Computer Methods in Applied Mechanics and Engineering, 379:113763, 2021.
  • [30] Michal Pavelka, Václav Klika, and Miroslav Grmela. Multiscale thermo-dynamics: introduction to GENERIC. Walter de Gruyter GmbH & Co KG, 2018.
  • [31] Weinan E. A proposal on machine learning via dynamical systems. Communications in Mathematics and Statistics, 5(1):1–11, Mar 2017.
  • [32] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, and Ian Goodfellow. Adversarial autoencoders. In International Conference on Learning Representations, 2016.
  • [33] Diederik P Kingma and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  • [34] Beatriz Moya, Alberto Badías, Icíar Alfaro, Francisco Chinesta, and Elías Cueto. Digital twins that learn and correct themselves. International Journal for Numerical Methods in Engineering, 123(13):3034–3044, 2022.
  • [35] Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals, Yujia Li, and Razvan Pascanu. Relational inductive biases, deep learning, and graph networks, 2018.
  • [36] Philip J. Morrison. A paradigm for joined hamiltonian and dissipative systems. Physica D: Nonlinear Phenomena, 18(1):410–419, 1986.
  • [37] Alexander Mielke. On thermodynamically consistent models and gradient structures for thermoplasticity. GAMM-Mitteilungen, 34(1):51–58, 2011.
  • [38] Alexander Mielke. Formulation of thermoelastic dissipative material behavior using generic. Continuum Mechanics and Thermodynamics, 23(3):233–256, 2011.
  • [39] Ignacio Romero. Algorithms for coupled problems that preserve symmetries and the laws of thermodynamics: Part i: Monolithic integrators and their application to finite strain thermoelasticity. Computer Methods in Applied Mechanics and Engineering, 199(25-28):1841–1858, 2010.
  • [40] Ignacio Romero. Algorithms for coupled problems that preserve symmetries and the laws of thermodynamics: Part ii: Fractional step methods. Computer Methods in Applied Mechanics and Engineering, 199(33-36):2235–2248, 2010.
  • [41] Beatriz Moya, Alberto Badías, David González, Francisco Chinesta, and Elías Cueto. Physics perception in sloshing scenes with guaranteed thermodynamic consistency. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):2136–2150, 2023.
  • [42] H. G. Weller, G. Tabor, H. Jasak, and C. Fureby. A tensorial approach to computational continuum mechanics using object-oriented techniques. Computer in Physics, 12(6):620–631, 11 1998.
  • [43] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  • [44] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. 2017.
  • [45] Yi-Jen Wang and Chin-Teng Lin. Runge-kutta neural network for identification of dynamical systems in high accuracy. IEEE Transactions on Neural Networks, 9(2):294–307, 1998.
  • [46] Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst. Geometric deep learning: Going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.