Search | arXiv e-print repository

Easy attention: A simple attention mechanism for temporal predictions with transformers

Authors: Marcial Sanchis-Agudo, Yuning Wang, Roger Arnau, Luca Guastoni, Jasmin Lim, Karthik Duraisamy, Ricardo Vinuesa

Abstract: To improve the robustness of transformer neural networks used for temporal-dynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention which we demonstrate in time-series reconstruction and prediction. While the standard self attention only makes use of the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not nece… ▽ More To improve the robustness of transformer neural networks used for temporal-dynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention which we demonstrate in time-series reconstruction and prediction. While the standard self attention only makes use of the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not necessary for obtaining the attention score required to capture long-term dependencies in temporal sequences. Through the singular-value decomposition (SVD) on the softmax attention score, we further observe that self attention compresses the contributions from both queries and keys in the space spanned by the attention score. Therefore, our proposed easy-attention method directly treats the attention scores as learnable parameters. This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems exhibiting more robustness and less complexity than self attention or the widely-used long short-term memory (LSTM) network. We show the improved performance of the easy-attention method in the Lorenz system, a turbulence shear flow and a model of a nuclear reactor. △ Less

Submitted 15 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: 15 pages and 6 figures

arXiv:2203.00974 [pdf, other]

Predicting the temporal dynamics of turbulent channels through deep learning

Authors: Giuseppe Borrelli, Luca Guastoni, Hamidreza Eivazi, Philipp Schlatter, Ricardo Vinuesa

Abstract: The success of recurrent neural networks (RNNs) has been demonstrated in many applications related to turbulence, including flow control, optimization, turbulent features reproduction as well as turbulence prediction and modeling. With this study we aim to assess the capability of these networks to reproduce the temporal evolution of a minimal turbulent channel flow. We first obtain a data-driven… ▽ More The success of recurrent neural networks (RNNs) has been demonstrated in many applications related to turbulence, including flow control, optimization, turbulent features reproduction as well as turbulence prediction and modeling. With this study we aim to assess the capability of these networks to reproduce the temporal evolution of a minimal turbulent channel flow. We first obtain a data-driven model based on a modal decomposition in the Fourier domain (which we denote as FFT-POD) of the time series sampled from the flow. This particular case of turbulent flow allows us to accurately simulate the most relevant coherent structures close to the wall. Long-short-term-memory (LSTM) networks and a Koopman-based framework (KNF) are trained to predict the temporal dynamics of the minimal-channel-flow modes. Tests with different configurations highlight the limits of the KNF method compared to the LSTM, given the complexity of the flow under study. Long-term prediction for LSTM show excellent agreement from the statistical point of view, with errors below 2% for the best models with respect to the reference. Furthermore, the analysis of the chaotic behaviour through the use of the Lyapunov exponents and of the dynamic behaviour through Poincaré maps emphasizes the ability of the LSTM to reproduce the temporal dynamics of turbulence. Alternative reduced-order models (ROMs), based on the identification of different turbulent structures, are explored and they continue to show a good potential in predicting the temporal dynamics of the minimal channel. △ Less

Submitted 2 March, 2022; originally announced March 2022.

arXiv:2005.02762 [pdf, other]

Recurrent neural networks and Koopman-based frameworks for temporal predictions in a low-order model of turbulence

Authors: Hamidreza Eivazi, Luca Guastoni, Philipp Schlatter, Hossein Azizpour, Ricardo Vinuesa

Abstract: The capabilities of recurrent neural networks and Koopman-based frameworks are assessed in the prediction of temporal dynamics of the low-order model of near-wall turbulence by Moehlis et al. (New J. Phys. 6, 56, 2004). Our results show that it is possible to obtain excellent reproductions of the long-term statistics and the dynamic behavior of the chaotic system with properly trained long-short-t… ▽ More The capabilities of recurrent neural networks and Koopman-based frameworks are assessed in the prediction of temporal dynamics of the low-order model of near-wall turbulence by Moehlis et al. (New J. Phys. 6, 56, 2004). Our results show that it is possible to obtain excellent reproductions of the long-term statistics and the dynamic behavior of the chaotic system with properly trained long-short-term memory (LSTM) networks, leading to relative errors in the mean and the fluctuations below $1\%$. Besides, a newly developed Koopman-based framework, called Koopman with nonlinear forcing (KNF), leads to the same level of accuracy in the statistics at a significantly lower computational expense. Furthermore, the KNF framework outperforms the LSTM network when it comes to short-term predictions. We also observe that using a loss function based only on the instantaneous predictions of the chaotic system can lead to suboptimal reproductions in terms of long-term statistics. Thus, we propose a model-selection criterion based on the computed statistics which allows to achieve excellent statistical reconstruction even on small datasets, with minimal loss of accuracy in the instantaneous predictions. △ Less

Submitted 14 April, 2021; v1 submitted 1 May, 2020; originally announced May 2020.

Comments: International Journal of Heat and Fluid Flow. arXiv admin note: substantial text overlap with arXiv:2002.01222

arXiv:2002.01222 [pdf, ps, other]

On the use of recurrent neural networks for predictions of turbulent flows

Authors: Luca Guastoni, Prem A. Srinivasan, Hossein Azizpour, Philipp Schlatter, Ricardo Vinuesa

Abstract: In this paper, the prediction capabilities of recurrent neural networks are assessed in the low-order model of near-wall turbulence by Moehlis {\it et al.} (New J. Phys. {\bf 6}, 56, 2004). Our results show that it is possible to obtain excellent predictions of the turbulence statistics and the dynamic behavior of the flow with properly trained long short-term memory (LSTM) networks, leading to re… ▽ More In this paper, the prediction capabilities of recurrent neural networks are assessed in the low-order model of near-wall turbulence by Moehlis {\it et al.} (New J. Phys. {\bf 6}, 56, 2004). Our results show that it is possible to obtain excellent predictions of the turbulence statistics and the dynamic behavior of the flow with properly trained long short-term memory (LSTM) networks, leading to relative errors in the mean and the fluctuations below $1\%$. We also observe that using a loss function based only on the instantaneous predictions of the flow may not lead to the best predictions in terms of turbulence statistics, and it is necessary to define a stop** criterion based on the computed statistics. Furthermore, more sophisticated loss functions, including not only the instantaneous predictions but also the averaged behavior of the flow, may lead to much faster neural network training. △ Less

Submitted 4 February, 2020; originally announced February 2020.

Comments: Conference paper presented at 11th International Symposium on Turbulence and Shear Flow Phenomena (TSFP11) at Southampton, UK, July 30 to August 2, 2019

Showing 1–4 of 4 results for author: Guastoni, L