-
Solving Partial Differential Equations with Equivariant Extreme Learning Machines
Authors:
Hans Harder,
Jean Rabault,
Ricardo Vinuesa,
Mikael Mortensen,
Sebastian Peitz
Abstract:
We utilize extreme-learning machines for the prediction of partial differential equations (PDEs). Our method splits the state space into multiple windows that are predicted individually using a single model. Despite requiring only few data points (in some cases, our method can learn from a single full-state snapshot), it still achieves high accuracy and can predict the flow of PDEs over long time…
▽ More
We utilize extreme-learning machines for the prediction of partial differential equations (PDEs). Our method splits the state space into multiple windows that are predicted individually using a single model. Despite requiring only few data points (in some cases, our method can learn from a single full-state snapshot), it still achieves high accuracy and can predict the flow of PDEs over long time horizons. Moreover, we show how additional symmetries can be exploited to increase sample efficiency and to enforce equivariance.
△ Less
Submitted 24 May, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
Profiling Irony & Stereotype: Exploring Sentiment, Topic, and Lexical Features
Authors:
Tibor L. R. Krols,
Marie Mortensen,
Ninell Oldenburg
Abstract:
Social media has become a very popular source of information. With this popularity comes an interest in systems that can classify the information produced. This study tries to create such a system detecting irony in Twitter users. Recent work emphasize the importance of lexical features, sentiment features and the contrast herein along with TF-IDF and topic models. Based on a thorough feature sele…
▽ More
Social media has become a very popular source of information. With this popularity comes an interest in systems that can classify the information produced. This study tries to create such a system detecting irony in Twitter users. Recent work emphasize the importance of lexical features, sentiment features and the contrast herein along with TF-IDF and topic models. Based on a thorough feature selection process, the resulting model contains specific sub-features from these areas. Our model reaches an F1-score of 0.84, which is above the baseline. We find that lexical features, especially TF-IDF, contribute the most to our models while sentiment and topic modeling features contribute less to overall performance. Lastly, we highlight multiple interesting and important paths for further exploration.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Effective control of two-dimensional Rayleigh--Bénard convection: invariant multi-agent reinforcement learning is all you need
Authors:
Colin Vignon,
Jean Rabault,
Joel Vasanth,
Francisco Alcántara-Ávila,
Mikael Mortensen,
Ricardo Vinuesa
Abstract:
Rayleigh-Bénard convection (RBC) is a recurrent phenomenon in several industrial and geoscience flows and a well-studied system from a fundamental fluid-mechanics viewpoint. However, controlling RBC, for example by modulating the spatial distribution of the bottom-plate heating in the canonical RBC configuration, remains a challenging topic for classical control-theory methods. In the present work…
▽ More
Rayleigh-Bénard convection (RBC) is a recurrent phenomenon in several industrial and geoscience flows and a well-studied system from a fundamental fluid-mechanics viewpoint. However, controlling RBC, for example by modulating the spatial distribution of the bottom-plate heating in the canonical RBC configuration, remains a challenging topic for classical control-theory methods. In the present work, we apply deep reinforcement learning (DRL) for controlling RBC. We show that effective RBC control can be obtained by leveraging invariant multi-agent reinforcement learning (MARL), which takes advantage of the locality and translational invariance inherent to RBC flows inside wide channels. The MARL framework applied to RBC allows for an increase in the number of control segments without encountering the curse of dimensionality that would result from a naive increase in the DRL action-size dimension. This is made possible by the MARL ability for re-using the knowledge generated in different parts of the RBC domain. We show in a case study that MARL DRL is able to discover an advanced control strategy that destabilizes the spontaneous RBC double-cell pattern, changes the topology of RBC by coalescing adjacent convection cells, and actively controls the resulting coalesced cell to bring it to a new stable configuration. This modified flow configuration results in reduced convective heat transfer, which is beneficial in several industrial processes. Therefore, our work both shows the potential of MARL DRL for controlling large RBC systems, as well as demonstrates the possibility for DRL to discover strategies that move the RBC configuration between different topological configurations, yielding desirable heat-transfer characteristics. These results are useful for both gaining further understanding of the intrinsic properties of RBC, as well as for develo** industrial applications.
△ Less
Submitted 13 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Fast parallel multidimensional FFT using advanced MPI
Authors:
Lisandro Dalcin,
Mikael Mortensen,
David E Keyes
Abstract:
We present a new method for performing global redistributions of multidimensional arrays essential to parallel fast Fourier (or similar) transforms. Traditional methods use standard all-to-all collective communication of contiguous memory buffers, thus necessary requiring local data realignment steps intermixed in-between redistribution and transform steps. Instead, our method takes advantage of s…
▽ More
We present a new method for performing global redistributions of multidimensional arrays essential to parallel fast Fourier (or similar) transforms. Traditional methods use standard all-to-all collective communication of contiguous memory buffers, thus necessary requiring local data realignment steps intermixed in-between redistribution and transform steps. Instead, our method takes advantage of subarray datatypes and generalized all-to-all scatter/gather from the MPI-2 standard to communicate discontiguous memory buffers, effectively eliminating the need for local data realignments. Despite generalized all-to-all communication of discontiguous data being generally slower, our proposal economizes in local work. For a range of strong and weak scaling tests, we found the overall performance of our method to be on par and often better than well-established libraries like MPI-FFTW, P3DFFT, and 2DECOMP&FFT. We provide compact routines implemented at the highest possible level using the MPI bindings for the C programming language. These routines apply to any global redistribution, over any two directions of a multidimensional array, decomposed on arbitrary Cartesian processor grids (1D slabs, 2D pencils, or even higher-dimensional decompositions). The high level implementation makes the code easy to read, maintain, and eventually extend. Our approach enables for future speedups from optimizations in the internal datatype handling engines within MPI implementations.
△ Less
Submitted 25 April, 2018;
originally announced April 2018.
-
Massively parallel implementation in Python of a pseudo-spectral DNS code for turbulent flows
Authors:
Mikael Mortensen
Abstract:
Direct Numerical Simulations (DNS) of the Navier Stokes equations is a valuable research tool in fluid dynamics, but there are very few publicly available codes and, due to heavy number crunching, codes are usually written in low-level languages. In this work a \textasciitilde{}100 line standard scientific Python DNS code is described that nearly matches the performance of pure C for thousands of…
▽ More
Direct Numerical Simulations (DNS) of the Navier Stokes equations is a valuable research tool in fluid dynamics, but there are very few publicly available codes and, due to heavy number crunching, codes are usually written in low-level languages. In this work a \textasciitilde{}100 line standard scientific Python DNS code is described that nearly matches the performance of pure C for thousands of processors and billions of unknowns. With optimization of a few routines in Cython, it is found to match the performance of a more or less identical solver implemented from scratch in C++. Keys to the efficiency of the solver are the mesh decomposition and three dimensional FFT routines, implemented directly in Python using MPI, wrapped through MPI for Python, and a serial FFT module (both numpy.fft or pyFFTW may be used). Two popular decomposition strategies, slab and pencil, have been implemented and tested.
△ Less
Submitted 1 July, 2016;
originally announced July 2016.
-
Oasis: a high-level/high-performance open source Navier-Stokes solver
Authors:
Mikael Mortensen,
Kristian Valen-Sendstad
Abstract:
Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a hig…
▽ More
Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown reproduce very well the classical, spectral, turbulent channel simulations of Moser, Kim and Mansour at $Re_τ=180$ [Phys. Fluids, vol 11(4), p. 964]. The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.
△ Less
Submitted 11 February, 2016;
originally announced February 2016.
-
High performance Python for direct numerical simulations of turbulent flows
Authors:
Mikael Mortensen,
Hans Petter Langtangen
Abstract:
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches…
▽ More
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches the performance of C++ for thousands of processors and billions of unknowns. We also describe a version optimized through Cython, that is found to match the speed of C++. The solvers are written from scratch in Python, both the mesh, the MPI domain decomposition, and the temporal integrators. The solvers have been verified and benchmarked on the Shaheen supercomputer at the KAUST supercomputing laboratory, and we are able to show very good scaling up to several thousand cores.
A very important part of the implementation is the mesh decomposition (we implement both slab and pencil decompositions) and 3D parallel Fast Fourier Transforms (FFT). The mesh decomposition and FFT routines have been implemented in Python using serial FFT routines (either NumPy, pyFFTW or any other serial FFT module), NumPy array manipulations and with MPI communications handled by MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT in Python for a slab mesh decomposition using 4 lines of compact Python code, for which the parallel performance on Shaheen is found to be slightly better than similar routines provided through the FFTW library. For a pencil mesh decomposition 7 lines of code is required to execute a transform.
△ Less
Submitted 11 February, 2016;
originally announced February 2016.
-
Estimating the Impact of Unknown Unknowns on Aggregate Query Results
Authors:
Yeounoh Chung,
Michael Lind Mortensen,
Carsten Binnig,
Tim Kraska
Abstract:
It is common practice for data scientists to acquire and integrate disparate data sources to achieve higher quality results. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) is the integrated data set complete and (2) what is the impact of any unknown (i.e., unobserved) data on query results?
In this work, we develop and analyze techniques to estimate…
▽ More
It is common practice for data scientists to acquire and integrate disparate data sources to achieve higher quality results. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) is the integrated data set complete and (2) what is the impact of any unknown (i.e., unobserved) data on query results?
In this work, we develop and analyze techniques to estimate the impact of the unknown data (a.k.a., unknown unknowns) on simple aggregate queries. The key idea is that the overlap between different data sources enables us to estimate the number and values of the missing data items. Our main techniques are parameter-free and do not assume prior knowledge about the distribution. Through a series of experiments, we show that estimating the impact of unknown unknowns is invaluable to better assess the results of aggregate queries over integrated data sources.
△ Less
Submitted 26 December, 2015; v1 submitted 20 July, 2015;
originally announced July 2015.
-
A FEniCS-Based Programming Framework for Modeling Turbulent Flow by the Reynolds-Averaged Navier-Stokes Equations
Authors:
Mikael Mortensen,
Hans Petter Langtangen,
Garth N. Wells
Abstract:
Finding an appropriate turbulence model for a given flow case usually calls for extensive experimentation with both models and numerical solution methods. This work presents the design and implementation of a flexible, programmable software framework for assisting with numerical experiments in computational turbulence. The framework targets Reynolds-averaged Navier-Stokes models, discretized by fi…
▽ More
Finding an appropriate turbulence model for a given flow case usually calls for extensive experimentation with both models and numerical solution methods. This work presents the design and implementation of a flexible, programmable software framework for assisting with numerical experiments in computational turbulence. The framework targets Reynolds-averaged Navier-Stokes models, discretized by finite element methods. The novel implementation makes use of Python and the FEniCS package, the combination of which leads to compact and reusable code, where model- and solver-specific code resemble closely the mathematical formulation of equations and algorithms. The presented ideas and programming techniques are also applicable to other fields that involve systems of nonlinear partial differential equations. We demonstrate the framework in two applications and investigate the impact of various linearizations on the convergence properties of nonlinear solvers for a Reynolds-averaged Navier-Stokes model.
△ Less
Submitted 31 March, 2011; v1 submitted 14 February, 2011;
originally announced February 2011.