-
Solving Partial Differential Equations with Equivariant Extreme Learning Machines
Authors:
Hans Harder,
Jean Rabault,
Ricardo Vinuesa,
Mikael Mortensen,
Sebastian Peitz
Abstract:
We utilize extreme-learning machines for the prediction of partial differential equations (PDEs). Our method splits the state space into multiple windows that are predicted individually using a single model. Despite requiring only few data points (in some cases, our method can learn from a single full-state snapshot), it still achieves high accuracy and can predict the flow of PDEs over long time…
▽ More
We utilize extreme-learning machines for the prediction of partial differential equations (PDEs). Our method splits the state space into multiple windows that are predicted individually using a single model. Despite requiring only few data points (in some cases, our method can learn from a single full-state snapshot), it still achieves high accuracy and can predict the flow of PDEs over long time horizons. Moreover, we show how additional symmetries can be exploited to increase sample efficiency and to enforce equivariance.
△ Less
Submitted 24 May, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
Profiling Irony & Stereotype: Exploring Sentiment, Topic, and Lexical Features
Authors:
Tibor L. R. Krols,
Marie Mortensen,
Ninell Oldenburg
Abstract:
Social media has become a very popular source of information. With this popularity comes an interest in systems that can classify the information produced. This study tries to create such a system detecting irony in Twitter users. Recent work emphasize the importance of lexical features, sentiment features and the contrast herein along with TF-IDF and topic models. Based on a thorough feature sele…
▽ More
Social media has become a very popular source of information. With this popularity comes an interest in systems that can classify the information produced. This study tries to create such a system detecting irony in Twitter users. Recent work emphasize the importance of lexical features, sentiment features and the contrast herein along with TF-IDF and topic models. Based on a thorough feature selection process, the resulting model contains specific sub-features from these areas. Our model reaches an F1-score of 0.84, which is above the baseline. We find that lexical features, especially TF-IDF, contribute the most to our models while sentiment and topic modeling features contribute less to overall performance. Lastly, we highlight multiple interesting and important paths for further exploration.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Effective control of two-dimensional Rayleigh--Bénard convection: invariant multi-agent reinforcement learning is all you need
Authors:
Colin Vignon,
Jean Rabault,
Joel Vasanth,
Francisco Alcántara-Ávila,
Mikael Mortensen,
Ricardo Vinuesa
Abstract:
Rayleigh-Bénard convection (RBC) is a recurrent phenomenon in several industrial and geoscience flows and a well-studied system from a fundamental fluid-mechanics viewpoint. However, controlling RBC, for example by modulating the spatial distribution of the bottom-plate heating in the canonical RBC configuration, remains a challenging topic for classical control-theory methods. In the present work…
▽ More
Rayleigh-Bénard convection (RBC) is a recurrent phenomenon in several industrial and geoscience flows and a well-studied system from a fundamental fluid-mechanics viewpoint. However, controlling RBC, for example by modulating the spatial distribution of the bottom-plate heating in the canonical RBC configuration, remains a challenging topic for classical control-theory methods. In the present work, we apply deep reinforcement learning (DRL) for controlling RBC. We show that effective RBC control can be obtained by leveraging invariant multi-agent reinforcement learning (MARL), which takes advantage of the locality and translational invariance inherent to RBC flows inside wide channels. The MARL framework applied to RBC allows for an increase in the number of control segments without encountering the curse of dimensionality that would result from a naive increase in the DRL action-size dimension. This is made possible by the MARL ability for re-using the knowledge generated in different parts of the RBC domain. We show in a case study that MARL DRL is able to discover an advanced control strategy that destabilizes the spontaneous RBC double-cell pattern, changes the topology of RBC by coalescing adjacent convection cells, and actively controls the resulting coalesced cell to bring it to a new stable configuration. This modified flow configuration results in reduced convective heat transfer, which is beneficial in several industrial processes. Therefore, our work both shows the potential of MARL DRL for controlling large RBC systems, as well as demonstrates the possibility for DRL to discover strategies that move the RBC configuration between different topological configurations, yielding desirable heat-transfer characteristics. These results are useful for both gaining further understanding of the intrinsic properties of RBC, as well as for develo** industrial applications.
△ Less
Submitted 13 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Two-phase flow simulations at 0-4 degrees inclination in an eccentric annulus
Authors:
C. Friedemann,
M. Mortensen,
J. Nossen
Abstract:
Multiphase flow simulations were run in an eccentric annulus. The dimensions of the annulus were 0.1 and 0.05 m for the outer and inner cylinders, respectively, and the mixture velocities were between 1.2 and 4.2 m/s. The simulations were compared with fully eccentric and completely concentric experiments conducted at the Institute for Energy Technology. The purpose of this paper is to explore the…
▽ More
Multiphase flow simulations were run in an eccentric annulus. The dimensions of the annulus were 0.1 and 0.05 m for the outer and inner cylinders, respectively, and the mixture velocities were between 1.2 and 4.2 m/s. The simulations were compared with fully eccentric and completely concentric experiments conducted at the Institute for Energy Technology. The purpose of this paper is to explore the effect of the holdup fraction and interior pipe's position on the pressure gradient and flow regime. The comparisons indicate that moving the pipe from an entirely eccentric to a partially eccentric configuration has a drastic impact on the pressure gradient. In all cases where the inner pipe was changed from a completely eccentric geometry to a less eccentric configuration, we notice an increase of 48-303 \% of the mean pressure gradient. Comparatively, cases, where the pipe was moved from a concentric to a more eccentric configuration, result in less drastic pressure gradient changes. 2 cases were within 22 \% of the experimental results for mean, max, and min pressure gradient, while the last two cases exceeded the minimum and mean pressure gradients by 25-250 %, respectively. We rarely observed a change of flow regime as an effect of moving the inner pipe; 2 out of the 8 horizontal cases indicate transition from wavy flow to slug flow or significantly larger waves. The most prominent and frequent discrepancies identified were altered slug and wave frequencies. Through the simulations, we notice that there is an increased pressure gradient accompanying an increased holdup fraction when the phase-averaged velocities were the same. Corresponding to a fractional holdup increase of 0.177, 0.244, 0.063, and 0.073, the increase in simulated pressure gradient for each case of the same mixture flow rate and mesh density was 80, 300, 614 and 367 Pa/m respectively or 116, 244, 61.5 and 25 %.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
Ocellaris: a discontinuous Galerkin finite element solver for two-phase flows with high density differences
Authors:
Tormod Landet,
Mikael Mortensen
Abstract:
In free-surface flows, such as breaking ocean waves, the momentum field will have a discontinuity at the interface between the two immiscible fluids, air and water, but still be smooth in most of the domain. Using a higher-order numerical method is more efficient than increasing the number of low-order computational cells in areas where the solution is smooth, but higher-order approximations cause…
▽ More
In free-surface flows, such as breaking ocean waves, the momentum field will have a discontinuity at the interface between the two immiscible fluids, air and water, but still be smooth in most of the domain. Using a higher-order numerical method is more efficient than increasing the number of low-order computational cells in areas where the solution is smooth, but higher-order approximations cause convective instabilities at discontinuities. In Ocellaris we use slope limiting of discontinuous Galerkin solutions to stabilise finite element simulations of flows with large density jumps, which would otherwise blow up due to Gibbs oscillations resulting from approximating a factor 1000 sharp jump (air to water) by higher-order shape functions.
We have previously shown a slope-limiting procedure for velocity fields that is able to stabilise 2D free-surface simulations running on a single CPU. In this paper our solver is extended to 3D and coupled to an algebraic pressure-correction scheme that retains the exact incompressibility of the direct solution used in the 2D simulations. We have tested the method on a common 3D dam-breaking test case and compared the free-surface evolution and impact pressures to experimental results. We also show how a forcing-zone approach can be used to simulate a surface-piercing vertical cylinder in an infinite wave field. In both cases the free-surface elevation and the forces and pressures compare well with published experiments. The Ocellaris solver is available as an open-source and well-documented program along with the input files needed to replicate the included results (www.ocellaris.org).
△ Less
Submitted 30 March, 2019;
originally announced April 2019.
-
On exactly incompressible DG FEM pressure splitting schemes for the Navier-Stokes equation
Authors:
Tormod Landet,
Mikael Mortensen
Abstract:
We compare three iterative pressure correction schemes for solving the Navier-Stokes equations with a focus on exactly divergence free solution with higher order discontinuous Galerkin discretisations. The investigated schemes are the incremental pressure correction scheme on the standard differential form (IPCS-D), the same scheme on algebraic form (IPCS-A), and the semi-implicit method for press…
▽ More
We compare three iterative pressure correction schemes for solving the Navier-Stokes equations with a focus on exactly divergence free solution with higher order discontinuous Galerkin discretisations. The investigated schemes are the incremental pressure correction scheme on the standard differential form (IPCS-D), the same scheme on algebraic form (IPCS-A), and the semi-implicit method for pressure linked equations (SIMPLE). We show algebraically and through numerical examples that the IPCS-A and SIMPLE schemes are exactly mass conserving due to the algebraic pressure correction, while the IPCS-D scheme cannot be exactly divergence free due to the stabilisation terms required in the pressure Poisson equation. The SIMPLE scheme requires a significantly higher number of pressure correction iterations to obtain converged results than the IPCS-A scheme, so for efficient and mass conserving simulation the IPCS-A method is the best option among the three evaluated schemes.
△ Less
Submitted 20 March, 2019;
originally announced March 2019.
-
More efficient time integration for Fourier pseudo-spectral DNS of incompressible turbulence
Authors:
David I. Ketcheson,
Mikael Mortensen,
Matteo Parsani,
Nathanael Schilling
Abstract:
Time integration of Fourier pseudo-spectral DNS is usually performed using the classical fourth-order accurate Runge--Kutta method, or other methods of second or third order, with a fixed step size. We investigate the use of higher-order Runge-Kutta pairs and automatic step size control based on local error estimation. We find that the fifth-order accurate Runge--Kutta pair of Bogacki \& Shampine…
▽ More
Time integration of Fourier pseudo-spectral DNS is usually performed using the classical fourth-order accurate Runge--Kutta method, or other methods of second or third order, with a fixed step size. We investigate the use of higher-order Runge-Kutta pairs and automatic step size control based on local error estimation. We find that the fifth-order accurate Runge--Kutta pair of Bogacki \& Shampine gives much greater accuracy at a significantly reduced computational cost. Specifically, we demonstrate speedups of 2x-10x for the same accuracy. Numerical tests (including the Taylor-Green vortex, Rayleigh-Taylor instability, and homogeneous isotropic turbulence) confirm the reliability and efficiency of the method. We also show that adaptive time step** provides a significant computational advantage for some problems (like the development of a Rayleigh-Taylor instability) without compromising accuracy.
△ Less
Submitted 7 November, 2019; v1 submitted 24 October, 2018;
originally announced October 2018.
-
Fast parallel multidimensional FFT using advanced MPI
Authors:
Lisandro Dalcin,
Mikael Mortensen,
David E Keyes
Abstract:
We present a new method for performing global redistributions of multidimensional arrays essential to parallel fast Fourier (or similar) transforms. Traditional methods use standard all-to-all collective communication of contiguous memory buffers, thus necessary requiring local data realignment steps intermixed in-between redistribution and transform steps. Instead, our method takes advantage of s…
▽ More
We present a new method for performing global redistributions of multidimensional arrays essential to parallel fast Fourier (or similar) transforms. Traditional methods use standard all-to-all collective communication of contiguous memory buffers, thus necessary requiring local data realignment steps intermixed in-between redistribution and transform steps. Instead, our method takes advantage of subarray datatypes and generalized all-to-all scatter/gather from the MPI-2 standard to communicate discontiguous memory buffers, effectively eliminating the need for local data realignments. Despite generalized all-to-all communication of discontiguous data being generally slower, our proposal economizes in local work. For a range of strong and weak scaling tests, we found the overall performance of our method to be on par and often better than well-established libraries like MPI-FFTW, P3DFFT, and 2DECOMP&FFT. We provide compact routines implemented at the highest possible level using the MPI bindings for the C programming language. These routines apply to any global redistribution, over any two directions of a multidimensional array, decomposed on arbitrary Cartesian processor grids (1D slabs, 2D pencils, or even higher-dimensional decompositions). The high level implementation makes the code easy to read, maintain, and eventually extend. Our approach enables for future speedups from optimizations in the internal datatype handling engines within MPI implementations.
△ Less
Submitted 25 April, 2018;
originally announced April 2018.
-
Slope limiting the velocity field in a discontinuous Galerkin divergence free two-phase flow solver
Authors:
Tormod Landet,
Kent-Andre Mardal,
Mikael Mortensen
Abstract:
Solving the Navier-Stokes equations when the density field contains a large sharp discontinuity---such as a water/air free surface---is numerically challenging. Convective instabilities cause Gibbs oscillations which quickly destroy the solution. We investigate the use of slope limiters for the velocity field to overcome this problem in a way that does not compromise on the mass conservation prope…
▽ More
Solving the Navier-Stokes equations when the density field contains a large sharp discontinuity---such as a water/air free surface---is numerically challenging. Convective instabilities cause Gibbs oscillations which quickly destroy the solution. We investigate the use of slope limiters for the velocity field to overcome this problem in a way that does not compromise on the mass conservation properties. The equations are discretised using the interior penalty discontinuous Galerkin finite element method that is divergence free to machine precision.
A slope limiter made specifically for exactly divergence free (solenoidal) fields is presented and used to illustrated the difficulties in obtaining convectively stable fields that are also exactly solenoidal. The lessons learned from this are applied in constructing a simpler method based on the use of an existing scalar slope limiter applied to each velocity component.
We show by numerical examples how both presented slope limiting methods are vastly superior to the naive non-limited method. The methods can solve difficult two-phase problems with high density ratios and high Reynolds numbers---typical for marine and offshore water/air simulations---in a way that conserves mass and stops unbounded energy growth caused by the Gibbs phenomenon.
△ Less
Submitted 19 March, 2018;
originally announced March 2018.
-
Shenfun -- automating the spectral Galerkin method
Authors:
Mikael Mortensen
Abstract:
With the shenfun Python module (github.com/spectralDNS/shenfun) an effort is made towards automating the implementation of the spectral Galerkin method for simple tensor product domains, consisting of (currently) one non-periodic and any number of periodic directions. The user interface to shenfun is intentionally made very similar to FEniCS (fenicsproject.org). Partial Differential Equations are…
▽ More
With the shenfun Python module (github.com/spectralDNS/shenfun) an effort is made towards automating the implementation of the spectral Galerkin method for simple tensor product domains, consisting of (currently) one non-periodic and any number of periodic directions. The user interface to shenfun is intentionally made very similar to FEniCS (fenicsproject.org). Partial Differential Equations are represented through weak variational forms and solved using efficient direct solvers where available. MPI decomposition is achieved through the {mpi4py-fft} module (bitbucket.org/mpi4py/mpi4py-fft), and all developed solver may, with no additional effort, be run on supercomputers using thousands of processors. Complete solvers are shown for the linear Poisson and biharmonic problems, as well as the nonlinear and time-dependent Ginzburg-Landau equation.
△ Less
Submitted 10 August, 2017;
originally announced August 2017.
-
A spectral-Galerkin turbulent channel flow solver for large-scale simulations
Authors:
Mikael Mortensen
Abstract:
A fully (pseudo-)spectral solver for direct numerical simulations of large-scale turbulent channel flows is described. The solver utilizes the Chebyshev base functions suggested by J. Shen [SIAM J. Sci. Comput., 16, 1, 1995], that lead to stable and robust numerical schemes, even at very large scale. New and fast algorithms for the direct solution of the linear systems are devised, and algorithms…
▽ More
A fully (pseudo-)spectral solver for direct numerical simulations of large-scale turbulent channel flows is described. The solver utilizes the Chebyshev base functions suggested by J. Shen [SIAM J. Sci. Comput., 16, 1, 1995], that lead to stable and robust numerical schemes, even at very large scale. New and fast algorithms for the direct solution of the linear systems are devised, and algorithms and matrices for all required scalar products and transforms are provided. We validate the solver for very high Reynolds numbers. Specifically, the solver is shown to reproduce the first order statistics of Hoyas and Jiménez [Phys. Fluids, 18(1), 2006], for a channel flow at $Re_τ=2000$. The solver is available through the open source project spectralDNS [https://github.com/spectralDNS].
△ Less
Submitted 13 January, 2017;
originally announced January 2017.
-
Preconditioning trace coupled 3$d$-1$d$ systems using fractional Laplacian
Authors:
Miroslav Kuchta,
Kent-Andre Mardal,
Mikael Mortensen
Abstract:
Multiscale or multiphysics problems often involve coupling of partial differential equations posed on domains of different dimensionality. In this work we consider a simplified model problem of a 3d-1d coupling and the main objective is to construct algorithms that may utilize stan- dard multilevel algorithms for the 3d domain, which has the dominating computational complexity. Preconditioning for…
▽ More
Multiscale or multiphysics problems often involve coupling of partial differential equations posed on domains of different dimensionality. In this work we consider a simplified model problem of a 3d-1d coupling and the main objective is to construct algorithms that may utilize stan- dard multilevel algorithms for the 3d domain, which has the dominating computational complexity. Preconditioning for a system of two elliptic problems posed, respectively, in a three dimensional domain and an embedded one dimensional curve and coupled by the trace constraint is discussed. Investigating numerically the properties of the well-defined discrete trace operator, it is found that negative fractional Sobolev norms are suitable preconditioners for the Schur complement of the sys- tem. The norms are employed to construct a robust block diagonal preconditioner for the coupled problem.
△ Less
Submitted 8 April, 2018; v1 submitted 12 December, 2016;
originally announced December 2016.
-
On the Singular Neumann Problem in Linear Elasticity
Authors:
Miroslav Kuchta,
Kent-Andre Mardal,
Mikael Mortensen
Abstract:
The Neumann problem of linear elasticity is singular with a kernel formed by the rigid motions of the body. There are several tricks that are commonly used to obtain a non-singular linear system. However, they often cause reduced accuracy or lead to poor convergence of the iterative solvers. In this paper, different well-posed formulations of the problem are studied through discretization by the f…
▽ More
The Neumann problem of linear elasticity is singular with a kernel formed by the rigid motions of the body. There are several tricks that are commonly used to obtain a non-singular linear system. However, they often cause reduced accuracy or lead to poor convergence of the iterative solvers. In this paper, different well-posed formulations of the problem are studied through discretization by the finite element method, and preconditioning strategies based on operator preconditioning are discussed. For each formulation we derive preconditioners that are independent of the discretization parameter. Preconditioners that are robust with respect to the first Lamé constant are constructed for the pure displacement formulations, while a preconditioner that is robust in both Lamé constants is constructed for the mixed formulation. It is shown that, for convergence in the first Sobolev norm, it is crucial to respect the orthogonality constraint derived from the continuous problem. Based on this observation a modification to the conjugate gradient method is proposed that achieves optimal error convergence of the computed solution.
△ Less
Submitted 8 April, 2018; v1 submitted 29 September, 2016;
originally announced September 2016.
-
Massively parallel implementation in Python of a pseudo-spectral DNS code for turbulent flows
Authors:
Mikael Mortensen
Abstract:
Direct Numerical Simulations (DNS) of the Navier Stokes equations is a valuable research tool in fluid dynamics, but there are very few publicly available codes and, due to heavy number crunching, codes are usually written in low-level languages. In this work a \textasciitilde{}100 line standard scientific Python DNS code is described that nearly matches the performance of pure C for thousands of…
▽ More
Direct Numerical Simulations (DNS) of the Navier Stokes equations is a valuable research tool in fluid dynamics, but there are very few publicly available codes and, due to heavy number crunching, codes are usually written in low-level languages. In this work a \textasciitilde{}100 line standard scientific Python DNS code is described that nearly matches the performance of pure C for thousands of processors and billions of unknowns. With optimization of a few routines in Cython, it is found to match the performance of a more or less identical solver implemented from scratch in C++. Keys to the efficiency of the solver are the mesh decomposition and three dimensional FFT routines, implemented directly in Python using MPI, wrapped through MPI for Python, and a serial FFT module (both numpy.fft or pyFFTW may be used). Two popular decomposition strategies, slab and pencil, have been implemented and tested.
△ Less
Submitted 1 July, 2016;
originally announced July 2016.
-
Efficient preconditioners for saddle point systems with trace constraints coupling 2D and 1D domains
Authors:
Miroslav Kuchta,
Magne Nordaas,
Joris C. G. Verschaeve,
Mikael Mortensen,
Kent-Andre Mardal
Abstract:
We study preconditioners for a model problem describing the coupling of two elliptic subproblems posed over domains with different topological dimension by a parameter dependent constraint. A pair of parameter robust and efficient preconditioners is proposed and analyzed. Robustness and efficiency of the preconditioners is demonstrated by numerical experiments.
We study preconditioners for a model problem describing the coupling of two elliptic subproblems posed over domains with different topological dimension by a parameter dependent constraint. A pair of parameter robust and efficient preconditioners is proposed and analyzed. Robustness and efficiency of the preconditioners is demonstrated by numerical experiments.
△ Less
Submitted 23 May, 2016;
originally announced May 2016.
-
Oasis: a high-level/high-performance open source Navier-Stokes solver
Authors:
Mikael Mortensen,
Kristian Valen-Sendstad
Abstract:
Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a hig…
▽ More
Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown reproduce very well the classical, spectral, turbulent channel simulations of Moser, Kim and Mansour at $Re_τ=180$ [Phys. Fluids, vol 11(4), p. 964]. The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.
△ Less
Submitted 11 February, 2016;
originally announced February 2016.
-
High performance Python for direct numerical simulations of turbulent flows
Authors:
Mikael Mortensen,
Hans Petter Langtangen
Abstract:
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches…
▽ More
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches the performance of C++ for thousands of processors and billions of unknowns. We also describe a version optimized through Cython, that is found to match the speed of C++. The solvers are written from scratch in Python, both the mesh, the MPI domain decomposition, and the temporal integrators. The solvers have been verified and benchmarked on the Shaheen supercomputer at the KAUST supercomputing laboratory, and we are able to show very good scaling up to several thousand cores.
A very important part of the implementation is the mesh decomposition (we implement both slab and pencil decompositions) and 3D parallel Fast Fourier Transforms (FFT). The mesh decomposition and FFT routines have been implemented in Python using serial FFT routines (either NumPy, pyFFTW or any other serial FFT module), NumPy array manipulations and with MPI communications handled by MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT in Python for a slab mesh decomposition using 4 lines of compact Python code, for which the parallel performance on Shaheen is found to be slightly better than similar routines provided through the FFTW library. For a pencil mesh decomposition 7 lines of code is required to execute a transform.
△ Less
Submitted 11 February, 2016;
originally announced February 2016.
-
Estimating the Impact of Unknown Unknowns on Aggregate Query Results
Authors:
Yeounoh Chung,
Michael Lind Mortensen,
Carsten Binnig,
Tim Kraska
Abstract:
It is common practice for data scientists to acquire and integrate disparate data sources to achieve higher quality results. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) is the integrated data set complete and (2) what is the impact of any unknown (i.e., unobserved) data on query results?
In this work, we develop and analyze techniques to estimate…
▽ More
It is common practice for data scientists to acquire and integrate disparate data sources to achieve higher quality results. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) is the integrated data set complete and (2) what is the impact of any unknown (i.e., unobserved) data on query results?
In this work, we develop and analyze techniques to estimate the impact of the unknown data (a.k.a., unknown unknowns) on simple aggregate queries. The key idea is that the overlap between different data sources enables us to estimate the number and values of the missing data items. Our main techniques are parameter-free and do not assume prior knowledge about the distribution. Through a series of experiments, we show that estimating the impact of unknown unknowns is invaluable to better assess the results of aggregate queries over integrated data sources.
△ Less
Submitted 26 December, 2015; v1 submitted 20 July, 2015;
originally announced July 2015.
-
A FEniCS-Based Programming Framework for Modeling Turbulent Flow by the Reynolds-Averaged Navier-Stokes Equations
Authors:
Mikael Mortensen,
Hans Petter Langtangen,
Garth N. Wells
Abstract:
Finding an appropriate turbulence model for a given flow case usually calls for extensive experimentation with both models and numerical solution methods. This work presents the design and implementation of a flexible, programmable software framework for assisting with numerical experiments in computational turbulence. The framework targets Reynolds-averaged Navier-Stokes models, discretized by fi…
▽ More
Finding an appropriate turbulence model for a given flow case usually calls for extensive experimentation with both models and numerical solution methods. This work presents the design and implementation of a flexible, programmable software framework for assisting with numerical experiments in computational turbulence. The framework targets Reynolds-averaged Navier-Stokes models, discretized by finite element methods. The novel implementation makes use of Python and the FEniCS package, the combination of which leads to compact and reusable code, where model- and solver-specific code resemble closely the mathematical formulation of equations and algorithms. The presented ideas and programming techniques are also applicable to other fields that involve systems of nonlinear partial differential equations. We demonstrate the framework in two applications and investigate the impact of various linearizations on the convergence properties of nonlinear solvers for a Reynolds-averaged Navier-Stokes model.
△ Less
Submitted 31 March, 2011; v1 submitted 14 February, 2011;
originally announced February 2011.