Search | arXiv e-print repository

Relevance of the Basset history term for Lagrangian particle dynamics

Authors: Julio Urizarna-Carasa, Daniel Ruprecht, Alexandra von Kameke, Kathrin Padberg-Gehle

Abstract: The movement of small but finite spherical particles in a fluid can be described by the Maxey-Riley equation (MRE) if they are too large to be considered passive tracers. The MRE contains an integral "history term" modeling wake effects, which causes the force acting on a particle at some given time to depend on its full past trajectory. The history term causes complications in the numerical solut… ▽ More The movement of small but finite spherical particles in a fluid can be described by the Maxey-Riley equation (MRE) if they are too large to be considered passive tracers. The MRE contains an integral "history term" modeling wake effects, which causes the force acting on a particle at some given time to depend on its full past trajectory. The history term causes complications in the numerical solution of the MRE and is therefore often neglected, despite both numerical and experimental evidence that its effects are generally not negligible. By numerically computing trajectories with and without the history term of a large number of particles in different flow fields, we investigate its impact on the large-scale Lagrangian dynamics of simulated particles. We show that for moderate to large Stokes numbers, ignoring the history term leads to significant differences in clustering patterns. Furthermore, we compute finite-time Lyapunov exponents and show that, even for small particles, the differences in the resulting scalar field from ignoring the BHT can be significant, in particular if the underlying flow is turbulent. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2404.06400 [pdf, other]

Dynamic Deep Learning Based Super-Resolution For The Shallow Water Equations

Authors: Maximilian Witte, Fabricio Rodrigues Lapolli, Philip Freese, Sebastian Götschel, Daniel Ruprecht, Peter Korn, Christopher Kadow

Abstract: Using the nonlinear shallow water equations as benchmark, we demonstrate that a simulation with the ICON-O ocean model with a 20km resolution that is frequently corrected by a U-net-type neural network can achieve discretization errors of a simulation with 10km resolution. The network, originally developed for image-based super-resolution in post-processing, is trained to compute the difference be… ▽ More Using the nonlinear shallow water equations as benchmark, we demonstrate that a simulation with the ICON-O ocean model with a 20km resolution that is frequently corrected by a U-net-type neural network can achieve discretization errors of a simulation with 10km resolution. The network, originally developed for image-based super-resolution in post-processing, is trained to compute the difference between solutions on both meshes and is used to correct the coarse mesh every 12h. Our setup is the Galewsky test case, modeling transition of a barotropic instability into turbulent flow. We show that the ML-corrected coarse resolution run correctly maintains a balance flow and captures the transition to turbulence in line with the higher resolution simulation. After 8 day of simulation, the $L_2$-error of the corrected run is similar to a simulation run on the finer mesh. While mass is conserved in the corrected runs, we observe some spurious generation of kinetic energy. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 17 pages, 12 figures

MSC Class: 65M99; 68T07; 86-08; 35-11

arXiv:2404.05556 [pdf, other]

doi 10.1016/j.compfluid.2024.106321

Bathymetry reconstruction from experimental data using PDE-constrained optimisation

Authors: Judith Angel, Jörn Behrens, Sebastian Götschel, Marten Hollm, Daniel Ruprecht, Robert Seifried

Abstract: Knowledge of the bottom topography, also called bathymetry, of rivers, seas or the ocean is important for many areas of maritime science and civil engineering. While direct measurements are possible, they are time consuming and expensive. Therefore, many approaches have been proposed how to infer the bathymetry from measurements of surface waves. Mathematically, this is an inverse problem where an… ▽ More Knowledge of the bottom topography, also called bathymetry, of rivers, seas or the ocean is important for many areas of maritime science and civil engineering. While direct measurements are possible, they are time consuming and expensive. Therefore, many approaches have been proposed how to infer the bathymetry from measurements of surface waves. Mathematically, this is an inverse problem where an unknown system state needs to be reconstructed from observations with a suitable model for the flow as constraint. In many cases, the shallow water equations can be used to describe the flow. While theoretical studies of the efficacy of such a PDE-constrained optimisation approach for bathymetry reconstruction exist, there seem to be few publications that study its application to data obtained from real-world measurements. This paper shows that the approach can, at least qualitatively, reconstruct a Gaussian-shaped bathymetry in a wave flume from measurements of the water height at up to three points. Achieved normalized root mean square errors (NRMSE) are in line with other approaches. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Journal ref: Computers & Fluids 278, pp. 106321, 2024

arXiv:2403.20135 [pdf, other]

Parallel performance of shared memory parallel spectral deferred corrections

Authors: Philip Freese, Sebastian Götschel, Thibaut Lunet, Daniel Ruprecht, Martin Schreiber

Abstract: We investigate parallel performance of parallel spectral deferred corrections, a numerical approach that provides small-scale parallelism for the numerical solution of initial value problems. The scheme is applied to the shallow water equation and uses an IMEX splitting that integrates fast modes implicitly and slow modes explicitly in order to be efficient. We describe parallel $\texttt{OpenMP}$-… ▽ More We investigate parallel performance of parallel spectral deferred corrections, a numerical approach that provides small-scale parallelism for the numerical solution of initial value problems. The scheme is applied to the shallow water equation and uses an IMEX splitting that integrates fast modes implicitly and slow modes explicitly in order to be efficient. We describe parallel $\texttt{OpenMP}$-based implementations of parallel SDC in two well established simulation codes: the finite volume based operational ocean model $\texttt{ICON-O}$ and the spherical harmonics based research code $\texttt{SWEET}$. The implementations are benchmarked on a single node of the JUSUF ($\texttt{SWEET}$) and JUWELS ($\texttt{ICON-O}$) system at Jülich Supercomputing Centre. We demonstrate a reduction of time-to-solution across a range of accuracies. For $\texttt{ICON-O}$, we show speedup over the currently used Adams--Bashforth-2 integrator with $\texttt{OpenMP}$ loop parallelization. For $\texttt{SWEET}$, we show speedup over serial spectral deferred corrections and a second order implicit-explicit integrator. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: 14 pages, 4 figures

MSC Class: 65Y05; 65M08; 65M70

arXiv:2403.19736 [pdf, other]

Physics-Informed Neural Networks for Satellite State Estimation

Authors: Jacob Varey, Jessica D. Ruprecht, Michael Tierney, Ryan Sullenberger

Abstract: The Space Domain Awareness (SDA) community routinely tracks satellites in orbit by fitting an orbital state to observations made by the Space Surveillance Network (SSN). In order to fit such orbits, an accurate model of the forces that are acting on the satellite is required. Over the past several decades, high-quality, physics-based models have been developed for satellite state estimation and pr… ▽ More The Space Domain Awareness (SDA) community routinely tracks satellites in orbit by fitting an orbital state to observations made by the Space Surveillance Network (SSN). In order to fit such orbits, an accurate model of the forces that are acting on the satellite is required. Over the past several decades, high-quality, physics-based models have been developed for satellite state estimation and propagation. These models are exceedingly good at estimating and propagating orbital states for non-maneuvering satellites; however, there are several classes of anomalous accelerations that a satellite might experience which are not well-modeled, such as satellites that use low-thrust electric propulsion to modify their orbit. Physics-Informed Neural Networks (PINNs) are a valuable tool for these classes of satellites as they combine physics models with Deep Neural Networks (DNNs), which are highly expressive and versatile function approximators. By combining a physics model with a DNN, the machine learning model need not learn astrodynamics, which results in more efficient and effective utilization of machine learning resources. This paper details the application of PINNs to estimate the orbital state and a continuous, low-amplitude anomalous acceleration profile for satellites. The PINN is trained to learn the unknown acceleration by minimizing the mean square error of observations. We evaluate the performance of pure physics models with PINNs in terms of their observation residuals and their propagation accuracy beyond the fit span of the observations. For a two-day simulation of a GEO satellite using an unmodeled acceleration profile on the order of $10^{-8} \text{ km/s}^2$, the PINN outperformed the best-fit physics model by orders of magnitude for both observation residuals (123 arcsec vs 1.00 arcsec) as well as propagation accuracy (3860 km vs 164 km after five days). △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.18641 [pdf, other]

Improving Efficiency of Parallel Across the Method Spectral Deferred Corrections

Authors: Gayatri Čaklović, Thibaut Lunet, Sebastian Götschel, Daniel Ruprecht

Abstract: Parallel-across-the method time integration can provide small scale parallelism when solving initial value problems. Spectral deferred corrections (SDC) with a diagonal sweeper, which is closely related to iterated Runge-Kutta methods proposed by Van der Houwen and Sommeijer, can use a number of threads equal to the number of quadrature nodes in the underlying collocation method. However, converge… ▽ More Parallel-across-the method time integration can provide small scale parallelism when solving initial value problems. Spectral deferred corrections (SDC) with a diagonal sweeper, which is closely related to iterated Runge-Kutta methods proposed by Van der Houwen and Sommeijer, can use a number of threads equal to the number of quadrature nodes in the underlying collocation method. However, convergence speed, efficiency and stability depends critically on the used coefficients. Previous approaches have used numerical optimization to find good parameters. Instead, we propose an ansatz that allows to find optimal parameters analytically. We show that the resulting parallel SDC methods provide stability domains and convergence order very similar to those of well established serial SDC variants. Using a model for computational cost that assumes 80% efficiency of an implementation of parallel SDC we show that our variants are competitive with serial SDC, previously published parallel SDC coefficients as well as Picard iteration, explicit RKM-4 and an implicit fourth-order diagonally implicit Runge-Kutta method. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 24 pages

MSC Class: 65R20; 45L05; 65L20 ACM Class: G.1.7; G.1.8

arXiv:2403.13515 [pdf, other]

Efficient numerical methods for the Maxey-Riley equations with Basset history term

Authors: Julio Urizarna-Carasa, Leon Schlegel, Daniel Ruprecht

Abstract: The Maxey-Riley equations (MRE) describe the motion of a finite-sized, spherical particle in a fluid. Because of wake effects, the force acting on a particle depends on its past trajectory. This is modelled by an integral term in the MRE, also called Basset force, that makes its numerical solution challenging and memory intensive. A recent approach proposed by Prasath, Vasan and Govindarajan explo… ▽ More The Maxey-Riley equations (MRE) describe the motion of a finite-sized, spherical particle in a fluid. Because of wake effects, the force acting on a particle depends on its past trajectory. This is modelled by an integral term in the MRE, also called Basset force, that makes its numerical solution challenging and memory intensive. A recent approach proposed by Prasath, Vasan and Govindarajan exploits connections between the integral term and fractional derivatives to reformulate the MRE as a time-dependent partial differential equation on a semi-infinite pseudo-space. They also propose a numerical algorithm based on polynomial expansions. This paper develops a numerical approach based on finite difference instead, by adopting techniques by Koleva and Fazio and Janelli to cope with the issues of having an unbounded spatial domain. We compare convergence order and computational efficiency for particles of varying size and density of the polynomial expansion by Prasath et al., our finite difference schemes and a direct integrator for the MRE based on multi-step methods proposed by Daitche. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.13454 [pdf, other]

Adaptive time step selection for Spectral Deferred Corrections

Authors: Thomas Baumann, Sebastian Götschel, Thibaut Lunet, Daniel Ruprecht, Robert Speck

Abstract: Spectral Deferred Corrections (SDC) is an iterative method for the numerical solution of ordinary differential equations. It works by refining the numerical solution for an initial value problem by approximately solving differential equations for the error, and can be interpreted as a preconditioned fixed-point iteration for solving the fully implicit collocation problem. We adopt techniques from… ▽ More Spectral Deferred Corrections (SDC) is an iterative method for the numerical solution of ordinary differential equations. It works by refining the numerical solution for an initial value problem by approximately solving differential equations for the error, and can be interpreted as a preconditioned fixed-point iteration for solving the fully implicit collocation problem. We adopt techniques from embedded Runge-Kutta Methods (RKM) to SDC in order to provide a mechanism for adaptive time step size selection and thus increase computational efficiency of SDC. We propose two SDC-specific estimates of the local error that are generic and require only minimal problem specific tuning. We demonstrate a gain in efficiency over standard SDC with fixed step size, compare efficiency favorably against state-of-the-art adaptive RKM and show that due to its iterative nature, adaptive SDC can cope efficiently with silent data corruption. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 34 pages including references, 12 figures. Submitted to Springer Numerical Algorithms

MSC Class: 65Y05 ACM Class: G.1.0

arXiv:2303.03848 [pdf, other]

doi 10.1007/978-3-031-39698-4_44

Parareal with a physics-informed neural network as coarse propagator

Authors: Abdul Qadir Ibrahim, Sebastian Götschel, Daniel Ruprecht

Abstract: Parallel-in-time algorithms provide an additional layer of concurrency for the numerical integration of models based on time-dependent differential equations. Methods like Parareal, which parallelize across multiple time steps, rely on a computationally cheap and coarse integrator to propagate information forward in time, while a parallelizable expensive fine propagator provides accuracy. Typicall… ▽ More Parallel-in-time algorithms provide an additional layer of concurrency for the numerical integration of models based on time-dependent differential equations. Methods like Parareal, which parallelize across multiple time steps, rely on a computationally cheap and coarse integrator to propagate information forward in time, while a parallelizable expensive fine propagator provides accuracy. Typically, the coarse method is a numerical integrator using lower resolution, reduced order or a simplified model. Our paper proposes to use a physics-informed neural network (PINN) instead. We demonstrate for the Black-Scholes equation, a partial differential equation from computational finance, that Parareal with a PINN coarse propagator provides better speedup than a numerical coarse propagator. Training and evaluating a neural network are both tasks whose computing patterns are well suited for GPUs. By contrast, mesh-based algorithms with their low computational intensity struggle to perform well. We show that moving the coarse propagator PINN to a GPU while running the numerical fine propagator on the CPU further improves Parareal's single-node performance. This suggests that integrating machine learning techniques into parallel-in-time integration methods and exploiting their differences in computing patterns might offer a way to better utilize heterogeneous architectures. △ Less

Submitted 5 June, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

Comments: 13 pages, 7 figures

MSC Class: 65Y05; 68T07; 65M55

Journal ref: In: Cano, J., Dikaiakos, M.D., Papadopoulos, G.A., Pericàs, M., Sakellariou, R. (eds) Euro-Par 2023: Parallel Processing. Euro-Par 2023. Lecture Notes in Computer Science, vol 14100. Springer, Cham

arXiv:2203.16069 [pdf, other]

doi 10.1137/22M1487163

A unified analysis framework for iterative parallel-in-time algorithms

Authors: M. J. Gander, T. Lunet, D. Ruprecht, R. Speck

Abstract: Parallel-in-time integration has been the focus of intensive research efforts over the past two decades due to the advent of massively parallel computer architectures and the scaling limits of purely spatial parallelization. Various iterative parallel-in-time (PinT) algorithms have been proposed, like Parareal, PFASST, MGRIT, and Space-Time Multi-Grid (STMG). These methods have been described usin… ▽ More Parallel-in-time integration has been the focus of intensive research efforts over the past two decades due to the advent of massively parallel computer architectures and the scaling limits of purely spatial parallelization. Various iterative parallel-in-time (PinT) algorithms have been proposed, like Parareal, PFASST, MGRIT, and Space-Time Multi-Grid (STMG). These methods have been described using different notations, and the convergence estimates that are available are difficult to compare. We describe Parareal, PFASST, MGRIT and STMG for the Dahlquist model problem using a common notation and give precise convergence estimates using generating functions. This allows us, for the first time, to directly compare their convergence. We prove that all four methods eventually converge super-linearly, and also compare them numerically. The generating function framework provides further opportunities to explore and analyze existing and new methods. △ Less

Submitted 28 April, 2023; v1 submitted 30 March, 2022; originally announced March 2022.

Journal ref: SIAM Journal on Scientific Computing 45(5), pp. A2275 - A2303, 2023

arXiv:2111.10228 [pdf, other]

Impact of spatial coarsening on Parareal convergence

Authors: Judith Angel, Sebastian Götschel, Daniel Ruprecht

Abstract: We study the impact of spatial coarsening on the convergence of the Parareal algorithm, both theoretically and numerically. For initial value problems with a normal system matrix, we prove a lower bound for the Euclidean norm of the iteration matrix. When there is no physical or numerical diffusion, an immediate consequence is that the norm of the iteration matrix cannot be smaller than unoty as s… ▽ More We study the impact of spatial coarsening on the convergence of the Parareal algorithm, both theoretically and numerically. For initial value problems with a normal system matrix, we prove a lower bound for the Euclidean norm of the iteration matrix. When there is no physical or numerical diffusion, an immediate consequence is that the norm of the iteration matrix cannot be smaller than unoty as soon as the coarse problem has fewer degrees-of-freedom than the fine. This prevents a theoretical guarantee for monotonic convergence, which is necessary to obtain meaningful speedups. For diffusive problems, in the worst-case where the iteration error contracts only as fast as the powers of the iteration matrix norm, making Parareal as accurate as the fine method will take about as many iterations as there are processors, making meaningful speedup impossible. Numerical examples with a non-normal system matrix show that for diffusive problems good speedup is possible, but that for non-diffusive problems the negative impact of spatial coarsening on convergence is big. △ Less

Submitted 19 November, 2021; originally announced November 2021.

arXiv:2102.11670 [pdf, other]

doi 10.1007/978-3-030-75933-9_4

Twelve Ways To Fool The Masses When Giving Parallel-In-Time Results

Authors: Sebastian Goetschel, Michael Minion, Daniel Ruprecht, Robert Speck

Abstract: Getting good speedup -- let alone high parallel efficiency -- for parallel-in-time (PinT) integration examples can be frustratingly difficult. The high complexity and large number of parameters in PinT methods can easily (and unintentionally) lead to numerical experiments that overestimate the algorithm's performance. In the tradition of Bailey's article "Twelve ways to fool the masses when giving… ▽ More Getting good speedup -- let alone high parallel efficiency -- for parallel-in-time (PinT) integration examples can be frustratingly difficult. The high complexity and large number of parameters in PinT methods can easily (and unintentionally) lead to numerical experiments that overestimate the algorithm's performance. In the tradition of Bailey's article "Twelve ways to fool the masses when giving performance results on parallel computers", we discuss and demonstrate pitfalls to avoid when evaluating performance of PinT methods. Despite being written in a light-hearted tone, this paper is intended to raise awareness that there are many ways to unintentionally fool yourself and others and that by avoiding these fallacies more meaningful PinT performance results can be obtained. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Journal ref: In: Ong B., Schroder J., Shipton J., Friedhoff S. (eds) Parallel-in-Time Integration Methods. PinT 2020. Springer Proceedings in Mathematics & Statistics, vol 356. Springer, Cham

arXiv:1912.05958 [pdf, other]

doi 10.1007/s00791-020-00327-0

Parareal with a Learned Coarse Model for Robotic Manipulation

Authors: Wisdom Agboh, Oliver Grainger, Daniel Ruprecht, Mehmet Dogar

Abstract: A key component of many robotics model-based planning and control algorithms is physics predictions, that is, forecasting a sequence of states given an initial state and a sequence of controls. This process is slow and a major computational bottleneck for robotics planning algorithms. Parallel-in-time integration methods can help to leverage parallel computing to accelerate physics predictions and… ▽ More A key component of many robotics model-based planning and control algorithms is physics predictions, that is, forecasting a sequence of states given an initial state and a sequence of controls. This process is slow and a major computational bottleneck for robotics planning algorithms. Parallel-in-time integration methods can help to leverage parallel computing to accelerate physics predictions and thus planning. The Parareal algorithm iterates between a coarse serial integrator and a fine parallel integrator. A key challenge is to devise a coarse model that is computationally cheap but accurate enough for Parareal to converge quickly. Here, we investigate the use of a deep neural network physics model as a coarse model for Parareal in the context of robotic manipulation. In simulated experiments using the physics engine Mujoco as fine propagator we show that the learned coarse model leads to faster Parareal convergence than a coarse physics-based model. We further show that the learned coarse model allows to apply Parareal to scenarios with multiple objects, where the physics-based coarse model is not applicable. Finally, we conduct experiments on a real robot and show that Parareal predictions are close to real-world physics predictions for robotic pushing of multiple objects. Videos are at https://youtu.be/wCh2o1rf-gA. △ Less

Submitted 19 June, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

Comments: Accepted to Computing and Visualization in Science (special issue on parallel-in-time)

Journal ref: Computing and Visualization in Science 23(8), 2020

arXiv:1903.08470 [pdf, other]

doi 10.1007/978-3-030-95459-8_44

Combining Coarse and Fine Physics for Manipulation using Parallel-in-Time Integration

Authors: Wisdom C. Agboh, Daniel Ruprecht, Mehmet R. Dogar

Abstract: We present a method for fast and accurate physics-based predictions during non-prehensile manipulation planning and control. Given an initial state and a sequence of controls, the problem of predicting the resulting sequence of states is a key component of a variety of model-based planning and control algorithms. We propose combining a coarse (i.e. computationally cheap but not very accurate) pred… ▽ More We present a method for fast and accurate physics-based predictions during non-prehensile manipulation planning and control. Given an initial state and a sequence of controls, the problem of predicting the resulting sequence of states is a key component of a variety of model-based planning and control algorithms. We propose combining a coarse (i.e. computationally cheap but not very accurate) predictive physics model, with a fine (i.e. computationally expensive but accurate) predictive physics model, to generate a hybrid model that is at the required speed and accuracy for a given manipulation task. Our approach is based on the Parareal algorithm, a parallel-in-time integration method used for computing numerical solutions for general systems of ordinary differential equations. We adapt Parareal to combine a coarse pushing model with an off-the-shelf physics engine to deliver physics-based predictions that are as accurate as the physics engine but run in substantially less wall-clock time, thanks to parallelization across time. We use these physics-based predictions in a model-predictive-control framework based on trajectory optimization, to plan pushing actions that avoid an obstacle and reach a goal location. We show that with hybrid physics models, we can achieve the same success rates as the planner that uses the off-the-shelf physics engine directly, but significantly faster. We present experiments in simulation and on a real robotic setup. Videos are available here: https://youtu.be/5e9oTeu4JOU △ Less

Submitted 30 August, 2019; v1 submitted 20 March, 2019; originally announced March 2019.

Comments: International Symposium on Robotics Research (ISRR), 2019

Journal ref: In: Asfour T., Yoshida E., Park J., Christensen H., Khatib O. (eds) Robotics Research. ISRR 2019. Springer Proceedings in Advanced Robotics, vol 20. Springer, Cham

arXiv:1812.08117 [pdf, other]

doi 10.1016/j.jcpx.2019.100036

An arbitrary order time-step** algorithm for tracking particles in inhomogeneous magnetic fields

Authors: Krasymyr Tretiak, Daniel Ruprecht

Abstract: The Lorentz equations describe the motion of electrically charged particles in electric and magnetic fields and are used widely in plasma physics. The most popular numerical algorithm for solving them is the Boris method, a variant of the Störmer-Verlet algorithm. Boris' method is phase space volume conserving and simulated particles typically remain near the correct trajectory. However, it is onl… ▽ More The Lorentz equations describe the motion of electrically charged particles in electric and magnetic fields and are used widely in plasma physics. The most popular numerical algorithm for solving them is the Boris method, a variant of the Störmer-Verlet algorithm. Boris' method is phase space volume conserving and simulated particles typically remain near the correct trajectory. However, it is only second order accurate. Therefore, in scenarios where it is not enough to know that a particle stays on the right trajectory but one needs to know where on the trajectory the particle is at a given time, Boris method requires very small time steps to deliver accurate phase information, making it computationally expensive. We derive an improved version of the high-order Boris spectral deferred correction algorithm (Boris-SDC) by adopting a convergence acceleration strategy for second order problems based on the Generalised Minimum Residual (GMRES) method. Our new algorithm is easy to implement as it still relies on the standard Boris method. Like Boris-SDC it can deliver arbitrary order of accuracy through simple changes of runtime parameter but possesses better long-term energy stability. We demonstrate for two examples, a magnetic mirror trap and the Solev'ev equilibrium, that the new method can deliver better accuracy at lower computational cost compared to the standard Boris method. While our examples are motivated by tracking ions in the magnetic field of a nuclear fusion reactor, the introduced algorithm can potentially deliver similar improvements in efficiency for other applications. △ Less

Submitted 2 August, 2019; v1 submitted 19 December, 2018; originally announced December 2018.

Journal ref: Journal of Computational Physics: X 4, pp. 100036, 2019

arXiv:1707.03581 [pdf, other]

doi 10.1007/s00466-018-1540-6

Toward transient finite element simulation of thermal deformation of machine tools in real-time

Authors: Andreas Naumann, Daniel Ruprecht, Joerg Wensch

Abstract: Finite element models without simplifying assumptions can accurately describe the spatial and temporal distribution of heat in machine tools as well as the resulting deformation. In principle, this allows to correct for displacements of the Tool Centre Point and enables high precision manufacturing. However, the computational cost of FEM models and restriction to generic algorithms in commercial t… ▽ More Finite element models without simplifying assumptions can accurately describe the spatial and temporal distribution of heat in machine tools as well as the resulting deformation. In principle, this allows to correct for displacements of the Tool Centre Point and enables high precision manufacturing. However, the computational cost of FEM models and restriction to generic algorithms in commercial tools like ANSYS prevents their operational use since simulations have to run faster than real-time. For the case where heat diffusion is slow compared to machine movement, we introduce a tailored implicit-explicit multi-rate time step** method of higher order based on spectral deferred corrections. Using the open-source FEM library DUNE, we show that fully coupled simulations of the temperature field are possible in real-time for a machine consisting of a stock sliding up and down on rails attached to a stand. △ Less

Submitted 12 July, 2017; originally announced July 2017.

Journal ref: Computational Mechanics 62(5), pp. 929 - 942, 2018

arXiv:1705.06149 [pdf, other]

doi 10.1007/978-3-319-09063-4_2

Parallel-in-Space-and-Time Simulation of the Three-Dimensional, Unsteady Navier-Stokes Equations for Incompressible Flow

Authors: Roberto Croce, Daniel Ruprecht, Rolf Krause

Abstract: In this paper we combine the Parareal parallel-in-time method together with spatial parallelization and investigate this space-time parallel scheme by means of solving the three-dimensional incompressible Navier-Stokes equations. Parallelization of time step** provides a new direction of parallelization and allows to employ additional cores to further speed up simulations after spatial paralleli… ▽ More In this paper we combine the Parareal parallel-in-time method together with spatial parallelization and investigate this space-time parallel scheme by means of solving the three-dimensional incompressible Navier-Stokes equations. Parallelization of time step** provides a new direction of parallelization and allows to employ additional cores to further speed up simulations after spatial parallelization has saturated. We report on numerical experiments performed on a Cray XE6, simulating a driven cavity flow with and without obstacles. Distributed memory parallelization is used in both space and time, featuring up to 2,048 cores in total. It is confirmed that the space-time-parallel method can provide speedup beyond the saturation of the spatial parallelization. △ Less

Submitted 17 May, 2017; originally announced May 2017.

Journal ref: Modeling, Simulation and Optimization of Complex Processes - HPSC 2012, Springer International Publishing, pages 13-23, 2014

arXiv:1701.01359 [pdf, other]

doi 10.1007/s00791-018-0296-z

Wave propagation characteristics of Parareal

Authors: Daniel Ruprecht

Abstract: The paper derives and analyses the (semi-)discrete dispersion relation of the Parareal parallel-in-time integration method. It investigates Parareal's wave propagation characteristics with the aim to better understand what causes the well documented stability problems for hyperbolic equations. The analysis shows that the instability is caused by convergence of the amplification factor to the exact… ▽ More The paper derives and analyses the (semi-)discrete dispersion relation of the Parareal parallel-in-time integration method. It investigates Parareal's wave propagation characteristics with the aim to better understand what causes the well documented stability problems for hyperbolic equations. The analysis shows that the instability is caused by convergence of the amplification factor to the exact value from above for medium to high wave numbers. Phase errors in the coarse propagator are identified as the culprit, which suggests that specifically tailored coarse level methods could provide a remedy. △ Less

Submitted 14 October, 2017; v1 submitted 5 January, 2017; originally announced January 2017.

Journal ref: Computing and Visualization in Science 19(1), pp. 1- 17, 2018

arXiv:1510.08334 [pdf, other]

doi 10.1016/j.parco.2016.12.001

Toward fault-tolerant parallel-in-time integration with PFASST

Authors: Robert Speck, Daniel Ruprecht

Abstract: We introduce and analyze different strategies for the parallel-in-time integration method PFASST to recover from hard faults and subsequent data loss. Since PFASST stores solutions at multiple time steps on different processors, information from adjacent steps can be used to recover after a processor has failed. PFASST's multi-level hierarchy allows to use the coarse level for correcting the recon… ▽ More We introduce and analyze different strategies for the parallel-in-time integration method PFASST to recover from hard faults and subsequent data loss. Since PFASST stores solutions at multiple time steps on different processors, information from adjacent steps can be used to recover after a processor has failed. PFASST's multi-level hierarchy allows to use the coarse level for correcting the reconstructed solution, which can help to minimize overhead. A theoretical model is devised linking overhead to the number of additional PFASST iterations required for convergence after a fault. The potential efficiency of different strategies is assessed in terms of required additional iterations for examples of diffusive and advective type. △ Less

Submitted 31 May, 2016; v1 submitted 28 October, 2015; originally announced October 2015.

Journal ref: Parallel Computing 62, pp. 20 - 37, 2017

arXiv:1510.02237 [pdf, ps, other]

doi 10.1016/j.compfluid.2012.02.015

Explicit Parallel-in-time Integration of a Linear Acoustic-Advection System

Authors: Daniel Ruprecht, Rolf Krause

Abstract: The applicability of the Parareal parallel-in-time integration scheme for the solution of a linear, two-dimensional hyperbolic acoustic-advection system, which is often used as a test case for integration schemes for numerical weather prediction (NWP), is addressed. Parallel-in-time schemes are a possible way to increase, on the algorithmic level, the amount of parallelism, a requirement arising f… ▽ More The applicability of the Parareal parallel-in-time integration scheme for the solution of a linear, two-dimensional hyperbolic acoustic-advection system, which is often used as a test case for integration schemes for numerical weather prediction (NWP), is addressed. Parallel-in-time schemes are a possible way to increase, on the algorithmic level, the amount of parallelism, a requirement arising from the rapidly growing number of CPUs in high performance computer systems. A recently introduced modification of the "parallel implicit time-integration algorithm" could successfully solve hyperbolic problems arising in structural dynamics. It has later been cast into the framework of Parareal. The present paper adapts this modified Parareal and employs it for the solution of a hyperbolic flow problem, where the initial value problem solved in parallel arises from the spatial discretization of a partial differential equation by a finite difference method. It is demonstrated that the modified Parareal is stable and can produce reasonably accurate solutions while allowing for a noticeable reduction of the time-to-solution. The implementation relies on integration schemes already widely used in NWP (RK-3, partially split forward Euler, forward-backward). It is demonstrated that using an explicit partially split scheme for the coarse integrator allows to avoid the use of an implicit scheme while still achieving speedup. △ Less

Submitted 8 October, 2015; originally announced October 2015.

Journal ref: Computers & Fluids 59, pp. 72-83, 2012

arXiv:1509.06935 [pdf, other]

doi 10.1007/978-3-319-64203-1_48

Shared Memory Pipelined Parareal

Authors: Daniel Ruprecht

Abstract: For the parallel-in-time integration method Parareal, pipelining can be used to hide some of the cost of the serial correction step and improve its efficiency. The paper introduces a basic OpenMP implementation of pipelined Parareal and compares it to a standard MPI-based variant. Both versions yield almost identical runtimes, but, depending on the compiler, the OpenMP variant consumes about 7% le… ▽ More For the parallel-in-time integration method Parareal, pipelining can be used to hide some of the cost of the serial correction step and improve its efficiency. The paper introduces a basic OpenMP implementation of pipelined Parareal and compares it to a standard MPI-based variant. Both versions yield almost identical runtimes, but, depending on the compiler, the OpenMP variant consumes about 7% less energy and has a significantly smaller memory footprint. However, its higher implementation complexity might make it difficult to use in legacy codes and in combination with spatial parallelisation. △ Less

Submitted 11 November, 2019; v1 submitted 23 September, 2015; originally announced September 2015.

MSC Class: 68W10; 65Y05; 68N19

Journal ref: In: Rivera F., Pena T., Cabaleiro J. (eds) Euro-Par 2017: Parallel Processing. Lecture Notes in Computer Science, vol 10417. Springer

arXiv:1509.04252 [pdf, other]

Parareal convergence for 2D unsteady flow around a cylinder

Authors: Andreas Kreienbuehl, Arne Naegel, Daniel Ruprecht, Andreas Vogel, Gabriel Wittum, Rolf Krause

Abstract: In this technical report we study the convergence of Parareal for 2D incompressible flow around a cylinder for different viscosities. Two methods are used as fine integrator: backward Euler and a fractional step method. It is found that Parareal converges better for the implicit Euler, likely because it under-resolves the fine-scale dynamics as a result of numerical diffusion. In this technical report we study the convergence of Parareal for 2D incompressible flow around a cylinder for different viscosities. Two methods are used as fine integrator: backward Euler and a fractional step method. It is found that Parareal converges better for the implicit Euler, likely because it under-resolves the fine-scale dynamics as a result of numerical diffusion. △ Less

Submitted 14 September, 2015; originally announced September 2015.

Comments: 16 pages, 7 figures

arXiv:1509.01572 [pdf, other]

doi 10.2140/camcos.2017.12.109

Time parallel gravitational collapse simulation

Authors: Andreas Kreienbuehl, Pietro Benedusi, Daniel Ruprecht, Rolf Krause

Abstract: This article demonstrates the applicability of the parallel-in-time method Parareal to the numerical solution of the Einstein gravity equations for the spherical collapse of a massless scalar field. To account for the shrinking of the spatial domain in time, a tailored load balancing scheme is proposed and compared to load balancing based on number of time steps alone. The performance of Parareal… ▽ More This article demonstrates the applicability of the parallel-in-time method Parareal to the numerical solution of the Einstein gravity equations for the spherical collapse of a massless scalar field. To account for the shrinking of the spatial domain in time, a tailored load balancing scheme is proposed and compared to load balancing based on number of time steps alone. The performance of Parareal is studied for both the sub-critical and black hole case; our experiments show that Parareal generates substantial speedup and, in the super-critical regime, can reproduce Choptuik's black hole mass scaling law. △ Less

Submitted 28 December, 2016; v1 submitted 4 September, 2015; originally announced September 2015.

Comments: 16 pages, 8 figures, 1 listing, and 1 table

Journal ref: Communications in Applied Mathematics and Computational Science 12-1 (2017), 109--128

arXiv:1502.03645 [pdf, other]

doi 10.1007/s00791-015-0246-y

Numerical simulation of skin transport using Parareal

Authors: Andreas Kreienbuehl, Arne Naegel, Daniel Ruprecht, Robert Speck, Gabriel Wittum, Rolf Krause

Abstract: In-silico investigation of skin permeation is an important but also computationally demanding problem. To resolve all scales involved in full detail will not only require exascale computing capacities but also suitable parallel algorithms. This article investigates the applicability of the time-parallel Parareal algorithm to a brick and mortar setup, a precursory problem to skin permeation. The C+… ▽ More In-silico investigation of skin permeation is an important but also computationally demanding problem. To resolve all scales involved in full detail will not only require exascale computing capacities but also suitable parallel algorithms. This article investigates the applicability of the time-parallel Parareal algorithm to a brick and mortar setup, a precursory problem to skin permeation. The C++ library Lib4PrM implementing Parareal is combined with the UG4 simulation framework, which provides the spatial discretization and parallelization. The combination's performance is studied with respect to convergence and speedup. It is confirmed that anisotropies in the domain and jumps in diffusion coefficients only have a minor impact on Parareal's convergence. The influence of load imbalances in time due to differences in number of iterations required by the spatial solver as well as spatio-temporal weak scaling is discussed. △ Less

Submitted 27 July, 2015; v1 submitted 12 February, 2015; originally announced February 2015.

Comments: 11 pages, 8 figures

Journal ref: Computing and Visualization in Science 17(2), pp. 99-108, 2015

arXiv:1409.8563 [pdf, other]

doi 10.1016/j.amc.2014.12.055

A stencil-based implementation of Parareal in the C++ domain specific embedded language STELLA

Authors: Andrea Arteaga, Daniel Ruprecht, Rolf Krause

Abstract: In view of the rapid rise of the number of cores in modern supercomputers, time-parallel methods that introduce concurrency along the temporal axis are becoming increasingly popular. For the solution of time-dependent partial differential equations, these methods can add another direction for concurrency on top of spatial parallelization. The paper presents an implementation of the time-parallel P… ▽ More In view of the rapid rise of the number of cores in modern supercomputers, time-parallel methods that introduce concurrency along the temporal axis are becoming increasingly popular. For the solution of time-dependent partial differential equations, these methods can add another direction for concurrency on top of spatial parallelization. The paper presents an implementation of the time-parallel Parareal method in a C++ domain specific language for stencil computations (STELLA). STELLA provides both an OpenMP and a CUDA backend for a shared memory parallelization, using the CPU or GPU inside a node for the spatial stencils. Here, we intertwine this node-wise spatial parallelism with the time-parallel Parareal. This is done by adding an MPI-based implementation of Parareal, which allows us to parallelize in time across nodes. The performance of Parareal with both backends is analyzed in terms of speedup, parallel efficiency and energy-to-solution for an advection-diffusion problem with a time-dependent diffusion coefficient. △ Less

Submitted 3 December, 2014; v1 submitted 30 September, 2014; originally announced September 2014.

Journal ref: Applied Mathematics and Computation 267, pp. 727-741, 2015

arXiv:1407.6486 [pdf, other]

doi 10.1137/14097536X

Interweaving PFASST and Parallel Multigrid

Authors: Michael Minion, Robert Speck, Matthias Bolten, Matthew Emmett, Daniel Ruprecht

Abstract: The parallel full approximation scheme in space and time (PFASST) introduced by Emmett and Minion in 2012 is an iterative strategy for the temporal parallelization of ODEs and discretized PDEs. As the name suggests, PFASST is similar in spirit to a space-time FAS multigrid method performed over multiple time-steps in parallel. However, since the original focus of PFASST has been on the performance… ▽ More The parallel full approximation scheme in space and time (PFASST) introduced by Emmett and Minion in 2012 is an iterative strategy for the temporal parallelization of ODEs and discretized PDEs. As the name suggests, PFASST is similar in spirit to a space-time FAS multigrid method performed over multiple time-steps in parallel. However, since the original focus of PFASST has been on the performance of the method in terms of time parallelism, the solution of any spatial system arising from the use of implicit or semi-implicit temporal methods within PFASST have simply been assumed to be solved to some desired accuracy completely at each sub-step and each iteration by some unspecified procedure. It hence is natural to investigate how iterative solvers in the spatial dimensions can be interwoven with the PFASST iterations and whether this strategy leads to a more efficient overall approach. This paper presents an initial investigation on the relative performance of different strategies for coupling PFASST iterations with multigrid methods for the implicit treatment of diffusion terms in PDEs. In particular, we compare full accuracy multigrid solves at each sub-step with a small fixed number of multigrid V-cycles. This reduces the cost of each PFASST iteration at the possible expense of a corresponding increase in the number of PFASST iterations needed for convergence. Parallel efficiency of the resulting methods is explored through numerical examples. △ Less

Submitted 30 March, 2015; v1 submitted 24 July, 2014; originally announced July 2014.

Journal ref: SIAM Journal on Scientific Computing 37(5), pp. S244-S263, 2015

arXiv:1311.4588 [pdf, ps, other]

doi 10.1007/978-3-319-10705-9__19

Convergence of Parareal for the Navier-Stokes equations depending on the Reynolds number

Authors: Johannes Steiner, Daniel Ruprecht, Robert Speck, Rolf Krause

Abstract: The paper presents first a linear stability analysis for the time-parallel Parareal method, using an IMEX Euler as coarse and a Runge-Kutta-3 method as fine propagator, confirming that dominant imaginary eigenvalues negatively affect Parareal's convergence. This suggests that when Parareal is applied to the nonlinear Navier-Stokes equations, problems for small viscosities could arise. Numerical re… ▽ More The paper presents first a linear stability analysis for the time-parallel Parareal method, using an IMEX Euler as coarse and a Runge-Kutta-3 method as fine propagator, confirming that dominant imaginary eigenvalues negatively affect Parareal's convergence. This suggests that when Parareal is applied to the nonlinear Navier-Stokes equations, problems for small viscosities could arise. Numerical results for a driven cavity benchmark are presented, confirming that Parareal's convergence can indeed deteriorate as viscosity decreases and the flow becomes increasingly dominated by convection. The effect is found to strongly depend on the spatial resolution. △ Less

Submitted 15 October, 2014; v1 submitted 18 November, 2013; originally announced November 2013.

Journal ref: Lecture Notes in Computational Science and Engineering 103, Springer International Publishing, pages 195 - 202, 2015

arXiv:1307.7867 [pdf, other]

doi 10.3233/978-1-61499-381-0-263

A space-time parallel solver for the three-dimensional heat equation

Authors: Robert Speck, Daniel Ruprecht, Matthew Emmett, Matthias Bolten, Rolf Krause

Abstract: The paper presents a combination of the time-parallel "parallel full approximation scheme in space and time" (PFASST) with a parallel multigrid method (PMG) in space, resulting in a mesh-based solver for the three-dimensional heat equation with a uniquely high degree of efficient concurrency. Parallel scaling tests are reported on the Cray XE6 machine "Monte Rosa" on up to 16,384 cores and on the… ▽ More The paper presents a combination of the time-parallel "parallel full approximation scheme in space and time" (PFASST) with a parallel multigrid method (PMG) in space, resulting in a mesh-based solver for the three-dimensional heat equation with a uniquely high degree of efficient concurrency. Parallel scaling tests are reported on the Cray XE6 machine "Monte Rosa" on up to 16,384 cores and on the IBM Blue Gene/Q system "JUQUEEN" on up to 65,536 cores. The efficacy of the combined spatial- and temporal parallelization is shown by demonstrating that using PFASST in addition to PMG significantly extends the strong-scaling limit. Implications of using spatial coarsening strategies in PFASST's multi-level hierarchy in large-scale parallel simulations are discussed. △ Less

Submitted 14 July, 2014; v1 submitted 30 July, 2013; originally announced July 2013.

Comments: 10 pages

Journal ref: Advances in Parallel Computing 25, IOS Press, pages 263 - 272, 2014

Showing 1–28 of 28 results for author: Ruprecht, D