Showing 1–2 of 2 results for author: Lippuner, J

Search v0.5.6 released 2020-02-24

arXiv:2202.12309 [pdf, other]

cs.DC astro-ph.IM

doi 10.1177/10943420221143775

Parthenon -- a performance portable block-structured adaptive mesh refinement framework

Authors: Philipp Grete, Joshua C. Dolence, Jonah M. Miller, Joshua Brown, Ben Ryan, Andrew Gaspar, Forrest Glines, Sriram Swaminarayan, Jonas Lippuner, Clell J. Solomon, Galen Shipman, Christoph Junghans, Daniel Holladay, James M. Stone, Luke F. Roberts

Abstract: On the path to exascale the landscape of computer device architectures and corresponding programming models has become much more diverse. While various low-level performance portable programming models are available, support at the application level lacks behind. To address this issue, we present the performance portable block-structured adaptive mesh refinement (AMR) framework Parthenon, derived… ▽ More On the path to exascale the landscape of computer device architectures and corresponding programming models has become much more diverse. While various low-level performance portable programming models are available, support at the application level lacks behind. To address this issue, we present the performance portable block-structured adaptive mesh refinement (AMR) framework Parthenon, derived from the well-tested and widely used Athena++ astrophysical magnetohydrodynamics code, but generalized to serve as the foundation for a variety of downstream multi-physics codes. Parthenon adopts the Kokkos programming model, and provides various levels of abstractions from multi-dimensional variables, to packages defining and separating components, to launching of parallel compute kernels. Parthenon allocates all data in device memory to reduce data movement, supports the logical packing of variables and mesh blocks to reduce kernel launch overhead, and employs one-sided, asynchronous MPI calls to reduce communication overhead in multi-node simulations. Using a hydrodynamics miniapp, we demonstrate weak and strong scaling on various architectures including AMD and NVIDIA GPUs, Intel and AMD x86 CPUs, IBM Power9 CPUs, as well as Fujitsu A64FX CPUs. At the largest scale on Frontier (the first TOP500 exascale machine), the miniapp reaches a total of $1.7\times10^{13}$ zone-cycles/s on 9,216 nodes (73,728 logical GPUs) at ~92% weak scaling parallel efficiency (starting from a single node). In combination with being an open, collaborative project, this makes Parthenon an ideal framework to target exascale simulations in which the downstream developers can focus on their specific application rather than on the complexity of handling massively-parallel, device-accelerated AMR. △ Less

Submitted 21 November, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

Comments: 17 pages, 11 figures, accepted for publication in IJHPCA, Codes available at https://github.com/parthenon-hpc-lab

Report number: LA-UR-22-21270
arXiv:1609.00098 [pdf, other]

astro-ph.HE cs.DC gr-qc physics.comp-ph

doi 10.1016/j.jcp.2016.12.059

SpECTRE: A Task-based Discontinuous Galerkin Code for Relativistic Astrophysics

Authors: Lawrence E. Kidder, Scott E. Field, Francois Foucart, Erik Schnetter, Saul A. Teukolsky, Andy Bohn, Nils Deppe, Peter Diener, François Hébert, Jonas Lippuner, Jonah Miller, Christian D. Ott, Mark A. Scheel, Trevor Vincent

Abstract: We introduce a new relativistic astrophysics code, SpECTRE, that combines a discontinuous Galerkin method with a task-based parallelism model. SpECTRE's goal is to achieve more accurate solutions for challenging relativistic astrophysics problems such as core-collapse supernovae and binary neutron star mergers. The robustness of the discontinuous Galerkin method allows for the use of high-resoluti… ▽ More We introduce a new relativistic astrophysics code, SpECTRE, that combines a discontinuous Galerkin method with a task-based parallelism model. SpECTRE's goal is to achieve more accurate solutions for challenging relativistic astrophysics problems such as core-collapse supernovae and binary neutron star mergers. The robustness of the discontinuous Galerkin method allows for the use of high-resolution shock capturing methods in regions where (relativistic) shocks are found, while exploiting high-order accuracy in smooth regions. A task-based parallelism model allows efficient use of the largest supercomputers for problems with a heterogeneous workload over disparate spatial and temporal scales. We argue that the locality and algorithmic structure of discontinuous Galerkin methods will exhibit good scalability within a task-based parallelism framework. We demonstrate the code on a wide variety of challenging benchmark problems in (non)-relativistic (magneto)-hydrodynamics. We demonstrate the code's scalability including its strong scaling on the NCSA Blue Waters supercomputer up to the machine's full capacity of 22,380 nodes using 671,400 threads. △ Less

Submitted 21 July, 2017; v1 submitted 31 August, 2016; originally announced September 2016.

Comments: 41 pages, 13 figures, and 7 tables. Ancillary data contains simulation input files

Journal ref: Journal of Computational Physics, Volume 335, 2017, Pages 84-114

Search v0.5.6 released 2020-02-24