Skip to main content

Showing 1–11 of 11 results for author: Daiß, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00026  [pdf, other

    cs.DC

    Distributed astrophysics simulations using Octo-Tiger with RISC-V CPUs using HPX and Kokkos

    Authors: Patrick Diehl, Gregor Daiß, Steven R. Brandt, Alireza Kheirkhahan, Srinivas Yadav Singanaboina, Dominic Marcello, Chris Taylor, John Leidel, Hartmut Kaiser

    Abstract: In recent years, interest in RISC-V computing architectures have moved from academic to mainstream, especially in the field of High Performance Computing where energy limitations are increasingly a point of concern. The results presented in this paper are part of a longer-term evaluation of RISC-V's viability for HPC applications. In this work, we use the Octo-Tiger multi-physics, multi-scale, 3D… ▽ More

    Submitted 10 May, 2024; originally announced July 2024.

  2. HPX with Spack and Singularity Containers: Evaluating Overheads for HPX/Kokkos using an astrophysics application

    Authors: Patrick Diehl, Steven R. Brandt, Gregor Daiß, Hartmut Kaiser

    Abstract: Cloud computing for high performance computing resources is an emerging topic. This service is of interest to researchers who care about reproducible computing, for software packages with complex installations, and for companies or researchers who need the compute resources only occasionally or do not want to run and maintain a supercomputer on their own. The connection between HPC and containers… ▽ More

    Submitted 7 May, 2024; v1 submitted 11 February, 2024; originally announced May 2024.

  3. Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger

    Authors: Parick Diehl, Gregor Daiss, Steven R. Brandt, Alireza Kheirkhahan, Hartmut Kaiser, Christopher Taylor, John Leidel

    Abstract: In recent years, computers based on the RISC-V architecture have raised broad interest in the high-performance computing (HPC) community. As the RISC-V community develops the core instruction set architecture (ISA) along with ISA extensions, the HPC community has been actively ensuring HPC applications and environments are supported. In this context, assessing the performance of asynchronous many-… ▽ More

    Submitted 17 August, 2023; originally announced September 2023.

  4. Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku

    Authors: Patrick Diehl, Gregor Daiß, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Dirk Pflüger

    Abstract: The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set of interesting challenges to application developers. In addition to requiring code portability across different parallelization schemes, programs targeting these architectures have to be highly adaptable in terms of compute kernel sizes to accommodate different… ▽ More

    Submitted 15 March, 2023; originally announced April 2023.

  5. Stellar Mergers with HPX-Kokkos and SYCL: Methods of using an Asynchronous Many-Task Runtime System with SYCL

    Authors: Gregor Daiß, Patrick Diehl, Hartmut Kaiser, Dirk Pflüger

    Abstract: Ranging from NVIDIA GPUs to AMD GPUs and Intel GPUs: Given the heterogeneity of available accelerator cards within current supercomputers, portability is a key aspect for modern HPC applications. In Octo-Tiger, we rely on Kokkos and its various execution spaces for portable compute kernels. In turn, we use HPX to coordinate kernel launches, CPU tasks, and communication. This combination allows us… ▽ More

    Submitted 8 May, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

  6. From Merging Frameworks to Merging Stars: Experiences using HPX, Kokkos and SIMD Types

    Authors: Gregor Daiß, Srinivas Yadav Singanaboina, Patrick Diehl, Hartmut Kaiser, Dirk Pflüger

    Abstract: Octo-Tiger, a large-scale 3D AMR code for the merger of stars, uses a combination of HPX, Kokkos and explicit SIMD types, aiming to achieve performance-portability for a broad range of heterogeneous hardware. However, on A64FX CPUs, we encountered several missing pieces, hindering performance by causing problems with the SIMD vectorization. Therefore, we add std::experimental::simd as an option to… ▽ More

    Submitted 8 May, 2023; v1 submitted 26 September, 2022; originally announced October 2022.

  7. From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

    Authors: Gregor Daiß, Patrick Diehl, Dominic Marcello, Alireza Kheirkhahan, Hartmut Kaiser, Dirk Pflüger

    Abstract: Meeting both scalability and performance portability requirements is a challenge for any HPC application, especially for adaptively refined ones. In Octo-Tiger, an astrophysics application for the simulation of stellar mergers, we approach this with existing solutions: We employ HPX to obtain fine-grained tasks to easily distribute work and finely overlap communication and computation. For the com… ▽ More

    Submitted 4 March, 2023; v1 submitted 26 September, 2022; originally announced October 2022.

  8. arXiv:2210.06437  [pdf, other

    cs.DC

    Distributed, combined CPU and GPU profiling within HPX using APEX

    Authors: Patrick Diehl, Gregor Daiss, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Geoffrey C. Clayton, Dirk Pflueger

    Abstract: Benchmarking and comparing performance of a scientific simulation across hardware platforms is a complex task. When the simulation in question is constructed with an asynchronous, many-task (AMT) runtime offloading work to GPUs, the task becomes even more complex. In this paper, we discuss the use of a uniquely suited performance measurement library, APEX, to capture the performance behavior of a… ▽ More

    Submitted 21 September, 2022; originally announced October 2022.

  9. Octo-Tiger's New Hydro Module and Performance Using HPX+CUDA on ORNL's Summit

    Authors: Patrick Diehl, Gregor Daiß, Dominic Marcello, Kevin Huck, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Dirk Pflüger

    Abstract: Octo-Tiger is a code for modeling three-dimensional self-gravitating astrophysical fluids. It was particularly designed for the study of dynamical mass transfer between interacting binary stars. Octo-Tiger is parallelized for distributed systems using the asynchronous many-task runtime system, the C++ standard library for parallelism and concurrency (HPX) and utilizes CUDA for its gravity solver.… ▽ More

    Submitted 26 July, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Cluster

  10. Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application

    Authors: Patrick Diehl, Dominic Marcello, Parsa Amini, Hartmut Kaiser, Sagiv Shiber, Geoffrey C. Clayton, Juhan Frank, Gregor Daiß, Dirk Pflüger, David Eder, Alice Koniges, Kevin Huck

    Abstract: Analyzing performance within asynchronous many-task-based runtime systems is challenging because millions of tasks are launched concurrently. Especially for long-term runs the amount of data collected becomes overwhelming. We study HPX and its performance-counter framework and APEX to collect performance data and energy consumption. We added HPX application-specific performance counters to the Oct… ▽ More

    Submitted 9 June, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

  11. From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

    Authors: Gregor Daiß, Parsa Amini, John Biddiscombe, Patrick Diehl, Juhan Frank, Kevin Huck, Hartmut Kaiser, Dominic Marcello, David Pfander, Dirk Pflüger

    Abstract: We study the simulation of stellar mergers, which requires complex simulations with high computational demands. We have developed Octo-Tiger, a finite volume grid-based hydrodynamics simulation code with Adaptive Mesh Refinement which is unique in conserving both linear and angular momentum to machine precision. To face the challenge of increasingly complex, diverse, and heterogeneous HPC systems,… ▽ More

    Submitted 9 August, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: Accepted at SC19