Skip to main content

Showing 1–13 of 13 results for author: Huck, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19058  [pdf, other

    physics.comp-ph cs.DC cs.PF physics.plasm-ph

    Understanding the Impact of openPMD on BIT1, a Particle-in-Cell Monte Carlo Code, through Instrumentation, Monitoring, and In-Situ Analysis

    Authors: Jeremy J. Williams, Stefan Costea, Allen D. Malony, David Tskhakaya, Leon Kos, Ales Podolnik, Jakub Hromadka, Kevin Huck, Erwin Laure, Stefano Markidis

    Abstract: Particle-in-Cell Monte Carlo simulations on large-scale systems play a fundamental role in understanding the complexities of plasma dynamics in fusion devices. Efficient handling and analysis of vast datasets are essential for advancing these simulations. Previously, we addressed this challenge by integrating openPMD with BIT1, a Particle-in-Cell Monte Carlo code, streamlining data streaming and s… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted by the Euro-Par 2024 workshops (PHYSHPC 2024), prepared in the standardized Springer LNCS format and consists of 12 pages, which includes the main text, references, and figures

  2. arXiv:2401.16971  [pdf, other

    cs.DC

    Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations

    Authors: Francieli Boito, Jim Brandt, Valeria Cardellini, Philip Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin Huck, Thomas Ilsche, Thomas Jakobsche, Terry Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Ivy Peng, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael Showerman, Devesh Tiwari , et al. (2 additional authors not shown)

    Abstract: Many High Performance Computing (HPC) facilities have developed and deployed frameworks in support of continuous monitoring and operational data analytics (MODA) to help improve efficiency and throughput. Because of the complexity and scale of systems and workflows and the need for low-latency response to address dynamic circumstances, automated feedback and response have the potential to be more… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  3. arXiv:2304.11205  [pdf, ps, other

    cs.DC

    STaKTAU: profiling HPC applications' operating system usage

    Authors: Camille Coti, Kevin Huck, Allen D. Malony

    Abstract: This paper presents a approach for measuring the time spent by HPC applications in the operating system's kernel. We use the SystemTap interface to insert timers before and after system calls, and take advantage of its stability to design a tool that can be used with multiple versions of the kernel. We evaluate its performance overhead, using an OS-intensive mini-benchmark and a raytracing mini ap… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

  4. Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku

    Authors: Patrick Diehl, Gregor Daiß, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Dirk Pflüger

    Abstract: The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set of interesting challenges to application developers. In addition to requiring code portability across different parallelization schemes, programs targeting these architectures have to be highly adaptable in terms of compute kernel sizes to accommodate different… ▽ More

    Submitted 15 March, 2023; originally announced April 2023.

  5. arXiv:2210.06437  [pdf, other

    cs.DC

    Distributed, combined CPU and GPU profiling within HPX using APEX

    Authors: Patrick Diehl, Gregor Daiss, Kevin Huck, Dominic Marcello, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Geoffrey C. Clayton, Dirk Pflueger

    Abstract: Benchmarking and comparing performance of a scientific simulation across hardware platforms is a complex task. When the simulation in question is constructed with an asynchronous, many-task (AMT) runtime offloading work to GPUs, the task becomes even more complex. In this paper, we discuss the use of a uniquely suited performance measurement library, APEX, to capture the performance behavior of a… ▽ More

    Submitted 21 September, 2022; originally announced October 2022.

  6. arXiv:2208.00109  [pdf, other

    cs.HC

    Traveler: Navigating Task Parallel Traces for Performance Analysis

    Authors: Sayef Azad Sakin, Alex Bigelow, R. Tohid, Connor Scully-Allison, Carlos Scheidegger, Steven R. Brandt, Christopher Taylor, Kevin A. Huck, Hartmut Kaiser, Katherine E. Isaacs

    Abstract: Understanding the behavior of software in execution is a key step in identifying and fixing performance issues. This is especially important in high performance computing contexts where even minor performance tweaks can translate into large savings in terms of computational resource use. To aid performance analysis, developers may collect an execution trace - a chronological log of program activit… ▽ More

    Submitted 3 September, 2022; v1 submitted 29 July, 2022; originally announced August 2022.

    Comments: IEEE VIS 2022

  7. Octo-Tiger's New Hydro Module and Performance Using HPX+CUDA on ORNL's Summit

    Authors: Patrick Diehl, Gregor Daiß, Dominic Marcello, Kevin Huck, Sagiv Shiber, Hartmut Kaiser, Juhan Frank, Dirk Pflüger

    Abstract: Octo-Tiger is a code for modeling three-dimensional self-gravitating astrophysical fluids. It was particularly designed for the study of dynamical mass transfer between interacting binary stars. Octo-Tiger is parallelized for distributed systems using the asynchronous many-task runtime system, the C++ standard library for parallelism and concurrency (HPX) and utilizes CUDA for its gravity solver.… ▽ More

    Submitted 26 July, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Cluster

  8. arXiv:2105.00027  [pdf, other

    cs.DC cond-mat.mtrl-sci cond-mat.str-el cond-mat.supr-con

    Memory Reduction using a Ring Abstraction over GPU RDMA for Distributed Quantum Monte Carlo Solver

    Authors: Weile Wei, Eduardo D'Azevedo, Kevin Huck, Arghya Chatterjee, Oscar Hernandez, Hartmut Kaiser

    Abstract: Scientific applications that run on leadership computing facilities often face the challenge of being unable to fit leading science cases onto accelerator devices due to memory constraints (memory-bound applications). In this work, the authors studied one such US Department of Energy mission-critical condensed matter physics application, Dynamical Cluster Approximation (DCA++), and this paper disc… ▽ More

    Submitted 13 May, 2021; v1 submitted 30 April, 2021; originally announced May 2021.

  9. Performance Measurements within Asynchronous Task-based Runtime Systems: A Double White Dwarf Merger as an Application

    Authors: Patrick Diehl, Dominic Marcello, Parsa Amini, Hartmut Kaiser, Sagiv Shiber, Geoffrey C. Clayton, Juhan Frank, Gregor Daiß, Dirk Pflüger, David Eder, Alice Koniges, Kevin Huck

    Abstract: Analyzing performance within asynchronous many-task-based runtime systems is challenging because millions of tasks are launched concurrently. Especially for long-term runs the amount of data collected becomes overwhelming. We study HPX and its performance-counter framework and APEX to collect performance data and energy consumption. We added HPX application-specific performance counters to the Oct… ▽ More

    Submitted 9 June, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

  10. arXiv:2010.07098  [pdf, other

    cs.DC cond-mat.mtrl-sci cond-mat.str-el cond-mat.supr-con

    Performance Analysis of a Quantum Monte Carlo Application on Multiple Hardware Architectures Using the HPX Runtime

    Authors: Weile Wei, Arghya Chatterjee, Kevin Huck, Oscar Hernandez, Hartmut Kaiser

    Abstract: This paper describes how we successfully used the HPX programming model to port the DCA++ application on multiple architectures that include POWER9, x86, ARM v8, and NVIDIA GPUs. We describe the lessons we can learn from this experience as well as the benefits of enabling the HPX in the application to improve the CPU threading part of the code, which led to an overall 21% improvement across archit… ▽ More

    Submitted 19 October, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

  11. arXiv:2008.13742  [pdf, other

    cs.DC cs.PF

    Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool

    Authors: Sungsoo Ha, Wonyong Jeong, Gyorgy Matyasfalvi, Cong Xie, Kevin Huck, Jong Youl Choi, Abid Malik, Li Tang, Hubertus Van Dam, Line Pouchard, Wei Xu, Shinjae Yoo, Nicholas D'Imperio, Kerstin Kleese Van Dam

    Abstract: Because of the limits input/output systems currently impose on high-performance computing systems, a new generation of workflows that include online data reduction and analysis is emerging. Diagnosing their performance requires sophisticated performance analysis capabilities due to the complexity of execution patterns and underlying hardware, and no tool could handle the voluminous performance tra… ▽ More

    Submitted 31 August, 2020; originally announced August 2020.

  12. From Piz Daint to the Stars: Simulation of Stellar Mergers using High-Level Abstractions

    Authors: Gregor Daiß, Parsa Amini, John Biddiscombe, Patrick Diehl, Juhan Frank, Kevin Huck, Hartmut Kaiser, Dominic Marcello, David Pfander, Dirk Pflüger

    Abstract: We study the simulation of stellar mergers, which requires complex simulations with high computational demands. We have developed Octo-Tiger, a finite volume grid-based hydrodynamics simulation code with Adaptive Mesh Refinement which is unique in conserving both linear and angular momentum to machine precision. To face the challenge of increasingly complex, diverse, and heterogeneous HPC systems,… ▽ More

    Submitted 9 August, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: Accepted at SC19

  13. Asynchronous Execution of Python Code on Task Based Runtime Systems

    Authors: R. Tohid, Bibek Wagle, Shahrzad Shirzad, Patrick Diehl, Adrian Serio, Alireza Kheirkhahan, Parsa Amini, Katy Williams, Kate Isaacs, Kevin Huck, Steven Brandt, Hartmut Kaiser

    Abstract: Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenienc… ▽ More

    Submitted 22 October, 2018; v1 submitted 17 October, 2018; originally announced October 2018.