Skip to main content

Showing 1–7 of 7 results for author: Wahlgren, J

.
  1. arXiv:2407.07850  [pdf, other

    cs.DC

    Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper

    Authors: Gabin Schieffer, Jacob Wahlgren, Jie Ren, Jennifer Faj, Ivy Peng

    Abstract: Memory management across discrete CPU and GPU physical memory is traditionally achieved through explicit GPU allocations and data copy or unified virtual memory. The Grace Hopper Superchip, for the first time, supports an integrated CPU-GPU system page table, hardware-level addressing of system allocated memory, and cache-coherent NVLink-C2C interconnect, bringing an alternative solution for enabl… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted to ICPP '24 (The 53rd International Conference on Parallel Processing)

  2. arXiv:2406.11760  [pdf, other

    cs.DC

    Understanding Layered Portability from HPC to Cloud in Containerized Environments

    Authors: Daniel Medeiros, Gabin Schieffer, Jacob Wahlgren, Ivy Peng

    Abstract: Recent development in lightweight OS-level virtualization, containers, provides a potential solution for running HPC applications on the cloud platform. In this work, we focus on the impact of different layers in a containerized environment when migrating HPC containers from a dedicated HPC system to a cloud platform. On three ARM-based platforms, including the latest Nvidia Grace CPU, we use six… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Submitted to ISC Workshop - Workshop on Converged Computing '24, preprint

  3. A Quantitative Approach for Adopting Disaggregated Memory in HPC Systems

    Authors: Jacob Wahlgren, Gabin Schieffer, Maya Gokhale, Ivy Peng

    Abstract: Memory disaggregation has recently been adopted in data centers to improve resource utilization, motivated by cost and sustainability. Recent studies on large-scale HPC facilities have also highlighted memory underutilization. A promising and non-disruptive option for memory disaggregation is rack-scale memory pooling, where shared memory pools supplement node-local memory. This work outlines the… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted to SC23 (The International Conference for High Performance Computing, Networking, Storage, and Analysis 2023)

  4. arXiv:2308.00763  [pdf, other

    cs.DC

    Boosting the Performance of Object Tracking with a Half-Precision Particle Filter on GPU

    Authors: Gabin Schieffer, Nattawat Pornthisan, Daniel Araújo de Medeiros, Stefano Markidis, Jacob Wahlgren, Ivy Peng

    Abstract: High-performance GPU-accelerated particle filter methods are critical for object detection applications, ranging from autonomous driving, robot localization, to time-series prediction. In this work, we investigate the design, development and optimization of particle-filter using half-precision on CUDA cores and compare their performance and accuracy with single- and double-precision baselines on N… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 12 pages, 8 figures, conference. To be published in The 21st International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (HeteroPar2023)

  5. arXiv:2307.14860  [pdf, other

    cs.PF

    Quantum Computer Simulations at Warp Speed: Assessing the Impact of GPU Acceleration

    Authors: Jennifer Faj, Ivy Peng, Jacob Wahlgren, Stefano Markidis

    Abstract: Quantum computer simulators are crucial for the development of quantum computing. In this work, we investigate the suitability and performance impact of GPU and multi-GPU systems on a widely used simulation tool - the state vector simulator Qiskit Aer. In particular, we evaluate the performance of both Qiskit's default Nvidia Thrust backend and the recent Nvidia cuQuantum backend on Nvidia A100 GP… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  6. Evaluating Emerging CXL-enabled Memory Pooling for HPC Systems

    Authors: Jacob Wahlgren, Maya Gokhale, Ivy B. Peng

    Abstract: Current HPC systems provide memory resources that are statically configured and tightly coupled with compute nodes. However, workloads on HPC systems are evolving. Diverse workloads lead to a need for configurable memory resources to achieve high performance and utilization. In this study, we evaluate a memory subsystem design leveraging CXL-enabled memory pooling. Two promising use cases of compo… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: 10 pages, 13 figures. Accepted for publication in Workshop on Memory Centric High Performance Computing (MCHPC'22) at SC22

  7. arXiv:2207.07098  [pdf, other

    cs.MS cs.CE cs.DC physics.flu-dyn

    Large-Scale Direct Numerical Simulations of Turbulence Using GPUs and Modern Fortran

    Authors: Martin Karp, Daniele Massaro, Niclas Jansson, Alistair Hart, Jacob Wahlgren, Philipp Schlatter, Stefano Markidis

    Abstract: We present our approach to making direct numerical simulations of turbulence with applications in sustainable ship**. We use modern Fortran and the spectral element method to leverage and scale on supercomputers powered by the Nvidia A100 and the recent AMD Instinct MI250X GPUs, while still providing support for user software developed in Fortran. We demonstrate the efficiency of our approach by… ▽ More

    Submitted 23 June, 2022; originally announced July 2022.

    Comments: 13 pages, 7 figures

    ACM Class: G.4; J.2