Skip to main content

Showing 1–4 of 4 results for author: Matthes, A

Searching in archive cs. Search in all archives.
.
  1. LLAMA: The Low-Level Abstraction For Memory Access

    Authors: Bernhard Manfred Gruber, Guilherme Amadio, Jakob Blomer, Alexander Matthes, René Widera, Michael Bussmann

    Abstract: The performance gap between CPU and memory widens continuously. Choosing the best memory layout for each hardware architecture is increasingly important as more and more programs become memory bound. For portable codes that run across heterogeneous hardware architectures, the choice of the memory layout for data structures is ideally decoupled from the rest of a program. This can be accomplished v… ▽ More

    Submitted 9 March, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: 39 pages, 10 figures, 11 listings

    Journal ref: Softw Pract Exper. 2022; 1- 27

  2. Tuning and optimization for a variety of many-core architectures without changing a single line of implementation code using the Alpaka library

    Authors: Alexander Matthes, René Widera, Erik Zenker, Benjamin Worpitz, Axel Huebl, Michael Bussmann

    Abstract: We present an analysis on optimizing performance of a single C++11 source code using the Alpaka hardware abstraction library. For this we use the general matrix multiplication (GEMM) algorithm in order to show that compilers can optimize Alpaka code effectively when tuning key parameters of the algorithm. We do not intend to rival existing, highly optimized DGEMM versions, but merely choose this e… ▽ More

    Submitted 30 June, 2017; originally announced June 2017.

    Comments: Accepted paper for the P\^{}3MA workshop at the ISC 2017 in Frankfurt

    Journal ref: J.M. Kunkel et al. (Eds.): ISC High Performance Workshops 2017, LNCS 10524, pp. 496-514, 2017

  3. arXiv:1706.00522  [pdf, other

    cs.PF physics.comp-ph

    On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective

    Authors: Axel Huebl, Rene Widera, Felix Schmitt, Alexander Matthes, Norbert Podhorszki, Jong Youl Choi, Scott Klasky, Michael Bussmann

    Abstract: We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on today's and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threa… ▽ More

    Submitted 1 June, 2017; originally announced June 2017.

    Comments: 15 pages, 5 figures, accepted for DRBSD-1 in conjunction with ISC'17

    ACM Class: D.4.8; B.4.3; I.6.6

    Journal ref: J.M. Kunkel et al. (Eds.): ISC High Performance Workshops 2017, LNCS 10524, pp. 15-29, 2017

  4. In situ, steerable, hardware-independent and data-structure agnostic visualization with ISAAC

    Authors: Alexander Matthes, Axel Huebl, René Widera, Sebastian Grottel, Stefan Gumhold, Michael Bussmann

    Abstract: The computation power of supercomputers grows faster than the bandwidth of their storage and network. Especially applications using hardware accelerators like Nvidia GPUs cannot save enough data to be analyzed in a later step. There is a high risk of loosing important scientific information. We introduce the in situ template library ISAAC which enables arbitrary applications like scientific simula… ▽ More

    Submitted 28 November, 2016; originally announced November 2016.

    Journal ref: Supercomputing Frontiers and Innovations, [S.l.], v. 3, n. 4, p. 30-48, oct. 2016