Skip to main content

Showing 1–13 of 13 results for author: Jansson, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05640  [pdf, other

    cs.DC cs.MS physics.flu-dyn

    Experience and Analysis of Scalable High-Fidelity Computational Fluid Dynamics on Modular Supercomputing Architectures

    Authors: Martin Karp, Estela Suarez, Jan H. Meinke, Måns I. Andersson, Philipp Schlatter, Stefano Markidis, Niclas Jansson

    Abstract: The never-ending computational demand from simulations of turbulence makes computational fluid dynamics (CFD) a prime application use case for current and future exascale systems. High-order finite element methods, such as the spectral element method, have been gaining traction as they offer high performance on both multicore CPUs and modern GPU-based accelerators. In this work, we assess how high… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 13 pages, 5 figures, 3 tables, preprint

    ACM Class: J.2; C.1.4; G.4

  2. arXiv:2405.05639  [pdf, other

    cs.DC

    Supercomputers as a Continous Medium

    Authors: Martin Karp, Niclas Jansson, Philipp Schlatter, Stefano Markidis

    Abstract: As supercomputers' complexity has grown, the traditional boundaries between processor, memory, network, and accelerators have blurred, making a homogeneous computer model, in which the overall computer system is modeled as a continuous medium with homogeneously distributed computational power, memory, and data movement transfer capabilities, an intriguing and powerful abstraction. By applying a ho… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 3 tables

    ACM Class: F.1; F.2; I.6

  3. arXiv:2401.14576  [pdf

    cs.DC cs.PF

    Accelerating Scientific Application through Transparent I/O Interposition

    Authors: Steven W. D. Chien, Kento Sato, Artur Podobas, Niclas Jansson, Stefano Markidis, Michio Honda

    Abstract: The ability to handle a large volume of data generated by scientific applications is crucial. We have seen an increase in the heterogeneity of storage technologies available to scientific applications, such as burst buffers, local temporary block storage, managed cloud parallel file systems (PFS), and non-POSIX object stores. However, scientific applications designed for traditional HPC systems ca… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Submitted to HPDC 2024

  4. arXiv:2305.01338  [pdf, other

    eess.SY cs.LG

    Physics-Informed Learning Using Hamiltonian Neural Networks with Output Error Noise Models

    Authors: Sarvin Moradi, Nick Jaensson, Roland Tóth, Maarten Schoukens

    Abstract: In order to make data-driven models of physical systems interpretable and reliable, it is essential to include prior physical knowledge in the modeling framework. Hamiltonian Neural Networks (HNNs) implement Hamiltonian theory in deep learning and form a comprehensive framework for modeling autonomous energy-conservative systems. Despite being suitable to estimate a wide range of physical system b… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Preprint submitted to IFAC 2023

  5. arXiv:2207.07098  [pdf, other

    cs.MS cs.CE cs.DC physics.flu-dyn

    Large-Scale Direct Numerical Simulations of Turbulence Using GPUs and Modern Fortran

    Authors: Martin Karp, Daniele Massaro, Niclas Jansson, Alistair Hart, Jacob Wahlgren, Philipp Schlatter, Stefano Markidis

    Abstract: We present our approach to making direct numerical simulations of turbulence with applications in sustainable ship**. We use modern Fortran and the spectral element method to leverage and scale on supercomputers powered by the Nvidia A100 and the recent AMD Instinct MI250X GPUs, while still providing support for user software developed in Fortran. We demonstrate the efficiency of our approach by… ▽ More

    Submitted 23 June, 2022; originally announced July 2022.

    Comments: 13 pages, 7 figures

    ACM Class: G.4; J.2

  6. arXiv:2109.03592  [pdf, ps, other

    cs.DC

    Strong Scaling of OpenACC enabled Nek5000 on several GPU based HPC systems

    Authors: Jonathan Vincent, **g Gong, Martin Karp, Adam Peplinski, Niclas Jansson, Artur Podobas, Andreas Jocksch, Jie Yao, Fazle Hussain, Stefano Markidis, Matts Karlsson, Dirk Pleiter, Erwin Laure, Philipp Schlatter

    Abstract: We present new results on the strong parallel scaling for the OpenACC-accelerated implementation of the high-order spectral element fluid dynamics solver Nek5000. The test case considered consists of a direct numerical simulation of fully-developed turbulent flow in a straight pipe, at two different Reynolds numbers $Re_τ=360$ and $Re_τ=550$, based on friction velocity and pipe radius. The strong… ▽ More

    Submitted 4 November, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

    Comments: 9 pages, 8 figures. Submitted to HPC-Asia 2022 conference, updated to address reviewers comments

    ACM Class: G.4; J.2; C.1

  7. A High-Fidelity Flow Solver for Unstructured Meshes on Field-Programmable Gate Arrays

    Authors: Martin Karp, Artur Podobas, Tobias Kenter, Niclas Jansson, Christian Plessl, Philipp Schlatter, Stefano Markidis

    Abstract: The impending termination of Moore's law motivates the search for new forms of computing to continue the performance scaling we have grown accustomed to. Among the many emerging Post-Moore computing candidates, perhaps none is as salient as the Field-Programmable Gate Array (FPGA), which offers the means of specializing and customizing the hardware to the computation at hand. In this work, we de… ▽ More

    Submitted 2 November, 2021; v1 submitted 27 August, 2021; originally announced August 2021.

    Comments: 12 pages, 3 figures, 3 tables, Accepted to HPC Asia 2022

    ACM Class: G.4; J.2; C.1

  8. arXiv:2107.01243  [pdf

    cs.MS

    Neko: A Modern, Portable, and Scalable Framework for High-Fidelity Computational Fluid Dynamics

    Authors: Niclas Jansson, Martin Karp, Artur Podobas, Stefano Markidis, Philipp Schlatter

    Abstract: Recent trends and advancement in including more diverse and heterogeneous hardware in High-Performance Computing is challenging software developers in their pursuit for good performance and numerical stability. The well-known maxim "software outlives hardware" may no longer necessarily hold true, and developers are today forced to re-factor their codebases to leverage these powerful new systems. C… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  9. arXiv:2106.04979  [pdf

    cs.DC

    Benchmarking the Nvidia GPU Lineage: From Early K80 to Modern A100 with Asynchronous Memory Transfers

    Authors: Martin Svedin, Steven W. D. Chien, Gibson Chikafa, Niclas Jansson, Artur Podobas

    Abstract: For many, Graphics Processing Units (GPUs) provides a source of reliable computing power. Recently, Nvidia introduced its 9th generation HPC-grade GPUs, the Ampere 100, claiming significant performance improvements over previous generations, particularly for AI-workloads, as well as introducing new architectural features such as asynchronous data movement. But how well does the A100 perform on non… ▽ More

    Submitted 3 July, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 7 pages

  10. arXiv:2103.09683  [pdf, other

    cs.DC

    Accelerating Radiation Therapy Dose Calculation with Nvidia GPUs

    Authors: Felix Liu, Niclas Jansson, Artur Podobas, Albin Fredriksson, Stefano Markidis

    Abstract: Radiation Treatment Planning (RTP) is the process of planning the appropriate external beam radiotherapy to combat cancer in human patients. RTP is a complex and compute-intensive task, which often takes a long time (several hours) to compute. Reducing this time allows for higher productivity at clinics and more sophisticated treatment planning, which can materialize in better treatments. The stat… ▽ More

    Submitted 19 September, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

  11. arXiv:2010.13463  [pdf

    cs.DC

    High-Performance Spectral Element Methods on Field-Programmable Gate Arrays

    Authors: Martin Karp, Artur Podobas, Niclas Jansson, Tobias Kenter, Christian Plessl, Philipp Schlatter, Stefano Markidis

    Abstract: Improvements in computer systems have historically relied on two well-known observations: Moore's law and Dennard's scaling. Today, both these observations are ending, forcing computer users, researchers, and practitioners to abandon the general-purpose architectures' comforts in favor of emerging post-Moore systems. Among the most salient of these post-Moore systems is the Field-Programmable Gate… ▽ More

    Submitted 4 May, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: 10 pages, IEEE International Parallel and Distributed Processing Symposium 2021 (IPDPS'21)

    ACM Class: G.4; J.2; C.1

  12. arXiv:2005.13425  [pdf

    cs.DC

    Optimization of Tensor-product Operations in Nekbone on GPUs

    Authors: Martin Karp, Niclas Jansson, Artur Podobas, Philipp Schlatter, Stefano Markidis

    Abstract: In the CFD solver Nek5000, the computation is dominated by the evaluation of small tensor operations. Nekbone is a proxy app for Nek5000 and has previously been ported to GPUs with a mixed OpenACC and CUDA approach. In this work, we continue this effort and optimize the main tensor-product operation in Nekbone further. Our optimization is done in CUDA and uses a different, 2D, thread structure to… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

    Comments: 4 pages, 4 figures

    ACM Class: G.4; J.2

  13. arXiv:1808.04099  [pdf, other

    cs.CE physics.flu-dyn

    CUBE: A scalable framework for large-scale industrial simulations

    Authors: Niclas Jansson, Rahul Bale, Keiji Onishi, Makoto Tsubokura

    Abstract: Writing high performance solvers for engineering applications is a delicate task. These codes are often developed on an application to application basis, highly optimized to solve a certain problem. Here, we present our work on develo** a general simulation framework for efficient computation of time resolved approximations of complex industrial flow problems - Complex Unified Building cubE meth… ▽ More

    Submitted 13 August, 2018; originally announced August 2018.