-
GPU Accelerated Implicit Kinetic Meshfree Method based on Modified LU-SGS
Authors:
Mayuri Verma,
Anil Nemili,
Nischay Ram Mamidi
Abstract:
This report presents the GPU acceleration of implicit kinetic meshfree methods using modified LU-SGS algorithms. The meshfree scheme is based on the least squares kinetic upwind method (LSKUM). In the existing matrix-free LU-SGS approaches for kinetic meshfree methods, the products of split flux Jacobians and increments in conserved vectors are approximated by increments in the split fluxes. In ou…
▽ More
This report presents the GPU acceleration of implicit kinetic meshfree methods using modified LU-SGS algorithms. The meshfree scheme is based on the least squares kinetic upwind method (LSKUM). In the existing matrix-free LU-SGS approaches for kinetic meshfree methods, the products of split flux Jacobians and increments in conserved vectors are approximated by increments in the split fluxes. In our modified LU-SGS approach, the Jacobian vector products are computed exactly using algorithmic differentiation (AD). The implicit GPU solvers with exact and approximate computation of the Jacobian vector products are applied to the standard test cases for two-dimensional inviscid flows. Numerical results have shown that the GPU solvers with the exact computation of the Jacobian vector products are computationally more efficient and yield better convergence rates than the solvers with approximations to the Jacobian vector products. Benchmarks are presented to assess the performance of implicit GPU solvers compared to the explicit GPU solver and the implicit serial LSKUM solver.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Regent based parallel meshfree LSKUM solver for heterogenous HPC platforms
Authors:
Sanath Salil,
Nischay Ram Mamidi,
Anil Nemili,
Elliott Slaughter
Abstract:
Regent is an implicitly parallel programming language that allows the development of a single codebase for heterogeneous platforms targeting CPUs and GPUs. This paper presents the development of a parallel meshfree solver in Regent for two-dimensional inviscid compressible flows. The meshfree solver is based on the least squares kinetic upwind method. Example codes are presented to show the differ…
▽ More
Regent is an implicitly parallel programming language that allows the development of a single codebase for heterogeneous platforms targeting CPUs and GPUs. This paper presents the development of a parallel meshfree solver in Regent for two-dimensional inviscid compressible flows. The meshfree solver is based on the least squares kinetic upwind method. Example codes are presented to show the difference between the Regent and CUDA-C implementations of the meshfree solver on a GPU node. For CPU parallel computations, details are presented on how the data communication and synchronisation are handled by Regent and Fortran+MPI codes. The Regent solver is verified by applying it to the standard test cases for inviscid flows. Benchmark simulations are performed on coarse to very fine point distributions to assess the solver's performance. The computational efficiency of the Regent solver on an A100 GPU is compared with an equivalent meshfree solver written in CUDA-C. The codes are then profiled to investigate the differences in their performance. The performance of the Regent solver on CPU cores is compared with an equivalent explicitly parallel Fortran meshfree solver based on MPI. Scalability results are shown to offer insights into performance.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
On the performance of GPU accelerated q-LSKUM based meshfree solvers in Fortran, C++, Python, and Julia
Authors:
Nischay Ram Mamidi,
Kumar Prasun,
Dhruv Saxena,
Anil Nemili,
Bharatkumar Sharma,
S. M. Deshpande
Abstract:
This report presents a comprehensive analysis of the performance of GPU accelerated meshfree CFD solvers for two-dimensional compressible flows in Fortran, C++, Python, and Julia. The programming model CUDA is used to develop the GPU codes. The meshfree solver is based on the least squares kinetic upwind method with entropy variables (q-LSKUM). To assess the computational efficiency of the GPU sol…
▽ More
This report presents a comprehensive analysis of the performance of GPU accelerated meshfree CFD solvers for two-dimensional compressible flows in Fortran, C++, Python, and Julia. The programming model CUDA is used to develop the GPU codes. The meshfree solver is based on the least squares kinetic upwind method with entropy variables (q-LSKUM). To assess the computational efficiency of the GPU solvers and to compare their relative performance, benchmark calculations are performed on seven levels of point distribution. To analyse the difference in their run-times, the computationally intensive kernel is profiled. Various performance metrics are investigated from the profiled data to determine the cause of observed variation in run-times. To address some of the performance related issues, various optimisation strategies are employed. The optimised GPU codes are compared with the naive codes, and conclusions are drawn from their performance.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Chaotic Behavior of Stiff ODEs and Their Derivatives: An Illustrative Example
Authors:
Emre Ă–zkaya,
Nicolas R. Gauger,
Anil Nemili
Abstract:
In the following, an illustrative example concerning difficulties in differentiating stiff ODEs is presented. In the given example, the solution of a completely deterministic system becomes chaotic due to computational noise introduced by the numerical algorithm.
In the following, an illustrative example concerning difficulties in differentiating stiff ODEs is presented. In the given example, the solution of a completely deterministic system becomes chaotic due to computational noise introduced by the numerical algorithm.
△ Less
Submitted 7 October, 2016;
originally announced October 2016.