Search | arXiv e-print repository

doi 10.1145/3569951.3597545

Optimization and Portability of a Fusion OpenACC-based FORTRAN HPC Code from NVIDIA to AMD GPUs

Authors: Igor Sfiligoi, Emily A. Belli, Jeff Candy, Reuben D. Budiardja

Abstract: NVIDIA has been the main provider of GPU hardware in HPC systems for over a decade. Most applications that benefit from GPUs have thus been developed and optimized for the NVIDIA software stack. Recent exascale HPC systems are, however, introducing GPUs from other vendors, e.g. with the AMD GPU-based OLCF Frontier system just becoming available. AMD GPUs cannot be directly accessed using the NVIDI… ▽ More NVIDIA has been the main provider of GPU hardware in HPC systems for over a decade. Most applications that benefit from GPUs have thus been developed and optimized for the NVIDIA software stack. Recent exascale HPC systems are, however, introducing GPUs from other vendors, e.g. with the AMD GPU-based OLCF Frontier system just becoming available. AMD GPUs cannot be directly accessed using the NVIDIA software stack, and require a porting effort by the application developers. This paper provides an overview of our experience porting and optimizing the CGYRO code, a widely-used fusion simulation tool based on FORTRAN with OpenACC-based GPU acceleration. While the porting from the NVIDIA compilers was relatively straightforward using the CRAY compilers on the AMD systems, the performance optimization required more fine-tuning. In the optimization effort, we uncovered code sections that had performed well on NVIDIA GPUs, but were unexpectedly slow on AMD GPUs. After AMD-targeted code optimizations, performance on AMD GPUs has increased to meet our expectations. Modest speed improvements were also seen on NVIDIA GPUs, which was an unexpected benefit of this exercise. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: 6 pages, 4 figures, 2 tables, To be published in Proceedings of PEARC23

Journal ref: Practice and Experience in Advanced Research Computing (PEARC '23). Association for Computing Machinery, New York, NY, USA, 246-250. (2023)

arXiv:2305.09683 [pdf, other]

Flexible, integrated modeling of tokamak stability, transport, equilibrium, and pedestal physics

Authors: B. C. Lyons, J. McClenaghan, T. Slendebroek, O. Meneghini, T. F. Neiser, S. P. Smith, D. B. Weisberg, E. A. Belli, J. Candy, J. M. Hanson, L. L. Lao, N. C. Logan, S. Saarelma, O. Sauter, P. B. Snyder, G. M. Staebler, K. E. Thome, A. D. Turnbull

Abstract: The STEP (Stability, Transport, Equilibrium, and Pedestal) integrated-modeling tool has been developed in OMFIT to predict stable, tokamak equilibria self-consistently with core-transport and pedestal calculations. STEP couples theory-based codes to integrate a variety of physics, including MHD stability, transport, equilibrium, pedestal formation, and current-drive, heating, and fueling. The inpu… ▽ More The STEP (Stability, Transport, Equilibrium, and Pedestal) integrated-modeling tool has been developed in OMFIT to predict stable, tokamak equilibria self-consistently with core-transport and pedestal calculations. STEP couples theory-based codes to integrate a variety of physics, including MHD stability, transport, equilibrium, pedestal formation, and current-drive, heating, and fueling. The input/output of each code is interfaced with a centralized ITER-IMAS data structure, allowing codes to be run in any order and enabling open-loop, feedback, and optimization workflows. This paradigm simplifies the integration of new codes, making STEP highly extensible. STEP has been verified against a published benchmark of six different integrated models. Core-pedestal calculations with STEP have been successfully validated against individual DIII-D H-mode discharges and across more than 500 discharges of the $H_{98,y2}$ database, with a mean error in confinement time from experiment less than 19%. STEP has also reproduced results in less conventional DIII-D scenarios, including negative-central-shear and negative-triangularity plasmas. Predictive STEP modeling has been used to assess performance in several tokamak reactors. Simulations of a high-field, large-aspect-ratio reactor show significantly lower fusion power than predicted by a zero-dimensional study, demonstrating the limitations of scaling-law extrapolations. STEP predictions have found promising EXCITE scenarios, including a high-pressure, 80%-bootstrap-fraction plasma. ITER modeling with STEP has shown that pellet fueling enhances fusion gain in both the baseline and advanced-inductive scenarios. Finally, STEP predictions for the SPARC baseline scenario are in good agreement with published results from the physics basis. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: 15 pages, 11 figures Associated with invited talk at 63nd Annual Meeting of the APS Division of Plasma Physics: https://meetings.aps.org/Meeting/DPP21/Session/NI02.1 . The following article has been submitted to Physics of Plasmas. After it is published, it will be found at https://publishing.aip.org/resources/librarians/products/journals/

arXiv:2205.09682 [pdf]

doi 10.1145/3491418.3535130

Comparing single-node and multi-node performance of an important fusion HPC code benchmark

Authors: Emily A. Belli, Jeff Candy, Igor Sfiligoi, Frank Würthwein

Abstract: Fusion simulations have traditionally required the use of leadership scale High Performance Computing (HPC) resources in order to produce advances in physics. The impressive improvements in compute and memory capacity of many-GPU compute nodes are now allowing for some problems that once required a multi-node setup to be also solvable on a single node. When possible, the increased interconnect ban… ▽ More Fusion simulations have traditionally required the use of leadership scale High Performance Computing (HPC) resources in order to produce advances in physics. The impressive improvements in compute and memory capacity of many-GPU compute nodes are now allowing for some problems that once required a multi-node setup to be also solvable on a single node. When possible, the increased interconnect bandwidth can result in order of magnitude higher science throughput, especially for communication-heavy applications. In this paper we analyze the performance of the fusion simulation tool CGYRO, an Eulerian gyrokinetic turbulence solver designed and optimized for collisional, electromagnetic, multiscale simulation, which is widely used in the fusion research community. Due to the nature of the problem, the application has to work on a large multi-dimensional computational mesh as a whole, requiring frequent exchange of large amounts of data between the compute processes. In particular, we show that the average-scale nl03 benchmark CGYRO simulation can be run at an acceptable speed on a single Google Cloud instance with 16 A100 GPUs, outperforming 8 NERSC Perlmutter Phase1 nodes, 16 ORNL Summit nodes and 256 NERSC Cori nodes. Moving from a multi-node to a single-node GPU setup we get comparable simulation times using less than half the number of GPUs. Larger benchmark problems, however, still require a multi-node HPC setup due to GPU memory capacity needs, since at the time of writing no vendor offers nodes with a sufficient GPU memory setup. The upcoming external NVSWITCH does however promise to deliver an almost equivalent solution for up to 256 NVIDIA GPUs. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: 6 pages, 1 table, 1 figure, to be published in proceedings of PEARC22

Journal ref: PEARC '22: Practice and Experience in Advanced Research Computing (2022) 10 1-4

arXiv:1407.1191 [pdf, other]

doi 10.1088/0741-3335/57/1/014031

Theoretical description of heavy impurity transport and its application to the modelling of tungsten in JET and ASDEX Upgrade

Authors: F. J. Casson, C. Angioni, E. A. Belli, R. Bilato, P. Mantica, T. Odstrcil, T. Puetterich, M. Valisa, L. Garzotti, C. Giroud, J. Hobirk, C. F. Maggi, J. Mlynar, M. L. Reinke, JET EFDA contributors, ASDEX-Upgrade team

Abstract: Recent developments in theory-based modelling of core heavy impurity transport are presented, and shown to be necessary for quantitative description of present experiments in JET and ASDEX Upgrade. The treatment of heavy impurities is complicated by their large mass and charge, which result in a strong response to plasma rotation or any small background electrostatic field in the plasma, such as t… ▽ More Recent developments in theory-based modelling of core heavy impurity transport are presented, and shown to be necessary for quantitative description of present experiments in JET and ASDEX Upgrade. The treatment of heavy impurities is complicated by their large mass and charge, which result in a strong response to plasma rotation or any small background electrostatic field in the plasma, such as that generated by anisotropic external heating. These forces lead to strong poloidal asymmetries of impurity density, which have recently been added to numerical tools describing both neoclassical and turbulent transport. Modelling predictions of the steady-state two-dimensional tungsten impurity distribution are compared with experimental densities interpreted from soft X-ray diagnostics. The modelling identifies neoclassical transport enhanced by poloidal asymmetries as the dominant mechanism responsible for tungsten accumulation in the central core of the plasma. Depending on the bulk plasma profiles, neoclassical temperature screening can prevent accumulation, and can be enhanced by externally heated species, demonstrated here in ICRH plasmas. △ Less

Submitted 4 July, 2014; originally announced July 2014.

Comments: 9 pages, 11 figures, EPS Berlin Plasma Physics 2014, Invited speaker, submitted to Plasma Physics and Controlled Fusion

Journal ref: Plasma Phys. Control. Fusion 57 (2015) 014031

arXiv:1402.0309 [pdf, ps, other]

doi 10.1088/0741-3335/56/12/124005

Impurity transport in Alcator C-Mod in the presence of poloidal density variation induced by ion cyclotron resonance heating

Authors: Albert Mollén, István Pusztai, Matthew L. Reinke, Yevgen O. Kazakov, Nathan T. Howard, Emily A. Belli, Tünde Fülöp, the Alcator C-Mod Team

Abstract: Impurity particle transport in an ion cyclotron resonance heated Alcator C-Mod discharge is studied with local gyrokinetic simulations and a theoretical model including the effect of poloidal asymmetries and elongation. In spite of the strong minority temperature anisotropy in the deep core region, the poloidal asymmetries are found to have a negligible effect on the turbulent impurity transport d… ▽ More Impurity particle transport in an ion cyclotron resonance heated Alcator C-Mod discharge is studied with local gyrokinetic simulations and a theoretical model including the effect of poloidal asymmetries and elongation. In spite of the strong minority temperature anisotropy in the deep core region, the poloidal asymmetries are found to have a negligible effect on the turbulent impurity transport due to low magnetic shear in this region, in agreement with the experimental observations. According to the theoretical model, in outer core regions poloidal asymmetries may contribute to the reduction of the impurity peaking, but uncertainties in atomic physics processes prevent quantitative comparison with experiments. △ Less

Submitted 10 November, 2014; v1 submitted 3 February, 2014; originally announced February 2014.

Comments: 32 pages, 12 figures

Journal ref: Plasma Phys. Control. Fusion 56 (2014) 124005

arXiv:1304.3633 [pdf, other]

doi 10.1103/PhysRevLett.111.055005

Intrinsic rotation driven by non-Maxwellian equilibria in tokamak plasmas

Authors: M. Barnes, F. I. Parra, J. P. Lee, E. A. Belli, M. F. F. Nave, A. E. White

Abstract: The effect of small deviations from a Maxwellian equilibrium on turbulent momentum transport in tokamak plasmas is considered. These non-Maxwellian features, arising from diamagnetic effects, introduce a strong dependence of the radial flux of co-current toroidal angular momentum on collisionality: As the plasma goes from nearly collisionless to weakly collisional, the flux reverses direction from… ▽ More The effect of small deviations from a Maxwellian equilibrium on turbulent momentum transport in tokamak plasmas is considered. These non-Maxwellian features, arising from diamagnetic effects, introduce a strong dependence of the radial flux of co-current toroidal angular momentum on collisionality: As the plasma goes from nearly collisionless to weakly collisional, the flux reverses direction from radially inward to outward. This indicates a collisionality-dependent transition from peaked to hollow rotation profiles, consistent with experimental observations of intrinsic rotation. △ Less

Submitted 12 April, 2013; originally announced April 2013.

Comments: 5 pages, 3 figures

arXiv:1109.4558 [pdf, ps, other]

doi 10.1063/1.3662064

Simulating Gyrokinetic Microinstabilities in Stellarator Geometry with GS2

Authors: J. A. Baumgaertel, E. A. Belli, W. Dorland, W. Guttenfelder, G. W. Hammett, D. R. Mikkelsen, G. Rewoldt, W. M. Tang, P. Xanthopoulos

Abstract: The nonlinear gyrokinetic code GS2 has been extended to treat non-axisymmetric stellarator geometry. Electromagnetic perturbations and multiple trapped particle regions are allowed. Here, linear, collisionless, electrostatic simulations of the quasi-axisymmetric, three-field period National Compact Stellarator Experiment (NCSX) design QAS3-C82 have been successfully benchmarked against the eigenva… ▽ More The nonlinear gyrokinetic code GS2 has been extended to treat non-axisymmetric stellarator geometry. Electromagnetic perturbations and multiple trapped particle regions are allowed. Here, linear, collisionless, electrostatic simulations of the quasi-axisymmetric, three-field period National Compact Stellarator Experiment (NCSX) design QAS3-C82 have been successfully benchmarked against the eigenvalue code FULL. Quantitatively, the linear stability calculations of GS2 and FULL agree to within ~10%. △ Less

Submitted 21 September, 2011; originally announced September 2011.

Comments: Submitted to Physics of Plasmas. 9 pages, 14 figures

Journal ref: Phys. Plasmas 18, 122301 (2011)

Showing 1–7 of 7 results for author: Belli, E A