-
Ookami: Deployment and Initial Experiences
Authors:
Andrew Burford,
Alan C. Calder,
David Carlson,
Barbara Chapman,
Firat CoŞKun,
Tony Curtis,
Catherine Feldman,
Robert J. Harrison,
Yan Kang,
Benjamin Michalow-Icz,
Eric Raut,
Eva Siegmann,
Daniel G. Wood,
Robert L. Deleon,
Mathew Jones,
Nikolay A. Simakov,
Joseph P. White,
Dossay Oryspayev
Abstract:
Ookami is a computer technology testbed supported by the United States National Science Foundation. It provides researchers with access to the A64FX processor developed by Fujitsu in collaboration with RIKΞN for the Japanese path to exascale computing, as deployed in Fugaku, the fastest computer in the world. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vec…
▽ More
Ookami is a computer technology testbed supported by the United States National Science Foundation. It provides researchers with access to the A64FX processor developed by Fujitsu in collaboration with RIKΞN for the Japanese path to exascale computing, as deployed in Fugaku, the fastest computer in the world. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain familiar and successful programming models while achieving very high performance for a wide range of applications. We review relevant technology and system details, and the main body of the paper focuses on initial experiences with the hardware and software ecosystem for micro-benchmarks, mini-apps, and full applications, and starts to answer questions about where such technologies fit into the NSF ecosystem.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
Modernizing Titan2D, a Parallel AMR Geophysical Flow Code to Support Multiple Rheologies and Extendability
Authors:
Nikolay A. Simakov,
Renette L. Jones-Ivey,
Ali Akhavan-Safaei,
Hossein Aghakhani,
Matthew D. Jones,
Abani K. Patra
Abstract:
In this work, we report on strategies and results of our initial approach for modernization of Titan2D code. Titan2D is a geophysical mass flow simulation code designed for modeling of volcanic flows, debris avalanches and landslides over a realistic terrain model. It solves an underlying hyperbolic system of partial differential equations using parallel adaptive mesh Godunov scheme. The following…
▽ More
In this work, we report on strategies and results of our initial approach for modernization of Titan2D code. Titan2D is a geophysical mass flow simulation code designed for modeling of volcanic flows, debris avalanches and landslides over a realistic terrain model. It solves an underlying hyperbolic system of partial differential equations using parallel adaptive mesh Godunov scheme. The following work was done during code refactoring and modernization. To facilitate user input two level python interface was developed. Such design permits large changes in C++ and Python low-level while maintaining stable high-level interface exposed to the end user. Multiple diverged forks implementing different material models were merged back together. Data storage layout was changed from a linked list of structures to a structure of arrays representation for better memory access and in preparation for further work on better utilization of vectorized instruction. Existing MPI parallelization was augmented with OpenMP parallelization. The performance of a hash table used to store mesh elements and nodes references was improved by switching from a linked list for overflow entries to dynamic arrays allowing the implementation of the binary search algorithm. The introduction of the new data layout made possible to reduce the number of hash table look-ups by replacing them with direct use of indexes from the storage class. The modifications lead to 8-9 times performance improvement for serial execution.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Effect of Meltdown and Spectre Patches on the Performance of HPC Applications
Authors:
Nikolay A. Simakov,
Martins D. Innus,
Matthew D. Jones,
Joseph P. White,
Steven M. Gallo,
Robert L. DeLeon,
Thomas R. Furlani
Abstract:
In this work we examine how the updates addressing Meltdown and Spectre vulnerabilities impact the performance of HPC applications. To study this we use the application kernel module of XDMoD to test the performance before and after the application of the vulnerability patches. We tested the performance difference for multiple application and benchmarks including: NWChem, NAMD, HPCC, IOR, MDTest a…
▽ More
In this work we examine how the updates addressing Meltdown and Spectre vulnerabilities impact the performance of HPC applications. To study this we use the application kernel module of XDMoD to test the performance before and after the application of the vulnerability patches. We tested the performance difference for multiple application and benchmarks including: NWChem, NAMD, HPCC, IOR, MDTest and IMB. The results show that although some specific functions can have performance decreased by as much as 74%, the majority of individual metrics indicates little to no decrease in performance. The real-world applications show a 2-3% decrease in performance for single node jobs and a 5-11% decrease for parallel multi node jobs.
△ Less
Submitted 16 January, 2018; v1 submitted 12 January, 2018;
originally announced January 2018.
-
A Workload Analysis of NSF's Innovative HPC Resources Using XDMoD
Authors:
Nikolay A. Simakov,
Joseph P. White,
Robert L. DeLeon,
Steven M. Gallo,
Matthew D. Jones,
Jeffrey T. Palmer,
Benjamin Plessinger,
Thomas R. Furlani
Abstract:
Workload characterization is an integral part of performance analysis of high performance computing (HPC) systems. An understanding of workload properties sheds light on resource utilization and can be used to inform performance optimization both at the software and system configuration levels. It can provide information on how computational science usage modalities are changing that could potenti…
▽ More
Workload characterization is an integral part of performance analysis of high performance computing (HPC) systems. An understanding of workload properties sheds light on resource utilization and can be used to inform performance optimization both at the software and system configuration levels. It can provide information on how computational science usage modalities are changing that could potentially aid holistic capacity planning for the wider HPC ecosystem. Here, we report on the results of a detailed workload analysis of the portfolio of supercomputers comprising the NSF Innovative HPC program in order to characterize its past and current workload and look for trends to understand the nature of how the broad portfolio of computational science research is being supported and how it is changing over time. The workload analysis also sought to illustrate a wide variety of usage patterns and performance requirements for jobs running on these systems. File system performance, memory utilization and the types of parallelism employed by users (MPI, threads, etc) were also studied for all systems for which job level performance data was available.
△ Less
Submitted 12 January, 2018;
originally announced January 2018.