-
Workflows to driving high-performance interactive supercomputing for urgent decision making
Authors:
Nick Brown,
Rupert Nash,
Gordon Gibb,
Evgenij Belikov,
Artur Podobas,
Wei Der Chien,
Stefano Markidis,
Markus Flatken,
Andreas Gerndt
Abstract:
Interactive urgent computing is a small but growing user of supercomputing resources. However there are numerous technical challenges that must be overcome to make supercomputers fully suited to the wide range of urgent workloads which could benefit from the computational power delivered by such instruments. An important question is how to connect the different components of an urgent workload; na…
▽ More
Interactive urgent computing is a small but growing user of supercomputing resources. However there are numerous technical challenges that must be overcome to make supercomputers fully suited to the wide range of urgent workloads which could benefit from the computational power delivered by such instruments. An important question is how to connect the different components of an urgent workload; namely the users, the simulation codes, and external data sources, together in a structured and accessible manner.
In this paper we explore the role of workflows from both the perspective of marshalling and control of urgent workloads, and at the individual HPC machine level. Ultimately requiring two workflow systems, by using a space weather prediction urgent use-cases, we explore the benefit that these two workflow systems provide especially when one exploits the flexibility enabled by them interoperating.
△ Less
Submitted 28 June, 2022;
originally announced June 2022.
-
Utilising urgent computing to tackle the spread of mosquito-borne diseases
Authors:
Nick Brown,
Rupert Nash,
Piero Poletti,
Giorgio Guzzetta,
Mattia Manica,
Agnese Zardini,
Markus Flatken,
Jules Vidal,
Charles Gueunet,
Evgenij Belikov,
Julien Tierny,
Artur Podobas,
Wei Der Chien,
Stefano Markidis,
Andreas Gerndt
Abstract:
It is estimated that around 80\% of the world's population live in areas susceptible to at-least one major vector borne disease, and approximately 20% of global communicable diseases are spread by mosquitoes. Furthermore, the outbreaks of such diseases are becoming more common and widespread, with much of this driven in recent years by socio-demographic and climatic factors. These trends are causi…
▽ More
It is estimated that around 80\% of the world's population live in areas susceptible to at-least one major vector borne disease, and approximately 20% of global communicable diseases are spread by mosquitoes. Furthermore, the outbreaks of such diseases are becoming more common and widespread, with much of this driven in recent years by socio-demographic and climatic factors. These trends are causing significant worry to global health organisations, including the CDC and WHO, and-so an important question is the role that technology can play in addressing them.
In this work we describe the integration of an epidemiology model, which simulates the spread of mosquito-borne diseases, with the VESTEC urgent computing ecosystem. The intention of this work is to empower human health professionals to exploit this model and more easily explore the progression of mosquito-borne diseases. Traditionally in the domain of the few research scientists, by leveraging state of the art visualisation and analytics techniques, all supported by running the computational workloads on HPC machines in a seamless fashion, we demonstrate the significant advantages that such an integration can provide. Furthermore we demonstrate the benefits of using an ecosystem such as VESTEC, which provides a framework for urgent computing, in supporting the easy adoption of these technologies by the epidemiologists and disaster response professionals more widely.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
The role of interactive super-computing in using HPC for urgent decision making
Authors:
Nick Brown,
Rupert Nash,
Gordon Gibb,
Bianca Prodan,
Max Kontak,
Vyacheslav Olshevsky,
Wei Der Chien
Abstract:
Technological advances are creating exciting new opportunities that have the potential to move HPC well beyond traditional computational workloads. In this paper we focus on the potential for HPC to be instrumental in responding to disasters such as wildfires, hurricanes, extreme flooding, earthquakes, tsunamis, winter weather conditions, and accidents. Driven by the VESTEC EU funded H2020 project…
▽ More
Technological advances are creating exciting new opportunities that have the potential to move HPC well beyond traditional computational workloads. In this paper we focus on the potential for HPC to be instrumental in responding to disasters such as wildfires, hurricanes, extreme flooding, earthquakes, tsunamis, winter weather conditions, and accidents. Driven by the VESTEC EU funded H2020 project, our research looks to prove HPC as a tool not only capable of simulating disasters once they have happened, but also one which is able to operate in a responsive mode, supporting disaster response teams making urgent decisions in real-time. Whilst this has the potential to revolutionise disaster response, it requires the ability to drive HPC interactively, both from the user's perspective and also based upon the arrival of data. As such interactivity is a critical component in enabling HPC to be exploited in the role of supporting disaster response teams so that urgent decision makers can make the correct decision first time, every time.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
NVIDIA Tensor Core Programmability, Performance & Precision
Authors:
Stefano Markidis,
Steven Wei Der Chien,
Erwin Laure,
Ivy Bo Peng,
Jeffrey S. Vetter
Abstract:
The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to pro…
▽ More
The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program NVIDIA Tensor Cores, their performances and the precision loss due to computation in mixed precision.
Currently, NVIDIA provides three different ways of programming matrix-multiply-and-accumulate on Tensor Cores: the CUDA Warp Matrix Multiply Accumulate (WMMA) API, CUTLASS, a templated library based on WMMA, and cuBLAS GEMM. After experimenting with different approaches, we found that NVIDIA Tensor Cores can deliver up to 83 Tflops/s in mixed precision on a Tesla V100 GPU, seven and three times the performance in single and half precision respectively. A WMMA implementation of batched GEMM reaches a performance of 4 Tflops/s. While precision loss due to matrix multiplication with half precision input might be critical in many HPC applications, it can be considerably reduced at the cost of increased computation. Our results indicate that HPC applications using matrix multiplications can strongly benefit from using of NVIDIA Tensor Cores.
△ Less
Submitted 11 March, 2018;
originally announced March 2018.