-
The role of interactive super-computing in using HPC for urgent decision making
Authors:
Nick Brown,
Rupert Nash,
Gordon Gibb,
Bianca Prodan,
Max Kontak,
Vyacheslav Olshevsky,
Wei Der Chien
Abstract:
Technological advances are creating exciting new opportunities that have the potential to move HPC well beyond traditional computational workloads. In this paper we focus on the potential for HPC to be instrumental in responding to disasters such as wildfires, hurricanes, extreme flooding, earthquakes, tsunamis, winter weather conditions, and accidents. Driven by the VESTEC EU funded H2020 project…
▽ More
Technological advances are creating exciting new opportunities that have the potential to move HPC well beyond traditional computational workloads. In this paper we focus on the potential for HPC to be instrumental in responding to disasters such as wildfires, hurricanes, extreme flooding, earthquakes, tsunamis, winter weather conditions, and accidents. Driven by the VESTEC EU funded H2020 project, our research looks to prove HPC as a tool not only capable of simulating disasters once they have happened, but also one which is able to operate in a responsive mode, supporting disaster response teams making urgent decisions in real-time. Whilst this has the potential to revolutionise disaster response, it requires the ability to drive HPC interactively, both from the user's perspective and also based upon the arrival of data. As such interactivity is a critical component in enabling HPC to be exploited in the role of supporting disaster response teams so that urgent decision makers can make the correct decision first time, every time.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
Automated classification of plasma regions using 3D particle energy distributions
Authors:
Vyacheslav Olshevsky,
Yuri V. Khotyaintsev,
Ahmad Lalti,
Andrey Divin,
Gian Luca Delzanno,
Sven Anderzen,
Pawel Herman,
Steven W. D. Chien,
Levon Avanov,
Andrew P. Dimmock,
Stefano Markidis
Abstract:
We investigate the properties of the ion sky maps produced by the Dual Ion Spectrometers (DIS) from the Fast Plasma Investigation (FPI). We have trained a convolutional neural network classifier to predict four regions crossed by the MMS on the dayside magnetosphere: solar wind, ion foreshock, magnetosheath, and magnetopause using solely DIS spectrograms. The accuracy of the classifier is >98%. We…
▽ More
We investigate the properties of the ion sky maps produced by the Dual Ion Spectrometers (DIS) from the Fast Plasma Investigation (FPI). We have trained a convolutional neural network classifier to predict four regions crossed by the MMS on the dayside magnetosphere: solar wind, ion foreshock, magnetosheath, and magnetopause using solely DIS spectrograms. The accuracy of the classifier is >98%. We use the classifier to detect mixed plasma regions, in particular to find the bow shock regions. A similar approach can be used to identify the magnetopause crossings and reveal regions prone to magnetic reconnection. Data processing through the trained classifier is fast and efficient and thus can be used for classification for the whole MMS database.
△ Less
Submitted 21 September, 2021; v1 submitted 15 August, 2019;
originally announced August 2019.
-
Multi-GPU Acceleration of the iPIC3D Implicit Particle-in-Cell Code
Authors:
Chaitanya Prasad Sishtla,
Steven W. D. Chien,
Vyacheslav Olshevsky,
Erwin Laure,
Stefano Markidis
Abstract:
iPIC3D is a widely used massively parallel Particle-in-Cell code for the simulation of space plasmas. However, its current implementation does not support execution on multiple GPUs. In this paper, we describe the porting of iPIC3D particle mover to GPUs and the optimization steps to increase the performance and parallel scaling on multiple GPUs. We analyze the strong scaling of the mover on two G…
▽ More
iPIC3D is a widely used massively parallel Particle-in-Cell code for the simulation of space plasmas. However, its current implementation does not support execution on multiple GPUs. In this paper, we describe the porting of iPIC3D particle mover to GPUs and the optimization steps to increase the performance and parallel scaling on multiple GPUs. We analyze the strong scaling of the mover on two GPU clusters and evaluate its performance and acceleration. The optimized GPU version which uses pinned memory and asynchronous data prefetching outperform their corresponding CPU versions by 5-10x on two different systems equipped with NVIDIA K80 and V100 GPUs.
△ Less
Submitted 7 April, 2019;
originally announced April 2019.
-
TensorFlow Doing HPC
Authors:
Steven W. D. Chien,
Stefano Markidis,
Vyacheslav Olshevsky,
Yaroslav Bulatov,
Erwin Laure,
Jeffrey S. Vetter
Abstract:
TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for develo** Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HP…
▽ More
TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for develo** Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HPC applications. However, very few experiments have been conducted to evaluate TensorFlow performance when running HPC workloads on supercomputers. This work addresses this lack by designing four traditional HPC benchmark applications: STREAM, matrix-matrix multiply, Conjugate Gradient (CG) solver and Fast Fourier Transform (FFT). We analyze their performance on two supercomputers with accelerators and evaluate the potential of TensorFlow for develo** HPC applications. Our tests show that TensorFlow can fully take advantage of high performance networks and accelerators on supercomputers. Running our TensorFlow STREAM benchmark, we obtain over 50% of theoretical communication bandwidth on our testing platform. We find an approximately 2x, 1.7x and 1.8x performance improvement when increasing the number of GPUs from two to four in the matrix-matrix multiply, CG and FFT applications respectively. All our performance results demonstrate that TensorFlow has high potential of emerging also as HPC programming framework for heterogeneous supercomputers.
△ Less
Submitted 11 March, 2019;
originally announced March 2019.
-
A Fast Algorithm for the Inversion of Quasiseparable Vandermonde-like Matrices
Authors:
Sirani M. Perera,
Grigory Bonik,
Vadim Olshevsky
Abstract:
The results on Vandermonde-like matrices were introduced as a generalization of polynomial Vandermonde matrices, and the displacement structure of these matrices was used to derive an inversion formula. In this paper we first present a fast Gaussian elimination algorithm for the polynomial Vandermonde-like matrices. Later we use the said algorithm to derive fast inversion algorithms for quasisepar…
▽ More
The results on Vandermonde-like matrices were introduced as a generalization of polynomial Vandermonde matrices, and the displacement structure of these matrices was used to derive an inversion formula. In this paper we first present a fast Gaussian elimination algorithm for the polynomial Vandermonde-like matrices. Later we use the said algorithm to derive fast inversion algorithms for quasiseparable, semiseparable and well-free Vandermonde-like matrices having $\mathcal{O}(n^2)$ complexity. To do so we identify structures of displacement operators in terms of generators and the recurrence relations(2-term and 3-term) between the columns of the basis transformation matrices for quasiseparable, semiseparable and well-free polynomials. Finally we present an $\mathcal{O}(n^2)$ algorithm to compute the inversion of quasiseparable Vandermonde-like matrices.
△ Less
Submitted 8 January, 2014;
originally announced January 2014.