-
Quantum Computing and Visualization: A Disruptive Technological Change Ahead
Authors:
E. Wes Bethel,
Mercy G. Amankwah,
Jan Balewski,
Roel Van Beeumen,
Daan Camps,
Daniel Huang,
Talita Perciano
Abstract:
The focus of this Visualization Viewpoints article is to provide some background on Quantum Computing (QC), to explore ideas related to how visualization helps in understanding QC, and examine how QC might be useful for visualization with the growth and maturation of both technologies in the future. In a quickly evolving technology landscape, QC is emerging as a promising pathway to overcome the g…
▽ More
The focus of this Visualization Viewpoints article is to provide some background on Quantum Computing (QC), to explore ideas related to how visualization helps in understanding QC, and examine how QC might be useful for visualization with the growth and maturation of both technologies in the future. In a quickly evolving technology landscape, QC is emerging as a promising pathway to overcome the growth limits in classical computing. In some cases, QC platforms offer the potential to vastly outperform the familiar classical computer by solving problems more quickly or that may be intractable on any known classical platform. As further performance gains for classical computing platforms are limited by diminishing Moore's Law scaling, QC platforms might be viewed as a potential successor to the current field of exascale-class platforms. While present-day QC hardware platforms are still limited in scale, the field of quantum computing is robust and rapidly advancing in terms of hardware capabilities, software environments for develo** quantum algorithms, and educational programs for training the next generation of scientists and engineers. After a brief introduction to QC concepts, the focus of this article is to explore the interplay between the fields of visualization and QC. First, visualization has played a role in QC by providing the means to show representations of the quantum state of single-qubits in superposition states and multiple-qubits in entangled states. Second, there are a number of ways in which the field of visual data exploration and analysis may potentially benefit from this disruptive new technology though there are challenges going forward.
△ Less
Submitted 11 October, 2023; v1 submitted 7 October, 2023;
originally announced October 2023.
-
MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems
Authors:
Steven Farrell,
Murali Emani,
Jacob Balma,
Lukas Drescher,
Aleksandr Drozd,
Andreas Fink,
Geoffrey Fox,
David Kanter,
Thorsten Kurth,
Peter Mattson,
Dawei Mu,
Amit Ruhela,
Kento Sato,
Koichi Shirahata,
Tsuguchika Tabaru,
Aristeidis Tsaris,
Jan Balewski,
Ben Cumming,
Takumi Danjo,
Jens Domke,
Takaaki Fukai,
Naoto Fukumoto,
Tatsuya Fukushi,
Balazs Gerofi,
Takumi Honda
, et al. (18 additional authors not shown)
Abstract:
Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning appli…
▽ More
Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of large-scale scientific machine learning training applications driven by the MLCommons Association. We present the results from the first submission round, including a diverse set of some of the world's largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence, and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization, and communication scheduling, enabling overall $>10 \times$ (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system's memory hierarchy, and training convergence that underlines the importance of near-compute storage. To overcome the data-parallel scalability challenge at large batch sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O, and network behavior to parameterize extended roofline performance models in future rounds.
△ Less
Submitted 26 October, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Authors:
Yosuke Oyama,
Naoya Maruyama,
Nikoli Dryden,
Erin McCarthy,
Peter Harrington,
Jan Balewski,
Satoshi Matsuoka,
Peter Nugent,
Brian Van Essen
Abstract:
We present scalable hybrid-parallel algorithms for training large-scale 3D convolutional neural networks. Deep learning-based emerging scientific workflows often require model training with large, high-dimensional samples, which can make training much more costly and even infeasible due to excessive memory usage. We solve these challenges by extensively applying hybrid parallelism throughout the e…
▽ More
We present scalable hybrid-parallel algorithms for training large-scale 3D convolutional neural networks. Deep learning-based emerging scientific workflows often require model training with large, high-dimensional samples, which can make training much more costly and even infeasible due to excessive memory usage. We solve these challenges by extensively applying hybrid parallelism throughout the end-to-end training pipeline, including both computations and I/O. Our hybrid-parallel algorithm extends the standard data parallelism with spatial parallelism, which partitions a single sample in the spatial domain, realizing strong scaling beyond the mini-batch dimension with a larger aggregated memory capacity. We evaluate our proposed training algorithms with two challenging 3D CNNs, CosmoFlow and 3D U-Net. Our comprehensive performance studies show that good weak and strong scaling can be achieved for both networks using up 2K GPUs. More importantly, we enable training of CosmoFlow with much larger samples than previously possible, realizing an order-of-magnitude improvement in prediction accuracy.
△ Less
Submitted 25 July, 2020;
originally announced July 2020.