Skip to main content

Showing 1–18 of 18 results for author: Chien, S W

.
  1. arXiv:2406.15686  [pdf, other

    cs.CR cs.NI

    The Case for Transport-Level Encryption in Datacenter Networks

    Authors: Tianyi Gao, Xinshu Ma, Suhas Narreddy, Eugenio Luo, Steven W. D. Chien, Michio Honda

    Abstract: Cloud applications need network data encryption to isolate from other tenants and protect their data from potential eavesdroppers in the network infrastructure. This paper presents SDP, a protocol design for emerging datacenter transport protocols, such as pHost, NDP, and Homa, to integrate data encryption with the use of existing NIC offloading of cryptographic operations designed for TLS over TC… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2401.14576  [pdf

    cs.DC cs.PF

    Accelerating Scientific Application through Transparent I/O Interposition

    Authors: Steven W. D. Chien, Kento Sato, Artur Podobas, Niclas Jansson, Stefano Markidis, Michio Honda

    Abstract: The ability to handle a large volume of data generated by scientific applications is crucial. We have seen an increase in the heterogeneity of storage technologies available to scientific applications, such as burst buffers, local temporary block storage, managed cloud parallel file systems (PFS), and non-POSIX object stores. However, scientific applications designed for traditional HPC systems ca… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Submitted to HPDC 2024

  3. arXiv:2107.06676  [pdf, other

    cs.LG cs.CE cs.DC cs.NE

    Higgs Boson Classification: Brain-inspired BCPNN Learning with StreamBrain

    Authors: Martin Svedin, Artur Podobas, Steven W. D. Chien, Stefano Markidis

    Abstract: One of the most promising approaches for data analysis and exploration of large data sets is Machine Learning techniques that are inspired by brain models. Such methods use alternative learning rules potentially more efficiently than established learning rules. In this work, we focus on the potential of brain-inspired ML for exploiting High-Performance Computing (HPC) resources to solve ML problem… ▽ More

    Submitted 17 August, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

    Comments: Accepted for publication at The 2nd Workshop on Artificial Intelligence and Machine Learning for Scientific Applications (AI4S 2021)

  4. arXiv:2106.05373  [pdf, other

    cs.DC cs.LG cs.NE

    StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs

    Authors: Artur Podobas, Martin Svedin, Steven W. D. Chien, Ivy B. Peng, Naresh Balaji Ravichandran, Pawel Herman, Anders Lansner, Stefano Markidis

    Abstract: The modern deep learning method based on backpropagation has surged in popularity and has been used in multiple domains and application areas. At the same time, there are other -- less-known -- machine learning algorithms with a mature and solid theoretical foundation whose performance remains unexplored. One such example is the brain-like Bayesian Confidence Propagation Neural Network (BCPNN). In… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: Accepted for publication at the International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies (HEART 2021)

  5. arXiv:2106.04979  [pdf

    cs.DC

    Benchmarking the Nvidia GPU Lineage: From Early K80 to Modern A100 with Asynchronous Memory Transfers

    Authors: Martin Svedin, Steven W. D. Chien, Gibson Chikafa, Niclas Jansson, Artur Podobas

    Abstract: For many, Graphics Processing Units (GPUs) provides a source of reliable computing power. Recently, Nvidia introduced its 9th generation HPC-grade GPUs, the Ampere 100, claiming significant performance improvements over previous generations, particularly for AI-workloads, as well as introducing new architectural features such as asynchronous data movement. But how well does the A100 perform on non… ▽ More

    Submitted 3 July, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 7 pages

  6. sputniPIC: an Implicit Particle-in-Cell Code for Multi-GPU Systems

    Authors: Steven W. D. Chien, Jonas Nylund, Gabriel Bengtsson, Ivy B. Peng, Artur Podobas, Stefano Markidis

    Abstract: Large-scale simulations of plasmas are essential for advancing our understanding of fusion devices, space, and astrophysical systems. Particle-in-Cell (PIC) codes have demonstrated their success in simulating numerous plasma phenomena on HPC systems. Today, flagship supercomputers feature multiple GPUs per compute node to achieve unprecedented computing power at high power efficiency. PIC codes re… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at the 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2020)

  7. tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads

    Authors: Steven W. D. Chien, Artur Podobas, Ivy B. Peng, Stefano Markidis

    Abstract: Machine Learning applications on HPC systems have been gaining popularity in recent years. The upcoming large scale systems will offer tremendous parallelism for training through GPUs. However, another heavy aspect of Machine Learning is I/O, and this can potentially be a performance bottleneck. TensorFlow, one of the most popular Deep-Learning platforms, now offers a new profiler interface and al… ▽ More

    Submitted 11 August, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at the 2020 International Conference on Cluster Computing (CLUSTER 2020)

  8. Performance Evaluation of Advanced Features in CUDA Unified Memory

    Authors: Steven W. D. Chien, Ivy B. Peng, Stefano Markidis

    Abstract: CUDA Unified Memory improves the GPU programmability and also enables GPU memory oversubscription. Recently, two advanced memory features, memory advises and asynchronous prefetch, have been introduced. In this work, we evaluate the new features on two platforms that feature different CPUs, GPUs, and interconnects. We derive a benchmark suite for the experiments and stress the memory system to eva… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

    Comments: Accepted for publication at Workshop on Memory Centric High Performance Computing (MCHPC'19) in SC19

  9. arXiv:1908.05715  [pdf, other

    physics.space-ph cs.LG eess.IV

    Automated classification of plasma regions using 3D particle energy distributions

    Authors: Vyacheslav Olshevsky, Yuri V. Khotyaintsev, Ahmad Lalti, Andrey Divin, Gian Luca Delzanno, Sven Anderzen, Pawel Herman, Steven W. D. Chien, Levon Avanov, Andrew P. Dimmock, Stefano Markidis

    Abstract: We investigate the properties of the ion sky maps produced by the Dual Ion Spectrometers (DIS) from the Fast Plasma Investigation (FPI). We have trained a convolutional neural network classifier to predict four regions crossed by the MMS on the dayside magnetosphere: solar wind, ion foreshock, magnetosheath, and magnetopause using solely DIS spectrograms. The accuracy of the classifier is >98%. We… ▽ More

    Submitted 21 September, 2021; v1 submitted 15 August, 2019; originally announced August 2019.

    Comments: Accepted to JGR: Space Physics

  10. Posit NPB: Assessing the Precision Improvement in HPC Scientific Applications

    Authors: Steven W. D. Chien, Ivy B. Peng, Stefano Markidis

    Abstract: Floating-point operations can significantly impact the accuracy and performance of scientific applications on large-scale parallel systems. Recently, an emerging floating-point format called Posit has attracted attention as an alternative to the standard IEEE floating-point formats because it could enable higher precision than IEEE formats using the same number of bits. In this work, we first expl… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

    Comments: Accepted for publication in PPAM 2019 conference

  11. Multi-GPU Acceleration of the iPIC3D Implicit Particle-in-Cell Code

    Authors: Chaitanya Prasad Sishtla, Steven W. D. Chien, Vyacheslav Olshevsky, Erwin Laure, Stefano Markidis

    Abstract: iPIC3D is a widely used massively parallel Particle-in-Cell code for the simulation of space plasmas. However, its current implementation does not support execution on multiple GPUs. In this paper, we describe the porting of iPIC3D particle mover to GPUs and the optimization steps to increase the performance and parallel scaling on multiple GPUs. We analyze the strong scaling of the mover on two G… ▽ More

    Submitted 7 April, 2019; originally announced April 2019.

    Comments: Accepted for publication in ICCS 2019

  12. TensorFlow Doing HPC

    Authors: Steven W. D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, Jeffrey S. Vetter

    Abstract: TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for develo** Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HP… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: Accepted for publication at The Ninth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES'19)

  13. Particle-in-Cell Simulations of Plasma Dynamics in Cometary Environment

    Authors: Chaitanya Prasad Sishtla, Vyacheslav Olshevsky, Steven W. D. Chien, Stefano Markidis, Erwin Laure

    Abstract: We perform and analyze global Particle-in-Cell (PIC) simulations of the interaction between solar wind and an outgassing comet with the goal of studying the plasma kinetic dynamics of a cometary environment. To achieve this, we design and implement a new numerical method in the iPIC3D code to model outgassing from the comet: new plasma particles are ejected from the comet "surface" at each computa… ▽ More

    Submitted 28 January, 2019; originally announced January 2019.

    Comments: 11 pages, 5 figures, ASTRONUM-2018

    Journal ref: Journal of Physics: Conference Series, Volume 1225 (2019), 012009

  14. Characterizing Deep-Learning I/O Workloads in TensorFlow

    Authors: Steven W. D. Chien, Stefano Markidis, Chaitanya Prasad Sishtla, Luis Santos, Pawel Herman, Sai Narasimhamurthy, Erwin Laure

    Abstract: The performance of Deep-Learning (DL) computing frameworks rely on the performance of data ingestion and checkpointing. In fact, during the training, a considerable high number of relatively small files are first loaded and pre-processed on CPUs and then moved to accelerator for computation. In addition, checkpointing and restart operations are carried out to allow DL computing frameworks to resta… ▽ More

    Submitted 6 October, 2018; originally announced October 2018.

    Comments: Accepted for publication at pdsw-DISCS 2018

  15. arXiv:1807.05183  [pdf, other

    physics.comp-ph physics.flu-dyn physics.plasm-ph

    PolyPIC: the Polymorphic-Particle-in-Cell Method for Fluid-Kinetic Coupling

    Authors: Stefano Markidis, Vyacheslav Olshevsky, Chaitanya Prasad Sishtla, Steven Wei-der Chien, Erwin Laure, Giovanni Lapenta

    Abstract: Particle-in-Cell (PIC) methods are widely used computational tools for fluid and kinetic plasma modeling. While both the fluid and kinetic PIC approaches have been successfully used to target either kinetic or fluid simulations, little was done to combine fluid and kinetic particles under the same PIC framework. This work addresses this issue by proposing a new PIC method, PolyPIC, that uses polym… ▽ More

    Submitted 13 July, 2018; originally announced July 2018.

    Comments: Submitted to Frontiers

    Journal ref: Frontiers in Physics, 6 (2018), 100

  16. The SAGE Project: a Storage Centric Approach for Exascale Computing

    Authors: Sai Narasimhamurthy, Nikita Danilov, Sining Wu, Ganesan Umanesan, Steven Wei-der Chien, Sergio Rivas-Gomez, Ivy Bo Peng, Erwin Laure, Shaun de Witt, Dirk Pleiter, Stefano Markidis

    Abstract: SAGE (Percipient StorAGe for Exascale Data Centric Computing) is a European Commission funded project towards the era of Exascale computing. Its goal is to design and implement a Big Data/Extreme Computing (BDEC) capable infrastructure with associated software stack. The SAGE system follows a "storage centric" approach as it is capable of storing and processing large data volumes at the Exascale r… ▽ More

    Submitted 6 July, 2018; originally announced July 2018.

    Comments: Submitted to Computing Frontiers 2018. arXiv admin note: substantial text overlap with arXiv:1805.00556

  17. Exploring Scientific Application Performance Using Large Scale Object Storage

    Authors: Steven Wei-der Chien, Stefano Markidis, Rami Karim, Erwin Laure, Sai Narasimhamurthy

    Abstract: One of the major performance and scalability bottlenecks in large scientific applications is parallel reading and writing to supercomputer I/O systems. The usage of parallel file systems and consistency requirements of POSIX, that all the traditional HPC parallel I/O interfaces adhere to, pose limitations to the scalability of scientific applications. Object storage is a widely used storage techno… ▽ More

    Submitted 6 July, 2018; originally announced July 2018.

    Comments: Preprint submitted to WOPSSS workshop at ISC 2018

  18. NVIDIA Tensor Core Programmability, Performance & Precision

    Authors: Stefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, Jeffrey S. Vetter

    Abstract: The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called "Tensor Core" that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to pro… ▽ More

    Submitted 11 March, 2018; originally announced March 2018.

    Comments: This paper has been accepted by the Eighth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 2018