Skip to main content

Showing 1–3 of 3 results for author: Noh, S H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2304.08925  [pdf, other

    cs.LG cs.PF

    Understand Data Preprocessing for Effective End-to-End Training of Deep Neural Networks

    Authors: ** Gong, Yuxin Ma, Cheng Li, Xiaosong Ma, Sam H. Noh

    Abstract: In this paper, we primarily focus on understanding the data preprocessing pipeline for DNN Training in the public cloud. First, we run experiments to test the performance implications of the two major data preprocessing methods using either raw data or record files. The preliminary results show that data preprocessing is a clear bottleneck, even with the most efficient software and hardware config… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  2. arXiv:2005.14038  [pdf, other

    cs.DC

    HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism

    Authors: Jay H. Park, Gyeongchan Yun, Chang M. Yi, Nguyen T. Nguyen, Seungmin Lee, Jaesik Choi, Sam H. Noh, Young-ri Choi

    Abstract: Deep Neural Network (DNN) models have continuously been growing in size in order to improve the accuracy and quality of the models. Moreover, for training of large DNN models, the use of heterogeneous GPUs is inevitable due to the short release cycle of new GPU architectures. In this paper, we investigate how to enable training of large DNN models on a heterogeneous GPU cluster that possibly inclu… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

  3. arXiv:1901.05803  [pdf, other

    cs.DC

    Accelerated Training for CNN Distributed Deep Learning through Automatic Resource-Aware Layer Placement

    Authors: Jay H. Park, Sunghwan Kim, **won Lee, Myeongjae Jeon, Sam H. Noh

    Abstract: The Convolutional Neural Network (CNN) model, often used for image classification, requires significant training time to obtain high accuracy. To this end, distributed training is performed with the parameter server (PS) architecture using multiple servers. Unfortunately, scalability has been found to be poor in existing architectures. We find that the PS network is the bottleneck as it communicat… ▽ More

    Submitted 17 January, 2019; originally announced January 2019.