Skip to main content

Showing 1–4 of 4 results for author: Kawachiya, K

.
  1. arXiv:2008.08272  [pdf, other

    cs.PL cs.LG

    Compiling ONNX Neural Network Models Using MLIR

    Authors: Tian **, Gheorghe-Teodor Bercea, Tung D. Le, Tong Chen, Gong Su, Haruki Imai, Yasushi Negishi, Anh Leu, Kevin O'Brien, Kiyokuni Kawachiya, Alexandre E. Eichenberger

    Abstract: Deep neural network models are becoming increasingly popular and have been used in various tasks such as computer vision, speech recognition, and natural language processing. Machine learning models are commonly trained in a resource-rich environment and then deployed in a distinct environment such as high availability machines or edge devices. To assist the portability of models, the open-source… ▽ More

    Submitted 30 September, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

    Comments: 8 pages

  2. arXiv:1907.05013  [pdf, other

    cs.LG cs.DC cs.PF

    Profiling based Out-of-core Hybrid Method for Large Neural Networks

    Authors: Yuki Ito, Haruki Imai, Tung Le Duc, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, Toshio Endo

    Abstract: GPUs are widely used to accelerate deep learning with NNs (NNs). On the other hand, since GPU memory capacity is limited, it is difficult to implement efficient programs that compute large NNs on GPU. To compute NNs exceeding GPU memory capacity, data-swap** method and recomputing method have been proposed in existing work. However, in these methods, performance overhead occurs due to data movem… ▽ More

    Submitted 11 July, 2019; originally announced July 2019.

    Comments: 15 pages

  3. arXiv:1812.07816  [pdf

    cs.LG cs.CV cs.PF stat.ML

    Fast and Accurate 3D Medical Image Segmentation with Data-swap** Method

    Authors: Haruki Imai, Samuel Matzek, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya

    Abstract: Deep neural network models used for medical image segmentation are large because they are trained with high-resolution three-dimensional (3D) images. Graphics processing units (GPUs) are widely used to accelerate the trainings. However, the memory on a GPU is not large enough to train the models. A popular approach to tackling this problem is patch-based method, which divides a large image into sm… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

    Comments: 13 pages

    ACM Class: C.4; I.2.6; I.2.10; I.4.6; I.4.9; J.4

  4. arXiv:1807.02037  [pdf, other

    cs.LG cs.AI stat.ML

    TFLMS: Large Model Support in TensorFlow by Graph Rewriting

    Authors: Tung D. Le, Haruki Imai, Yasushi Negishi, Kiyokuni Kawachiya

    Abstract: While accelerators such as GPUs have limited memory, deep neural networks are becoming larger and will not fit with the memory limitation of accelerators for training. We propose an approach to tackle this problem by rewriting the computational graph of a neural network, in which swap-out and swap-in operations are inserted to temporarily store intermediate results on CPU memory. In particular, we… ▽ More

    Submitted 2 October, 2019; v1 submitted 5 July, 2018; originally announced July 2018.

    Comments: A new version of TFLMS was published at ISMM 2019 (https://dl.acm.org/citation.cfm?id=3329984)