Skip to main content

Showing 1–6 of 6 results for author: Nurvitadhi, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2301.04767  [pdf, other

    cs.AR

    RAD-Sim: Rapid Architecture Exploration for Novel Reconfigurable Acceleration Devices

    Authors: Andrew Boutros, Eriko Nurvitadhi, Vaughn Betz

    Abstract: With the continued growth in field-programmable gate array (FPGA) capacity and their incorporation into new environments such as datacenters, we have witnessed the introduction of a new class of reconfigurable acceleration devices (RADs) that go beyond conventional FPGA architectures. These devices combine a reconfigurable fabric with coarse-grained domain-specialized accelerator blocks all connec… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Comments: Published in the 2022 proceedings of the International Conference on Field-Programmable Logic and Applications (FPL)

  2. arXiv:2204.10943  [pdf, other

    cs.DC cs.AI

    FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems

    Authors: Rui Ma, Evangelos Georganas, Alexander Heinecke, Andrew Boutros, Eriko Nurvitadhi

    Abstract: Rapid advances in artificial intelligence (AI) technology have led to significant accuracy improvements in a myriad of application domains at the cost of larger and more compute-intensive models. Training such models on massive amounts of data typically requires scaling to many compute nodes and relies heavily on collective communication algorithms, such as all-reduce, to exchange the weight gradi… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: 5 pages, 4 figures

  3. arXiv:1806.11547  [pdf

    cs.DC cs.AR

    Exploration of Low Numeric Precision Deep Learning Inference Using Intel FPGAs

    Authors: Philip Colangelo, Nasibeh Nasiri, Asit Mishra, Eriko Nurvitadhi, Martin Margala, Kevin Nealis

    Abstract: CNNs have been shown to maintain reasonable classification accuracy when quantized to lower precisions. Quantizing to sub 8-bit activations and weights can result in accuracy falling below an acceptable threshold. Techniques exist for closing the accuracy gap of limited numeric precision typically by increasing computation. This results in a trade-off between throughput and accuracy and can be tai… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: To Appear In The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines

  4. arXiv:1709.01134  [pdf, ps, other

    cs.CV cs.LG cs.NE

    WRPN: Wide Reduced-Precision Networks

    Authors: Asit Mishra, Eriko Nurvitadhi, Jeffrey J Cook, Debbie Marr

    Abstract: For computer vision applications, prior works have shown the efficacy of reducing numeric precision of model parameters (network weights) in deep neural networks. Activation maps, however, occupy a large memory footprint during both the training and inference step when using mini-batches of inputs. One way to reduce this large memory footprint is to reduce the precision of activations. However, pa… ▽ More

    Submitted 4 September, 2017; originally announced September 2017.

  5. arXiv:1704.03079  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.NE

    WRPN: Training and Inference using Wide Reduced-Precision Networks

    Authors: Asit Mishra, Jeffrey J Cook, Eriko Nurvitadhi, Debbie Marr

    Abstract: For computer vision applications, prior works have shown the efficacy of reducing the numeric precision of model parameters (network weights) in deep neural networks but also that reducing the precision of activations hurts model accuracy much more than reducing the precision of model parameters. We study schemes to train networks from scratch using reduced-precision activations without hurting th… ▽ More

    Submitted 10 April, 2017; originally announced April 2017.

    Comments: Under submission to CVPR Workshop

  6. arXiv:1610.00324  [pdf, other

    cs.LG cs.NE

    Accelerating Deep Convolutional Networks using low-precision and sparsity

    Authors: Ganesh Venkatesh, Eriko Nurvitadhi, Debbie Marr

    Abstract: We explore techniques to significantly improve the compute efficiency and performance of Deep Convolution Networks without impacting their accuracy. To improve the compute efficiency, we focus on achieving high accuracy with extremely low-precision (2-bit) weight networks, and to accelerate the execution time, we aggressively skip operations on zero-values. We achieve the highest reported accuracy… ▽ More

    Submitted 2 October, 2016; originally announced October 2016.