Skip to main content

Showing 1–18 of 18 results for author: Hubara, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.14110  [pdf, other

    cs.LG cs.AI cs.AR

    Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators

    Authors: Yaniv Blumenfeld, Itay Hubara, Daniel Soudry

    Abstract: The majority of the research on the quantization of Deep Neural Networks (DNNs) is focused on reducing the precision of tensors visible by high-level frameworks (e.g., weights, activations, and gradients). However, current hardware still relies on high-accuracy core operations. Most significant is the operation of accumulating products. This high-precision accumulation operation is gradually becom… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  2. arXiv:2203.10991  [pdf, other

    cs.LG cs.AI

    Minimum Variance Unbiased N:M Sparsity for the Neural Gradients

    Authors: Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry

    Abstract: In deep learning, fine-grained N:M sparsity reduces the data footprint and bandwidth of a General Matrix multiply (GEMM) up to x2, and doubles throughput by skip** computation of zero values. So far, it was mainly only used to prune weights to accelerate the forward and backward phases. We examine how this method can be used also for the neural gradients (i.e., loss gradients with respect to the… ▽ More

    Submitted 9 June, 2024; v1 submitted 21 March, 2022; originally announced March 2022.

  3. arXiv:2102.08124  [pdf, other

    cs.AI

    Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

    Authors: Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Seffi Naor, Daniel Soudry

    Abstract: Unstructured pruning reduces the memory footprint in deep neural networks (DNNs). Recently, researchers proposed different types of structural pruning intending to reduce also the computation complexity. In this work, we first suggest a new measure called mask-diversity which correlates with the expected accuracy of the different types of structural pruning. We focus on the recently suggested N:M… ▽ More

    Submitted 20 October, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

  4. arXiv:2006.10518  [pdf, other

    cs.LG stat.ML

    Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming

    Authors: Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry

    Abstract: Lately, post-training quantization methods have gained considerable attention, as they are simple to use, and require only a small unlabeled calibration set. This small dataset cannot be used to fine-tune the model without significant over-fitting. Instead, these methods only use the calibration set to set the activations' dynamic ranges. However, such methods always resulted in significant accura… ▽ More

    Submitted 14 December, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  5. arXiv:1912.12636  [pdf, other

    cs.ET cs.AR cs.LG cs.NE

    Training of Quantized Deep Neural Networks using a Magnetic Tunnel Junction-Based Synapse

    Authors: Tzofnat Greenberg Toledo, Ben Perach, Itay Hubara, Daniel Soudry, Shahar Kvatinsky

    Abstract: Quantized neural networks (QNNs) are being actively researched as a solution for the computational complexity and memory intensity of deep neural networks. This has sparked efforts to develop algorithms that support both inference and training with quantized weight and activation values, without sacrificing accuracy. A recent example is the GXNOR framework for stochastic training of ternary (TNN)… ▽ More

    Submitted 29 May, 2022; v1 submitted 29 December, 2019; originally announced December 2019.

    Comments: Published in Semiconductor Science and Technology, Vol 36

    Journal ref: Semicond. Sci. Technol. 36 114003 (2021)

  6. arXiv:1912.01274  [pdf, other

    cs.LG cs.CV stat.ML

    The Knowledge Within: Methods for Data-Free Model Compression

    Authors: Matan Haroush, Itay Hubara, Elad Hoffer, Daniel Soudry

    Abstract: Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNN). So far, high compression rate algorithms require part of the training dataset for a low precision calibration, or a fine-tuning process. However, this requirement is unacceptable when the data is unavailable or contains sensitive information, as in medical and biometric use-cases.… ▽ More

    Submitted 6 April, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

  7. arXiv:1911.02549  [pdf, other

    cs.LG cs.PF stat.ML

    MLPerf Inference Benchmark

    Authors: Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee , et al. (22 additional authors not shown)

    Abstract: Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devic… ▽ More

    Submitted 9 May, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: ISCA 2020

  8. arXiv:1908.08986  [pdf, other

    cs.CV cs.LG stat.ML

    Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency

    Authors: Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry

    Abstract: Convolutional neural networks (CNNs) are commonly trained using a fixed spatial image size predetermined for a given model. Although trained on images of aspecific size, it is well established that CNNs can be used to evaluate a wide range of image sizes at test time, by adjusting the size of intermediate feature maps. In this work, we describe and evaluate a novel mixed-size training regime that… ▽ More

    Submitted 12 August, 2019; originally announced August 2019.

  9. arXiv:1901.09335  [pdf, other

    cs.LG stat.ML

    Augment your batch: better training with larger batches

    Authors: Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry

    Abstract: Large-batch SGD is important for scaling training of deep neural networks. However, without fine-tuning hyperparameter schedules, the generalization of the model may be hampered. We propose to use batch augmentation: replicating instances of samples within the same batch with different data augmentations. Batch augmentation acts as a regularizer and an accelerator, increasing both generalization a… ▽ More

    Submitted 27 January, 2019; originally announced January 2019.

  10. arXiv:1805.11046  [pdf, other

    cs.LG stat.ML

    Scalable Methods for 8-bit Training of Neural Networks

    Authors: Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

    Abstract: Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to… ▽ More

    Submitted 17 June, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

  11. arXiv:1801.04540  [pdf, other

    cs.LG cs.CV stat.ML

    Fix your classifier: the marginal value of training the last weight layer

    Authors: Elad Hoffer, Itay Hubara, Daniel Soudry

    Abstract: Neural networks are commonly used as models for classification for a wide variety of tasks. Typically, a learned affine transformation is placed at the end of such models, yielding a per-class value used for classification. This classifier can have a vast number of parameters, which grows linearly with the number of possible classes, thus requiring increasingly more resources. In this work we argu… ▽ More

    Submitted 20 March, 2018; v1 submitted 14 January, 2018; originally announced January 2018.

    Comments: https://openreview.net/forum?id=S1Dh8Tg0-

    Journal ref: International Conference on Learning Representations 2018

  12. arXiv:1705.08741  [pdf, other

    stat.ML cs.LG

    Train longer, generalize better: closing the generalization gap in large batch training of neural networks

    Authors: Elad Hoffer, Itay Hubara, Daniel Soudry

    Abstract: Background: Deep learning models are typically trained using stochastic gradient descent or one of its variants. These methods update the weights using their gradient, estimated from a small fraction of the training data. It has been observed that when using large batch sizes there is a persistent degradation in generalization performance - known as the "generalization gap" phenomena. Identifying… ▽ More

    Submitted 1 January, 2018; v1 submitted 24 May, 2017; originally announced May 2017.

    Journal ref: Advances in Neural Information Processing Systems 30 2017; pages 1729-1739; http://papers.nips.cc/paper/6770-train-longer-generalize-better-closing-the-generalization-gap-in-large-batch-training-of-neural-networks

  13. arXiv:1611.06996  [pdf, ps, other

    stat.ML cs.LG

    Spatial contrasting for deep unsupervised learning

    Authors: Elad Hoffer, Itay Hubara, Nir Ailon

    Abstract: Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training me… ▽ More

    Submitted 21 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  14. arXiv:1611.02205  [pdf, other

    cs.LG cs.AI

    Playing SNES in the Retro Learning Environment

    Authors: Nadav Bhonker, Shai Rozenberg, Itay Hubara

    Abstract: Mastering a video game requires skill, tactics and strategy. While these attributes may be acquired naturally by human players, teaching them to a computer program is a far more challenging task. In recent years, extensive research was carried out in the field of reinforcement learning and numerous algorithms were introduced, aiming to learn how to perform human tasks such as playing video games.… ▽ More

    Submitted 7 February, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

  15. arXiv:1610.00243  [pdf, other

    cs.LG cs.AI stat.ML

    Deep unsupervised learning through spatial contrasting

    Authors: Elad Hoffer, Itay Hubara, Nir Ailon

    Abstract: Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training me… ▽ More

    Submitted 4 December, 2018; v1 submitted 2 October, 2016; originally announced October 2016.

  16. arXiv:1609.07061  [pdf, other

    cs.NE cs.LG

    Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

    Authors: Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio

    Abstract: We introduce a method to train Quantized Neural Networks (QNNs) --- neural networks with extremely low precision (e.g., 1-bit) weights and activations, at run-time. At train-time the quantized weights and activations are used for computing the parameter gradients. During the forward pass, QNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operati… ▽ More

    Submitted 22 September, 2016; originally announced September 2016.

    Comments: arXiv admin note: text overlap with arXiv:1602.02830

  17. arXiv:1602.02830  [pdf, other

    cs.LG

    Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1

    Authors: Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio

    Abstract: We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time. At training-time the binary weights and activations are used for computing the parameters gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substan… ▽ More

    Submitted 17 March, 2016; v1 submitted 8 February, 2016; originally announced February 2016.

    Comments: 11 pages and 3 figures

  18. arXiv:1602.02505   

    cs.LG cs.NE

    Binarized Neural Networks

    Authors: Itay Hubara, Daniel Soudry, Ran El Yaniv

    Abstract: We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters' gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward… ▽ More

    Submitted 10 March, 2016; v1 submitted 8 February, 2016; originally announced February 2016.

    Comments: This is an obsolete version, up to date version is available here: arXiv:1602.02830