Skip to main content

Showing 1–10 of 10 results for author: Hubara, I

Searching in archive stat. Search in all archives.
.
  1. arXiv:2006.10518  [pdf, other

    cs.LG stat.ML

    Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming

    Authors: Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry

    Abstract: Lately, post-training quantization methods have gained considerable attention, as they are simple to use, and require only a small unlabeled calibration set. This small dataset cannot be used to fine-tune the model without significant over-fitting. Instead, these methods only use the calibration set to set the activations' dynamic ranges. However, such methods always resulted in significant accura… ▽ More

    Submitted 14 December, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  2. arXiv:1912.01274  [pdf, other

    cs.LG cs.CV stat.ML

    The Knowledge Within: Methods for Data-Free Model Compression

    Authors: Matan Haroush, Itay Hubara, Elad Hoffer, Daniel Soudry

    Abstract: Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNN). So far, high compression rate algorithms require part of the training dataset for a low precision calibration, or a fine-tuning process. However, this requirement is unacceptable when the data is unavailable or contains sensitive information, as in medical and biometric use-cases.… ▽ More

    Submitted 6 April, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

  3. arXiv:1911.02549  [pdf, other

    cs.LG cs.PF stat.ML

    MLPerf Inference Benchmark

    Authors: Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee , et al. (22 additional authors not shown)

    Abstract: Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devic… ▽ More

    Submitted 9 May, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: ISCA 2020

  4. arXiv:1908.08986  [pdf, other

    cs.CV cs.LG stat.ML

    Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency

    Authors: Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry

    Abstract: Convolutional neural networks (CNNs) are commonly trained using a fixed spatial image size predetermined for a given model. Although trained on images of aspecific size, it is well established that CNNs can be used to evaluate a wide range of image sizes at test time, by adjusting the size of intermediate feature maps. In this work, we describe and evaluate a novel mixed-size training regime that… ▽ More

    Submitted 12 August, 2019; originally announced August 2019.

  5. arXiv:1901.09335  [pdf, other

    cs.LG stat.ML

    Augment your batch: better training with larger batches

    Authors: Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry

    Abstract: Large-batch SGD is important for scaling training of deep neural networks. However, without fine-tuning hyperparameter schedules, the generalization of the model may be hampered. We propose to use batch augmentation: replicating instances of samples within the same batch with different data augmentations. Batch augmentation acts as a regularizer and an accelerator, increasing both generalization a… ▽ More

    Submitted 27 January, 2019; originally announced January 2019.

  6. arXiv:1805.11046  [pdf, other

    cs.LG stat.ML

    Scalable Methods for 8-bit Training of Neural Networks

    Authors: Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

    Abstract: Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to… ▽ More

    Submitted 17 June, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

  7. arXiv:1801.04540  [pdf, other

    cs.LG cs.CV stat.ML

    Fix your classifier: the marginal value of training the last weight layer

    Authors: Elad Hoffer, Itay Hubara, Daniel Soudry

    Abstract: Neural networks are commonly used as models for classification for a wide variety of tasks. Typically, a learned affine transformation is placed at the end of such models, yielding a per-class value used for classification. This classifier can have a vast number of parameters, which grows linearly with the number of possible classes, thus requiring increasingly more resources. In this work we argu… ▽ More

    Submitted 20 March, 2018; v1 submitted 14 January, 2018; originally announced January 2018.

    Comments: https://openreview.net/forum?id=S1Dh8Tg0-

    Journal ref: International Conference on Learning Representations 2018

  8. arXiv:1705.08741  [pdf, other

    stat.ML cs.LG

    Train longer, generalize better: closing the generalization gap in large batch training of neural networks

    Authors: Elad Hoffer, Itay Hubara, Daniel Soudry

    Abstract: Background: Deep learning models are typically trained using stochastic gradient descent or one of its variants. These methods update the weights using their gradient, estimated from a small fraction of the training data. It has been observed that when using large batch sizes there is a persistent degradation in generalization performance - known as the "generalization gap" phenomena. Identifying… ▽ More

    Submitted 1 January, 2018; v1 submitted 24 May, 2017; originally announced May 2017.

    Journal ref: Advances in Neural Information Processing Systems 30 2017; pages 1729-1739; http://papers.nips.cc/paper/6770-train-longer-generalize-better-closing-the-generalization-gap-in-large-batch-training-of-neural-networks

  9. arXiv:1611.06996  [pdf, ps, other

    stat.ML cs.LG

    Spatial contrasting for deep unsupervised learning

    Authors: Elad Hoffer, Itay Hubara, Nir Ailon

    Abstract: Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training me… ▽ More

    Submitted 21 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  10. arXiv:1610.00243  [pdf, other

    cs.LG cs.AI stat.ML

    Deep unsupervised learning through spatial contrasting

    Authors: Elad Hoffer, Itay Hubara, Nir Ailon

    Abstract: Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks. They are, however, most suited for supervised learning from large amounts of labeled data. Previous attempts have been made to use unlabeled data to improve model performance by applying unsupervised techniques. These attempts require different architectures and training me… ▽ More

    Submitted 4 December, 2018; v1 submitted 2 October, 2016; originally announced October 2016.