Skip to main content

Showing 1–17 of 17 results for author: Banner, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.10598  [pdf, other

    cs.LG

    DropCompute: simple and more robust distributed synchronous training via compute variance reduction

    Authors: Niv Giladi, Shahar Gottlieb, Moran Shkolnik, Asaf Karnieli, Ron Banner, Elad Hoffer, Kfir Yehuda Levy, Daniel Soudry

    Abstract: Background: Distributed training is essential for large scale training of deep neural networks (DNNs). The dominant methods for large scale DNN training are synchronous (e.g. All-Reduce), but these require waiting for all workers in each step. Thus, these methods are limited by the delays caused by straggling workers. Results: We study a typical scenario in which workers are straggling due to vari… ▽ More

    Submitted 24 September, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: https://github.com/paper-submissions/dropcompute

    Journal ref: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  2. arXiv:2203.10991  [pdf, other

    cs.LG cs.AI

    Minimum Variance Unbiased N:M Sparsity for the Neural Gradients

    Authors: Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry

    Abstract: In deep learning, fine-grained N:M sparsity reduces the data footprint and bandwidth of a General Matrix multiply (GEMM) up to x2, and doubles throughput by skip** computation of zero values. So far, it was mainly only used to prune weights to accelerate the forward and backward phases. We examine how this method can be used also for the neural gradients (i.e., loss gradients with respect to the… ▽ More

    Submitted 9 June, 2024; v1 submitted 21 March, 2022; originally announced March 2022.

  3. arXiv:2202.02783  [pdf, other

    cs.LG cs.CV

    Energy awareness in low precision neural networks

    Authors: Nurit S**arn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov, Tomer Michaeli

    Abstract: Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices. Existing approaches for reducing power consumption rely on quite general principles, including avoidance of multiplication operations and aggressive quantization of weights and activations. However, these methods do not take into account the precise power consumed by each module in the network, a… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

  4. arXiv:2201.12843  [pdf, other

    cs.LG

    Graph Representation Learning via Aggregation Enhancement

    Authors: Maxim Fishman, Chaim Baskin, Evgenii Zheltonozhskii, Almog David, Ron Banner, Avi Mendelson

    Abstract: Graph neural networks (GNNs) have become a powerful tool for processing graph-structured data but still face challenges in effectively aggregating and propagating information between layers, which limits their performance. We tackle this problem with the kernel regression (KR) approach, using KR loss as the primary loss in self-supervised settings or as a regularization term in supervised settings… ▽ More

    Submitted 8 February, 2023; v1 submitted 30 January, 2022; originally announced January 2022.

  5. arXiv:2112.10769  [pdf, other

    cs.LG

    Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats

    Authors: Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

    Abstract: Quantization of the weights and activations is one of the main methods to reduce the computational footprint of Deep Neural Networks (DNNs) training. Current methods enable 4-bit quantization of the forward phase. However, this constitutes only a third of the training process. Reducing the computational footprint of the entire training process requires the quantization of the neural gradients, i.e… ▽ More

    Submitted 9 June, 2024; v1 submitted 19 December, 2021; originally announced December 2021.

  6. arXiv:2102.08124  [pdf, other

    cs.AI

    Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

    Authors: Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Seffi Naor, Daniel Soudry

    Abstract: Unstructured pruning reduces the memory footprint in deep neural networks (DNNs). Recently, researchers proposed different types of structural pruning intending to reduce also the computation complexity. In this work, we first suggest a new measure called mask-diversity which correlates with the expected accuracy of the different types of structural pruning. We focus on the recently suggested N:M… ▽ More

    Submitted 20 October, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

  7. arXiv:2012.05328  [pdf, other

    cs.CV

    GAN "Steerability" without optimization

    Authors: Nurit S**arn-Eliezer, Ron Banner, Tomer Michaeli

    Abstract: Recent research has shown remarkable success in revealing "steering" directions in the latent spaces of pre-trained GANs. These directions correspond to semantically meaningful image transformations e.g., shift, zoom, color manipulations), and have similar interpretable effects across all categories that the GAN can generate. Some methods focus on user-specified transformations, while others disco… ▽ More

    Submitted 24 January, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

  8. arXiv:2006.10518  [pdf, other

    cs.LG stat.ML

    Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming

    Authors: Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry

    Abstract: Lately, post-training quantization methods have gained considerable attention, as they are simple to use, and require only a small unlabeled calibration set. This small dataset cannot be used to fine-tune the model without significant over-fitting. Instead, these methods only use the calibration set to set the activations' dynamic ranges. However, such methods always resulted in significant accura… ▽ More

    Submitted 14 December, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  9. arXiv:2006.08173  [pdf, other

    cs.CV cs.LG

    Neural gradients are near-lognormal: improved quantized and sparse training

    Authors: Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry

    Abstract: While training can mostly be accelerated by reducing the time needed to propagate neural gradients back throughout the model, most previous works focus on the quantization/pruning of weights and activations. These methods are often not applicable to neural gradients, which have very different statistical properties. Distinguished from weights and activations, we find that the distribution of neura… ▽ More

    Submitted 12 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

  10. arXiv:2002.07686  [pdf, other

    cs.LG cs.CV stat.ML

    Robust Quantization: One Model to Rule Them All

    Authors: Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan, Alex Bronstein, Uri Weiser

    Abstract: Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers an alternative approach with improved tolerance to different classes of data-types and quantization policies. It opens up new exciting applications where the qua… ▽ More

    Submitted 22 October, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

  11. arXiv:1911.07190  [pdf, other

    cs.LG cs.CV

    Loss Aware Post-training Quantization

    Authors: Yury Nahshan, Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson

    Abstract: Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. Additionally, we show that the structure is flat and separable… ▽ More

    Submitted 16 March, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

  12. arXiv:1909.11481  [pdf, other

    cs.CV cs.LG

    CAT: Compression-Aware Training for bandwidth reduction

    Authors: Chaim Baskin, Brian Chmiel, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson

    Abstract: Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression m… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

  13. arXiv:1909.07636  [pdf, other

    cs.CV

    Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks

    Authors: Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser

    Abstract: Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks with the price of high computational demands. Inspired by the observation that spatial correlation exists in CNN output feature maps (ofms), we propose a method to dynamically predict whether ofm activations are zero-valued or not according to their neighboring activation values, thereby avoiding zero-valued… ▽ More

    Submitted 13 July, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

  14. Feature Map Transform Coding for Energy-Efficient CNN Inference

    Authors: Brian Chmiel, Chaim Baskin, Ron Banner, Evgenii Zheltonozhskii, Yevgeny Yermolin, Alex Karbachevsky, Alex M. Bronstein, Avi Mendelson

    Abstract: Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a… ▽ More

    Submitted 26 September, 2019; v1 submitted 26 May, 2019; originally announced May 2019.

  15. arXiv:1810.05723  [pdf, other

    cs.CV

    Post-training 4-bit quantization of convolution networks for rapid-deployment

    Authors: Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry

    Abstract: Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources. Neural network quantization has significant benefits in reducing the amount of intermediate results, but it often requires the full datasets and time-consuming fine tuning to recover the accuracy lost after quantization. This paper introduces the… ▽ More

    Submitted 29 May, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

  16. arXiv:1805.11046  [pdf, other

    cs.LG stat.ML

    Scalable Methods for 8-bit Training of Neural Networks

    Authors: Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

    Abstract: Quantized Neural Networks (QNNs) are often used to improve network efficiency during the inference phase, i.e. after the network has been trained. Extensive research in the field suggests many different quantization schemes. Still, the number of bits required, as well as the best quantization scheme, are yet unknown. Our theoretical analysis suggests that most of the training process is robust to… ▽ More

    Submitted 17 June, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

  17. arXiv:1803.01814  [pdf, other

    stat.ML cs.LG

    Norm matters: efficient and accurate normalization schemes in deep networks

    Authors: Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry

    Abstract: Over the past few years, Batch-Normalization has been commonly used in deep networks, allowing faster training and high performance for a wide variety of applications. However, the reasons behind its merits remained unanswered, with several shortcomings that hindered its use for certain tasks. In this work, we present a novel view on the purpose and function of normalization methods and weight-dec… ▽ More

    Submitted 7 February, 2019; v1 submitted 5 March, 2018; originally announced March 2018.

    Comments: http://papers.nips.cc/paper/7485-norm-matters-efficient-and-accurate-normalization-schemes-in-deep-networks

    Journal ref: NeurIPS2018