Skip to main content

Showing 1–12 of 12 results for author: Gilboa, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2310.07136  [pdf, other

    quant-ph stat.ML

    Exponential Quantum Communication Advantage in Distributed Inference and Learning

    Authors: Hagay Michaeli, Dar Gilboa, Daniel Soudry, Jarrod R. McClean

    Abstract: Training and inference with large machine learning models that far exceed the memory capacity of individual devices necessitates the design of distributed architectures, forcing one to contend with communication constraints. We present a framework for distributed computation over a quantum network in which data is encoded into specialized quantum states. We prove that for models within this framew… ▽ More

    Submitted 21 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  2. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, AdriĆ  Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  3. arXiv:2107.14324  [pdf, other

    stat.ML cs.LG math.OC

    Deep Networks Provably Classify Data on Curves

    Authors: Tingran Wang, Sam Buchanan, Dar Gilboa, John Wright

    Abstract: Data with low-dimensional nonlinear structure are ubiquitous in engineering and scientific problems. We study a model problem with such structure -- a binary classification task that uses a deep fully-connected neural network to classify data drawn from two disjoint smooth curves on the unit sphere. Aside from mild regularity conditions, we place no restrictions on the configuration of the curves.… ▽ More

    Submitted 28 October, 2021; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: NeurIPS 2021

  4. arXiv:2106.04741  [pdf, other

    stat.ML cs.LG stat.ME

    Marginalizable Density Models

    Authors: Dar Gilboa, Ari Pakman, Thibault Vatter

    Abstract: Probability density models based on deep networks have achieved remarkable success in modeling complex high-dimensional datasets. However, unlike kernel density estimators, modern neural models do not yield marginals or conditionals in closed form, as these quantities require the evaluation of seldom tractable integrals. In this work, we present the Marginalizable Density Model Approximator (MDMA)… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

  5. arXiv:2102.00218  [pdf, other

    cs.IT stat.CO

    Estimating the Unique Information of Continuous Variables

    Authors: Ari Pakman, Amin Nejatbakhsh, Dar Gilboa, Abdullah Makkeh, Luca Mazzucato, Michael Wibral, Elad Schneidman

    Abstract: The integration and transfer of information from multiple sources to multiple targets is a core motive of neural systems. The emerging field of partial information decomposition (PID) provides a novel information-theoretic lens into these mechanisms by identifying synergistic, redundant, and unique contributions to the mutual information between one and several variables. While many works have stu… ▽ More

    Submitted 26 October, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

    Journal ref: NeurIPS 2021

  6. arXiv:2008.11245  [pdf, other

    stat.ML cs.LG math.OC

    Deep Networks and the Multiple Manifold Problem

    Authors: Sam Buchanan, Dar Gilboa, John Wright

    Abstract: We study the multiple manifold problem, a binary classification task modeled on applications in machine vision, in which a deep fully-connected neural network is trained to separate two low-dimensional submanifolds of the unit sphere. We provide an analysis of the one-dimensional case, proving for a simple manifold configuration that when the network depth $L$ is large relative to certain geometri… ▽ More

    Submitted 6 May, 2021; v1 submitted 25 August, 2020; originally announced August 2020.

    Comments: ICLR 2021

  7. arXiv:2007.01038  [pdf, other

    cs.LG stat.ML

    Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?

    Authors: Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

    Abstract: Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We construct a deep convolutional network with identical features by initializing almost all the weights to $0$. The architecture also enables perfect signal… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: ICML 2020

  8. arXiv:1912.05137   

    cs.LG stat.ML

    Is Feature Diversity Necessary in Neural Network Initialization?

    Authors: Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

    Abstract: Standard practice in training neural networks involves initializing the weights in an independent fashion. The results of recent work suggest that feature "diversity" at initialization plays an important role in training the network. However, other initialization schemes with reduced feature diversity have also been shown to be viable. In this work, we conduct a series of experiments aimed at eluc… ▽ More

    Submitted 3 July, 2020; v1 submitted 11 December, 2019; originally announced December 2019.

    Comments: This paper has been substantially modified, updated, and expanded with additional content (arXiv:2007.01038). To avoid confusion, we are withdrawing the old version of this article

  9. arXiv:1909.11572  [pdf, other

    cs.LG stat.ML

    Wider Networks Learn Better Features

    Authors: Dar Gilboa, Guy Gur-Ari

    Abstract: Transferability of learned features between tasks can massively reduce the cost of training a neural network on a novel task. We investigate the effect of network width on learned features using activation atlases --- a visualization technique that captures features the entire hidden state responds to, as opposed to individual neurons alone. We find that, while individual neurons do not learn inte… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

  10. arXiv:1906.00771  [pdf, other

    stat.ML cs.LG

    A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off

    Authors: Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

    Abstract: Reducing the precision of weights and activation functions in neural network training, with minimal impact on performance, is essential for the deployment of these models in resource-constrained environments. We apply mean-field techniques to networks with quantized activations in order to evaluate the degree to which quantization degrades signal propagation at initialization. We derive initializa… ▽ More

    Submitted 31 October, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: NIPS 2019

  11. arXiv:1901.08987  [pdf, other

    cs.LG stat.ML

    Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs

    Authors: Dar Gilboa, Bo Chang, Minmin Chen, Greg Yang, Samuel S. Schoenholz, Ed H. Chi, Jeffrey Pennington

    Abstract: Training recurrent neural networks (RNNs) on long sequence tasks is plagued with difficulties arising from the exponential explosion or vanishing of signals as they propagate forward or backward through the network. Many techniques have been proposed to ameliorate these issues, including various algorithmic and architectural modifications. Two of the most successful RNN architectures, the LSTM and… ▽ More

    Submitted 23 May, 2019; v1 submitted 25 January, 2019; originally announced January 2019.

  12. arXiv:1609.00770  [pdf, other

    stat.CO stat.ML

    Stochastic Bouncy Particle Sampler

    Authors: Ari Pakman, Dar Gilboa, David Carlson, Liam Paninski

    Abstract: We introduce a novel stochastic version of the non-reversible, rejection-free Bouncy Particle Sampler (BPS), a Markov process whose sample trajectories are piecewise linear. The algorithm is based on simulating first arrival times in a doubly stochastic Poisson process using the thinning method, and allows efficient sampling of Bayesian posteriors in big datasets. We prove that in the BPS no bias… ▽ More

    Submitted 13 June, 2017; v1 submitted 2 September, 2016; originally announced September 2016.

    Comments: ICML Camera ready version