Skip to main content

Showing 1–12 of 12 results for author: Nachum, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.13658  [pdf, other

    cs.LG cs.NE stat.ML

    Fantastic Generalization Measures are Nowhere to be Found

    Authors: Michael Gastpar, Ido Nachum, Jonathan Shafer, Thomas Weinberger

    Abstract: We study the notion of a generalization bound being uniformly tight, meaning that the difference between the bound and the population loss is small for all learning algorithms and all population distributions. Numerous generalization bounds have been proposed in the literature as potential explanations for the ability of neural networks to generalize in the overparameterized setting. However, in t… ▽ More

    Submitted 28 November, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: 34 pages, 1 figure. Minor fix: subsection 6.2 -> section 7

  2. arXiv:2206.13257  [pdf, ps, other

    cs.LG cs.IT

    Finite Littlestone Dimension Implies Finite Information Complexity

    Authors: Aditya Pradeep, Ido Nachum, Michael Gastpar

    Abstract: We prove that every online learnable class of functions of Littlestone dimension $d$ admits a learning algorithm with finite information complexity. Towards this end, we use the notion of a globally stable algorithm. Generally, the information complexity of such a globally stable algorithm is large yet finite, roughly exponential in $d$. We also show there is room for improvement; for a canonical… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  3. arXiv:2111.02155  [pdf, ps, other

    cs.LG cs.NE math.PR

    A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs

    Authors: Ido Nachum, Jan Hązła, Michael Gastpar, Anatoly Khina

    Abstract: How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson--Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), stating that the geometry is essentially preserved. For FNNs with the ReLU activation, the angle between two inputs contracts according to a known… ▽ More

    Submitted 7 March, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

  4. arXiv:2111.02154  [pdf, ps, other

    cs.LG cs.NE

    Regularization by Misclassification in ReLU Neural Networks

    Authors: Elisabetta Cornacchia, Jan Hązła, Ido Nachum, Amir Yehudayoff

    Abstract: We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label (label smoothing being a close variant of this procedure). Our experiments demonstrate that label noise propels the network to a sparse solution in the following sense: for a typical input, a small fraction of neurons are active, and the fir… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  5. arXiv:2004.09590  [pdf, other

    cs.IT

    Almost-Reed--Muller Codes Achieve Constant Rates for Random Errors

    Authors: Emmanuel Abbe, Jan Hązła, Ido Nachum

    Abstract: This paper considers '$δ$-almost Reed-Muller codes', i.e., linear codes spanned by evaluations of all but a $δ$ fraction of monomials of degree at most $d$. It is shown that for any $δ> 0$ and any $\varepsilon>0$, there exists a family of $δ$-almost Reed-Muller codes of constant rate that correct $1/2-\varepsilon$ fraction of random errors with high probability. For exact Reed-Muller codes, the an… ▽ More

    Submitted 5 October, 2021; v1 submitted 20 April, 2020; originally announced April 2020.

  6. arXiv:1907.00560  [pdf, ps, other

    cs.LG stat.ML

    On Symmetry and Initialization for Neural Networks

    Authors: Ido Nachum, Amir Yehudayoff

    Abstract: This work provides an additional step in the theoretical understanding of neural networks. We consider neural networks with one hidden layer and show that when learning symmetric functions, one can choose initial conditions so that standard SGD training efficiently produces generalization guarantees. We empirically verify this and show that this does not hold when the initial conditions are chosen… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

  7. arXiv:1811.09923  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Average-Case Information Complexity of Learning

    Authors: Ido Nachum, Amir Yehudayoff

    Abstract: How many bits of information are revealed by a learning algorithm for a concept class of VC-dimension $d$? Previous works have shown that even for $d=1$ the amount of information may be unbounded (tend to $\infty$ with the universe size). Can it be that all concepts in the class require leaking a large amount of information? We show that typically concepts do not require leakage. There exists a pr… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

  8. arXiv:1806.05570  [pdf, ps, other

    cs.CV

    Direct Automated Quantitative Measurement of Spine via Cascade Amplifier Regression Network

    Authors: Shumao Pang, Stephanie Leung, Ilanit Ben Nachum, Qian** Feng, Shuo Li

    Abstract: Automated quantitative measurement of the spine (i.e., multiple indices estimation of heights, widths, areas, and so on for the vertebral body and disc) is of the utmost importance in clinical spinal disease diagnoses, such as osteoporosis, intervertebral disc degeneration, and lumbar disc herniation, yet still an unprecedented challenge due to the variety of spine structure and the high dimension… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: Accepted by MICCAI 2018

  9. arXiv:1806.05403  [pdf, ps, other

    cs.LG stat.ML

    On the Perceptron's Compression

    Authors: Shay Moran, Ido Nachum, Itai Panasoff, Amir Yehudayoff

    Abstract: We study and provide exposition to several phenomena that are related to the perceptron's compression. One theme concerns modifications of the perceptron algorithm that yield better guarantees on the margin of the hyperplane it outputs. These modifications can be useful in training neural networks as well, and we demonstrate them with some experimental data. In a second theme, we deduce conclusion… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

  10. arXiv:1804.05474  [pdf, ps, other

    cs.LG cs.IT stat.ML

    A Direct Sum Result for the Information Complexity of Learning

    Authors: Ido Nachum, Jonathan Shafer, Amir Yehudayoff

    Abstract: How many bits of information are required to PAC learn a class of hypotheses of VC dimension $d$? The mathematical setting we follow is that of Bassily et al. (2018), where the value of interest is the mutual information $\mathrm{I}(S;A(S))$ between the input sample $S$ and the hypothesis outputted by the learning algorithm $A$. We introduce a class of functions of VC dimension $d$ over the domain… ▽ More

    Submitted 15 April, 2018; originally announced April 2018.

  11. arXiv:1710.05233  [pdf, ps, other

    cs.LG cs.AI cs.CR cs.IT

    Learners that Use Little Information

    Authors: Raef Bassily, Shay Moran, Ido Nachum, Jonathan Shafer, Amir Yehudayoff

    Abstract: We study learning algorithms that are restricted to using a small amount of information from their input sample. We introduce a category of learning algorithms we term $d$-bit information learners, which are algorithms whose output conveys at most $d$ bits of information of their input. A central theme in this work is that such algorithms generalize. We focus on the learning capacity of these al… ▽ More

    Submitted 27 February, 2018; v1 submitted 14 October, 2017; originally announced October 2017.

  12. arXiv:1705.09728  [pdf, other

    cs.CV

    Direct Estimation of Regional Wall Thicknesses via Residual Recurrent Neural Network

    Authors: Wufeng Xue, Ilanit Ben Nachum, Sachin Pandey, James Warrington, Stephanie Leung, Shuo Li

    Abstract: Accurate estimation of regional wall thicknesses (RWT) of left ventricular (LV) myocardium from cardiac MR sequences is of significant importance for identification and diagnosis of cardiac disease. Existing RWT estimation still relies on segmentation of LV myocardium, which requires strong prior information and user interaction. No work has been devoted into direct estimation of RWT from cardiac… ▽ More

    Submitted 26 May, 2017; originally announced May 2017.

    Comments: To appear as an oral paper in IPMI2017