Skip to main content

Showing 1–13 of 13 results for author: Ringel, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02663  [pdf, ps, other

    stat.ML cond-mat.dis-nn cs.LG

    Symmetric Kernels with Non-Symmetric Data: A Data-Agnostic Learnability Bound

    Authors: Itay Lavie, Zohar Ringel

    Abstract: Kernel ridge regression (KRR) and Gaussian processes (GPs) are fundamental tools in statistics and machine learning with recent applications to highly over-parameterized deep neural networks. The ability of these tools to learn a target function is directly related to the eigenvalues of their kernel sampled on the input data. Targets having support on higher eigenvalues are more learnable. While k… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2405.06008  [pdf, other

    cs.LG cond-mat.dis-nn hep-th stat.ML

    Wilsonian Renormalization of Neural Network Gaussian Processes

    Authors: Jessica N. Howard, Ro Jefferson, Anindita Maiti, Zohar Ringel

    Abstract: Separating relevant and irrelevant information is key to any modeling process or scientific inquiry. Theoretical physics offers a powerful tool for achieving this in the form of the renormalization group (RG). Here we demonstrate a practical approach to performing Wilsonian RG in the context of Gaussian Process (GP) Regression. We systematically integrate out the unlearnable modes of the GP kernel… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 24 pages, 1 figure

  3. arXiv:2402.05173  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Towards Understanding Inductive Bias in Transformers: A View From Infinity

    Authors: Itay Lavie, Guy Gur-Ari, Zohar Ringel

    Abstract: We study inductive bias in Transformers in the infinitely over-parameterized Gaussian process limit and argue transformers tend to be biased towards more permutation symmetric functions in sequence space. We show that the representation theory of the symmetric group can be used to give quantitative analytical predictions when the dataset is symmetric to permutations between tokens. We present a si… ▽ More

    Submitted 28 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: ICML 2024

    Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  4. arXiv:2310.03789  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Grokking as a First Order Phase Transition in Two Layer Networks

    Authors: Noa Rubin, Inbar Seroussi, Zohar Ringel

    Abstract: A key property of deep neural networks (DNNs) is their ability to learn new features during training. This intriguing aspect of deep learning stands out most clearly in recently reported Grokking phenomena. While mainly reflected as a sudden increase in test accuracy, Grokking is also believed to be a beyond lazy-learning/Gaussian Process (GP) phenomenon involving feature learning. Here we apply a… ▽ More

    Submitted 5 May, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  5. arXiv:2307.14653  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Speed Limits for Deep Learning

    Authors: Inbar Seroussi, Alexander A. Alemi, Moritz Helias, Zohar Ringel

    Abstract: State-of-the-art neural networks require extreme computational power to train. It is therefore natural to wonder whether they are optimally trained. Here we apply a recent advancement in stochastic thermodynamics which allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network, based on the ratio of their Wasserstein-2… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  6. arXiv:2307.06362  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Spectral-Bias and Kernel-Task Alignment in Physically Informed Neural Networks

    Authors: Inbar Seroussi, Asaf Miron, Zohar Ringel

    Abstract: Physically informed neural networks (PINNs) are a promising emerging method for solving differential equations. As in many other deep learning approaches, the choice of PINN design and training protocol requires careful craftsmanship. Here, we suggest a comprehensive theoretical framework that sheds light on this important problem. Leveraging an equivalence between infinitely over-parameterized ne… ▽ More

    Submitted 5 October, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

  7. arXiv:2112.15383  [pdf, other

    stat.ML cs.LG physics.data-an

    Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs

    Authors: Inbar Seroussi, Gadi Naveh, Zohar Ringel

    Abstract: Deep neural networks (DNNs) are powerful tools for compressing and distilling information. Their scale and complexity, often involving billions of inter-dependent parameters, render direct microscopic analysis difficult. Under such circumstances, a common strategy is to identify slow variables that average the erratic behavior of the fast microscopic variables. Here, we identify a similar separati… ▽ More

    Submitted 22 September, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

  8. arXiv:2106.04110  [pdf, other

    cs.LG cond-mat.stat-mech stat.ML

    A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs

    Authors: Gadi Naveh, Zohar Ringel

    Abstract: Deep neural networks (DNNs) in the infinite width/channel limit have received much attention recently, as they provide a clear analytical window to deep learning via map**s to Gaussian Processes (GPs). Despite its theoretical appeal, this viewpoint lacks a crucial ingredient of deep learning in finite DNNs, laying at the heart of their success -- feature learning. Here we consider DNNs trained w… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Comments: 9 pages of main text, 23 pages of appendices, 5 figures total

  9. arXiv:2012.01447  [pdf, other

    cond-mat.stat-mech cond-mat.dis-nn cs.IT

    Relevance in the Renormalization Group and in Information Theory

    Authors: Amit Gordon, Aditya Banerjee, Maciej Koch-Janusz, Zohar Ringel

    Abstract: The analysis of complex physical systems hinges on the ability to extract the relevant degrees of freedom from among the many others. Though much hope is placed in machine learning, it also brings challenges, chief of which is interpretability. It is often unclear what relation, if any, the architecture- and training-dependent learned "relevant" features bear to standard objects of physical theory… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Journal ref: Phys. Rev. Lett. 126, 240601 (2021)

  10. arXiv:2004.01190  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG cs.NE

    Predicting the outputs of finite deep neural networks trained with noisy gradients

    Authors: Gadi Naveh, Oded Ben-David, Haim Sompolinsky, Zohar Ringel

    Abstract: A recent line of works studied wide deep neural networks (DNNs) by approximating them as Gaussian Processes (GPs). A DNN trained with gradient flow was shown to map to a GP governed by the Neural Tangent Kernel (NTK), whereas earlier works showed that a DNN with an i.i.d. prior over its weights maps to the so-called Neural Network Gaussian Process (NNGP). Here we consider a DNN training protocol,… ▽ More

    Submitted 30 September, 2021; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: 8 pages + appendix, 7 figures overall

  11. arXiv:1906.05301  [pdf, other

    cs.LG cond-mat.stat-mech cs.NE physics.data-an stat.ML

    Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective

    Authors: Omry Cohen, Or Malka, Zohar Ringel

    Abstract: In the past decade, deep neural networks (DNNs) came to the fore as the leading machine learning algorithms for a variety of tasks. Their raise was founded on market needs and engineering craftsmanship, the latter based more on trial and error than on theory. While still far behind the application forefront, the theoretical study of DNNs has recently made important advancements in analyzing the hi… ▽ More

    Submitted 26 November, 2020; v1 submitted 12 June, 2019; originally announced June 2019.

    Journal ref: Phys. Rev. Research 3, 023034 (2021)

  12. arXiv:1902.02354  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI stat.ML

    The role of a layer in deep neural networks: a Gaussian Process perspective

    Authors: Oded Ben-David, Zohar Ringel

    Abstract: A fundamental question in deep learning concerns the role played by individual layers in a deep neural network (DNN) and the transferable properties of the data representations which they learn. To the extent that layers have clear roles, one should be able to optimize them separately using layer-wise loss functions. Such loss functions would describe what is the set of good data representations a… ▽ More

    Submitted 13 June, 2019; v1 submitted 6 February, 2019; originally announced February 2019.

  13. arXiv:1704.06279  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech cs.IT cs.LG stat.ML

    Mutual Information, Neural Networks and the Renormalization Group

    Authors: Maciej Koch-Janusz, Zohar Ringel

    Abstract: Physical systems differring in their microscopic details often display strikingly similar behaviour when probed at macroscopic scales. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains "slow" degrees of freedom and integrates out the rest. However, the important degrees of… ▽ More

    Submitted 24 September, 2018; v1 submitted 20 April, 2017; originally announced April 2017.

    Comments: The accepted (substantially extended) version

    Journal ref: Nature Physics 14, 578--582 (2018)