Skip to main content

Showing 1–3 of 3 results for author: Ujváry, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.01964  [pdf, other

    stat.ML cs.LG

    Position: Understanding LLMs Requires More Than Statistical Generalization

    Authors: Patrik Reizinger, Szilvia Ujváry, Anna Mészáros, Anna Kerekes, Wieland Brendel, Ferenc Huszár

    Abstract: The last decade has seen blossoming research in deep learning theory attempting to answer, "Why does deep learning generalize?" A powerful shift in perspective precipitated this progress: the study of overparametrized models in the interpolation regime. In this paper, we argue that another perspective shift is due, since some of the desirable qualities of LLMs are not a consequence of good statist… ▽ More

    Submitted 17 June, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted as a position paper at ICML2024, Code: https://github.com/rpatrik96/llm-non-identifiability

  2. arXiv:2310.20053  [pdf, other

    stat.ML cs.LG

    Estimating optimal PAC-Bayes bounds with Hamiltonian Monte Carlo

    Authors: Szilvia Ujváry, Gergely Flamich, Vincent Fortuin, José Miguel Hernández Lobato

    Abstract: An important yet underexplored question in the PAC-Bayes literature is how much tightness we lose by restricting the posterior family to factorized Gaussian distributions when optimizing a PAC-Bayes bound. We investigate this issue by estimating data-independent PAC-Bayes bounds using the optimal posteriors, comparing them to bounds obtained using MFVI. Concretely, we (1) sample from the optimal G… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Mathematics of Modern Machine Learning Workshop at NeurIPS 2023

    ACM Class: G.3

  3. arXiv:2210.10452  [pdf, other

    stat.ML cs.LG

    Rethinking Sharpness-Aware Minimization as Variational Inference

    Authors: Szilvia Ujváry, Zsigmond Telek, Anna Kerekes, Anna Mészáros, Ferenc Huszár

    Abstract: Sharpness-aware minimization (SAM) aims to improve the generalisation of gradient-based learning by seeking out flat minima. In this work, we establish connections between SAM and Mean-Field Variational Inference (MFVI) of neural network parameters. We show that both these methods have interpretations as optimizing notions of flatness, and when using the reparametrisation trick, they both boil dow… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.