Skip to main content

Showing 1–6 of 6 results for author: Súkeník, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14468  [pdf, other

    cs.LG math.OC stat.ML

    Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal?

    Authors: Peter Súkeník, Marco Mondelli, Christoph Lampert

    Abstract: Deep neural networks (DNNs) exhibit a surprising structure in their final layer known as neural collapse (NC), and a growing body of works has currently investigated the propagation of neural collapse to earlier layers of DNNs -- a phenomenon called deep neural collapse (DNC). However, existing theoretical results are restricted to special cases: linear models, only two layers or binary classifica… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2402.13728  [pdf, other

    cs.LG stat.ML

    Average gradient outer product as a mechanism for deep neural collapse

    Authors: Daniel Beaglehole, Peter Súkeník, Marco Mondelli, Mikhail Belkin

    Abstract: Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data representations in the final layers of Deep Neural Networks (DNNs). Though the phenomenon has been measured in a variety of settings, its emergence is typically explained via data-agnostic approaches, such as the unconstrained features model. In this work, we introduce a data-dependent setting where DNC forms due to… ▽ More

    Submitted 23 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  3. arXiv:2305.13165  [pdf, other

    cs.LG stat.ML

    Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model

    Authors: Peter Súkeník, Marco Mondelli, Christoph Lampert

    Abstract: Neural collapse (NC) refers to the surprising structure of the last layer of deep neural networks in the terminal phase of gradient descent training. Recently, an increasing amount of experimental evidence has pointed to the propagation of NC to earlier layers of neural networks. However, while the NC in the last layer is well studied theoretically, much less is known about its multi-layered count… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  4. arXiv:2210.05657  [pdf, other

    cs.CV cs.AI

    The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

    Authors: Peter Kocsis, Peter Súkeník, Guillem Brasó, Matthias Nießner, Laura Leal-Taixé, Ismail Elezi

    Abstract: Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet… ▽ More

    Submitted 13 October, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022, Homepage: https://peter-kocsis.github.io/LowDataGeneralization/ 24 pages, 14 figures, 12 tables

    ACM Class: I.2.10; I.5.1; I.4.8

  5. arXiv:2208.13499  [pdf, other

    cs.LG cs.AI stat.ML

    Generalization In Multi-Objective Machine Learning

    Authors: Peter Súkeník, Christoph H. Lampert

    Abstract: Modern machine learning tasks often require considering not just one but multiple objectives. For example, besides the prediction quality, this could be the efficiency, robustness or fairness of the learned models, or any of their combinations. Multi-objective learning offers a natural framework for handling such problems without having to commit to early trade-offs. Surprisingly, statistical lear… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

    Comments: 17 pages, 2 figures; Peter Súkeník and Christoph H. Lampert contributed equally

  6. arXiv:2110.05365  [pdf, other

    cs.LG cs.AI stat.ML

    Intriguing Properties of Input-dependent Randomized Smoothing

    Authors: Peter Súkeník, Aleksei Kuvshinov, Stephan Günnemann

    Abstract: Randomized smoothing is currently considered the state-of-the-art method to obtain certifiably robust classifiers. Despite its remarkable performance, the method is associated with various serious problems such as "certified accuracy waterfalls", certification vs.\ accuracy trade-off, or even fairness issues. Input-dependent smoothing approaches have been proposed with intention of overcoming thes… ▽ More

    Submitted 8 March, 2024; v1 submitted 11 October, 2021; originally announced October 2021.