Skip to main content

Showing 1–3 of 3 results for author: Rofin, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.09963  [pdf, other

    cs.LG

    Why are Sensitive Functions Hard for Transformers?

    Authors: Michael Hahn, Mark Rofin

    Abstract: Empirical studies have identified a range of learnability biases and limitations of transformers, such as a persistent difficulty in learning to compute simple formal languages such as PARITY, and a bias towards low-degree functions. However, theoretical understanding remains limited, with existing expressiveness theory either overpredicting or underpredicting realistic learning abilities. We prov… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: ACL 2024

  2. arXiv:2211.12092  [pdf, other

    cs.CL cs.LG

    Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models

    Authors: Mark Rofin, Nikita Balagansky, Daniil Gavrilov

    Abstract: The simplest way to obtain continuous interpolation between two points in high dimensional space is to draw a line between them. While previous works focused on the general connectivity between model parameters, we explored linear interpolation for parameters of pre-trained models after fine-tuning. Surprisingly, we could perform linear interpolation without a performance drop in intermediate poin… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  3. Vote'n'Rank: Revision of Benchmarking with Social Choice Theory

    Authors: Mark Rofin, Vladislav Mikhailov, Mikhail Florinskiy, Andrey Kravchenko, Elena Tutubalina, Tatiana Shavrina, Daniel Karabekyan, Ekaterina Artemova

    Abstract: The development of state-of-the-art systems in different applied areas of machine learning (ML) is driven by benchmarks, which have shaped the paradigm of evaluating generalisation capabilities from multiple perspectives. Although the paradigm is shifting towards more fine-grained evaluation across diverse tasks, the delicate question of how to aggregate the performances has received particular in… ▽ More

    Submitted 12 February, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: To appear in EACL 2023 (main)