Skip to main content

Showing 1–5 of 5 results for author: Behrens, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.03902  [pdf, other

    cs.LG

    A phase transition between positional and semantic learning in a solvable model of dot-product attention

    Authors: Hugo Cui, Freya Behrens, Florent Krzakala, Lenka Zdeborová

    Abstract: We investigate how a dot-product attention layer learns a positional attention matrix (with tokens attending to each other based on their respective positions) and a semantic attention matrix (with tokens attending to each other based on their meaning). For an algorithmic task, we experimentally show how the same simple architecture can learn to implement a solution using either the positional or… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  2. arXiv:2306.07104  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Unveiling the Hessian's Connection to the Decision Boundary

    Authors: Mahalakshmi Sabanayagam, Freya Behrens, Urte Adomaityte, Anna Dawid

    Abstract: Understanding the properties of well-generalizing minima is at the heart of deep learning research. On the one hand, the generalization of neural networks has been connected to the decision boundary complexity, which is hard to study in the high-dimensional input space. Conversely, the flatness of a minimum has become a controversial proxy for generalization. In this work, we provide the missing l… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 14 pages, 6 figures + 18-page appendices with 19 figures. Any feedback is very welcome! Code is available at https://github.com/Shmoo137/Hessian-and-Decision-Boundary

  3. arXiv:2202.10379  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech cs.DM math.PR

    (Dis)assortative Partitions on Random Regular Graphs

    Authors: Freya Behrens, Gabriel Arpino, Yaroslav Kivva, Lenka Zdeborová

    Abstract: We study the problem of assortative and disassortative partitions on random $d$-regular graphs. Nodes in the graph are partitioned into two non-empty groups. In the assortative partition every node requires at least $H$ of their neighbors to be in their own group. In the disassortative partition they require less than $H$ neighbors to be in their own group. Using the cavity method based on analysi… ▽ More

    Submitted 2 May, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: 21 pages; Corrected usage of the world "planted" in Section 4

    Journal ref: J. Phys. A: Math. Theor. 55 395004 (2022)

  4. arXiv:2102.03815  [pdf, other

    cs.LG stat.ML

    Bandits for Learning to Explain from Explanations

    Authors: Freya Behrens, Stefano Teso, Davide Mottin

    Abstract: We introduce Explearn, an online algorithm that learns to jointly output predictions and explanations for those predictions. Explearn leverages Gaussian Processes (GP)-based contextual bandits. This brings two key benefits. First, GPs naturally capture different kinds of explanations and enable the system designer to control how explanations generalize across the space by virtue of choosing a suit… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: Accepted at the Explainable Agency in Artificial Intelligence Workshop, hosted at the 35th AAAI Conference on Artificial Intelligence, February 2-9, 2021

  5. arXiv:2010.01930  [pdf, other

    cs.LG cs.IT eess.SP stat.ML

    Neurally Augmented ALISTA

    Authors: Freya Behrens, Jonathan Sauder, Peter Jung

    Abstract: It is well-established that many iterative sparse reconstruction algorithms can be unrolled to yield a learnable neural network for improved empirical performance. A prime example is learned ISTA (LISTA) where weights, step sizes and thresholds are learned from training data. Recently, Analytic LISTA (ALISTA) has been introduced, combining the strong empirical performance of a fully learned approa… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 10pages, 9 figures