Skip to main content

Showing 1–4 of 4 results for author: Gabrielsson, R B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04047  [pdf, other

    stat.ML cs.LG

    Slicing Mutual Information Generalization Bounds for Neural Networks

    Authors: Kimia Nadjahi, Kristjan Greenewald, Rickard Brüel Gabrielsson, Justin Solomon

    Abstract: The ability of machine learning (ML) algorithms to generalize well to unseen data has been studied through the lens of information theory, by bounding the generalization error with the input-output mutual information (MI), i.e., the MI between the training data and the learned hypothesis. Yet, these bounds have limited practicality for modern ML applications (e.g., deep learning), due to the diffi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  2. arXiv:2402.16842  [pdf, other

    cs.LG

    Asymmetry in Low-Rank Adapters of Foundation Models

    Authors: Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon

    Abstract: Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. Specifically,… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 17 pages, 2 figures, 9 tables

  3. arXiv:1811.01122  [pdf, other

    cs.LG cs.AI math.AT stat.ML

    Topological Approaches to Deep Learning

    Authors: Gunnar Carlsson, Rickard Brüel Gabrielsson

    Abstract: We perform topological data analysis on the internal states of convolutional deep neural networks to develop an understanding of the computations that they perform. We apply this understanding to modify the computations so as to (a) speed up computations and (b) improve generalization from one data set of digits to another. One byproduct of the analysis is the production of a geometry on new sets… ▽ More

    Submitted 2 November, 2018; originally announced November 2018.

    Comments: 23 pages, 10 figures

    MSC Class: 68T05; 55N35; 62-07

  4. arXiv:1810.03234  [pdf, other

    cs.CV

    Exposition and Interpretation of the Topology of Neural Networks

    Authors: Rickard Brüel Gabrielsson, Gunnar Carlsson

    Abstract: Convolutional neural networks (CNN's) are powerful and widely used tools. However, their interpretability is far from ideal. One such shortcoming is the difficulty of deducing a network's ability to generalize to unseen data. We use topological data analysis to show that the information encoded in the weights of a CNN can be organized in terms of a topological data model and demonstrate how such i… ▽ More

    Submitted 18 October, 2019; v1 submitted 7 October, 2018; originally announced October 2018.