Skip to main content

Showing 1–9 of 9 results for author: Daxberger, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.04354  [pdf, other

    cs.CV cs.LG stat.ML

    Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts

    Authors: Erik Daxberger, Floris Weers, Bowen Zhang, Tom Gunter, Ruoming Pang, Marcin Eichner, Michael Emmersberger, Yinfei Yang, Alexander Toshev, Xianzhi Du

    Abstract: Sparse Mixture-of-Experts models (MoEs) have recently gained popularity due to their ability to decouple model size from inference efficiency by only activating a small subset of the model parameters for any given input token. As such, sparse MoEs have enabled unprecedented scalability, resulting in tremendous successes across domains such as natural language processing and computer vision. In thi… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  2. arXiv:2206.08900  [pdf, other

    stat.ML cs.AI cs.LG

    Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

    Authors: Javier Antorán, David Janz, James Urquhart Allingham, Erik Daxberger, Riccardo Barbano, Eric Nalisnick, José Miguel Hernández-Lobato

    Abstract: The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selecti… ▽ More

    Submitted 8 December, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Paper appearing at ICML 2022

  3. arXiv:2111.03577  [pdf, other

    cs.LG stat.ML

    Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in Deep Learning

    Authors: Runa Eschenhagen, Erik Daxberger, Philipp Hennig, Agustinus Kristiadi

    Abstract: Deep neural networks are prone to overconfident predictions on outliers. Bayesian neural networks and deep ensembles have both been shown to mitigate this problem to some extent. In this work, we aim to combine the benefits of the two approaches by proposing to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep ne… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: Bayesian Deep Learning Workshop, NeurIPS 2021

  4. arXiv:2106.14806  [pdf, other

    cs.LG stat.ML

    Laplace Redux -- Effortless Bayesian Deep Learning

    Authors: Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig

    Abstract: Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection. The Laplace approximation (LA) is a classic, and arguably the simplest family of approximations for the intractable posteriors of deep neural networks. Yet, despite its simplicity, the L… ▽ More

    Submitted 14 March, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 camera-ready version; source code: https://github.com/AlexImmer/Laplace

  5. arXiv:2010.14689  [pdf, other

    cs.LG stat.ML

    Bayesian Deep Learning via Subnetwork Inference

    Authors: Erik Daxberger, Eric Nalisnick, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights ar… ▽ More

    Submitted 14 March, 2022; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: ICML 2021; 22 pages, extended version with supplementary material

  6. arXiv:2006.09191  [pdf, other

    cs.LG stat.ML

    Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining

    Authors: Austin Tripp, Erik Daxberger, José Miguel Hernández-Lobato

    Abstract: Many important problems in science and engineering, such as drug design, involve optimizing an expensive black-box objective function over a complex, high-dimensional, and structured input space. Although machine learning techniques have shown promise in solving such problems, existing approaches substantially lack sample efficiency. We introduce an improved method for efficient black-box optimiza… ▽ More

    Submitted 25 October, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: 23 pages, 14 figures; Includes supplementary material; NeurIPS 2020

  7. arXiv:1912.05651  [pdf, other

    cs.LG stat.ML

    Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection

    Authors: Erik Daxberger, José Miguel Hernández-Lobato

    Abstract: Despite their successes, deep neural networks may make unreliable predictions when faced with test data drawn from a distribution different to that of the training data, constituting a major problem for AI safety. While this has recently motivated the development of methods to detect such out-of-distribution (OoD) inputs, a robust solution is still lacking. We propose a new probabilistic, unsuperv… ▽ More

    Submitted 15 July, 2020; v1 submitted 11 December, 2019; originally announced December 2019.

    Comments: 21 pages, extended version with supplementary material

  8. Mixed-Variable Bayesian Optimization

    Authors: Erik Daxberger, Anastasia Makarova, Matteo Turchetta, Andreas Krause

    Abstract: The optimization of expensive to evaluate, black-box, mixed-variable functions, i.e. functions that have continuous and discrete inputs, is a difficult and yet pervasive problem in science and engineering. In Bayesian optimization (BO), special cases of this problem that consider fully continuous or fully discrete domains have been widely studied. However, few methods exist for mixed-variable doma… ▽ More

    Submitted 4 August, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: IJCAI 2020 camera-ready; 17 pages, extended version with supplementary material

    Journal ref: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), 2020, pages 2633-2639

  9. arXiv:1807.00228  [pdf, other

    cs.AI cs.IR cs.LG

    Embedding Models for Episodic Knowledge Graphs

    Authors: Yunpu Ma, Volker Tresp, Erik Daxberger

    Abstract: In recent years a number of large-scale triple-oriented knowledge graphs have been generated and various models have been proposed to perform learning in those graphs. Most knowledge graphs are static and reflect the world in its current state. In reality, of course, the state of the world is changing: a healthy person becomes diagnosed with a disease and a new president is inaugurated. In this pa… ▽ More

    Submitted 3 December, 2018; v1 submitted 30 June, 2018; originally announced July 2018.

    Comments: 26 pages