Skip to main content

Showing 1–10 of 10 results for author: Allingham, J U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.01946  [pdf, other

    cs.LG

    A Generative Model of Symmetry Transformations

    Authors: James Urquhart Allingham, Bruno Kacper Mlodozeniec, Shreyas Padhy, Javier Antorán, David Krueger, Richard E. Turner, Eric Nalisnick, José Miguel Hernández-Lobato

    Abstract: Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though methods incorporating symmetries often require prior knowledge. While recent advancements have been made in learning those symmetries directly from the dataset, most of this work has focused on the discriminative setting. In this paper, we take inspiration from grou… ▽ More

    Submitted 20 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  2. arXiv:2306.02652  [pdf, other

    cs.LG cs.AI stat.ML

    Towards Anytime Classification in Early-Exit Architectures by Enforcing Conditional Monotonicity

    Authors: Metod Jazbec, James Urquhart Allingham, Dan Zhang, Eric Nalisnick

    Abstract: Modern predictive models are often deployed to environments in which computational budgets are dynamic. Anytime algorithms are well-suited to such environments as, at any point during computation, they can output a prediction whose quality is a function of computation time. Early-exit neural networks have garnered attention in the context of anytime computation due to their capability to provide i… ▽ More

    Submitted 29 October, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  3. arXiv:2302.06235  [pdf, other

    cs.LG cs.CV stat.ML

    A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

    Authors: James Urquhart Allingham, Jie Ren, Michael W Dusenberry, Xiuye Gu, Yin Cui, Dustin Tran, Jeremiah Zhe Liu, Balaji Lakshminarayanan

    Abstract: Contrastively trained text-image models have the remarkable ability to perform zero-shot classification, that is, classifying previously unseen images into categories that the model has never been explicitly trained to identify. However, these zero-shot classifiers need prompt engineering to achieve high accuracy. Prompt engineering typically requires hand-crafting a set of prompts for individual… ▽ More

    Submitted 15 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023. 23 pages, 10 tables, 3 figures

  4. arXiv:2206.08900  [pdf, other

    stat.ML cs.AI cs.LG

    Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

    Authors: Javier Antorán, David Janz, James Urquhart Allingham, Erik Daxberger, Riccardo Barbano, Eric Nalisnick, José Miguel Hernández-Lobato

    Abstract: The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selecti… ▽ More

    Submitted 8 December, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Paper appearing at ICML 2022

  5. arXiv:2112.06926  [pdf, other

    cs.LG stat.ML

    Addressing Bias in Active Learning with Depth Uncertainty Networks... or Not

    Authors: Chelsea Murray, James U. Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: Farquhar et al. [2021] show that correcting for active learning bias with underparameterised models leads to improved downstream performance. For overparameterised models such as NNs, however, correction leads either to decreased or unchanged performance. They suggest that this is due to an "overfitting bias" which offsets the active learning bias. We show that depth uncertainty networks operate i… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2112.06796

  6. arXiv:2112.06796  [pdf, other

    cs.LG stat.ML

    Depth Uncertainty Networks for Active Learning

    Authors: Chelsea Murray, James U. Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: In active learning, the size and complexity of the training dataset changes over time. Simple models that are well specified by the amount of data available at the start of active learning might suffer from bias as more points are actively sampled. Flexible models that might be well suited to the full dataset can suffer from overfitting towards the start of active learning. We tackle this problem… ▽ More

    Submitted 4 May, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

  7. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  8. arXiv:2010.14689  [pdf, other

    cs.LG stat.ML

    Bayesian Deep Learning via Subnetwork Inference

    Authors: Erik Daxberger, Eric Nalisnick, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights ar… ▽ More

    Submitted 14 March, 2022; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: ICML 2021; 22 pages, extended version with supplementary material

  9. arXiv:2006.08437  [pdf, other

    stat.ML cs.LG

    Depth Uncertainty in Neural Networks

    Authors: Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

    Abstract: Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding mo… ▽ More

    Submitted 7 December, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Published at NeurIPS 2020

  10. arXiv:2002.02797  [pdf, other

    stat.ML cs.LG

    Variational Depth Search in ResNets

    Authors: Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

    Abstract: One-shot neural architecture search allows joint learning of weights and network architecture, reducing computational cost. We limit our search space to the depth of residual networks and formulate an analytically tractable variational objective that allows for obtaining an unbiased approximate posterior over depths in one-shot. We propose a heuristic to prune our networks based on this distributi… ▽ More

    Submitted 1 April, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Appearing at the 1st ICLR workshop on Neural Architecture Search 2020