Skip to main content

Showing 1–16 of 16 results for author: Antorán, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.18457  [pdf, other

    cs.LG stat.ML

    Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes

    Authors: Jihao Andreas Lin, Shreyas Padhy, Bruno Mlodozeniec, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: Scaling hyperparameter optimisation to very large datasets remains an open problem in the Gaussian process community. This paper focuses on iterative methods, which use linear system solvers, like conjugate gradients, alternating projections or stochastic gradient descent, to construct an estimate of the marginal likelihood gradient. We discuss three key improvements which are applicable across so… ▽ More

    Submitted 6 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Preprint. arXiv admin note: text overlap with arXiv:2405.18328

  2. arXiv:2404.19157  [pdf, other

    stat.ML cs.LG

    Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks

    Authors: Javier Antoran

    Abstract: Large neural networks trained on large datasets have become the dominant paradigm in machine learning. These systems rely on maximum likelihood point estimates of their parameters, precluding them from expressing model uncertainty. This may result in overconfident predictions and it prevents the use of deep learning models for sequential decision making. This thesis develops scalable methods to eq… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: PhD Thesis, University of Cambridge

  3. arXiv:2310.20581  [pdf, other

    cs.LG stat.ML

    Stochastic Gradient Descent for Gaussian Processes Done Right

    Authors: Jihao Andreas Lin, Shreyas Padhy, Javier Antorán, Austin Tripp, Alexander Terenin, Csaba Szepesvári, José Miguel Hernández-Lobato, David Janz

    Abstract: As is well known, both sampling from the posterior and computing the mean of the posterior in Gaussian process regression reduces to solving a large linear system of equations. We study the use of stochastic gradient descent for solving this linear system, and show that when \emph{done right} -- by which we mean using specific insights from the optimisation and kernel communities -- stochastic gra… ▽ More

    Submitted 28 April, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

  4. arXiv:2307.06093  [pdf, other

    cs.LG stat.ML

    Online Laplace Model Selection Revisited

    Authors: Jihao Andreas Lin, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: The Laplace approximation provides a closed-form model selection objective for neural networks (NN). Online variants, which optimise NN parameters jointly with hyperparameters, like weight decay strength, have seen renewed interest in the Bayesian deep learning community. However, these methods violate Laplace's method's critical assumption that the approximation is performed around a mode of the… ▽ More

    Submitted 9 January, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: Advances in Approximate Bayesian Inference 2023

  5. arXiv:2306.11589  [pdf, other

    cs.LG stat.ML

    Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent

    Authors: Jihao Andreas Lin, Javier Antorán, Shreyas Padhy, David Janz, José Miguel Hernández-Lobato, Alexander Terenin

    Abstract: Gaussian processes are a powerful framework for quantifying uncertainty and for sequential decision-making but are limited by the requirement of solving linear systems. In general, this has a cubic cost in dataset size and is sensitive to conditioning. We explore stochastic gradient algorithms as a computationally efficient method of approximately solving these linear systems: we develop low-varia… ▽ More

    Submitted 15 January, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

    Journal ref: Advances in Neural Information Processing Systems, 2023

  6. arXiv:2210.04994  [pdf, other

    stat.ML cs.AI cs.LG

    Sampling-based inference for large linear models, with application to linearised Laplace

    Authors: Javier Antorán, Shreyas Padhy, Riccardo Barbano, Eric Nalisnick, David Janz, José Miguel Hernández-Lobato

    Abstract: Large-scale linear models are ubiquitous throughout machine learning, with contemporary application as surrogate models for neural network uncertainty quantification; that is, the linearised Laplace method. Alas, the computational cost associated with Bayesian linear models constrains this method's application to small networks, small output spaces and small datasets. We address this limitation by… ▽ More

    Submitted 16 March, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Published at ICLR 2023. This latest Arxiv version is extended with a demonstration of the proposed methods on the Imagenet dataset

  7. arXiv:2206.08900  [pdf, other

    stat.ML cs.AI cs.LG

    Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

    Authors: Javier Antorán, David Janz, James Urquhart Allingham, Erik Daxberger, Riccardo Barbano, Eric Nalisnick, José Miguel Hernández-Lobato

    Abstract: The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selecti… ▽ More

    Submitted 8 December, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Paper appearing at ICML 2022

  8. arXiv:2203.00479  [pdf, other

    eess.IV cs.LG stat.ML

    Uncertainty Estimation for Computed Tomography with a Linearised Deep Image Prior

    Authors: Javier Antorán, Riccardo Barbano, Johannes Leuschner, José Miguel Hernández-Lobato, Bangti **

    Abstract: Existing deep-learning based tomographic image reconstruction methods do not provide accurate estimates of reconstruction uncertainty, hindering their real-world deployment. This paper develops a method, termed as the linearised deep image prior (DIP), to estimate the uncertainty associated with reconstructions produced by the DIP with total variation regularisation (TV). Specifically, we endow th… ▽ More

    Submitted 4 November, 2022; v1 submitted 28 February, 2022; originally announced March 2022.

  9. arXiv:2202.02195  [pdf, other

    stat.ML cs.LG

    Deep End-to-end Causal Inference

    Authors: Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

    Abstract: Causal inference is essential for data-driven decision making across domains such as business engagement, medical treatment and policy making. However, research on causal discovery has evolved separately from inference methods, preventing straight-forward combination of methods from both fields. In this work, we develop Deep End-to-end Causal Inference (DECI), a single flow-based non-linear additi… ▽ More

    Submitted 20 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

  10. arXiv:2112.06926  [pdf, other

    cs.LG stat.ML

    Addressing Bias in Active Learning with Depth Uncertainty Networks... or Not

    Authors: Chelsea Murray, James U. Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: Farquhar et al. [2021] show that correcting for active learning bias with underparameterised models leads to improved downstream performance. For overparameterised models such as NNs, however, correction leads either to decreased or unchanged performance. They suggest that this is due to an "overfitting bias" which offsets the active learning bias. We show that depth uncertainty networks operate i… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2112.06796

  11. arXiv:2112.06796  [pdf, other

    cs.LG stat.ML

    Depth Uncertainty Networks for Active Learning

    Authors: Chelsea Murray, James U. Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: In active learning, the size and complexity of the training dataset changes over time. Simple models that are well specified by the amount of data available at the start of active learning might suffer from bias as more points are actively sampled. Flexible models that might be well suited to the full dataset can suffer from overfitting towards the start of active learning. We tackle this problem… ▽ More

    Submitted 4 May, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

  12. arXiv:2010.14689  [pdf, other

    cs.LG stat.ML

    Bayesian Deep Learning via Subnetwork Inference

    Authors: Erik Daxberger, Eric Nalisnick, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato

    Abstract: The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights ar… ▽ More

    Submitted 14 March, 2022; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: ICML 2021; 22 pages, extended version with supplementary material

  13. arXiv:2006.08437  [pdf, other

    stat.ML cs.LG

    Depth Uncertainty in Neural Networks

    Authors: Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

    Abstract: Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding mo… ▽ More

    Submitted 7 December, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Published at NeurIPS 2020

  14. arXiv:2006.06848  [pdf, other

    stat.ML cs.LG

    Getting a CLUE: A Method for Explaining Uncertainty Estimates

    Authors: Javier Antorán, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández-Lobato

    Abstract: Both uncertainty estimation and interpretability are important factors for trustworthy machine learning systems. However, there is little work at the intersection of these two areas. We address this gap by proposing a novel method for interpreting uncertainty estimates from differentiable probabilistic models, like Bayesian Neural Networks (BNNs). Our method, Counterfactual Latent Uncertainty Expl… ▽ More

    Submitted 18 March, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Accepted as an oral presentation at ICLR 2021

  15. arXiv:2002.02797  [pdf, other

    stat.ML cs.LG

    Variational Depth Search in ResNets

    Authors: Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato

    Abstract: One-shot neural architecture search allows joint learning of weights and network architecture, reducing computational cost. We limit our search space to the depth of residual networks and formulate an analytically tractable variational objective that allows for obtaining an unbiased approximate posterior over depths in one-shot. We propose a heuristic to prune our networks based on this distributi… ▽ More

    Submitted 1 April, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Appearing at the 1st ICLR workshop on Neural Architecture Search 2020

  16. Disentangling and Learning Robust Representations with Natural Clustering

    Authors: Javier Antoran, Antonio Miguel

    Abstract: Learning representations that disentangle the underlying factors of variability in data is an intuitive way to achieve generalization in deep models. In this work, we address the scenario where generative factors present a multimodal distribution due to the existence of class distinction in the data. We propose N-VAE, a model which is capable of separating factors of variation which are exclusive… ▽ More

    Submitted 5 November, 2019; v1 submitted 27 January, 2019; originally announced January 2019.

    Comments: Accepted at ICMLA 2019