Skip to main content

Showing 1–11 of 11 results for author: Kirsch, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2304.08151  [pdf, other

    cs.LG stat.ML

    Prediction-Oriented Bayesian Active Learning

    Authors: Freddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth

    Abstract: Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To add… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Published at AISTATS 2023

  2. arXiv:2302.08981  [pdf, other

    cs.LG stat.ML

    Black-Box Batch Active Learning for Regression

    Authors: Andreas Kirsch

    Abstract: Batch active learning is a popular approach for efficiently training machine learning models on large, initially unlabelled datasets by repeatedly acquiring labels for batches of data points. However, many recent batch active learning methods are white-box approaches and are often limited to differentiable parametric models: they score unlabeled points using acquisition functions based on model em… ▽ More

    Submitted 7 July, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: 12 pages + 11 pages appendix

  3. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  4. arXiv:2205.08766  [pdf, other

    cs.LG stat.ML

    Marginal and Joint Cross-Entropies & Predictives for Online Bayesian Inference, Active Learning, and Active Sampling

    Authors: Andreas Kirsch, Jannik Kossen, Yarin Gal

    Abstract: Principled Bayesian deep learning (BDL) does not live up to its potential when we only focus on marginal predictive distributions (marginal predictives). Recent works have highlighted the importance of joint predictives for (Bayesian) sequential decision making from a theoretical and synthetic perspective. We provide additional practical arguments grounded in real-world applications for focusing o… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: 10 pages + references

  5. arXiv:2111.02275  [pdf, other

    cs.LG stat.ML

    Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data

    Authors: Andrew Jesson, Panagiotis Tigas, Joost van Amersfoort, Andreas Kirsch, Uri Shalit, Yarin Gal

    Abstract: Estimating personalized treatment effects from high-dimensional observational data is essential in situations where experimental designs are infeasible, unethical, or expensive. Existing approaches rely on fitting deep models on outcomes observed for treated and control populations. However, when measuring individual outcomes is costly, as is the case of a tumor biopsy, a sample-efficient strategy… ▽ More

    Submitted 1 February, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

    Comments: 24 pages, 8 Figures, 5 tables, NeurIPS 2021

  6. arXiv:2106.12062  [pdf, other

    cs.LG stat.ML

    A Practical & Unified Notation for Information-Theoretic Quantities in ML

    Authors: Andreas Kirsch, Yarin Gal

    Abstract: A practical notation can convey valuable intuitions and concisely express new ideas. Information theory is of importance to machine learning, but the notation for information-theoretic quantities is sometimes opaque. We propose a practical and unified notation and extend it to include information-theoretic quantities between observed outcomes (events) and random variables. This includes the point-… ▽ More

    Submitted 2 December, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  7. arXiv:2106.12059  [pdf, other

    cs.LG stat.ML

    Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning

    Authors: Andreas Kirsch, Sebastian Farquhar, Parmida Atighehchian, Andrew Jesson, Frederic Branchaud-Charron, Yarin Gal

    Abstract: We examine a simple stochastic strategy for adapting well-known single-point acquisition functions to allow batch active learning. Unlike acquiring the top-K points from the pool set, score- or rank-based sampling takes into account that acquisition scores change as new data are acquired. This simple strategy for adapting standard single-sample acquisition strategies can even perform just as well… ▽ More

    Submitted 19 September, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: TMLR Paper: https://openreview.net/forum?id=vcHwQyNBjW

  8. arXiv:2106.11719  [pdf, other

    cs.LG stat.ML

    Test Distribution-Aware Active Learning: A Principled Approach Against Distribution Shift and Outliers

    Authors: Andreas Kirsch, Tom Rainforth, Yarin Gal

    Abstract: Expanding on MacKay (1992), we argue that conventional model-based methods for active learning - like BALD - have a fundamental shortfall: they fail to directly account for the test-time distribution of the input variables. This can lead to pathologies in the acquisition strategy, as what is maximally informative for model parameters may not be maximally informative for prediction: for example, wh… ▽ More

    Submitted 21 November, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  9. arXiv:2102.11582  [pdf, other

    cs.LG stat.ML

    Deep Deterministic Uncertainty: A Simple Baseline

    Authors: Jishnu Mukhoti, Andreas Kirsch, Joost van Amersfoort, Philip H. S. Torr, Yarin Gal

    Abstract: Reliable uncertainty from deterministic single-forward pass models is sought after because conventional methods of uncertainty quantification are computationally expensive. We take two complex single-forward-pass uncertainty approaches, DUQ and SNGP, and examine whether they mainly rely on a well-regularized feature space. Crucially, without using their more complex methods for estimating uncertai… ▽ More

    Submitted 28 January, 2022; v1 submitted 23 February, 2021; originally announced February 2021.

  10. arXiv:2003.12537  [pdf, other

    cs.LG stat.ML

    Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning

    Authors: Andreas Kirsch, Clare Lyle, Yarin Gal

    Abstract: The Information Bottleneck principle offers both a mechanism to explain how deep neural networks train and generalize, as well as a regularized objective with which to train models. However, multiple competing objectives are proposed in the literature, and the information-theoretic quantities used in these objectives are difficult to compute for large deep neural networks, which in turn limits the… ▽ More

    Submitted 5 January, 2021; v1 submitted 27 March, 2020; originally announced March 2020.

  11. arXiv:1906.08158  [pdf, other

    cs.LG stat.ML

    BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

    Authors: Andreas Kirsch, Joost van Amersfoort, Yarin Gal

    Abstract: We develop BatchBALD, a tractable approximation to the mutual information between a batch of points and model parameters, which we use as an acquisition function to select multiple informative points jointly for the task of deep Bayesian active learning. BatchBALD is a greedy linear-time $1 - \frac{1}{e}$-approximate algorithm amenable to dynamic programming and efficient caching. We compare Batch… ▽ More

    Submitted 28 October, 2019; v1 submitted 19 June, 2019; originally announced June 2019.