Skip to main content

Showing 1–50 of 162 results for author: Gal, Y

.
  1. arXiv:2406.15927  [pdf, other

    cs.CL cs.AI cs.LG

    Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs

    Authors: Jannik Kossen, Jiatong Han, Muhammed Razzak, Lisa Schut, Shreshth Malik, Yarin Gal

    Abstract: We propose semantic entropy probes (SEPs), a cheap and reliable method for uncertainty quantification in Large Language Models (LLMs). Hallucinations, which are plausible-sounding but factually incorrect and arbitrary model generations, present a major challenge to the practical adoption of LLMs. Recent work by Farquhar et al. (2024) proposes semantic entropy (SE), which can detect hallucinations… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: First three authors contributed equally

  2. arXiv:2406.12011  [pdf, other

    cs.LG cs.CY

    The Benefits and Risks of Transductive Approaches for AI Fairness

    Authors: Muhammed Razzak, Andreas Kirsch, Yarin Gal

    Abstract: Recently, transductive learning methods, which leverage holdout sets during training, have gained popularity for their potential to improve speed, accuracy, and fairness in machine learning models. Despite this, the composition of the holdout set itself, particularly the balance of sensitive sub-groups, has been largely overlooked. Our experiments on CIFAR and CelebA datasets show that composition… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.10023  [pdf, other

    cs.LG cs.CL stat.ML

    Deep Bayesian Active Learning for Preference Modeling in Large Language Models

    Authors: Luckeciano C. Melo, Panagiotis Tigas, Alessandro Abate, Yarin Gal

    Abstract: Leveraging human preferences for steering the behavior of Large Language Models (LLMs) has demonstrated notable success in recent years. Nonetheless, data selection and labeling are still a bottleneck for these systems, particularly at large scale. Hence, selecting the most informative points for acquiring human feedback may considerably reduce the cost of preference labeling and unleash the furth… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  4. arXiv:2406.07457  [pdf, other

    cs.LG stat.ML

    Estimating the Hallucination Rate of Generative AI

    Authors: Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P. Cunningham, David Blei

    Abstract: This work is about estimating the hallucination rate for in-context learning (ICL) with Generative AI. In ICL, a conditional generative model (CGM) is prompted with a dataset and asked to make a prediction based on that dataset. The Bayesian interpretation of ICL assumes that the CGM is calculating a posterior predictive distribution over an unknown Bayesian model of a latent parameter and data. W… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  5. arXiv:2406.03209  [pdf, other

    cs.LG cs.AI

    Challenges and Considerations in the Evaluation of Bayesian Causal Discovery

    Authors: Amir Mohammad Karimi Mamaghan, Panagiotis Tigas, Karl Henrik Johansson, Yarin Gal, Yashas Annadani, Stefan Bauer

    Abstract: Representing uncertainty in causal discovery is a crucial component for experimental design, and more broadly, for safe and reliable causal decision making. Bayesian Causal Discovery (BCD) offers a principled approach to encapsulating this uncertainty. Unlike non-Bayesian causal discovery, which relies on a single estimated causal graph and model parameters for assessment, evaluating BCD presents… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2405.20003  [pdf, other

    cs.LG cs.AI cs.CL

    Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities

    Authors: Alexander Nikitin, Jannik Kossen, Yarin Gal, Pekka Marttinen

    Abstract: Uncertainty quantification in Large Language Models (LLMs) is crucial for applications where safety and reliability are important. In particular, uncertainty can be used to improve the trustworthiness of LLMs by detecting factually incorrect model responses, commonly called hallucinations. Critically, one should seek to capture the model's semantic uncertainty, i.e., the uncertainty over the meani… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  7. arXiv:2405.05852  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO stat.ML

    Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

    Authors: Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

    Abstract: Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs. Such capabilities are difficult to learn solely from task-specific data. This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains. However, commonly used… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  8. arXiv:2404.03713  [pdf, other

    cs.LG cs.AI cs.CV cs.HC

    Explaining Explainability: Understanding Concept Activation Vectors

    Authors: Angus Nicolson, Lisa Schut, J. Alison Noble, Yarin Gal

    Abstract: Recent interpretability methods propose using concept-based explanations to translate the internal representations of deep learning models into a language that humans are familiar with: concepts. This requires understanding which concepts are present in the representation space of a neural network. One popular method for finding concepts is Concept Activation Vectors (CAVs), which are learnt using… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: (54 pages, 39 figures)

    ACM Class: I.2.6

  9. arXiv:2403.05534  [pdf, other

    cs.CL

    Bayesian Preference Elicitation with Language Models

    Authors: Kunal Handa, Yarin Gal, Ellie Pavlick, Noah Goodman, Jacob Andreas, Alex Tamkin, Belinda Z. Li

    Abstract: Aligning AI systems to users' interests requires understanding and incorporating humans' complex values and preferences. Recently, language models (LMs) have been used to gather information about the preferences of human users. This preference data can be used to fine-tune or guide other LMs and/or AI systems. However, LMs have been shown to struggle with crucial aspects of preference learning: qu… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  10. arXiv:2312.17210  [pdf, other

    stat.ML cs.AI cs.LG

    Continual Learning via Sequential Function-Space Variational Inference

    Authors: Tim G. J. Rudner, Freddie Bickford Smith, Qixuan Feng, Yee Whye Teh, Yarin Gal

    Abstract: Sequential Bayesian inference over predictive functions is a natural framework for continual learning from streams of data. However, applying it to neural networks has proved challenging in practice. Addressing the drawbacks of existing techniques, we propose an optimization objective derived by formulating continual learning as sequential function-space variational inference. In contrast to exist… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

  11. arXiv:2312.17199  [pdf, other

    stat.ML cs.AI cs.LG

    Tractable Function-Space Variational Inference in Bayesian Neural Networks

    Authors: Tim G. J. Rudner, Zonghao Chen, Yee Whye Teh, Yarin Gal

    Abstract: Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  12. arXiv:2312.17168  [pdf, other

    cs.LG cs.AI

    Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning?

    Authors: Gunshi Gupta, Tim G. J. Rudner, Rowan Thomas McAllister, Adrien Gaidon, Yarin Gal

    Abstract: Causal confusion is a phenomenon where an agent learns a policy that reflects imperfect spurious correlations in the data. Such a policy may falsely appear to be optimal during training if most of the training data contain such spurious correlations. This phenomenon is particularly pronounced in domains such as robotics, with potentially large gaps between the open- and closed-loop performance of… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 2nd Conference on Causal Learning and Reasoning (CLeaR 2021)

  13. arXiv:2312.13419  [pdf, other

    cond-mat.stat-mech quant-ph

    Entanglement Dynamics in Monitored Systems and the Role of Quantum Jumps

    Authors: Youenn Le Gal, Xhek Turkeshi, Marco Schirò

    Abstract: Monitored quantum many-body systems display a rich pattern of entanglement dynamics, which is unique to this non-unitary setting. This work studies the effect of quantum jumps on the entanglement dynamics beyond the no-click limit corresponding to a deterministic non-Hermitian evolution. We consider two examples, a monitored SSH model and a quantum Ising chain, for which we show the jumps have rem… ▽ More

    Submitted 27 June, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: 12 + 5 pages, 7 + 5 figures. Major overhaul of the text

  14. arXiv:2312.04064  [pdf, other

    q-bio.QM cs.LG stat.ME

    DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design

    Authors: Clare Lyle, Arash Mehrjou, Pascal Notin, Andrew Jesson, Stefan Bauer, Yarin Gal, Patrick Schwab

    Abstract: The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms. Existing approaches search over the billions of potential interventions to maximize the expected influence on the target phenotype. However, to reduce the risk of failure in future stages of trials, practical experiment design aims to find a set of interv… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Journal ref: International Conference on Machine Learning, 2023

  15. arXiv:2311.01009  [pdf, other

    cs.CV cs.AI

    Revam** AI Models in Dermatology: Overcoming Critical Challenges for Enhanced Skin Lesion Diagnosis

    Authors: Deval Mehta, Brigid Betz-Stablein, Toan D Nguyen, Yaniv Gal, Adrian Bowling, Martin Haskett, Maithili Sashindranath, Paul Bonnington, Victoria Mar, H Peter Soyer, Zongyuan Ge

    Abstract: The surge in develo** deep learning models for diagnosing skin lesions through image analysis is notable, yet their clinical black faces challenges. Current dermatology AI models have limitations: limited number of possible diagnostic outputs, lack of real-world testing on uncommon skin lesions, inability to detect out-of-distribution images, and over-reliance on dermoscopic images. To address t… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  16. arXiv:2311.00444  [pdf, other

    cs.LG

    Form follows Function: Text-to-Text Conditional Graph Generation based on Functional Requirements

    Authors: Peter A. Zachares, Vahan Hovhannisyan, Alan Mosca, Yarin Gal

    Abstract: This work focuses on the novel problem setting of generating graphs conditioned on a description of the graph's functional requirements in a downstream task. We pose the problem as a text-to-text generation problem and focus on the approach of fine-tuning a pretrained large language model (LLM) to generate graphs. We propose an inductive bias which incorporates information about the structure of t… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  17. arXiv:2309.15840  [pdf, other

    cs.CL cs.AI cs.LG

    How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

    Authors: Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner

    Abstract: Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  18. arXiv:2308.13320  [pdf, other

    cs.LG cs.CV

    Fine-tuning can cripple your foundation model; preserving features may be the solution

    Authors: Jishnu Mukhoti, Yarin Gal, Philip H. S. Torr, Puneet K. Dokania

    Abstract: Pre-trained foundation models, due to their enormous capacity and exposure to vast amounts of data during pre-training, are known to have learned plenty of real-world concepts. An important step in making these pre-trained models effective on downstream tasks is to fine-tune them on related datasets. While various fine-tuning methods have been devised and have been shown to be highly effective, we… ▽ More

    Submitted 1 July, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Published in TMLR: https://openreview.net/forum?id=kfhoeZCeW7

  19. arXiv:2307.12375  [pdf, other

    cs.CL cs.AI cs.LG

    In-Context Learning Learns Label Relationships but Is Not Conventional Learning

    Authors: Jannik Kossen, Yarin Gal, Tom Rainforth

    Abstract: The predictions of Large Language Models (LLMs) on downstream tasks often improve significantly when including examples of the input--label relationship in the context. However, there is currently no consensus about how this in-context learning (ICL) ability of LLMs works. For example, while Xie et al. (2021) liken ICL to a general-purpose learning algorithm, Min et al. (2022) argue ICL does not e… ▽ More

    Submitted 13 March, 2024; v1 submitted 23 July, 2023; originally announced July 2023.

    Comments: Accepted for publication at ICLR 2024

  20. arXiv:2307.10719  [pdf, other

    cs.AI cs.CL cs.CR cs.LG

    LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?

    Authors: David Glukhov, Ilia Shumailov, Yarin Gal, Nicolas Papernot, Vardan Papyan

    Abstract: Large language models (LLMs) have exhibited impressive capabilities in comprehending complex instructions. However, their blind adherence to provided instructions has led to concerns regarding risks of malicious use. Existing defence mechanisms, such as model fine-tuning or output censorship using LLMs, have proven to be fallible, as LLMs can still generate problematic responses. Commonly employed… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  21. arXiv:2306.15058  [pdf, other

    cs.LG stat.ML

    BatchGFN: Generative Flow Networks for Batch Active Learning

    Authors: Shreshth A. Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal

    Abstract: We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted at the Structured Probabilistic Inference & Generative Modeling workshop, ICML 2023

  22. arXiv:2306.01460  [pdf, other

    cs.LG

    ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages

    Authors: Andrew Jesson, Chris Lu, Gunshi Gupta, Angelos Filos, Jakob Nicolaus Foerster, Yarin Gal

    Abstract: This paper introduces an effective and practical step toward approximate Bayesian inference in on-policy actor-critic deep reinforcement learning. This step manifests as three simple modifications to the Asynchronous Advantage Actor-Critic (A3C) algorithm: (1) applying a ReLU function to advantage estimates, (2) spectral normalization of actor-critic weights, and (3) incorporating dropout as a Bay… ▽ More

    Submitted 24 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  23. arXiv:2305.17493  [pdf, other

    cs.LG cs.AI cs.CL cs.CR cs.CV

    The Curse of Recursion: Training on Generated Data Makes Models Forget

    Authors: Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson

    Abstract: Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such language models to the general public. It is now clear that large language models (LLMs) are here to stay, and will bring about drastic change in the whole ecosystem of online text and images. In this paper… ▽ More

    Submitted 14 April, 2024; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: Fixed typos in eqn 4,5

  24. arXiv:2304.08151  [pdf, other

    cs.LG stat.ML

    Prediction-Oriented Bayesian Active Learning

    Authors: Freddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth

    Abstract: Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To add… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Published at AISTATS 2023

  25. arXiv:2304.03609  [pdf, other

    cs.CL cs.LG

    Revisiting Automated Prompting: Are We Actually Doing Better?

    Authors: Yulin Zhou, Yiren Zhao, Ilia Shumailov, Robert Mullins, Yarin Gal

    Abstract: Current literature demonstrates that Large Language Models (LLMs) are great few-shot learners, and prompting significantly increases their performance on a range of downstream tasks in a few-shot learning setting. An attempt to automate human-led prompting followed, with some progress achieved. In particular, subsequent work demonstrates automation can outperform fine-tuning in certain K-shot lear… ▽ More

    Submitted 22 June, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

  26. arXiv:2302.10607  [pdf, other

    cs.LG cs.AI stat.ME

    Differentiable Multi-Target Causal Bayesian Experimental Design

    Authors: Yashas Annadani, Panagiotis Tigas, Desi R. Ivanova, Andrew Jesson, Yarin Gal, Adam Foster, Stefan Bauer

    Abstract: We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting -- a critical component for causal discovery from finite data where interventions can be costly or risky. Existing methods rely on greedy approximations to construct a batch of experiments while using black-box methods to optimize over a single target-state pair… ▽ More

    Submitted 2 June, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Camera-ready version ICML 2023

  27. arXiv:2302.09664  [pdf, other

    cs.CL cs.AI cs.LG

    Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation

    Authors: Lorenz Kuhn, Yarin Gal, Sebastian Farquhar

    Abstract: We introduce a method to measure uncertainty in large language models. For tasks like question answering, it is essential to know when we can trust the natural language outputs of foundation models. We show that measuring uncertainty in natural language is challenging because of "semantic equivalence" -- different sentences can mean the same thing. To overcome these challenges we introduce semanti… ▽ More

    Submitted 15 April, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: International Conference on Learning Representations 2023 (Spotlight)

  28. arXiv:2301.11921  [pdf, other

    physics.data-an cs.LG physics.ao-ph

    Using uncertainty-aware machine learning models to study aerosol-cloud interactions

    Authors: Maëlys Solal, Andrew Jesson, Yarin Gal, Alyson Douglas

    Abstract: Aerosol-cloud interactions (ACI) include various effects that result from aerosols entering a cloud, and affecting cloud properties. In general, an increase in aerosol concentration results in smaller droplet sizes which leads to larger, brighter, longer-lasting clouds that reflect more sunlight and cool the Earth. The strength of the effect is however heterogeneous, meaning it depends on the surr… ▽ More

    Submitted 30 November, 2022; originally announced January 2023.

  29. arXiv:2212.13936  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

    Authors: Tim G. J. Rudner, Cong Lu, Michael A. Osborne, Yarin Gal, Yee Whye Teh

    Abstract: KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological traini… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  30. arXiv:2212.07769  [pdf, other

    cs.CL cs.AI cs.LG

    CLAM: Selective Clarification for Ambiguous Questions with Generative Language Models

    Authors: Lorenz Kuhn, Yarin Gal, Sebastian Farquhar

    Abstract: Users often ask dialogue systems ambiguous questions that require clarification. We show that current language models rarely ask users to clarify ambiguous questions and instead provide incorrect answers. To address this, we introduce CLAM: a framework for getting language models to selectively ask for clarification about ambiguous user questions. In particular, we show that we can prompt language… ▽ More

    Submitted 20 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  31. arXiv:2211.12717  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks

    Authors: Neil Band, Tim G. J. Rudner, Qixuan Feng, Angelos Filos, Zachary Nado, Michael W. Dusenberry, Ghassen Jerfel, Dustin Tran, Yarin Gal

    Abstract: Bayesian deep learning seeks to equip deep neural networks with the ability to precisely quantify their predictive uncertainty, and has promised to make deep learning more reliable for safety-critical real-world applications. Yet, existing Bayesian deep learning methods fall short of this promise; new methods continue to be evaluated on unrealistic test beds that do not reflect the complexities of… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: Published in Neural Information Processing Systems (NeurIPS) 2021 Datasets and Benchmarks Track Proceedings. First two authors contributed equally. Code available at https://rebrand.ly/retina-benchmark

  32. arXiv:2211.06903  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG

    Discovering Long-period Exoplanets using Deep Learning with Citizen Science Labels

    Authors: Shreshth A. Malik, Nora L. Eisner, Chris J. Lintott, Yarin Gal

    Abstract: Automated planetary transit detection has become vital to prioritize candidates for expert analysis given the scale of modern telescopic surveys. While current methods for short-period exoplanet detection work effectively due to periodicity in the light curves, there lacks a robust approach for detecting single-transit events. However, volunteer-labelled transits recently collected by the Planet H… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

    Comments: Accepted at the Machine Learning and the Physical Sciences workshop, NeurIPS 2022

  33. arXiv:2210.11937  [pdf, other

    cond-mat.stat-mech cond-mat.str-el quant-ph

    Volume-to-Area Law Entanglement Transition in a non-Hermitian Free Fermionic Chain

    Authors: Youenn Le Gal, Xhek Turkeshi, Marco Schirò

    Abstract: We consider the dynamics of the non-Hermitian Su-Schrieffer-Heeger model arising as the no-click limit of a continuously monitored free fermion chain where particles and holes are measured on two sublattices. The model has $\mathcal{PT}$-symmetry, which we show to spontaneously break as a function of the strength of measurement backaction, resulting in a spectral transition where quasiparticles ac… ▽ More

    Submitted 22 February, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: version 3 , 4 figures, footnote

    Journal ref: SciPost Phys. 14, 138 (2023)

  34. arXiv:2209.13569  [pdf, other

    cs.LG stat.ML

    Exploring Low Rank Training of Deep Neural Networks

    Authors: Siddhartha Rao Kamalakara, Acyr Locatelli, Bharat Venkitesh, Jimmy Ba, Yarin Gal, Aidan N. Gomez

    Abstract: Training deep neural networks in low rank, i.e. with factorised layers, is of particular interest to the community: it offers efficiency over unfactorised training in terms of both memory consumption and training time. Prior work has focused on low rank approximations of pre-trained networks and training in low rank space with additional objectives, offering various ad hoc explanations for chosen… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  35. arXiv:2209.05842  [pdf, other

    cs.CV

    Skin Lesion Recognition with Class-Hierarchy Regularized Hyperbolic Embeddings

    Authors: Zhen Yu, Toan Nguyen, Yaniv Gal, Lie Ju, Shekhar S. Chandra, Lei Zhang, Paul Bonnington, Victoria Mar, Zhiyong Wang, Zongyuan Ge

    Abstract: In practice, many medical datasets have an underlying taxonomy defined over the disease label space. However, existing classification algorithms for medical diagnoses often assume semantically independent labels. In this study, we aim to leverage class hierarchy with deep learning algorithms for more accurate and reliable skin lesion recognition. We propose a hyperbolic network to learn image embe… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: in The 25th International Conference on Medical Image Computing and Computer Assisted Intervention

  36. arXiv:2208.09512  [pdf, other

    astro-ph.SR astro-ph.IM cs.CV cs.LG

    Exploring the Limits of Synthetic Creation of Solar EUV Images via Image-to-Image Translation

    Authors: Valentina Salvatelli, Luiz F. G. dos Santos, Souvik Bose, Brad Neuberg, Mark C. M. Cheung, Miho Janvier, Meng **, Yarin Gal, Atilim Gunes Baydin

    Abstract: The Solar Dynamics Observatory (SDO), a NASA multi-spectral decade-long mission that has been daily producing terabytes of observational data from the Sun, has been recently used as a use-case to demonstrate the potential of machine learning methodologies and to pave the way for future deep-space mission planning. In particular, the idea of using image-to-image translation to virtually produce ext… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

    Comments: 16 pages, 8 figures. To be published on ApJ (submitted on Feb 21st, accepted on July 28th)

    Journal ref: ApJ 937 (2022) 100

  37. arXiv:2208.00549  [pdf, other

    cs.LG cs.AI cs.IR cs.IT

    Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities

    Authors: Andreas Kirsch, Yarin Gal

    Abstract: Recently proposed methods in data subset selection, that is active learning and active sampling, use Fisher information, Hessians, similarity matrices based on gradients, and gradient lengths to estimate how informative data is for a model's training. Are these different approaches connected, and if so, how? We revisit the fundamentals of Bayesian optimal experiment design and show that these rece… ▽ More

    Submitted 6 November, 2022; v1 submitted 31 July, 2022; originally announced August 2022.

    Comments: 18.5 pages main paper, 31 pages total

  38. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  39. arXiv:2206.15186  [pdf, other

    cs.CV cs.AI cs.LG

    Out-of-Distribution Detection for Long-tailed and Fine-grained Skin Lesion Images

    Authors: Deval Mehta, Yaniv Gal, Adrian Bowling, Paul Bonnington, Zongyuan Ge

    Abstract: Recent years have witnessed a rapid development of automated methods for skin lesion diagnosis and classification. Due to an increasing deployment of such systems in clinics, it has become important to develop a more robust system towards various Out-of-Distribution(OOD) samples (unknown skin lesions and conditions). However, the current deep learning models trained for skin lesion classification… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted to MICCAI 2022 (top 13% paper; early accept)

  40. arXiv:2206.07137  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

    Authors: Sören Mindermann, Jan Brauner, Muhammed Razzak, Mrinank Sharma, Andreas Kirsch, Winnie Xu, Benedikt Höltgen, Aidan N. Gomez, Adrien Morisot, Sebastian Farquhar, Yarin Gal

    Abstract: Training on web-scale data can take months. But most computation and time is wasted on redundant and noisy points that are already learnt or not learnable. To accelerate training, we introduce Reducible Holdout Loss Selection (RHO-LOSS), a simple but principled technique which selects approximately those points for training that most reduce the model's generalization loss. As a result, RHO-LOSS mi… ▽ More

    Submitted 26 September, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  41. arXiv:2206.02126  [pdf, other

    cs.LG

    Learning Dynamics and Generalization in Reinforcement Learning

    Authors: Clare Lyle, Mark Rowland, Will Dabney, Marta Kwiatkowska, Yarin Gal

    Abstract: Solving a reinforcement learning (RL) problem poses two competing challenges: fitting a potentially discontinuous value function, and generalizing well to new observations. In this paper, we analyze the learning dynamics of temporal difference algorithms to gain novel insight into the tension between these two objectives. We show theoretically that temporal difference learning encourages agents to… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

  42. arXiv:2205.13760  [pdf, other

    cs.LG

    Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

    Authors: Pascal Notin, Mafalda Dias, Jonathan Frazer, Javier Marchena-Hurtado, Aidan Gomez, Debora S. Marks, Yarin Gal

    Abstract: The ability to accurately model the fitness landscape of protein sequences is critical to a wide range of applications, from quantifying the effects of human variants on disease likelihood, to predicting immune-escape mutations in viruses and designing novel biotherapeutic proteins. Deep generative models of protein sequences trained on multiple sequence alignments have been the most successful ap… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: ICML 2022

  43. arXiv:2205.12734  [pdf, other

    physics.space-ph astro-ph.IM astro-ph.SR cs.LG

    Global geomagnetic perturbation forecasting using Deep Learning

    Authors: Vishal Upendran, Panagiotis Tigas, Banafsheh Ferdousi, Teo Bloch, Mark C. M. Cheung, Siddha Ganju, Asti Bhatt, Ryan M. McGranaghan, Yarin Gal

    Abstract: Geomagnetically Induced Currents (GICs) arise from spatio-temporal changes to Earth's magnetic field which arise from the interaction of the solar wind with Earth's magnetosphere, and drive catastrophic destruction to our technologically dependent society. Hence, computational models to forecast GICs globally with large forecast horizon, high spatial resolution and temporal cadence are of increasi… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: 23 pages, 8 figures, 5 tables; accepted for publication in AGU: Spaceweather

  44. arXiv:2205.08766  [pdf, other

    cs.LG stat.ML

    Marginal and Joint Cross-Entropies & Predictives for Online Bayesian Inference, Active Learning, and Active Sampling

    Authors: Andreas Kirsch, Jannik Kossen, Yarin Gal

    Abstract: Principled Bayesian deep learning (BDL) does not live up to its potential when we only focus on marginal predictive distributions (marginal predictives). Recent works have highlighted the importance of joint predictives for (Bayesian) sequential decision making from a theoretical and synthetic perspective. We provide additional practical arguments grounded in real-world applications for focusing o… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: 10 pages + references

  45. arXiv:2204.10022  [pdf, other

    cs.LG stat.ML

    Scalable Sensitivity and Uncertainty Analysis for Causal-Effect Estimates of Continuous-Valued Interventions

    Authors: Andrew Jesson, Alyson Douglas, Peter Manshausen, Maëlys Solal, Nicolai Meinshausen, Philip Stier, Yarin Gal, Uri Shalit

    Abstract: Estimating the effects of continuous-valued interventions from observational data is a critically important task for climate science, healthcare, and economics. Recent work focuses on designing neural network architectures and regularization functions to allow for scalable estimation of average and individual-level dose-response curves from high-dimensional, large-sample data. Such methodologies a… ▽ More

    Submitted 12 October, 2022; v1 submitted 21 April, 2022; originally announced April 2022.

    Comments: 33 pages

  46. arXiv:2203.02016  [pdf, other

    cs.LG cs.AI stat.ML

    Interventions, Where and How? Experimental Design for Causal Models at Scale

    Authors: Panagiotis Tigas, Yashas Annadani, Andrew Jesson, Bernhard Schölkopf, Yarin Gal, Stefan Bauer

    Abstract: Causal discovery from observational and interventional data is challenging due to limited data and non-identifiability: factors that introduce uncertainty in estimating the underlying structural causal model (SCM). Selecting experiments (interventions) based on the uncertainty arising from both factors can expedite the identification of the SCM. Existing methods in experimental design for causal d… ▽ More

    Submitted 21 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Presented at the thirty-sixth Conference on Neural Information Processing Systems (2022)

  47. arXiv:2202.08132  [pdf, other

    cs.LG

    Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

    Authors: Milad Alizadeh, Shyam A. Tailor, Luisa M Zintgraf, Joost van Amersfoort, Sebastian Farquhar, Nicholas Donald Lane, Yarin Gal

    Abstract: Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of… ▽ More

    Submitted 5 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

  48. arXiv:2202.06881  [pdf, other

    cs.LG stat.ML

    Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation

    Authors: Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

    Abstract: We propose Active Surrogate Estimators (ASEs), a new method for label-efficient model evaluation. Evaluating model performance is a challenging and important problem when labels are expensive. ASEs address this active testing problem using a surrogate-based estimation approach that interpolates the errors of points with unknown labels, rather than forming a Monte Carlo estimator. ASEs actively lea… ▽ More

    Submitted 18 October, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: Accepted for publication at NeurIPS 2022

  49. arXiv:2202.05097  [pdf, other

    astro-ph.IM physics.ins-det

    EXCESS workshop: Descriptions of rising low-energy spectra

    Authors: P. Adari, A. Aguilar-Arevalo, D. Amidei, G. Angloher, E. Armengaud, C. Augier, L. Balogh, S. Banik, D. Baxter, C. Beaufort, G. Beaulieu, V. Belov, Y. Ben Gal, G. Benato, A. Benoît, A. Bento, L. Bergé, A. Bertolini, R. Bhattacharyya, J. Billard, I. M. Bloch, A. Botti, R. Breier, G. Bres, J-. L. Bret , et al. (281 additional authors not shown)

    Abstract: Many low-threshold experiments observe sharply rising event rates of yet unknown origins below a few hundred eV, and larger than expected from known backgrounds. Due to the significant impact of this excess on the dark matter or neutrino sensitivity of these experiments, a collective effort has been started to share the knowledge about the individual observations. For this, the EXCESS Workshop was… ▽ More

    Submitted 4 March, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: 44 pages, 20 figures; Editors: A. Fuss, M. Kaznacheeva, F. Reindl, F. Wagner; updated copyright statements and funding information

    Journal ref: SciPost Phys. Proc. 9, 001 (2022)

  50. arXiv:2202.01851  [pdf, other

    cs.LG cs.AI

    A Note on "Assessing Generalization of SGD via Disagreement"

    Authors: Andreas Kirsch, Yarin Gal

    Abstract: Several recent works find empirically that the average test error of deep neural networks can be estimated via the prediction disagreement of models, which does not require labels. In particular, Jiang et al. (2022) show for the disagreement between two separately trained networks that this `Generalization Disagreement Equality' follows from the well-calibrated nature of deep ensembles under the n… ▽ More

    Submitted 6 November, 2022; v1 submitted 3 February, 2022; originally announced February 2022.