Skip to main content

Showing 51–100 of 162 results for author: Gal, Y

.
  1. arXiv:2112.13023  [pdf, other

    cs.LG cs.AI

    DARTS without a Validation Set: Optimizing the Marginal Likelihood

    Authors: Miroslav Fil, Binxin Ru, Clare Lyle, Yarin Gal

    Abstract: The success of neural architecture search (NAS) has historically been limited by excessive compute requirements. While modern weight-sharing NAS methods such as DARTS are able to finish the search in single-digit GPU days, extracting the final best architecture from the shared weights is notoriously unreliable. Training-Speed-Estimate (TSE), a recently developed generalization estimator with a Bay… ▽ More

    Submitted 24 December, 2021; originally announced December 2021.

    Comments: Presented at the 5th Workshop on Meta-Learning at NeurIPS 2021

  2. arXiv:2112.10074  [pdf, other

    eess.IV cs.CV cs.LG

    QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation - Analysis of Ranking Scores and Benchmarking Results

    Authors: Raghav Mehta, Angelos Filos, Ujjwal Baid, Chiharu Sako, Richard McKinley, Michael Rebsamen, Katrin Datwyler, Raphael Meier, Piotr Radojewski, Gowtham Krishnan Murugesan, Sahil Nalawade, Chandan Ganesh, Ben Wagner, Fang F. Yu, Baowei Fei, Ananth J. Madhuranthakam, Joseph A. Maldjian, Laura Daza, Catalina Gomez, Pablo Arbelaez, Chengliang Dai, Shuo Wang, Hadrien Reynaud, Yuan-han Mo, Elsa Angelini , et al. (67 additional authors not shown)

    Abstract: Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying… ▽ More

    Submitted 23 August, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA): https://www.melba-journal.org/papers/2022:026.html

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 1 (2022)

  3. arXiv:2112.00856  [pdf, other

    cs.LG

    Decomposing Representations for Deterministic Uncertainty Estimation

    Authors: Haiwen Huang, Joost van Amersfoort, Yarin Gal

    Abstract: Uncertainty estimation is a key component in any deployed machine learning system. One way to evaluate uncertainty estimation is using "out-of-distribution" (OoD) detection, that is, distinguishing between the training data distribution and an unseen different data distribution using uncertainty. In this work, we show that current feature density based uncertainty estimators cannot perform well co… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  4. arXiv:2111.15639  [pdf, other

    cs.CV cs.LG stat.ML

    DeDUCE: Generating Counterfactual Explanations Efficiently

    Authors: Benedikt Höltgen, Lisa Schut, Jan M. Brauner, Yarin Gal

    Abstract: When an image classifier outputs a wrong class label, it can be helpful to see what changes in the image would lead to a correct classification. This is the aim of algorithms generating counterfactual explanations. However, there is no easily scalable method to generate such counterfactuals. We develop a new algorithm providing counterfactual explanations for large image classifiers trained with s… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: Presented at the 1st Workshop on eXplainable AI approaches for debugging and diagnosis (XAI4Debugging@NeurIPS2021)

  5. arXiv:2111.07679  [pdf, other

    stat.ML cs.LG

    Contrastive Representation Learning with Trainable Augmentation Channel

    Authors: Masanori Koyama, Kentaro Minami, Takeru Miyato, Yarin Gal

    Abstract: In contrastive representation learning, data representation is trained so that it can classify the image instances even when the images are altered by augmentations. However, depending on the datasets, some augmentations can damage the information of the images beyond recognition, and such augmentations can result in collapsed representations. We present a partial solution to this problem by forma… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  6. arXiv:2111.03231  [pdf, other

    eess.IV cs.CV

    Multi-Spectral Multi-Image Super-Resolution of Sentinel-2 with Radiometric Consistency Losses and Its Effect on Building Delineation

    Authors: Muhammed Razzak, Gonzalo Mateo-Garcia, Luis Gómez-Chova, Yarin Gal, Freddie Kalaitzis

    Abstract: High resolution remote sensing imagery is used in broad range of tasks, including detection and classification of objects. High-resolution imagery is however expensive, while lower resolution imagery is often freely available and can be used by the public for range of social good applications. To that end, we curate a multi-spectral multi-image super-resolution dataset, using PlanetScope imagery f… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

  7. arXiv:2111.02275  [pdf, other

    cs.LG stat.ML

    Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data

    Authors: Andrew Jesson, Panagiotis Tigas, Joost van Amersfoort, Andreas Kirsch, Uri Shalit, Yarin Gal

    Abstract: Estimating personalized treatment effects from high-dimensional observational data is essential in situations where experimental designs are infeasible, unethical, or expensive. Existing approaches rely on fitting deep models on outcomes observed for treated and control populations. However, when measuring individual outcomes is costly, as is the case of a tumor biopsy, a sample-efficient strategy… ▽ More

    Submitted 1 February, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

    Comments: 24 pages, 8 Figures, 5 tables, NeurIPS 2021

  8. arXiv:2111.00079  [pdf, other

    cs.CV cs.LG

    Deep Deterministic Uncertainty for Semantic Segmentation

    Authors: Jishnu Mukhoti, Joost van Amersfoort, Philip H. S. Torr, Yarin Gal

    Abstract: We extend Deep Deterministic Uncertainty (DDU), a method for uncertainty estimation using feature space densities, to semantic segmentation. DDU enables quantifying and disentangling epistemic and aleatoric uncertainty in a single forward pass through the model. We study the similarity of feature representations of pixels at different locations for the same class and conclude that it is feasible t… ▽ More

    Submitted 29 October, 2021; originally announced November 2021.

  9. arXiv:2110.15084  [pdf, other

    physics.ao-ph cs.LG physics.data-an

    Using Non-Linear Causal Models to Study Aerosol-Cloud Interactions in the Southeast Pacific

    Authors: Andrew Jesson, Peter Manshausen, Alyson Douglas, Duncan Watson-Parris, Yarin Gal, Philip Stier

    Abstract: Aerosol-cloud interactions include a myriad of effects that all begin when aerosol enters a cloud and acts as cloud condensation nuclei (CCN). An increase in CCN results in a decrease in the mean cloud droplet size (r$_{e}$). The smaller droplet size leads to brighter, more expansive, and longer lasting clouds that reflect more incoming sunlight, thus cooling the earth. Globally, aerosol-cloud int… ▽ More

    Submitted 3 November, 2021; v1 submitted 28 October, 2021; originally announced October 2021.

  10. arXiv:2110.11875  [pdf, other

    cs.LG stat.ML

    GeneDisco: A Benchmark for Experimental Design in Drug Discovery

    Authors: Arash Mehrjou, Ashkan Soleymani, Andrew Jesson, Pascal Notin, Yarin Gal, Stefan Bauer, Patrick Schwab

    Abstract: In vitro cellular experimentation with genetic interventions, using for example CRISPR technologies, is an essential step in early-stage drug discovery and target validation that serves to assess initial hypotheses about causal associations between biological mechanisms and disease pathologies. With billions of potential hypotheses to test, the experimental design space for in vitro genetic experi… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

  11. arXiv:2107.14261  [pdf, other

    physics.acc-ph cs.LG

    Quantifying Uncertainty for Machine Learning Based Diagnostic

    Authors: Owen Convery, Lewis Smith, Yarin Gal, Adi Hanuka

    Abstract: Virtual Diagnostic (VD) is a deep learning tool that can be used to predict a diagnostic output. VDs are especially useful in systems where measuring the output is invasive, limited, costly or runs the risk of damaging the output. Given a prediction, it is necessary to relay how reliable that prediction is. This is known as 'uncertainty quantification' of a prediction. In this paper, we use ensemb… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2105.04654

  12. arXiv:2107.07455  [pdf, other

    cs.LG cs.AI stat.ML

    Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks

    Authors: Andrey Malinin, Neil Band, Ganshin, Alexander, German Chesnokov, Yarin Gal, Mark J. F. Gales, Alexey Noskov, Andrey Ploskonosov, Liudmila Prokhorenkova, Ivan Provilkov, Vatsal Raina, Vyas Raina, Roginskiy, Denis, Mariya Shmatova, Panos Tigas, Boris Yangel

    Abstract: There has been significant research done on develo** methods for improving robustness to distributional shift and uncertainty estimation. In contrast, only limited work has examined develo** standard datasets and benchmarks for assessing these approaches. Additionally, most work on uncertainty estimation and robustness has developed new techniques based on small-scale regression or image class… ▽ More

    Submitted 11 February, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

  13. arXiv:2107.02565  [pdf, other

    cs.LG cs.IT

    Prioritized training on points that are learnable, worth learning, and not yet learned (workshop version)

    Authors: Sören Mindermann, Muhammed Razzak, Winnie Xu, Andreas Kirsch, Mrinank Sharma, Adrien Morisot, Aidan N. Gomez, Sebastian Farquhar, Jan Brauner, Yarin Gal

    Abstract: We introduce Goldilocks Selection, a technique for faster model training which selects a sequence of training points that are "just right". We propose an information-theoretic acquisition function -- the reducible validation loss -- and compute it with a small proxy model -- GoldiProx -- to efficiently choose training points that maximize information about a validation set. We show that the "hard"… ▽ More

    Submitted 17 October, 2023; v1 submitted 6 July, 2021; originally announced July 2021.

    Journal ref: ICML 2021 Workshop on Subset Selection in Machine Learning

  14. arXiv:2107.00096  [pdf, other

    cs.LG stat.ML

    Improving black-box optimization in VAE latent space using decoder uncertainty

    Authors: Pascal Notin, José Miguel Hernández-Lobato, Yarin Gal

    Abstract: Optimization in the latent space of variational autoencoders is a promising approach to generate high-dimensional discrete objects that maximize an expensive black-box property (e.g., drug-likeness in molecular generation, function approximation with arithmetic expressions). However, existing methods lack robustness as they may decide to explore areas of the latent space for which no data was avai… ▽ More

    Submitted 30 June, 2021; originally announced July 2021.

  15. arXiv:2106.12062  [pdf, other

    cs.LG stat.ML

    A Practical & Unified Notation for Information-Theoretic Quantities in ML

    Authors: Andreas Kirsch, Yarin Gal

    Abstract: A practical notation can convey valuable intuitions and concisely express new ideas. Information theory is of importance to machine learning, but the notation for information-theoretic quantities is sometimes opaque. We propose a practical and unified notation and extend it to include information-theoretic quantities between observed outcomes (events) and random variables. This includes the point-… ▽ More

    Submitted 2 December, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  16. arXiv:2106.12059  [pdf, other

    cs.LG stat.ML

    Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning

    Authors: Andreas Kirsch, Sebastian Farquhar, Parmida Atighehchian, Andrew Jesson, Frederic Branchaud-Charron, Yarin Gal

    Abstract: We examine a simple stochastic strategy for adapting well-known single-point acquisition functions to allow batch active learning. Unlike acquiring the top-K points from the pool set, score- or rank-based sampling takes into account that acquisition scores change as new data are acquired. This simple strategy for adapting standard single-sample acquisition strategies can even perform just as well… ▽ More

    Submitted 19 September, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: TMLR Paper: https://openreview.net/forum?id=vcHwQyNBjW

  17. arXiv:2106.11719  [pdf, other

    cs.LG stat.ML

    Test Distribution-Aware Active Learning: A Principled Approach Against Distribution Shift and Outliers

    Authors: Andreas Kirsch, Tom Rainforth, Yarin Gal

    Abstract: Expanding on MacKay (1992), we argue that conventional model-based methods for active learning - like BALD - have a fundamental shortfall: they fail to directly account for the test-time distribution of the input variables. This can lead to pathologies in the acquisition strategy, as what is maximally informative for model parameters may not be maximally informative for prediction: for example, wh… ▽ More

    Submitted 21 November, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  18. arXiv:2106.07780  [pdf, other

    cs.LG

    KL Guided Domain Adaptation

    Authors: A. Tuan Nguyen, Toan Tran, Yarin Gal, Philip H. S. Torr, Atılım Güneş Baydin

    Abstract: Domain adaptation is an important problem and often needed for real-world applications. In this problem, instead of i.i.d. training and testing datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in… ▽ More

    Submitted 14 March, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: Accepted to ICLR2022

  19. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  20. arXiv:2106.02584  [pdf, other

    cs.LG stat.ML

    Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

    Authors: Jannik Kossen, Neil Band, Clare Lyle, Aidan N. Gomez, Tom Rainforth, Yarin Gal

    Abstract: We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introduce a general-purpose deep learning architecture that takes as input the entire dataset instead of processing one datapoint at a time. Our approach uses self-attention to reason about relationships betw… ▽ More

    Submitted 1 February, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Accepted for publication at NeurIPS 2021. First two authors contributed equally

  21. arXiv:2106.02469  [pdf, other

    cs.LG stat.ML

    Can convolutional ResNets approximately preserve input distances? A frequency analysis perspective

    Authors: Lewis Smith, Joost van Amersfoort, Haiwen Huang, Stephen Roberts, Yarin Gal

    Abstract: ResNets constrained to be bi-Lipschitz, that is, approximately distance preserving, have been a crucial component of recently proposed techniques for deterministic uncertainty quantification in neural models. We show that theoretical justifications for recent regularisation schemes trying to enforce such a constraint suffer from a crucial flaw -- the theoretical link between the regularisation sch… ▽ More

    Submitted 17 June, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Main paper 10 pages including references, appendix 10 pages. 7 figures and 6 tables including appendix

  22. Uncertainty Quantification for Virtual Diagnostic of Particle Accelerators

    Authors: Owen Convery, Lewis Smith, Yarin Gal, Adi Hanuka

    Abstract: Virtual Diagnostic (VD) is a computational tool based on deep learning that can be used to predict a diagnostic output. VDs are especially useful in systems where measuring the output is invasive, limited, costly or runs the risk of altering the output. Given a prediction, it is necessary to relay how reliable that prediction is, i.e. quantify the uncertainty of the prediction. In this paper, we u… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  23. arXiv:2104.10190  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Outcome-Driven Reinforcement Learning via Variational Inference

    Authors: Tim G. J. Rudner, Vitchyr H. Pong, Rowan McAllister, Yarin Gal, Sergey Levine

    Abstract: While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient sha** to accomplish it. In this paper, we view reinforcement learning as inferring policies that achieve desired outcomes, rath… ▽ More

    Submitted 28 December, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: Published in Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  24. arXiv:2104.04785  [pdf, other

    cs.CV cs.LG eess.IV

    Physically-Consistent Generative Adversarial Networks for Coastal Flood Visualization

    Authors: Björn Lütjens, Brandon Leshchinskiy, Christian Requena-Mesa, Farrukh Chishtie, Natalia Díaz-Rodríguez, Océane Boulais, Aruna Sankaranarayanan, Margaux Masson-Forsythe, Aaron Piña, Yarin Gal, Chedy Raïssi, Alexander Lavin, Dava Newman

    Abstract: As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, and better tools for flood risk communication could increase the support for flood-resilient infrastructure development. Our work aims to enable more visual communication of large-scale climate impacts via visualizing the output of… ▽ More

    Submitted 21 February, 2023; v1 submitted 10 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: text overlap with arXiv:2010.08103

  25. arXiv:2103.08951  [pdf, other

    cs.LG stat.AP

    Generating Interpretable Counterfactual Explanations By Implicit Minimisation of Epistemic and Aleatoric Uncertainties

    Authors: Lisa Schut, Oscar Key, Rory McGrath, Luca Costabello, Bogdan Sacaleanu, Medb Corcoran, Yarin Gal

    Abstract: Counterfactual explanations (CEs) are a practical tool for demonstrating why machine learning classifiers make particular decisions. For CEs to be useful, it is important that they are easy for users to interpret. Existing methods for generating interpretable CEs rely on auxiliary generative models, which may not be suitable for complex datasets, and incur engineering overhead. We introduce a simp… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

    Comments: 21 pages, 13 Figures

    Journal ref: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021

  26. arXiv:2103.06002  [pdf, other

    cs.LG stat.ML

    Robustness to Pruning Predicts Generalization in Deep Neural Networks

    Authors: Lorenz Kuhn, Clare Lyle, Aidan N. Gomez, Jonas Rothfuss, Yarin Gal

    Abstract: Existing generalization measures that aim to capture a model's simplicity based on parameter counts or norms fail to explain generalization in overparameterized deep neural networks. In this paper, we introduce a new, theoretically motivated measure of a network's simplicity which we call prunability: the smallest \emph{fraction} of the network's parameters that can be kept while pruning without a… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

  27. arXiv:2103.05331  [pdf, other

    stat.ML cs.LG

    Active Testing: Sample-Efficient Model Evaluation

    Authors: Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

    Abstract: We introduce a new framework for sample-efficient model evaluation that we call active testing. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of labeling test data, typically unrealistically assuming large test sets for model evaluation. This creates a disconnect to real applications, where test labels are… ▽ More

    Submitted 14 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: Published at the 38th International Conference on Machine Learning (ICML 2021)

  28. arXiv:2103.04850  [pdf, other

    cs.LG stat.ML

    Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

    Authors: Andrew Jesson, Sören Mindermann, Yarin Gal, Uri Shalit

    Abstract: We study the problem of learning conditional average treatment effects (CATE) from high-dimensional, observational data with unobserved confounders. Unobserved confounders introduce ignorance -- a level of unidentifiability -- about an individual's response to treatment by inducing bias in CATE estimates. We present a new parametric interval estimator suited for high-dimensional data, that estimat… ▽ More

    Submitted 1 February, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: 19 pages, 5 figures, ICML 2021

    Journal ref: PMLR 139 (2021) 4829-4838

  29. arXiv:2102.12560  [pdf, other

    cs.LG cs.AI

    PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

    Authors: Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar

    Abstract: We study reinforcement learning (RL) with no-reward demonstrations, a setting in which an RL agent has access to additional data from the interaction of other agents with the same environment. However, it has no access to the rewards or goals of these agents, and their objectives and levels of expertise may vary widely. These assumptions are common in multi-agent settings, such as autonomous drivi… ▽ More

    Submitted 10 June, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

    Comments: The last two authors contributed equally. Accepted at ICML 2021

  30. arXiv:2102.11582  [pdf, other

    cs.LG stat.ML

    Deep Deterministic Uncertainty: A Simple Baseline

    Authors: Jishnu Mukhoti, Andreas Kirsch, Joost van Amersfoort, Philip H. S. Torr, Yarin Gal

    Abstract: Reliable uncertainty from deterministic single-forward pass models is sought after because conventional methods of uncertainty quantification are computationally expensive. We take two complex single-forward-pass uncertainty approaches, DUQ and SNGP, and examine whether they mainly rely on a well-regularized feature space. Crucially, without using their more complex methods for estimating uncertai… ▽ More

    Submitted 28 January, 2022; v1 submitted 23 February, 2021; originally announced February 2021.

  31. arXiv:2102.11409  [pdf, other

    cs.LG stat.ML

    On Feature Collapse and Deep Kernel Learning for Single Forward Pass Uncertainty

    Authors: Joost van Amersfoort, Lewis Smith, Andrew Jesson, Oscar Key, Yarin Gal

    Abstract: Inducing point Gaussian process approximations are often considered a gold standard in uncertainty estimation since they retain many of the properties of the exact GP and scale to large datasets. A major drawback is that they have difficulty scaling to high dimensional inputs. Deep Kernel Learning (DKL) promises a solution: a deep feature extractor transforms the inputs over which an inducing poin… ▽ More

    Submitted 7 March, 2022; v1 submitted 22 February, 2021; originally announced February 2021.

  32. arXiv:2102.08414  [pdf, other

    astro-ph.GA cs.CV

    Galaxy Zoo DECaLS: Detailed Visual Morphology Measurements from Volunteers and Deep Learning for 314,000 Galaxies

    Authors: Mike Walmsley, Chris Lintott, Tobias Geron, Sandor Kruk, Coleman Krawczyk, Kyle W. Willett, Steven Bamford, Lee S. Kelvin, Lucy Fortson, Yarin Gal, William Keel, Karen L. Masters, Vihang Mehta, Brooke D. Simmons, Rebecca Smethurst, Lewis Smith, Elisabeth M. Baeten, Christine Macmillan

    Abstract: We present Galaxy Zoo DECaLS: detailed visual morphological classifications for Dark Energy Camera Legacy Survey images of galaxies within the SDSS DR8 footprint. Deeper DECaLS images (r=23.6 vs. r=22.2 from SDSS) reveal spiral arms, weak bars, and tidal features not previously visible in SDSS imaging. To best exploit the greater depth of DECaLS images, volunteers select from a new set of answers… ▽ More

    Submitted 3 January, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Accepted by MNRAS July '21. Open access DOI below. Data at https://doi.org/10.5281/zenodo.4196266. Code at https://www.github.com/mwalmsley/zoobot. Docs at https://zoobot.readthedocs.io/. Interactive viewer at https://share.streamlit.io/mwalmsley/galaxy-poster/gz_decals_mike_walmsley.py

  33. arXiv:2102.05082  [pdf, other

    cs.LG

    Domain Invariant Representation Learning with Domain Density Transformations

    Authors: A. Tuan Nguyen, Toan Tran, Yarin Gal, Atılım Güneş Baydin

    Abstract: Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to targ… ▽ More

    Submitted 15 February, 2022; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2021

  34. arXiv:2102.01447  [pdf, other

    physics.geo-ph astro-ph.SR cs.LG physics.space-ph

    Global Earth Magnetic Field Modeling and Forecasting with Spherical Harmonics Decomposition

    Authors: Panagiotis Tigas, Téo Bloch, Vishal Upendran, Banafsheh Ferdoushi, Mark C. M. Cheung, Siddha Ganju, Ryan M. McGranaghan, Yarin Gal, Asti Bhatt

    Abstract: Modeling and forecasting the solar wind-driven global magnetic field perturbations is an open challenge. Current approaches depend on simulations of computationally demanding models like the Magnetohydrodynamics (MHD) model or sampling spatially and temporally through sparse ground-based stations (SuperMAG). In this paper, we develop a Deep Learning model that forecasts in Spherical Harmonics spac… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: Third Workshop on Machine Learning and the Physical Sciences (NeurIPS 2020), Vancouver, Canada

  35. arXiv:2101.11665  [pdf, other

    stat.ML cs.LG

    On Statistical Bias In Active Learning: How and When To Fix It

    Authors: Sebastian Farquhar, Yarin Gal, Tom Rainforth

    Abstract: Active learning is a powerful tool when labelling data is expensive, but it introduces a bias because the training data no longer follows the population distribution. We formalize this bias and investigate the situations in which it can be harmful and sometimes even helpful. We further introduce novel corrective weights to remove bias when doing so is beneficial. Through this, our work not only pr… ▽ More

    Submitted 31 May, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Published at ICLR 2021 (Spotlight)

  36. Technology Readiness Levels for Machine Learning Systems

    Authors: Alexander Lavin, Ciarán M. Gilligan-Lee, Alessya Visnjic, Siddha Ganju, Dava Newman, Atılım Güneş Baydin, Sujoy Ganguly, Danny Lange, Amit Sharma, Stephan Zheng, Eric P. Xing, Adam Gibson, James Parr, Chris Mattmann, Yarin Gal

    Abstract: The development and deployment of machine learning (ML) systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. The lack of diligence can lead to technical debt, scope creep and misaligned objectives, model misuse and failures, and expensive consequences. Engineering systems, on the other hand, follow well-defined processes and testing standards t… ▽ More

    Submitted 29 November, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

  37. arXiv:2101.03552  [pdf, other

    cs.LG math.OC

    PowerEvaluationBALD: Efficient Evaluation-Oriented Deep (Bayesian) Active Learning with Stochastic Acquisition Functions

    Authors: Andreas Kirsch, Yarin Gal

    Abstract: We develop BatchEvaluationBALD, a new acquisition function for deep Bayesian active learning, as an expansion of BatchBALD that takes into account an evaluation set of unlabeled data, for example, the pool set. We also develop a variant for the non-Bayesian setting, which we call Evaluation Information Gain. To reduce computational requirements and allow these methods to scale to larger acquisitio… ▽ More

    Submitted 10 May, 2021; v1 submitted 10 January, 2021; originally announced January 2021.

  38. arXiv:2012.14023  [pdf, other

    astro-ph.SR astro-ph.IM cs.LG physics.data-an physics.space-ph

    Multi-Channel Auto-Calibration for the Atmospheric Imaging Assembly using Machine Learning

    Authors: Luiz F. G. dos Santos, Souvik Bose, Valentina Salvatelli, Brad Neuberg, Mark C. M. Cheung, Miho Janvier, Meng **, Yarin Gal, Paul Boerner, Atılım Güneş Baydin

    Abstract: Solar activity plays a quintessential role in influencing the interplanetary medium and space-weather around the Earth. Remote sensing instruments onboard heliophysics space missions provide a pool of information about the Sun's activity via the measurement of its magnetic field and the emission of light from the multi-layered, multi-thermal, and dynamic solar atmosphere. Extreme UV (EUV) waveleng… ▽ More

    Submitted 1 February, 2021; v1 submitted 27 December, 2020; originally announced December 2020.

    Comments: 12 pages, 7 figures, 8 tables. This is a pre-print of an article submitted and accepted by A&A Journal

    Journal ref: A&A 648, A53 (2021)

  39. arXiv:2012.13220  [pdf, other

    cs.LG stat.ML

    On Batch Normalisation for Approximate Bayesian Inference

    Authors: Jishnu Mukhoti, Puneet K. Dokania, Philip H. S. Torr, Yarin Gal

    Abstract: We study batch normalisation in the context of variational inference methods in Bayesian neural networks, such as mean-field or MC Dropout. We show that batch-normalisation does not affect the optimum of the evidence lower bound (ELBO). Furthermore, we study the Monte Carlo Batch Normalisation (MCBN) algorithm, proposed as an approximate inference technique parallel to MC Dropout, and show that fo… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

  40. arXiv:2011.08714  [pdf, other

    stat.ML cs.LG

    Semi-supervised Learning of Galaxy Morphology using Equivariant Transformer Variational Autoencoders

    Authors: Mizu Nishikawa-Toomey, Lewis Smith, Yarin Gal

    Abstract: The growth in the number of galaxy images is much faster than the speed at which these galaxies can be labelled by humans. However, by leveraging the information present in the ever growing set of unlabelled images, semi-supervised learning could be an effective way of reducing the required labelling and increasing classification accuracy. We develop a Variational Autoencoder (VAE) with Equivarian… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: Accepted at the workshop for Machine Learning and the Physical Sciences, 34th Conference on Neural Information Processing Systems (NeurIPS) December 11, 2020

  41. arXiv:2011.00515  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes

    Authors: Tim G. J. Rudner, Oscar Key, Yarin Gal, Tom Rainforth

    Abstract: We show that the gradient estimates used in training Deep Gaussian Processes (DGPs) with importance-weighted variational inference are susceptible to signal-to-noise ratio (SNR) issues. Specifically, we show both theoretically and via an extensive empirical evaluation that the SNR of the gradient estimates for the latent variable's variational parameters decreases as the number of importance sampl… ▽ More

    Submitted 21 July, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: Published in Proceedings of the 38th International Conference on Machine Learning (ICML 2021)

  42. arXiv:2011.00415  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Inter-domain Deep Gaussian Processes

    Authors: Tim G. J. Rudner, Dino Sejdinovic, Yarin Gal

    Abstract: Inter-domain Gaussian processes (GPs) allow for high flexibility and low computational cost when performing approximate inference in GP models. They are particularly suitable for modeling data exhibiting global structure but are limited to stationary covariance functions and thus fail to model non-stationary data effectively. We propose Inter-domain Deep Gaussian Processes, an extension of inter-d… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: Published in Proceedings of the 37th International Conference on Machine Learning (ICML 2020)

  43. arXiv:2010.14499  [pdf, other

    cs.LG

    A Bayesian Perspective on Training Speed and Model Selection

    Authors: Clare Lyle, Lisa Schut, Binxin Ru, Yarin Gal, Mark van der Wilk

    Abstract: We take a Bayesian perspective to illustrate a connection between training speed and the marginal likelihood in linear models. This provides two major insights: first, that a measure of a model's training speed can be used to estimate its marginal likelihood. Second, that this measure, under certain conditions, predicts the relative weighting of models in linear model combinations trained to minim… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: To be presented at NeurIPS 2020

  44. arXiv:2010.08103  [pdf, other

    cs.CV cs.HC cs.LG eess.IV

    Physics-informed GANs for Coastal Flood Visualization

    Authors: Björn Lütjens, Brandon Leshchinskiy, Christian Requena-Mesa, Farrukh Chishtie, Natalia Díaz-Rodriguez, Océane Boulais, Aaron Piña, Dava Newman, Alexander Lavin, Yarin Gal, Chedy Raïssi

    Abstract: As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, but during hurricanes the area is largely covered by clouds and emergency managers must rely on nonintuitive flood visualizations for mission planning. To assist these emergency managers, we have created a deep learning pipeline tha… ▽ More

    Submitted 12 February, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: Under Review

  45. arXiv:2010.04116  [pdf, other

    cs.LG cs.AI

    Interlocking Backpropagation: Improving depthwise model-parallelism

    Authors: Aidan N. Gomez, Oscar Key, Kuba Perlin, Stephen Gou, Nick Frosst, Jeff Dean, Yarin Gal

    Abstract: The number of parameters in state of the art neural networks has drastically increased in recent years. This surge of interest in large scale neural networks has motivated the development of new distributed training strategies enabling such models. One such strategy is model-parallel distributed training. Unfortunately, model-parallelism can suffer from poor resource utilisation, which leads to wa… ▽ More

    Submitted 7 July, 2022; v1 submitted 8 October, 2020; originally announced October 2020.

  46. arXiv:2007.13454  [pdf, other

    stat.AP cs.LG q-bio.PE q-bio.QM stat.ML

    How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?

    Authors: Mrinank Sharma, Sören Mindermann, Jan Markus Brauner, Gavin Leech, Anna B. Stephenson, Tomáš Gavenčiak, Jan Kulveit, Yee Whye Teh, Leonid Chindelevitch, Yarin Gal

    Abstract: To what extent are effectiveness estimates of nonpharmaceutical interventions (NPIs) against COVID-19 influenced by the assumptions our models make? To answer this question, we investigate 2 state-of-the-art NPI effectiveness models and propose 6 variants that make different structural assumptions. In particular, we investigate how well NPI effectiveness estimates generalise to unseen countries, a… ▽ More

    Submitted 20 December, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

    Journal ref: NeurIPS 2020, Advances in Neural Information Processing Systems 33

  47. arXiv:2007.10909  [pdf, other

    cs.LG stat.ML

    Improving compute efficacy frontiers with SliceOut

    Authors: Pascal Notin, Aidan N. Gomez, Joanna Yoo, Yarin Gal

    Abstract: Pushing forward the compute efficacy frontier in deep learning is critical for tasks that require frequent model re-training or workloads that entail training a large number of models. We introduce SliceOut -- a dropout-inspired scheme designed to take advantage of GPU memory layout to train deep learning models faster without impacting final test accuracy. By drop** contiguous sets of units at… ▽ More

    Submitted 31 March, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

  48. arXiv:2007.00389  [pdf, other

    cs.LG stat.ML

    Single Shot Structured Pruning Before Training

    Authors: Joost van Amersfoort, Milad Alizadeh, Sebastian Farquhar, Nicholas Lane, Yarin Gal

    Abstract: We introduce a method to speed up training by 2x and inference by 3x in deep neural networks using structured pruning applied before training. Unlike previous works on pruning before training which prune individual weights, our work develops a methodology to remove entire channels and hidden units with the explicit aim of speeding up training and inference. We introduce a compute-aware scoring mec… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  49. arXiv:2007.00163  [pdf, other

    cs.LG stat.ML

    Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models

    Authors: Andrew Jesson, Sören Mindermann, Uri Shalit, Yarin Gal

    Abstract: Recommending the best course of action for an individual is a major application of individual-level causal effect estimation. This application is often needed in safety-critical domains such as healthcare, where estimating and communicating uncertainty to decision-makers is crucial. We introduce a practical approach for integrating uncertainty estimation into a class of state-of-the-art neural net… ▽ More

    Submitted 22 October, 2020; v1 submitted 30 June, 2020; originally announced July 2020.

  50. arXiv:2006.14911  [pdf, other

    cs.LG cs.RO stat.ML

    Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

    Authors: Angelos Filos, Panagiotis Tigas, Rowan McAllister, Nicholas Rhinehart, Sergey Levine, Yarin Gal

    Abstract: Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment, typically leading to arbitrary deductions and poorly-informed decisions. In principle, detection of and adaptation to OOD scenes can mitigate their adverse effects. In this paper, we highlight the limitations of current approaches to novel driving scenes and propose an epistemic uncertainty-aware… ▽ More

    Submitted 2 September, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: The first two authors contributed equally. Accepted at ICML 2020. Supplementary videos and code available at: https://sites.google.com/view/av-detect-recover-adapt