Skip to main content

Showing 1–11 of 11 results for author: Dusenberry, M W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.00667  [pdf, other

    stat.ML cs.AI cs.LG

    Morse Neural Networks for Uncertainty Quantification

    Authors: Benoit Dherin, Huiyi Hu, Jie Ren, Michael W. Dusenberry, Balaji Lakshminarayanan

    Abstract: We introduce a new deep generative model useful for uncertainty quantification: the Morse neural network, which generalizes the unnormalized Gaussian densities to have modes of high-dimensional submanifolds instead of just discrete points. Fitting the Morse neural network via a KL-divergence loss yields 1) a (unnormalized) generative density, 2) an OOD detector, 3) a calibration temperature, 4) a… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Accepted to ICML workshop on Structured Probabilistic Inference & Generative Modeling 2023

  2. arXiv:2302.06235  [pdf, other

    cs.LG cs.CV stat.ML

    A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

    Authors: James Urquhart Allingham, Jie Ren, Michael W Dusenberry, Xiuye Gu, Yin Cui, Dustin Tran, Jeremiah Zhe Liu, Balaji Lakshminarayanan

    Abstract: Contrastively trained text-image models have the remarkable ability to perform zero-shot classification, that is, classifying previously unseen images into categories that the model has never been explicitly trained to identify. However, these zero-shot classifiers need prompt engineering to achieve high accuracy. Prompt engineering typically requires hand-crafting a set of prompts for individual… ▽ More

    Submitted 15 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023. 23 pages, 10 tables, 3 figures

  3. arXiv:2301.04857  [pdf, other

    cs.AI stat.ME

    Neural Spline Search for Quantile Probabilistic Modeling

    Authors: Ruoxi Sun, Chun-Liang Li, Sercan O. Arik, Michael W. Dusenberry, Chen-Yu Lee, Tomas Pfister

    Abstract: Accurate estimation of output quantiles is crucial in many use cases, where it is desired to model the range of possibility. Modeling target distribution at arbitrary quantile levels and at arbitrary input attribute levels are important to offer a comprehensive picture of the data, and requires the quantile function to be expressive enough. The quantile function describing the target distribution… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

  4. arXiv:2211.12717  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks

    Authors: Neil Band, Tim G. J. Rudner, Qixuan Feng, Angelos Filos, Zachary Nado, Michael W. Dusenberry, Ghassen Jerfel, Dustin Tran, Yarin Gal

    Abstract: Bayesian deep learning seeks to equip deep neural networks with the ability to precisely quantify their predictive uncertainty, and has promised to make deep learning more reliable for safety-critical real-world applications. Yet, existing Bayesian deep learning methods fall short of this promise; new methods continue to be evaluated on unrealistic test beds that do not reflect the complexities of… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: Published in Neural Information Processing Systems (NeurIPS) 2021 Datasets and Benchmarks Track Proceedings. First two authors contributed equally. Code available at https://rebrand.ly/retina-benchmark

  5. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  6. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  7. arXiv:2010.09875  [pdf, other

    cs.LG stat.ML

    Combining Ensembles and Data Augmentation can Harm your Calibration

    Authors: Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W. Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness. Similarly, data augmentation techniques, which encode prior information in the form of invariant feature transformations, are effective for improving calibration and robustness. In this paper, we show a surprising pathology: combining ensembles and data aug… ▽ More

    Submitted 22 March, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

  8. arXiv:2005.07186  [pdf, other

    cs.LG stat.ML

    Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

    Authors: Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from effic… ▽ More

    Submitted 14 August, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: Published in the International Conference on Machine Learning (ICML) 2020. Code available at https://github.com/google/edward2

  9. arXiv:1906.04716  [pdf, other

    cs.LG stat.ML

    Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer

    Authors: Edward Choi, Zhen Xu, Yujia Li, Michael W. Dusenberry, Gerardo Flores, Yuan Xue, Andrew M. Dai

    Abstract: Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure informat… ▽ More

    Submitted 19 January, 2020; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: To be presented at AAAI 2020

  10. Analyzing the Role of Model Uncertainty for Electronic Health Records

    Authors: Michael W. Dusenberry, Dustin Tran, Edward Choi, Jonas Kemp, Jeremy Nixon, Ghassen Jerfel, Katherine Heller, Andrew M. Dai

    Abstract: In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertaint… ▽ More

    Submitted 25 March, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Published in the ACM Conference on Health, Inference, and Learning (CHIL) 2020. Code available at https://github.com/Google-Health/records-research

  11. arXiv:1812.03973  [pdf, other

    cs.LG cs.PL stat.ML

    Bayesian Layers: A Module for Neural Network Uncertainty

    Authors: Dustin Tran, Michael W. Dusenberry, Mark van der Wilk, Danijar Hafner

    Abstract: We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural ne… ▽ More

    Submitted 5 March, 2019; v1 submitted 10 December, 2018; originally announced December 2018.

    Comments: Code available at https://github.com/tensorflow/tensor2tensor