Skip to main content

Showing 1–12 of 12 results for author: Collier, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2301.12860  [pdf, other

    cs.LG stat.ML

    Massively Scaling Heteroscedastic Classifiers

    Authors: Mark Collier, Rodolphe Jenatton, Basil Mustafa, Neil Houlsby, Jesse Berent, Effrosyni Kokiopoulou

    Abstract: Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes. However, compared to standard classifiers, they introduce extra parameters that scale linearly with the number of classes. This makes them infeasible to apply to larger-scale problems. In additi… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to ICLR 2023

  2. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  3. arXiv:2110.02609  [pdf, other

    stat.ML cs.LG

    Deep Classifiers with Label Noise Modeling and Distance Awareness

    Authors: Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncert… ▽ More

    Submitted 8 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR

  4. arXiv:2105.10305  [pdf, other

    cs.LG cs.CV stat.ML

    Correlated Input-Dependent Label Noise in Large-Scale Image Classification

    Authors: Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

    Abstract: Large scale image classification datasets often contain noisy labels. We take a principled probabilistic approach to modelling input-dependent, also known as heteroscedastic, label noise in these datasets. We place a multivariate Normal distributed latent variable on the final hidden layer of a neural network classifier. The covariance matrix of this latent variable, models the aleatoric uncertain… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: Accepted as Oral at CVPR 2021

  5. arXiv:2009.04381  [pdf, other

    cs.LG stat.ML

    Routing Networks with Co-training for Continual Learning

    Authors: Mark Collier, Efi Kokiopoulou, Andrea Gesmundo, Jesse Berent

    Abstract: The core challenge with continual learning is catastrophic forgetting, the phenomenon that when neural networks are trained on a sequence of tasks they rapidly forget previously learned tasks. It has been observed that catastrophic forgetting is most severe when tasks are dissimilar to each other. We propose the use of sparse routing networks for continual learning. For each input, these network a… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: Presented at ICML Workshop on Continual Learning 2020

  6. arXiv:2006.05301  [pdf, other

    cs.LG stat.ML

    VAEs in the Presence of Missing Data

    Authors: Mark Collier, Alfredo Nazabal, Christopher K. I. Williams

    Abstract: Real world datasets often contain entries with missing elements e.g. in a medical dataset, a patient is unlikely to have taken all possible diagnostic tests. Variational Autoencoders (VAEs) are popular generative models often used for unsupervised learning. Despite their widespread use it is unclear how best to apply VAEs to datasets with missing data. We develop a novel latent variable model of a… ▽ More

    Submitted 21 March, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: Accepted to ICML Workshop on the Art of Learning with Missing Values (Artemiss), 17 July 2020

  7. arXiv:2003.06778  [pdf, other

    cs.LG stat.ML

    A Simple Probabilistic Method for Deep Classification under Input-Dependent Label Noise

    Authors: Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

    Abstract: Datasets with noisy labels are a common occurrence in practical applications of classification methods. We propose a simple probabilistic method for training deep classifiers under input-dependent (heteroscedastic) label noise. We assume an underlying heteroscedastic generative process for noisy labels. To make gradient based training feasible we use a temperature parameterized softmax as a smooth… ▽ More

    Submitted 12 November, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

  8. arXiv:1909.08994  [pdf, ps, other

    cs.LG stat.ML

    Scalable Deep Unsupervised Clustering with Concrete GMVAEs

    Authors: Mark Collier, Hector Urdiales

    Abstract: Discrete random variables are natural components of probabilistic clustering models. A number of VAE variants with discrete latent variables have been developed. Training such methods requires marginalizing over the discrete latent variables, causing training time complexity to be linear in the number clusters. By applying a continuous relaxation to the discrete variables in these methods we can a… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

  9. arXiv:1909.08314  [pdf, other

    cs.LG cs.CL stat.ML

    Memory-Augmented Neural Networks for Machine Translation

    Authors: Mark Collier, Joeran Beel

    Abstract: Memory-augmented neural networks (MANNs) have been shown to outperform other recurrent neural network architectures on a series of artificial sequence learning tasks, yet they have had limited application to real-world tasks. We evaluate direct application of Neural Turing Machines (NTM) and Differentiable Neural Computers (DNC) to machine translation. We further propose and evaluate two models wh… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

  10. arXiv:1809.10789  [pdf, other

    cs.LG stat.ML

    An Empirical Comparison of Syllabuses for Curriculum Learning

    Authors: Mark Collier, Joeran Beel

    Abstract: Syllabuses for curriculum learning have been developed on an ad-hoc, per task basis and little is known about the relative performance of different syllabuses. We identify a number of syllabuses used in the literature. We compare the identified syllabuses based on their effect on the speed of learning and generalization ability of a LSTM network on three sequential learning tasks. We find that the… ▽ More

    Submitted 12 November, 2018; v1 submitted 27 September, 2018; originally announced September 2018.

  11. arXiv:1807.09809  [pdf, other

    cs.LG stat.ML

    Deep Contextual Multi-armed Bandits

    Authors: Mark Collier, Hector Urdiales Llorens

    Abstract: Contextual multi-armed bandit problems arise frequently in important industrial applications. Existing solutions model the context either linearly, which enables uncertainty driven (principled) exploration, or non-linearly, by using epsilon-greedy exploration policies. Here we present a deep learning framework for contextual multi-armed bandits that is both non-linear and enables principled explor… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

  12. arXiv:1807.08518  [pdf, other

    cs.LG stat.ML

    Implementing Neural Turing Machines

    Authors: Mark Collier, Joeran Beel

    Abstract: Neural Turing Machines (NTMs) are an instance of Memory Augmented Neural Networks, a new class of recurrent neural networks which decouple computation from memory by introducing an external memory unit. NTMs have demonstrated superior performance over Long Short-Term Memory Cells in several sequence learning tasks. A number of open source implementations of NTMs exist but are unstable during train… ▽ More

    Submitted 26 July, 2018; v1 submitted 23 July, 2018; originally announced July 2018.