Skip to main content

Showing 151–162 of 162 results for author: Gal, Y

.
  1. arXiv:1703.02910  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Bayesian Active Learning with Image Data

    Authors: Yarin Gal, Riashat Islam, Zoubin Ghahramani

    Abstract: Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. First, active learning (AL) methods generally rely on being able to learn and update models from small amounts of data. Recent advances in deep learning, on the other hand, are notorious for the… ▽ More

    Submitted 8 March, 2017; originally announced March 2017.

  2. arXiv:1512.05287  [pdf, other

    stat.ML

    A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

    Authors: Yarin Gal, Zoubin Ghahramani

    Abstract: Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This gr… ▽ More

    Submitted 5 October, 2016; v1 submitted 16 December, 2015; originally announced December 2015.

    Comments: Added clarifications; Published in NIPS 2016

  3. arXiv:1509.04781  [pdf, other

    stat.ML

    Dirichlet Fragmentation Processes

    Authors: Hong Ge, Yarin Gal, Zoubin Ghahramani

    Abstract: Tree structures are ubiquitous in data across many domains, and many datasets are naturally modelled by unobserved tree structures. In this paper, first we review the theory of random fragmentation processes [Bertoin, 2006], and a number of existing methods for modelling trees, including the popular nested Chinese restaurant process (nCRP). Then we define a general class of probability distributio… ▽ More

    Submitted 15 September, 2015; originally announced September 2015.

  4. arXiv:1506.02158  [pdf, other

    stat.ML cs.LG

    Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference

    Authors: Yarin Gal, Zoubin Ghahramani

    Abstract: Convolutional neural networks (CNNs) work well on large datasets. But labelled data is hard to collect, and in some applications larger amounts of data are not available. The problem then is how to use CNNs with small data -- as CNNs overfit quickly. We present an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches. This is by placing a prob… ▽ More

    Submitted 18 January, 2016; v1 submitted 6 June, 2015; originally announced June 2015.

    Comments: 12 pages, 3 figures, ICLR format, updated with reviewer comments

  5. arXiv:1506.02157  [pdf, other

    stat.ML

    Dropout as a Bayesian Approximation: Appendix

    Authors: Yarin Gal, Zoubin Ghahramani

    Abstract: We show that a neural network with arbitrary depth and non-linearities, with dropout applied before every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model. This interpretation might offer an explanation to some of dropout's key properties, such as its robustness to over-fitting. Our interpretation allows us to reason about uncertainty in deep learning,… ▽ More

    Submitted 25 May, 2016; v1 submitted 6 June, 2015; originally announced June 2015.

    Comments: 20 pages, 1 figure; ICML proceedings version

  6. arXiv:1506.02142  [pdf, other

    stat.ML cs.LG

    Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

    Authors: Yarin Gal, Zoubin Ghahramani

    Abstract: Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. In this paper we develop a new theoretical framework casting dropou… ▽ More

    Submitted 4 October, 2016; v1 submitted 6 June, 2015; originally announced June 2015.

    Comments: 12 pages, 6 figures; fixed a mistake with standard error and added a new table with updated results (marked "Update [October 2016]"); Published in ICML 2016

  7. arXiv:1503.02424  [pdf, other

    stat.ML

    Improving the Gaussian Process Sparse Spectrum Approximation by Representing Uncertainty in Frequency Inputs

    Authors: Yarin Gal, Richard Turner

    Abstract: Standard sparse pseudo-input approximations to the Gaussian process (GP) cannot handle complex functions well. Sparse spectrum alternatives attempt to answer this but are known to over-fit. We suggest the use of variational inference for the sparse spectrum approximation to avoid both issues. We model the covariance function with a finite Fourier series approximation and treat it as a random varia… ▽ More

    Submitted 20 March, 2015; v1 submitted 9 March, 2015; originally announced March 2015.

    Comments: 13 pages, 3 figures

  8. arXiv:1503.02182  [pdf, other

    stat.ML

    Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

    Authors: Yarin Gal, Yutian Chen, Zoubin Ghahramani

    Abstract: Multivariate categorical data occur in many applications of machine learning. One of the main difficulties with these vectors of categorical variables is sparsity. The number of possible observations grows exponentially with vector length, but dataset diversity might be poor in comparison. Recent models have gained significant improvement in supervised tasks with this data. These models embed obse… ▽ More

    Submitted 7 March, 2015; originally announced March 2015.

    Comments: 11 pages, 6 figures

  9. arXiv:1402.7265  [pdf, ps, other

    cs.CL

    Semantics, Modelling, and the Problem of Representation of Meaning -- a Brief Survey of Recent Literature

    Authors: Yarin Gal

    Abstract: Over the past 50 years many have debated what representation should be used to capture the meaning of natural language utterances. Recently new needs of such representations have been raised in research. Here I survey some of the interesting representations suggested to answer for these new needs.

    Submitted 28 February, 2014; originally announced February 2014.

    Comments: 15 pages, no figures

  10. arXiv:1402.1412  [pdf, other

    stat.ML

    Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models - a Gentle Tutorial

    Authors: Yarin Gal, Mark van der Wilk

    Abstract: In this tutorial we explain the inference procedures developed for the sparse Gaussian process (GP) regression and Gaussian process latent variable model (GPLVM). Due to page limit the derivation given in Titsias (2009) and Titsias & Lawrence (2010) is brief, hence getting a full picture of it requires collecting results from several different sources and a substantial amount of algebra to fill-in… ▽ More

    Submitted 29 September, 2014; v1 submitted 6 February, 2014; originally announced February 2014.

    Comments: 20 pages, no figures

  11. arXiv:1402.1389  [pdf, other

    stat.ML cs.LG

    Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

    Authors: Yarin Gal, Mark van der Wilk, Carl E. Rasmussen

    Abstract: Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have been applied to both regression and non-linear dimensionality reduction, and offer desirable properties such as uncertainty estimates, robustness to over-fitting, and principled ways for tuning hyper-parameters. However the scalability of these models to big datasets remains an active topic of resear… ▽ More

    Submitted 29 September, 2014; v1 submitted 6 February, 2014; originally announced February 2014.

    Comments: 9 pages, 8 figures

  12. arXiv:1401.3426  [pdf

    cs.GT cs.AI

    Networks of Influence Diagrams: A Formalism for Representing Agents' Beliefs and Decision-Making Processes

    Authors: Yaakov Gal, Avi Pfeffer

    Abstract: This paper presents Networks of Influence Diagrams (NID), a compact, natural and highly expressive language for reasoning about agents beliefs and decision-making processes. NIDs are graphical structures in which agents mental models are represented as nodes in a network; a mental model for an agent may itself use descriptions of the mental models of other agents. NIDs are demonstrated by examples… ▽ More

    Submitted 14 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 33, pages 109-147, 2008