Skip to main content

Showing 51–100 of 107 results for author: Ghahramani, Z

.
  1. arXiv:1506.02157  [pdf, other

    stat.ML

    Dropout as a Bayesian Approximation: Appendix

    Authors: Yarin Gal, Zoubin Ghahramani

    Abstract: We show that a neural network with arbitrary depth and non-linearities, with dropout applied before every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model. This interpretation might offer an explanation to some of dropout's key properties, such as its robustness to over-fitting. Our interpretation allows us to reason about uncertainty in deep learning,… ▽ More

    Submitted 25 May, 2016; v1 submitted 6 June, 2015; originally announced June 2015.

    Comments: 20 pages, 1 figure; ICML proceedings version

  2. arXiv:1506.02142  [pdf, other

    stat.ML cs.LG

    Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

    Authors: Yarin Gal, Zoubin Ghahramani

    Abstract: Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. In this paper we develop a new theoretical framework casting dropou… ▽ More

    Submitted 4 October, 2016; v1 submitted 6 June, 2015; originally announced June 2015.

    Comments: 12 pages, 6 figures; fixed a mistake with standard error and added a new table with updated results (marked "Update [October 2016]"); Published in ICML 2016

  3. arXiv:1505.03906  [pdf, other

    stat.ML cs.LG

    Training generative neural networks via Maximum Mean Discrepancy optimization

    Authors: Gintare Karolina Dziugaite, Daniel M. Roy, Zoubin Ghahramani

    Abstract: We consider training a deep neural network to generate samples from an unknown distribution given i.i.d. data. We frame learning as an optimization minimizing a two-sample test statistic---informally speaking, a good generator network produces samples that cause a two-sample test to fail to reject the null hypothesis. As our two-sample test statistic, we use an unbiased estimate of the maximum mea… ▽ More

    Submitted 14 May, 2015; originally announced May 2015.

    Comments: 10 pages, to appear in Uncertainty in Artificial Intelligence (UAI) 2015

  4. Bayesian cluster analysis: Point estimation and credible balls

    Authors: Sara Wade, Zoubin Ghahramani

    Abstract: Clustering is widely studied in statistics and machine learning, with applications in a variety of fields. As opposed to classical algorithms which return a single clustering solution, Bayesian nonparametric models provide a posterior over the entire space of partitions, allowing one to assess statistical properties, such as uncertainty on the number of clusters. However, an important problem is h… ▽ More

    Submitted 8 February, 2019; v1 submitted 13 May, 2015; originally announced May 2015.

    Journal ref: Bayesian Anal., Volume 13, Number 2 (2018), 559-626

  5. arXiv:1505.00428  [pdf, other

    stat.ML

    A Linear-Time Particle Gibbs Sampler for Infinite Hidden Markov Models

    Authors: Nilesh Tripuraneni, Shane Gu, Hong Ge, Zoubin Ghahramani

    Abstract: Infinite Hidden Markov Models (iHMM's) are an attractive, nonparametric generalization of the classical Hidden Markov Model which can automatically infer the number of hidden states in the system. However, due to the infinite-dimensional nature of transition dynamics performing inference in the iHMM is difficult. In this paper, we present an infinite-state Particle Gibbs (PG) algorithm to resample… ▽ More

    Submitted 9 June, 2015; v1 submitted 3 May, 2015; originally announced May 2015.

  6. arXiv:1504.07027  [pdf, ps, other

    stat.ML

    On Sparse variational methods and the Kullback-Leibler divergence between stochastic processes

    Authors: Alexander G. de G. Matthews, James Hensman, Richard E. Turner, Zoubin Ghahramani

    Abstract: The variational framework for learning inducing variables (Titsias, 2009a) has had a large impact on the Gaussian process literature. The framework may be interpreted as minimizing a rigorously defined Kullback-Leibler divergence between the approximating and posterior processes. To our knowledge this connection has thus far gone unremarked in the literature. In this paper we give a substantial ge… ▽ More

    Submitted 4 December, 2015; v1 submitted 27 April, 2015; originally announced April 2015.

    Comments: 9 pages. No figures

  7. arXiv:1503.02182  [pdf, other

    stat.ML

    Latent Gaussian Processes for Distribution Estimation of Multivariate Categorical Data

    Authors: Yarin Gal, Yutian Chen, Zoubin Ghahramani

    Abstract: Multivariate categorical data occur in many applications of machine learning. One of the main difficulties with these vectors of categorical variables is sparsity. The number of possible observations grows exponentially with vector length, but dataset diversity might be poor in comparison. Recent models have gained significant improvement in supervised tasks with this data. These models embed obse… ▽ More

    Submitted 7 March, 2015; originally announced March 2015.

    Comments: 11 pages, 6 figures

  8. arXiv:1502.05312  [pdf, other

    stat.ML

    Predictive Entropy Search for Bayesian Optimization with Unknown Constraints

    Authors: José Miguel Hernández-Lobato, Michael A. Gelbart, Matthew W. Hoffman, Ryan P. Adams, Zoubin Ghahramani

    Abstract: Unknown constraints arise in many types of expensive black-box optimization problems. Several methods have been proposed recently for performing Bayesian optimization with constraints, based on the expected improvement (EI) heuristic. However, EI can lead to pathologies when used with constraints. For example, in the case of decoupled constraints---i.e., when one can independently evaluate the obj… ▽ More

    Submitted 15 July, 2015; v1 submitted 18 February, 2015; originally announced February 2015.

  9. arXiv:1501.04684  [pdf, other

    cs.AI cs.PL

    Slice Sampling for Probabilistic Programming

    Authors: Razvan Ranca, Zoubin Ghahramani

    Abstract: We introduce the first, general purpose, slice sampling inference engine for probabilistic programs. This engine is released as part of StocPy, a new Turing-Complete probabilistic programming language, available as a Python library. We present a transdimensional generalisation of slice sampling which is necessary for the inference engine to work on traces with different numbers of random variables… ▽ More

    Submitted 19 January, 2015; originally announced January 2015.

    Comments: 11 pages

  10. arXiv:1411.2005  [pdf, other

    stat.ML

    Scalable Variational Gaussian Process Classification

    Authors: James Hensman, Alex Matthews, Zoubin Ghahramani

    Abstract: Gaussian process classification is a popular method with a number of appealing properties. We show how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets. Importantly, the variational formulation can be exploited to allow classification in problems with millions of data points, as we demonstrate in experiments.

    Submitted 7 November, 2014; originally announced November 2014.

    Comments: 16 pages, 9 figures

  11. arXiv:1411.1690  [pdf, other

    stat.ML

    Sublinear-Time Approximate MCMC Transitions for Probabilistic Programs

    Authors: Yutian Chen, Vikash Mansinghka, Zoubin Ghahramani

    Abstract: Probabilistic programming languages can simplify the development of machine learning techniques, but only if inference is sufficiently scalable. Unfortunately, Bayesian parameter estimation for highly coupled models such as regressions and state-space models still scales poorly; each MCMC transition takes linear time in the number of observations. This paper describes a sublinear-time algorithm fo… ▽ More

    Submitted 9 March, 2015; v1 submitted 6 November, 2014; originally announced November 2014.

  12. arXiv:1408.3378  [pdf, other

    stat.ML

    Beta diffusion trees and hierarchical feature allocations

    Authors: Creighton Heaukulani, David A. Knowles, Zoubin Ghahramani

    Abstract: We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlap** subsets of objects, known as a feature allocation. A generative process for the tree structure is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet diffusion tree (Neal, 2003), which defines a tree structure… ▽ More

    Submitted 3 April, 2015; v1 submitted 14 August, 2014; originally announced August 2014.

    Comments: 43 pages, 13 figures. Major revision to the proof of Thm. 2. Large portions of Chs. 2 & 4 moved into the appendix. Added Fig. 4. Revisions throughout

  13. arXiv:1408.2061  [pdf

    cs.LG stat.ML

    Warped Mixtures for Nonparametric Cluster Shapes

    Authors: Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

    Abstract: A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters. To produce more appropriate clusterings, we introduce a model which warps a latent mixture of Gaussians to produce nonparametric cluster shapes. The possibly low-dimensional latent mixture model allows us to summarize the properties of the high-dimensional clusters (or density ma… ▽ More

    Submitted 9 August, 2014; originally announced August 2014.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-311-320

  14. arXiv:1406.2541  [pdf, other

    stat.ML cs.LG

    Predictive Entropy Search for Efficient Global Optimization of Black-box Functions

    Authors: José Miguel Hernández-Lobato, Matthew W. Hoffman, Zoubin Ghahramani

    Abstract: We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution… ▽ More

    Submitted 10 June, 2014; originally announced June 2014.

  15. arXiv:1406.0873  [pdf, other

    stat.ML

    Linear Dimensionality Reduction: Survey, Insights, and Generalizations

    Authors: John P. Cunningham, Zoubin Ghahramani

    Abstract: Linear dimensionality reduction methods are a cornerstone of analyzing high dimensional data, due to their simple geometric interpretations and typically attractive computational properties. These methods capture many data features of interest, such as covariance, dynamical structure, correlation between data sets, input-output relationships, and margin between data classes. Methods have been deve… ▽ More

    Submitted 18 March, 2016; v1 submitted 3 June, 2014; originally announced June 2014.

    Comments: 42 pages, 5 figures, 1 table

    Journal ref: Journal of Machine Learning Research. 16(Dec): 2859-2900, 2015

  16. arXiv:1405.4141  [pdf, other

    stat.ML stat.CO stat.ME

    Classification using log Gaussian Cox processes

    Authors: Alexander G. de. G Matthews, Zoubin Ghahramani

    Abstract: McCullagh and Yang (2006) suggest a family of classification algorithms based on Cox processes. We further investigate the log Gaussian variant which has a number of appealing properties. Conditioned on the covariates, the distribution over labels is given by a type of conditional Markov random field. In the supervised case, computation of the predictive probability of a single test point scales l… ▽ More

    Submitted 20 June, 2014; v1 submitted 16 May, 2014; originally announced May 2014.

    Comments: 17 pages, 6 figures

  17. arXiv:1403.4206  [pdf, other

    stat.ML

    A reversible infinite HMM using normalised random measures

    Authors: Konstantina Palla, David A. Knowles, Zoubin Ghahramani

    Abstract: We present a nonparametric prior over reversible Markov chains. We use completely random measures, specifically gamma processes, to construct a countably infinite graph with weighted edges. By enforcing symmetry to make the edges undirected we define a prior over random walks on graphs that results in a reversible Markov chain. The resulting prior over infinite transition matrices is closely relat… ▽ More

    Submitted 17 March, 2014; originally announced March 2014.

    Comments: 9 pages, 6 figures

  18. arXiv:1402.5836  [pdf, other

    stat.ML cs.LG

    Avoiding pathologies in very deep networks

    Authors: David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani

    Abstract: Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance. To shed light on this problem, we analyze the analogous problem of constructing useful priors on compositions of functions. Specifically, we study the deep Gaussian process, a type of infinitely-wide, deep neural network. We show that in standard architectures, the represen… ▽ More

    Submitted 8 July, 2016; v1 submitted 24 February, 2014; originally announced February 2014.

    Comments: Fixed a typo regarding number of layers

  19. arXiv:1402.4306  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Student-t Processes as Alternatives to Gaussian Processes

    Authors: Amar Shah, Andrew Gordon Wilson, Zoubin Ghahramani

    Abstract: We investigate the Student-t process as an alternative to the Gaussian process as a nonparametric prior over functions. We derive closed form expressions for the marginal likelihood and predictive distribution of a Student-t process, by integrating away an inverse Wishart process prior over the covariance kernel of a Gaussian process model. We show surprising equivalences between different hierarc… ▽ More

    Submitted 19 February, 2014; v1 submitted 18 February, 2014; originally announced February 2014.

    Comments: 13 pages, 6 figures, 1 table. To appear in "The Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014."

  20. arXiv:1402.4304  [pdf, other

    stat.ML cs.LG

    Automatic Construction and Natural-Language Description of Nonparametric Regression Models

    Authors: James Robert Lloyd, David Duvenaud, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani

    Abstract: This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural-language text. Our approach treats unknown regression functions nonparametrically using Gaussian processes, which has two important c… ▽ More

    Submitted 24 April, 2014; v1 submitted 18 February, 2014; originally announced February 2014.

  21. arXiv:1402.4293  [pdf, other

    stat.ML cs.LG

    The Random Forest Kernel and other kernels for big data from random partitions

    Authors: Alex Davies, Zoubin Ghahramani

    Abstract: We present Random Partition Kernels, a new class of kernels derived by demonstrating a natural connection between random partitions of objects and kernels between those objects. We show how the construction can be used to create kernels from methods that would not normally be viewed as random partitions, such as Random Forest. To demonstrate the potential of this method, we propose two new kernels… ▽ More

    Submitted 18 February, 2014; originally announced February 2014.

  22. arXiv:1402.3085  [pdf, other

    stat.ME stat.ML

    Gaussian Process Volatility Model

    Authors: Yue Wu, Jose Miguel Hernandez Lobato, Zoubin Ghahramani

    Abstract: The accurate prediction of time-changing variances is an important task in the modeling of financial data. Standard econometric models are often limited as they assume rigid functional relationships for the variances. Moreover, function parameters are usually learned using maximum likelihood, which can lead to overfitting. To address these problems we introduce a novel model for time-changing vari… ▽ More

    Submitted 13 February, 2014; originally announced February 2014.

    MSC Class: 62P05

  23. arXiv:1402.0119  [pdf, other

    stat.ML cs.LG

    Randomized Nonlinear Component Analysis

    Authors: David Lopez-Paz, Suvrit Sra, Alex Smola, Zoubin Ghahramani, Bernhard Schölkopf

    Abstract: Classical methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are ubiquitous in statistics. However, these techniques are only able to reveal linear relationships in data. Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale. In a separate strand of recent research, randomized methods have… ▽ More

    Submitted 13 May, 2014; v1 submitted 1 February, 2014; originally announced February 2014.

    Comments: Appearing in ICML 2014

  24. arXiv:1309.6862  [pdf

    cs.LG stat.ML

    Determinantal Clustering Processes - A Nonparametric Bayesian Approach to Kernel Based Semi-Supervised Clustering

    Authors: Amar Shah, Zoubin Ghahramani

    Abstract: Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional dat… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-566-575

  25. arXiv:1309.6858  [pdf

    cs.LG stat.ML

    The Supervised IBP: Neighbourhood Preserving Infinite Latent Feature Models

    Authors: Novi Quadrianto, Viktoriia Sharmanska, David A. Knowles, Zoubin Ghahramani

    Abstract: We propose a probabilistic model to infer supervised latent variables in the Hamming space from observed data. Our model allows simultaneous inference of the number of binary latent variables, and their values. The latent variables preserve neighbourhood structure of the data in a sense that objects in the same semantic concept have similar latent values, and objects in different concepts have dis… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-527-536

  26. arXiv:1307.3846  [pdf, other

    stat.ML cs.LG

    Bayesian Structured Prediction Using Gaussian Processes

    Authors: Sebastien Bratieres, Novi Quadrianto, Zoubin Ghahramani

    Abstract: We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M3N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference… ▽ More

    Submitted 15 July, 2013; originally announced July 2013.

    Comments: 8 pages with figures

  27. arXiv:1305.4268  [pdf, other

    stat.ME stat.ML

    Dynamic Covariance Models for Multivariate Financial Time Series

    Authors: Yue Wu, José Miguel Hernández-Lobato, Zoubin Ghahramani

    Abstract: The accurate prediction of time-changing covariances is an important problem in the modeling of multivariate financial data. However, some of the most popular models suffer from a) overfitting problems and multiple local optima, b) failure to capture shifts in market conditions and c) large computational costs. To address these problems we introduce a novel dynamic model for time-changing covarian… ▽ More

    Submitted 2 June, 2013; v1 submitted 18 May, 2013; originally announced May 2013.

  28. arXiv:1304.3577  [pdf, other

    q-bio.GN stat.ML

    Identifying cancer subtypes in glioblastoma by combining genomic, transcriptomic and epigenomic data

    Authors: Richard S. Savage, Zoubin Ghahramani, Jim E. Griffin, Paul Kirk, David L. Wild

    Abstract: We present a nonparametric Bayesian method for disease subtype discovery in multi-dimensional cancer data. Our method can simultaneously analyse a wide range of data types, allowing for both agreement and disagreement between their underlying clustering structure. It includes feature selection and infers the most likely number of disease subtypes, given the data. We apply the method to 277 gliob… ▽ More

    Submitted 15 April, 2013; v1 submitted 12 April, 2013; originally announced April 2013.

    Journal ref: International Conference on Machine Learning (ICML) 2012: Workshop on Machine Learning in Genetics and Genomics

  29. arXiv:1304.3285  [pdf, other

    stat.ML cs.LG

    Scaling the Indian Buffet Process via Submodular Maximization

    Authors: Colorado Reed, Zoubin Ghahramani

    Abstract: Inference for latent feature models is inherently difficult as the inference space grows exponentially with the size of the input data and number of latent features. In this work, we use Kurihara & Welling (2008)'s maximization-expectation framework to perform approximate MAP inference for linear-Gaussian latent feature models with an Indian Buffet Process (IBP) prior. This formulation yields a su… ▽ More

    Submitted 24 July, 2013; v1 submitted 11 April, 2013; originally announced April 2013.

    Comments: 13 pages, 8 figures

    Journal ref: In ICML 2013: JMLR W&CP 28 (3): 1013-1021, 2013

  30. arXiv:1303.3265  [pdf, other

    stat.ML

    A dependent partition-valued process for multitask clustering and time evolving network modelling

    Authors: Konstantina Palla, David A. Knowles, Zoubin Ghahramani

    Abstract: The fundamental aim of clustering algorithms is to partition data points. We consider tasks where the discovered partition is allowed to vary with some covariate such as space or time. One approach would be to use fragmentation-coagulation processes, but these, being Markov processes, are restricted to linear or tree structured covariate spaces. We define a partition-valued process on an arbitrary… ▽ More

    Submitted 31 October, 2013; v1 submitted 13 March, 2013; originally announced March 2013.

    Comments: 9 pages, 7 figures, submitted for review

  31. arXiv:1302.4922  [pdf, other

    stat.ML cs.LG stat.ME

    Structure Discovery in Nonparametric Regression through Compositional Kernel Search

    Authors: David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani

    Abstract: Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base kernels. We present a method for searching over this space of structures which mirrors the scientific discovery process. The learned structures can often decompose… ▽ More

    Submitted 13 May, 2013; v1 submitted 20 February, 2013; originally announced February 2013.

    Comments: 9 pages, 7 figures, To appear in proceedings of the 2013 International Conference on Machine Learning

    ACM Class: G.3; I.2.6

  32. arXiv:1302.3979  [pdf, other

    stat.ME stat.ML

    Gaussian Process Vine Copulas for Multivariate Dependence

    Authors: David Lopez-Paz, José Miguel Hernández-Lobato, Zoubin Ghahramani

    Abstract: Copulas allow to learn marginal distributions separately from the multivariate dependence structure (copula) that links them together into a density function. Vine factorizations ease the learning of high-dimensional copulas by constructing a hierarchy of conditional bivariate copulas. However, to simplify inference, it is common to assume that each of these conditional bivariate copulas is indepe… ▽ More

    Submitted 16 February, 2013; originally announced February 2013.

    Comments: Accepted to International Conference in Machine Learning (ICML 2013)

  33. arXiv:1212.2490  [pdf

    cs.LG stat.ML

    On the Convergence of Bound Optimization Algorithms

    Authors: Ruslan R Salakhutdinov, Sam T Roweis, Zoubin Ghahramani

    Abstract: Many practitioners who use the EM algorithm complain that it is sometimes slow. When does this happen, and what can be done about it? In this paper, we study the general class of bound optimization algorithms - including Expectation-Maximization, Iterative Scaling and CCCP - and their relationship to direct optimization algorithms such as gradient-based methods for parameter learning. We d… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-509-516

  34. arXiv:1209.1145  [pdf, other

    stat.ME stat.ML

    Restricting exchangeable nonparametric distributions

    Authors: Sinead Williamson, Zoubin Ghahramani, Steven N. MacEachern, Eric P. Xing

    Abstract: Distributions over exchangeable matrices with infinitely many columns, such as the Indian buffet process, are useful in constructing nonparametric latent variable models. However, the distribution implied by such models over the number of features exhibited by each data point may be poorly- suited for many modeling tasks. In this paper, we propose a class of exchangeable nonparametric priors obtai… ▽ More

    Submitted 5 September, 2012; originally announced September 2012.

  35. arXiv:1207.4525  [pdf, other

    cs.AI cs.DB cs.IR

    SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

    Authors: Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani

    Abstract: The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and answer complex queries. However, the efficient alignment of large-scale knowledge bases still poses a considerable challe… ▽ More

    Submitted 18 July, 2012; originally announced July 2012.

    Comments: 10 pages + 2 pages appendix; 5 figures -- initial preprint

    ACM Class: I.2.4; H.3.4; D.2.12

  36. arXiv:1207.4134  [pdf

    cs.LG stat.ML

    Bayesian Learning in Undirected Graphical Models: Approximate MCMC algorithms

    Authors: Iain Murray, Zoubin Ghahramani

    Abstract: Bayesian learning in undirected graphical models|computing posterior distributions over parameters and predictive quantities is exceptionally difficult. We conjecture that for general undirected models, there are no tractable MCMC (Markov Chain Monte Carlo) schemes giving the correct equilibrium distribution over parameters. While this intractability, due to the partition function, is familiar to… ▽ More

    Submitted 11 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-392-399

  37. arXiv:1206.6874  [pdf

    stat.ME cs.AI

    Bayesian Inference for Gaussian Mixed Graph Models

    Authors: Ricardo Silva, Zoubin Ghahramani

    Abstract: We introduce priors and algorithms to perform Bayesian inference in Gaussian models defined by acyclic directed mixed graphs. Such a class of graphs, composed of directed and bi-directed edges, is a representation of conditional independencies that is closed under marginalization and arises naturally from causal models which allow for unmeasured confounding. Monte Carlo methods and a variational a… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-453-460

  38. arXiv:1206.6873  [pdf

    cs.LG stat.ML

    Variable noise and dimensionality reduction for sparse Gaussian processes

    Authors: Edward Snelson, Zoubin Ghahramani

    Abstract: The sparse pseudo-input Gaussian process (SPGP) is a new approximation method for speeding up GP regression in the case of a large number of data points N. The approximation is controlled by the gradient optimization of a small set of M `pseudo-inputs', thereby reducing complexity from N^3 to NM^2. One limitation of the SPGP is that this optimization space becomes impractically big for high dimens… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-461-468

  39. arXiv:1206.6865  [pdf

    cs.LG cs.AI stat.ML

    A Non-Parametric Bayesian Method for Inferring Hidden Causes

    Authors: Frank Wood, Thomas Griffiths, Zoubin Ghahramani

    Abstract: We present a non-parametric Bayesian approach to structure learning with hidden causes. Previous Bayesian treatments of this problem define a prior over the number of hidden causes and use algorithms such as reversible jump Markov chain Monte Carlo to move between solutions. In contrast, we assume that the number of hidden causes is unbounded, but only a finite number influence observable variable… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-536-543

  40. arXiv:1206.6848  [pdf

    stat.CO stat.ME

    MCMC for doubly-intractable distributions

    Authors: Iain Murray, Zoubin Ghahramani, David MacKay

    Abstract: Markov Chain Monte Carlo (MCMC) algorithms are routinely used to draw samples from distributions with intractable normalization constants. However, standard MCMC algorithms do not apply to doubly-intractable distributions in which there are additional parameter-dependent normalization terms; for example, the posterior over parameters of an undirected graphical model. An ingenious auxiliary-variabl… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-359-366

  41. arXiv:1206.6416  [pdf

    cs.LG stat.ML

    An Infinite Latent Attribute Model for Network Data

    Authors: Konstantina Palla, David Knowles, Zoubin Ghahramani

    Abstract: Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These m… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  42. arXiv:1206.4682  [pdf

    cs.LG math.ST stat.ML

    Copula-based Kernel Dependency Measures

    Authors: Barnabas Poczos, Zoubin Ghahramani, Jeff Schneider

    Abstract: The paper presents a new copula based method for measuring dependence between random variables. Our approach extends the Maximum Mean Discrepancy to the copula of the joint distribution. We prove that this approach has several advantageous properties. Similarly to Shannon mutual information, the proposed dependence measure is invariant to any strictly increasing transformation of the marginal vari… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012

  43. arXiv:1206.1846  [pdf, other

    stat.ML

    Warped Mixtures for Nonparametric Cluster Shapes

    Authors: Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

    Abstract: A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters. To produce more appropriate clusterings, we introduce a model which warps a latent mixture of Gaussians to produce nonparametric cluster shapes. The possibly low-dimensional latent mixture model allows us to summarize the properties of the high-dimensional clusters (or density ma… ▽ More

    Submitted 21 March, 2013; v1 submitted 8 June, 2012; originally announced June 2012.

    Comments: 10 pages, 6 figures, submitted for review

    ACM Class: I.5.3

  44. arXiv:1205.2650  [pdf

    cs.LG stat.ML

    Correlated Non-Parametric Latent Feature Models

    Authors: Finale Doshi-Velez, Zoubin Ghahramani

    Abstract: We are often interested in explaining data through a set of hidden factors or features. When the number of hidden features is unknown, the Indian Buffet Process (IBP) is a nonparametric latent feature model that does not bound the number of active features in dataset. However, the IBP assumes that all latent features are uncorrelated, making it inadequate for many realworld problems. We introduce… ▽ More

    Submitted 9 May, 2012; originally announced May 2012.

    Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

    Report number: UAI-P-2009-PG-143-150

  45. arXiv:1112.5745  [pdf, other

    stat.ML cs.LG

    Bayesian Active Learning for Classification and Preference Learning

    Authors: Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, Máté Lengyel

    Abstract: Information theoretic active learning has been widely studied for probabilistic models. For simple regression an optimal myopic policy is easily tractable. However, for other tasks and with more complex models, such as classification with nonparametric models, the optimal solution is harder to compute. Current approaches make approximations to achieve tractability. We propose an approach that expr… ▽ More

    Submitted 24 December, 2011; originally announced December 2011.

  46. arXiv:1110.4411  [pdf, other

    stat.ML q-fin.ST stat.ME

    Gaussian Process Regression Networks

    Authors: Andrew Gordon Wilson, David A. Knowles, Zoubin Ghahramani

    Abstract: We introduce a new regression framework, Gaussian process regression networks (GPRN), which combines the structural properties of Bayesian neural networks with the non-parametric flexibility of Gaussian processes. This model accommodates input dependent signal and noise correlations between multiple response variables, input dependent length-scales and amplitudes, and heavy-tailed predictive distr… ▽ More

    Submitted 19 October, 2011; originally announced October 2011.

    Comments: 17 pages, 3 figures, 1 table. Submitted for publication

  47. arXiv:1106.2494  [pdf, other

    stat.ML

    Pitman-Yor Diffusion Trees

    Authors: David A. Knowles, Zoubin Ghahramani

    Abstract: We introduce the Pitman Yor Diffusion Tree (PYDT) for hierarchical clustering, a generalization of the Dirichlet Diffusion Tree (Neal, 2001) which removes the restriction to binary branching structure. The generative process is described and shown to result in an exchangeable distribution over data points. We prove some theoretical properties of the model and then present two inference methods: a… ▽ More

    Submitted 16 June, 2011; v1 submitted 13 June, 2011; originally announced June 2011.

    Comments: 8 pages, to be presented at UAI 2011

    MSC Class: 62G07; 62H30 ACM Class: G.3.7

  48. arXiv:1106.1157  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian and L1 Approaches to Sparse Unsupervised Learning

    Authors: Shakir Mohamed, Katherine Heller, Zoubin Ghahramani

    Abstract: The use of L1 regularisation for sparse learning has generated immense research interest, with successful application in such diverse areas as signal acquisition, image coding, genomics and collaborative filtering. While existing work highlights the many advantages of L1 methods, in this paper we find that L1 regularisation often dramatically underperforms in terms of predictive performance when c… ▽ More

    Submitted 17 August, 2012; v1 submitted 6 June, 2011; originally announced June 2011.

    Comments: In Proceedings of the 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, 2012

  49. arXiv:1101.0240  [pdf, other

    stat.ME math.PR q-fin.CP q-fin.ST stat.ML

    Generalised Wishart Processes

    Authors: Andrew Gordon Wilson, Zoubin Ghahramani

    Abstract: We introduce a stochastic process with Wishart marginals: the generalised Wishart process (GWP). It is a collection of positive semi-definite random matrices indexed by any arbitrary dependent variable. We use it to model dynamic (e.g. time varying) covariance matrices. Unlike existing models, it can capture a diverse class of covariance structures, it can easily handle missing data, the dependent… ▽ More

    Submitted 31 December, 2010; originally announced January 2011.

    Comments: 14 pages, 4 figures, 1 table. Submitted for publication

  50. arXiv:1011.6293  [pdf, ps, other

    stat.AP cs.AI stat.ML

    Nonparametric Bayesian sparse factor models with application to gene expression modeling

    Authors: David Knowles, Zoubin Ghahramani

    Abstract: A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $\mathbf{Y}$ is modeled as a linear superposition, $\mathbf{G}$, of a potentially infinite number of hidden factors, $\mathbf{X}$. The Indian Buffet Process (IBP) is used as a prior on $\mathbf{G}$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for mode… ▽ More

    Submitted 28 July, 2011; v1 submitted 29 November, 2010; originally announced November 2010.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS435 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS435

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 2B, 1534-1552