Skip to main content

Showing 1–50 of 54 results for author: de Freitas, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2205.13320  [pdf, other

    cs.LG cs.AI stat.ML

    Towards Learning Universal Hyperparameter Optimizers with Transformers

    Authors: Yutian Chen, Xingyou Song, Chansoo Lee, Zi Wang, Qiuyi Zhang, David Dohan, Kazuya Kawakami, Greg Kochanski, Arnaud Doucet, Marc'aurelio Ranzato, Sagi Perel, Nando de Freitas

    Abstract: Meta-learning hyperparameter optimization (HPO) algorithms from prior experiments is a promising approach to improve optimization efficiency over objective functions from a similar distribution. However, existing methods are restricted to learning from experiments sharing the same set of hyperparameters. In this paper, we introduce the OptFormer, the first text-based Transformer HPO framework that… ▽ More

    Submitted 13 October, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Published as a conference paper in Neural Information Processing Systems (NeurIPS) 2022. Code can be found in https://github.com/google-research/optformer and Google AI Blog can be found in https://ai.googleblog.com/2022/08/optformer-towards-universal.html

  2. arXiv:2106.10251  [pdf, other

    cs.LG cs.AI stat.ML

    Active Offline Policy Selection

    Authors: Ksenia Konyushkova, Yutian Chen, Tom Le Paine, Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas

    Abstract: This paper addresses the problem of policy selection in domains with abundant logged data, but with a restricted interaction budget. Solving this problem would enable safe evaluation and deployment of offline reinforcement learning policies in industry, robotics, and recommendation domains among others. Several off-policy evaluation (OPE) techniques have been proposed to assess the value of polici… ▽ More

    Submitted 6 May, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: Presented at NeurIPS 2021

  3. arXiv:2105.10148  [pdf, other

    cs.LG stat.ML

    On Instrumental Variable Regression for Deep Offline Policy Evaluation

    Authors: Yutian Chen, Liyuan Xu, Caglar Gulcehre, Tom Le Paine, Arthur Gretton, Nando de Freitas, Arnaud Doucet

    Abstract: We show that the popular reinforcement learning (RL) strategy of estimating the state-action value (Q-function) by minimizing the mean squared Bellman error leads to a regression problem with confounding, the inputs and output noise being correlated. Hence, direct minimization of the Bellman error can result in significantly biased Q-function estimates. We explain why fixing the target Q-network i… ▽ More

    Submitted 23 November, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted by Journal of Machine Learning Research in 11/2022

    Journal ref: Journal of Machine Learning Research 23 (2022) 1-41

  4. arXiv:2011.13885  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Offline Learning from Demonstrations and Unlabeled Experience

    Authors: Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

    Abstract: Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

    Comments: Accepted to Offline Reinforcement Learning Workshop at Neural Information Processing Systems (2020)

  5. arXiv:2010.07154  [pdf, other

    cs.LG stat.ML

    Learning Deep Features in Instrumental Variable Regression

    Authors: Liyuan Xu, Yutian Chen, Siddarth Srinivasan, Nando de Freitas, Arnaud Doucet, Arthur Gretton

    Abstract: Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by utilizing an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and… ▽ More

    Submitted 27 June, 2023; v1 submitted 14 October, 2020; originally announced October 2020.

  6. arXiv:2007.09055  [pdf, other

    cs.LG cs.AI stat.ML

    Hyperparameter Selection for Offline Reinforcement Learning

    Authors: Tom Le Paine, Cosmin Paduraru, Andrea Michi, Caglar Gulcehre, Konrad Zolna, Alexander Novikov, Ziyu Wang, Nando de Freitas

    Abstract: Offline reinforcement learning (RL purely from logged data) is an important avenue for deploying RL techniques in real-world scenarios. However, existing hyperparameter selection methods for offline RL break the offline assumption by evaluating policies corresponding to each hyperparameter setting in the environment. This online execution is often infeasible and hence undermines the main aim of of… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

  7. arXiv:2006.15134  [pdf, other

    cs.LG cs.AI stat.ML

    Critic Regularized Regression

    Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

    Abstract: Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction. It addresses challenges with regard to the cost of data collection and safety, both of which are particularly pertinent to real-world applications of RL. Unfortunately, most off-policy algorithms perform poorly when learnin… ▽ More

    Submitted 22 September, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: 24 pages; presented at NeurIPS 2020

  8. arXiv:2006.13888  [pdf, other

    cs.LG stat.ML

    RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

    Authors: Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

    Abstract: Offline methods for reinforcement learning have a potential to help bridge the gap between reinforcement learning research and real-world applications. They make it possible to learn policies from offline datasets, thus overcoming concerns associated with online data collection in the real-world, including cost, safety, or ethical concerns. In this paper, we propose a benchmark called RL Unplugged… ▽ More

    Submitted 12 February, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: NeurIPS paper. 21 pages including supplementary material, the github link for the datasets: https://github.com/deepmind/deepmind-research/rl_unplugged

  9. arXiv:1910.01077  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Task-Relevant Adversarial Imitation Learning

    Authors: Konrad Zolna, Scott Reed, Alexander Novikov, Sergio Gomez Colmenarejo, David Budden, Serkan Cabi, Misha Denil, Nando de Freitas, Ziyu Wang

    Abstract: We show that a critical vulnerability in adversarial imitation is the tendency of discriminator networks to learn spurious associations between visual features and expert labels. When the discriminator focuses on task-irrelevant features, it does not provide an informative reward signal, leading to poor task performance. We analyze this problem in detail and propose a solution that outperforms sta… ▽ More

    Submitted 12 November, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: Accepted to CoRL 2020 (see presentation here: https://youtu.be/ZgQvFGuEgFU )

  10. arXiv:1909.05557  [pdf, other

    cs.LG cs.AI stat.ML

    Modular Meta-Learning with Shrinkage

    Authors: Yutian Chen, Abram L. Friesen, Feryal Behbahani, Arnaud Doucet, David Budden, Matthew W. Hoffman, Nando de Freitas

    Abstract: Many real-world problems, including multi-speaker text-to-speech synthesis, can greatly benefit from the ability to meta-learn large models with only a few task-specific components. Updating only these task-specific modules then allows the model to be adapted to low-data tasks for as many steps as necessary without risking overfitting. Unfortunately, existing meta-learning methods either do not sc… ▽ More

    Submitted 22 October, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

    Comments: Accepted by NeurIPS 2020

  11. arXiv:1905.03030  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-learning of Sequential Strategies

    Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

    Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

    Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

  12. arXiv:1812.06855  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian Optimization in AlphaGo

    Authors: Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver, Nando de Freitas

    Abstract: During the development of AlphaGo, its many hyper-parameters were tuned with Bayesian optimization multiple times. This automatic tuning process resulted in substantial improvements in playing strength. For example, prior to the match with Lee Sedol, we tuned the latest AlphaGo agent and this improved its win-rate from 50% to 66.5% in self-play games. This tuned version was deployed in the final m… ▽ More

    Submitted 17 December, 2018; originally announced December 2018.

  13. arXiv:1810.08647  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

    Authors: Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

    Abstract: We propose a unified mechanism for achieving coordination and communication in Multi-Agent Reinforcement Learning (MARL), through rewarding agents for having causal influence over other agents' actions. Causal influence is assessed using counterfactual reasoning. At each timestep, an agent simulates alternate actions that it could have taken, and computes their effect on the behavior of other agen… ▽ More

    Submitted 18 June, 2019; v1 submitted 19 October, 2018; originally announced October 2018.

  14. arXiv:1809.10460  [pdf, other

    cs.LG cs.SD stat.ML

    Sample Efficient Adaptive Text-to-Speech

    Authors: Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas

    Abstract: We present a meta-learning approach for adaptive text-to-speech (TTS) with few data. During training, we learn a multi-speaker model using a shared conditional WaveNet core and independent learned embeddings for each speaker. The aim of training is not to produce a neural network with fixed weights, which is then deployed as a TTS system. Instead, the aim is to produce a network that requires few… ▽ More

    Submitted 16 January, 2019; v1 submitted 27 September, 2018; originally announced September 2018.

    Comments: Accepted by ICLR 2019

  15. arXiv:1805.11592  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Playing hard exploration games by watching YouTube

    Authors: Yusuf Aytar, Tobias Pfaff, David Budden, Tom Le Paine, Ziyu Wang, Nando de Freitas

    Abstract: Deep reinforcement learning methods traditionally struggle with tasks where environment rewards are particularly sparse. One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator. However, these demonstrations are typically collected under artificial conditions, i.e. with access to the agent's exact environment setup and the demonstra… ▽ More

    Submitted 30 November, 2018; v1 submitted 29 May, 2018; originally announced May 2018.

  16. arXiv:1711.02448  [pdf, other

    q-bio.NC cs.NE stat.ML

    Cortical microcircuits as gated-recurrent neural networks

    Authors: Rui Ponte Costa, Yannis M. Assael, Brendan Shillingford, Nando de Freitas, Tim P. Vogels

    Abstract: Cortical circuits exhibit intricate recurrent architectures that are remarkably similar across different brain areas. Such stereotyped structure suggests the existence of common computational principles. However, such principles have remained largely elusive. Inspired by gated-memory networks, namely long short-term memory networks (LSTMs), we introduce a recurrent neural network in which informat… ▽ More

    Submitted 3 January, 2018; v1 submitted 7 November, 2017; originally announced November 2017.

    Comments: To appear in Advances in Neural Information Processing Systems 30 (NIPS 2017). 13 pages, 2 figures (and 1 supp. figure)

  17. arXiv:1706.06383  [pdf, other

    cs.AI cs.NE stat.ML

    Programmable Agents

    Authors: Misha Denil, Sergio Gómez Colmenarejo, Serkan Cabi, David Saxton, Nando de Freitas

    Abstract: We build deep RL agents that execute declarative programs expressed in formal language. The agents learn to ground the terms in this language in their environment, and can generalize their behavior at test time to execute new programs that refer to objects that were not referenced during training. The agents develop disentangled interpretable representations that allow them to generalize to a wide… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

  18. arXiv:1703.04813  [pdf, other

    cs.LG cs.NE stat.ML

    Learned Optimizers that Scale and Generalize

    Authors: Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein

    Abstract: Learning to learn has emerged as an important direction for achieving artificial intelligence. Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks. We introduce a learned gradient descent optimizer that generalizes well to new tasks, and which has significantly reduced memory and computation overhead. We achieve… ▽ More

    Submitted 7 September, 2017; v1 submitted 14 March, 2017; originally announced March 2017.

    Comments: Final ICML paper after reviewer suggestions

  19. arXiv:1611.03824  [pdf, other

    stat.ML cs.LG

    Learning to Learn without Gradient Descent by Gradient Descent

    Authors: Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas

    Abstract: We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter t… ▽ More

    Submitted 12 June, 2017; v1 submitted 11 November, 2016; originally announced November 2016.

    Comments: Accepted by ICML 2017. Previous version "Learning to Learn for Global Optimization of Black Box Functions" was published in the Deep Reinforcement Learning Workshop, NIPS 2016

  20. arXiv:1611.01843  [pdf, other

    stat.ML cs.AI cs.CV cs.LG cs.NE physics.soc-ph

    Learning to Perform Physics Experiments via Deep Reinforcement Learning

    Authors: Misha Denil, Pulkit Agrawal, Tejas D Kulkarni, Tom Erez, Peter Battaglia, Nando de Freitas

    Abstract: When encountering novel objects, humans are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with them in a goal driven way. This process of active interaction is in the same spirit as a scientist performing experiments to discover hidden facts. Recent advances in artificial intelligence have yielded machines that can achieve superhuman perf… ▽ More

    Submitted 17 August, 2017; v1 submitted 6 November, 2016; originally announced November 2016.

  21. arXiv:1508.03666  [pdf, other

    stat.ML

    Unbounded Bayesian Optimization via Regularization

    Authors: Bobak Shahriari, Alexandre Bouchard-Côté, Nando de Freitas

    Abstract: Bayesian optimization has recently emerged as a popular and efficient tool for global optimization and hyperparameter tuning. Currently, the established Bayesian optimization practice requires a user-defined bounding box which is assumed to contain the optimizer. However, when little is known about the probed objective function, it can be difficult to prescribe such bounds. In this work we modify… ▽ More

    Submitted 14 August, 2015; originally announced August 2015.

    Comments: 9 pages, 4 figures

  22. arXiv:1412.7149  [pdf, other

    cs.LG cs.NE stat.ML

    Deep Fried Convnets

    Authors: Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, Ziyu Wang

    Abstract: The fully connected layers of a deep convolutional neural network typically contain over 90% of the network parameters, and consume the majority of the memory required to store the network parameters. Reducing the number of parameters while preserving essentially the same predictive performance is critically important for operating deep neural networks in memory constrained environments such as GP… ▽ More

    Submitted 17 July, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

    Comments: svd experiments included

  23. arXiv:1411.3128  [pdf, other

    cs.LG stat.ML

    Deep Multi-Instance Transfer Learning

    Authors: Dimitrios Kotzias, Misha Denil, Phil Blunsom, Nando de Freitas

    Abstract: We present a new approach for transferring knowledge from groups to individuals that comprise them. We evaluate our method in text, by inferring the ratings of individual sentences using full-review ratings. This approach, which combines ideas from transfer learning, deep learning and multi-instance learning, reduces the need for laborious human labelling of fine-grained data when abundant labels… ▽ More

    Submitted 10 December, 2014; v1 submitted 12 November, 2014; originally announced November 2014.

  24. arXiv:1410.7172  [pdf, other

    cs.LG math.OC stat.ML

    Heteroscedastic Treed Bayesian Optimisation

    Authors: John-Alexander M. Assael, Ziyu Wang, Bobak Shahriari, Nando de Freitas

    Abstract: Optimising black-box functions is important in many disciplines, such as tuning machine learning models, robotics, finance and mining exploration. Bayesian optimisation is a state-of-the-art technique for the global optimisation of black-box functions which are expensive to evaluate. At the core of this approach is a Gaussian process prior that captures our belief about the distribution over funct… ▽ More

    Submitted 4 March, 2015; v1 submitted 27 October, 2014; originally announced October 2014.

  25. arXiv:1406.7758  [pdf, other

    stat.ML cs.LG

    Theoretical Analysis of Bayesian Optimisation with Unknown Gaussian Process Hyper-Parameters

    Authors: Ziyu Wang, Nando de Freitas

    Abstract: Bayesian optimisation has gained great popularity as a tool for optimising the parameters of machine learning algorithms and models. Somewhat ironically, setting up the hyper-parameters of Bayesian optimisation methods is notoriously hard. While reasonable practical solutions have been advanced, they can often fail to find the best optima. Surprisingly, there is little theoretical analysis of this… ▽ More

    Submitted 30 June, 2014; originally announced June 2014.

    Comments: 16 pages, 1 figure

  26. arXiv:1406.4625  [pdf, other

    stat.ML cs.LG

    An Entropy Search Portfolio for Bayesian Optimization

    Authors: Bobak Shahriari, Ziyu Wang, Matthew W. Hoffman, Alexandre Bouchard-Côté, Nando de Freitas

    Abstract: Bayesian optimization is a sample-efficient method for black-box global optimization. How- ever, the performance of a Bayesian optimization method very much depends on its exploration strategy, i.e. the choice of acquisition function, and it is not clear a priori which choice will result in superior performance. While portfolio methods provide an effective, principled way of combining a collection… ▽ More

    Submitted 4 March, 2015; v1 submitted 18 June, 2014; originally announced June 2014.

    Comments: 10 pages, 5 figures

  27. arXiv:1406.3830  [pdf, other

    cs.CL cs.LG stat.ML

    Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

    Authors: Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas

    Abstract: Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced se… ▽ More

    Submitted 15 June, 2014; originally announced June 2014.

  28. arXiv:1406.3070  [pdf, other

    stat.ML

    Distributed Parameter Estimation in Probabilistic Graphical Models

    Authors: Yariv Dror Mizrahi, Misha Denil, Nando de Freitas

    Abstract: This paper presents foundational theoretical results on distributed parameter estimation for undirected probabilistic graphical models. It introduces a general condition on composite likelihood decompositions of these models which guarantees the global consistency of distributed estimators, provided the local estimators are consistent.

    Submitted 11 June, 2014; originally announced June 2014.

  29. arXiv:1402.7005  [pdf, other

    stat.ML cs.LG

    Bayesian Multi-Scale Optimistic Optimization

    Authors: Ziyu Wang, Babak Shakibi, Lin **, Nando de Freitas

    Abstract: Bayesian optimization is a powerful global optimization technique for expensive black-box functions. One of its shortcomings is that it requires auxiliary optimization of an acquisition function at each iteration. This auxiliary optimization can be costly and very hard to carry out in practice. Moreover, it creates serious theoretical concerns, as most of the convergence results assume that the ex… ▽ More

    Submitted 27 February, 2014; originally announced February 2014.

    Comments: 15 pages

  30. arXiv:1310.1415  [pdf, other

    stat.ML cs.LG

    Narrowing the Gap: Random Forests In Theory and In Practice

    Authors: Misha Denil, David Matheson, Nando de Freitas

    Abstract: Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of random regression forests and prove that our algorithm is consistent. We also provide an empirical evaluation, comparing our algorithm and other theoretically tra… ▽ More

    Submitted 4 October, 2013; originally announced October 2013.

    Comments: Under review by the International Conference on Machine Learning (ICML) 2014

  31. arXiv:1308.6342  [pdf, other

    stat.ML cs.LG

    Linear and Parallel Learning of Markov Random Fields

    Authors: Yariv Dror Mizrahi, Misha Denil, Nando de Freitas

    Abstract: We introduce a new embarrassingly parallel parameter learning algorithm for Markov random fields with untied parameters which is efficient for a large class of practical models. Our algorithm parallelizes naturally over cliques and, for graphs of bounded degree, its complexity is linear in the number of cliques. Unlike its competitors, our algorithm is fully parallel and for log-linear models it i… ▽ More

    Submitted 5 February, 2014; v1 submitted 28 August, 2013; originally announced August 2013.

  32. arXiv:1306.0543  [pdf, other

    cs.LG cs.NE stat.ML

    Predicting Parameters in Deep Learning

    Authors: Misha Denil, Babak Shakibi, Laurent Dinh, Marc'Aurelio Ranzato, Nando de Freitas

    Abstract: We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small nu… ▽ More

    Submitted 27 October, 2014; v1 submitted 3 June, 2013; originally announced June 2013.

  33. arXiv:1303.6746  [pdf, other

    stat.ML cs.LG

    Exploiting correlation and budget constraints in Bayesian multi-armed bandit optimization

    Authors: Matthew W. Hoffman, Bobak Shahriari, Nando de Freitas

    Abstract: We address the problem of finding the maximizer of a nonlinear smooth function, that can only be evaluated point-wise, subject to constraints on the number of permitted function evaluations. This problem is also known as fixed-budget best arm identification in the multi-armed bandit literature. We introduce a Bayesian approach for this problem and show that it empirically outperforms both the exis… ▽ More

    Submitted 11 November, 2013; v1 submitted 27 March, 2013; originally announced March 2013.

  34. arXiv:1302.6182  [pdf, other

    stat.CO

    Adaptive Hamiltonian and Riemann Manifold Monte Carlo Samplers

    Authors: ziyu wang, Shakir Mohamed, Nando de Freitas

    Abstract: In this paper we address the widely-experienced difficulty in tuning Hamiltonian-based Monte Carlo samplers. We develop an algorithm that allows for the adaptation of Hamiltonian and Riemann manifold Hamiltonian Monte Carlo samplers using Bayesian optimization that allows for infinite adaptation of the parameters of these samplers. We show that the resulting sampling algorithms are ergodic, and th… ▽ More

    Submitted 25 February, 2013; originally announced February 2013.

    Comments: 10 pages, 4 figures

  35. arXiv:1302.4853  [pdf, other

    stat.ML

    Consistency of Online Random Forests

    Authors: Misha Denil, David Matheson, Nando de Freitas

    Abstract: As a testament to their success, the theory of random forests has long been outpaced by their application in practice. In this paper, we take a step towards narrowing this gap by providing a consistency result for online random forests.

    Submitted 8 May, 2013; v1 submitted 20 February, 2013; originally announced February 2013.

    Comments: To appear in Proceedings of the 30th International Conference on Machine Learning, 2013

  36. arXiv:1301.4168  [pdf, other

    cs.LG stat.CO stat.ML

    Herded Gibbs Sampling

    Authors: Luke Bornn, Yutian Chen, Nando de Freitas, Mareija Eskelin, **g Fang, Max Welling

    Abstract: The Gibbs sampler is one of the most popular algorithms for inference in statistical models. In this paper, we introduce a herding variant of this algorithm, called herded Gibbs, that is entirely deterministic. We prove that herded Gibbs has an $O(1/T)$ convergence rate for models with independent variables and for fully connected probabilistic graphical models. Herded Gibbs is shown to outperform… ▽ More

    Submitted 15 March, 2013; v1 submitted 17 January, 2013; originally announced January 2013.

    Comments: 19 pages, including the appendix. Submission for ICLR 2013

  37. arXiv:1301.3853  [pdf

    cs.LG cs.AI stat.CO

    Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks

    Authors: Arnaud Doucet, Nando de Freitas, Kevin Murphy, Stuart Russell

    Abstract: Particle filters (PFs) are powerful sampling-based inference/learning algorithms for dynamic Bayesian networks (DBNs). They allow us to treat, in a principled way, any type of probability distribution, nonlinearity and non-stationarity. They have appeared in several fields under such names as "condensation", "sequential Monte Carlo" and "survival of the fittest". In this paper, we show how we can… ▽ More

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-176-183

  38. arXiv:1301.3833  [pdf

    cs.LG cs.NE stat.ML

    Reversible Jump MCMC Simulated Annealing for Neural Networks

    Authors: Christophe Andrieu, Nando de Freitas, Arnaud Doucet

    Abstract: We propose a novel reversible jump Markov chain Monte Carlo (MCMC) simulated annealing algorithm to optimize radial basis function (RBF) networks. This algorithm enables us to maximize the joint posterior distribution of the network parameters and the number of basis functions. It performs a global search in the joint space of the parameters and number of parameters, thereby surmounting the proble… ▽ More

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-11-18

  39. arXiv:1301.2266  [pdf

    cs.LG stat.CO stat.ML

    Variational MCMC

    Authors: Nando de Freitas, Pedro Hojen-Sorensen, Michael I. Jordan, Stuart Russell

    Abstract: We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algor… ▽ More

    Submitted 10 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

    Report number: UAI-P-2001-PG-120-127

  40. arXiv:1301.1942  [pdf, other

    stat.ML cs.LG

    Bayesian Optimization in a Billion Dimensions via Random Embeddings

    Authors: Ziyu Wang, Frank Hutter, Masrour Zoghi, David Matheson, Nando de Freitas

    Abstract: Bayesian optimization techniques have been successfully applied to robotics, planning, sensor placement, recommendation, advertising, intelligent user interfaces and automatic algorithm configuration. Despite these successes, the approach is restricted to problems of moderate dimension, and several workshops on Bayesian optimization have identified its scaling to high-dimensions as one of the holy… ▽ More

    Submitted 10 January, 2016; v1 submitted 9 January, 2013; originally announced January 2013.

    Comments: 33 pages

  41. arXiv:1208.0959  [pdf, other

    cs.LG cs.CV stat.ML

    Recklessly Approximate Sparse Coding

    Authors: Misha Denil, Nando de Freitas

    Abstract: It has recently been observed that certain extremely simple feature encoding techniques are able to achieve state of the art performance on several standard image classification benchmarks including deep belief networks, convolutional nets, factored RBMs, mcRBMs, convolutional RBMs, sparse autoencoders and several others. Moreover, these "triangle" or "soft threshold" encodings are ex- tremely eff… ▽ More

    Submitted 6 January, 2013; v1 submitted 4 August, 2012; originally announced August 2012.

  42. arXiv:1207.4149  [pdf

    stat.CO cs.LG

    From Fields to Trees

    Authors: Firas Hamze, Nando de Freitas

    Abstract: We present new MCMC algorithms for computing the posterior distributions and expectations of the unknown variables in undirected graphical models with regular structure. For demonstration purposes, we focus on Markov Random Fields (MRFs). By partitioning the MRFs into non-overlap** trees, it is possible to compute the posterior distribution of a particular tree exactly by conditioning on the rem… ▽ More

    Submitted 11 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-243-250

  43. arXiv:1207.1396  [pdf

    stat.CO cs.LG stat.ML

    Toward Practical N2 Monte Carlo: the Marginal Particle Filter

    Authors: Mike Klaas, Nando de Freitas, Arnaud Doucet

    Abstract: Sequential Monte Carlo techniques are useful for state estimation in non-linear, non-Gaussian dynamic models. These methods allow us to approximate the joint posterior distribution using sequential importance sampling. In this framework, the dimension of the target distribution grows with each time step, thus it is necessary to introduce some resampling steps to ensure that the estimates provided… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

    Report number: UAI-P-2005-PG-308-315

  44. arXiv:1207.1393  [pdf

    cs.LG stat.ML

    Learning about individuals from group statistics

    Authors: Hendrik Kuck, Nando de Freitas

    Abstract: We propose a new problem formulation which is similar to, but more informative than, the binary multiple-instance learning problem. In this setting, we are given groups of instances (described by feature vectors) along with estimates of the fraction of positively-labeled instances per group. The task is to learn an instance level classifier from this information. That is, we are trying to estimate… ▽ More

    Submitted 4 July, 2012; originally announced July 2012.

    Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

    Report number: UAI-P-2005-PG-332-339

  45. arXiv:1206.6457  [pdf

    cs.LG stat.ML

    Exponential Regret Bounds for Gaussian Process Bandits with Deterministic Observations

    Authors: Nando de Freitas, Alex Smola, Masrour Zoghi

    Abstract: This paper analyzes the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al, 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, Srinivas et al proved that the regret vanishes at the approximate rate of $O(1/\sqrt{t})$, where t is the nu… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012). arXiv admin note: substantial text overlap with arXiv:1203.2177

  46. arXiv:1206.5239  [pdf

    stat.CO cs.AI

    Large-Flip Importance Sampling

    Authors: Firas Hamze, Nando de Freitas

    Abstract: We propose a new Monte Carlo algorithm for complex discrete distributions. The algorithm is motivated by the N-Fold Way, which is an ingenious event-driven MCMC sampler that avoids rejection moves at any specific state. The N-Fold Way can however get "trapped" in cycles. We surmount this problem by modifying the sampling process. This correction does introduce bias, but the bias is subsequently co… ▽ More

    Submitted 20 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007)

    Report number: UAI-P-2007-PG-167-174

  47. arXiv:1205.2643  [pdf

    cs.LG eess.SY math.OC stat.CO stat.ML

    New inference strategies for solving Markov Decision Processes using reversible jump MCMC

    Authors: Matthias Hoffman, Hendrik Kueck, Nando de Freitas, Arnaud Doucet

    Abstract: In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampl… ▽ More

    Submitted 9 May, 2012; originally announced May 2012.

    Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

    Report number: UAI-P-2009-PG-223-231

  48. arXiv:1203.3484  [pdf

    stat.CO cs.AI

    Intracluster Moves for Constrained Discrete-Space MCMC

    Authors: Firas Hamze, Nando de Freitas

    Abstract: This paper addresses the problem of sampling from binary distributions with constraints. In particular, it proposes an MCMC method to draw samples from a distribution of the set of all states at a specified distance from some reference state. For example, when the reference state is the vector of zeros, the algorithm can draw samples from a binary distribution with a constraint on the number of ac… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-236-243

  49. arXiv:1203.2394  [pdf, other

    stat.ML cs.LG stat.CO

    Decentralized, Adaptive, Look-Ahead Particle Filtering

    Authors: Mohamed Osama Ahmed, Pouyan T. Bibalan, Nando de Freitas, Simon Fauvel

    Abstract: The decentralized particle filter (DPF) was proposed recently to increase the level of parallelism of particle filtering. Given a decomposition of the state space into two nested sets of variables, the DPF uses a particle filter to sample the first set and then conditions on this sample to generate a set of samples for the second set of variables. The DPF can be understood as a variant of the popu… ▽ More

    Submitted 11 March, 2012; originally announced March 2012.

    Comments: 16 pages, 11 figures, Authorship in alphabetical order

  50. arXiv:1203.2177  [pdf, other

    cs.LG stat.ML

    Regret Bounds for Deterministic Gaussian Process Bandits

    Authors: Nando de Freitas, Alex Smola, Masrour Zoghi

    Abstract: This paper analyses the problem of Gaussian process (GP) bandits with deterministic observations. The analysis uses a branch and bound algorithm that is related to the UCB algorithm of (Srinivas et al., 2010). For GPs with Gaussian observation noise, with variance strictly greater than zero, (Srinivas et al., 2010) proved that the regret vanishes at the approximate rate of $O(\frac{1}{\sqrt{t}})$,… ▽ More

    Submitted 9 March, 2012; originally announced March 2012.

    Comments: 17 pages, 5 figures