Skip to main content

Showing 1–39 of 39 results for author: Rezende, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2204.04875  [pdf, other

    stat.ML cs.LG

    Learning to Induce Causal Structure

    Authors: Nan Rosemary Ke, Silvia Chiappa, Jane Wang, Anirudh Goyal, Jorg Bornschein, Melanie Rey, Theophane Weber, Matthew Botvinic, Michael Mozer, Danilo Jimenez Rezende

    Abstract: The fundamental challenge in causal induction is to infer the underlying graph structure given observational and/or interventional data. Most existing causal induction algorithms operate by generating candidate graphs and evaluating them using either score-based methods (including continuous optimization) or independence tests. In our work, we instead treat the inference process as a black box and… ▽ More

    Submitted 7 October, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

  2. arXiv:2203.09250  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE stat.ML

    Symmetry-Based Representations for Artificial and Biological General Intelligence

    Authors: Irina Higgins, Sébastien Racanière, Danilo Rezende

    Abstract: Biological intelligence is remarkable in its ability to produce complex behaviour in many diverse situations through data efficient, generalisable and transferable skill acquisition. It is believed that learning "good" sensory representations is important for enabling this, however there is little agreement as to what a good representation should look like. In this review article we are going to a… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

  3. arXiv:2201.13117  [pdf, other

    stat.ML cond-mat.stat-mech cs.LG hep-lat

    Continual Repeated Annealed Flow Transport Monte Carlo

    Authors: Alexander G. D. G. Matthews, Michael Arbel, Danilo J. Rezende, Arnaud Doucet

    Abstract: We propose Continual Repeated Annealed Flow Transport Monte Carlo (CRAFT), a method that combines a sequential Monte Carlo (SMC) sampler (itself a generalization of Annealed Importance Sampling) with variational inference using normalizing flows. The normalizing flows are directly trained to transport between annealing temperatures using a KL divergence for each transition. This optimization objec… ▽ More

    Submitted 6 April, 2023; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: 21 pages, 6 figures Published at International Conference on Machine Learning (ICML) 2022

  4. arXiv:2110.01288  [pdf, other

    stat.ML cs.LG

    Implicit Riemannian Concave Potential Maps

    Authors: Danilo J. Rezende, Sébastien Racanière

    Abstract: We are interested in the challenging problem of modelling densities on Riemannian manifolds with a known symmetry group using normalising flows. This has many potential applications in physical sciences such as molecular dynamics and quantum simulations. In this work we combine ideas from implicit neural layers and optimal transport theory to propose a generalisation of existing work on exponentia… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  5. arXiv:2107.00848  [pdf, other

    stat.ML cs.LG

    Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

    Authors: Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal

    Abstract: Inducing causal relationships from observations is a classic problem in machine learning. Most work in causality starts from the premise that the causal variables themselves are observed. However, for AI agents such as robots trying to make sense of their environment, the only observables are low-level variables like pixels in images. To generalize well, an agent must induce high-level variables,… ▽ More

    Submitted 2 July, 2021; originally announced July 2021.

  6. arXiv:2104.00587  [pdf, other

    stat.ML cs.LG

    NeRF-VAE: A Geometry Aware 3D Scene Generative Model

    Authors: Adam R. Kosiorek, Heiko Strathmann, Daniel Zoran, Pol Moreno, Rosalia Schneider, Soňa Mokrá, Danilo J. Rezende

    Abstract: We propose NeRF-VAE, a 3D scene generative model that incorporates geometric structure via NeRF and differentiable volume rendering. In contrast to NeRF, our model takes into account shared structure across scenes, and is able to infer the structure of a novel scene -- without the need to re-train -- using amortized inference. NeRF-VAE's explicit 3D rendering process further contrasts previous gen… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: 17 pages, 15 figures, under review

  7. arXiv:2012.02035  [pdf, other

    stat.ML cs.LG

    Integrable Nonparametric Flows

    Authors: David Pfau, Danilo Rezende

    Abstract: We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a (possibly unnormalized) probability distribution. This reverses the conventional task of normalizing flows -- rather than being given samples from a unknown target distribution and learning a flow that approximates the distribution, we are given a perturbation to an initial distributi… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    Comments: Accepted to 3rd NeurIPS Workshop on Machine Learning and Physical Sciences

  8. arXiv:2008.09301  [pdf, other

    stat.ML cs.LG

    Amortized learning of neural causal representations

    Authors: Nan Rosemary Ke, Jane. X. Wang, Jovana Mitrovic, Martin Szummer, Danilo J. Rezende

    Abstract: Causal models can compactly and efficiently encode the data-generating process under all interventions and hence may generalize better under changes in distribution. These models are often represented as Bayesian networks and learning them scales poorly with the number of variables. Moreover, these approaches cannot leverage previously learned knowledge to help with learning new causal models. In… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

    Comments: ICLR 2020 causal learning for decision making workshop

  9. arXiv:2008.05456  [pdf, other

    hep-lat cs.LG stat.ML

    Sampling using $SU(N)$ gauge equivariant flows

    Authors: Denis Boyda, Gurtej Kanwar, Sébastien Racanière, Danilo Jimenez Rezende, Michael S. Albergo, Kyle Cranmer, Daniel C. Hackett, Phiala E. Shanahan

    Abstract: We develop a flow-based sampling algorithm for $SU(N)$ lattice gauge theories that is gauge-invariant by construction. Our key contribution is constructing a class of flows on an $SU(N)$ variable (or on a $U(N)$ variable by a simple alternative) that respect matrix conjugation symmetry. We apply this technique to sample distributions of single $SU(N)$ variables and to construct flow-based samplers… ▽ More

    Submitted 18 September, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

    Comments: 24 pages, 19 figures

    Report number: MIT-CTP/5228

    Journal ref: Phys. Rev. D 103, 074504 (2021)

  10. arXiv:2003.13367  [pdf, other

    cs.LG cs.IT stat.ML

    Neural Communication Systems with Bandwidth-limited Channel

    Authors: Karen Ullrich, Fabio Viola, Danilo Jimenez Rezende

    Abstract: Reliably transmitting messages despite information loss due to a noisy channel is a core problem of information theory. One of the most important aspects of real world communication, e.g. via wifi, is that it may happen at varying levels of information transfer. The bandwidth-limited channel models this phenomenon. In this study we consider learning coding with the bandwidth-limited channel (BWLC)… ▽ More

    Submitted 1 April, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

  11. arXiv:2002.04913  [pdf, other

    physics.comp-ph physics.chem-ph stat.ML

    Targeted free energy estimation via learned map**s

    Authors: Peter Wirnsberger, Andrew J. Ballard, George Papamakarios, Stuart Abercrombie, Sébastien Racanière, Alexander Pritzel, Danilo Jimenez Rezende, Charles Blundell

    Abstract: Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences, and has since inspired a huge body of related methods that use it as an integral building block. Being an importance sampling based estimator, however, FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions. One strategy to mit… ▽ More

    Submitted 18 August, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: Added figure 3, added data augmentation for octahedral symmetries, updated experimental results and revised text (11 pages, 6 figures)

  12. arXiv:2002.02836  [pdf, other

    cs.LG cs.AI stat.ML

    Causally Correct Partial Models for Reinforcement Learning

    Authors: Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing

    Abstract: In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this pa… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  13. arXiv:2002.02428  [pdf, other

    stat.ML cs.LG

    Normalizing Flows on Tori and Spheres

    Authors: Danilo Jimenez Rezende, George Papamakarios, Sébastien Racanière, Michael S. Albergo, Gurtej Kanwar, Phiala E. Shanahan, Kyle Cranmer

    Abstract: Normalizing flows are a powerful tool for building expressive distributions in high dimensions. So far, most of the literature has concentrated on learning flows on Euclidean spaces. Some problems however, such as those involving angles, are defined on spaces with more complex geometries, such as tori or spheres. In this paper, we propose and compare expressive and numerically stable flows on such… ▽ More

    Submitted 1 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Accepted to the International Conference on Machine Learning (ICML) 2020

  14. arXiv:1912.02762  [pdf, other

    stat.ML cs.LG

    Normalizing Flows for Probabilistic Modeling and Inference

    Authors: George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan

    Abstract: Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations. There has been much recent work on normalizing flows, ranging from improving their expressive power to expanding their application. We believe the field has now matured and is in need of… ▽ More

    Submitted 8 April, 2021; v1 submitted 5 December, 2019; originally announced December 2019.

    Comments: Review article, 64 pages, 9 figures. Published in the Journal of Machine Learning Research (see https://jmlr.org/papers/v22/19-1028.html)

    Journal ref: Journal of Machine Learning Research, 22(57):1-64, 2021

  15. arXiv:1909.13789  [pdf, other

    cs.LG stat.ML

    Hamiltonian Generative Networks

    Authors: Peter Toth, Danilo Jimenez Rezende, Andrew Jaegle, Sébastien Racanière, Aleksandar Botev, Irina Higgins

    Abstract: The Hamiltonian formalism plays a central role in classical and quantum physics. Hamiltonians are the main tool for modelling the continuous time evolution of systems with conserved quantities, and they come equipped with many useful properties, like time reversibility and smooth interpolation in time. These properties are important for many machine learning problems - from sequence prediction to… ▽ More

    Submitted 14 February, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

  16. arXiv:1909.13739  [pdf, other

    stat.ML cs.LG

    Equivariant Hamiltonian Flows

    Authors: Danilo Jimenez Rezende, Sébastien Racanière, Irina Higgins, Peter Toth

    Abstract: This paper introduces equivariant hamiltonian flows, a method for learning expressive densities that are invariant with respect to a known Lie-algebra of local symmetry transformations while providing an equivariant representation of the data. We provide proof of principle demonstrations of how such flows can be learnt, as well as how the addition of symmetry invariance constraints can improve dat… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

  17. arXiv:1906.09237  [pdf, other

    cs.LG cs.AI stat.ML

    Sha** Belief States with Generative Environment Models for RL

    Authors: Karol Gregor, Danilo Jimenez Rezende, Frederic Besse, Yan Wu, Hamza Merzic, Aaron van den Oord

    Abstract: When agents interact with a complex environment, they must form and maintain beliefs about the relevant aspects of that environment. We propose a way to efficiently train expressive generative models in complex environments. We show that a predictive algorithm with an expressive generative model can form stable belief-states in visually rich and dynamic 3D environments. More precisely, we show tha… ▽ More

    Submitted 24 June, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

    Comments: pre-print

  18. arXiv:1906.02500  [pdf, other

    cs.LG stat.ML

    Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

    Authors: Alex Mott, Daniel Zoran, Mike Chrzanowski, Daan Wierstra, Danilo J. Rezende

    Abstract: Inspired by recent work in attention models for image captioning and question answering, we present a soft attention model for the reinforcement learning domain. This model uses a soft, top-down attention mechanism to create a bottleneck in the agent, forcing it to focus on task-relevant information by sequentially querying its view of the environment. The output of the attention mechanism allows… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  19. arXiv:1812.02230  [pdf, other

    cs.LG stat.ML

    Towards a Definition of Disentangled Representations

    Authors: Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, Alexander Lerchner

    Abstract: How can intelligent agents solve a diverse set of tasks in a data-efficient manner? The disentangled representation learning approach posits that such an agent would benefit from separating out (disentangling) the underlying structure of the world into disjoint parts of its representation. However, there is no generally agreed-upon definition of disentangling, not least because it is unclear how t… ▽ More

    Submitted 5 December, 2018; originally announced December 2018.

  20. arXiv:1810.00597  [pdf, other

    stat.ML cs.LG

    Taming VAEs

    Authors: Danilo Jimenez Rezende, Fabio Viola

    Abstract: In spite of remarkable progress in deep latent variable generative modeling, training still remains a challenge due to a combination of optimization and generalization issues. In practice, a combination of heuristic algorithms (such as hand-crafted annealing of KL-terms) is often used in order to achieve the desired results, but such solutions are not robust to changes in model architecture or dat… ▽ More

    Submitted 1 October, 2018; originally announced October 2018.

  21. arXiv:1807.03149  [pdf, other

    cs.CV cs.LG stat.ML

    Learning models for visual 3D localization with implicit map**

    Authors: Dan Rosenbaum, Frederic Besse, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami

    Abstract: We consider learning based methods for visual localization that do not require the construction of explicit maps in the form of point clouds or voxels. The goal is to learn an implicit representation of the environment at a higher, more abstract level. We propose to use a generative approach based on Generative Query Networks (GQNs, Eslami et al. 2018), asking the following questions: 1) Can GQN c… ▽ More

    Submitted 12 December, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

  22. arXiv:1807.02033  [pdf, other

    cs.CV cs.LG stat.ML

    Consistent Generative Query Networks

    Authors: Ananya Kumar, S. M. Ali Eslami, Danilo J. Rezende, Marta Garnelo, Fabio Viola, Edward Lockhart, Murray Shanahan

    Abstract: Stochastic video prediction models take in a sequence of image frames, and generate a sequence of consecutive future image frames. These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive. We introduce a model that overcomes these drawbacks by generating a latent representation from an arbitrary set of fram… ▽ More

    Submitted 21 April, 2019; v1 submitted 5 July, 2018; originally announced July 2018.

  23. arXiv:1807.01622  [pdf, other

    cs.LG stat.ML

    Neural Processes

    Authors: Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J. Rezende, S. M. Ali Eslami, Yee Whye Teh

    Abstract: A neural network (NN) is a parameterised function that can be tuned via gradient descent to approximate a labelled collection of data with high precision. A Gaussian process (GP), on the other hand, is a probabilistic model that defines a distribution over possible functions, and is updated in light of data via the rules of probabilistic inference. GPs are probabilistic, data-efficient and flexibl… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

  24. arXiv:1807.01613  [pdf, other

    cs.LG stat.ML

    Conditional Neural Processes

    Authors: Marta Garnelo, Dan Rosenbaum, Chris J. Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo J. Rezende, S. M. Ali Eslami

    Abstract: Deep neural networks excel at function approximation, yet they are typically trained from scratch for each new function. On the other hand, Bayesian methods, such as Gaussian Processes (GPs), exploit prior knowledge to quickly infer the shape of a new function at test time. Yet GPs are computationally expensive, and it can be hard to design appropriate priors. In this paper we propose a family of… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

  25. arXiv:1806.05034  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    A Probabilistic U-Net for Segmentation of Ambiguous Images

    Authors: Simon A. A. Kohl, Bernardino Romera-Paredes, Clemens Meyer, Jeffrey De Fauw, Joseph R. Ledsam, Klaus H. Maier-Hein, S. M. Ali Eslami, Danilo Jimenez Rezende, Olaf Ronneberger

    Abstract: Many real-world vision problems suffer from inherent ambiguities. In clinical applications for example, it might not be clear from a CT scan alone which particular region is cancer tissue. Therefore a group of graders typically produces a set of diverse but plausible segmentations. We consider the task of learning a distribution over segmentations given an input. To this end we propose a generativ… ▽ More

    Submitted 29 January, 2019; v1 submitted 13 June, 2018; originally announced June 2018.

    Comments: Last update: added further details about the LIDC experiment. 11 pages for the main paper, 28 pages including appendix. 5 figures in the main paper, 18 figures in total, Advances in Neural Information Processing Systems (NeurIPS), 2018

  26. arXiv:1804.09401  [pdf, other

    stat.ML cs.LG

    Generative Temporal Models with Spatial Memory for Partially Observed Environments

    Authors: Marco Fraccaro, Danilo Jimenez Rezende, Yori Zwols, Alexander Pritzel, S. M. Ali Eslami, Fabio Viola

    Abstract: In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent's representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially p… ▽ More

    Submitted 19 July, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

    Comments: ICML 2018

  27. arXiv:1803.10760  [pdf, other

    cs.LG stat.ML

    Unsupervised Predictive Memory in a Goal-Directed Agent

    Authors: Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap

    Abstract: Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement l… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

  28. arXiv:1803.01682  [pdf, other

    stat.ML cs.LG

    Beyond Greedy Ranking: Slate Optimization via List-CVAE

    Authors: Ray Jiang, Sven Gowal, Timothy A. Mann, Danilo J. Rezende

    Abstract: The conventional solution to the recommendation problem greedily ranks individual document candidates by prediction scores. However, this method fails to optimize the slate as a whole, and hence, often struggles to capture biases caused by the page layout and document interdepedencies. The slate recommendation problem aims to directly find the optimally ordered subset of documents (i.e. slates) th… ▽ More

    Submitted 23 February, 2019; v1 submitted 5 March, 2018; originally announced March 2018.

  29. arXiv:1707.06203  [pdf, other

    cs.LG cs.AI stat.ML

    Imagination-Augmented Agents for Deep Reinforcement Learning

    Authors: Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra

    Abstract: We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in… ▽ More

    Submitted 14 February, 2018; v1 submitted 19 July, 2017; originally announced July 2017.

  30. arXiv:1702.04649  [pdf, other

    cs.LG cs.NE stat.ML

    Generative Temporal Models with Memory

    Authors: Mevlana Gemici, Chia-Chun Hung, Adam Santoro, Greg Wayne, Shakir Mohamed, Danilo J. Rezende, David Amos, Timothy Lillicrap

    Abstract: We consider the general problem of modeling temporal data with long-range dependencies, wherein new observations are fully or partially predictable based on temporally-distant, past observations. A sufficiently powerful temporal model should separate predictable elements of the sequence from unpredictable elements, express uncertainty about those unpredictable elements, and rapidly identify novel… ▽ More

    Submitted 21 February, 2017; v1 submitted 15 February, 2017; originally announced February 2017.

  31. arXiv:1611.02304  [pdf, other

    stat.ML cs.AI math.ST

    Normalizing Flows on Riemannian Manifolds

    Authors: Mevlana C. Gemici, Danilo Rezende, Shakir Mohamed

    Abstract: We consider the problem of density estimation on Riemannian manifolds. Density estimation on manifolds has many applications in fluid-mechanics, optics and plasma physics and it appears often when dealing with angular variables (such as used in protein folding, robot limbs, gene-expression) and in general directional statistics. In spite of the multitude of algorithms available for density estimat… ▽ More

    Submitted 9 November, 2016; v1 submitted 7 November, 2016; originally announced November 2016.

    Comments: 3 pages, 2 figures, Submitted to Workshop on Bayesian Deep Learning at NIPS 2016

  32. arXiv:1607.00662  [pdf, other

    cs.CV cs.LG stat.ML

    Unsupervised Learning of 3D Structure from Images

    Authors: Danilo Jimenez Rezende, S. M. Ali Eslami, Shakir Mohamed, Peter Battaglia, Max Jaderberg, Nicolas Heess

    Abstract: A key goal of computer vision is to recover the underlying 3D structure from 2D observations of the world. In this paper we learn strong deep generative models of 3D structures, and recover these structures from 3D and 2D images via probabilistic inference. We demonstrate high-quality samples and report log-likelihoods on several datasets, including ShapeNet [2], and establish the first benchmarks… ▽ More

    Submitted 19 June, 2018; v1 submitted 3 July, 2016; originally announced July 2016.

    Comments: Appears in Advances in Neural Information Processing Systems 29 (NIPS 2016)

  33. arXiv:1604.08772  [pdf, other

    stat.ML cs.CV cs.LG

    Towards Conceptual Compression

    Authors: Karol Gregor, Frederic Besse, Danilo Jimenez Rezende, Ivo Danihelka, Daan Wierstra

    Abstract: We introduce a simple recurrent variational auto-encoder architecture that significantly improves image modeling. The system represents the state-of-the-art in latent variable models for both the ImageNet and Omniglot datasets. We show that it naturally separates global conceptual information from lower level details, thus addressing one of the fundamentally desired properties of unsupervised lear… ▽ More

    Submitted 29 April, 2016; originally announced April 2016.

    Comments: 14 pages, 13 figures

  34. arXiv:1603.05106  [pdf, other

    stat.ML cs.AI cs.LG

    One-Shot Generalization in Deep Generative Models

    Authors: Danilo Jimenez Rezende, Shakir Mohamed, Ivo Danihelka, Karol Gregor, Daan Wierstra

    Abstract: Humans have an impressive ability to reason about new concepts and experiences from just a single example. In particular, humans have an ability for one-shot generalization: an ability to encounter a new concept, understand its structure, and then be able to generate compelling alternative variations of the concept. We develop machine learning systems with this important capacity by develo** new… ▽ More

    Submitted 25 May, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

    Comments: 8pgs, 1pg references, 1pg appendix, In Proceedings of the 33rd International Conference on Machine Learning, JMLR: W&CP volume 48, 2016

  35. arXiv:1602.06725  [pdf, other

    cs.LG stat.ML

    Variational inference for Monte Carlo objectives

    Authors: Andriy Mnih, Danilo J. Rezende

    Abstract: Recent progress in deep latent variable models has largely been driven by the development of flexible and scalable variational inference methods. Variational training of this type involves maximizing a lower bound on the log-likelihood, using samples from the variational posterior to compute the required gradients. Recently, Burda et al. (2016) have derived a tighter lower bound using a multi-samp… ▽ More

    Submitted 1 June, 2016; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: Appears in Proceedings of the 33rd International Conference on Machine Learning (ICML), New York, NY, USA, 2016. JMLR: W&CP volume 48

  36. arXiv:1509.08731  [pdf, other

    stat.ML cs.AI cs.LG

    Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

    Authors: Shakir Mohamed, Danilo Jimenez Rezende

    Abstract: The mutual information is a core statistical quantity that has applications in all areas of machine learning, whether this is in training of density models over multiple data modalities, in maximising the efficiency of noisy transmission channels, or when learning behaviour policies for exploration by artificial agents. Most learning algorithms that involve optimisation of the mutual information r… ▽ More

    Submitted 29 September, 2015; originally announced September 2015.

    Comments: Proceedings of the 29th Conference on Neural Information Processing Systems (NIPS 2015)

  37. arXiv:1505.05770  [pdf, other

    stat.ML cs.AI cs.LG stat.CO stat.ME

    Variational Inference with Normalizing Flows

    Authors: Danilo Jimenez Rezende, Shakir Mohamed

    Abstract: The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of posterior approximations in order to allow for efficient inference, focusing on mean-field or other simple structured approximations. This restriction has a significant impact on the quality of inferences made using variational… ▽ More

    Submitted 14 June, 2016; v1 submitted 21 May, 2015; originally announced May 2015.

    Comments: Proceedings of the 32nd International Conference on Machine Learning

  38. arXiv:1406.5298  [pdf, other

    cs.LG stat.ML

    Semi-Supervised Learning with Deep Generative Models

    Authors: Diederik P. Kingma, Danilo J. Rezende, Shakir Mohamed, Max Welling

    Abstract: The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unl… ▽ More

    Submitted 31 October, 2014; v1 submitted 20 June, 2014; originally announced June 2014.

    Comments: To appear in the proceedings of Neural Information Processing Systems (NIPS) 2014

  39. arXiv:1401.4082  [pdf, other

    stat.ML cs.AI cs.LG stat.CO stat.ME

    Stochastic Backpropagation and Approximate Inference in Deep Generative Models

    Authors: Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra

    Abstract: We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and learning. Our algorithm introduces a recognition model to represent approximate posterior distributions, and that acts as a stochastic encoder of the data. We develop stochastic back-propagation -- rul… ▽ More

    Submitted 30 May, 2014; v1 submitted 16 January, 2014; originally announced January 2014.

    Comments: Appears In Proceedings of the 31st International Conference on Machine Learning (ICML), JMLR: W\&CP volume 32, 2014