Skip to main content

Showing 1–12 of 12 results for author: Henderson, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2007.06705  [pdf, other

    cs.CV cs.LG stat.ML

    Unsupervised object-centric video generation and decomposition in 3D

    Authors: Paul Henderson, Christoph H. Lampert

    Abstract: A natural approach to generative modeling of videos is to represent them as a composition of moving objects. Recent works model a set of 2D sprites over a slowly-varying background, but without considering the underlying 3D scene that gives rise to them. We instead propose to model a video as the view seen while moving through a scene with multiple 3D objects and a 3D background. Our model is trai… ▽ More

    Submitted 24 March, 2021; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: Appeared at NeurIPS 2020. Project page: http://pmh47.net/o3v/

  2. arXiv:2007.02786  [pdf, other

    cs.LG stat.ML

    TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

    Authors: Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau

    Abstract: We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers. Our method, TDprop, computes a per parameter learning rate based on the diagonal preconditioning of the TD update rule. We show how this can be used in both $n$-step returns and TD($λ$). Our theoretical findings demonstrate that i… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: Presented at the Theoretical Foundations of Reinforcement Learning workshop at ICML 2020

  3. arXiv:2004.00642  [pdf, other

    cs.LG cs.CV stat.ML

    Object-Centric Image Generation with Factored Depths, Locations, and Appearances

    Authors: Titas Anciukevicius, Christoph H. Lampert, Paul Henderson

    Abstract: We present a generative model of images that explicitly reasons over the set of objects they show. Our model learns a structured latent representation that separates objects from each other and from the background; unlike prior works, it explicitly represents the 2D position and depth of each object, as well as an embedding of its segmentation mask and appearance. The model can be trained from ima… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

  4. arXiv:1902.01883  [pdf, other

    cs.LG cs.AI stat.ML

    Separating value functions across time-scales

    Authors: Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier

    Abstract: In many finite horizon episodic reinforcement learning (RL) settings, it is desirable to optimize for the undiscounted return - in settings like Atari, for instance, the goal is to collect the most points while staying alive in the long run. Yet, it may be difficult (or even intractable) mathematically to learn with this target. As such, temporal discounting is often applied to optimize over a sho… ▽ More

    Submitted 24 May, 2019; v1 submitted 5 February, 2019; originally announced February 2019.

    Comments: Full version accepted to ICML 2019. Extended abstract also to be presented at RLDM 2019

  5. arXiv:1812.01074  [pdf, other

    cs.DL cs.LG stat.ML

    Distilling Information from a Flood: A Possibility for the Use of Meta-Analysis and Systematic Review in Machine Learning Research

    Authors: Peter Henderson, Emma Brunskill

    Abstract: The current flood of information in all areas of machine learning research, from computer vision to reinforcement learning, has made it difficult to make aggregate scientific inferences. It can be challenging to distill a myriad of similar papers into a set of useful principles, to determine which new methodologies to use for a particular application, and to be confident that one has compared agai… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

    Comments: Accepted to the Critiquing and Correcting Trends in Machine Learning Workshop (CRACT) at NeurIPS 2018

  6. arXiv:1811.12560  [pdf, other

    cs.LG cs.AI stat.ML

    An Introduction to Deep Reinforcement Learning

    Authors: Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau

    Abstract: Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision-making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introductio… ▽ More

    Submitted 3 December, 2018; v1 submitted 29 November, 2018; originally announced November 2018.

    Journal ref: Foundations and Trends in Machine Learning: Vol. 11, No. 3-4, 2018

  7. arXiv:1811.01302  [pdf, other

    cs.LG cs.CL stat.ML

    Adversarial Gain

    Authors: Peter Henderson, Koustuv Sinha, Rosemary Nan Ke, Joelle Pineau

    Abstract: Adversarial examples can be defined as inputs to a model which induce a mistake - where the model output is different than that of an oracle, perhaps in surprising or malicious ways. Original models of adversarial attacks are primarily studied in the context of classification and computer vision tasks. While several attacks have been proposed in natural language processing (NLP) settings, they oft… ▽ More

    Submitted 3 November, 2018; originally announced November 2018.

  8. arXiv:1810.02525  [pdf, other

    cs.LG cs.AI stat.ML

    Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods

    Authors: Peter Henderson, Joshua Romoff, Joelle Pineau

    Abstract: Recent analyses of certain gradient descent optimization methods have shown that performance can degrade in some settings - such as with stochasticity or implicit momentum. In deep reinforcement learning (Deep RL), such optimization methods are often used for training neural networks via the temporal difference error or policy gradient. As an agent improves over time, the optimization target chang… ▽ More

    Submitted 5 October, 2018; originally announced October 2018.

    Comments: Accepted at the European Workshop on Reinforcement Learning 2018 (EWRL14)

  9. arXiv:1805.03359  [pdf, other

    cs.LG cs.AI stat.ML

    Reward Estimation for Variance Reduction in Deep Reinforcement Learning

    Authors: Joshua Romoff, Peter Henderson, Alexandre Piché, Vincent Francois-Lavet, Joelle Pineau

    Abstract: Reinforcement Learning (RL) agents require the specification of a reward signal for learning behaviours. However, introduction of corrupt or stochastic rewards can yield high variance in learning. Such corruption may be a direct result of goal misspecification, randomness in the reward signal, or correlation of the reward with external factors that are not known to the agent. Corruption or stochas… ▽ More

    Submitted 7 November, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

    Comments: Version 1 as appears in the International Conference on Learning Representations (ICLR) 2018 Workshop Track; Version 2 as appears in the Proceedings of The 2nd Conference on Robot Learning

  10. arXiv:1709.06560  [pdf, other

    cs.LG stat.ML

    Deep Reinforcement Learning that Matters

    Authors: Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger

    Abstract: In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL). Reproducing existing work and accurately judging the improvements offered by novel methods is vital to sustaining this progress. Unfortunately, reproducing results for state-of-the-art deep RL methods is seldom straightforward. In particular, non-determ… ▽ More

    Submitted 29 January, 2019; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: Accepted to the Thirthy-Second AAAI Conference On Artificial Intelligence (AAAI), 2018

  11. arXiv:1512.05742  [pdf, other

    cs.CL cs.AI cs.HC cs.LG stat.ML

    A Survey of Available Corpora for Building Data-Driven Dialogue Systems

    Authors: Iulian Vlad Serban, Ryan Lowe, Peter Henderson, Laurent Charlin, Joelle Pineau

    Abstract: During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and q… ▽ More

    Submitted 20 March, 2017; v1 submitted 17 December, 2015; originally announced December 2015.

    Comments: 56 pages including references and appendix, 5 tables and 1 figure; Under review for the Dialogue & Discourse journal. Update: paper has been rewritten and now includes several new datasets

    MSC Class: 68T01; 68T05; 68T35; 68T50 ACM Class: I.2.6; I.2.7; I.2.1

  12. arXiv:1510.06356  [pdf

    quant-ph cs.LG stat.ML

    Application of Quantum Annealing to Training of Deep Neural Networks

    Authors: Steven H. Adachi, Maxwell P. Henderson

    Abstract: In Deep Learning, a well-known approach for training a Deep Neural Network starts by training a generative Deep Belief Network model, typically using Contrastive Divergence (CD), then fine-tuning the weights using backpropagation or other discriminative techniques. However, the generative training can be time-consuming due to the slow mixing of Gibbs sampling. We investigated an alternative approa… ▽ More

    Submitted 21 October, 2015; originally announced October 2015.

    Comments: 18 pages

    Report number: DIS201510002