Skip to main content

Showing 1–27 of 27 results for author: Bachman, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.06121  [pdf, other

    cs.LG cs.AI

    Ignorance is Bliss: Robust Control via Information Gating

    Authors: Manan Tomar, Riashat Islam, Matthew E. Taylor, Sergey Levine, Philip Bachman

    Abstract: Informational parsimony provides a useful inductive bias for learning representations that achieve better generalization by being robust to noise and spurious correlations. We propose \textit{information gating} as a way to learn parsimonious representations that identify the minimal information required for a task. When gating information, we can learn to reveal as little information as possible… ▽ More

    Submitted 8 December, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  2. arXiv:2106.13401  [pdf, other

    cs.LG cs.AI

    Decomposed Mutual Information Estimation for Contrastive Representation Learning

    Authors: Alessandro Sordoni, Nouha Dziri, Hannes Schulz, Geoff Gordon, Phil Bachman, Remi Tachet

    Abstract: Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong unde… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  3. arXiv:2106.04799  [pdf, other

    cs.LG

    Pretraining Representations for Data-Efficient Reinforcement Learning

    Authors: Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville

    Abstract: Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited t… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  4. arXiv:2007.13278  [pdf, other

    cs.CV cs.LG

    Representation Learning with Video Deep InfoMax

    Authors: R Devon Hjelm, Philip Bachman

    Abstract: Self-supervised learning has made unsupervised pretraining relevant again for difficult computer vision tasks. The most effective self-supervised methods involve prediction tasks based on features extracted from diverse views of the data. DeepInfoMax (DIM) is a self-supervised method which leverages the internal structure of deep networks to construct such views, forming prediction tasks between l… ▽ More

    Submitted 27 July, 2020; v1 submitted 26 July, 2020; originally announced July 2020.

  5. arXiv:2007.05929  [pdf, other

    cs.LG stat.ML

    Data-Efficient Reinforcement Learning with Self-Predictive Representations

    Authors: Max Schwarzer, Ankesh Anand, Rishab Goel, R Devon Hjelm, Aaron Courville, Philip Bachman

    Abstract: While deep reinforcement learning excels at solving tasks where large amounts of data can be collected through virtually unlimited interaction with the environment, learning from limited interaction remains a key challenge. We posit that an agent can learn more efficiently if we augment reward maximization with self-supervised objectives based on structure in its visual input and sequential intera… ▽ More

    Submitted 20 May, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

    Comments: The first two authors contributed equally to this work. v4 includes new ablations and reformatting for ICLR camera ready

  6. arXiv:2006.07217  [pdf, other

    cs.LG stat.ML

    Deep Reinforcement and InfoMax Learning

    Authors: Bogdan Mazoure, Remi Tachet des Combes, Thang Doan, Philip Bachman, R Devon Hjelm

    Abstract: We begin with the hypothesis that a model-free agent whose representations are predictive of properties of future states (beyond expected rewards) will be more capable of solving and adapting to new RL problems. To test that hypothesis, we introduce an objective based on Deep InfoMax (DIM) which trains the agent to predict the future by maximizing the mutual information between its internal repres… ▽ More

    Submitted 16 November, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: NeurIPS 2020

  7. arXiv:1906.00910  [pdf, other

    cs.LG stat.ML

    Learning Representations by Maximizing Mutual Information Across Views

    Authors: Philip Bachman, R Devon Hjelm, William Buchwalter

    Abstract: We propose an approach to self-supervised representation learning based on maximizing mutual information between features extracted from multiple views of a shared context. For example, one could produce multiple views of a local spatio-temporal context by observing it from different locations (e.g., camera positions within a scene), and via different modalities (e.g., tactile, auditory, or visual… ▽ More

    Submitted 8 July, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

  8. arXiv:1809.02591  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Invariances for Policy Generalization

    Authors: Remi Tachet, Philip Bachman, Harm van Seijen

    Abstract: While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks. In this paper, we study a simple reinforcement learning problem and focus on learning policies that encode the proper invariances for generalization to different settings. We evaluate three potential methods fo… ▽ More

    Submitted 12 December, 2020; v1 submitted 7 September, 2018; originally announced September 2018.

    Comments: 7 pages, 1 figure

  9. arXiv:1808.06670  [pdf, other

    stat.ML cs.LG

    Learning deep representations by mutual information estimation and maximization

    Authors: R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, Yoshua Bengio

    Abstract: In this work, we perform unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality of the input to the objective can greatly influence a representation's suitability for downstream tasks. We further control characteristics of the repr… ▽ More

    Submitted 22 February, 2019; v1 submitted 20 August, 2018; originally announced August 2018.

    Comments: Accepted as an oral presentation at the International Conference for Learning Representations (ICLR), 2019

  10. arXiv:1807.04106  [pdf, other

    cs.LG stat.ML

    VFunc: a Deep Generative Model for Functions

    Authors: Philip Bachman, Riashat Islam, Alessandro Sordoni, Zafarali Ahmed

    Abstract: We introduce a deep generative model for functions. Our model provides a joint distribution p(f, z) over functions f and latent variables z which lets us efficiently sample from the marginal p(f) and maximize a variational lower bound on the entropy H(f). We can thus maximize objectives of the form E_{f~p(f)}[R(f)] + c*H(f), where R(f) denotes, e.g., a data log-likelihood term or an expected rewar… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: To be presented at the ICML 2018 workshop on Prediction and Generative Modeling in Reinforcement Learning

  11. arXiv:1802.10151  [pdf, other

    cs.LG

    Augmented CycleGAN: Learning Many-to-Many Map**s from Unpaired Data

    Authors: Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville

    Abstract: Learning inter-domain map**s from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by reducing the need for paired data. CycleGAN was recently proposed for this problem, but critically assumes the underlying inter-domain map** is approximately deterministic and one-to-one. This assumption renders the model ineffective for tasks requiring flexibl… ▽ More

    Submitted 18 June, 2018; v1 submitted 27 February, 2018; originally announced February 2018.

    Comments: ICML 2018

  12. arXiv:1709.06560  [pdf, other

    cs.LG stat.ML

    Deep Reinforcement Learning that Matters

    Authors: Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger

    Abstract: In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL). Reproducing existing work and accurately judging the improvements offered by novel methods is vital to sustaining this progress. Unfortunately, reproducing results for state-of-the-art deep RL methods is seldom straightforward. In particular, non-determ… ▽ More

    Submitted 29 January, 2019; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: Accepted to the Thirthy-Second AAAI Conference On Artificial Intelligence (AAAI), 2018

  13. arXiv:1708.00805  [pdf, other

    cs.LG

    Variational Generative Stochastic Networks with Collaborative Sha**

    Authors: Philip Bachman, Doina Precup

    Abstract: We develop an approach to training generative models based on unrolling a variational auto-encoder into a Markov chain, and sha** the chain's trajectories using a technique inspired by recent work in Approximate Bayesian computation. We show that the global minimizer of the resulting objective is achieved when the generative model reproduces the target distribution. To allow finer control over t… ▽ More

    Submitted 2 August, 2017; originally announced August 2017.

    Comments: Old paper, from ICML 2015

  14. arXiv:1708.00088  [pdf, other

    cs.LG

    Learning Algorithms for Active Learning

    Authors: Philip Bachman, Alessandro Sordoni, Adam Trischler

    Abstract: We introduce a model that learns active learning algorithms via metalearning. For a distribution of related tasks, our model jointly learns: a data representation, an item selection heuristic, and a method for constructing prediction functions from labeled training sets. Our model uses the item selection heuristic to gather labeled training sets from which to construct prediction functions. Using… ▽ More

    Submitted 31 July, 2017; originally announced August 2017.

    Comments: Accepted for publication at ICML 2017

  15. arXiv:1705.02012  [pdf, ps, other

    cs.CL

    Machine Comprehension by Text-to-Text Neural Question Generation

    Authors: Xingdi Yuan, Tong Wang, Caglar Gulcehre, Alessandro Sordoni, Philip Bachman, Sandeep Subramanian, Saizheng Zhang, Adam Trischler

    Abstract: We propose a recurrent neural model that generates natural-language questions from documents, conditioned on answers. We show how to train the model using a combination of supervised and reinforcement learning. After teacher forcing for standard maximum likelihood training, we fine-tune the model using policy gradient techniques to maximize several rewards that measure question quality. Most notab… ▽ More

    Submitted 15 May, 2017; v1 submitted 4 May, 2017; originally announced May 2017.

  16. arXiv:1702.01691  [pdf, other

    cs.LG

    Calibrating Energy-based Generative Adversarial Networks

    Authors: Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville

    Abstract: In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples.Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal. We derive th… ▽ More

    Submitted 23 February, 2017; v1 submitted 6 February, 2017; originally announced February 2017.

    Comments: ICLR 2017 camera ready

  17. arXiv:1612.04739  [pdf, other

    cs.LG

    An Architecture for Deep, Hierarchical Generative Models

    Authors: Philip Bachman

    Abstract: We present an architecture which lets us train deep, directed generative models with many layers of latent variables. We include deterministic paths between all latent variables and the generated output, and provide a richer set of connections between computations for inference and generation, which enables more effective communication of information throughout the model during training. To improv… ▽ More

    Submitted 8 December, 2016; originally announced December 2016.

    Comments: Published in NIPS 2016

  18. arXiv:1612.02605  [pdf, other

    cs.LG

    Towards Information-Seeking Agents

    Authors: Philip Bachman, Alessandro Sordoni, Adam Trischler

    Abstract: We develop a general problem setting for training and testing the ability of agents to gather information efficiently. Specifically, we present a collection of tasks in which success requires searching through a partially-observed environment, for fragments of information which can be pieced together to accomplish various goals. We combine deep architectures with techniques from reinforcement lear… ▽ More

    Submitted 8 December, 2016; originally announced December 2016.

    Comments: Under review for ICLR 2017

  19. arXiv:1611.09830  [pdf, other

    cs.CL cs.AI

    NewsQA: A Machine Comprehension Dataset

    Authors: Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, Kaheer Suleman

    Abstract: We present NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs. Crowdworkers supply questions and answers based on a set of over 10,000 news articles from CNN, with answers consisting of spans of text from the corresponding articles. We collect this dataset through a four-stage process designed to solicit exploratory questions that require reas… ▽ More

    Submitted 7 February, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

  20. arXiv:1606.03632  [pdf, other

    cs.CL

    Natural Language Generation in Dialogue using Lexicalized and Delexicalized Data

    Authors: Shikhar Sharma, **g He, Kaheer Suleman, Hannes Schulz, Philip Bachman

    Abstract: Natural language generation plays a critical role in spoken dialogue systems. We present a new approach to natural language generation for task-oriented dialogue using recurrent neural networks in an encoder-decoder framework. In contrast to previous work, our model uses both lexicalized and delexicalized components i.e. slot-value pairs for dialogue acts, with slots and corresponding values align… ▽ More

    Submitted 21 April, 2017; v1 submitted 11 June, 2016; originally announced June 2016.

  21. arXiv:1606.02245  [pdf, other

    cs.CL cs.NE

    Iterative Alternating Neural Attention for Machine Reading

    Authors: Alessandro Sordoni, Philip Bachman, Adam Trischler, Yoshua Bengio

    Abstract: We propose a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document. Unlike previous models, we do not collapse the query into a single vector, instead we deploy an iterative alternating attention mechanism that allows a fine-grained exploration of both the query and the document. Our model outperforms state-of-th… ▽ More

    Submitted 9 November, 2016; v1 submitted 7 June, 2016; originally announced June 2016.

  22. arXiv:1603.08884  [pdf, other

    cs.CL

    A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data

    Authors: Adam Trischler, Zheng Ye, Xingdi Yuan, **g He, Phillip Bachman, Kaheer Suleman

    Abstract: Understanding unstructured text is a major goal within natural language processing. Comprehension tests pose questions based on short text passages to evaluate such understanding. In this work, we investigate machine comprehension on the challenging {\it MCTest} benchmark. Partly because of its limited size, prior work on {\it MCTest} has focused mainly on engineering better features. We tackle th… ▽ More

    Submitted 29 March, 2016; originally announced March 2016.

    Comments: 9 pages, submitted to ACL

    MSC Class: I.2.7

  23. arXiv:1510.08949  [pdf, other

    cs.LG

    Testing Visual Attention in Dynamic Environments

    Authors: Philip Bachman, David Krueger, Doina Precup

    Abstract: We investigate attention as the active pursuit of useful information. This contrasts with attention as a mechanism for the attenuation of irrelevant information. We also consider the role of short-term memory, whose use is critical to any model incapable of simultaneously perceiving all information on which its output depends. We present several simple synthetic tasks, which become considerably mo… ▽ More

    Submitted 29 October, 2015; originally announced October 2015.

  24. arXiv:1506.03504  [pdf, other

    cs.LG stat.ML

    Data Generation as Sequential Decision Making

    Authors: Philip Bachman, Doina Precup

    Abstract: We connect a broad class of generative models through their shared reliance on sequential decision making. Motivated by this view, we develop extensions to an existing model, and then explore the idea further in the context of data imputation -- perhaps the simplest setting in which to investigate the relation between unconditional and conditional generative modelling. We formulate data imputation… ▽ More

    Submitted 2 November, 2015; v1 submitted 10 June, 2015; originally announced June 2015.

    Comments: Accepted for publication at Advances in Neural Information Processing Systems (NIPS) 2015

  25. arXiv:1412.4864  [pdf, other

    stat.ML cs.LG cs.NE

    Learning with Pseudo-Ensembles

    Authors: Philip Bachman, Ouais Alsharif, Doina Precup

    Abstract: We formalize the notion of a pseudo-ensemble, a (possibly infinite) collection of child models spawned from a parent model by perturbing it according to some noise process. E.g., dropout (Hinton et. al, 2012) in a deep neural network trains a pseudo-ensemble of child subnetworks generated by randomly masking nodes in the parent network. We present a novel regularizer based on making the behavior o… ▽ More

    Submitted 15 December, 2014; originally announced December 2014.

    Comments: To appear in Advances in Neural Information Processing Systems 27 (NIPS 2014), Advances in Neural Information Processing Systems 27, Dec. 2014

  26. arXiv:1404.4108  [pdf, other

    cs.LG

    Representation as a Service

    Authors: Ouais Alsharif, Philip Bachman, Joelle Pineau

    Abstract: Consider a Machine Learning Service Provider (MLSP) designed to rapidly create highly accurate learners for a never-ending stream of new tasks. The challenge is to produce task-specific learners that can be trained from few labeled samples, even if tasks are not uniquely identified, and the number of tasks and input dimensionality are large. In this paper, we argue that the MLSP should exploit kno… ▽ More

    Submitted 9 July, 2014; v1 submitted 24 February, 2014; originally announced April 2014.

    Comments: 8 pages

  27. arXiv:1206.6385  [pdf

    cs.LG stat.ME stat.ML

    Improved Estimation in Time Varying Models

    Authors: Doina Precup, Philip Bachman

    Abstract: Locally adapted parameterizations of a model (such as locally weighted regression) are expressive but often suffer from high variance. We describe an approach for reducing the variance, based on the idea of estimating simultaneously a transformed space for the model, as well as locally adapted parameterizations in this new space. We present a new problem formulation that captures this idea and ill… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)