Skip to main content

Showing 1–10 of 10 results for author: Mirza, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2010.01298  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban

    Authors: Peter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber

    Abstract: Intelligent robots need to achieve abstract objectives using concrete, spatiotemporally complex sensory information and motor control. Tabula rasa deep reinforcement learning (RL) has tackled demanding tasks in terms of either visual, abstract, or physical reasoning, but solving these jointly remains a formidable challenge. One recent, unsolved benchmark task that integrates these challenges is Mu… ▽ More

    Submitted 3 October, 2020; originally announced October 2020.

  2. arXiv:1901.03559  [pdf, other

    cs.LG cs.AI stat.ML

    An investigation of model-free planning

    Authors: Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy Lillicrap

    Abstract: The field of reinforcement learning (RL) is facing increasingly challenging domains with combinatorial complexity. For an RL agent to address these challenges, it is essential that it can plan effectively. Prior work has typically utilized an explicit model of the environment, combined with a specific planning algorithm (such as tree search). More recently, a new family of methods have been propos… ▽ More

    Submitted 20 May, 2019; v1 submitted 11 January, 2019; originally announced January 2019.

  3. arXiv:1803.10760  [pdf, other

    cs.LG stat.ML

    Unsupervised Predictive Memory in a Goal-Directed Agent

    Authors: Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap

    Abstract: Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement l… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

  4. arXiv:1612.03809  [pdf, other

    stat.ML cs.CV cs.LG

    Generalizable Features From Unsupervised Learning

    Authors: Mehdi Mirza, Aaron Courville, Yoshua Bengio

    Abstract: Humans learn a predictive model of the world and use this model to reason about future events and the consequences of actions. In contrast to most machine predictors, we exhibit an impressive ability to generalize to unseen scenarios and reason intelligently in these settings. One important aspect of this ability is physical intuition(Lake et al., 2016). In this work, we explore the potential of u… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

  5. arXiv:1411.1784  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Conditional Generative Adversarial Nets

    Authors: Mehdi Mirza, Simon Osindero

    Abstract: Generative Adversarial Nets [8] were recently introduced as a novel way to train generative models. In this work we introduce the conditional version of generative adversarial nets, which can be constructed by simply feeding the data, y, we wish to condition on to both the generator and discriminator. We show that this model can generate MNIST digits conditioned on class labels. We also illustrate… ▽ More

    Submitted 6 November, 2014; originally announced November 2014.

  6. arXiv:1406.2661  [pdf, other

    stat.ML cs.LG

    Generative Adversarial Networks

    Authors: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio

    Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This fram… ▽ More

    Submitted 10 June, 2014; originally announced June 2014.

  7. arXiv:1312.6211  [pdf, other

    stat.ML cs.LG cs.NE

    An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

    Authors: Ian J. Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, Yoshua Bengio

    Abstract: Catastrophic forgetting is a problem faced by many machine learning models and algorithms. When trained on one task, then trained on a second task, many machine learning models "forget" how to perform the first task. This is widely believed to be a serious problem for neural networks. Here, we investigate the extent to which the catastrophic forgetting problem occurs for modern neural networks, co… ▽ More

    Submitted 3 March, 2015; v1 submitted 21 December, 2013; originally announced December 2013.

  8. arXiv:1308.4214  [pdf, ps, other

    stat.ML cs.LG cs.MS

    Pylearn2: a machine learning research library

    Authors: Ian J. Goodfellow, David Warde-Farley, Pascal Lamblin, Vincent Dumoulin, Mehdi Mirza, Razvan Pascanu, James Bergstra, Frédéric Bastien, Yoshua Bengio

    Abstract: Pylearn2 is a machine learning research library. This does not just mean that it is a collection of machine learning algorithms that share a common API; it means that it has been designed for flexibility and extensibility in order to facilitate research projects that involve new or unusual use cases. In this paper we give a brief history of the library, an overview of its basic philosophy, a summa… ▽ More

    Submitted 19 August, 2013; originally announced August 2013.

    Comments: 9 pages

  9. arXiv:1307.0414  [pdf, other

    stat.ML cs.LG

    Challenges in Representation Learning: A report on three machine learning contests

    Authors: Ian J. Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, Yingbo Zhou, Chetan Ramaiah, Fangxiang Feng, Ruifan Li, Xiaojie Wang, Dimitris Athanasakis, John Shawe-Taylor, Maxim Milakov, John Park, Radu Ionescu, Marius Popescu, Cristian Grozea, James Bergstra, **g**g Xie, Lukasz Romaszko , et al. (3 additional authors not shown)

    Abstract: The ICML 2013 Workshop on Challenges in Representation Learning focused on three challenges: the black box learning challenge, the facial expression recognition challenge, and the multimodal learning challenge. We describe the datasets created for these challenges and summarize the results of the competitions. We provide suggestions for organizers of future challenges and some comments on what kin… ▽ More

    Submitted 1 July, 2013; originally announced July 2013.

    Comments: 8 pages, 2 figures

  10. arXiv:1302.4389  [pdf, other

    stat.ML cs.LG

    Maxout Networks

    Authors: Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, Yoshua Bengio

    Abstract: We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model av… ▽ More

    Submitted 20 September, 2013; v1 submitted 18 February, 2013; originally announced February 2013.

    Comments: This is the version of the paper that appears in ICML 2013

    Journal ref: JMLR WCP 28 (3): 1319-1327, 2013