Skip to main content

Showing 1–13 of 13 results for author: Chevalier-Boisvert, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.13831  [pdf, other

    cs.LG

    Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

    Authors: Maxime Chevalier-Boisvert, Bolun Dai, Mark Towers, Rodrigo de Lazcano, Lucas Willems, Salem Lahlou, Suman Pal, Pablo Samuel Castro, Jordan Terry

    Abstract: We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for a wide range of research-specific needs. As a result, both have received widescale adoption by the RL community, facilitating research in a wide range of areas.… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  2. arXiv:2008.04391  [pdf, other

    cs.LG stat.ML

    DeepDrummer : Generating Drum Loops using Deep Learning and a Human in the Loop

    Authors: Guillaume Alain, Maxime Chevalier-Boisvert, Frederic Osterrath, Remi Piche-Taillefer

    Abstract: DeepDrummer is a drum loop generation tool that uses active learning to learn the preferences (or current artistic intentions) of a human user from a small number of interactions. The principal goal of this tool is to enable an efficient exploration of new musical ideas. We train a deep neural network classifier on audio data and show how it can be used as the core component of a system that gener… ▽ More

    Submitted 26 August, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

  3. arXiv:2007.12770  [pdf, other

    cs.AI cs.CL cs.LG

    BabyAI 1.1

    Authors: David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

    Abstract: The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 presents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent's architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: 9 pages, 1 figure, technical report

  4. arXiv:2002.00412  [pdf, other

    cs.LG cs.AI stat.ML

    Combating False Negatives in Adversarial Imitation Learning

    Authors: Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

    Abstract: In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's t… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: This is an extended version of the student abstract published at 34th AAAI Conference on Artificial Intelligence

  5. arXiv:2001.00271  [pdf, other

    cs.LG cs.AI stat.ML

    Options of Interest: Temporal Abstraction with Interest Functions

    Authors: Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup

    Abstract: Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can initiate, an internal policy and a stochastic termination condition. However, much of the subsequent work on option discovery has ignored the initiation set, be… ▽ More

    Submitted 1 January, 2020; originally announced January 2020.

    Comments: To appear in Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)

  6. arXiv:1912.00444  [pdf, other

    cs.LG cs.AI stat.ML

    Automated curriculum generation for Policy Gradients from Demonstrations

    Authors: Anirudh Srinivasan, Dzmitry Bahdanau, Maxime Chevalier-Boisvert, Yoshua Bengio

    Abstract: In this paper, we present a technique that improves the process of training an agent (using RL) for instruction following. We develop a training curriculum that uses a nominal number of expert demonstrations and trains the agent in a manner that draws parallels from one of the ways in which humans learn to perform complex tasks, i.e by starting from the goal and working backwards. We test our meth… ▽ More

    Submitted 1 December, 2019; originally announced December 2019.

    Comments: Accepted to Deep RL Workshop at NeurIPS 2019

  7. arXiv:1911.12825  [pdf, other

    cs.AI cs.MA eess.SY math.OC

    Option-Critic in Cooperative Multi-agent Systems

    Authors: Jhelum Chakravorty, Nadeem Ward, Julien Roy, Maxime Chevalier-Boisvert, Sumana Basu, Andrei Lupu, Doina Precup

    Abstract: In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999). First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a \emph{common information approach}. We use the notion of \emph{common beliefs} and broadcasting to solve an equivalent centralized… ▽ More

    Submitted 19 March, 2020; v1 submitted 28 November, 2019; originally announced November 2019.

  8. arXiv:1911.03594  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Robo-PlaNet: Learning to Poke in a Day

    Authors: Maxime Chevalier-Boisvert, Guillaume Alain, Florian Golemo, Derek Nowrouzezahrai

    Abstract: Recently, the Deep Planning Network (PlaNet) approach was introduced as a model-based reinforcement learning method that learns environment dynamics directly from pixel observations. This architecture is useful for learning tasks in which either the agent does not have access to meaningful states (like position/velocity of robotic joints) or where the observed states significantly deviate from the… ▽ More

    Submitted 19 November, 2019; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: 4 pages, 3 figures. Version 2: added reference and acknowledgement

  9. arXiv:1810.08272  [pdf, other

    cs.AI cs.CL

    BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning

    Authors: Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio

    Abstract: Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded languag… ▽ More

    Submitted 19 December, 2019; v1 submitted 18 October, 2018; originally announced October 2018.

    Comments: Accepted at ICLR 2019

  10. arXiv:1511.02956  [pdf, other

    cs.PL

    Interprocedural Type Specialization of JavaScript Programs Without Type Analysis

    Authors: Maxime Chevalier-Boisvert, Marc Feeley

    Abstract: Dynamically typed programming languages such as Python and JavaScript defer type checking to run time. VM implementations can improve performance by eliminating redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures.… ▽ More

    Submitted 9 November, 2015; originally announced November 2015.

    Comments: 10 pages, 10 figures, submitted to CGO 2016

    ACM Class: D.3.4

  11. arXiv:1507.02437  [pdf, other

    cs.PL

    Extending Basic Block Versioning with Typed Object Shapes

    Authors: Maxime Chevalier-Boisvert, Marc Feeley

    Abstract: Typical JavaScript (JS) programs feature a large number of object property accesses. Hence, fast property reads and writes are crucial for good performance. Unfortunately, many (often redundant) dynamic checks are implied in each property access and the semantic complexity of JS makes it difficult to optimize away these tests through program analysis. We introduce two techniques to effectively eli… ▽ More

    Submitted 9 July, 2015; originally announced July 2015.

    ACM Class: D.3.4

  12. arXiv:1411.0352  [pdf, other

    cs.PL

    Simple and Effective Type Check Removal through Lazy Basic Block Versioning

    Authors: Maxime Chevalier-Boisvert, Marc Feeley

    Abstract: Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly co… ▽ More

    Submitted 29 May, 2015; v1 submitted 2 November, 2014; originally announced November 2014.

    ACM Class: D.3.4

  13. arXiv:1401.3041  [pdf, ps, other

    cs.PL cs.PF

    Removing Dynamic Type Tests with Context-Driven Basic Block Versioning

    Authors: Maxime Chevalier-Boisvert, Marc Feeley

    Abstract: Dynamic ty** is an important feature of dynamic programming languages. Primitive operators such as those for performing arithmetic and comparisons typically operate on a wide variety of in put value types, and as such, must internally implement some form of dynamic type dispatch and type checking. Removing such type tests is important for an efficient implementation. In this paper, we examine… ▽ More

    Submitted 13 January, 2014; originally announced January 2014.

    Comments: 22 pages, 10 figures