Skip to main content

Showing 1–11 of 11 results for author: Harb, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.08710  [pdf, other

    cs.RO cs.LG

    Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

    Authors: Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John D. Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, Benjamin Sapp

    Abstract: Simulation is an essential tool to develop and benchmark autonomous vehicle planning software in a safe and cost-effective manner. However, realistic simulation requires accurate modeling of nuanced and complex multi-agent interactive behaviors. To address these challenges, we introduce Waymax, a new data-driven simulator for autonomous driving in multi-agent scenes, designed for large-scale simul… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  2. arXiv:2309.02812  [pdf

    cs.CL

    Agent-based simulation of pedestrians' earthquake evacuation; application to Beirut, Lebanon

    Authors: Rouba Iskandar, Kamel Allaw, Julie Dugdale, Elise Beck, Jocelyne Adjizian-Gérard, Cécile Cornou, Jacques Harb, Pascal Lacroix, Nada Badaro-Saliba, Stéphane Cartier, Rita Zaarour

    Abstract: Most seismic risk assessment methods focus on estimating the damages to the built environment and the consequent socioeconomic losses without fully taking into account the social aspect of risk. Yet, human behaviour is a key element in predicting the human impact of an earthquake, therefore, it is important to include it in quantitative risk assessment studies. In this study, an interdisciplinary… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  3. arXiv:2207.01566  [pdf, other

    cs.LG stat.ML

    General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States

    Authors: Francesco Faccio, Aditya Ramesh, Vincent Herrmann, Jean Harb, Jürgen Schmidhuber

    Abstract: Learning to evaluate and improve policies is a core problem of Reinforcement Learning (RL). Traditional RL algorithms learn a value function defined for a single policy. A recently explored competitive alternative is to learn a single value function for many policies. Here we combine the actor-critic architecture of Parameter-Based Value Functions and the policy embedding of Policy Evaluation Netw… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Preprint. Under review

  4. arXiv:2002.11833  [pdf, other

    cs.LG cs.AI stat.ML

    Policy Evaluation Networks

    Authors: Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon

    Abstract: Many reinforcement learning algorithms use value functions to guide the search for better policies. These methods estimate the value of a single policy while generalizing across many states. The core idea of this paper is to flip this convention and estimate the value of many policies, for a single set of states. This approach opens up the possibility of performing direct gradient ascent in policy… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: 12 pages, 11 figures

  5. arXiv:2002.05665  [pdf, other

    cs.CY cs.CV cs.LG

    FRSign: A Large-Scale Traffic Light Dataset for Autonomous Trains

    Authors: Jeanine Harb, Nicolas Rébéna, Raphaël Chosidow, Grégoire Roblin, Roman Potarusov, Hatem Hajri

    Abstract: In the realm of autonomous transportation, there have been many initiatives for open-sourcing self-driving cars datasets, but much less for alternative methods of transportation such as trains. In this paper, we aim to bridge the gap by introducing FRSign, a large-scale and accurate dataset for vision-based railway traffic light detection and recognition. Our recordings were made on selected runni… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

  6. arXiv:1811.07004  [pdf, ps, other

    cs.AI cs.LG

    The Barbados 2018 List of Open Issues in Continual Learning

    Authors: Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

    Abstract: We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments. The purpose of this report is to sketch a research outline, share some of the most important open issues we are facing, and stimulate further discussion in the community. The content is based on some of our discussions during a week-… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.

    Comments: NIPS Continual Learning Workshop 2018

  7. arXiv:1712.00004  [pdf, other

    cs.LG cs.AI

    Learnings Options End-to-End for Continuous Action Tasks

    Authors: Martin Klissarov, Pierre-Luc Bacon, Jean Harb, Doina Precup

    Abstract: We present new results on learning temporally extended actions for continuoustasks, using the options framework (Suttonet al.[1999b], Precup [2000]). In orderto achieve this goal we work with the option-critic architecture (Baconet al.[2017])using a deliberation cost and train it with proximal policy optimization (Schulmanet al.[2017]) instead of vanilla policy gradient. Results on Mujoco domains… ▽ More

    Submitted 29 November, 2017; originally announced December 2017.

  8. arXiv:1709.04571  [pdf, other

    cs.AI

    When Waiting is not an Option : Learning Options with a Deliberation Cost

    Authors: Jean Harb, Pierre-Luc Bacon, Martin Klissarov, Doina Precup

    Abstract: Recent work has shown that temporally extended actions (options) can be learned fully end-to-end as opposed to being specified in advance. While the problem of "how" to learn options is increasingly well understood, the question of "what" good options should be has remained elusive. We formulate our answer to what "good" options should be in the bounded rationality framework (Simon, 1957) through… ▽ More

    Submitted 13 September, 2017; originally announced September 2017.

  9. arXiv:1706.02275  [pdf, other

    cs.LG cs.AI cs.NE

    Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

    Authors: Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch

    Abstract: We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers ac… ▽ More

    Submitted 14 March, 2020; v1 submitted 7 June, 2017; originally announced June 2017.

  10. arXiv:1704.05495  [pdf, other

    cs.AI cs.LG

    Investigating Recurrence and Eligibility Traces in Deep Q-Networks

    Authors: Jean Harb, Doina Precup

    Abstract: Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highlight a… ▽ More

    Submitted 18 April, 2017; originally announced April 2017.

    Comments: 8 pages, 3 figures, NIPS 2016 Deep Reinforcement Learning Workshop

  11. arXiv:1609.05140  [pdf, other

    cs.AI

    The Option-Critic Architecture

    Authors: Pierre-Luc Bacon, Jean Harb, Doina Precup

    Abstract: Temporal abstraction is key to scaling up learning and planning in reinforcement learning. While planning with temporally extended actions is well understood, creating such abstractions autonomously from data has remained challenging. We tackle this problem in the framework of options [Sutton, Precup & Singh, 1999; Precup, 2000]. We derive policy gradient theorems for options and propose a new opt… ▽ More

    Submitted 2 December, 2016; v1 submitted 16 September, 2016; originally announced September 2016.

    Comments: Accepted to the Thirthy-first AAAI Conference On Artificial Intelligence (AAAI), 2017