Skip to main content

Showing 1–6 of 6 results for author: Damani, M

.
  1. arXiv:2307.15217  [pdf, other

    cs.AI cs.CL cs.LG

    Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

    Authors: Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen , et al. (7 additional authors not shown)

    Abstract: Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popularity, there has been relatively little public work systematizing its flaws. In this paper, we (1) survey open problems and fundamental limitations of RLHF and rel… ▽ More

    Submitted 11 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

  2. arXiv:2305.16145  [pdf, other

    cs.LG

    SocialLight: Distributed Cooperation Learning towards Network-Wide Traffic Signal Control

    Authors: Harsh Goel, Yifeng Zhang, Mehul Damani, Guillaume Sartoretti

    Abstract: Many recent works have turned to multi-agent reinforcement learning (MARL) for adaptive traffic signal control to optimize the travel time of vehicles over large urban networks. However, achieving effective and scalable cooperation among junctions (agents) remains an open challenge, as existing methods often rely on extensive, non-generalizable reward sha** or on non-scalable centralized learnin… ▽ More

    Submitted 20 April, 2023; originally announced May 2023.

    Comments: To appear in the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

  3. arXiv:2208.10469  [pdf, other

    cs.AI cs.GT cs.MA econ.TH

    Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

    Authors: Andreas A. Haupt, Phillip J. K. Christoffersen, Mehul Damani, Dylan Hadfield-Menell

    Abstract: Multi-agent Reinforcement Learning (MARL) is a powerful tool for training autonomous agents acting independently in a common environment. However, it can lead to sub-optimal behavior when individual incentives and group incentives diverge. Humans are remarkably capable at solving these social dilemmas. It is an open problem in MARL to replicate such cooperative behaviors in selfish agents. In this… ▽ More

    Submitted 29 January, 2024; v1 submitted 22 August, 2022; originally announced August 2022.

  4. arXiv:2204.03516  [pdf, other

    cs.RO cs.AI cs.LG cs.MA

    Distributed Reinforcement Learning for Robot Teams: A Review

    Authors: Yutong Wang, Mehul Damani, Pamela Wang, Yuhong Cao, Guillaume Sartoretti

    Abstract: Purpose of review: Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems consisting of hundreds/thousands of robots, with promising applications to automated manufacturing, disaster relief, harvesting, last-mile delivery, port/airport operations, or search and rescue. The community has leveraged model-free multi-agent reinforcement learning (MARL) to de… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: Preprint of the paper submitted to Springer's Current Robotics Reports

  5. arXiv:2103.16511  [pdf, other

    cs.AI cs.LG

    Flatland Competition 2020: MAPF and MARL for Efficient Train Coordination on a Grid World

    Authors: Florian Laurent, Manuel Schneider, Christian Scheller, Jeremy Watson, Jiaoyang Li, Zhe Chen, Yi Zheng, Shao-Hung Chan, Konstantin Makhnev, Oleg Svidchenko, Vladimir Egorov, Dmitry Ivanov, Aleksei Shpilman, Evgenija Spirovska, Oliver Tanevski, Aleksandar Nikov, Ramon Grunder, David Galevski, Jakov Mitrovski, Guillaume Sartoretti, Zhiyao Luo, Mehul Damani, Nilabha Bhattacharya, Shivam Agarwal, Adrian Egli , et al. (2 additional authors not shown)

    Abstract: The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operations research (OR) for decades, the ever-growing com… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: 28 pages, 8 figures

  6. PRIMAL2: Pathfinding via Reinforcement and Imitation Multi-Agent Learning -- Lifelong

    Authors: Mehul Damani, Zhiyao Luo, Emerson Wenzel, Guillaume Sartoretti

    Abstract: Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) - an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one - in dense and highly structured environments, typical… ▽ More

    Submitted 4 March, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: \c{opyright} 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works