Skip to main content

Showing 1–50 of 61 results for author: Taylor, M E

.
  1. arXiv:2406.06495  [pdf, other

    cs.LG

    Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity

    Authors: Calarina Muslimani, Bram Grooten, Deepak Ranganatha Sastry Mamillapalli, Mykola Pechenizkiy, Decebal Constantin Mocanu, Matthew E. Taylor

    Abstract: For autonomous agents to successfully integrate into human-centered environments, agents should be able to learn from and adapt to humans in their native settings. Preference-based reinforcement learning (PbRL) is a promising approach that learns reward functions from human preferences. This enables RL agents to adapt their behavior based on human desires. However, humans live in a world full of d… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  2. arXiv:2405.00746  [pdf, other

    cs.LG cs.AI cs.RO

    Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning

    Authors: Calarina Muslimani, Matthew E. Taylor

    Abstract: To create useful reinforcement learning (RL) agents, step zero is to design a suitable reward function that captures the nuances of the task. However, reward engineering can be a difficult and time-consuming process. Instead, human-in-the-loop (HitL) RL allows agents to learn reward functions from human feedback. Despite recent successes, many of the HitL RL methods still require numerous human in… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  3. arXiv:2404.13061  [pdf, other

    cs.AR cs.AI cs.LG

    FPGA Divide-and-Conquer Placement using Deep Reinforcement Learning

    Authors: Shang Wang, Deepak Ranganatha Sastry Mamillapalli, Tianpei Yang, Matthew E. Taylor

    Abstract: This paper introduces the problem of learning to place logic blocks in Field-Programmable Gate Arrays (FPGAs) and a learning-based method. In contrast to previous search-based placement algorithms, we instead employ Reinforcement Learning (RL) with the goal of minimizing wirelength. In addition to our preliminary learning results, we also evaluated a novel decomposition to address the nature of la… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: accepted by ISEDA2024

  4. arXiv:2402.06819  [pdf, other

    cs.LG

    Monitored Markov Decision Processes

    Authors: Simone Parisi, Montaser Mohammedalamen, Alireza Kazemipour, Matthew E. Taylor, Michael Bowling

    Abstract: In reinforcement learning (RL), an agent learns to perform a task by interacting with an environment and receiving feedback (a numerical reward) for its actions. However, the assumption that rewards are always observable is often not applicable in real-world problems. For example, the agent may need to ask a human to supervise its actions or activate a monitoring system to receive feedback. There… ▽ More

    Submitted 13 February, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: AAMAS 2024, Main Track

  5. arXiv:2401.02991  [pdf, other

    cs.CL cs.AI cs.LG

    GLIDE-RL: Grounded Language Instruction through DEmonstration in RL

    Authors: Chaitanya Kharyal, Sai Krishna Gottipati, Tanmay Kumar Sinha, Srijita Das, Matthew E. Taylor

    Abstract: One of the final frontiers in the development of complex human - AI collaborative systems is the ability of AI agents to comprehend the natural language and perform tasks accordingly. However, training efficient Reinforcement Learning (RL) agents grounded in natural language has been a long-standing challenge due to the complexity and ambiguity of the language and sparsity of the rewards, among ot… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 12 pages, 6 figures, to be presented at AAMAS 2024

  6. arXiv:2401.00907  [pdf, other

    cs.LG cs.AI cs.CL

    LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models

    Authors: Qianxi Li, Yingyue Cao, Jikun Kang, Tianpei Yang, Xi Chen, Jun **, Matthew E. Taylor

    Abstract: Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance. Supervised Fine-Tuning (SFT) is a common approach, where an LLM is trained to produce desired answers. However, LLMs trained with SFT sometimes make simple mistakes and result in hallucinations on reasoning tasks such as question-answering. Without extern… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: Paper accepted in Human-Centric Representation Learning workshop at AAAI 2024 (https://hcrl-workshop.github.io/2024/)

  7. arXiv:2312.15339  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning

    Authors: Bram Grooten, Tristan Tomilin, Gautham Vasan, Matthew E. Taylor, A. Rupam Mahmood, Meng Fang, Mykola Pechenizkiy, Decebal Constantin Mocanu

    Abstract: The visual world provides an abundance of information, but many input pixels received by agents often contain distracting stimuli. Autonomous agents need the ability to distinguish useful information from task-irrelevant perceptions, enabling them to generalize to unseen environments with new distractions. Existing works approach this problem using data augmentation or large auxiliary networks wit… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Accepted as full-paper (oral) at AAMAS 2024. Code is available at https://github.com/bramgrooten/mask-distractions and see our 40-second video at https://youtu.be/2oImF0h1k48

  8. arXiv:2312.11768  [pdf, other

    cs.AI cs.LG cs.MA

    Curriculum Learning for Cooperation in Multi-Agent Reinforcement Learning

    Authors: Rupali Bhati, Sai Krishna Gottipati, Clodéric Mars, Matthew E. Taylor

    Abstract: While there has been significant progress in curriculum learning and continuous learning for training agents to generalize across a wide variety of environments in the context of single-agent reinforcement learning, it is unclear if these algorithms would still be valid in a multi-agent setting. In a competitive setting, a learning agent can be trained by making it compete with a curriculum of inc… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 9 pages, 5 figures. Presented at Agent Learning in Open-Endedness Workshop at Neural Information Processing Systems (NeurIPS 2023)

  9. arXiv:2312.11718  [pdf, other

    cs.AI cs.HC cs.LG cs.MA stat.AP

    Human-Machine Teaming for UAVs: An Experimentation Platform

    Authors: Laila El Moujtahid, Sai Krishna Gottipati, Clodéric Mars, Matthew E. Taylor

    Abstract: Full automation is often not achievable or desirable in critical systems with high-stakes decisions. Instead, human-AI teams can achieve better results. To research, develop, evaluate, and validate algorithms suited for such teaming, lightweight experimentation platforms that enable interactions between humans and multiple AI agents are necessary. However, there are limited examples of such platfo… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 9 pages, 6 figures Presented at Conference on Artificial Intelligence for Defense (CAID) 2023

  10. arXiv:2311.00810  [pdf, other

    cs.CY cs.CV cs.HC

    A Call to Arms: AI Should be Critical for Social Media Analysis of Conflict Zones

    Authors: Afia Abedin, Abdul Bais, Cody Buntain, Laura Courchesne, Brian McQuinn, Matthew E. Taylor, Muhib Ullah

    Abstract: The massive proliferation of social media data represents a transformative moment in conflict studies. This data can provide unique insights into the spread and use of weaponry, but the scale and types of data are problematic for traditional open-source intelligence. This paper presents preliminary, transdisciplinary work using computer vision to identify specific weapon systems and the insignias… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  11. arXiv:2307.05603  [pdf, other

    cs.SE cs.LG cs.PL

    Can You Improve My Code? Optimizing Programs with Local Search

    Authors: Fatemeh Abdollahi, Saqib Ameen, Matthew E. Taylor, Levi H. S. Lelis

    Abstract: This paper introduces a local search method for improving an existing program with respect to a measurable objective. Program Optimization with Locally Improving Search (POLIS) exploits the structure of a program, defined by its lines. POLIS improves a single line of the program while kee** the remaining lines fixed, using existing brute-force synthesis algorithms, and continues iterating until… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI) 2023

  12. arXiv:2303.06121  [pdf, other

    cs.LG cs.AI

    Ignorance is Bliss: Robust Control via Information Gating

    Authors: Manan Tomar, Riashat Islam, Matthew E. Taylor, Sergey Levine, Philip Bachman

    Abstract: Informational parsimony provides a useful inductive bias for learning representations that achieve better generalization by being robust to noise and spurious correlations. We propose \textit{information gating} as a way to learn parsimonious representations that identify the minimal information required for a task. When gating information, we can learn to reveal as little information as possible… ▽ More

    Submitted 8 December, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  13. arXiv:2302.06548  [pdf, other

    cs.LG cs.AI

    Automatic Noise Filtering with Dynamic Sparse Training in Deep Reinforcement Learning

    Authors: Bram Grooten, Ghada Sokar, Shibhansh Dohare, Elena Mocanu, Matthew E. Taylor, Mykola Pechenizkiy, Decebal Constantin Mocanu

    Abstract: Tomorrow's robots will need to distinguish useful information from noise when performing different tasks. A household robot for instance may continuously receive a plethora of information about the home, but needs to focus on just a small subset to successfully execute its current chore. Filtering distracting inputs that contain irrelevant data has received little attention in the reinforcement le… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: Accepted as full-paper at AAMAS 2023

  14. arXiv:2301.11153  [pdf, other

    cs.LG cs.AI cs.MA

    Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Kate Larson, Mark Crowley

    Abstract: Multi-agent reinforcement learning typically suffers from the problem of sample inefficiency, where learning suitable policies involves the use of many data samples. Learning from external demonstrators is a possible solution that mitigates this problem. However, most prior approaches in this area assume the presence of a single demonstrator. Leveraging multiple knowledge sources (i.e., advisors)… ▽ More

    Submitted 2 March, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: Paper to appear in AAMAS 2023, London, UK

  15. arXiv:2212.08302  [pdf, other

    cs.LG cs.AI

    Safe Evaluation For Offline Learning: Are We Ready To Deploy?

    Authors: Hager Radi, Josiah P. Hanna, Peter Stone, Matthew E. Taylor

    Abstract: The world currently offers an abundance of data in multiple domains, from which we can learn reinforcement learning (RL) policies without further interaction with the environment. RL agents learning offline from such data is possible but deploying them while learning might be dangerous in domains where safety is critical. Therefore, it is essential to find a way to estimate how a newly-learned age… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2021 Workshop on Deployable Decision Making in Embodied Systems [Spotlight]

  16. arXiv:2211.07545  [pdf, ps, other

    cs.RO cs.CV cs.LG

    NeurIPS 2022 Competition: Driving SMARTS

    Authors: Amir Rasouli, Randy Goebel, Matthew E. Taylor, Iuliia Kotseruba, Soheil Alizadeh, Tianpei Yang, Montgomery Alban, Florian Shkurti, Yuzheng Zhuang, Adam Scibior, Kasra Rezaee, Animesh Garg, David Meger, Jun Luo, Liam Paull, Weinan Zhang, Xinyu Wang, Xi Chen

    Abstract: Driving SMARTS is a regular competition designed to tackle problems caused by the distribution shift in dynamic interaction contexts that are prevalent in real-world autonomous driving (AD). The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods, trained on a combination of naturalistic AD data and open-source simulati… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: 10 pages, 8 figures

  17. arXiv:2210.06683  [pdf, other

    cs.LG cs.HC

    Augmenting Flight Training with AI to Efficiently Train Pilots

    Authors: Michael Guevarra, Srijita Das, Christabel Wayllace, Carrie Demmans Epp, Matthew E. Taylor, Alan Tay

    Abstract: We propose an AI-based pilot trainer to help students learn how to fly aircraft. First, an AI agent uses behavioral cloning to learn flying maneuvers from qualified flight instructors. Later, the system uses the agent's decisions to detect errors made by students and provide feedback to help students correct their errors. This paper presents an instantiation of the pilot trainer. We focus on teach… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: 3 pages, 3 figures, submitted to AAAI-23 Demonstration Program

  18. arXiv:2205.04489  [pdf, ps, other

    math.AP math.CA math.SP

    Product Manifolds with Improved Spectral Cluster and Weyl Remainder Estimates

    Authors: Xiaoqi Huang, Christopher D. Sogge, Michael E. Taylor

    Abstract: We show that if $Y$ is a compact Riemannian manifold with improved $L^q$ eigenfunction estimates then, at least for large enough exponents, one always obtains improved $L^q$ bounds on the product manifold $X\times Y$ if $X$ is another compact manifold. Similarly, improved Weyl remainder term bounds on the spectral counting function of $Y$ lead to corresponding improvements on $X\times Y$. The latt… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

    Comments: 22 pages

    MSC Class: 58J50; 35P15

  19. arXiv:2204.11897  [pdf, other

    cs.LG

    Reinforcement Teaching

    Authors: Alex Lewandowski, Calarina Muslimani, Dale Schuurmans, Matthew E. Taylor, Jun Luo

    Abstract: Meta-learning strives to learn about and improve a student's machine learning algorithm. However, existing meta-learning methods either only work with differentiable algorithms or are hand-crafted to improve one specific component of an algorithm. We develop a unifying meta-learning framework, called Reinforcement Teaching, to improve the learning process of any algorithm. Under Reinforcement Teac… ▽ More

    Submitted 22 May, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: First two authors contributed equally

  20. arXiv:2204.07254  [pdf, other

    cs.LG cs.AI cs.MA

    Methodical Advice Collection and Reuse in Deep Reinforcement Learning

    Authors: Sahir, Ercüment İlhan, Srijita Das, Matthew E. Taylor

    Abstract: Reinforcement learning (RL) has shown great success in solving many challenging tasks via use of deep neural networks. Although using deep learning for RL brings immense representational power, it also causes a well-known sample-inefficiency problem. This means that the algorithms are data-hungry and require millions of training samples to converge to an adequate policy. One way to combat this iss… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: To be published in ALA2022: Adaptive and Learning Agents Workshop 2022 at AAMAS

  21. arXiv:2203.08553  [pdf, other

    cs.MA cs.AI

    PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

    Authors: Pengyi Li, Hongyao Tang, Tianpei Yang, Xiaotian Hao, Tong Sang, Yan Zheng, Jianye Hao, Matthew E. Taylor, Wenyuan Tao, Zhen Wang, Fazl Barez

    Abstract: Learning to collaborate is critical in Multi-Agent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder t… ▽ More

    Submitted 21 February, 2023; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: The paper has been accepted by The Thirty-ninth International Conference on Machine Learning (ICML 2022) and the Cooperative AI Workshop at 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  22. arXiv:2112.09099  [pdf, other

    cs.MA

    Decentralized Mean Field Games

    Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Mark Crowley, Pascal Poupart

    Abstract: Multiagent reinforcement learning algorithms have not been widely adopted in large scale environments with many agents as they often scale poorly with the number of agents. Using mean field theory to aggregate agents has been proposed as a solution to this problem. However, almost all previous methods in this area make a strong assumption of a centralized system where all the agents in the environ… ▽ More

    Submitted 13 April, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: This work is to appear in AAAI-22. Recent version has minor formatting changes and some typos corrected

  23. arXiv:2111.07775  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Representations for Pixel-based Control: What Matters and Why?

    Authors: Manan Tomar, Utkarsh A. Mishra, Amy Zhang, Matthew E. Taylor

    Abstract: Learning representations for pixel-based control has garnered significant attention recently in reinforcement learning. A wide range of methods have been proposed to enable efficient learning, leading to sample complexities similar to those in the full state setting. However, moving beyond carefully curated pixel data sets (centered crop, appropriate lighting, clear background, etc.) remains chall… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  24. arXiv:2111.00345  [pdf, other

    cs.AI cs.MA

    Multi-Agent Advisor Q-Learning

    Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Kate Larson, Mark Crowley

    Abstract: In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable policies, that need to be overcome before wide-spread deployment is possible. However, many real-world environments already, in practice, deploy sub-optimal or heuristic approaches for generating… ▽ More

    Submitted 1 March, 2023; v1 submitted 25 October, 2021; originally announced November 2021.

    Comments: Paper has been accepted to Journal of Artificial Intelligence Research (JAIR). Please refer to https://jair.org/index.php/jair/article/view/13445 for JAIR version. The most recent version includes two illustrative figures that pictorially describes the settings of the two algorithms (i.e., ADMIRAL-DM and ADMIRAL-AE)

  25. arXiv:2104.04893  [pdf, other

    cs.LG cs.AI eess.SY

    The Atari Data Scraper

    Authors: Brittany Davis Pierson, Justine Ventura, Matthew E. Taylor

    Abstract: Reinforcement learning has made great strides in recent years due to the success of methods using deep neural networks. However, such neural networks act as a black box, obscuring the inner workings. While reinforcement learning has the potential to solve unique problems, a lack of trust and understanding of reinforcement learning algorithms could prevent their widespread adoption. Here, we presen… ▽ More

    Submitted 10 April, 2021; originally announced April 2021.

    Comments: 3 authors, nine pages, 6 figures, papers with code

  26. arXiv:2103.04416  [pdf, other

    cs.LG

    The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning

    Authors: Volodymyr Tkachuk, Sriram Ganapathi Subramanian, Matthew E. Taylor

    Abstract: Some reinforcement learning methods suffer from high sample complexity causing them to not be practical in real-world situations. $Q$-function reuse, a transfer learning method, is one way to reduce the sample complexity of learning, potentially improving usefulness of existing algorithms. Prior work has shown the empirical effectiveness of $Q$-function reuse for various environments when applied… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

    Comments: 7 pages, 2 figures, submitted to ALA 2021

  27. arXiv:2102.09850  [pdf, other

    cs.LG cs.AI cs.RO

    Model-Invariant State Abstractions for Model-Based Reinforcement Learning

    Authors: Manan Tomar, Amy Zhang, Roberto Calandra, Matthew E. Taylor, Joelle Pineau

    Abstract: Accuracy and generalization of dynamics models is key to the success of model-based reinforcement learning (MBRL). As the complexity of tasks increases, so does the sample inefficiency of learning accurate dynamics models. However, many complex tasks also exhibit sparsity in the dynamics, i.e., actions have only a local effect on the system dynamics. In this paper, we exploit this property with a… ▽ More

    Submitted 7 June, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

  28. arXiv:2102.07659  [pdf, other

    cs.AI cs.MA

    Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems

    Authors: Yaodong Yang, Jun Luo, Ying Wen, Oliver Slumbers, Daniel Graves, Haitham Bou Ammar, Jun Wang, Matthew E. Taylor

    Abstract: Multiagent reinforcement learning (MARL) has achieved a remarkable amount of success in solving various types of video games. A cornerstone of this success is the auto-curriculum framework, which shapes the learning process by continually creating new challenging tasks for agents to adapt to, thereby facilitating the acquisition of new skills. In order to extend MARL methods to real-world domains… ▽ More

    Submitted 16 February, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: AAMAS 2021

  29. arXiv:2102.02639  [pdf, other

    cs.LG cs.AI cs.HC

    Improving Reinforcement Learning with Human Assistance: An Argument for Human Subject Studies with HIPPO Gym

    Authors: Matthew E. Taylor, Nicholas Nissen, Yuan Wang, Neda Navidi

    Abstract: Reinforcement learning (RL) is a popular machine learning paradigm for game playing, robotics control, and other sequential decision tasks. However, RL agents often have long learning times with high data requirements because they begin by acting randomly. In order to better learn in complex tasks, this article argues that an external teacher can often significantly help the RL agent learn. Open… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  30. arXiv:2102.00824  [pdf, other

    cs.MA cs.AI cs.LG

    HAMMER: Multi-Level Coordination of Reinforcement Learning Agents via Learned Messaging

    Authors: Nikunj Gupta, G Srinivasaraghavan, Swarup Kumar Mohalik, Nishant Kumar, Matthew E. Taylor

    Abstract: Cooperative multi-agent reinforcement learning (MARL) has achieved significant results, most notably by leveraging the representation-learning abilities of deep neural networks. However, large centralized approaches quickly become infeasible as the number of agents scale, and fully decentralized approaches can miss important opportunities for information sharing and coordination. Furthermore, not… ▽ More

    Submitted 2 December, 2022; v1 submitted 18 January, 2021; originally announced February 2021.

  31. arXiv:2012.15791  [pdf, other

    cs.MA

    Partially Observable Mean Field Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Mark Crowley, Pascal Poupart

    Abstract: Traditional multi-agent reinforcement learning algorithms are not scalable to environments with more than a few agents, since these algorithms are exponential in the number of agents. Recent research has introduced successful methods to scale multi-agent reinforcement learning algorithms to many agent scenarios using mean field theory. Previous work in this field assumes that an agent has access t… ▽ More

    Submitted 24 January, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Paper to be published in International Conference on Autonomous Agents and Multiagent Systems (AAMAS) - 2021. New version has some typos corrected

  32. arXiv:2011.01297  [pdf, other

    cs.LG cs.AI

    Useful Policy Invariant Sha** from Arbitrary Advice

    Authors: Paniz Behboudian, Yash Satsangi, Matthew E. Taylor, Anna Harutyunyan, Michael Bowling

    Abstract: Reinforcement learning is a powerful learning paradigm in which agents can learn to maximize sparse and delayed reward signals. Although RL has had many impressive successes in complex domains, learning can take hours, days, or even years of training data. A major challenge of contemporary RL research is to discover how to learn with less data. Previous work has shown that domain information can b… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: 9 pages, 6 figures, Adaptive and Learning Agents (ALA) 2020 Workshop

  33. arXiv:2010.03744  [pdf, other

    cs.LG cs.AI stat.ML

    Maximum Reward Formulation In Reinforcement Learning

    Authors: Sai Krishna Gottipati, Yashaswi Pathak, Rohan Nuttall, Sahir, Raviteja Chunduru, Ahmed Touati, Sriram Ganapathi Subramanian, Matthew E. Taylor, Sarath Chandar

    Abstract: Reinforcement learning (RL) algorithms typically deal with maximizing the expected cumulative return (discounted or undiscounted, finite or infinite horizon). However, several crucial applications in the real world, such as drug discovery, do not fit within this framework because an RL agent only needs to identify states (molecules) that achieve the highest reward within a trajectory and does not… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: 14 pages, 5 figures Update based on reviewer feedback

  34. arXiv:2009.13736  [pdf, other

    cs.LG cs.AI stat.ML

    Lucid Dreaming for Experience Replay: Refreshing Past States with the Current Policy

    Authors: Yunshu Du, Garrett Warnell, Assefaw Gebremedhin, Peter Stone, Matthew E. Taylor

    Abstract: Experience replay (ER) improves the data efficiency of off-policy reinforcement learning (RL) algorithms by allowing an agent to store and reuse its past experiences in a replay buffer. While many techniques have been proposed to enhance ER by biasing how experiences are sampled from the buffer, thus far they have not considered strategies for refreshing experiences inside the buffer. In this work… ▽ More

    Submitted 3 April, 2021; v1 submitted 28 September, 2020; originally announced September 2020.

    Comments: 29 pages (with appendices), 8 figures, preprint

  35. A Conceptual Framework for Externally-influenced Agents: An Assisted Reinforcement Learning Review

    Authors: Adam Bignold, Francisco Cruz, Matthew E. Taylor, Tim Brys, Richard Dazeley, Peter Vamplew, Cameron Foale

    Abstract: A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, while reviewing externally-influenced methods, we propose… ▽ More

    Submitted 19 September, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: 33 pages, 9 figures

  36. arXiv:2004.00600  [pdf, other

    cs.LG cs.AI

    Work in Progress: Temporally Extended Auxiliary Tasks

    Authors: Craig Sherstan, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

    Abstract: Predictive auxiliary tasks have been shown to improve performance in numerous reinforcement learning works, however, this effect is still not well understood. The primary purpose of the work presented here is to investigate the impact that an auxiliary task's prediction timescale has on the agent's policy performance. We consider auxiliary tasks which learn to make on-policy predictions using temp… ▽ More

    Submitted 16 April, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

    Comments: Accepted for the Adaptive and Learning Agents (ALA) Workshop at AAMAS 2020

  37. arXiv:2003.04960  [pdf, other

    cs.LG cs.AI stat.ML

    Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

    Authors: Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone

    Abstract: Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning ha… ▽ More

    Submitted 17 September, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

    Journal ref: Journal of Machine Learning Research 21(181):1-50, 2020

  38. arXiv:2002.02513  [pdf, other

    cs.MA cs.AI cs.LG

    Multi Type Mean Field Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Pascal Poupart, Matthew E. Taylor, Nidhi Hegde

    Abstract: Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field reinforcement learning, which is that all agents in the environment are pla… ▽ More

    Submitted 21 June, 2022; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: The paper appears in the proceedings of International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2020. Revised version has some typos corrected

  39. arXiv:1907.11788  [pdf, other

    cs.LG cs.AI stat.ML

    On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman

    Authors: Chao Gao, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

    Abstract: How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL --- past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the env… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

    Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2019

  40. arXiv:1907.11703  [pdf, other

    cs.LG cs.MA stat.ML

    Action Guidance with MCTS for Deep Reinforcement Learning

    Authors: Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

    Abstract: Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framewor… ▽ More

    Submitted 25 July, 2019; originally announced July 2019.

    Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: substantial text overlap with arXiv:1904.05759, arXiv:1812.00045

  41. arXiv:1907.10827  [pdf, other

    cs.LG cs.MA stat.ML

    Terminal Prediction as an Auxiliary Task for Deep Reinforcement Learning

    Authors: Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

    Abstract: Deep reinforcement learning has achieved great successes in recent years, but there are still open challenges, such as convergence to locally optimal policies and sample inefficiency. In this paper, we contribute a novel self-supervised auxiliary task, i.e., Terminal Prediction (TP), estimating temporal closeness to terminal states for episodic tasks. The intuition is to help representation learni… ▽ More

    Submitted 24 July, 2019; originally announced July 2019.

    Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: text overlap with arXiv:1812.00045

  42. arXiv:1907.09597  [pdf, other

    cs.MA cs.LG

    Agent Modeling as Auxiliary Task for Deep Reinforcement Learning

    Authors: Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

    Abstract: In this paper we explore how actor-critic methods in deep reinforcement learning, in particular Asynchronous Advantage Actor-Critic (A3C), can be extended with agent modeling. Inspired by recent works on representation learning and multiagent deep reinforcement learning, we propose two architectures to perform agent modeling: the first one based on parameter sharing, and the second one based on ag… ▽ More

    Submitted 22 July, 2019; originally announced July 2019.

    Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19)

  43. arXiv:1907.08478  [pdf, other

    cs.AI cs.HC

    Interactive Learning of Environment Dynamics for Sequential Tasks

    Authors: Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts

    Abstract: In order for robots and other artificial agents to efficiently learn to perform useful tasks defined by an end user, they must understand not only the goals of those tasks, but also the structure and dynamics of that user's environment. While existing work has looked at how the goals of a task can be inferred from a human teacher, the agent is often left to learn about the environment on its own.… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

  44. arXiv:1905.01360  [pdf, other

    cs.MA cs.AI cs.LG

    Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team Competition

    Authors: Chao Gao, Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

    Abstract: The Pommerman Team Environment is a recently proposed benchmark which involves a multi-agent domain with challenges such as partial observability, decentralized execution (without communication), and very sparse and delayed rewards. The inaugural Pommerman Team Competition held at NeurIPS 2018 hosted 25 participants who submitted a team of 2 agents. Our submission nn_team_skynet955_skynet955 won 2… ▽ More

    Submitted 20 April, 2019; originally announced May 2019.

    Comments: 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making

  45. arXiv:1904.05759  [pdf, other

    cs.LG cs.AI cs.MA

    Safer Deep RL with Shallow MCTS: A Case Study in Pommerman

    Authors: Bilal Kartal, Pablo Hernandez-Leal, Chao Gao, Matthew E. Taylor

    Abstract: Safe reinforcement learning has many variants and it is still an open research problem. Here, we focus on how to use action guidance by means of a non-expert demonstrator to avoid catastrophic events in a domain with sparse, delayed, and deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for reinforcement learning (RL) --- past work has sho… ▽ More

    Submitted 10 April, 2019; originally announced April 2019.

    Comments: Adaptive Learning Agents (ALA) Workshop at AAMAS 2019. arXiv admin note: substantial text overlap with arXiv:1812.00045

  46. arXiv:1904.02206  [pdf, other

    cs.LG stat.ML

    Jointly Pre-training with Supervised, Autoencoder, and Value Losses for Deep Reinforcement Learning

    Authors: Gabriel V. de la Cruz Jr., Yunshu Du, Matthew E. Taylor

    Abstract: Deep Reinforcement Learning (DRL) algorithms are known to be data inefficient. One reason is that a DRL agent learns both the feature and the policy tabula rasa. Integrating prior knowledge into DRL algorithms is one way to improve learning efficiency since it helps to build helpful representations. In this work, we consider incorporating human knowledge to accelerate the asynchronous advantage ac… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: Accepted in Adaptive and Learning Agents (ALA) Workshop at AAMAS

  47. arXiv:1812.08904  [pdf, other

    cs.LG cs.AI stat.ML

    Pre-training with Non-expert Human Demonstration for Deep Reinforcement Learning

    Authors: Gabriel V. de la Cruz, Yunshu Du, Matthew E. Taylor

    Abstract: Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by using deep neural networks as function approximators to learn directly from raw input images. However, learning directly from raw images is data inefficient. The agent must learn feature representation of complex states in addition to learning a policy. As a result, deep RL typically suffers from… ▽ More

    Submitted 20 December, 2018; originally announced December 2018.

  48. arXiv:1812.00045  [pdf, other

    cs.LG cs.AI cs.NE

    Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

    Authors: Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

    Abstract: Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power. However, there are still several challenges to be addressed such as convergence to locally optimal policies and long training times. In this paper, firstly, we augment Asynchronous Advantage Actor-Critic (A3C) method with a novel self-supervised auxiliary task, i.… ▽ More

    Submitted 30 November, 2018; originally announced December 2018.

    Comments: 9 pages, 6 figures, To appear at AAAI-19 Workshop on Reinforcement Learning in Games

  49. arXiv:1811.08275  [pdf, other

    cs.AI

    Autonomous Extraction of a Hierarchical Structure of Tasks in Reinforcement Learning, A Sequential Associate Rule Mining Approach

    Authors: Behzad Ghazanfari, Fatemeh Afghah, Matthew E. Taylor

    Abstract: Reinforcement learning (RL) techniques, while often powerful, can suffer from slow learning speeds, particularly in high dimensional spaces. Decomposition of tasks into a hierarchical structure holds the potential to significantly speed up learning, generalization, and transfer learning. However, the current task decomposition techniques often rely on high-level knowledge provided by an expert (e.… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.

    Comments: arXiv admin note: text overlap with arXiv:1709.04579

  50. A Survey and Critique of Multiagent Deep Reinforcement Learning

    Authors: Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

    Abstract: Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be address… ▽ More

    Submitted 30 August, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: Under review since Oct 2018. Earlier versions of this work had the title: "Is multiagent deep reinforcement learning the answer or the question? A brief survey"