Skip to main content

Showing 1–37 of 37 results for author: Waytowich, N

.
  1. arXiv:2405.01693  [pdf, other

    cs.CR

    Adversarial Attacks on Reinforcement Learning Agents for Command and Control

    Authors: Ahaan Dabholkar, James Z. Hare, Mark Mittrick, John Richardson, Nicholas Waytowich, Priya Narayanan, Saurabh Bagchi

    Abstract: Given the recent impact of Deep Reinforcement Learning in training agents to win complex games like StarCraft and DoTA(Defense Of The Ancients) - there has been a surge in research for exploiting learning based techniques for professional wargaming, battlefield simulation and modeling. Real time strategy games and simulators have become a valuable resource for operational planning and military res… ▽ More

    Submitted 1 July, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to appear in the Journal Of Defense Modeling and Simulation (JDMS)

  2. arXiv:2402.06501  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    Scalable Interactive Machine Learning for Future Command and Control

    Authors: Anna Madison, Ellen Novoseller, Vinicius G. Goecks, Benjamin T. Files, Nicholas Waytowich, Alfred Yu, Vernon J. Lawhern, Steven Thurman, Christopher Kelshaw, Kaleb McDowell

    Abstract: Future warfare will require Command and Control (C2) personnel to make decisions at shrinking timescales in complex and potentially ill-defined situations. Given the need for robust decision-making processes and decision-support tools, integration of artificial and human intelligence holds the potential to revolutionize the C2 operations process to ensure adaptability and efficiency in rapidly cha… ▽ More

    Submitted 28 March, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted at the NATO Science and Technology Organization Symposium (ICMCIS) organized by the Information Systems Technology (IST) Panel, IST-205-RSY - the ICMCIS, held in Koblenz, Germany, 23-24 April 2024

    ACM Class: I.2.6; I.2.7; J.7

  3. arXiv:2402.01786  [pdf, other

    cs.AI cs.CL cs.HC cs.LG

    COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations

    Authors: Vinicius G. Goecks, Nicholas Waytowich

    Abstract: The development of Courses of Action (COAs) in military operations is traditionally a time-consuming and intricate process. Addressing this challenge, this study introduces COA-GPT, a novel algorithm employing Large Language Models (LLMs) for rapid and efficient generation of valid COAs. COA-GPT incorporates military doctrine and domain expertise to LLMs through in-context learning, allowing comma… ▽ More

    Submitted 28 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted at the NATO Science and Technology Organization Symposium (ICMCIS) organized by the Information Systems Technology (IST) Panel, IST-205-RSY - the ICMCIS, held in Koblenz, Germany, 23-24 April 2024

    ACM Class: I.2.6; I.2.7; J.7

  4. arXiv:2401.04290  [pdf, other

    cs.CV cs.AI cs.MA

    StarCraftImage: A Dataset For Prototy** Spatial Reasoning Methods For Multi-Agent Environments

    Authors: Sean Kulinski, Nicholas R. Waytowich, James Z. Hare, David I. Inouye

    Abstract: Spatial reasoning tasks in multi-agent environments such as event prediction, agent type identification, or missing data imputation are important for multiple applications (e.g., autonomous surveillance over sensor networks and subtasks for reinforcement learning (RL)). StarCraft II game replays encode intelligent (and adversarial) multi-agent behavior and could provide a testbed for these tasks;… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: Published in CVPR 23'

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023

  5. arXiv:2307.16348  [pdf, other

    cs.LG cs.AI cs.RO

    Rating-based Reinforcement Learning

    Authors: Devin White, Mingkang Wu, Ellen Novoseller, Vernon J. Lawhern, Nicholas Waytowich, Yongcan Cao

    Abstract: This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individua… ▽ More

    Submitted 29 January, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

    Comments: This is an extended version of the paper "Rating-based Reinforcement Learning" accepted to the 38th Annual AAAI Conference on Artificial Intelligence

  6. arXiv:2307.12158  [pdf, other

    cs.LG cs.AI cs.HC

    DIP-RL: Demonstration-Inferred Preference Learning in Minecraft

    Authors: Ellen Novoseller, Vinicius G. Goecks, David Watkins, Josh Miller, Nicholas Waytowich

    Abstract: In machine learning for sequential decision-making, an algorithmic agent learns to interact with an environment while receiving feedback in the form of a reward signal. However, in many unstructured real-world settings, such a reward signal is unknown and humans cannot reliably craft a reward signal that correctly captures desired behavior. To solve tasks in such unstructured and open-ended enviro… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

    Comments: Paper accepted at The Many Facets of Preference Learning Workshop at the International Conference on Machine Learning (ICML), Honolulu, Hawaii, USA, 2023

    ACM Class: I.2.6; G.3

  7. arXiv:2306.17271  [pdf, other

    cs.LG

    DisasterResponseGPT: Large Language Models for Accelerated Plan of Action Development in Disaster Response Scenarios

    Authors: Vinicius G. Goecks, Nicholas R. Waytowich

    Abstract: The development of plans of action in disaster response scenarios is a time-consuming process. Large Language Models (LLMs) offer a powerful solution to expedite this process through in-context learning. This study presents DisasterResponseGPT, an algorithm that leverages LLMs to generate valid plans of action quickly by incorporating disaster response and planning guidelines in the initial prompt… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: Accepted at the Workshop on Challenges in Deployable Generative AI at International Conference on Machine Learning (ICML), Honolulu, Hawaii, USA. 2023

    ACM Class: I.2.7; J.7; K.4.0

  8. arXiv:2305.00929  [pdf, other

    cs.RO

    Learning Flight Control Systems from Human Demonstrations and Real-Time Uncertainty-Informed Interventions

    Authors: Prashant Ganesh, J. Humberto Ramos, Vinicius G. Goecks, Jared Paquet, Matthew Longmire, Nicholas R. Waytowich, Kevin Brink

    Abstract: This paper describes a methodology for learning flight control systems from human demonstrations and interventions while considering the estimated uncertainty in the learned models. The proposed approach uses human demonstrations to train an initial model via imitation learning and then iteratively, improve its performance by using real-time human interventions. The aim of the interventions is to… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: IFAC 2023

  9. arXiv:2303.13512  [pdf, other

    cs.AI

    Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

    Authors: Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yu**g Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller , et al. (5 additional authors not shown)

    Abstract: To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022. The BASALT challenge asks teams to compete to develop algorithms to solve tasks with hard-to-specify reward functions in Minecraft. Through this competition, we aimed to promote the development of algorithms that use… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  10. arXiv:2210.08412  [pdf, other

    cs.LG cs.RO

    Towards an Interpretable Hierarchical Agent Framework using Semantic Goals

    Authors: Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin

    Abstract: Learning to solve long horizon temporally extended tasks with reinforcement learning has been a challenge for several years now. We believe that it is important to leverage both the hierarchical structure of complex tasks and to use expert supervision whenever possible to solve such tasks. This work introduces an interpretable hierarchical agent framework by combining planning and semantic goal di… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Report number: AIHRI/2022/7590

  11. arXiv:2209.06291  [pdf, other

    cs.CV cs.RO

    Multiple View Performers for Shape Completion

    Authors: David Watkins, Peter Allen, Krzysztof Choromanski, Jacob Varley, Nicholas Waytowich

    Abstract: We propose the Multiple View Performer (MVP) - a new architecture for 3D shape completion from a series of temporally sequential views. MVP accomplishes this task by using linear-attention Transformers called Performers. Our model allows the current observation of the scene to attend to the previous ones for more accurate infilling. The history of past observations is compressed via the compact as… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: 6 pages, 2 pages of references, 6 figures, 3 tables

  12. TagTeam: Towards Wearable-Assisted, Implicit Guidance for Human--Drone Teams

    Authors: Kasthuri Jayarajah, Aryya Gangopadhyay, Nicholas Waytowich

    Abstract: The availability of sensor-rich smart wearables and tiny, yet capable, unmanned vehicles such as nano quadcopters, opens up opportunities for a novel class of highly interactive, attention-shared human--machine teams. Reliable, lightweight, yet passive exchange of intent, data and inferences within such human--machine teams make them suitable for scenarios such as search-and-rescue with significan… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

  13. arXiv:2205.05784  [pdf, other

    cs.LG cs.MA

    Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II

    Authors: Nicholas Waytowich, James Hare, Vinicius G. Goecks, Mark Mittrick, John Richardson, Anjon Basak, Derrik E. Asher

    Abstract: Traditionally, learning from human demonstrations via direct behavior cloning can lead to high-performance policies given that the algorithm has access to large amounts of high-quality data covering the most likely scenarios to be encountered when the agent is operating. However, in real-world scenarios, expert data is limited and it is desired to train an agent that learns a behavior policy gener… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Submitted to the 2022 SPIE Defense + Commercial Sensing (DCS) Conference on "Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV"

    ACM Class: I.2.6; I.2.11

  14. arXiv:2204.07123  [pdf, other

    cs.AI

    Retrospective on the 2021 BASALT Competition on Learning from Human Feedback

    Authors: Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Chan Jun Shern, Daniel del Castillo, Tom Lieberum

    Abstract: We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques,… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted to the PMLR NeurIPS 2021 Demo & Competition Track volume

  15. arXiv:2112.03482  [pdf, other

    cs.LG cs.AI cs.HC

    Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft

    Authors: Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash

    Abstract: Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarde… ▽ More

    Submitted 11 May, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: Submitted to the AAAI 2022 Spring Symposium on Machine Learning and Knowledge Engineering for Hybrid Intelligence (AAAI-MAKE 2022)

    ACM Class: I.2.1; I.2.6; I.2.10; I.2.0

  16. arXiv:2111.04120  [pdf, other

    cs.AI cs.RO

    Automatic Goal Generation using Dynamical Distance Learning

    Authors: Bharat Prakash, Nicholas Waytowich, Tinoosh Mohsenin, Tim Oates

    Abstract: Reinforcement Learning (RL) agents can learn to solve complex sequential decision making tasks by interacting with the environment. However, sample efficiency remains a major challenge. In the field of multi-goal RL, where agents are required to reach multiple goals to solve complex tasks, improving sample efficiency can be especially challenging. On the other hand, humans or other biological agen… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

  17. arXiv:2110.11305  [pdf, other

    cs.LG cs.MA

    On games and simulators as a platform for development of artificial intelligence for command and control

    Authors: Vinicius G. Goecks, Nicholas Waytowich, Derrik E. Asher, Song Jun Park, Mark Mittrick, John Richardson, Manuel Vindiola, Anne Logie, Mark Dennison, Theron Trout, Priya Narayanan, Alexander Kott

    Abstract: Games and simulators can be a valuable platform to execute complex multi-agent, multiplayer, imperfect information scenarios with significant parallels to military applications: multiple participants manage resources and make decisions that command assets to secure specific areas of a map or neutralize opposing forces. These characteristics have attracted the artificial intelligence (AI) community… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: Preprint submitted to the Journal of Defense Modeling and Simulation (JDMS) for peer review

    ACM Class: I.2.6; I.6.3; A.1

  18. arXiv:2110.04649  [pdf, other

    cs.AI

    Interactive Hierarchical Guidance using Language

    Authors: Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin

    Abstract: Reinforcement learning has been successful in many tasks ranging from robotic control, games, energy management etc. In complex real world environments with sparse rewards and long task horizons, sample efficiency is still a major challenge. Most complex tasks can be easily decomposed into high-level planning and low level control. Therefore, it is important to enable agents to leverage the hierar… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836)

    Report number: AIHRI/2021/45

  19. arXiv:2110.00717  [pdf, other

    cs.RO

    Mobile Manipulation Leveraging Multiple Views

    Authors: David Watkins, Peter K Allen, Henrique Maia, Madhavan Seshadri, Jonathan Sanabria, Nicholas Waytowich, Jacob Varley

    Abstract: While both navigation and manipulation are challenging topics in isolation, many tasks require the ability to both navigate and manipulate in concert. To this end, we propose a mobile manipulation system that leverages novel navigation and shape completion methods to manipulate an object with a mobile robot. Our system utilizes uncertainty in the initial estimation of a manipulation target to calc… ▽ More

    Submitted 7 March, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

    Comments: 6 pages, 2 pages of references, 5 figures, 5 tables

  20. arXiv:2102.13008  [pdf, other

    cs.LG cs.HC cs.RO

    Imitation Learning with Human Eye Gaze via Multi-Objective Prediction

    Authors: Ravi Kumar Thakur, MD-Nazmus Samin Sunbeam, Vinicius G. Goecks, Ellen Novoseller, Ritwik Bera, Vernon J. Lawhern, Gregory M. Gremillion, John Valasek, Nicholas R. Waytowich

    Abstract: Approaches for teaching learning agents via human demonstrations have been widely studied and successfully applied to multiple domains. However, the majority of imitation learning work utilizes only behavioral information from the demonstrator, i.e. which actions were taken, and ignores other useful information. In particular, eye gaze information can give valuable insight towards where the demons… ▽ More

    Submitted 22 July, 2023; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: Paper accepted and selected as an oral presentation at Interactive Learning with Implicit Human Feedback Workshop at ICML 2023

    ACM Class: I.2.6; I.2.9; I.2.10

  21. arXiv:1911.00497  [pdf, other

    cs.AI cs.CL cs.LG

    A Narration-based Reward Sha** Approach using Grounded Natural Language Commands

    Authors: Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Garrett Warnell

    Abstract: While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of reward sparsity. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at the end… ▽ More

    Submitted 31 October, 2019; originally announced November 2019.

    Comments: Presented at the Imitation, Intent and Interaction (I3) workshop, ICML 2019. arXiv admin note: substantial text overlap with arXiv:1906.02671

  22. arXiv:1911.00171  [pdf, other

    cs.LG cs.AI stat.ML

    PODNet: A Neural Network for Discovery of Plannable Options

    Authors: Ritwik Bera, Vinicius G. Goecks, Gregory M. Gremillion, John Valasek, Nicholas R. Waytowich

    Abstract: Learning from demonstration has been widely studied in machine learning but becomes challenging when the demonstrated trajectories are unstructured and follow different objectives. This short-paper proposes PODNet, Plannable Option Discovery Network, addressing how to segment an unstructured set of demonstrated trajectories for option discovery. This enables learning from demonstration to perform… ▽ More

    Submitted 28 February, 2020; v1 submitted 31 October, 2019; originally announced November 2019.

    ACM Class: I.2.0; I.2.6

  23. arXiv:1910.04281  [pdf, other

    cs.LG cs.AI stat.ML

    Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments

    Authors: Vinicius G. Goecks, Gregory M. Gremillion, Vernon J. Lawhern, John Valasek, Nicholas R. Waytowich

    Abstract: This paper investigates how to efficiently transition and update policies, trained initially with demonstrations, using off-policy actor-critic reinforcement learning. It is well-known that techniques based on Learning from Demonstrations, for example behavior cloning, can lead to proficient policies given limited data. However, it is currently unclear how to efficiently update that policy using r… ▽ More

    Submitted 3 April, 2020; v1 submitted 9 October, 2019; originally announced October 2019.

    Comments: 9 pages, 5 Figures. AAMAS 2020

  24. arXiv:1909.13392  [pdf, other

    cs.LG stat.ML

    Learning from Observations Using a Single Video Demonstration and Human Feedback

    Authors: Sunil Gandhi, Tim Oates, Tinoosh Mohsenin, Nicholas Waytowich

    Abstract: In this paper, we present a method for learning from video demonstrations by using human feedback to construct a map** between the standard representation of the agent and the visual representation of the demonstration. In this way, we leverage the advantages of both these representations, i.e., we learn the policy using standard state representations, but are able to specify the expected behavi… ▽ More

    Submitted 29 September, 2019; originally announced September 2019.

  25. arXiv:1909.09295  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation

    Authors: David Watkins-Valls, **gxi Xu, Nicholas Waytowich, Peter Allen

    Abstract: We present a robot navigation system that uses an imitation learning framework to successfully navigate in complex environments. Our framework takes a pre-built 3D scan of a real environment and trains an agent from pre-generated expert trajectories to navigate to any position given a panoramic view of the goal and the current visual input without relying on map, compass, odometry, or relative pos… ▽ More

    Submitted 25 September, 2020; v1 submitted 19 September, 2019; originally announced September 2019.

  26. arXiv:1909.07887  [pdf, other

    cs.RO

    Inferring and Learning Multi-Robot Policies by Observing an Expert

    Authors: Pietro Pierpaoli, Harish Ravichandar, Nicholas Waytowich, Anqi Li, Derrik Asher, Magnus Egerstedt

    Abstract: We present a technique for learning how to solve a multi-robot mission that requires interaction with an external environment by observing an expert system executing the same mission. We define the expert system as a team of robots equipped with a library of controllers, each designed to solve a specific task, supervised by an expert policy that appropriately selects controllers based on the state… ▽ More

    Submitted 2 March, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: 8 pages, 7 figures

  27. arXiv:1909.05232  [pdf, other

    cs.MA cs.AI

    On Memory Mechanism in Multi-Agent Reinforcement Learning

    Authors: Yilun Zhou, Derrik E. Asher, Nicholas R. Waytowich, Julie A. Shah

    Abstract: Multi-agent reinforcement learning (MARL) extends (single-agent) reinforcement learning (RL) by introducing additional agents and (potentially) partial observability of the environment. Consequently, algorithms for solving MARL problems incorporate various extensions beyond traditional RL methods, such as a learned communication protocol between cooperative agents that enables exchange of private… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

  28. arXiv:1906.02671  [pdf, other

    cs.MM cs.CL cs.LG cs.NE cs.RO

    Grounding Natural Language Commands to StarCraft II Game States for Narration-Guided Reinforcement Learning

    Authors: Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Ethan Stump, Garrett Warnell

    Abstract: While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of {\em reward sparsity}. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at th… ▽ More

    Submitted 24 April, 2019; originally announced June 2019.

    Comments: 10 pages, 3 figures. Published at SPIE 2019

  29. arXiv:1903.10404  [pdf, other

    cs.LG cs.AI

    On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

    Authors: Bharat Prakash, Mark Horton, Nicholas R. Waytowich, William David Hairston, Tim Oates, Tinoosh Mohsenin

    Abstract: In autonomous embedded systems, it is often vital to reduce the amount of actions taken in the real world and energy required to learn a policy. Training reinforcement learning agents from high dimensional image representations can be very expensive and time consuming. Autoencoders are deep neural network used to compress high dimensional data such as pixelated images into small latent representat… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

  30. arXiv:1903.09328  [pdf, other

    cs.AI

    Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

    Authors: Bharat Prakash, Mohit Khatwani, Nicholas Waytowich, Tinoosh Mohsenin

    Abstract: Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces. However, most of these successes have been primarily in simulated environments where failure is of little or no consequence. Most real-world applications, however, require training solutions that are safe to operate as catastrophic failures are inadmissible espec… ▽ More

    Submitted 21 March, 2019; originally announced March 2019.

  31. arXiv:1810.11545  [pdf, other

    cs.AI cs.HC cs.RO

    Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real-Time

    Authors: Vinicius G. Goecks, Gregory M. Gremillion, Vernon J. Lawhern, John Valasek, Nicholas R. Waytowich

    Abstract: This paper investigates how to utilize different forms of human interaction to safely train autonomous systems in real-time by learning from both human demonstrations and interventions. We implement two components of the Cycle-of-Learning for Autonomous Systems, which is our framework for combining multiple modalities of human interaction. The current effort employs human demonstrations to teach a… ▽ More

    Submitted 28 November, 2018; v1 submitted 26 October, 2018; originally announced October 2018.

    Comments: 9 pages, 6 figures

  32. arXiv:1809.04918  [pdf, ps, other

    cs.MA cs.AI

    Coordination-driven learning in multi-agent problem spaces

    Authors: Sean L. Barton, Nicholas R. Waytowich, Derrik E. Asher

    Abstract: We discuss the role of coordination as a direct learning objective in multi-agent reinforcement learning (MARL) domains. To this end, we present a novel means of quantifying coordination in multi-agent systems, and discuss the implications of using such a measure to optimize coordinated agent policies. This concept has important implications for adversary-aware RL, which we take to be a sub-domain… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: AAAI Fall Symposium 2018, Concept Paper

    Report number: Vol-2269 FSS-18

    Journal ref: Proceedings of the AAAI Fall 2018 Sympo?sium on Adversary-Aware Learning Techniques and Trends in Cy?bersecurity, Arlington, VA, USA, 18-19 October, 2018, published at http://ceur-ws.org

  33. arXiv:1808.09572  [pdf, other

    cs.AI cs.HC cs.RO

    Cycle-of-Learning for Autonomous Systems from Human Interaction

    Authors: Nicholas R. Waytowich, Vinicius G. Goecks, Vernon J. Lawhern

    Abstract: We discuss different types of human-robot interaction paradigms in the context of training end-to-end reinforcement learning algorithms. We provide a taxonomy to categorize the types of human interaction and present our Cycle-of-Learning framework for autonomous systems that combines different human-interaction modalities with reinforcement learning. Two key concepts provided by our Cycle-of-Learn… ▽ More

    Submitted 9 October, 2018; v1 submitted 28 August, 2018; originally announced August 2018.

    Comments: Presented at AI-HRI AAAI-FSS, 2018 (arXiv:1809.06606)

    Report number: AI-HRI/2018/05

  34. arXiv:1807.08663  [pdf

    cs.MA

    Measuring collaborative emergent behavior in multi-agent reinforcement learning

    Authors: Sean L. Barton, Nicholas R. Waytowich, Erin Zaroukian, Derrik E. Asher

    Abstract: Multi-agent reinforcement learning (RL) has important implications for the future of human-agent teaming. We show that improved performance with multi-agent RL is not a guarantee of the collaborative behavior thought to be important for solving multi-agent tasks. To address this, we present a novel approach for quantitatively assessing collaboration in continuous spatial tasks with multi-agent RL.… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

    Comments: 1st International Conference on Human Systems Engineering and Design, 6 pages, 2 figures, 1 table

  35. arXiv:1803.04566  [pdf, other

    cs.LG q-bio.NC stat.ML

    Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials

    Authors: Nicholas R. Waytowich, Vernon Lawhern, Javier O. Garcia, Jennifer Cummings, Josef Faller, Paul Sajda, Jean M. Vettel

    Abstract: Steady-State Visual Evoked Potentials (SSVEPs) are neural oscillations from the parietal and occipital regions of the brain that are evoked from flickering visual stimuli. SSVEPs are robust signals measurable in the electroencephalogram (EEG) and are commonly used in brain-computer interfaces (BCIs). However, methods for high-accuracy decoding of SSVEPs usually require hand-crafted approaches that… ▽ More

    Submitted 9 October, 2018; v1 submitted 12 March, 2018; originally announced March 2018.

    Comments: Accepted for publication at the Journal of Neural Engineering

  36. arXiv:1709.10163  [pdf, other

    cs.AI cs.LG

    Deep TAMER: Interactive Agent Sha** in High-Dimensional State Spaces

    Authors: Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone

    Abstract: While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data. One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedback i… ▽ More

    Submitted 19 January, 2018; v1 submitted 28 September, 2017; originally announced September 2017.

    Comments: 9 pages, 6 figures

  37. arXiv:1611.08024  [pdf, other

    cs.LG q-bio.NC stat.ML

    EEGNet: A Compact Convolutional Network for EEG-based Brain-Computer Interfaces

    Authors: Vernon J. Lawhern, Amelia J. Solon, Nicholas R. Waytowich, Stephen M. Gordon, Chou P. Hung, Brent J. Lance

    Abstract: Brain computer interfaces (BCI) enable direct communication with a computer, using neural activity as the control signal. This neural signal is generally chosen from a variety of well-studied electroencephalogram (EEG) signals. For a given BCI paradigm, feature extractors and classifiers are tailored to the distinct characteristics of its expected EEG control signal, limiting its application to th… ▽ More

    Submitted 15 May, 2018; v1 submitted 23 November, 2016; originally announced November 2016.

    Comments: 30 pages, 10 figures. Added additional feature relevance analyses. Minor change to EEGNet architecture. Source code can be found at https://github.com/vlawhern/arl-eegmodels