Search | arXiv e-print repository

Creative Problem Solving in Large Language and Vision Models -- What Would it Take?

Authors: Lakshmi Nair, Evana Gizzi, Jivko Sinapov

Abstract: In this paper, we discuss approaches for integrating Computational Creativity (CC) with research in large language and vision models (LLVMs) to address a key limitation of these models, i.e., creative problem solving. We present preliminary experiments showing how CC principles can be applied to address this limitation through augmented prompting. With this work, we hope to foster discussions of C… ▽ More In this paper, we discuss approaches for integrating Computational Creativity (CC) with research in large language and vision models (LLVMs) to address a key limitation of these models, i.e., creative problem solving. We present preliminary experiments showing how CC principles can be applied to address this limitation through augmented prompting. With this work, we hope to foster discussions of Computational Creativity in the context of ML algorithms for creative problem solving in LLVMs. Our code is at: https://github.com/lnairGT/creative-problem-solving-LLMs △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: 9 pages, 7 figures, 2 tables

arXiv:2404.01158 [pdf, other]

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

Authors: Casey Kennington, Malihe Alikhani, Heather Pon-Barry, Katherine Atwell, Yonatan Bisk, Daniel Fried, Felix Gervits, Zhao Han, Mert Inan, Michael Johnston, Raj Korpan, Diane Litman, Matthew Marge, Cynthia Matuszek, Ross Mead, Shiwali Mohan, Raymond Mooney, Natalie Parde, Jivko Sinapov, Angela Stewart, Matthew Stone, Stefanie Tellex, Tom Williams

Abstract: The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first… ▽ More The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots. The three proposals should act as white papers for any researcher to take and build upon. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: NSF Report on the "Dialogue with Robots" Workshop held in Pittsburg, PA, April 2023

arXiv:2402.03678 [pdf, other]

Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents

Authors: Yash Shukla, Tanushree Burman, Abhishek Kulkarni, Robert Wright, Alvaro Velasquez, Jivko Sinapov

Abstract: Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors. However, learning an effective policy often requires a large number of environment interactions. To mitigate sample complexity issues, recent approaches have used high-level task specifications, such as Linear Temporal Logic (LTL$_f$) formulas or Reward Machines (RM), to guide the lea… ▽ More Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors. However, learning an effective policy often requires a large number of environment interactions. To mitigate sample complexity issues, recent approaches have used high-level task specifications, such as Linear Temporal Logic (LTL$_f$) formulas or Reward Machines (RM), to guide the learning progress of the agent. In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions. Unlike previous work, LSTS does not assume information about the environment dynamics or the Reward Machine, and dynamically samples promising tasks that lead to successful goal policies. We evaluate LSTS on a gridworld and show that it achieves improved time-to-threshold performance on complex sequential decision-making problems compared to state-of-the-art RM and Automaton-guided RL baselines, such as Q-Learning for Reward Machines and Compositional RL from logical Specifications (DIRL). Moreover, we demonstrate that our method outperforms RM and Automaton-guided RL baselines in terms of sample-efficiency, both in a partially observable robotic task and in a continuous control robotic manipulation task. △ Less

Submitted 2 April, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

arXiv:2401.03546 [pdf, other]

NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds

Authors: Shivam Goel, Yichen Wei, Panagiotis Lymperopoulos, Klara Chura, Matthias Scheutz, Jivko Sinapov

Abstract: As AI agents leave the lab and venture into the real world as autonomous vehicles, delivery robots, and cooking robots, it is increasingly necessary to design and comprehensively evaluate algorithms that tackle the ``open-world''. To this end, we introduce NovelGym, a flexible and adaptable ecosystem designed to simulate gridworld environments, serving as a robust platform for benchmarking reinfor… ▽ More As AI agents leave the lab and venture into the real world as autonomous vehicles, delivery robots, and cooking robots, it is increasingly necessary to design and comprehensively evaluate algorithms that tackle the ``open-world''. To this end, we introduce NovelGym, a flexible and adaptable ecosystem designed to simulate gridworld environments, serving as a robust platform for benchmarking reinforcement learning (RL) and hybrid planning and learning agents in open-world contexts. The modular architecture of NovelGym facilitates rapid creation and modification of task environments, including multi-agent scenarios, with multiple environment transformations, thus providing a dynamic testbed for researchers to develop open-world AI agents. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: Accepted at AAMAS-2024

arXiv:2310.09454 [pdf, other]

LgTS: Dynamic Task Sampling using LLM-generated sub-goals for Reinforcement Learning Agents

Authors: Yash Shukla, Wenchang Gao, Vasanth Sarathy, Alvaro Velasquez, Robert Wright, Jivko Sinapov

Abstract: Recent advancements in reasoning abilities of Large Language Models (LLM) has promoted their usage in problems that require high-level planning for robots and artificial agents. However, current techniques that utilize LLMs for such planning tasks make certain key assumptions such as, access to datasets that permit finetuning, meticulously engineered prompts that only provide relevant and essentia… ▽ More Recent advancements in reasoning abilities of Large Language Models (LLM) has promoted their usage in problems that require high-level planning for robots and artificial agents. However, current techniques that utilize LLMs for such planning tasks make certain key assumptions such as, access to datasets that permit finetuning, meticulously engineered prompts that only provide relevant and essential information to the LLM, and most importantly, a deterministic approach to allow execution of the LLM responses either in the form of existing policies or plan operators. In this work, we propose LgTS (LLM-guided Teacher-Student learning), a novel approach that explores the planning abilities of LLMs to provide a graphical representation of the sub-goals to a reinforcement learning (RL) agent that does not have access to the transition dynamics of the environment. The RL agent uses Teacher-Student learning algorithm to learn a set of successful policies for reaching the goal state from the start state while simultaneously minimizing the number of environmental interactions. Unlike previous methods that utilize LLMs, our approach does not assume access to a propreitary or a fine-tuned LLM, nor does it require pre-trained policies that achieve the sub-goals proposed by the LLM. Through experiments on a gridworld based DoorKey domain and a search-and-rescue inspired domain, we show that generating a graphical structure of sub-goals helps in learning policies for the LLM proposed sub-goals and the Teacher-Student learning algorithm minimizes the number of environment interactions when the transition dynamics are unknown. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.08836 [pdf, other]

A Framework for Few-Shot Policy Transfer through Observation Map** and Behavior Cloning

Authors: Yash Shukla, Bharat Kesari, Shivam Goel, Robert Wright, Jivko Sinapov

Abstract: Despite recent progress in Reinforcement Learning for robotics applications, many tasks remain prohibitively difficult to solve because of the expensive interaction cost. Transfer learning helps reduce the training time in the target domain by transferring knowledge learned in a source domain. Sim2Real transfer helps transfer knowledge from a simulated robotic domain to a physical target domain. K… ▽ More Despite recent progress in Reinforcement Learning for robotics applications, many tasks remain prohibitively difficult to solve because of the expensive interaction cost. Transfer learning helps reduce the training time in the target domain by transferring knowledge learned in a source domain. Sim2Real transfer helps transfer knowledge from a simulated robotic domain to a physical target domain. Knowledge transfer reduces the time required to train a task in the physical world, where the cost of interactions is high. However, most existing approaches assume exact correspondence in the task structure and the physical properties of the two domains. This work proposes a framework for Few-Shot Policy Transfer between two domains through Observation Map** and Behavior Cloning. We use Generative Adversarial Networks (GANs) along with a cycle-consistency loss to map the observations between the source and target domains and later use this learned map** to clone the successful source task behavior policy to the target domain. We observe successful behavior policy transfer with limited target task interactions and in cases where the source and target task are semantically dissimilar. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: Paper accepted to the IROS 2023 Conference

arXiv:2309.08508 [pdf, other]

MOSAIC: Learning Unified Multi-Sensory Object Property Representations for Robot Learning via Interactive Perception

Authors: Gyan Tatiya, Jonathan Francis, Ho-Hsiang Wu, Yonatan Bisk, Jivko Sinapov

Abstract: A holistic understanding of object properties across diverse sensory modalities (e.g., visual, audio, and haptic) is essential for tasks ranging from object categorization to complex manipulation. Drawing inspiration from cognitive science studies that emphasize the significance of multi-sensory integration in human perception, we introduce MOSAIC (Multimodal Object property learning with Self-Att… ▽ More A holistic understanding of object properties across diverse sensory modalities (e.g., visual, audio, and haptic) is essential for tasks ranging from object categorization to complex manipulation. Drawing inspiration from cognitive science studies that emphasize the significance of multi-sensory integration in human perception, we introduce MOSAIC (Multimodal Object property learning with Self-Attention and Interactive Comprehension), a novel framework designed to facilitate the learning of unified multi-sensory object property representations. While it is undeniable that visual information plays a prominent role, we acknowledge that many fundamental object properties extend beyond the visual domain to encompass attributes like texture, mass distribution, or sounds, which significantly influence how we interact with objects. In MOSAIC, we leverage this profound insight by distilling knowledge from multimodal foundation models and aligning these representations not only across vision but also haptic and auditory sensory modalities. Through extensive experiments on a dataset where a humanoid robot interacts with 100 objects across 10 exploratory behaviors, we demonstrate the versatility of MOSAIC in two task families: object categorization and object-fetching tasks. Our results underscore the efficacy of MOSAIC's unified representations, showing competitive performance in category recognition through a simple linear probe setup and excelling in the fetch object task under zero-shot transfer conditions. This work pioneers the application of sensory grounding in foundation models for robotics, promising a significant leap in multi-sensory perception capabilities for autonomous systems. We have released the code, datasets, and additional results: https://github.com/gtatiya/MOSAIC. △ Less

Submitted 22 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: Accepted to the 2024 IEEE International Conference on Robotics and Automation (ICRA), May 13 to 17, 2024; Yokohama, Japan

arXiv:2304.05271 [pdf, other]

Automaton-Guided Curriculum Generation for Reinforcement Learning Agents

Authors: Yash Shukla, Abhishek Kulkarni, Robert Wright, Alvaro Velasquez, Jivko Sinapov

Abstract: Despite advances in Reinforcement Learning, many sequential decision making tasks remain prohibitively expensive and impractical to learn. Recently, approaches that automatically generate reward functions from logical task specifications have been proposed to mitigate this issue; however, they scale poorly on long-horizon tasks (i.e., tasks where the agent needs to perform a series of correct acti… ▽ More Despite advances in Reinforcement Learning, many sequential decision making tasks remain prohibitively expensive and impractical to learn. Recently, approaches that automatically generate reward functions from logical task specifications have been proposed to mitigate this issue; however, they scale poorly on long-horizon tasks (i.e., tasks where the agent needs to perform a series of correct actions to reach the goal state, considering future transitions while choosing an action). Employing a curriculum (a sequence of increasingly complex tasks) further improves the learning speed of the agent by sequencing intermediate tasks suited to the learning capacity of the agent. However, generating curricula from the logical specification still remains an unsolved problem. To this end, we propose AGCL, Automaton-guided Curriculum Learning, a novel method for automatically generating curricula for the target task in the form of Directed Acyclic Graphs (DAGs). AGCL encodes the specification in the form of a deterministic finite automaton (DFA), and then uses the DFA along with the Object-Oriented MDP (OOMDP) representation to generate a curriculum as a DAG, where the vertices correspond to tasks, and edges correspond to the direction of knowledge transfer. Experiments in gridworld and physics-based simulated robotics domains show that the curricula produced by AGCL achieve improved time-to-threshold performance on a complex sequential decision-making problem relative to state-of-the-art curriculum learning (e.g, teacher-student, self-play) and automaton-guided reinforcement learning baselines (e.g, Q-Learning for Reward Machines). Further, we demonstrate that AGCL performs well even in the presence of noise in the task's OOMDP description, and also when distractor objects are present that are not modeled in the logical specification of the tasks' objectives. △ Less

Submitted 11 April, 2023; originally announced April 2023.

Comments: To be presented at The International Conference on Automated Planning and Scheduling (ICAPS) 2023

arXiv:2303.04023 [pdf, other]

Cross-Tool and Cross-Behavior Perceptual Knowledge Transfer for Grounded Object Recognition

Authors: Gyan Tatiya, Jonathan Francis, Jivko Sinapov

Abstract: Humans learn about objects via interaction and using multiple perceptions, such as vision, sound, and touch. While vision can provide information about an object's appearance, non-visual sensors, such as audio and haptics, can provide information about its intrinsic properties, such as weight, temperature, hardness, and the object's sound. Using tools to interact with objects can reveal additional… ▽ More Humans learn about objects via interaction and using multiple perceptions, such as vision, sound, and touch. While vision can provide information about an object's appearance, non-visual sensors, such as audio and haptics, can provide information about its intrinsic properties, such as weight, temperature, hardness, and the object's sound. Using tools to interact with objects can reveal additional object properties that are otherwise hidden (e.g., knives and spoons can be used to examine the properties of food, including its texture and consistency). Robots can use tools to interact with objects and gather information about their implicit properties via non-visual sensors. However, a robot's model for recognizing objects using a tool-mediated behavior does not generalize to a new tool or behavior due to differing observed data distributions. To address this challenge, we propose a framework to enable robots to transfer implicit knowledge about granular objects across different tools and behaviors. The proposed approach learns a shared latent space from multiple robots' contexts produced by respective sensory data while interacting with objects using tools. We collected a dataset using a UR5 robot that performed 5,400 interactions using 6 tools and 6 behaviors on 15 granular objects and tested our method on cross-tool and cross-behavioral transfer tasks. Our results show the less experienced target robot can benefit from the experience gained from the source robot and perform recognition on a set of novel objects. We have released the code, datasets, and additional results: https://github.com/gtatiya/Tool-Knowledge-Transfer. △ Less

Submitted 15 September, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

Comments: Under review for 2024 IEEE International Conference on Robotics and Automation (ICRA), May 13 to 17, 2024, Yokohama, Japan

arXiv:2302.14208 [pdf, other]

Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments

Authors: Tung Thai, Ming Shen, Mayank Garg, Ayush Kalani, Nakul Vaidya, Utkarsh Soni, Mudit Verma, Sriram Gopalakrishnan, Neeraj Varshney, Chitta Baral, Subbarao Kambhampati, Jivko Sinapov, Matthias Scheutz

Abstract: Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectu… ▽ More Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectural mechanisms for detecting and characterizing different types of novelties, and for building an appropriate adaptive model to accommodate them utilizing logical representations and reasoning methods. We demonstrate the effectiveness of the proposed methods in evaluations performed by a third party in the adversarial multi-agent board game Monopoly. The results show high novelty detection and accommodation rates across a variety of novelty types, including changes to the rules of the game, as well as changes to the agent's action capabilities. △ Less

Submitted 5 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

arXiv:2212.11345 [pdf, other]

Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation

Authors: Gyan Tatiya, Jonathan Francis, Luca Bondi, Ingrid Navarro, Eric Nyberg, Jivko Sinapov, Jean Oh

Abstract: Generalisation to unseen contexts remains a challenge for embodied navigation agents. In the context of semantic audio-visual navigation (SAVi) tasks, the notion of generalisation should include both generalising to unseen indoor visual scenes as well as generalising to unheard sounding objects. However, previous SAVi task definitions do not include evaluation conditions on truly novel sounding ob… ▽ More Generalisation to unseen contexts remains a challenge for embodied navigation agents. In the context of semantic audio-visual navigation (SAVi) tasks, the notion of generalisation should include both generalising to unseen indoor visual scenes as well as generalising to unheard sounding objects. However, previous SAVi task definitions do not include evaluation conditions on truly novel sounding objects, resorting instead to evaluating agents on unheard sound clips of known objects; meanwhile, previous SAVi methods do not include explicit mechanisms for incorporating domain knowledge about object and region semantics. These weaknesses limit the development and assessment of models' abilities to generalise their learned experience. In this work, we introduce the use of knowledge-driven scene priors in the semantic audio-visual embodied navigation task: we combine semantic information from our novel knowledge graph that encodes object-region relations, spatial knowledge from dual Graph Encoder Networks, and background knowledge from a series of pre-training tasks -- all within a reinforcement learning framework for audio-visual navigation. We also define a new audio-visual navigation sub-task, where agents are evaluated on novel sounding objects, as opposed to unheard clips of known objects. We show improvements over strong baselines in generalisation to unseen regions and novel sounding objects, within the Habitat-Matterport3D simulation environment, under the SoundSpaces task. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: 19 pages, 8 figures, 9 tables

arXiv:2209.06890 [pdf, other]

doi 10.1109/icra48891.2023.10160811

Transferring Implicit Knowledge of Non-Visual Object Properties Across Heterogeneous Robot Morphologies

Authors: Gyan Tatiya, Jonathan Francis, Jivko Sinapov

Abstract: Humans leverage multiple sensor modalities when interacting with objects and discovering their intrinsic properties. Using the visual modality alone is insufficient for deriving intuition behind object properties (e.g., which of two boxes is heavier), making it essential to consider non-visual modalities as well, such as the tactile and auditory. Whereas robots may leverage various modalities to o… ▽ More Humans leverage multiple sensor modalities when interacting with objects and discovering their intrinsic properties. Using the visual modality alone is insufficient for deriving intuition behind object properties (e.g., which of two boxes is heavier), making it essential to consider non-visual modalities as well, such as the tactile and auditory. Whereas robots may leverage various modalities to obtain object property understanding via learned exploratory interactions with objects (e.g., gras**, lifting, and shaking behaviors), challenges remain: the implicit knowledge acquired by one robot via object exploration cannot be directly leveraged by another robot with different morphology, because the sensor models, observed data distributions, and interaction capabilities are different across these different robot configurations. To avoid the costly process of learning interactive object perception tasks from scratch, we propose a multi-stage projection framework for each new robot for transferring implicit knowledge of object properties across heterogeneous robot morphologies. We evaluate our approach on the object-property recognition and object-identity recognition tasks, using a dataset containing two heterogeneous robots that perform 7,600 object interactions. Results indicate that knowledge can be transferred across robots, such that a newly-deployed robot can bootstrap its recognition models without exhaustively exploring all objects. We also propose a data augmentation technique and show that this technique improves the generalization of models. We release our code and datasets, here: https://github.com/gtatiya/Implicit-Knowledge-Transfer. △ Less

Submitted 6 July, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

Comments: In proceedings of the IEEE International Conference on Robotics and Automation (ICRA), May 29 - June 2, 2023 , ExCeL London, UK

arXiv:2206.12493 [pdf, other]

RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments

Authors: Shivam Goel, Yash Shukla, Vasanth Sarathy, Matthias Scheutz, Jivko Sinapov

Abstract: We propose RAPid-Learn: Learning to Recover and Plan Again, a hybrid planning and learning method, to tackle the problem of adapting to sudden and unexpected changes in an agent's environment (i.e., novelties). RAPid-Learn is designed to formulate and solve modifications to a task's Markov Decision Process (MDPs) on-the-fly and is capable of exploiting domain knowledge to learn any new dynamics ca… ▽ More We propose RAPid-Learn: Learning to Recover and Plan Again, a hybrid planning and learning method, to tackle the problem of adapting to sudden and unexpected changes in an agent's environment (i.e., novelties). RAPid-Learn is designed to formulate and solve modifications to a task's Markov Decision Process (MDPs) on-the-fly and is capable of exploiting domain knowledge to learn any new dynamics caused by the environmental changes. It is capable of exploiting the domain knowledge to learn action executors which can be further used to resolve execution impasses, leading to a successful plan execution. This novelty information is reflected in its updated domain model. We demonstrate its efficacy by introducing a wide variety of novelties in a gridworld environment inspired by Minecraft, and compare our algorithm with transfer learning baselines from the literature. Our method is (1) effective even in the presence of multiple novelties, (2) more sample efficient than transfer learning RL baselines, and (3) robust to incomplete model information, as opposed to pure symbolic planning approaches. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: Proceedings of the IEEE Conference on Development and Learning (ICDL 2022)

arXiv:2204.10358 [pdf, other]

doi 10.1613/jair.1.13864

Creative Problem Solving in Artificially Intelligent Agents: A Survey and Framework

Authors: Evana Gizzi, Lakshmi Nair, Sonia Chernova, Jivko Sinapov

Abstract: Creative Problem Solving (CPS) is a sub-area within Artificial Intelligence (AI) that focuses on methods for solving off-nominal, or anomalous problems in autonomous systems. Despite many advancements in planning and learning, resolving novel problems or adapting existing knowledge to a new context, especially in cases where the environment may change in unpredictable ways post deployment, remains… ▽ More Creative Problem Solving (CPS) is a sub-area within Artificial Intelligence (AI) that focuses on methods for solving off-nominal, or anomalous problems in autonomous systems. Despite many advancements in planning and learning, resolving novel problems or adapting existing knowledge to a new context, especially in cases where the environment may change in unpredictable ways post deployment, remains a limiting factor in the safe and useful integration of intelligent systems. The emergence of increasingly autonomous systems dictates the necessity for AI agents to deal with environmental uncertainty through creativity. To stimulate further research in CPS, we present a definition and a framework of CPS, which we adopt to categorize existing AI methods in this field. Our framework consists of four main components of a CPS problem, namely, 1) problem formulation, 2) knowledge representation, 3) method of knowledge manipulation, and 4) method of evaluation. We conclude our survey with open research questions, and suggested directions for the future. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Comments: 46 pages (including appendix), 17 figures, under submission at Journal of Artificial Intelligence Research (JAIR)

Report number: Vol. 75

Journal ref: Journal of Artificial Intelligence Research 2022

arXiv:2204.04823 [pdf, other]

ACuTE: Automatic Curriculum Transfer from Simple to Complex Environments

Authors: Yash Shukla, Christopher Thierauf, Ramtin Hosseini, Gyan Tatiya, Jivko Sinapov

Abstract: Despite recent advances in Reinforcement Learning (RL), many problems, especially real-world tasks, remain prohibitively expensive to learn. To address this issue, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum to learn a problem that may otherwise be too difficult to learn from scratch. However, generating and optimizing a curricu… ▽ More Despite recent advances in Reinforcement Learning (RL), many problems, especially real-world tasks, remain prohibitively expensive to learn. To address this issue, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum to learn a problem that may otherwise be too difficult to learn from scratch. However, generating and optimizing a curriculum in a realistic scenario still requires extensive interactions with the environment. To address this challenge, we formulate the curriculum transfer problem, in which the schema of a curriculum optimized in a simpler, easy-to-solve environment (e.g., a grid world) is transferred to a complex, realistic scenario (e.g., a physics-based robotics simulation or the real world). We present "ACuTE", Automatic Curriculum Transfer from Simple to Complex Environments, a novel framework to solve this problem, and evaluate our proposed method by comparing it to other baseline approaches (e.g., domain adaptation) designed to speed up learning. We observe that our approach produces improved jumpstart and time-to-threshold performance even when adding task elements that further increase the difficulty of the realistic scenario. Finally, we demonstrate that our approach is independent of the learning algorithm used for curriculum generation, and is Sim2Real transferable to a real world scenario using a physical robot. △ Less

Submitted 10 April, 2022; originally announced April 2022.

arXiv:2110.04697 [pdf, other]

An Augmented Reality Platform for Introducing Reinforcement Learning to K-12 Students with Robots

Authors: Ziyi Zhang, Samuel Micah Akai-Nettey, Adonai Addo, Chris Rogers, Jivko Sinapov

Abstract: Interactive reinforcement learning, where humans actively assist during an agent's learning process, has the promise to alleviate the sample complexity challenges of practical algorithms. However, the inner workings and state of the robot are typically hidden from the teacher when humans provide feedback. To create a common ground between the human and the learning robot, in this paper, we propose… ▽ More Interactive reinforcement learning, where humans actively assist during an agent's learning process, has the promise to alleviate the sample complexity challenges of practical algorithms. However, the inner workings and state of the robot are typically hidden from the teacher when humans provide feedback. To create a common ground between the human and the learning robot, in this paper, we propose an Augmented Reality (AR) system that reveals the hidden state of the learning to the human users. This paper describes our system's design and implementation and concludes with a discussion on two directions for future work which we are pursuing: 1) use of our system in AI education activities at the K-12 level; and 2) development of a framework for an AR-based human-in-the-loop reinforcement learning, where the human teacher can see sensory and cognitive representations of the robot overlaid in the real world. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: AAAI-FSS 2021 (arXiv:2109.10836)

Report number: AIHRI/2021/43

arXiv:2109.10836

AI-HRI 2021 Proceedings

Authors: Reuth Mirsky, Megan Zimmerman, Muneed Ahmad, Shelly Bagchi, Felix Gervits, Zhao Han, Justin Hart, Daniel Hernández García, Matteo Leonetti, Ross Mead, Emmanuel Senft, Jivko Sinapov, Jason Wilson

Abstract: The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. During that time, these symposia provided a fertile ground for numerous collaborations and pioneered many discussions revolving trust in HRI, XAI for HRI, service robots, interactive learning, and more. This year, we aim to review the achievements o… ▽ More The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. During that time, these symposia provided a fertile ground for numerous collaborations and pioneered many discussions revolving trust in HRI, XAI for HRI, service robots, interactive learning, and more. This year, we aim to review the achievements of the AI-HRI community in the last decade, identify the challenges facing ahead, and welcome new researchers who wish to take part in this growing community. Taking this wide perspective, this year there will be no single theme to lead the symposium and we encourage AI-HRI submissions from across disciplines and research interests. Moreover, with the rising interest in AR and VR as part of an interaction and following the difficulties in running physical experiments during the pandemic, this year we specifically encourage researchers to submit works that do not include a physical robot in their evaluation, but promote HRI research in general. In addition, acknowledging that ethics is an inherent part of the human-robot interaction, we encourage submissions of works on ethics for HRI. Over the course of the two-day meeting, we will host a collaborative forum for discussion of current efforts in AI-HRI, with additional talks focused on the topics of ethics in HRI and ubiquitous HRI. △ Less

Submitted 23 September, 2021; v1 submitted 22 September, 2021; originally announced September 2021.

Comments: Proceedings of the AI-HRI Symposium at AAAI-FSS 2021

Report number: AIHRI/2021/01

arXiv:2109.07561 [pdf, other]

A Framework for Multisensory Foresight for Embodied Agents

Authors: Xiaohui Chen, Ramtin Hosseini, Karen Panetta, Jivko Sinapov

Abstract: Predicting future sensory states is crucial for learning agents such as robots, drones, and autonomous vehicles. In this paper, we couple multiple sensory modalities with exploratory actions and propose a predictive neural network architecture to address this problem. Most existing approaches rely on large, manually annotated datasets, or only use visual data as a single modality. In contrast, the… ▽ More Predicting future sensory states is crucial for learning agents such as robots, drones, and autonomous vehicles. In this paper, we couple multiple sensory modalities with exploratory actions and propose a predictive neural network architecture to address this problem. Most existing approaches rely on large, manually annotated datasets, or only use visual data as a single modality. In contrast, the unsupervised method presented here uses multi-modal perceptions for predicting future visual frames. As a result, the proposed model is more comprehensive and can better capture the spatio-temporal dynamics of the environment, leading to more accurate visual frame prediction. The other novelty of our framework is the use of sub-networks dedicated to anticipating future haptic, audio, and tactile signals. The framework was tested and validated with a dataset containing 4 sensory modalities (vision, haptic, audio, and tactile) on a humanoid robot performing 9 behaviors multiple times on a large set of objects. While the visual information is the dominant modality, utilizing the additional non-visual modalities improves the accuracy of predictions. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: ICRA 2021

arXiv:2106.03029 [pdf, other]

Planning Multimodal Exploratory Actions for Online Robot Attribute Learning

Authors: Xiaohan Zhang, Jivko Sinapov, Shiqi Zhang

Abstract: Robots frequently need to perceive object attributes, such as "red," "heavy," and "empty," using multimodal exploratory actions, such as "look," "lift," and "shake." Robot attribute learning algorithms aim to learn an observation model for each perceivable attribute given an exploratory action. Once the attribute models are learned, they can be used to identify attributes of new objects, answering… ▽ More Robots frequently need to perceive object attributes, such as "red," "heavy," and "empty," using multimodal exploratory actions, such as "look," "lift," and "shake." Robot attribute learning algorithms aim to learn an observation model for each perceivable attribute given an exploratory action. Once the attribute models are learned, they can be used to identify attributes of new objects, answering questions, such as "Is this object red and empty?" Attribute learning and identification are being treated as two separate problems in the literature. In this paper, we first define a new problem called online robot attribute learning (On-RAL), where the robot works on attribute learning and attribute identification simultaneously. Then we develop an algorithm called information-theoretic reward sha** (ITRS) that actively addresses the trade-off between exploration and exploitation in On-RAL problems. ITRS was compared with competitive robot attribute learning baselines, and experimental results demonstrate ITRS' superiority in learning efficiency and identification accuracy. △ Less

Submitted 7 June, 2021; v1 submitted 6 June, 2021; originally announced June 2021.

Comments: To be published in Robotics: Science and Systems (RSS), July 12-16, 2021

arXiv:2012.13037 [pdf, other]

SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning

Authors: Vasanth Sarathy, Daniel Kasenberg, Shivam Goel, Jivko Sinapov, Matthias Scheutz

Abstract: Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collect… ▽ More Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards. However, RL approaches tend to require millions of episodes of experience and often learn policies that are not easily transferable to other tasks. In this paper, we address one aspect of the open problem of integrating these approaches: how can decision-making agents resolve discrepancies in their symbolic planning models while attempting to accomplish goals? We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed by the agent to accomplish goals that are initially unreachable for the agent. SPOTTER outperforms pure-RL approaches while also discovering transferable symbolic knowledge and does not require supervision, successful plan traces or any a priori knowledge about the missing planning operator. △ Less

Submitted 23 December, 2020; originally announced December 2020.

Comments: Accepted to AAMAS 2021

arXiv:2011.04515 [pdf, other]

SENSAR: A Visual Tool for Intelligent Robots for Collaborative Human-Robot Interaction

Authors: Andre Cleaver, Faizan Muhammad, Amel Hassan, Elaine Short, Jivko Sinapov

Abstract: Establishing common ground between an intelligent robot and a human requires communication of the robot's intention, behavior, and knowledge to the human to build trust and assure safety in a shared environment. This paper introduces SENSAR (Seeing Everything iN Situ with Augmented Reality), an augmented reality robotic system that enables robots to communicate their sensory and cognitive data in… ▽ More Establishing common ground between an intelligent robot and a human requires communication of the robot's intention, behavior, and knowledge to the human to build trust and assure safety in a shared environment. This paper introduces SENSAR (Seeing Everything iN Situ with Augmented Reality), an augmented reality robotic system that enables robots to communicate their sensory and cognitive data in context over the real-world with rendered graphics, allowing a user to understand, correct, and validate the robot's perception of the world. Our system aims to support human-robot interaction research by establishing common ground where the perceptions of the human and the robot align. △ Less

Submitted 9 November, 2020; originally announced November 2020.

arXiv:2011.03464 [pdf, other]

HAVEN: A Unity-based Virtual Robot Environment to Showcase HRI-based Augmented Reality

Authors: Andre Cleaver, Darren Tang, Victoria Chen, Jivko Sinapov

Abstract: Due to the COVID-19 pandemic, conducting Human-Robot Interaction (HRI) studies in person is not permissible due to social distancing practices to limit the spread of the virus. Therefore, a virtual reality (VR) simulation with a virtual robot may offer an alternative to real-life HRI studies. Like a real intelligent robot, a virtual robot can utilize the same advanced algorithms to behave autonomo… ▽ More Due to the COVID-19 pandemic, conducting Human-Robot Interaction (HRI) studies in person is not permissible due to social distancing practices to limit the spread of the virus. Therefore, a virtual reality (VR) simulation with a virtual robot may offer an alternative to real-life HRI studies. Like a real intelligent robot, a virtual robot can utilize the same advanced algorithms to behave autonomously. This paper introduces HAVEN (HRI-based Augmentation in a Virtual robot Environment using uNity), a VR simulation that enables users to interact with a virtual robot. The goal of this system design is to enable researchers to conduct HRI Augmented Reality studies using a virtual robot without being in a real environment. This framework also introduces two common HRI experiment designs: a hallway passing scenario and human-robot team object retrieval scenario. Both reflect HAVEN's potential as a tool for future AR-based HRI studies. △ Less

Submitted 6 November, 2020; originally announced November 2020.

arXiv:2010.13830

Proceedings of the AI-HRI Symposium at AAAI-FSS 2020

Authors: Shelly Bagchi, Jason R. Wilson, Muneeb I. Ahmad, Christian Dondrup, Zhao Han, Justin W. Hart, Matteo Leonetti, Katrin Lohan, Ross Mead, Emmanuel Senft, Jivko Sinapov, Megan L. Zimmerman

Abstract: The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. In that time, the related topic of trust in robotics has been rapidly growing, with major research efforts at universities and laboratories across the world. Indeed, many of the past participants in AI-HRI have been or are now involved with research i… ▽ More The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. In that time, the related topic of trust in robotics has been rapidly growing, with major research efforts at universities and laboratories across the world. Indeed, many of the past participants in AI-HRI have been or are now involved with research into trust in HRI. While trust has no consensus definition, it is regularly associated with predictability, reliability, inciting confidence, and meeting expectations. Furthermore, it is generally believed that trust is crucial for adoption of both AI and robotics, particularly when transitioning technologies from the lab to industrial, social, and consumer applications. However, how does trust apply to the specific situations we encounter in the AI-HRI sphere? Is the notion of trust in AI the same as that in HRI? We see a growing need for research that lives directly at the intersection of AI and HRI that is serviced by this symposium. Over the course of the two-day meeting, we propose to create a collaborative forum for discussion of current efforts in trust for AI-HRI, with a sub-session focused on the related topic of explainable AI (XAI) for HRI. △ Less

Submitted 14 December, 2020; v1 submitted 26 October, 2020; originally announced October 2020.

Comments: Symposium proceedings

arXiv:2003.04960 [pdf, other]

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Authors: Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone

Abstract: Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning ha… ▽ More Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research. △ Less

Submitted 17 September, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

Journal ref: Journal of Machine Learning Research 21(181):1-50, 2020

arXiv:1909.04812

Proceedings of the AI-HRI Symposium at AAAI-FSS 2019

Authors: Justin W. Hart, Nick DePalma, Richard G. Freedman, Luca Iocchi, Matteo Leonetti, Katrin Lohan, Ross Mead, Emmanuel Senft, Jivko Sinapov, Elin A. Topp, Tom Williams

Abstract: The past few years have seen rapid progress in the development of service robots. Universities and companies alike have launched major research efforts toward the deployment of ambitious systems designed to aid human operators performing a variety of tasks. These robots are intended to make those who may otherwise need to live in assisted care facilities more independent, to help workers perform t… ▽ More The past few years have seen rapid progress in the development of service robots. Universities and companies alike have launched major research efforts toward the deployment of ambitious systems designed to aid human operators performing a variety of tasks. These robots are intended to make those who may otherwise need to live in assisted care facilities more independent, to help workers perform their jobs, or simply to make life more convenient. Service robots provide a powerful platform on which to study Artificial Intelligence (AI) and Human-Robot Interaction (HRI) in the real world. Research sitting at the intersection of AI and HRI is crucial to the success of service robots if they are to fulfill their mission. This symposium seeks to highlight research enabling robots to effectively interact with people autonomously while modeling, planning, and reasoning about the environment that the robot operates in and the tasks that it must perform. AI-HRI deals with the challenge of interacting with humans in environments that are relatively unstructured or which are structured around people rather than machines, as well as the possibility that the robot may need to interact naturally with people rather than through teach pendants, programming, or similar interfaces. △ Less

Submitted 19 September, 2019; v1 submitted 10 September, 2019; originally announced September 2019.

Comments: HTML file with clickable links to papers - All papers have been reviewed by at least two reviewers in a single blind fashion - Symposium website: https://ai-hri.github.io/2019/

arXiv:1903.00122 [pdf, other]

doi 10.1109/ICRA.2019.8794287

Improving Grounded Natural Language Understanding through Human-Robot Dialog

Authors: Jesse Thomason, Aishwarya Padmakumar, Jivko Sinapov, Nick Walker, Yuqian Jiang, Harel Yedidsion, Justin Hart, Peter Stone, Raymond J. Mooney

Abstract: Natural language understanding for robotics can require substantial domain- and platform-specific engineering. For example, for mobile robots to pick-and-place objects in an environment to satisfy human commands, we can specify the language humans use to issue such commands, and connect concept words like red can to physical object properties. One way to alleviate this engineering for a new domain… ▽ More Natural language understanding for robotics can require substantial domain- and platform-specific engineering. For example, for mobile robots to pick-and-place objects in an environment to satisfy human commands, we can specify the language humans use to issue such commands, and connect concept words like red can to physical object properties. One way to alleviate this engineering for a new domain is to enable robots in human environments to adapt dynamically---continually learning new language constructions and perceptual concepts. In this work, we present an end-to-end pipeline for translating natural language commands to discrete robot actions, and use clarification dialogs to jointly improve language parsing and concept grounding. We train and evaluate this agent in a virtual setting on Amazon Mechanical Turk, and we transfer the learned agent to a physical robot platform to demonstrate it in the real world. △ Less

Submitted 28 February, 2019; originally announced March 2019.

arXiv:1810.02919 [pdf, other]

Interaction and Autonomy in RoboCup@Home and Building-Wide Intelligence

Authors: Justin Hart, Harel Yedidsion, Yuqian Jiang, Nick Walker, Rishi Shah, Jesse Thomason, Aishwarya Padmakumar, Rolando Fernandez, Jivko Sinapov, Raymond Mooney, Peter Stone

Abstract: Efforts are underway at UT Austin to build autonomous robot systems that address the challenges of long-term deployments in office environments and of the more prescribed domestic service tasks of the RoboCup@Home competition. We discuss the contrasts and synergies of these efforts, highlighting how our work to build a RoboCup@Home Domestic Standard Platform League entry led us to identify an inte… ▽ More Efforts are underway at UT Austin to build autonomous robot systems that address the challenges of long-term deployments in office environments and of the more prescribed domestic service tasks of the RoboCup@Home competition. We discuss the contrasts and synergies of these efforts, highlighting how our work to build a RoboCup@Home Domestic Standard Platform League entry led us to identify an integrated software architecture that could support both projects. Further, naturalistic deployments of our office robot platform as part of the Building-Wide Intelligence project have led us to identify and research new problems in a traditional laboratory setting. △ Less

Submitted 5 October, 2018; originally announced October 2018.

Comments: Presented at AI-HRI AAAI-FSS, 2018 (arXiv:1809.06606)

Report number: AI-HRI/2018/10

Showing 1–27 of 27 results for author: Sinapov, J