-
KOROL: Learning Visualizable Object Feature with Koopman Operator Rollout for Manipulation
Authors:
Hongyi Chen,
Abulikemu Abuduweili,
Aviral Agrawal,
Yunhai Han,
Harish Ravichandar,
Changliu Liu,
Jeffrey Ichnowski
Abstract:
Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable f…
▽ More
Learning dexterous manipulation skills presents significant challenges due to complex nonlinear dynamics that underlie the interactions between objects and multi-fingered hands. Koopman operators have emerged as a robust method for modeling such nonlinear dynamics within a linear framework. However, current methods rely on runtime access to ground-truth (GT) object states, making them unsuitable for vision-based practical applications. Unlike image-to-action policies that implicitly learn visual features for control, we use a dynamics model, specifically the Koopman operator, to learn visually interpretable object features critical for robotic manipulation within a scene. We construct a Koopman operator using object features predicted by a feature extractor and utilize it to auto-regressively advance system states. We train the feature extractor to embed scene information into object features, thereby enabling the accurate propagation of robot trajectories. We evaluate our approach on simulated and real-world robot tasks, with results showing that it outperformed the model-based imitation learning NDP by 1.08$\times$ and the image-to-action Diffusion Policy by 1.16$\times$. The results suggest that our method maintains task success rates with learned features and extends applicability to real-world manipulation without GT object states.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget
Authors:
Glen Neville,
Jiazhen Liu,
Sonia Chernova,
Harish Ravichandar
Abstract:
Complex multi-objective missions require the coordination of heterogeneous robots at multiple inter-connected levels, such as coalition formation, scheduling, and motion planning. The associated challenges are exacerbated when solutions to these interconnected problems need to both maximize task performance and respect practical constraints on time and resources. In this work, we formulate a new c…
▽ More
Complex multi-objective missions require the coordination of heterogeneous robots at multiple inter-connected levels, such as coalition formation, scheduling, and motion planning. The associated challenges are exacerbated when solutions to these interconnected problems need to both maximize task performance and respect practical constraints on time and resources. In this work, we formulate a new class of spatio-temporal heterogeneous task allocation problems that consider these complexities. We contribute a novel framework, named Quality-Optimized Incremental Task Allocation Graph Search (Q-ITAGS), to solve such problems. Q-ITAGS builds upon our prior work in trait-based coordination and offers a flexible interleaved framework that i) explicitly models and optimizes the effect of collective capabilities on task performance via learnable trait-quality maps, and ii) respects both resource constraints and spatio-temporal constraints, including a user-specified time budget (i.e., maximum makespan). In addition to algorithmic contributions, we derive theoretical suboptimality bounds in terms of task performance that varies as a function of a single hyperparameter. Our detailed experiments involving a simulated emergency response task and a real-world video game dataset reveal that i) Q-ITAGS results in superior team performance compared to a state-of-the-art method, while also respecting complex spatio-temporal and resource constraints, ii) Q-ITAGS efficiently learns trait-quality maps to enable effective trade-off between task performance and resource constraints, and iii) Q-ITAGS' suboptimality bounds consistently hold in practice.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Learning Prehensile Dexterity by Imitating and Emulating State-only Observations
Authors:
Yunhai Han,
Zhenyang Chen,
Harish Ravichandar
Abstract:
When human acquire physical skills (e.g., tennis) from experts, we tend to first learn from merely observing the expert. But this is often insufficient. We then engage in practice, where we try to emulate the expert and ensure that our actions produce similar effects on our environment. Inspired by this observation, we introduce Combining IMitation and Emulation for Motion Refinement (CIMER) -- a…
▽ More
When human acquire physical skills (e.g., tennis) from experts, we tend to first learn from merely observing the expert. But this is often insufficient. We then engage in practice, where we try to emulate the expert and ensure that our actions produce similar effects on our environment. Inspired by this observation, we introduce Combining IMitation and Emulation for Motion Refinement (CIMER) -- a two-stage framework to learn dexterous prehensile manipulation skills from state-only observations. CIMER's first stage involves imitation: simultaneously encode the complex interdependent motions of the robot hand and the object in a structured dynamical system. This results in a reactive motion generation policy that provides a reasonable motion prior, but lacks the ability to reason about contact effects due to the lack of action labels. The second stage involves emulation: learn a motion refinement policy via reinforcement that adjusts the robot hand's motion prior such that the desired object motion is reenacted. CIMER is both task-agnostic (no task-specific reward design or sha**) and intervention-free (no additional teleoperated or labeled demonstrations). Detailed experiments with prehensile dexterity reveal that i) imitation alone is insufficient, but adding emulation drastically improves performance, ii) CIMER outperforms existing methods in terms of sample efficiency and the ability to generate realistic and stable motions, iii) CIMER can either zero-shot generalize or learn to adapt to novel objects from the YCB dataset, even outperforming expert policies trained with action labels in most cases. Source code and videos are available at https://sites.google.com/view/cimer-2024/.
△ Less
Submitted 12 April, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities
Authors:
Pierce Howell,
Max Rudolph,
Reza Torbati,
Kevin Fu,
Harish Ravichandar
Abstract:
Recent advances in multi-agent reinforcement learning (MARL) are enabling impressive coordination in heterogeneous multi-robot teams. However, existing approaches often overlook the challenge of generalizing learned policies to teams of new compositions, sizes, and robots. While such generalization might not be important in teams of virtual agents that can retrain policies on-demand, it is pivotal…
▽ More
Recent advances in multi-agent reinforcement learning (MARL) are enabling impressive coordination in heterogeneous multi-robot teams. However, existing approaches often overlook the challenge of generalizing learned policies to teams of new compositions, sizes, and robots. While such generalization might not be important in teams of virtual agents that can retrain policies on-demand, it is pivotal in multi-robot systems that are deployed in the real-world and must readily adapt to inevitable changes. As such, multi-robot policies must remain robust to team changes -- an ability we call adaptive teaming. In this work, we investigate if awareness and communication of robot capabilities can provide such generalization by conducting detailed experiments involving an established multi-robot test bed. We demonstrate that shared decentralized policies, that enable robots to be both aware of and communicate their capabilities, can achieve adaptive teaming by implicitly capturing the fundamental relationship between collective capabilities and effective coordination. Videos of trained policies can be viewed at: https://sites.google.com/view/cap-comm
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
The Effects of Robot Motion on Comfort Dynamics of Novice Users in Close-Proximity Human-Robot Interaction
Authors:
Pierce Howell,
Jack Kolb,
Yifan Liu,
Harish Ravichandar
Abstract:
Effective and fluent close-proximity human-robot interaction requires understanding how humans get habituated to robots and how robot motion affects human comfort. While prior work has identified humans' preferences over robot motion characteristics and studied their influence on comfort, we are yet to understand how novice first-time robot users get habituated to robots and how robot motion impac…
▽ More
Effective and fluent close-proximity human-robot interaction requires understanding how humans get habituated to robots and how robot motion affects human comfort. While prior work has identified humans' preferences over robot motion characteristics and studied their influence on comfort, we are yet to understand how novice first-time robot users get habituated to robots and how robot motion impacts the dynamics of comfort over repeated interactions. To take the first step towards such understanding, we carry out a user study to investigate the connections between robot motion and user comfort and habituation. Specifically, we study the influence of workspace overlap, end-effector speed, and robot motion legibility on overall comfort and its evolution over repeated interactions. Our analyses reveal that workspace overlap, in contrast to speed and legibility, has a significant impact on users' perceived comfort and habituation. In particular, lower workspace overlap leads to users reporting significantly higher overall comfort, lower variations in comfort, and fewer fluctuations in comfort levels during habituation.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
MARBLER: An Open Platform for Standardized Evaluation of Multi-Robot Reinforcement Learning Algorithms
Authors:
Reza Torbati,
Shubham Lohiya,
Shivika Singh,
Meher Shashwat Nigam,
Harish Ravichandar
Abstract:
Multi-Agent Reinforcement Learning (MARL) has enjoyed significant recent progress thanks, in part, to the integration of deep learning techniques for modeling interactions in complex environments. This is naturally starting to benefit multi-robot systems (MRS) in the form of multi-robot RL (MRRL). However, existing infrastructure to train and evaluate policies predominantly focus on the challenges…
▽ More
Multi-Agent Reinforcement Learning (MARL) has enjoyed significant recent progress thanks, in part, to the integration of deep learning techniques for modeling interactions in complex environments. This is naturally starting to benefit multi-robot systems (MRS) in the form of multi-robot RL (MRRL). However, existing infrastructure to train and evaluate policies predominantly focus on the challenges of coordinating virtual agents, and ignore characteristics important to robotic systems. Few platforms support realistic robot dynamics, and fewer still can evaluate Sim2Real performance of learned behavior. To address these issues, we contribute MARBLER: Multi-Agent RL Benchmark and Learning Environment for the Robotarium. MARBLER offers a robust and comprehensive evaluation platform for MRRL by marrying Georgia Tech's Robotarium (which enables rapid deployment on physical MRS) and OpenAI's Gym interface (which facilitates standardized use of modern learning algorithms). MARBLER offers a highly controllable environment with realistic dynamics, including barrier certificate-based obstacle avoidance. It allows anyone across the world to train and deploy MRRL algorithms on a physical testbed with reproducibility. Further, we introduce five novel scenarios inspired by common challenges in MRS and provide support for new custom scenarios. Finally, we use MARBLER to evaluate popular MARL algorithms and provide insights into their suitability for MRRL. In summary, MARBLER can be a valuable tool to the MRS research community by facilitating comprehensive and standardized evaluation of learning algorithms on realistic simulations and physical hardware. Links to our open-source framework and videos of real-world experiments can be found at https://shubhlohiya.github.io/MARBLER/.
△ Less
Submitted 21 October, 2023; v1 submitted 7 July, 2023;
originally announced July 2023.
-
Concurrent Constrained Optimization of Unknown Rewards for Multi-Robot Task Allocation
Authors:
Sukriti Singh,
Anusha Srikanthan,
Vivek Mallampati,
Harish Ravichandar
Abstract:
Task allocation can enable effective coordination of multi-robot teams to accomplish tasks that are intractable for individual robots. However, existing approaches to task allocation often assume that task requirements or reward functions are known and explicitly specified by the user. In this work, we consider the challenge of forming effective coalitions for a given heterogeneous multi-robot tea…
▽ More
Task allocation can enable effective coordination of multi-robot teams to accomplish tasks that are intractable for individual robots. However, existing approaches to task allocation often assume that task requirements or reward functions are known and explicitly specified by the user. In this work, we consider the challenge of forming effective coalitions for a given heterogeneous multi-robot team when task reward functions are unknown. To this end, we first formulate a new class of problems, dubbed COncurrent Constrained Online optimization of Allocation (COCOA). The COCOA problem requires online optimization of coalitions such that the unknown rewards of all the tasks are simultaneously maximized using a given multi-robot team with constrained resources. To address the COCOA problem, we introduce an online optimization algorithm, named Concurrent Multi-Task Adaptive Bandits (CMTAB), that leverages and builds upon continuum-armed bandit algorithms. Experiments involving detailed numerical simulations and a simulated emergency response task reveal that CMTAB can effectively trade-off exploration and exploitation to simultaneously and efficiently optimize the unknown task rewards while respecting the team's resource constraints.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Fast Anticipatory Motion Planning for Close-Proximity Human-Robot Interaction
Authors:
Sam Scheele,
Pierce Howell,
Harish Ravichandar
Abstract:
Effective close-proximity human-robot interaction (CP-HRI) requires robots to be able to both efficiently perform tasks as well as adapt to human behavior and preferences. However, this ability is mediated by many, sometimes competing, aspects of interaction. We propose a real-time motion-planning framework for robotic manipulators that can simultaneously optimize a set of both task- and human-cen…
▽ More
Effective close-proximity human-robot interaction (CP-HRI) requires robots to be able to both efficiently perform tasks as well as adapt to human behavior and preferences. However, this ability is mediated by many, sometimes competing, aspects of interaction. We propose a real-time motion-planning framework for robotic manipulators that can simultaneously optimize a set of both task- and human-centric cost functions. To this end, we formulate a Nonlinear Model-Predictive Control (NMPC) problem with kino-dynamic constraints and efficiently solve it by leveraging recent advances in nonlinear trajectory optimization. We employ stochastic predictions of the human partner's trajectories in order to adapt the robot's nominal behavior in anticipation of its human partner. Our framework explicitly models and allows balancing of different task- and human-centric cost functions. While previous approaches to trajectory optimization for CP-HRI take anywhere from several seconds to a full minute to compute a trajectory, our approach is capable of computing one in 318 ms on average, enabling real-time implementation. We illustrate the effectiveness of our framework by simultaneously optimizing for separation distance, end-effector visibility, legibility, smoothness, and deviation from nominal behavior. We also demonstrate that our approach performs comparably to prior work in terms of the chosen cost functions, while significantly improving computational efficiency.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
On the Utility of Koopman Operator Theory in Learning Dexterous Manipulation Skills
Authors:
Yunhai Han,
Mandy Xie,
Ye Zhao,
Harish Ravichandar
Abstract:
Despite impressive dexterous manipulation capabilities enabled by learning-based approaches, we are yet to witness widespread adoption beyond well-resourced laboratories. This is likely due to practical limitations, such as significant computational burden, inscrutable learned behaviors, sensitivity to initialization, and the considerable technical expertise required for implementation. In this wo…
▽ More
Despite impressive dexterous manipulation capabilities enabled by learning-based approaches, we are yet to witness widespread adoption beyond well-resourced laboratories. This is likely due to practical limitations, such as significant computational burden, inscrutable learned behaviors, sensitivity to initialization, and the considerable technical expertise required for implementation. In this work, we investigate the utility of Koopman operator theory in alleviating these limitations. Koopman operators are simple yet powerful control-theoretic structures to represent complex nonlinear dynamics as linear systems in higher dimensions. Motivated by the fact that complex nonlinear dynamics underlie dexterous manipulation, we develop a Koopman operator-based imitation learning framework to learn the desired motions of both the robotic hand and the object simultaneously. We show that Koopman operators are surprisingly effective for dexterous manipulation and offer a number of unique benefits. Notably, policies can be learned analytically, drastically reducing computation burden and eliminating sensitivity to initialization and the need for painstaking hyperparameter optimization. Our experiments reveal that a Koopman operator-based approach can perform comparably to state-of-the-art imitation learning algorithms in terms of success rate and sample efficiency, while being an order of magnitude faster. Policy videos can be viewed at https://sites.google.com/view/kodex-corl.
△ Less
Submitted 30 August, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Inferring Implicit Trait Preferences for Task Allocation in Heterogeneous Teams
Authors:
Vivek Mallampati,
Harish Ravichandar
Abstract:
Task allocation in heterogeneous multi-agent teams often requires reasoning about multi-dimensional agent traits (i.e., capabilities) and the demands placed on them by tasks. However, existing methods tend to ignore the fact that not all traits equally contribute to a given task. Ignoring such inherent preferences or relative importance can lead to unintended sub-optimal allocations of limited age…
▽ More
Task allocation in heterogeneous multi-agent teams often requires reasoning about multi-dimensional agent traits (i.e., capabilities) and the demands placed on them by tasks. However, existing methods tend to ignore the fact that not all traits equally contribute to a given task. Ignoring such inherent preferences or relative importance can lead to unintended sub-optimal allocations of limited agent resources that do not necessarily contribute to task success. Further, reasoning over a large number of traits can incur a hefty computational burden. To alleviate these concerns, we propose an algorithm to infer task-specific trait preferences implicit in expert demonstrations. We leverage the insight that the consistency with which an expert allocates a trait to a task across demonstrations reflects the trait's importance to that task. Inspired by findings in psychology, we account for the fact that the inherent diversity of a trait in the dataset influences the dataset's informativeness and, thereby, the extent of the inferred preference or the lack thereof. Through detailed numerical simulations and evaluations of a publicly-available soccer dataset (FIFA 20), we demonstrate that we can successfully infer implicit trait preferences and that accounting for the inferred preferences leads to more computationally efficient and effective task allocation, compared to a baseline approach that treats all traits equally.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Constrained Reinforcement Learning for Dexterous Manipulation
Authors:
Abhineet Jain,
Jack Kolb,
Harish Ravichandar
Abstract:
Existing learning approaches to dexterous manipulation use demonstrations or interactions with the environment to train black-box neural networks that provide little control over how the robot learns the skills or how it would perform post training. These approaches pose significant challenges when implemented on physical platforms given that, during initial stages of training, the robot's behavio…
▽ More
Existing learning approaches to dexterous manipulation use demonstrations or interactions with the environment to train black-box neural networks that provide little control over how the robot learns the skills or how it would perform post training. These approaches pose significant challenges when implemented on physical platforms given that, during initial stages of training, the robot's behavior could be erratic and potentially harmful to its own hardware, the environment, or any humans in the vicinity. A potential way to address these limitations is to add constraints during learning that restrict and guide the robot's behavior during training as well as roll outs. Inspired by the success of constrained approaches in other domains, we investigate the effects of adding position-based constraints to a 24-DOF robot hand learning to perform object relocation using Constrained Policy Optimization. We find that a simple geometric constraint can ensure the robot learns to move towards the object sooner than without constraints. Further, training with this constraint requires a similar number of samples as its unconstrained counterpart to master the skill. These findings shed light on how simple constraints can help robots achieve sensible and safe behavior quickly and ease concerns surrounding hardware deployment. We also investigate the effects of the strictness of these constraints and report findings that provide insights into how different degrees of strictness affect learning outcomes. Our code is available at https://github.com/GT-STAR-Lab/constrained-rl-dexterous-manipulation.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
D-ITAGS: A Dynamic Interleaved Approach to Resilient Task Allocation, Scheduling, and Motion Planning
Authors:
Glen Neville,
Sonia Chernova,
Harish Ravichandar
Abstract:
Complex, multi-objective missions require the coordination of heterogeneous robots at multiple inter-connected levels, such as coalition formation, scheduling, and motion planning. This challenge is exacerbated by dynamic changes, such as sensor and actuator failures, communication loss, and unexpected delays.
We introduce Dynamic Iterative Task Allocation Graph Search (D-ITAGS) to \textit{simul…
▽ More
Complex, multi-objective missions require the coordination of heterogeneous robots at multiple inter-connected levels, such as coalition formation, scheduling, and motion planning. This challenge is exacerbated by dynamic changes, such as sensor and actuator failures, communication loss, and unexpected delays.
We introduce Dynamic Iterative Task Allocation Graph Search (D-ITAGS) to \textit{simultaneously} address coalition formation, scheduling, and motion planning in dynamic settings involving heterogeneous teams. D-ITAGS achieves resilience via two key characteristics: i) interleaved execution, and ii) targeted repair. \textit{Interleaved execution} enables an effective search for solutions at each layer while avoiding incompatibility with other layers. \textit{Targeted repair} identifies and repairs parts of the existing solution impacted by a given disruption, while conserving the rest. In addition to algorithmic contributions, we provide theoretical insights into the inherent trade-off between time and resource optimality in these settings and derive meaningful bounds on schedule suboptimality.
Our experiments reveal that i) D-ITAGS is significantly faster than recomputation from scratch in dynamic settings, with little to no loss in solution quality, and ii) the theoretical suboptimality bounds consistently hold in practice.
△ Less
Submitted 3 December, 2022; v1 submitted 26 September, 2022;
originally announced September 2022.
-
Evaluating the Effectiveness of Corrective Demonstrations and a Low-Cost Sensor for Dexterous Manipulation
Authors:
Abhineet Jain,
Jack Kolb,
J. M. Abbess IV,
Harish Ravichandar
Abstract:
Imitation learning is a promising approach to help robots acquire dexterous manipulation capabilities without the need for a carefully-designed reward or a significant computational effort. However, existing imitation learning approaches require sophisticated data collection infrastructure and struggle to generalize beyond the training distribution. One way to address this limitation is to gather…
▽ More
Imitation learning is a promising approach to help robots acquire dexterous manipulation capabilities without the need for a carefully-designed reward or a significant computational effort. However, existing imitation learning approaches require sophisticated data collection infrastructure and struggle to generalize beyond the training distribution. One way to address this limitation is to gather additional data that better represents the full operating conditions. In this work, we investigate characteristics of such additional demonstrations and their impact on performance. Specifically, we study the effects of corrective and randomly-sampled additional demonstrations on learning a policy that guides a five-fingered robot hand through a pick-and-place task. Our results suggest that corrective demonstrations considerably outperform randomly-sampled demonstrations, when the proportion of additional demonstrations sampled from the full task distribution is larger than the number of original demonstrations sampled from a restrictive training distribution. Conversely, when the number of original demonstrations are higher than that of additional demonstrations, we find no significant differences between corrective and randomly-sampled additional demonstrations. These results provide insights into the inherent trade-off between the effort required to collect corrective demonstrations and their relative benefits over randomly-sampled demonstrations. Additionally, we show that inexpensive vision-based sensors, such as LeapMotion, can be used to dramatically reduce the cost of providing demonstrations for dexterous manipulation tasks. Our code is available at https://github.com/GT-STAR-Lab/corrective-demos-dexterous-manipulation.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
An Interleaved Approach to Trait-Based Task Allocation and Scheduling
Authors:
Glen Neville,
Andrew Messing,
Harish Ravichandar,
Seth Hutchinson,
Sonia Chernova
Abstract:
To realize effective heterogeneous multi-robot teams, researchers must leverage individual robots' relative strengths and coordinate their individual behaviors. Specifically, heterogeneous multi-robot systems must answer three important questions: \textit{who} (task allocation), \textit{when} (scheduling), and \textit{how} (motion planning). While specific variants of each of these problems are kn…
▽ More
To realize effective heterogeneous multi-robot teams, researchers must leverage individual robots' relative strengths and coordinate their individual behaviors. Specifically, heterogeneous multi-robot systems must answer three important questions: \textit{who} (task allocation), \textit{when} (scheduling), and \textit{how} (motion planning). While specific variants of each of these problems are known to be NP-Hard, their interdependence only exacerbates the challenges involved in solving them together. In this paper, we present a novel framework that interleaves task allocation, scheduling, and motion planning. We introduce a search-based approach for trait-based time-extended task allocation named Incremental Task Allocation Graph Search (ITAGS). In contrast to approaches that solve the three problems in sequence, ITAGS's interleaved approach enables efficient search for allocations while simultaneously satisfying scheduling constraints and accounting for the time taken to execute motion plans. To enable effective interleaving, we develop a convex combination of two search heuristics that optimizes the satisfaction of task requirements as well as the makespan of the associated schedule. We demonstrate the efficacy of ITAGS using detailed ablation studies and comparisons against two state-of-the-art algorithms in a simulated emergency response domain.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Resource-Aware Adaptation of Heterogeneous Strategies for Coalition Formation
Authors:
Anusha Srikanthan,
Harish Ravichandar
Abstract:
Existing approaches to coalition formation often assume that requirements associated with tasks are precisely specified by the human operator. However, prior work has demonstrated that humans, while extremely adept at solving complex problems, struggle to explicitly state their solution strategy. Further, existing approaches often ignore the fact that experts may utilize different, but equally-val…
▽ More
Existing approaches to coalition formation often assume that requirements associated with tasks are precisely specified by the human operator. However, prior work has demonstrated that humans, while extremely adept at solving complex problems, struggle to explicitly state their solution strategy. Further, existing approaches often ignore the fact that experts may utilize different, but equally-valid, solutions (i.e., heterogeneous strategies) to the same problem. In this work, we propose a two-part framework to address these challenges. First, we tackle the challenge of inferring implicit strategies directly from expert demonstrations of coalition formation. To this end, we model and infer such heterogeneous strategies as capability-based requirements associated with each task. Next, we propose a method capable of adaptively selecting one of the inferred strategies that best suits the target team without requiring additional training. Specifically, we formulate and solve a constrained optimization problem that simultaneously selects the most appropriate strategy given the target team's capabilities, and allocates its constituents into appropriate coalitions. We evaluate our approach against several baselines, including some that resemble existing approaches, using detailed numerical simulations, StarCraft II battles, and a multi-robot emergency-response scenario. Our results indicate that our framework consistently outperforms all baselines in terms of requirement satisfaction, resource utilization, and task success rates.
△ Less
Submitted 24 January, 2022; v1 submitted 5 August, 2021;
originally announced August 2021.
-
Desperate Times Call for Desperate Measures: Towards Risk-Adaptive Task Allocation
Authors:
Max Rudolph,
Sonia Chernova,
Harish Ravichandar
Abstract:
Multi-robot task allocation (MRTA) problems involve optimizing the allocation of robots to tasks. MRTA problems are known to be challenging when tasks require multiple robots and the team is composed of heterogeneous robots. These challenges are further exacerbated when we need to account for uncertainties encountered in the real-world. In this work, we address coalition formation in heterogeneous…
▽ More
Multi-robot task allocation (MRTA) problems involve optimizing the allocation of robots to tasks. MRTA problems are known to be challenging when tasks require multiple robots and the team is composed of heterogeneous robots. These challenges are further exacerbated when we need to account for uncertainties encountered in the real-world. In this work, we address coalition formation in heterogeneous multi-robot teams with uncertain capabilities. We specifically focus on tasks that require coalitions to collectively satisfy certain minimum requirements. Existing approaches to uncertainty-aware task allocation either maximize expected pay-off (risk-neutral approaches) or improve worst-case or near-worst-case outcomes (risk-averse approaches). Within the context of our problem, we demonstrate the inherent limitations of unilaterally ignoring or avoiding risk and show that these approaches can in fact reduce the probability of satisfying task requirements. Inspired by models that explain foraging behaviors in animals, we develop a risk-adaptive approach to task allocation. Our approach adaptively switches between risk-averse and risk-seeking behavior in order to maximize the probability of satisfying task requirements. Comprehensive numerical experiments conclusively demonstrate that our risk-adaptive approach outperforms risk-neutral and risk-averse approaches. We also demonstrate the effectiveness of our approach using a simulated multi-robot emergency response scenario.
△ Less
Submitted 7 August, 2021; v1 submitted 31 July, 2021;
originally announced August 2021.
-
Anticipatory Human-Robot Collaboration via Multi-Objective Trajectory Optimization
Authors:
Abhinav Jain,
Daphne Chen,
Dhruva Bansal,
Sam Scheele,
Mayank Kishore,
Hritik Sapra,
David Kent,
Harish Ravichandar,
Sonia Chernova
Abstract:
We address the problem of adapting robot trajectories to improve safety, comfort, and efficiency in human-robot collaborative tasks. To this end, we propose CoMOTO, a trajectory optimization framework that utilizes stochastic motion prediction models to anticipate the human's motion and adapt the robot's joint trajectory accordingly. We design a multi-objective cost function that simultaneously op…
▽ More
We address the problem of adapting robot trajectories to improve safety, comfort, and efficiency in human-robot collaborative tasks. To this end, we propose CoMOTO, a trajectory optimization framework that utilizes stochastic motion prediction models to anticipate the human's motion and adapt the robot's joint trajectory accordingly. We design a multi-objective cost function that simultaneously optimizes for i) separation distance, ii) visibility of the end-effector, iii) legibility, iv) efficiency, and v) smoothness. We evaluate CoMOTO against three existing methods for robot trajectory generation when in close proximity to humans. Our experimental results indicate that our approach consistently outperforms existing methods over a combined set of safety, comfort, and efficiency metrics.
△ Less
Submitted 30 July, 2020; v1 submitted 5 June, 2020;
originally announced June 2020.
-
Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks
Authors:
Siddhartha Banerjee,
Angel Daruna,
David Kent,
Weiyu Liu,
Jonathan Balloch,
Abhinav Jain,
Akshay Krishnan,
Muhammad Asif Rana,
Harish Ravichandar,
Binit Shah,
Nithin Shrivatsav,
Sonia Chernova
Abstract:
Robot task execution when situated in real-world environments is fragile. As such, robot architectures must rely on robust error recovery, adding non-trivial complexity to highly-complex robot systems. To handle this complexity in development, we introduce Recovery-Driven Development (RDD), an iterative task scripting process that facilitates rapid task and recovery development by leveraging hiera…
▽ More
Robot task execution when situated in real-world environments is fragile. As such, robot architectures must rely on robust error recovery, adding non-trivial complexity to highly-complex robot systems. To handle this complexity in development, we introduce Recovery-Driven Development (RDD), an iterative task scripting process that facilitates rapid task and recovery development by leveraging hierarchical specification, separation of nominal task and recovery development, and situated testing. We validate our approach with our challenge-winning mobile manipulator software architecture developed using RDD for the FetchIt! Challenge at the IEEE 2019 International Conference on Robotics and Automation. We attribute the success of our system to the level of robustness achieved using RDD, and conclude with lessons learned for develo** such systems.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Inferring and Learning Multi-Robot Policies by Observing an Expert
Authors:
Pietro Pierpaoli,
Harish Ravichandar,
Nicholas Waytowich,
Anqi Li,
Derrik Asher,
Magnus Egerstedt
Abstract:
We present a technique for learning how to solve a multi-robot mission that requires interaction with an external environment by observing an expert system executing the same mission. We define the expert system as a team of robots equipped with a library of controllers, each designed to solve a specific task, supervised by an expert policy that appropriately selects controllers based on the state…
▽ More
We present a technique for learning how to solve a multi-robot mission that requires interaction with an external environment by observing an expert system executing the same mission. We define the expert system as a team of robots equipped with a library of controllers, each designed to solve a specific task, supervised by an expert policy that appropriately selects controllers based on the states of robots and environment. The objective is for an un-trained team of robots (i.e., imitator system) equipped with the same library of controllers, but agnostic to the expert policy, to execute the mission, with performances comparable to those of the expert system. From un-annotated observations of the expert system, a multi-hypothesis filtering technique is used to estimate individual controllers executed by the expert policy. Then, the history of estimated controllers and environmental states is used to train a neural network policy for the imitator system. Considering a perimeter protection scenario on a team of differential-drive robots, we show that the learned policy endows the imitator system with performances comparable to those of the expert system.
△ Less
Submitted 2 March, 2020; v1 submitted 17 September, 2019;
originally announced September 2019.
-
Skill Acquisition via Automated Multi-Coordinate Cost Balancing
Authors:
Harish Ravichandar,
S. Reza Ahmadzadeh,
M. Asif Rana,
Sonia Chernova
Abstract:
We propose a learning framework, named Multi-Coordinate Cost Balancing (MCCB), to address the problem of acquiring point-to-point movement skills from demonstrations. MCCB encodes demonstrations simultaneously in multiple differential coordinates that specify local geometric properties. MCCB generates reproductions by solving a convex optimization problem with a multi-coordinate cost function and…
▽ More
We propose a learning framework, named Multi-Coordinate Cost Balancing (MCCB), to address the problem of acquiring point-to-point movement skills from demonstrations. MCCB encodes demonstrations simultaneously in multiple differential coordinates that specify local geometric properties. MCCB generates reproductions by solving a convex optimization problem with a multi-coordinate cost function and linear constraints on the reproductions, such as initial, target, and via points. Further, since the relative importance of each coordinate system in the cost function might be unknown for a given skill, MCCB learns optimal weighting factors that balance the cost function. We demonstrate the effectiveness of MCCB via detailed experiments conducted on one handwriting dataset and three complex skill datasets.
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
STRATA: A Unified Framework for Task Assignments in Large Teams of Heterogeneous Agents
Authors:
Harish Ravichandar,
Kenneth Shaw,
Sonia Chernova
Abstract:
Large teams of heterogeneous agents have the potential to solve complex multi-task problems that are intractable for a single agent working independently. However, solving complex multi-task problems requires leveraging the relative strengths of the different kinds of agents in the team. We present Stochastic TRAit-based Task Assignment (STRATA), a unified framework that models large teams of hete…
▽ More
Large teams of heterogeneous agents have the potential to solve complex multi-task problems that are intractable for a single agent working independently. However, solving complex multi-task problems requires leveraging the relative strengths of the different kinds of agents in the team. We present Stochastic TRAit-based Task Assignment (STRATA), a unified framework that models large teams of heterogeneous agents and performs effective task assignments. Specifically, given information on which traits (capabilities) are required for various tasks, STRATA computes the assignments of agents to tasks such that the trait requirements are achieved. Inspired by prior work in robot swarms and biodiversity, we categorize agents into different species (groups) based on their traits. We model each trait as a continuous variable and differentiate between traits that can and cannot be aggregated from different agents. STRATA is capable of reasoning about both species-level and agent-level variability in traits. Further, we define measures of diversity for any given team based on the team's continuous-space trait model. We illustrate the necessity and effectiveness of STRATA using detailed experiments based in simulation and in a capture-the-flag game environment.
△ Less
Submitted 6 February, 2020; v1 submitted 12 March, 2019;
originally announced March 2019.