-
Alternative Interfaces for Human-initiated Natural Language Communication and Robot-initiated Haptic Feedback: Towards Better Situational Awareness in Human-Robot Collaboration
Authors:
Callum Bennie,
Bridget Casey,
Cecile Paris,
Dana Kulic,
Brendan Tidd,
Nicholas Lawrance,
Alex Pitt,
Fletcher Talbot,
Jason Williams,
David Howard,
Pavan Sikka,
Hashini Senaratne
Abstract:
This article presents an implementation of a natural-language speech interface and a haptic feedback interface that enables a human supervisor to provide guidance to, request information, and receive status updates from a Spot robot. We provide insights gained during preliminary user testing of the interface in a realistic robot exploration scenario.
This article presents an implementation of a natural-language speech interface and a haptic feedback interface that enables a human supervisor to provide guidance to, request information, and receive status updates from a Spot robot. We provide insights gained during preliminary user testing of the interface in a realistic robot exploration scenario.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Robotic Vision for Human-Robot Interaction and Collaboration: A Survey and Systematic Review
Authors:
Nicole Robinson,
Brendan Tidd,
Dylan Campbell,
Dana Kulić,
Peter Corke
Abstract:
Robotic vision for human-robot interaction and collaboration is a critical process for robots to collect and interpret detailed information related to human actions, goals, and preferences, enabling robots to provide more useful services to people. This survey and systematic review presents a comprehensive analysis on robotic vision in human-robot interaction and collaboration over the last 10 yea…
▽ More
Robotic vision for human-robot interaction and collaboration is a critical process for robots to collect and interpret detailed information related to human actions, goals, and preferences, enabling robots to provide more useful services to people. This survey and systematic review presents a comprehensive analysis on robotic vision in human-robot interaction and collaboration over the last 10 years. From a detailed search of 3850 articles, systematic extraction and evaluation was used to identify and explore 310 papers in depth. These papers described robots with some level of autonomy using robotic vision for locomotion, manipulation and/or visual communication to collaborate or interact with people. This paper provides an in-depth analysis of current trends, common domains, methods and procedures, technical processes, data sets and models, experimental testing, sample populations, performance metrics and future challenges. This manuscript found that robotic vision was often used in action and gesture recognition, robot movement in human spaces, object handover and collaborative actions, social communication and learning from demonstration. Few high-impact and novel techniques from the computer vision field had been translated into human-robot interaction and collaboration. Overall, notable advancements have been made on how to develop and deploy robots to assist people.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Learning Visuo-Motor Behaviours for Robot Locomotion Over Difficult Terrain
Authors:
Brendan Tidd
Abstract:
As mobile robots become useful performing everyday tasks in complex real-world environments, they must be able to traverse a range of difficult terrain types such as stairs, step** stones, gaps, jumps and narrow passages. This work investigated traversing these types of environments with a bipedal robot (simulation experiments), and a tracked robot (real world). Develo** a traditional monolith…
▽ More
As mobile robots become useful performing everyday tasks in complex real-world environments, they must be able to traverse a range of difficult terrain types such as stairs, step** stones, gaps, jumps and narrow passages. This work investigated traversing these types of environments with a bipedal robot (simulation experiments), and a tracked robot (real world). Develo** a traditional monolithic controller for traversing all terrain types is challenging, and for large physical robots realistic test facilities are required and safety must be ensured. An alternative is a suite of simple behaviour controllers that can be composed to achieve complex tasks. This work efficiently trained complex behaviours to enable mobile robots to traverse difficult terrain. By minimising retraining as new behaviours became available, robots were able to traverse increasingly complex terrain sets, leading toward the development of scalable behaviour libraries.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Heterogeneous robot teams with unified perception and autonomy: How Team CSIRO Data61 tied for the top score at the DARPA Subterranean Challenge
Authors:
Navinda Kottege,
Jason Williams,
Brendan Tidd,
Fletcher Talbot,
Ryan Steindl,
Mark Cox,
Dennis Frousheger,
Thomas Hines,
Alex Pitt,
Benjamin Tam,
Brett Wood,
Lauren Hanson,
Katrina Lo Surdo,
Thomas Molnar,
Matt Wildie,
Kazys Stepanas,
Gavin Catt,
Lachlan Tychsen-Smith,
Dean Penfold,
Leslie Overs,
Milad Ramezani,
Kasra Khosoussi,
Farid Kendoul,
Glenn Wagner,
Duncan Palmer
, et al. (5 additional authors not shown)
Abstract:
The DARPA Subterranean Challenge was designed for competitors to develop and deploy teams of autonomous robots to explore difficult unknown underground environments. Categorised in to human-made tunnels, underground urban infrastructure and natural caves, each of these subdomains had many challenging elements for robot perception, locomotion, navigation and autonomy. These included degraded wirele…
▽ More
The DARPA Subterranean Challenge was designed for competitors to develop and deploy teams of autonomous robots to explore difficult unknown underground environments. Categorised in to human-made tunnels, underground urban infrastructure and natural caves, each of these subdomains had many challenging elements for robot perception, locomotion, navigation and autonomy. These included degraded wireless communication, poor visibility due to smoke, narrow passages and doorways, clutter, uneven ground, slippery and loose terrain, stairs, ledges, overhangs, drip** water, and dynamic obstacles that move to block paths among others. In the Final Event of this challenge held in September 2021, the course consisted of all three subdomains. The task was for the robot team to perform a scavenger hunt for a number of pre-defined artefacts within a limited time frame. Only one human supervisor was allowed to communicate with the robots once they were in the course. Points were scored when accurate detections and their locations were communicated back to the scoring server. A total of 8 teams competed in the finals held at the Mega Cavern in Louisville, KY, USA. This article describes the systems deployed by Team CSIRO Data61 that tied for the top score and won second place at the event.
△ Less
Submitted 25 February, 2023;
originally announced February 2023.
-
Human-Robot Team Performance Compared to Full Robot Autonomy in 16 Real-World Search and Rescue Missions: Adaptation of the DARPA Subterranean Challenge
Authors:
Nicole Robinson,
Jason Williams,
David Howard,
Brendan Tidd,
Fletcher Talbot,
Brett Wood,
Alex Pitt,
Navinda Kottege,
Dana Kulić
Abstract:
Human operators in human-robot teams are commonly perceived to be critical for mission success. To explore the direct and perceived impact of operator input on task success and team performance, 16 real-world missions (10 hrs) were conducted based on the DARPA Subterranean Challenge. These missions were to deploy a heterogeneous team of robots for a search task to locate and identify artifacts suc…
▽ More
Human operators in human-robot teams are commonly perceived to be critical for mission success. To explore the direct and perceived impact of operator input on task success and team performance, 16 real-world missions (10 hrs) were conducted based on the DARPA Subterranean Challenge. These missions were to deploy a heterogeneous team of robots for a search task to locate and identify artifacts such as climbing rope, drills and mannequins representing human survivors. Two conditions were evaluated: human operators that could control the robot team with state-of-the-art autonomy (Human-Robot Team) compared to autonomous missions without human operator input (Robot-Autonomy). Human-Robot Teams were often in directed autonomy mode (70% of mission time), found more items, traversed more distance, covered more unique ground, and had a higher time between safety-related events. Human-Robot Teams were faster at finding the first artifact, but slower to respond to information from the robot team. In routine conditions, scores were comparable for artifacts, distance, and coverage. Reasons for intervention included creating waypoints to prioritise high-yield areas, and to navigate through error-prone spaces. After observing robot autonomy, operators reported increases in robot competency and trust, but that robot behaviour was not always transparent and understandable, even after high mission performance.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics
Authors:
Krishan Rana,
Ming Xu,
Brendan Tidd,
Michael Milford,
Niko Sünderhauf
Abstract:
Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, maki…
▽ More
Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning. Skills are typically extracted from expert demonstrations and are embedded into a latent space from which they can be sampled as actions by a high-level RL agent. However, this skill space is expansive, and not all skills are relevant for a given robot state, making exploration difficult. Furthermore, the downstream RL agent is limited to learning structurally similar tasks to those used to construct the skill space. We firstly propose accelerating exploration in the skill space using state-conditioned generative models to directly bias the high-level agent towards only sampling skills relevant to a given state based on prior experience. Next, we propose a low-level residual policy for fine-grained skill adaptation enabling downstream RL agents to adapt to unseen task variations. Finally, we validate our approach across four challenging manipulation tasks that differ from those used to build the skill space, demonstrating our ability to learn across task variations while significantly accelerating exploration, outperforming prior works. Code and videos are available on our project website: https://krishanrana.github.io/reskill.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Heterogeneous Ground and Air Platforms, Homogeneous Sensing: Team CSIRO Data61's Approach to the DARPA Subterranean Challenge
Authors:
Nicolas Hudson,
Fletcher Talbot,
Mark Cox,
Jason Williams,
Thomas Hines,
Alex Pitt,
Brett Wood,
Dennis Frousheger,
Katrina Lo Surdo,
Thomas Molnar,
Ryan Steindl,
Matt Wildie,
Inkyu Sa,
Navinda Kottege,
Kazys Stepanas,
Emili Hernandez,
Gavin Catt,
William Docherty,
Brendan Tidd,
Benjamin Tam,
Simon Murrell,
Mitchell Bessell,
Lauren Hanson,
Lachlan Tychsen-Smith,
Hajime Suzuki
, et al. (9 additional authors not shown)
Abstract:
Heterogeneous teams of robots, leveraging a balance between autonomy and human interaction, bring powerful capabilities to the problem of exploring dangerous, unstructured subterranean environments. Here we describe the solution developed by Team CSIRO Data61, consisting of CSIRO, Emesent and Georgia Tech, during the DARPA Subterranean Challenge. These presented systems were fielded in the Tunnel…
▽ More
Heterogeneous teams of robots, leveraging a balance between autonomy and human interaction, bring powerful capabilities to the problem of exploring dangerous, unstructured subterranean environments. Here we describe the solution developed by Team CSIRO Data61, consisting of CSIRO, Emesent and Georgia Tech, during the DARPA Subterranean Challenge. These presented systems were fielded in the Tunnel Circuit in August 2019, the Urban Circuit in February 2020, and in our own Cave event, conducted in September 2020. A unique capability of the fielded team is the homogeneous sensing of the platforms utilised, which is leveraged to obtain a decentralised multi-agent SLAM solution on each platform (both ground agents and UAVs) using peer-to-peer communications. This enabled a shift in focus from constructing a pervasive communications network to relying on multi-agent autonomy, motivated by experiences in early circuit events. These experiences also showed the surprising capability of rugged tracked platforms for challenging terrain, which in turn led to the heterogeneous team structure based on a BIA5 OzBot Titan ground robot and an Emesent Hovermap UAV, supplemented by smaller tracked or legged ground robots. The ground agents use a common CatPack perception module, which allowed reuse of the perception and autonomy stack across all ground agents with minimal adaptation.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Passing Through Narrow Gaps with Deep Reinforcement Learning
Authors:
Brendan Tidd,
Akansel Cosgun,
Jurgen Leitner,
Nicolas Hudson
Abstract:
The U.S. Defense Advanced Research Projects Agency (DARPA) Subterranean Challenge requires teams of robots to traverse difficult and diverse underground environments. Traversing small gaps is one of the challenging scenarios that robots encounter. Imperfect sensor information makes it difficult for classical navigation methods, where behaviours require significant manual fine tuning. In this paper…
▽ More
The U.S. Defense Advanced Research Projects Agency (DARPA) Subterranean Challenge requires teams of robots to traverse difficult and diverse underground environments. Traversing small gaps is one of the challenging scenarios that robots encounter. Imperfect sensor information makes it difficult for classical navigation methods, where behaviours require significant manual fine tuning. In this paper we present a deep reinforcement learning method for autonomously navigating through small gaps, where contact between the robot and the gap may be required. We first learn a gap behaviour policy to get through small gaps (only centimeters wider than the robot). We then learn a goal-conditioned behaviour selection policy that determines when to activate the gap behaviour policy. We train our policies in simulation and demonstrate their effectiveness with a large tracked robot in simulation and on the real platform. In simulation experiments, our approach achieves 93\% success rate when the gap behaviour is activated manually by an operator, and 63\% with autonomous activation using the behaviour selection policy. In real robot experiments, our approach achieves a success rate of 73\% with manual activation, and 40\% with autonomous behaviour selection. While we show the feasibility of our approach in simulation, the difference in performance between simulated and real world scenarios highlight the difficulty of direct sim-to-real transfer for deep reinforcement learning policies. In both the simulated and real world environments alternative methods were unable to traverse the gap.
△ Less
Submitted 1 November, 2021; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Learning Setup Policies: Reliable Transition Between Locomotion Behaviours
Authors:
Brendan Tidd,
Nicolas Hudson,
Akansel Cosgun,
Jurgen Leitner
Abstract:
Dynamic platforms that operate over many unique terrain conditions typically require many behaviours. To transition safely, there must be an overlap of states between adjacent controllers. We develop a novel method for training setup policies that bridge the trajectories between pre-trained Deep Reinforcement Learning (DRL) policies. We demonstrate our method with a simulated biped traversing a di…
▽ More
Dynamic platforms that operate over many unique terrain conditions typically require many behaviours. To transition safely, there must be an overlap of states between adjacent controllers. We develop a novel method for training setup policies that bridge the trajectories between pre-trained Deep Reinforcement Learning (DRL) policies. We demonstrate our method with a simulated biped traversing a difficult jump terrain, where a single policy fails to learn the task, and switching between pre-trained policies without setup policies also fails. We perform an ablation of key components of our system, and show that our method outperforms others that learn transition policies. We demonstrate our method with several difficult and diverse terrain types, and show that we can use setup policies as part of a modular control suite to successfully traverse a sequence of complex terrains. We show that using setup policies improves the success rate for traversing a single difficult jump terrain (from 51.3% success rate with the best comparative method to 82.2%), and traversing a random sequence of difficult obstacles (from 1.9% without setup policies to 71.2%).
△ Less
Submitted 5 October, 2022; v1 submitted 22 January, 2021;
originally announced January 2021.
-
Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts
Authors:
Brendan Tidd,
Nicolas Hudson,
Akansel Cosgun,
Jurgen Leitner
Abstract:
Legged robots often use separate control policiesthat are highly engineered for traversing difficult terrain suchas stairs, gaps, and steps, where switching between policies isonly possible when the robot is in a region that is commonto adjacent controllers. Deep Reinforcement Learning (DRL)is a promising alternative to hand-crafted control design,though typically requires the full set of test con…
▽ More
Legged robots often use separate control policiesthat are highly engineered for traversing difficult terrain suchas stairs, gaps, and steps, where switching between policies isonly possible when the robot is in a region that is commonto adjacent controllers. Deep Reinforcement Learning (DRL)is a promising alternative to hand-crafted control design,though typically requires the full set of test conditions to beknown before training. DRL policies can result in complex(often unrealistic) behaviours that have few or no overlap**regions between adjacent policies, making it difficult to switchbehaviours. In this work we develop multiple DRL policieswith Curriculum Learning (CL), each that can traverse asingle respective terrain condition, while ensuring an overlapbetween policies. We then train a network for each destinationpolicy that estimates the likelihood of successfully switchingfrom any other policy. We evaluate our switching methodon a previously unseen combination of terrain artifacts andshow that it performs better than heuristic methods. Whileour method is trained on individual terrain types, it performscomparably to a Deep Q Network trained on the full set ofterrain conditions. This approach allows the development ofseparate policies in constrained conditions with embedded priorknowledge about each behaviour, that is scalable to any numberof behaviours, and prepares DRL methods for applications inthe real world
△ Less
Submitted 29 September, 2021; v1 submitted 1 November, 2020;
originally announced November 2020.
-
Guided Curriculum Learning for Walking Over Complex Terrain
Authors:
Brendan Tidd,
Nicolas Hudson,
Akansel Cosgun
Abstract:
Reliable bipedal walking over complex terrain is a challenging problem, using a curriculum can help learning. Curriculum learning is the idea of starting with an achievable version of a task and increasing the difficulty as a success criteria is met. We propose a 3-stage curriculum to train Deep Reinforcement Learning policies for bipedal walking over various challenging terrains. In the first sta…
▽ More
Reliable bipedal walking over complex terrain is a challenging problem, using a curriculum can help learning. Curriculum learning is the idea of starting with an achievable version of a task and increasing the difficulty as a success criteria is met. We propose a 3-stage curriculum to train Deep Reinforcement Learning policies for bipedal walking over various challenging terrains. In the first stage, the agent starts on an easy terrain and the terrain difficulty is gradually increased, while forces derived from a target policy are applied to the robot joints and the base. In the second stage, the guiding forces are gradually reduced to zero. Finally, in the third stage, random perturbations with increasing magnitude are applied to the robot base, so the robustness of the policies are improved. In simulation experiments, we show that our approach is effective in learning walking policies, separate from each other, for five terrain types: flat, hurdles, gaps, stairs, and steps. Moreover, we demonstrate that in the absence of human demonstrations, a simple hand designed walking trajectory is a sufficient prior to learn to traverse complex terrain types. In ablation studies, we show that taking out any one of the three stages of the curriculum degrades the learning performance.
△ Less
Submitted 1 February, 2021; v1 submitted 8 October, 2020;
originally announced October 2020.