-
Open Medical Gesture: An Open-Source Experiment in Naturalistic Physical Interactions for Mixed and Virtual Reality Simulations
Authors:
Thomas B Talbot,
Chinmay Chinara
Abstract:
Mixed Reality (MR) and Virtual Reality (VR) simulations are hampered by requirements for hand controllers or attempts to perseverate in use of two-dimensional computer interface paradigms from the 1980s. From our efforts to produce more naturalistic interactions for combat medic training for the military, USC has developed an open-source toolkit that enables direct hand controlled responsive inter…
▽ More
Mixed Reality (MR) and Virtual Reality (VR) simulations are hampered by requirements for hand controllers or attempts to perseverate in use of two-dimensional computer interface paradigms from the 1980s. From our efforts to produce more naturalistic interactions for combat medic training for the military, USC has developed an open-source toolkit that enables direct hand controlled responsive interactions that is sensor independent and can function with depth sensing cameras, webcams or sensory gloves. Natural approaches we have examined include the ability to manipulate virtual smart objects in a similar manner to how they are used in the real world. From this research and review of current literature, we have discerned several best approaches for hand-based human computer interactions which provide intuitive, responsive, useful, and low frustration experiences for VR users.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Learning to Learn: How to Continuously Teach Humans and Machines
Authors:
Parantak Singh,
You Li,
Ankur Sikarwar,
Weixian Lei,
Daniel Gao,
Morgan Bruce Talbot,
Ying Sun,
Mike Zheng Shou,
Gabriel Kreiman,
Mengmi Zhang
Abstract:
Curriculum design is a fundamental component of education. For example, when we learn mathematics at school, we build upon our knowledge of addition to learn multiplication. These and other concepts must be mastered before our first algebra lesson, which also reinforces our addition and multiplication skills. Designing a curriculum for teaching either a human or a machine shares the underlying goa…
▽ More
Curriculum design is a fundamental component of education. For example, when we learn mathematics at school, we build upon our knowledge of addition to learn multiplication. These and other concepts must be mastered before our first algebra lesson, which also reinforces our addition and multiplication skills. Designing a curriculum for teaching either a human or a machine shares the underlying goal of maximizing knowledge transfer from earlier to later tasks, while also minimizing forgetting of learned tasks. Prior research on curriculum design for image classification focuses on the ordering of training examples during a single offline task. Here, we investigate the effect of the order in which multiple distinct tasks are learned in a sequence. We focus on the online class-incremental continual learning setting, where algorithms or humans must learn image classes one at a time during a single pass through a dataset. We find that curriculum consistently influences learning outcomes for humans and for multiple continual machine learning algorithms across several benchmark datasets. We introduce a novel-object recognition dataset for human curriculum learning experiments and observe that curricula that are effective for humans are highly correlated with those that are effective for machines. As an initial step towards automated curriculum design for online class-incremental learning, we propose a novel algorithm, dubbed Curriculum Designer (CD), that designs and ranks curricula based on inter-class feature similarities. We find significant overlap between curricula that are empirically highly effective and those that are highly ranked by our CD. Our study establishes a framework for further research on teaching humans and machines to learn continuously using optimized curricula.
△ Less
Submitted 17 August, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Retrospectives on the Embodied AI Workshop
Authors:
Matt Deitke,
Dhruv Batra,
Yonatan Bisk,
Tommaso Campari,
Angel X. Chang,
Devendra Singh Chaplot,
Changan Chen,
Claudia Pérez D'Arpino,
Kiana Ehsani,
Ali Farhadi,
Li Fei-Fei,
Anthony Francis,
Chuang Gan,
Kristen Grauman,
David Hall,
Winson Han,
Unnat Jain,
Aniruddha Kembhavi,
Jacob Krantz,
Stefan Lee,
Chengshu Li,
Sagnik Majumder,
Oleksandr Maksymets,
Roberto Martín-Martín,
Roozbeh Mottaghi
, et al. (14 additional authors not shown)
Abstract:
We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of…
▽ More
We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of-the-art models. We highlight commonalities between top approaches to the challenges and identify potential future directions for Embodied AI research.
△ Less
Submitted 4 December, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Zero-Shot Uncertainty-Aware Deployment of Simulation Trained Policies on Real-World Robots
Authors:
Krishan Rana,
Vibhavari Dasagi,
Jesse Haviland,
Ben Talbot,
MIchael Milford,
Niko Sünderhauf
Abstract:
While deep reinforcement learning (RL) agents have demonstrated incredible potential in attaining dexterous behaviours for robotics, they tend to make errors when deployed in the real world due to mismatches between the training and execution environments. In contrast, the classical robotics community have developed a range of controllers that can safely operate across most states in the real worl…
▽ More
While deep reinforcement learning (RL) agents have demonstrated incredible potential in attaining dexterous behaviours for robotics, they tend to make errors when deployed in the real world due to mismatches between the training and execution environments. In contrast, the classical robotics community have developed a range of controllers that can safely operate across most states in the real world given their explicit derivation. These controllers however lack the dexterity required for complex tasks given limitations in analytical modelling and approximations. In this paper, we propose Bayesian Controller Fusion (BCF), a novel uncertainty-aware deployment strategy that combines the strengths of deep RL policies and traditional handcrafted controllers. In this framework, we can perform zero-shot sim-to-real transfer, where our uncertainty based formulation allows the robot to reliably act within out-of-distribution states by leveraging the handcrafted controller while gaining the dexterity of the learned system otherwise. We show promising results on two real-world continuous control tasks, where BCF outperforms both the standalone policy and controller, surpassing what either can achieve independently. A supplementary video demonstrating our system is provided at https://bit.ly/bcf_deploy.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
Evaluating the Impact of Semantic Segmentation and Pose Estimation on Dense Semantic SLAM
Authors:
Suman Raj Bista,
David Hall,
Ben Talbot,
Haoyang Zhang,
Feras Dayoub,
Niko Sünderhauf
Abstract:
Recent Semantic SLAM methods combine classical geometry-based estimation with deep learning-based object detection or semantic segmentation. In this paper we evaluate the quality of semantic maps generated by state-of-the-art class- and instance-aware dense semantic SLAM algorithms whose codes are publicly available and explore the impacts both semantic segmentation and pose estimation have on the…
▽ More
Recent Semantic SLAM methods combine classical geometry-based estimation with deep learning-based object detection or semantic segmentation. In this paper we evaluate the quality of semantic maps generated by state-of-the-art class- and instance-aware dense semantic SLAM algorithms whose codes are publicly available and explore the impacts both semantic segmentation and pose estimation have on the quality of semantic maps. We obtain these results by providing algorithms with ground-truth pose and/or semantic segmentation data available from simulated environments. We establish that semantic segmentation is the largest source of error through our experiments, drop** mAP and OMQ performance by up to 74.3% and 71.3% respectively.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics
Authors:
Krishan Rana,
Vibhavari Dasagi,
Jesse Haviland,
Ben Talbot,
Michael Milford,
Niko Sünderhauf
Abstract:
We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs f…
▽ More
We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, simple handcrafted controllers exist that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration and deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF's applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real world. BCF is a promising approach towards combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at https://krishanrana.github.io/bcf.
△ Less
Submitted 3 April, 2023; v1 submitted 20 July, 2021;
originally announced July 2021.
-
Learning and Executing Re-usable Behaviour Trees from Natural Language Instruction
Authors:
Gavin Suddrey,
Ben Talbot,
Frederic Maire
Abstract:
Domestic and service robots have the potential to transform industries such as health care and small-scale manufacturing, as well as the homes in which we live. However, due to the overwhelming variety of tasks these robots will be expected to complete, providing generic out-of-the-box solutions that meet the needs of every possible user is clearly intractable. To address this problem, robots must…
▽ More
Domestic and service robots have the potential to transform industries such as health care and small-scale manufacturing, as well as the homes in which we live. However, due to the overwhelming variety of tasks these robots will be expected to complete, providing generic out-of-the-box solutions that meet the needs of every possible user is clearly intractable. To address this problem, robots must therefore not only be capable of learning how to complete novel tasks at run-time, but the solutions to these tasks must also be informed by the needs of the user. In this paper we demonstrate how behaviour trees, a well established control architecture in the fields of gaming and robotics, can be used in conjunction with natural language instruction to provide a robust and modular control architecture for instructing autonomous agents to learn and perform novel complex tasks. We also show how behaviour trees generated using our approach can be generalised to novel scenarios, and can be re-used in future learning episodes to create increasingly complex behaviours. We validate this work against an existing corpus of natural language instructions, demonstrate the application of our approach on both a simulated robot solving a toy problem, as well as two distinct real-world robot platforms which, respectively, complete a block sorting scenario, and a patrol scenario.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
Tuned Compositional Feature Replays for Efficient Stream Learning
Authors:
Morgan B. Talbot,
Rushikesh Zawar,
Rohil Badkundri,
Mengmi Zhang,
Gabriel Kreiman
Abstract:
Our brains extract durable, generalizable knowledge from transient experiences of the world. Artificial neural networks come nowhere close to this ability. When tasked with learning to classify objects by training on non-repeating video frames in temporal order (online stream learning), models that learn well from shuffled datasets catastrophically forget old knowledge upon learning new stimuli. W…
▽ More
Our brains extract durable, generalizable knowledge from transient experiences of the world. Artificial neural networks come nowhere close to this ability. When tasked with learning to classify objects by training on non-repeating video frames in temporal order (online stream learning), models that learn well from shuffled datasets catastrophically forget old knowledge upon learning new stimuli. We propose a new continual learning algorithm, Compositional Replay Using Memory Blocks (CRUMB), which mitigates forgetting by replaying feature maps reconstructed by combining generic parts. CRUMB concatenates trainable and re-usable "memory block" vectors to compositionally reconstruct feature map tensors in convolutional neural networks. Storing the indices of memory blocks used to reconstruct new stimuli enables memories of the stimuli to be replayed during later tasks. This reconstruction mechanism also primes the neural network to minimize catastrophic forgetting by biasing it towards attending to information about object shapes more than information about image textures, and stabilizes the network during stream learning by providing a shared feature-level basis for all training examples. These properties allow CRUMB to outperform an otherwise identical algorithm that stores and replays raw images, while occupying only 3.6% as much memory. We stress-tested CRUMB alongside 13 competing methods on 7 challenging datasets. To address the limited number of existing online stream learning datasets, we introduce 2 new benchmarks by adapting existing datasets for stream learning. With only 3.7-4.1% as much memory and 15-43% as much runtime, CRUMB mitigates catastrophic forgetting more effectively than the state-of-the-art. Our code is available at https://github.com/MorganBDT/crumb.git.
△ Less
Submitted 2 January, 2024; v1 submitted 5 April, 2021;
originally announced April 2021.
-
The Robotic Vision Scene Understanding Challenge
Authors:
David Hall,
Ben Talbot,
Suman Raj Bista,
Haoyang Zhang,
Rohan Smith,
Feras Dayoub,
Niko Sünderhauf
Abstract:
Being able to explore an environment and understand the location and type of all objects therein is important for indoor robotic platforms that must interact closely with humans. However, it is difficult to evaluate progress in this area due to a lack of standardized testing which is limited due to the need for active robot agency and perfect object ground-truth. To help provide a standard for tes…
▽ More
Being able to explore an environment and understand the location and type of all objects therein is important for indoor robotic platforms that must interact closely with humans. However, it is difficult to evaluate progress in this area due to a lack of standardized testing which is limited due to the need for active robot agency and perfect object ground-truth. To help provide a standard for testing scene understanding systems, we present a new robot vision scene understanding challenge using simulation to enable repeatable experiments with active robot agency. We provide two challenging task types, three difficulty levels, five simulated environments and a new evaluation measure for evaluating 3D cuboid object maps. Our aim is to drive state-of-the-art research in scene understanding through enabling evaluation and comparison of active robotic vision systems.
△ Less
Submitted 11 September, 2020;
originally announced September 2020.
-
BenchBot: Evaluating Robotics Research in Photorealistic 3D Simulation and on Real Robots
Authors:
Ben Talbot,
David Hall,
Haoyang Zhang,
Suman Raj Bista,
Rohan Smith,
Feras Dayoub,
Niko Sünderhauf
Abstract:
We introduce BenchBot, a novel software suite for benchmarking the performance of robotics research across both photorealistic 3D simulations and real robot platforms. BenchBot provides a simple interface to the sensorimotor capabilities of a robot when solving robotics research problems; an interface that is consistent regardless of whether the target platform is simulated or a real robot. In thi…
▽ More
We introduce BenchBot, a novel software suite for benchmarking the performance of robotics research across both photorealistic 3D simulations and real robot platforms. BenchBot provides a simple interface to the sensorimotor capabilities of a robot when solving robotics research problems; an interface that is consistent regardless of whether the target platform is simulated or a real robot. In this paper we outline the BenchBot system architecture, and explore the parallels between its user-centric design and an ideal research development process devoid of tangential robot engineering challenges. The paper describes the research benefits of using the BenchBot system, including: enhanced capacity to focus solely on research problems, direct quantitative feedback to inform research development, tools for deriving comprehensive performance characteristics, and submission formats which promote sharability and repeatability of research outcomes. BenchBot is publicly available (http://benchbot.org), and we encourage its use in the research community for comprehensively evaluating the simulated and real world performance of novel robotic algorithms.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Multiplicative Controller Fusion: Leveraging Algorithmic Priors for Sample-efficient Reinforcement Learning and Safe Sim-To-Real Transfer
Authors:
Krishan Rana,
Vibhavari Dasagi,
Ben Talbot,
Michael Milford,
Niko Sünderhauf
Abstract:
Learning-based approaches often outperform hand-coded algorithmic solutions for many problems in robotics. However, learning long-horizon tasks on real robot hardware can be intractable, and transferring a learned policy from simulation to reality is still extremely challenging. We present a novel approach to model-free reinforcement learning that can leverage existing sub-optimal solutions as an…
▽ More
Learning-based approaches often outperform hand-coded algorithmic solutions for many problems in robotics. However, learning long-horizon tasks on real robot hardware can be intractable, and transferring a learned policy from simulation to reality is still extremely challenging. We present a novel approach to model-free reinforcement learning that can leverage existing sub-optimal solutions as an algorithmic prior during training and deployment. During training, our gated fusion approach enables the prior to guide the initial stages of exploration, increasing sample-efficiency and enabling learning from sparse long-horizon reward signals. Importantly, the policy can learn to improve beyond the performance of the sub-optimal prior since the prior's influence is annealed gradually. During deployment, the policy's uncertainty provides a reliable strategy for transferring a simulation-trained policy to the real world by falling back to the prior controller in uncertain states. We show the efficacy of our Multiplicative Controller Fusion approach on the task of robot navigation and demonstrate safe transfer from simulation to the real world without any fine-tuning. The code for this project is made publicly available at https://sites.google.com/view/mcf-nav/home
△ Less
Submitted 27 July, 2020; v1 submitted 11 March, 2020;
originally announced March 2020.
-
Robot Navigation in Unseen Spaces using an Abstract Map
Authors:
Ben Talbot,
Feras Dayoub,
Peter Corke,
Gordon Wyeth
Abstract:
Human navigation in built environments depends on symbolic spatial information which has unrealised potential to enhance robot navigation capabilities. Information sources such as labels, signs, maps, planners, spoken directions, and navigational gestures communicate a wealth of spatial information to the navigators of built environments; a wealth of information that robots typically ignore. We pr…
▽ More
Human navigation in built environments depends on symbolic spatial information which has unrealised potential to enhance robot navigation capabilities. Information sources such as labels, signs, maps, planners, spoken directions, and navigational gestures communicate a wealth of spatial information to the navigators of built environments; a wealth of information that robots typically ignore. We present a robot navigation system that uses the same symbolic spatial information employed by humans to purposefully navigate in unseen built environments with a level of performance comparable to humans. The navigation system uses a novel data structure called the abstract map to imagine malleable spatial models for unseen spaces from spatial symbols. Sensorimotor perceptions from a robot are then employed to provide purposeful navigation to symbolic goal locations in the unseen environment. We show how a dynamic system can be used to create malleable spatial models for the abstract map, and provide an open source implementation to encourage future work in the area of symbolic navigation. Symbolic navigation performance of humans and a robot is evaluated in a real-world built environment. The paper concludes with a qualitative analysis of human navigation strategies, providing further insights into how the symbolic navigation capabilities of robots in unseen built environments can be improved in the future.
△ Less
Submitted 15 May, 2020; v1 submitted 31 January, 2020;
originally announced January 2020.
-
Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments
Authors:
Krishan Rana,
Ben Talbot,
Vibhavari Dasagi,
Michael Milford,
Niko Sünderhauf
Abstract:
In this work we focus on improving the efficiency and generalisation of learned navigation strategies when transferred from its training environment to previously unseen ones. We present an extension of the residual reinforcement learning framework from the robotic manipulation literature and adapt it to the vast and unstructured environments that mobile robots can operate in. The concept is based…
▽ More
In this work we focus on improving the efficiency and generalisation of learned navigation strategies when transferred from its training environment to previously unseen ones. We present an extension of the residual reinforcement learning framework from the robotic manipulation literature and adapt it to the vast and unstructured environments that mobile robots can operate in. The concept is based on learning a residual control effect to add to a typical sub-optimal classical controller in order to close the performance gap, whilst guiding the exploration process during training for improved data efficiency. We exploit this tight coupling and propose a novel deployment strategy, switching Residual Reactive Navigation (sRRN), which yields efficient trajectories whilst probabilistically switching to a classical controller in cases of high policy uncertainty. Our approach achieves improved performance over end-to-end alternatives and can be incorporated as part of a complete navigation stack for cluttered indoor navigation tasks in the real world. The code and training environment for this project is made publicly available at https://sites.google.com/view/srrn/home.
△ Less
Submitted 11 March, 2020; v1 submitted 24 September, 2019;
originally announced September 2019.
-
OpenSeqSLAM2.0: An Open Source Toolbox for Visual Place Recognition Under Changing Conditions
Authors:
Ben Talbot,
Sourav Garg,
Michael Milford
Abstract:
Visually recognising a traversed route - regardless of whether seen during the day or night, in clear or inclement conditions, or in summer or winter - is an important capability for navigating robots. Since SeqSLAM was introduced in 2012, a large body of work has followed exploring how robotic systems can use the algorithm to meet the challenges posed by navigation in changing environmental condi…
▽ More
Visually recognising a traversed route - regardless of whether seen during the day or night, in clear or inclement conditions, or in summer or winter - is an important capability for navigating robots. Since SeqSLAM was introduced in 2012, a large body of work has followed exploring how robotic systems can use the algorithm to meet the challenges posed by navigation in changing environmental conditions. The following paper describes OpenSeqSLAM2.0, a fully open source toolbox for visual place recognition under changing conditions. Beyond the benefits of open access to the source code, OpenSeqSLAM2.0 provides a number of tools to facilitate exploration of the visual place recognition problem and interactive parameter tuning. Using the new open source platform, it is shown for the first time how comprehensive parameter characterisations provide new insights into many of the system components previously presented in ad hoc ways and provide users with a guide to what system component options should be used under what circumstances and why.
△ Less
Submitted 11 April, 2018; v1 submitted 6 April, 2018;
originally announced April 2018.
-
Place Categorization and Semantic Map** on a Mobile Robot
Authors:
Niko Sünderhauf,
Feras Dayoub,
Sean McMahon,
Ben Talbot,
Ruth Schulz,
Peter Corke,
Gordon Wyeth,
Ben Upcroft,
Michael Milford
Abstract:
In this paper we focus on the challenging problem of place categorization and semantic map** on a robot without environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can…
▽ More
In this paper we focus on the challenging problem of place categorization and semantic map** on a robot without environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot's behaviour during navigation tasks. The system is made available to the community as a ROS module.
△ Less
Submitted 9 July, 2015;
originally announced July 2015.