Search | arXiv e-print repository

Planning Coordinated Human-Robot Motions with Neural Network Full-Body Prediction Models

Authors: Philipp Kratzer, Marc Toussaint, Jim Mainprice

Abstract: Numerical optimization has become a popular approach to plan smooth motion trajectories for robots. However, when sharing space with humans, balancing properly safety, comfort and efficiency still remains challenging. This is notably the case because humans adapt their behavior to that of the robot, raising the need for intricate planning and prediction. In this paper, we propose a novel optimizat… ▽ More Numerical optimization has become a popular approach to plan smooth motion trajectories for robots. However, when sharing space with humans, balancing properly safety, comfort and efficiency still remains challenging. This is notably the case because humans adapt their behavior to that of the robot, raising the need for intricate planning and prediction. In this paper, we propose a novel optimization-based motion planning algorithm, which generates robot motions, while simultaneously maximizing the human trajectory likelihood under a data-driven predictive model. Considering planning and prediction together allows us to formulate objective and constraint functions in the joint human-robot state space. Key to the approach are added latent space modifiers to a differentiable human predictive model based on a dedicated recurrent neural network. These modifiers allow to change the human prediction within motion optimization. We empirically evaluate our method using the publicly available MoGaze dataset. Our results indicate that the proposed framework outperforms current baselines for planning handover trajectories and avoiding collisions between a robot and a human. Our experiments demonstrate collaborative motion trajectories, where both, the human prediction and the robot plan, adapt to each other. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2108.10634 [pdf, other]

Learning to Arbitrate Human and Robot Control using Disagreement between Sub-Policies

Authors: Yoo** Oh, Marc Toussaint, Jim Mainprice

Abstract: In the context of teleoperation, arbitration refers to deciding how to blend between human and autonomous robot commands. We present a reinforcement learning solution that learns an optimal arbitration strategy that allocates more control authority to the human when the robot comes across a decision point in the task. A decision point is where the robot encounters multiple options (sub-policies),… ▽ More In the context of teleoperation, arbitration refers to deciding how to blend between human and autonomous robot commands. We present a reinforcement learning solution that learns an optimal arbitration strategy that allocates more control authority to the human when the robot comes across a decision point in the task. A decision point is where the robot encounters multiple options (sub-policies), such as having multiple paths to get around an obstacle or deciding between two candidate goals. By expressing each directional sub-policy as a von Mises distribution, we identify the decision points by observing the modality of the mixture distribution. Our reward function reasons on this modality and prioritizes to match its learned policy to either the user or the robot accordingly. We report teleoperation experiments on reach-and-gras** objects using a robot manipulator arm with different simulated human controllers. Results indicate that our shared control agent outperforms direct control and improves the teleoperation performance among different users. Using our reward term enables flexible blending between human and robot commands while maintaining safe and accurate teleoperation. △ Less

Submitted 24 August, 2021; originally announced August 2021.

arXiv:2107.01836 [pdf, other]

GraspME -- Grasp Manifold Estimator

Authors: Janik Hager, Ruben Bauer, Marc Toussaint, Jim Mainprice

Abstract: In this paper, we introduce a Grasp Manifold Estimator (GraspME) to detect grasp affordances for objects directly in 2D camera images. To perform manipulation tasks autonomously it is crucial for robots to have such graspability models of the surrounding objects. Grasp manifolds have the advantage of providing continuously infinitely many grasps, which is not the case when using other grasp repres… ▽ More In this paper, we introduce a Grasp Manifold Estimator (GraspME) to detect grasp affordances for objects directly in 2D camera images. To perform manipulation tasks autonomously it is crucial for robots to have such graspability models of the surrounding objects. Grasp manifolds have the advantage of providing continuously infinitely many grasps, which is not the case when using other grasp representations such as predefined grasp points. For instance, this property can be leveraged in motion optimization to define goal sets as implicit surface constraints in the robot configuration space. In this work, we restrict ourselves to the case of estimating possible end-effector positions directly from 2D camera images. To this extend, we define grasp manifolds via a set of key points and locate them in images using a Mask R-CNN backbone. Using learned features allows generalizing to different view angles, with potentially noisy images, and objects that were not part of the training set. We rely on simulation data only and perform experiments on simple and complex objects, including unseen ones. Our framework achieves an inference speed of 11.5 fps on a GPU, an average precision for keypoint estimation of 94.5% and a mean pixel distance of only 1.29. This shows that we can estimate the objects very well via bounding boxes and segmentation masks as well as approximate the correct grasp manifold's keypoint coordinates. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: Accepted to RoMan 2021

arXiv:2107.01829 [pdf, other]

A System for Traded Control Teleoperation of Manipulation Tasks using Intent Prediction from Hand Gestures

Authors: Yoo** Oh, Marc Toussaint, Jim Mainprice

Abstract: This paper presents a teleoperation system that includes robot perception and intent prediction from hand gestures. The perception module identifies the objects present in the robot workspace and the intent prediction module which object the user likely wants to grasp. This architecture allows the approach to rely on traded control instead of direct control: we use hand gestures to specify the goa… ▽ More This paper presents a teleoperation system that includes robot perception and intent prediction from hand gestures. The perception module identifies the objects present in the robot workspace and the intent prediction module which object the user likely wants to grasp. This architecture allows the approach to rely on traded control instead of direct control: we use hand gestures to specify the goal objects for a sequential manipulation task, the robot then autonomously generates a gras** or a retrieving motion using trajectory optimization. The perception module relies on the model-based tracker to precisely track the 6D pose of the objects and makes use of a state of the art learning-based object detection and segmentation method, to initialize the tracker by automatically detecting objects in the scene. Goal objects are identified from user hand gestures using a trained a multi-layer perceptron classifier. After presenting all the components of the system and their empirical evaluation, we present experimental results comparing our pipeline to a direct traded control approach (i.e., one that does not use prediction) which shows that using intent prediction allows to bring down the overall task execution time. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: Accepted to IEEE-RoMAN 2021

arXiv:2104.08137 [pdf, other]

Hierarchical Human-Motion Prediction and Logic-Geometric Programming for Minimal Interference Human-Robot Tasks

Authors: An T. Le, Philipp Kratzer, Simon Hagenmayer, Marc Toussaint, Jim Mainprice

Abstract: In this paper, we tackle the problem of human-robot coordination in sequences of manipulation tasks. Our approach integrates hierarchical human motion prediction with Task and Motion Planning (TAMP). We first devise a hierarchical motion prediction approach by combining Inverse Reinforcement Learning and short-term motion prediction using a Recurrent Neural Network. In a second step, we propose a… ▽ More In this paper, we tackle the problem of human-robot coordination in sequences of manipulation tasks. Our approach integrates hierarchical human motion prediction with Task and Motion Planning (TAMP). We first devise a hierarchical motion prediction approach by combining Inverse Reinforcement Learning and short-term motion prediction using a Recurrent Neural Network. In a second step, we propose a dynamic version of the TAMP algorithm Logic-Geometric Programming (LGP). Our version of Dynamic LGP, replans periodically to handle the mismatch between the human motion prediction and the actual human behavior. We assess the efficacy of the approach by training the prediction algorithms and testing the framework on the publicly available MoGaze dataset. △ Less

Submitted 5 July, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: 8 pages, accepted to IEEE-ROMAN 2021

arXiv:2011.11552 [pdf, other]

MoGaze: A Dataset of Full-Body Motions that Includes Workspace Geometry and Eye-Gaze

Authors: Philipp Kratzer, Simon Bihlmaier, Niteesh Balachandra Midlagajni, Rohit Prakash, Marc Toussaint, Jim Mainprice

Abstract: As robots become more present in open human environments, it will become crucial for robotic systems to understand and predict human motion. Such capabilities depend heavily on the quality and availability of motion capture data. However, existing datasets of full-body motion rarely include 1) long sequences of manipulation tasks, 2) the 3D model of the workspace geometry, and 3) eye-gaze, which a… ▽ More As robots become more present in open human environments, it will become crucial for robotic systems to understand and predict human motion. Such capabilities depend heavily on the quality and availability of motion capture data. However, existing datasets of full-body motion rarely include 1) long sequences of manipulation tasks, 2) the 3D model of the workspace geometry, and 3) eye-gaze, which are all important when a robot needs to predict the movements of humans in close proximity. Hence, in this paper, we present a novel dataset of full-body motion for everyday manipulation tasks, which includes the above. The motion data was captured using a traditional motion capture system based on reflective markers. We additionally captured eye-gaze using a wearable pupil-tracking device. As we show in experiments, the dataset can be used for the design and evaluation of full-body motion prediction algorithms. Furthermore, our experiments show eye-gaze as a powerful predictor of human intent. The dataset includes 180 min of motion capture data with 1627 pick and place actions being performed. It is available at https://humans-to-robots-motion.github.io/mogaze and is planned to be extended to collaborative tasks with two humans in the near future. △ Less

Submitted 23 November, 2020; originally announced November 2020.

arXiv:2007.15308 [pdf, other]

Natural Gradient Shared Control

Authors: Yoo** Oh, Shao-Wen Wu, Marc Toussaint, Jim Mainprice

Abstract: We propose a formalism for shared control, which is the problem of defining a policy that blends user control and autonomous control. The challenge posed by the shared autonomy system is to maintain user control authority while allowing the robot to support the user. This can be done by enforcing constraints or acting optimally when the intent is clear. Our proposed solution relies on natural grad… ▽ More We propose a formalism for shared control, which is the problem of defining a policy that blends user control and autonomous control. The challenge posed by the shared autonomy system is to maintain user control authority while allowing the robot to support the user. This can be done by enforcing constraints or acting optimally when the intent is clear. Our proposed solution relies on natural gradients emerging from the divergence constraint between the robot and the shared policy. We approximate the Fisher information by sampling a learned robot policy and computing the local gradient to augment the user control when necessary. A user study performed on a manipulation task demonstrates that our approach allows for more efficient task completion while kee** control authority against a number of baseline methods. △ Less

Submitted 30 July, 2020; originally announced July 2020.

arXiv:2007.10038 [pdf, other]

Anticipating Human Intention for Full-Body Motion Prediction in Object Gras** and Placing Tasks

Authors: Philipp Kratzer, Niteesh Balachandra Midlagajni, Marc Toussaint, Jim Mainprice

Abstract: Motion prediction in unstructured environments is a difficult problem and is essential for safe and efficient human-robot space sharing and collaboration. In this work, we focus on manipulation movements in environments such as homes, workplaces or restaurants, where the overall task and environment can be leveraged to produce accurate motion prediction. For these cases we propose an algorithmic f… ▽ More Motion prediction in unstructured environments is a difficult problem and is essential for safe and efficient human-robot space sharing and collaboration. In this work, we focus on manipulation movements in environments such as homes, workplaces or restaurants, where the overall task and environment can be leveraged to produce accurate motion prediction. For these cases we propose an algorithmic framework that accounts explicitly for the environment geometry based on a model of affordances and a model of short-term human dynamics both trained on motion capture data. We propose dedicated function networks for graspability and placebility affordances and we make use of a dedicated RNN for short-term motion prediction. The prediction of grasp and placement probability densities are used by a constraint-based trajectory optimizer to produce a full-body motion prediction over the entire horizon. We show by comparing to ground truth data that we achieve similar performance for full-body motion predictions as using oracle grasp and place locations. △ Less

Submitted 20 July, 2020; originally announced July 2020.

arXiv:2007.04842 [pdf, other]

An Interior Point Method Solving Motion Planning Problems with Narrow Passages

Authors: Jim Mainprice, Nathan Ratliff, Marc Toussaint, Stefan Schaal

Abstract: Algorithmic solutions for the motion planning problem have been investigated for five decades. Since the development of A* in 1969 many approaches have been investigated, traditionally classified as either grid decomposition, potential fields or sampling-based. In this work, we focus on using numerical optimization, which is understudied for solving motion planning problems. This lack of interest… ▽ More Algorithmic solutions for the motion planning problem have been investigated for five decades. Since the development of A* in 1969 many approaches have been investigated, traditionally classified as either grid decomposition, potential fields or sampling-based. In this work, we focus on using numerical optimization, which is understudied for solving motion planning problems. This lack of interest in the favor of sampling-based methods is largely due to the non-convexity introduced by narrow passages. We address this shortcoming by grounding the solution in differential geometry. We demonstrate through a series of experiments on 3 Dofs and 6 Dofs narrow passage problems, how modeling explicitly the underlying Riemannian manifold leads to an efficient interior-point non-linear programming solution. △ Less

Submitted 24 July, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

Comments: IEEE RO-MAN 2020, 6 pages

arXiv:1910.01843 [pdf, other]

Prediction of Human Full-Body Movements with Motion Optimization and Recurrent Neural Networks

Authors: Philipp Kratzer, Marc Toussaint, Jim Mainprice

Abstract: Human movement prediction is difficult as humans naturally exhibit complex behaviors that can change drastically from one environment to the next. In order to alleviate this issue, we propose a prediction framework that decouples short-term prediction, linked to internal body dynamics, and long-term prediction, linked to the environment and task constraints. In this work we investigate encoding sh… ▽ More Human movement prediction is difficult as humans naturally exhibit complex behaviors that can change drastically from one environment to the next. In order to alleviate this issue, we propose a prediction framework that decouples short-term prediction, linked to internal body dynamics, and long-term prediction, linked to the environment and task constraints. In this work we investigate encoding short-term dynamics in a recurrent neural network, while we account for environmental constraints, such as obstacle avoidance, using gradient-based trajectory optimization. Experiments on real motion data demonstrate that our framework improves the prediction with respect to state-of-the-art motion prediction methods, as it accounts to beforehand unseen environmental structures. Moreover we demonstrate on an example, how this framework can be used to plan robot trajectories that are optimized to coordinate with a human partner. △ Less

Submitted 18 March, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

Comments: International Conference on Robotics and Automation (ICRA) 2020

arXiv:1906.12280 [pdf, other]

Learning Arbitration for Shared Autonomy by Hindsight Data Aggregation

Authors: Yoo** Oh, Marc Toussaint, Jim Mainprice

Abstract: In this paper we present a framework for the teleoperation of pick-and-place tasks. We define a shared control policy that allows to blend between direct user control and autonomous control based on user intent inference. One of the main challenges in shared autonomy systems is to define the arbitration function, which decides when to let the autonomous agent take over. In this work, we propose a… ▽ More In this paper we present a framework for the teleoperation of pick-and-place tasks. We define a shared control policy that allows to blend between direct user control and autonomous control based on user intent inference. One of the main challenges in shared autonomy systems is to define the arbitration function, which decides when to let the autonomous agent take over. In this work, we propose a model and training method to learn the arbitration function. Our model is based on a recurrent neural network that takes as input the state, intent prediction scores and user command to produce an arbitration between user and robot commands. This work extends our previous work on differentiable policies for shared autonomy. Differentiability of the policy is desirable to further train the shared autonomy system end-to-end. In this work we propose training of the arbitration function by using data from user performing the task with shared control. We present initial results by teleoperating a gripper in a virtual environment using pre-trained motion generation and intent prediction. We compare our data aggregation training procedure to a handcrafted arbitration function. Our preliminary results show the efficacy of the approach and shed light on limitations that we believe demonstrate the need for user adaptability in shared autonomy systems. △ Less

Submitted 28 June, 2019; originally announced June 2019.

Journal ref: Workshop on AI and Its Alternatives in Assistive and Collaborative Robotics (RSS 2019), Robotics: Science and Systems Freiburg, Germany

arXiv:1906.12279 [pdf, other]

Motion Prediction with Recurrent Neural Network Dynamical Models and Trajectory Optimization

Authors: Philipp Kratzer, Marc Toussaint, Jim Mainprice

Abstract: Predicting human motion in unstructured and dynamic environments is difficult as humans naturally exhibit complex behaviors that can change drastically from one environment to the next. In order to alleviate this issue, we propose to encode the lower level aspects of human motion separately from the higher level geometrical aspects, which we believe will generalize better over environments. In con… ▽ More Predicting human motion in unstructured and dynamic environments is difficult as humans naturally exhibit complex behaviors that can change drastically from one environment to the next. In order to alleviate this issue, we propose to encode the lower level aspects of human motion separately from the higher level geometrical aspects, which we believe will generalize better over environments. In contrast to our prior work~\cite{kratzer2018}, we encode the short-term behavior by using a state-of-the-art recurrent neural network structure instead of a Gaussian process. In order to perform longer-term behavior predictions that account for variation in tasks and environments, we propose to make use of gradient-based trajectory optimization. Preliminary experiments on real motion data demonstrate the efficacy of the approach. △ Less

Submitted 28 June, 2019; originally announced June 2019.

Journal ref: Workshop on AI and Its Alternatives in Assistive and Collaborative Robotics (RSS 2019), Robotics: Science and Systems Freiburg, Germany

arXiv:1903.01352 [pdf, other]

Learning Sensory-Motor Associations from Demonstration

Authors: Vincent Berenz, Ahmed Bjelic, Lahiru Herath, Jim Mainprice

Abstract: We propose a method which generates reactive robot behavior learned from human demonstration. In order to do so, we use the Playful programming language which is based on the reactive programming paradigm. This allows us to represent the learned behavior as a set of associations between sensor and motor primitives in a human readable script. Distinguishing between sensor and motor primitives intro… ▽ More We propose a method which generates reactive robot behavior learned from human demonstration. In order to do so, we use the Playful programming language which is based on the reactive programming paradigm. This allows us to represent the learned behavior as a set of associations between sensor and motor primitives in a human readable script. Distinguishing between sensor and motor primitives introduces a supplementary level of granularity and more importantly enforces feedback, increasing adaptability and robustness. As the experimental section shows, useful behaviors may be learned from a single demonstration covering a very limited portion of the task space. △ Less

Submitted 22 July, 2020; v1 submitted 4 March, 2019; originally announced March 2019.

Comments: 7 pages

arXiv:1703.03512 [pdf, other]

Real-time Perception meets Reactive Motion Generation

Authors: Daniel Kappler, Franziska Meier, Jan Issac, Jim Mainprice, Cristina Garcia Cifuentes, Manuel Wüthrich, Vincent Berenz, Stefan Schaal, Nathan Ratliff, Jeannette Bohg

Abstract: We address the challenging problem of robotic gras** and manipulation in the presence of uncertainty. This uncertainty is due to noisy sensing, inaccurate models and hard-to-predict environment dynamics. We quantify the importance of continuous, real-time perception and its tight integration with reactive motion generation methods in dynamic manipulation scenarios. We compare three different sys… ▽ More We address the challenging problem of robotic gras** and manipulation in the presence of uncertainty. This uncertainty is due to noisy sensing, inaccurate models and hard-to-predict environment dynamics. We quantify the importance of continuous, real-time perception and its tight integration with reactive motion generation methods in dynamic manipulation scenarios. We compare three different systems that are instantiations of the most common architectures in the field: (i) a traditional sense-plan-act approach that is still widely used, (ii) a myopic controller that only reacts to local environment dynamics and (iii) a reactive planner that integrates feedback control and motion optimization. All architectures rely on the same components for real-time perception and reactive motion generation to allow a quantitative evaluation. We extensively evaluate the systems on a real robotic platform in four scenarios that exhibit either a challenging workspace geometry or a dynamic environment. In 333 experiments, we quantify the robustness and accuracy that is due to integrating real-time feedback at different time scales in a reactive motion generation system. We also report on the lessons learned for system building. △ Less

Submitted 6 October, 2017; v1 submitted 9 March, 2017; originally announced March 2017.

arXiv:1606.02111 [pdf, other]

Goal Set Inverse Optimal Control and Iterative Re-planning for Predicting Human Reaching Motions in Shared Workspaces

Authors: Jim Mainprice, Rafi Hayne, Dmitry Berenson

Abstract: To enable safe and efficient human-robot collaboration in shared workspaces it is important for the robot to predict how a human will move when performing a task. While predicting human motion for tasks not known a priori is very challenging, we argue that single-arm reaching motions for known tasks in collaborative settings (which are especially relevant for manufacturing) are indeed predictable.… ▽ More To enable safe and efficient human-robot collaboration in shared workspaces it is important for the robot to predict how a human will move when performing a task. While predicting human motion for tasks not known a priori is very challenging, we argue that single-arm reaching motions for known tasks in collaborative settings (which are especially relevant for manufacturing) are indeed predictable. Two hypotheses underlie our approach for predicting such motions: First, that the trajectory the human performs is optimal with respect to an unknown cost function, and second, that human adaptation to their partner's motion can be captured well through iterative re-planning with the above cost function. The key to our approach is thus to learn a cost function which "explains" the motion of the human. To do this, we gather example trajectories from pairs of participants performing a collaborative assembly task using motion capture. We then use Inverse Optimal Control to learn a cost function from these trajectories. Finally, we predict reaching motions from the human's current configuration to a task-space goal region by iteratively re-planning a trajectory using the learned cost function. Our planning algorithm is based on the trajectory optimizer STOMP, it plans for a 23 DoF human kinematic model and accounts for the presence of a moving collaborator and obstacles in the environment. Our results suggest that in most cases, our method outperforms baseline methods when predicting motions. We also show that our method outperforms baselines for predicting human motion when a human and a robot share the workspace. △ Less

Submitted 7 June, 2016; originally announced June 2016.

Comments: 12 pages, Accepted for publication IEEE Transaction on Robotics 2016

Showing 1–15 of 15 results for author: Mainprice, J