Search | arXiv e-print repository

arXiv:2404.13432 [pdf, other]

The Child Factor in Child-Robot Interaction: Discovering the Impact of Developmental Stage and Individual Characteristics

Authors: Irina Rudenko, Andrey Rudenko, Achim J. Lilienthal, Kai O. Arras, Barbara Bruno

Abstract: Social robots, owing to their embodied physical presence in human spaces and the ability to directly interact with the users and their environment, have a great potential to support children in various activities in education, healthcare and daily life. Child-Robot Interaction (CRI), as any domain involving children, inevitably faces the major challenge of designing generalized strategies to work… ▽ More Social robots, owing to their embodied physical presence in human spaces and the ability to directly interact with the users and their environment, have a great potential to support children in various activities in education, healthcare and daily life. Child-Robot Interaction (CRI), as any domain involving children, inevitably faces the major challenge of designing generalized strategies to work with unique, turbulent and very diverse individuals. Addressing this challenging endeavor requires to combine the standpoint of the robot-centered perspective, i.e. what robots technically can and are best positioned to do, with that of the child-centered perspective, i.e. what children may gain from the robot and how the robot should act to best support them in reaching the goals of the interaction. This article aims to help researchers bridge the two perspectives and proposes to address the development of CRI scenarios with insights from child psychology and child development theories. To that end, we review the outcomes of the CRI studies, outline common trends and challenges, and identify two key factors from child psychology that impact child-robot interactions, especially in a long-term perspective: developmental stage and individual characteristics. For both of them we discuss prospective experiment designs which support building naturally engaging and sustainable interactions. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: Pre-print submitted to the International Journal of Social Robotics, accepted March 2024

arXiv:2309.07066 [pdf, other]

CLiFF-LHMP: Using Spatial Dynamics Patterns for Long-Term Human Motion Prediction

Authors: Yufei Zhu, Andrey Rudenko, Tomasz P. Kucner, Luigi Palmieri, Kai O. Arras, Achim J. Lilienthal, Martin Magnusson

Abstract: Human motion prediction is important for mobile service robots and intelligent vehicles to operate safely and smoothly around people. The more accurate predictions are, particularly over extended periods of time, the better a system can, e.g., assess collision risks and plan ahead. In this paper, we propose to exploit maps of dynamics (MoDs, a class of general representations of place-dependent sp… ▽ More Human motion prediction is important for mobile service robots and intelligent vehicles to operate safely and smoothly around people. The more accurate predictions are, particularly over extended periods of time, the better a system can, e.g., assess collision risks and plan ahead. In this paper, we propose to exploit maps of dynamics (MoDs, a class of general representations of place-dependent spatial motion patterns, learned from prior observations) for long-term human motion prediction (LHMP). We present a new MoD-informed human motion prediction approach, named CLiFF-LHMP, which is data efficient, explainable, and insensitive to errors from an upstream tracking system. Our approach uses CLiFF-map, a specific MoD trained with human motion data recorded in the same environment. We bias a constant velocity prediction with samples from the CLiFF-map to generate multi-modal trajectory predictions. In two public datasets we show that this algorithm outperforms the state of the art for predictions over very extended periods of time, achieving 45% more accurate prediction performance at 50s compared to the baseline. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: Accepted to the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

arXiv:2307.00841 [pdf, other]

doi 10.1109/RO-MAN57019.2023.10309629

Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver

Authors: Tim Schreiter, Lucas Morillo-Mendez, Ravi T. Chadalavada, Andrey Rudenko, Erik Billing, Martin Magnusson, Kai O. Arras, Achim J. Lilienthal

Abstract: Robots are increasingly used in shared environments with humans, making effective communication a necessity for successful human-robot interaction. In our work, we study a crucial component: active communication of robot intent. Here, we present an anthropomorphic solution where a humanoid robot communicates the intent of its host robot acting as an "Anthropomorphic Robotic Mock Driver" (ARMoD). W… ▽ More Robots are increasingly used in shared environments with humans, making effective communication a necessity for successful human-robot interaction. In our work, we study a crucial component: active communication of robot intent. Here, we present an anthropomorphic solution where a humanoid robot communicates the intent of its host robot acting as an "Anthropomorphic Robotic Mock Driver" (ARMoD). We evaluate this approach in two experiments in which participants work alongside a mobile robot on various tasks, while the ARMoD communicates a need for human attention, when required, or gives instructions to collaborate on a joint task. The experiments feature two interaction styles of the ARMoD: a verbal-only mode using only speech and a multimodal mode, additionally including robotic gaze and pointing gestures to support communication and register intent in space. Our results show that the multimodal interaction style, including head movements and eye gaze as well as pointing gestures, leads to more natural fixation behavior. Participants naturally identified and fixated longer on the areas relevant for intent communication, and reacted faster to instructions in collaborative tasks. Our research further indicates that the ARMoD intent communication improves engagement and social interaction with mobile robots in workplace settings. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: This paper has been accepted to the 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), which will be held in Busan, South Korea on August 28-31, 2023. For more information, please visit: https://ro-man2023.org/main

Journal ref: 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)

arXiv:2304.10595 [pdf, other]

The e-Bike Motor Assembly: Towards Advanced Robotic Manipulation for Flexible Manufacturing

Authors: Leonel Rozo, Andras G. Kupcsik, Philipp Schillinger, Meng Guo, Robert Krug, Niels van Duijkeren, Markus Spies, Patrick Kesper, Sabrina Hoppe, Hanna Ziesche, Mathias Bürger, Kai O. Arras

Abstract: Robotic manipulation is currently undergoing a profound paradigm shift due to the increasing needs for flexible manufacturing systems, and at the same time, because of the advances in enabling technologies such as sensing, learning, optimization, and hardware. This demands for robots that can observe and reason about their workspace, and that are skillfull enough to complete various assembly proce… ▽ More Robotic manipulation is currently undergoing a profound paradigm shift due to the increasing needs for flexible manufacturing systems, and at the same time, because of the advances in enabling technologies such as sensing, learning, optimization, and hardware. This demands for robots that can observe and reason about their workspace, and that are skillfull enough to complete various assembly processes in weakly-structured settings. Moreover, it remains a great challenge to enable operators for teaching robots on-site, while managing the inherent complexity of perception, control, motion planning and reaction to unexpected situations. Motivated by real-world industrial applications, this paper demonstrates the potential of such a paradigm shift in robotics on the industrial case of an e-Bike motor assembly. The paper presents a concept for teaching and programming adaptive robots on-site and demonstrates their potential for the named applications. The framework includes: (i) a method to teach perception systems onsite in a self-supervised manner, (ii) a general representation of object-centric motion skills and force-sensitive assembly skills, both learned from demonstration, (iii) a sequencing approach that exploits a human-designed plan to perform complex tasks, and (iv) a system solution for adapting and optimizing skills online. The aforementioned components are interfaced through a four-layer software architecture that makes our framework a tangible industrial technology. To demonstrate the generality of the proposed framework, we provide, in addition to the motivating e-Bike motor assembly, a further case study on dense box packing for logistics automation. △ Less

Submitted 20 April, 2023; originally announced April 2023.

arXiv:2208.14925 [pdf, other]

The Magni Human Motion Dataset: Accurate, Complex, Multi-Modal, Natural, Semantically-Rich and Contextualized

Authors: Tim Schreiter, Tiago Rodrigues de Almeida, Yufei Zhu, Eduardo Gutierrez Maestro, Lucas Morillo-Mendez, Andrey Rudenko, Tomasz P. Kucner, Oscar Martinez Mozos, Martin Magnusson, Luigi Palmieri, Kai O. Arras, Achim J. Lilienthal

Abstract: Rapid development of social robots stimulates active research in human motion modeling, interpretation and prediction, proactive collision avoidance, human-robot interaction and co-habitation in shared spaces. Modern approaches to this end require high quality datasets for training and evaluation. However, the majority of available datasets suffers from either inaccurate tracking data or unnatural… ▽ More Rapid development of social robots stimulates active research in human motion modeling, interpretation and prediction, proactive collision avoidance, human-robot interaction and co-habitation in shared spaces. Modern approaches to this end require high quality datasets for training and evaluation. However, the majority of available datasets suffers from either inaccurate tracking data or unnatural, scripted behavior of the tracked people. This paper attempts to fill this gap by providing high quality tracking information from motion capture, eye-gaze trackers and on-board robot sensors in a semantically-rich environment. To induce natural behavior of the recorded participants, we utilise loosely scripted task assignment, which induces the participants navigate through the dynamic laboratory environment in a natural and purposeful way. The motion dataset, presented in this paper, sets a high quality standard, as the realistic and accurate data is enhanced with semantic information, enabling development of new algorithms which rely not only on the tracking information but also on contextual cues of the moving agents, static and dynamic environment. △ Less

Submitted 31 August, 2022; originally announced August 2022.

Comments: in SIRRW Workshop held in conjunction with 31st IEEE International Conference on Robot & Human Interactive Communication, 29/08 - 02/09 2022, Naples (Italy)

arXiv:2207.09830 [pdf, other]

The Atlas Benchmark: an Automated Evaluation Framework for Human Motion Prediction

Authors: Andrey Rudenko, Luigi Palmieri, Wanting Huang, Achim J. Lilienthal, Kai O. Arras

Abstract: Human motion trajectory prediction, an essential task for autonomous systems in many domains, has been on the rise in recent years. With a multitude of new methods proposed by different communities, the lack of standardized benchmarks and objective comparisons is increasingly becoming a major limitation to assess progress and guide further research. Existing benchmarks are limited in their scope a… ▽ More Human motion trajectory prediction, an essential task for autonomous systems in many domains, has been on the rise in recent years. With a multitude of new methods proposed by different communities, the lack of standardized benchmarks and objective comparisons is increasingly becoming a major limitation to assess progress and guide further research. Existing benchmarks are limited in their scope and flexibility to conduct relevant experiments and to account for contextual cues of agents and environments. In this paper we present Atlas, a benchmark to systematically evaluate human motion trajectory prediction algorithms in a unified framework. Atlas offers data preprocessing functions, hyperparameter optimization, comes with popular datasets and has the flexibility to setup and conduct underexplored yet relevant experiments to analyze a method's accuracy and robustness. In an example application of Atlas, we compare five popular model- and learning-based predictors and find that, when properly applied, early physics-based approaches are still remarkably competitive. Such results confirm the necessity of benchmarks like Atlas. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: Accepted to and will be presented at the IEEE RO-MAN 2022 conference

arXiv:2108.01495 [pdf, other]

Cross-Modal Analysis of Human Detection for Robotics: An Industrial Case Study

Authors: Timm Linder, Narunas Vaskevicius, Robert Schirmer, Kai O. Arras

Abstract: Advances in sensing and learning algorithms have led to increasingly mature solutions for human detection by robots, particularly in selected use-cases such as pedestrian detection for self-driving cars or close-range person detection in consumer settings. Despite this progress, the simple question "which sensor-algorithm combination is best suited for a person detection task at hand?" remains har… ▽ More Advances in sensing and learning algorithms have led to increasingly mature solutions for human detection by robots, particularly in selected use-cases such as pedestrian detection for self-driving cars or close-range person detection in consumer settings. Despite this progress, the simple question "which sensor-algorithm combination is best suited for a person detection task at hand?" remains hard to answer. In this paper, we tackle this issue by conducting a systematic cross-modal analysis of sensor-algorithm combinations typically used in robotics. We compare the performance of state-of-the-art person detectors for 2D range data, 3D lidar, and RGB-D data as well as selected combinations thereof in a challenging industrial use-case. We further address the related problems of data scarcity in the industrial target domain, and that recent research on human detection in 3D point clouds has mostly focused on autonomous driving scenarios. To leverage these methodological advances for robotics applications, we utilize a simple, yet effective multi-sensor transfer learning strategy by extending a strong image-based RGB-D detector to provide cross-modal supervision for lidar detectors in the form of weak 3D bounding box labels. Our results show a large variance among the different approaches in terms of detection performance, generalization, frame rates and computational requirements. As our use-case contains difficulties representative for a wide range of service robot applications, we believe that these results point to relevant open challenges for further research and provide valuable support to practitioners for the design of their robot system. △ Less

Submitted 3 August, 2021; originally announced August 2021.

Comments: Accepted for publication at 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

arXiv:2102.08745 [pdf, other]

Learning Occupancy Priors of Human Motion from Semantic Maps of Urban Environments

Authors: Andrey Rudenko, Luigi Palmieri, Johannes Doellinger, Achim J. Lilienthal, Kai O. Arras

Abstract: Understanding and anticipating human activity is an important capability for intelligent systems in mobile robotics, autonomous driving, and video surveillance. While learning from demonstrations with on-site collected trajectory data is a powerful approach to discover recurrent motion patterns, generalization to new environments, where sufficient motion data are not readily available, remains a c… ▽ More Understanding and anticipating human activity is an important capability for intelligent systems in mobile robotics, autonomous driving, and video surveillance. While learning from demonstrations with on-site collected trajectory data is a powerful approach to discover recurrent motion patterns, generalization to new environments, where sufficient motion data are not readily available, remains a challenge. In many cases, however, semantic information about the environment is a highly informative cue for the prediction of pedestrian motion or the estimation of collision risks. In this work, we infer occupancy priors of human motion using only semantic environment information as input. To this end we apply and discuss a traditional Inverse Optimal Control approach, and propose a novel one based on Convolutional Neural Networks (CNN) to predict future occupancy maps. Our CNN method produces flexible context-aware occupancy estimations for semantically uniform map regions and generalizes well already with small amounts of training data. Evaluated on synthetic and real-world data, it shows superior results compared to several baselines, marking a qualitative step-up in semantic environment assessment. △ Less

Submitted 17 February, 2021; originally announced February 2021.

Comments: 8 pages, final pre-print version of the IEEE RA-L letter

arXiv:2007.07227 [pdf, other]

doi 10.1109/TBIOM.2020.3037257

MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation

Authors: István Sárándi, Timm Linder, Kai O. Arras, Bastian Leibe

Abstract: Heatmap representations have formed the basis of human pose estimation systems for many years, and their extension to 3D has been a fruitful line of recent research. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and Z to metric depth around the subject. To obtain metric-scale predictions, 2.5D methods need a separate post-processing step to resolve scale ambi… ▽ More Heatmap representations have formed the basis of human pose estimation systems for many years, and their extension to 3D has been a fruitful line of recent research. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and Z to metric depth around the subject. To obtain metric-scale predictions, 2.5D methods need a separate post-processing step to resolve scale ambiguity. Further, they cannot localize body joints outside the image boundaries, leading to incomplete estimates for truncated images. To address these limitations, we propose metric-scale truncation-robust (MeTRo) volumetric heatmaps, whose dimensions are all defined in metric 3D space, instead of being aligned with image space. This reinterpretation of heatmap dimensions allows us to directly estimate complete, metric-scale poses without test-time knowledge of distance or relying on anthropometric heuristics, such as bone lengths. To further demonstrate the utility our representation, we present a differentiable combination of our 3D metric-scale heatmaps with 2D image-space ones to estimate absolute 3D pose (our MeTRAbs architecture). We find that supervision via absolute pose loss is crucial for accurate non-root-relative localization. Using a ResNet-50 backbone without further learned layers, we obtain state-of-the-art results on Human3.6M, MPI-INF-3DHP and MuPoTS-3D. Our code will be made publicly available to facilitate further research. △ Less

Submitted 14 November, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

Comments: See project page at https://vision.rwth-aachen.de/metrabs . Accepted for publication in the IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM), Special Issue "Selected Best Works From Automated Face and Gesture Recognition 2020". Extended version of FG paper arXiv:2003.02953

ACM Class: I.2.10; I.4.8

Journal ref: IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 3, no. 1, pp. 16-30, Jan. 2021

arXiv:2003.03543 [pdf, other]

Experimental Comparison of Global Motion Planning Algorithms for Wheeled Mobile Robots

Authors: Eric Heiden, Luigi Palmieri, Kai O. Arras, Gaurav S. Sukhatme, Sven Koenig

Abstract: Planning smooth and energy-efficient motions for wheeled mobile robots is a central task for applications ranging from autonomous driving to service and intralogistic robotics. Over the past decades, a wide variety of motion planners, steer functions and path-improvement techniques have been proposed for such non-holonomic systems. With the objective of comparing this large assortment of state-of-… ▽ More Planning smooth and energy-efficient motions for wheeled mobile robots is a central task for applications ranging from autonomous driving to service and intralogistic robotics. Over the past decades, a wide variety of motion planners, steer functions and path-improvement techniques have been proposed for such non-holonomic systems. With the objective of comparing this large assortment of state-of-the-art motion-planning techniques, we introduce a novel open-source motion-planning benchmark for wheeled mobile robots, whose scenarios resemble real-world applications (such as navigating warehouses, moving in cluttered cities or parking), and propose metrics for planning efficiency and path quality. Our benchmark is easy to use and extend, and thus allows practitioners and researchers to evaluate new motion-planning algorithms, scenarios and metrics easily. We use our benchmark to highlight the strengths and weaknesses of several common state-of-the-art motion planners and provide recommendations on when they should be used. △ Less

Submitted 7 March, 2020; originally announced March 2020.

Comments: Extended version of manuscript under review

arXiv:2003.02953 [pdf, other]

doi 10.1109/FG47880.2020.00108

Metric-Scale Truncation-Robust Heatmaps for 3D Human Pose Estimation

Authors: István Sárándi, Timm Linder, Kai O. Arras, Bastian Leibe

Abstract: Heatmap representations have formed the basis of 2D human pose estimation systems for many years, but their generalizations for 3D pose have only recently been considered. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and the Z axis to metric depth around the subject. To obtain metric-scale predictions, these methods must include a separate, explicit post-pro… ▽ More Heatmap representations have formed the basis of 2D human pose estimation systems for many years, but their generalizations for 3D pose have only recently been considered. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and the Z axis to metric depth around the subject. To obtain metric-scale predictions, these methods must include a separate, explicit post-processing step to resolve scale ambiguity. Further, they cannot encode body joint positions outside of the image boundaries, leading to incomplete pose estimates in case of image truncation. We address these limitations by proposing metric-scale truncation-robust (MeTRo) volumetric heatmaps, whose dimensions are defined in metric 3D space near the subject, instead of being aligned with image space. We train a fully-convolutional network to estimate such heatmaps from monocular RGB in an end-to-end manner. This reinterpretation of the heatmap dimensions allows us to estimate complete metric-scale poses without test-time knowledge of the focal length or person distance and without relying on anthropometric heuristics in post-processing. Furthermore, as the image space is decoupled from the heatmap space, the network can learn to reason about joints beyond the image boundary. Using ResNet-50 without any additional learned layers, we obtain state-of-the-art results on the Human3.6M and MPI-INF-3DHP benchmarks. As our method is simple and fast, it can become a useful component for real-time top-down multi-person pose estimation systems. We make our code publicly available to facilitate further research (see https://vision.rwth-aachen.de/metro-pose3d). △ Less

Submitted 5 March, 2020; originally announced March 2020.

Comments: Accepted for publication at the 2020 IEEE Conference on Automatic Face and Gesture Recognition (FG)

arXiv:2003.00754 [pdf, other]

Plug-and-Play SLAM: A Unified SLAM Architecture for Modularity and Ease of Use

Authors: Mirco Colosi, Irvin Aloise, Tiziano Guadagnino, Dominik Schlegel, Bartolomeo Della Corte, Kai O. Arras, Giorgio Grisetti

Abstract: Nowadays, SLAM (Simultaneous Localization and Map**) is considered by the Robotics community to be a mature field. Currently, there are many open-source systems that are able to deliver fast and accurate estimation in typical real-world scenarios. Still, all these systems often provide an ad-hoc implementation that entailed to predefined sensor configurations. In this work, we tackle this issue,… ▽ More Nowadays, SLAM (Simultaneous Localization and Map**) is considered by the Robotics community to be a mature field. Currently, there are many open-source systems that are able to deliver fast and accurate estimation in typical real-world scenarios. Still, all these systems often provide an ad-hoc implementation that entailed to predefined sensor configurations. In this work, we tackle this issue, proposing a novel SLAM architecture specifically designed to address heterogeneous sensors' configuration and to standardize SLAM solutions. Thanks to its modularity and to specific design patterns, the presented architecture is easy to extend, enhancing code reuse and efficiency. Finally, adopting our solution, we conducted comparative experiments for a variety of sensor configurations, showing competitive results that confirm state-of-the-art performance. △ Less

Submitted 14 April, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

arXiv:2001.05449 [pdf, ps, other]

CIAO$^\star$: MPC-based Safe Motion Planning in Predictable Dynamic Environments

Authors: Tobias Schoels, Per Rutquist, Luigi Palmieri, Andrea Zanelli, Kai O. Arras, Moritz Diehl

Abstract: Robots have been operating in dynamic environments and shared workspaces for decades. Most optimization based motion planning methods, however, do not consider the movement of other agents, e.g. humans or other robots, and therefore do not guarantee collision avoidance in such scenarios. This paper builds upon the Convex Inner ApprOximation (CIAO) method and proposes a motion planning algorithm th… ▽ More Robots have been operating in dynamic environments and shared workspaces for decades. Most optimization based motion planning methods, however, do not consider the movement of other agents, e.g. humans or other robots, and therefore do not guarantee collision avoidance in such scenarios. This paper builds upon the Convex Inner ApprOximation (CIAO) method and proposes a motion planning algorithm that guarantees collision avoidance in predictable dynamic environments. Furthermore, it generalizes CIAO's free region concept to arbitrary norms and proposes a cost function to approximate time optimal motion planning. The proposed method, CIAO$^\star$, finds kinodynamically feasible and collision free trajectories for constrained single body robots using model predictive control (MPC). It optimizes the motion of one agent and accounts for the predicted movement of surrounding agents and obstacles. The experimental evaluation shows that CIAO$^\star$ reaches close to time optimal behavior. △ Less

Submitted 25 May, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: accepted to 21st IFAC World Congress

arXiv:1909.13552 [pdf, other]

doi 10.1109/LRA.2019.2958525

Dispertio: Optimal Sampling for Safe Deterministic Sampling-Based Motion Planning

Authors: Luigi Palmieri, Leonard Bruns, Michael Meurer, Kai Oliver Arras

Abstract: A key challenge in robotics is the efficient generation of optimal robot motion with safety guarantees in cluttered environments. Recently, deterministic optimal sampling-based motion planners have been shown to achieve good performance towards this end, in particular in terms of planning efficiency, final solution cost, quality guarantees as well as non-probabilistic completeness. Yet their appli… ▽ More A key challenge in robotics is the efficient generation of optimal robot motion with safety guarantees in cluttered environments. Recently, deterministic optimal sampling-based motion planners have been shown to achieve good performance towards this end, in particular in terms of planning efficiency, final solution cost, quality guarantees as well as non-probabilistic completeness. Yet their application is still limited to relatively simple systems (i.e., linear, holonomic, Euclidean state spaces). In this work, we extend this technique to the class of symmetric and optimal driftless systems by presenting Dispertio, an offline dispersion optimization technique for computing sampling sets, aware of differential constraints, for sampling-based robot motion planning. We prove that the approach, when combined with PRM*, is deterministically complete and retains asymptotic optimality. Furthermore, in our experiments we show that the proposed deterministic sampling technique outperforms several baselines and alternative methods in terms of planning efficiency and solution cost. △ Less

Submitted 30 September, 2019; originally announced September 2019.

arXiv:1909.08267 [pdf, other]

An NMPC Approach using Convex Inner Approximations for Online Motion Planning with Guaranteed Collision Avoidance

Authors: Tobias Schoels, Luigi Palmieri, Kai O. Arras, Moritz Diehl

Abstract: Even though mobile robots have been around for decades, trajectory optimization and continuous time collision avoidance remain subject of active research. Existing methods trade off between path quality, computational complexity, and kinodynamic feasibility. This work approaches the problem using a nonlinear model predictive control (NMPC) framework, that is based on a novel convex inner approxima… ▽ More Even though mobile robots have been around for decades, trajectory optimization and continuous time collision avoidance remain subject of active research. Existing methods trade off between path quality, computational complexity, and kinodynamic feasibility. This work approaches the problem using a nonlinear model predictive control (NMPC) framework, that is based on a novel convex inner approximation of the collision avoidance constraint. The proposed Convex Inner ApprOximation (CIAO) method finds kinodynamically feasible and continuous time collision free trajectories, in few iterations, typically one. For a feasible initialization, the approach is guaranteed to find a feasible solution, i.e. it preserves feasibility. Our experimental evaluation shows that CIAO outperforms state of the art baselines in terms of planning efficiency and path quality. Experiments on a robot with 12 states show that it also scales to high-dimensional systems. Furthermore real-world experiments demonstrate its capability of unifying trajectory optimization and tracking for safe motion planning in dynamic environments. △ Less

Submitted 29 February, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

Comments: Accepted by ICRA2020. This version is extended for readability and completeness. Additional content would have exceeded page limit

arXiv:1909.04403 [pdf, other]

doi 10.1109/LRA.2020.2965416

THÖR: Human-Robot Navigation Data Collection and Accurate Motion Trajectories Dataset

Authors: Andrey Rudenko, Tomasz P. Kucner, Chittaranjan S. Swaminathan, Ravi T. Chadalavada, Kai O. Arras, Achim J. Lilienthal

Abstract: Understanding human behavior is key for robots and intelligent systems that share a space with people. Accordingly, research that enables such systems to perceive, track, learn and predict human behavior as well as to plan and interact with humans has received increasing attention over the last years. The availability of large human motion datasets that contain relevant levels of difficulty is fun… ▽ More Understanding human behavior is key for robots and intelligent systems that share a space with people. Accordingly, research that enables such systems to perceive, track, learn and predict human behavior as well as to plan and interact with humans has received increasing attention over the last years. The availability of large human motion datasets that contain relevant levels of difficulty is fundamental to this research. Existing datasets are often limited in terms of information content, annotation quality or variability of human behavior. In this paper, we present THÖR, a new dataset with human motion trajectory and eye gaze data collected in an indoor environment with accurate ground truth for position, head orientation, gaze direction, social grou**, obstacles map and goal coordinates. THÖR also contains sensor data collected by a 3D lidar and involves a mobile robot navigating the space. We propose a set of metrics to quantitatively analyze motion trajectory datasets such as the average tracking duration, ground truth noise, curvature and speed variation of the trajectories. In comparison to prior art, our dataset has a larger variety in human motion behavior, is less noisy, and contains annotations at higher frequencies. △ Less

Submitted 11 December, 2019; v1 submitted 10 September, 2019; originally announced September 2019.

Comments: 7 pages, to appear in RA-L, awaiting decision for the ICRA 2020 conference option

arXiv:1908.00151 [pdf, other]

Multi-path Learning for Object Pose Estimation Across Domains

Authors: Martin Sundermeyer, Maximilian Durner, En Yen Puang, Zoltan-Csaba Marton, Narunas Vaskevicius, Kai O. Arras, Rudolph Triebel

Abstract: We introduce a scalable approach for object pose estimation trained on simulated RGB views of multiple 3D models together. We learn an encoding of object views that does not only describe an implicit orientation of all objects seen during training, but can also relate views of untrained objects. Our single-encoder-multi-decoder network is trained using a technique we denote "multi-path learning":… ▽ More We introduce a scalable approach for object pose estimation trained on simulated RGB views of multiple 3D models together. We learn an encoding of object views that does not only describe an implicit orientation of all objects seen during training, but can also relate views of untrained objects. Our single-encoder-multi-decoder network is trained using a technique we denote "multi-path learning": While the encoder is shared by all objects, each decoder only reconstructs views of a single object. Consequently, views of different instances do not have to be separated in the latent space and can share common features. The resulting encoder generalizes well from synthetic to real data and across various instances, categories, model types and datasets. We systematically investigate the learned encodings, their generalization, and iterative refinement strategies on the ModelNet40 and T-LESS dataset. Despite training jointly on multiple objects, our 6D Object Detection pipeline achieves state-of-the-art results on T-LESS at much lower runtimes than competing approaches. △ Less

Submitted 3 April, 2020; v1 submitted 31 July, 2019; originally announced August 2019.

Comments: To appear at CVPR 2020; Code will be available here: https://github.com/DLR-RM/AugmentedAutoencoder/tree/multipath

arXiv:1905.06113 [pdf, other]

doi 10.1177/0278364920917446

Human Motion Trajectory Prediction: A Survey

Authors: Andrey Rudenko, Luigi Palmieri, Michael Herman, Kris M. Kitani, Dariu M. Gavrila, Kai O. Arras

Abstract: With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper prov… ▽ More With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research. △ Less

Submitted 17 December, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

Comments: Submitted to the International Journal of Robotics Research (IJRR), 37 pages

arXiv:1809.04987 [pdf, other]

Synthetic Occlusion Augmentation with Volumetric Heatmaps for the 2018 ECCV PoseTrack Challenge on 3D Human Pose Estimation

Authors: István Sárándi, Timm Linder, Kai O. Arras, Bastian Leibe

Abstract: In this paper we present our winning entry at the 2018 ECCV PoseTrack Challenge on 3D human pose estimation. Using a fully-convolutional backbone architecture, we obtain volumetric heatmaps per body joint, which we convert to coordinates using soft-argmax. Absolute person center depth is estimated by a 1D heatmap prediction head. The coordinates are back-projected to 3D camera space, where we mini… ▽ More In this paper we present our winning entry at the 2018 ECCV PoseTrack Challenge on 3D human pose estimation. Using a fully-convolutional backbone architecture, we obtain volumetric heatmaps per body joint, which we convert to coordinates using soft-argmax. Absolute person center depth is estimated by a 1D heatmap prediction head. The coordinates are back-projected to 3D camera space, where we minimize the L1 loss. Key to our good results is the training data augmentation with randomly placed occluders from the Pascal VOC dataset. In addition to reaching first place in the Challenge, our method also surpasses the state-of-the-art on the full Human3.6M benchmark among methods that use no additional pose datasets in training. Code for applying synthetic occlusions is availabe at https://github.com/isarandi/synthetic-occlusion. △ Less

Submitted 6 November, 2018; v1 submitted 13 September, 2018; originally announced September 2018.

Comments: Extended abstract for the 2018 ECCV PoseTrack Workshop, updated with full result tables

arXiv:1808.09316 [pdf, other]

How Robust is 3D Human Pose Estimation to Occlusion?

Authors: István Sárándi, Timm Linder, Kai O. Arras, Bastian Leibe

Abstract: Occlusion is commonplace in realistic human-robot shared environments, yet its effects are not considered in standard 3D human pose estimation benchmarks. This leaves the question open: how robust are state-of-the-art 3D pose estimation methods against partial occlusions? We study several types of synthetic occlusions over the Human3.6M dataset and find a method with state-of-the-art benchmark per… ▽ More Occlusion is commonplace in realistic human-robot shared environments, yet its effects are not considered in standard 3D human pose estimation benchmarks. This leaves the question open: how robust are state-of-the-art 3D pose estimation methods against partial occlusions? We study several types of synthetic occlusions over the Human3.6M dataset and find a method with state-of-the-art benchmark performance to be sensitive even to low amounts of occlusion. Addressing this issue is key to progress in applications such as collaborative and service robotics. We take a first step in this direction by improving occlusion-robustness through training data augmentation with synthetic occlusions. This also turns out to be an effective regularizer that is beneficial even for non-occluded test cases. △ Less

Submitted 29 August, 2018; v1 submitted 28 August, 2018; originally announced August 2018.

Comments: Accepted for IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'18) - Workshop on Robotic Co-workers 4.0: Human Safety and Comfort in Human-Robot Interactive Social Environments

arXiv:1804.02463 [pdf, other]

Deep Person Detection in 2D Range Data

Authors: Lucas Beyer, Alexander Hermans, Timm Linder, Kai O. Arras, Bastian Leibe

Abstract: Detecting humans is a key skill for mobile robots and intelligent vehicles in a large variety of applications. While the problem is well studied for certain sensory modalities such as image data, few works exist that address this detection task using 2D range data. However, a widespread sensory setup for many mobile robots in service and domestic applications contains a horizontally mounted 2D las… ▽ More Detecting humans is a key skill for mobile robots and intelligent vehicles in a large variety of applications. While the problem is well studied for certain sensory modalities such as image data, few works exist that address this detection task using 2D range data. However, a widespread sensory setup for many mobile robots in service and domestic applications contains a horizontally mounted 2D laser scanner. Detecting people from 2D range data is challenging due to the speed and dynamics of human leg motion and the high levels of occlusion and self-occlusion particularly in crowds of people. While previous approaches mostly relied on handcrafted features, we recently developed the deep learning based wheelchair and walker detector DROW. In this paper, we show the generalization to people, including small modifications that significantly boost DROW's performance. Additionally, by providing a small, fully online temporal window in our network, we further boost our score. We extend the DROW dataset with person annotations, making this the largest dataset of person annotations in 2D range data, recorded during several days in a real-world environment with high diversity. Extensive experiments with three current baseline methods indicate it is a challenging dataset, on which our improved DROW detector beats the current state-of-the-art. △ Less

Submitted 6 April, 2018; originally announced April 2018.

arXiv:1510.08233 [pdf, other]

A Fast Randomized Method to Find Homotopy Classes for Socially-Aware Navigation

Authors: Luigi Palmieri, Andrey Rudenko, Kai O. Arras

Abstract: We introduce and show preliminary results of a fast randomized method that finds a set of K paths lying in distinct homotopy classes. We frame the path planning task as a graph search problem, where the navigation graph is based on a Voronoi diagram. The search is biased by a cost function derived from the social force model that is used to generate and select the paths. We compare our method to Y… ▽ More We introduce and show preliminary results of a fast randomized method that finds a set of K paths lying in distinct homotopy classes. We frame the path planning task as a graph search problem, where the navigation graph is based on a Voronoi diagram. The search is biased by a cost function derived from the social force model that is used to generate and select the paths. We compare our method to Yen's algorithm, and empirically show that our approach is faster to find a subset of homotopy classes. Furthermore our approach computes a set of more diverse paths with respect to the baseline while obtaining a negligible loss in path quality. △ Less

Submitted 28 October, 2015; originally announced October 2015.

Comments: In Proceedings of the IROS 2015 Workshop on Assistance and Service Robotics in a Human Environment Workshop, Hamburg, Germany, 2015

Showing 1–22 of 22 results for author: Arras, K O