-
Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification
Authors:
Xiaohan Xing,
Yuenan Hou,
Hang Li,
Yixuan Yuan,
Hongsheng Li,
Max Q. -H. Meng
Abstract:
The amount of medical images for training deep classification models is typically very scarce, making these deep models prone to overfit the training data. Studies showed that knowledge distillation (KD), especially the mean-teacher framework which is more robust to perturbations, can help mitigate the over-fitting effect. However, directly transferring KD from computer vision to medical image cla…
▽ More
The amount of medical images for training deep classification models is typically very scarce, making these deep models prone to overfit the training data. Studies showed that knowledge distillation (KD), especially the mean-teacher framework which is more robust to perturbations, can help mitigate the over-fitting effect. However, directly transferring KD from computer vision to medical image classification yields inferior performance as medical images suffer from higher intra-class variance and class imbalance. To address these issues, we propose a novel Categorical Relation-preserving Contrastive Knowledge Distillation (CRCKD) algorithm, which takes the commonly used mean-teacher model as the supervisor. Specifically, we propose a novel Class-guided Contrastive Distillation (CCD) module to pull closer positive image pairs from the same class in the teacher and student models, while pushing apart negative image pairs from different classes. With this regularization, the feature distribution of the student model shows higher intra-class similarity and inter-class variance. Besides, we propose a Categorical Relation Preserving (CRP) loss to distill the teacher's relational knowledge in a robust and class-balanced manner. With the contribution of the CCD and CRP, our CRCKD algorithm can distill the relational knowledge more comprehensively. Extensive experiments on the HAM10000 and APTOS datasets demonstrate the superiority of the proposed CRCKD method.
△ Less
Submitted 7 July, 2021;
originally announced July 2021.
-
Learning Robot Exploration Strategy with 4D Point-Clouds-like Information as Observations
Authors:
Zhaoting Li,
Tingguang Li,
Jiankun Wang,
Max Q. -H. Meng
Abstract:
Being able to explore unknown environments is a requirement for fully autonomous robots. Many learning-based methods have been proposed to learn an exploration strategy. In the frontier-based exploration, learning algorithms tend to learn the optimal or near-optimal frontier to explore. Most of these methods represent the environments as fixed size images and take these as inputs to neural network…
▽ More
Being able to explore unknown environments is a requirement for fully autonomous robots. Many learning-based methods have been proposed to learn an exploration strategy. In the frontier-based exploration, learning algorithms tend to learn the optimal or near-optimal frontier to explore. Most of these methods represent the environments as fixed size images and take these as inputs to neural networks. However, the size of environments is usually unknown, which makes these methods fail to generalize to real world scenarios. To address this issue, we present a novel state representation method based on 4D point-clouds-like information, including the locations, frontier, and distance information. We also design a neural network that can process these 4D point-clouds-like information and generate the estimated value for each frontier. Then this neural network is trained using the typical reinforcement learning framework. We test the performance of our proposed method by comparing it with other five methods and test its scalability on a map that is much larger than maps in the training set. The experiment results demonstrate that our proposed method needs shorter average traveling distances to explore whole environments and can be adopted in maps with arbitrarily sizes.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
Curiosity-based Robot Navigation under Uncertainty in Crowded Environments
Authors:
Kuanqi Cai,
Weinan Chen,
Chaoqun Wang,
Hong Zhang,
Max Q. -H. Meng
Abstract:
Mobile robots have become more and more popular in large-scale and crowded environments, such as airports, shop** malls, etc. However, due to sparse landmarks and crowd noise, localization in this environment is a great challenge. Furthermore, it is unreliable for the robot to navigate safely in crowds while considering human comfort. Thus, how to navigate safely with localization precision in t…
▽ More
Mobile robots have become more and more popular in large-scale and crowded environments, such as airports, shop** malls, etc. However, due to sparse landmarks and crowd noise, localization in this environment is a great challenge. Furthermore, it is unreliable for the robot to navigate safely in crowds while considering human comfort. Thus, how to navigate safely with localization precision in that environment is a critical problem. To solve this problem, we proposed a curiosity-based framework that can find an effective path with the consideration of human comfort and crowds, localization uncertainty, and the cost-to-go to the target. Three parts are involved in the proposed framework: the distance assessment module, the Curiosity for Positive Content (CPC), namely information-rich areas, and the Curiosity for Negative Content (CNC), namely crowded areas. CPC is introduced when the real-time localization uncertainty evaluation is not satisfied. This factor is predicted through the propagation of uncertainty along the candidate trajectory to provoke the robot to approach localization-referenced landmarks. The Human Comfort and Crowd Density Map (HCCDM) based on the Gaussian Mixture Model (GMM) is established to calculate CNC, which drives the robot to bypass the crowd and consider human comfort. The evaluation is conducted in a series of large-scale and crowded environments. The results show that our method can find a feasible path that can consider the localization uncertainty while simultaneously avoiding the crowded area.
△ Less
Submitted 20 March, 2023; v1 submitted 3 June, 2021;
originally announced June 2021.
-
VDB-EDT: An Efficient Euclidean Distance Transform Algorithm Based on VDB Data Structure
Authors:
Delong Zhu,
Chaoqun Wang,
Wenshan Wang,
Rohit Garg,
Sebastian Scherer,
Max Q. -H. Meng
Abstract:
This paper presents a fundamental algorithm, called VDB-EDT, for Euclidean distance transform (EDT) based on the VDB data structure. The algorithm executes on grid maps and generates the corresponding distance field for recording distance information against obstacles, which forms the basis of numerous motion planning algorithms. The contributions of this work mainly lie in three folds. Firstly, w…
▽ More
This paper presents a fundamental algorithm, called VDB-EDT, for Euclidean distance transform (EDT) based on the VDB data structure. The algorithm executes on grid maps and generates the corresponding distance field for recording distance information against obstacles, which forms the basis of numerous motion planning algorithms. The contributions of this work mainly lie in three folds. Firstly, we propose a novel algorithm that can facilitate distance transform procedures by optimizing the scheduling priorities of transform functions, which significantly improves the running speed of conventional EDT algorithms. Secondly, we for the first time introduce the memory-efficient VDB data structure, a customed B+ tree, to represent the distance field hierarchically. Benefiting from the special index and caching mechanism, VDB shows a fast (average \textit{O}(1)) random access speed, and thus is very suitable for the frequent neighbor-searching operations in EDT. Moreover, regarding the small scale of existing datasets, we release a large-scale dataset captured from subterranean environments to benchmark EDT algorithms. Extensive experiments on the released dataset and publicly available datasets show that VDB-EDT can reduce memory consumption by about 30%-85%, depending on the sparsity of the environment, while maintaining a competitive running speed with the fastest array-based implementation. The experiments also show that VDB-EDT can significantly outperform the state-of-the-art EDT algorithm in both runtime and memory efficiency, which strongly demonstrates the advantages of our proposed method. The released dataset and source code are available on https://github.com/zhudelong/VDB-EDT.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
No Need for Interactions: Robust Model-Based Imitation Learning using Neural ODE
Authors:
HaoChih Lin,
Baopu Li,
Xin Zhou,
Jiankun Wang,
Max Q. -H. Meng
Abstract:
Interactions with either environments or expert policies during training are needed for most of the current imitation learning (IL) algorithms. For IL problems with no interactions, a typical approach is Behavior Cloning (BC). However, BC-like methods tend to be affected by distribution shift. To mitigate this problem, we come up with a Robust Model-Based Imitation Learning (RMBIL) framework that…
▽ More
Interactions with either environments or expert policies during training are needed for most of the current imitation learning (IL) algorithms. For IL problems with no interactions, a typical approach is Behavior Cloning (BC). However, BC-like methods tend to be affected by distribution shift. To mitigate this problem, we come up with a Robust Model-Based Imitation Learning (RMBIL) framework that casts imitation learning as an end-to-end differentiable nonlinear closed-loop tracking problem. RMBIL applies Neural ODE to learn a precise multi-step dynamics and a robust tracking controller via Nonlinear Dynamics Inversion (NDI) algorithm. Then, the learned NDI controller will be combined with a trajectory generator, a conditional VAE, to imitate an expert's behavior. Theoretical derivation shows that the controller network can approximate an NDI when minimizing the training loss of Neural ODE. Experiments on Mujoco tasks also demonstrate that RMBIL is competitive to the state-of-the-art generative adversarial method (GAIL) and achieves at least 30% performance gain over BC in uneven surfaces.
△ Less
Submitted 3 April, 2021;
originally announced April 2021.
-
A Large-Scale Dataset for Benchmarking Elevator Button Segmentation and Character Recognition
Authors:
Jianbang Liu,
Yuqi Fang,
Delong Zhu,
Nachuan Ma,
** Pan,
Max Q. -H. Meng
Abstract:
Human activities are hugely restricted by COVID-19, recently. Robots that can conduct inter-floor navigation attract much public attention, since they can substitute human workers to conduct the service work. However, current robots either depend on human assistance or elevator retrofitting, and fully autonomous inter-floor navigation is still not available. As the very first step of inter-floor n…
▽ More
Human activities are hugely restricted by COVID-19, recently. Robots that can conduct inter-floor navigation attract much public attention, since they can substitute human workers to conduct the service work. However, current robots either depend on human assistance or elevator retrofitting, and fully autonomous inter-floor navigation is still not available. As the very first step of inter-floor navigation, elevator button segmentation and recognition hold an important position. Therefore, we release the first large-scale publicly available elevator panel dataset in this work, containing 3,718 panel images with 35,100 button labels, to facilitate more powerful algorithms on autonomous elevator operation. Together with the dataset, a number of deep learning based implementations for button segmentation and recognition are also released to benchmark future methods in the community. The dataset will be available at \url{https://github.com/zhudelong/elevator_button_recognition
△ Less
Submitted 22 March, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Autonomous Navigation of an Ultrasound Probe Towards Standard Scan Planes with Deep Reinforcement Learning
Authors:
Keyu Li,
Jian Wang,
Yangxin Xu,
Hao Qin,
Dongsheng Liu,
Li Liu,
Max Q. -H. Meng
Abstract:
Autonomous ultrasound (US) acquisition is an important yet challenging task, as it involves interpretation of the highly complex and variable images and their spatial relationships. In this work, we propose a deep reinforcement learning framework to autonomously control the 6-D pose of a virtual US probe based on real-time image feedback to navigate towards the standard scan planes under the restr…
▽ More
Autonomous ultrasound (US) acquisition is an important yet challenging task, as it involves interpretation of the highly complex and variable images and their spatial relationships. In this work, we propose a deep reinforcement learning framework to autonomously control the 6-D pose of a virtual US probe based on real-time image feedback to navigate towards the standard scan planes under the restrictions in real-world US scans. Furthermore, we propose a confidence-based approach to encode the optimization of image quality in the learning process. We validate our method in a simulation environment built with real-world data collected in the US imaging of the spine. Experimental results demonstrate that our method can perform reproducible US probe navigation towards the standard scan plane with an accuracy of $4.91mm/4.65^\circ$ in the intra-patient setting, and accomplish the task in the intra- and inter-patient settings with a success rate of $92\%$ and $46\%$, respectively. The results also show that the introduction of image quality optimization in our method can effectively improve the navigation performance.
△ Less
Submitted 26 August, 2021; v1 submitted 28 February, 2021;
originally announced March 2021.
-
Generative Adversarial Network based Heuristics for Sampling-based Path Planning
Authors:
Tianyi Zhang,
Jiankun Wang,
Max Q. -H. Meng
Abstract:
Sampling-based path planning is a popular methodology for robot path planning. With a uniform sampling strategy to explore the state space, a feasible path can be found without the complex geometric modeling of the configuration space. However, the quality of initial solution is not guaranteed and the convergence speed to the optimal solution is slow. In this paper, we present a novel image-based…
▽ More
Sampling-based path planning is a popular methodology for robot path planning. With a uniform sampling strategy to explore the state space, a feasible path can be found without the complex geometric modeling of the configuration space. However, the quality of initial solution is not guaranteed and the convergence speed to the optimal solution is slow. In this paper, we present a novel image-based path planning algorithm to overcome these limitations. Specifically, a generative adversarial network (GAN) is designed to take the environment map (denoted as RGB image) as the input without other preprocessing works. The output is also an RGB image where the promising region (where a feasible path probably exists) is segmented. This promising region is utilized as a heuristic to achieve nonuniform sampling for the path planner. We conduct a number of simulation experiments to validate the effectiveness of the proposed method, and the results demonstrate that our method performs much better in terms of the quality of initial solution and the convergence speed to the optimal solution. Furthermore, apart from the environments similar to the training set, our method also works well on the environments which are very different from the training set.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Efficient Heuristic Generation for Robot Path Planning with Recurrent Generative Model
Authors:
Zhaoting Li,
Jiankun Wang,
Max Q. -H. Meng
Abstract:
Robot path planning is difficult to solve due to the contradiction between optimality of results and complexity of algorithms, even in 2D environments. To find an optimal path, the algorithm needs to search all the state space, which costs a lot of computation resource. To address this issue, we present a novel recurrent generative model (RGM) which generates efficient heuristic to reduce the sear…
▽ More
Robot path planning is difficult to solve due to the contradiction between optimality of results and complexity of algorithms, even in 2D environments. To find an optimal path, the algorithm needs to search all the state space, which costs a lot of computation resource. To address this issue, we present a novel recurrent generative model (RGM) which generates efficient heuristic to reduce the search efforts of path planning algorithm. This RGM model adopts the framework of general generative adversarial networks (GAN), which consists of a novel generator that can generate heuristic by refining the outputs recurrently and two discriminators that check the connectivity and safety properties of heuristic. We test the proposed RGM module in various 2D environments to demonstrate its effectiveness and efficiency. The results show that the RGM successfully generates appropriate heuristic in both seen and new unseen maps with a high accuracy, demonstrating the good generalization ability of this model. We also compare the rapidly-exploring random tree star (RRT*) with generated heuristic and the conventional RRT* in four different maps, showing that the generated heuristic can guide the algorithm to find both initial and optimal solution in a faster and more efficient way.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Conditional Generative Adversarial Networks for Optimal Path Planning
Authors:
Nachuan Ma,
Jiankun Wang,
Max Q. -H. Meng
Abstract:
Path planning plays an important role in autonomous robot systems. Effective understanding of the surrounding environment and efficient generation of optimal collision-free path are both critical parts for solving path planning problem. Although conventional sampling-based algorithms, such as the rapidly-exploring random tree (RRT) and its improved optimal version (RRT*), have been widely used in…
▽ More
Path planning plays an important role in autonomous robot systems. Effective understanding of the surrounding environment and efficient generation of optimal collision-free path are both critical parts for solving path planning problem. Although conventional sampling-based algorithms, such as the rapidly-exploring random tree (RRT) and its improved optimal version (RRT*), have been widely used in path planning problems because of their ability to find a feasible path in even complex environments, they fail to find an optimal path efficiently. To solve this problem and satisfy the two aforementioned requirements, we propose a novel learning-based path planning algorithm which consists of a novel generative model based on the conditional generative adversarial networks (CGAN) and a modified RRT* algorithm (denoted by CGANRRT*). Given the map information, our CGAN model can generate an efficient possibility distribution of feasible paths, which can be utilized by the CGAN-RRT* algorithm to find the optimal path with a non-uniform sampling strategy. The CGAN model is trained by learning from ground truth maps, each of which is generated by putting all the results of executing RRT algorithm 50 times on one raw map. We demonstrate the efficient performance of this CGAN model by testing it on two groups of maps and comparing CGAN-RRT* algorithm with conventional RRT* algorithm.
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
Search-Based Online Trajectory Planning for Car-like Robots in Highly Dynamic Environments
Authors:
Jiahui Lin,
Tong Zhou,
Delong Zhu,
Jianbang Liu,
Max Q. -H. Meng
Abstract:
This paper presents a search-based partial motion planner to generate dynamically feasible trajectories for car-like robots in highly dynamic environments. The planner searches for smooth, safe, and near-time-optimal trajectories by exploring a state graph built on motion primitives, which are generated by discretizing the time dimension and the control space. To enable fast online planning, we fi…
▽ More
This paper presents a search-based partial motion planner to generate dynamically feasible trajectories for car-like robots in highly dynamic environments. The planner searches for smooth, safe, and near-time-optimal trajectories by exploring a state graph built on motion primitives, which are generated by discretizing the time dimension and the control space. To enable fast online planning, we first propose an efficient path searching algorithm based on the aggregation and pruning of motion primitives. We then propose a fast collision checking algorithm that takes into account the motions of moving obstacles. The algorithm linearizes relative motions between the robot and obstacles and then checks collisions by comparing a point-line distance. Benefiting from the fast searching and collision checking algorithms, the planner can effectively and safely explore the state-time space to generate near-time-optimal solutions. The results through extensive experiments show that the proposed method can generate feasible trajectories within milliseconds while maintaining a higher success rate than up-to-date methods, which significantly demonstrates its advantages.
△ Less
Submitted 6 November, 2020;
originally announced November 2020.
-
Online State-Time Trajectory Planning Using Timed-ESDF in Highly Dynamic Environments
Authors:
Delong Zhu,
Tong Zhou,
Jiahui Lin,
Yuqi Fang,
Max Q. -H. Meng
Abstract:
Online state-time trajectory planning in highly dynamic environments remains an unsolved problem due to the unpredictable motions of moving obstacles and the curse of dimensionality from the state-time space. Existing state-time planners are typically implemented based on randomized sampling approaches or path searching on discretized state graph. The smoothness, path clearance, and planning effic…
▽ More
Online state-time trajectory planning in highly dynamic environments remains an unsolved problem due to the unpredictable motions of moving obstacles and the curse of dimensionality from the state-time space. Existing state-time planners are typically implemented based on randomized sampling approaches or path searching on discretized state graph. The smoothness, path clearance, and planning efficiency of these planners are usually not satisfying. In this work, we propose a gradient-based planner over the state-time space for online trajectory generation in highly dynamic environments. To enable the gradient-based optimization, we propose a Timed-ESDT that supports distance and gradient queries with state-time keys. Based on the Timed-ESDT, we also define a smooth prior and an obstacle likelihood function that is compatible with the state-time space. The trajectory planning is then formulated to a MAP problem and solved by an efficient numerical optimizer. Moreover, to improve the optimality of the planner, we also define a state-time graph and then conduct path searching on it to find a better initialization for the optimizer. By integrating the graph searching, the planning quality is significantly improved. Experiment results on simulated and benchmark datasets show that our planner can outperform the state-of-the-art methods, demonstrating its significant advantages over the traditional ones.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Pedestrian Motion Tracking by Using Inertial Sensors on the Smartphone
Authors:
Yingying Wang,
Hu Cheng,
Max Q. H. Meng
Abstract:
Inertial Measurement Unit (IMU) has long been a dream for stable and reliable motion estimation, especially in indoor environments where GPS strength limits. In this paper, we propose a novel method for position and orientation estimation of a moving object only from a sequence of IMU signals collected from the phone. Our main observation is that human motion is monotonous and periodic. We adopt t…
▽ More
Inertial Measurement Unit (IMU) has long been a dream for stable and reliable motion estimation, especially in indoor environments where GPS strength limits. In this paper, we propose a novel method for position and orientation estimation of a moving object only from a sequence of IMU signals collected from the phone. Our main observation is that human motion is monotonous and periodic. We adopt the Extended Kalman Filter and use the learning-based method to dynamically update the measurement noise of the filter. Our pedestrian motion tracking system intends to accurately estimate planar position, velocity, heading direction without restricting the phone's daily use. The method is not only tested on the self-collected signals, but also provides accurate position and velocity estimations on the public RIDI dataset, i.e., the absolute transmit error is 1.28m for a 59-second sequence.
△ Less
Submitted 18 September, 2020;
originally announced September 2020.
-
Mobile Robot Path Planning in Dynamic Environments: A Survey
Authors:
Kuanqi Cai,
Chaoqun Wang,
Jiyu Cheng,
Clarence W De Silva,
Max Q. -H. Meng
Abstract:
There are many challenges for robot navigation in densely populated dynamic environments. This paper presents a survey of the path planning methods for robot navigation in dense environments. Particularly, the path planning in the navigation framework of mobile robots is composed of global path planning and local path planning, with regard to the planning scope and the executability. Within this f…
▽ More
There are many challenges for robot navigation in densely populated dynamic environments. This paper presents a survey of the path planning methods for robot navigation in dense environments. Particularly, the path planning in the navigation framework of mobile robots is composed of global path planning and local path planning, with regard to the planning scope and the executability. Within this framework, the recent progress of the path planning methods is presented in the paper, while examining their strengths and weaknesses. Notably, the recently developed Velocity Obstacle method and its variants that serve as the local planner are analyzed comprehensively. Moreover, as a model-free method that is widely used in current robot applications, the reinforcement learning-based path planning algorithms are detailed in this paper.
△ Less
Submitted 22 March, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Autonomous Removal of Perspective Distortion for Robotic Elevator Button Recognition
Authors:
Delong Zhu,
Jianbang Liu,
Nachuan Ma,
Zhe Min,
Max Q. -H. Meng
Abstract:
Elevator button recognition is considered an indispensable function for enabling the autonomous elevator operation of mobile robots. However, due to unfavorable image conditions and various image distortions, the recognition accuracy remains to be improved. In this paper, we present a novel algorithm that can autonomously correct perspective distortions of elevator panel images. The algorithm firs…
▽ More
Elevator button recognition is considered an indispensable function for enabling the autonomous elevator operation of mobile robots. However, due to unfavorable image conditions and various image distortions, the recognition accuracy remains to be improved. In this paper, we present a novel algorithm that can autonomously correct perspective distortions of elevator panel images. The algorithm first leverages the Gaussian Mixture Model (GMM) to conduct a grid fitting process based on button recognition results, then utilizes the estimated grid centers as reference features to estimate camera motions for correcting perspective distortions. The algorithm performs on a single image autonomously and does not need explicit feature detection or feature matching procedure, which is much more robust to noises and outliers than traditional feature-based geometric approaches. To verify the effectiveness of the algorithm, we collect an elevator panel dataset of 50 images captured from different angles of view. Experimental results show that the proposed algorithm can accurately estimate camera motions and effectively remove perspective distortions.
△ Less
Submitted 25 December, 2019;
originally announced December 2019.
-
Learning Hierarchical Control for Robust In-Hand Manipulation
Authors:
Tingguang Li,
Krishnan Srinivasan,
Max Qing-Hu Meng,
Wenzhen Yuan,
Jeannette Bohg
Abstract:
Robotic in-hand manipulation has been a long-standing challenge due to the complexity of modelling hand and object in contact and of coordinating finger motion for complex manipulation sequences. To address these challenges, the majority of prior work has either focused on model-based, low-level controllers or on model-free deep reinforcement learning that each have their own limitations. We propo…
▽ More
Robotic in-hand manipulation has been a long-standing challenge due to the complexity of modelling hand and object in contact and of coordinating finger motion for complex manipulation sequences. To address these challenges, the majority of prior work has either focused on model-based, low-level controllers or on model-free deep reinforcement learning that each have their own limitations. We propose a hierarchical method that relies on traditional, model-based controllers on the low-level and learned policies on the mid-level. The low-level controllers can robustly execute different manipulation primitives (reposing, sliding, flip**). The mid-level policy orchestrates these primitives. We extensively evaluate our approach in simulation with a 3-fingered hand that controls three degrees of freedom of elongated objects. We show that our approach can move objects between almost all the possible poses in the workspace while kee** them firmly grasped. We also show that our approach is robust to inaccuracies in the object models and to observation noise. Finally, we show how our approach generalizes to objects of other shapes.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Learning to Solve a Rubik's Cube with a Dexterous Hand
Authors:
Tingguang Li,
Weitao Xi,
Meng Fang,
Jia Xu,
Max Qing-Hu Meng
Abstract:
We present a learning-based approach to solving a Rubik's cube with a multi-fingered dexterous hand. Despite the promising performance of dexterous in-hand manipulation, solving complex tasks which involve multiple steps and diverse internal object structure has remained an important, yet challenging task. In this paper, we tackle this challenge with a hierarchical deep reinforcement learning meth…
▽ More
We present a learning-based approach to solving a Rubik's cube with a multi-fingered dexterous hand. Despite the promising performance of dexterous in-hand manipulation, solving complex tasks which involve multiple steps and diverse internal object structure has remained an important, yet challenging task. In this paper, we tackle this challenge with a hierarchical deep reinforcement learning method, which separates planning and manipulation. A model-based cube solver finds an optimal move sequence for restoring the cube and a model-free cube operator controls all five fingers to execute each move step by step. To train our models, we build a high-fidelity simulator which manipulates a Rubik's Cube, an object containing high-dimensional state space, with a 24-DoF robot hand. Extensive experiments on 1400 randomly scrambled Rubik's cubes demonstrate the effectiveness of our method, achieving an average success rate of 90.3%.
△ Less
Submitted 26 July, 2019;
originally announced July 2019.
-
Coverage Sampling Planner for UAV-enabled Environmental Exploration and Field Map**
Authors:
Teng Li,
Chaoqun Wang,
Max Q. -H. Meng,
Clarence W. de Silva
Abstract:
Unmanned Aerial Vehicles (UAVs) have been implemented for environmental monitoring by using their capabilities of mobile sensing, autonomous navigation, and remote operation. However, in real-world applications, the limitations of on-board resources (e.g., power supply) of UAVs will constrain the coverage of the monitored area and the number of the acquired samples, which will hinder the performan…
▽ More
Unmanned Aerial Vehicles (UAVs) have been implemented for environmental monitoring by using their capabilities of mobile sensing, autonomous navigation, and remote operation. However, in real-world applications, the limitations of on-board resources (e.g., power supply) of UAVs will constrain the coverage of the monitored area and the number of the acquired samples, which will hinder the performance of field estimation and map**. Therefore, the issue of constrained resources calls for an efficient sampling planner to schedule UAV-based sensing tasks in environmental monitoring. This paper presents a mission planner of coverage sampling and path planning for a UAV-enabled mobile sensor to effectively explore and map an unknown environment that is modeled as a random field. The proposed planner can generate a coverage path with an optimal coverage density for exploratory sampling, and the associated energy cost is subjected to a power supply constraint. The performance of the developed framework is evaluated and compared with the existing state-of-the-art algorithms, using a real-world dataset that is collected from an environmental monitoring program as well as physical field experiments. The experimental results illustrate the reliability and accuracy of the presented coverage sampling planner in a prior survey for environmental exploration and field map**.
△ Less
Submitted 12 July, 2019;
originally announced July 2019.
-
HouseExpo: A Large-scale 2D Indoor Layout Dataset for Learning-based Algorithms on Mobile Robots
Authors:
Tingguang Li,
Danny Ho,
Chenming Li,
Delong Zhu,
Chaoqun Wang,
Max Q. -H. Meng
Abstract:
As one of the most promising areas, mobile robots draw much attention these years. Current work in this field is often evaluated in a few manually designed scenarios, due to the lack of a common experimental platform. Meanwhile, with the recent development of deep learning techniques, some researchers attempt to apply learning-based methods to mobile robot tasks, which requires a substantial amoun…
▽ More
As one of the most promising areas, mobile robots draw much attention these years. Current work in this field is often evaluated in a few manually designed scenarios, due to the lack of a common experimental platform. Meanwhile, with the recent development of deep learning techniques, some researchers attempt to apply learning-based methods to mobile robot tasks, which requires a substantial amount of data. To satisfy the underlying demand, in this paper we build HouseExpo, a large-scale indoor layout dataset containing 35,126 2D floor plans including 252,550 rooms in total. Together we develop Pseudo-SLAM, a lightweight and efficient simulation platform to accelerate the data generation procedure, thereby speeding up the training process. In our experiments, we build models to tackle obstacle avoidance and autonomous exploration from a learning perspective in simulation as well as real-world experiments to verify the effectiveness of our simulator and dataset. All the data and codes are available online and we hope HouseExpo and Pseudo-SLAM can feed the need for data and benefits the whole community.
△ Less
Submitted 30 July, 2020; v1 submitted 23 March, 2019;
originally announced March 2019.
-
SRM: An Efficient Framework for Autonomous Robotic Exploration in Indoor Environments
Authors:
Chaoqun Wang,
Delong Zhu,
Teng Li,
Max Q. -H. Meng,
Clarence De. Silva
Abstract:
In this paper, we propose an integrated framework for the autonomous robotic exploration in indoor environments. Specially, we present a hybrid map, named Semantic Road Map (SRM), to represent the topological structure of the explored environment and facilitate decision-making in the exploration. The SRM is built incrementally along with the exploration process. It is a graph structure with collis…
▽ More
In this paper, we propose an integrated framework for the autonomous robotic exploration in indoor environments. Specially, we present a hybrid map, named Semantic Road Map (SRM), to represent the topological structure of the explored environment and facilitate decision-making in the exploration. The SRM is built incrementally along with the exploration process. It is a graph structure with collision-free nodes and edges that are generated within the sensor coverage. Moreover, each node has a semantic label and the expected information gain at that location. Based on the concise SRM, we present a novel and effective decision-making model to determine the next-best-target (NBT) during the exploration. The model concerns the semantic information, the information gain, and the path cost to the target location. We use the nodes of SRM to represent the candidate targets, which enables the target evaluation to be performed directly on the SRM. With the SRM, both the information gain of a node and the path cost to the node can be obtained efficiently. Besides, we adopt the cross-entropy method to optimize the path to make it more informative. We conduct experimental studies in both simulated and real-world environments, which demonstrate the effectiveness of the proposed method.
△ Less
Submitted 24 December, 2018;
originally announced December 2018.
-
Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration
Authors:
Tingguang Li,
** Pan,
Delong Zhu,
Max Q. -H. Meng
Abstract:
To achieve scenario intelligence, humans must transfer knowledge to robots by develo** goal-oriented algorithms, which are sometimes insensitive to dynamically changing environments. While deep reinforcement learning achieves significant success recently, it is still extremely difficult to be deployed in real robots directly. In this paper, we propose a hybrid structure named Option-Interruption…
▽ More
To achieve scenario intelligence, humans must transfer knowledge to robots by develo** goal-oriented algorithms, which are sometimes insensitive to dynamically changing environments. While deep reinforcement learning achieves significant success recently, it is still extremely difficult to be deployed in real robots directly. In this paper, we propose a hybrid structure named Option-Interruption in which human knowledge is embedded into a hierarchical reinforcement learning framework. Our architecture has two key components: options, represented by existing human-designed methods, can significantly speed up the training process and interruption mechanism, based on learnable termination functions, enables our system to quickly respond to the external environment. To implement this architecture, we derive a set of update rules based on policy gradient methods and present a complete training process. In the experiment part, our method is evaluated in Four-room navigation and exploration task, which shows the efficiency and flexibility of our framework.
△ Less
Submitted 29 July, 2018;
originally announced July 2018.
-
Autonomous Mobile Robot Navigation in Uneven and Unstructured Indoor Environments
Authors:
Chaoqun Wang,
Lili Meng,
Sizhen She,
Ian M. Mitchell,
Teng Li,
Frederick Tung,
Weiwei Wan,
Max. Q. -H. Meng,
Clarence W. de Silva
Abstract:
Robots are increasingly operating in indoor environments designed for and shared with people. However, robots working safely and autonomously in uneven and unstructured environments still face great challenges. Many modern indoor environments are designed with wheelchair accessibility in mind. This presents an opportunity for wheeled robots to navigate through sloped areas while avoiding staircase…
▽ More
Robots are increasingly operating in indoor environments designed for and shared with people. However, robots working safely and autonomously in uneven and unstructured environments still face great challenges. Many modern indoor environments are designed with wheelchair accessibility in mind. This presents an opportunity for wheeled robots to navigate through sloped areas while avoiding staircases. In this paper, we present an integrated software and hardware system for autonomous mobile robot navigation in uneven and unstructured indoor environments. This modular and reusable software framework incorporates capabilities of perception and navigation. Our robot first builds a 3D OctoMap representation for the uneven environment with the 3D map** using wheel odometry, 2D laser and RGB-D data. Then we project multilayer 2D occupancy maps from OctoMap to generate the the traversable map based on layer differences. The safe traversable map serves as the input for efficient autonomous navigation. Furthermore, we employ a variable step size Rapidly Exploring Random Trees that could adjust the step size automatically, eliminating tuning step sizes according to environments. We conduct extensive experiments in simulation and real-world, demonstrating the efficacy and efficiency of our system.
△ Less
Submitted 28 October, 2017;
originally announced October 2017.
-
Inverse Reinforcement Learning with Multi-Relational Chains for Robot-Centered Smart Home
Authors:
Kun Li,
Max Q. -H. Meng
Abstract:
In a robot-centered smart home, the robot observes the home states with its own sensors, and then it can change certain object states according to an operator's commands for remote operations, or imitate the operator's behaviors in the house for autonomous operations. To model the robot's imitation of the operator's behaviors in a dynamic indoor environment, we use multi-relational chains to descr…
▽ More
In a robot-centered smart home, the robot observes the home states with its own sensors, and then it can change certain object states according to an operator's commands for remote operations, or imitate the operator's behaviors in the house for autonomous operations. To model the robot's imitation of the operator's behaviors in a dynamic indoor environment, we use multi-relational chains to describe the changes of environment states, and apply inverse reinforcement learning to encoding the operator's behaviors with a learned reward function. We implement this approach with a mobile robot, and do five experiments to include increasing training days, object numbers, and action types. Besides, a baseline method by directly recording the operator's behaviors is also implemented, and comparison is made on the accuracy of home state evaluation and the accuracy of robot action selection. The results show that the proposed approach handles dynamic environment well, and guides the robot's actions in the house more accurately.
△ Less
Submitted 17 April, 2015; v1 submitted 16 August, 2014;
originally announced August 2014.
-
Object Structure from Manipulation via Particle Filter and Robot-based Active Learning
Authors:
Kun Li,
Max Q. -H. Meng
Abstract:
To learn object models for robotic manipulation, unsupervised methods cannot provide accurate object structural information and supervised methods require a large amount of manually labeled training samples, thus interactive object segmentation is developed to automate object modeling. In this article, we formulate a novel dynamic process for interactive object segmentation, and develop a solution…
▽ More
To learn object models for robotic manipulation, unsupervised methods cannot provide accurate object structural information and supervised methods require a large amount of manually labeled training samples, thus interactive object segmentation is developed to automate object modeling. In this article, we formulate a novel dynamic process for interactive object segmentation, and develop a solution based on particle filter and active learning so that a robot can manipulate and learn object structures incrementally and automatically. We demonstrate our method with a humanoidrobot on different types of objects, and compare its segmentation performancewith established methods on selected objects. The result shows that our approach allows more accurate object modeling and reveals richer object structural information.
△ Less
Submitted 17 April, 2015; v1 submitted 16 August, 2014;
originally announced August 2014.