Search | arXiv e-print repository

Online Time-Informed Kinodynamic Motion Planning of Nonlinear Systems

Authors: Fei Meng, Jianbang Liu, Haojie Shi, Han Ma, Hongliang Ren, Max Q. -H. Meng

Abstract: Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and… ▽ More Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and limited system applicable scope, e.g., linear and polynomial nonlinear systems. To overcome these problems, we propose a method by leveraging deep learning technology, Koopman operator theory, and random set theory. Specifically, we propose a Deep Invertible Koopman operator with control U model named DIKU to predict states forward and backward over a long horizon by modifying the auxiliary network with an invertible neural network. A sampling-based approach, ASKU, performing reachability analysis for the DIKU is developed to approximate the TIS of nonlinear control systems online. Furthermore, we design an online time-informed SKMP using a direct sampling technique to draw uniform random samples in the TIS. Simulation experiment results demonstrate that our method outperforms other existing works, approximating TIS in near real-time and achieving superior planning performance in several time-optimal kinodynamic motion planning problems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2406.00707 [pdf, other]

QUADFormer: Learning-based Detection of Cyber Attacks in Quadrotor UAVs

Authors: Pengyu Wang, Zhaohua Yang, Nachuan Yang, Zikai Wang, Jialu Li, Fan Zhang, Chaoqun Wang, Jiankun Wang, Max Q. -H. Meng, Ling Shi

Abstract: Safety-critical intelligent cyber-physical systems, such as quadrotor unmanned aerial vehicles (UAVs), are vulnerable to different types of cyber attacks, and the absence of timely and accurate attack detection can lead to severe consequences. When UAVs are engaged in large outdoor maneuvering flights, their system constitutes highly nonlinear dynamics that include non-Gaussian noises. Therefore,… ▽ More Safety-critical intelligent cyber-physical systems, such as quadrotor unmanned aerial vehicles (UAVs), are vulnerable to different types of cyber attacks, and the absence of timely and accurate attack detection can lead to severe consequences. When UAVs are engaged in large outdoor maneuvering flights, their system constitutes highly nonlinear dynamics that include non-Gaussian noises. Therefore, the commonly employed traditional statistics-based and emerging learning-based attack detection methods do not yield satisfactory results. In response to the above challenges, we propose QUADFormer, a novel Quadrotor UAV Attack Detection framework with transFormer-based architecture. This framework includes a residue generator designed to generate a residue sequence sensitive to anomalies. Subsequently, this sequence is fed into a transformer structure with disparity in correlation to specifically learn its statistical characteristics for the purpose of classification and attack detection. Finally, we design an alert module to ensure the safe execution of tasks by UAVs under attack conditions. We conduct extensive simulations and real-world experiments, and the results show that our method has achieved superior detection performance compared with many state-of-the-art methods. △ Less

Submitted 14 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00706 [pdf, other]

MINER-RRT*: A Hierarchical and Fast Trajectory Planning Framework in 3D Cluttered Environments

Authors: Pengyu Wang, Jiawei Tang, Hin Wang Lin, Fan Zhang, Chaoqun Wang, Jiankun Wang, Ling Shi, Max Q. -H. Meng

Abstract: Trajectory planning for quadrotors in cluttered environments has been challenging in recent years. While many trajectory planning frameworks have been successful, there still exists potential for improvements, particularly in enhancing the speed of generating efficient trajectories. In this paper, we present a novel hierarchical trajectory planning framework to reduce computational time and memory… ▽ More Trajectory planning for quadrotors in cluttered environments has been challenging in recent years. While many trajectory planning frameworks have been successful, there still exists potential for improvements, particularly in enhancing the speed of generating efficient trajectories. In this paper, we present a novel hierarchical trajectory planning framework to reduce computational time and memory usage called MINER-RRT*, which consists of two main components. First, we propose a sampling-based path planning method boosted by neural networks, where the predicted heuristic region accelerates the convergence of rapidly-exploring random trees. Second, we utilize the optimal conditions derived from the quadrotor's differential flatness properties to construct polynomial trajectories that minimize control effort in multiple stages. Extensive simulation and real-world experimental results demonstrate that, compared to several state-of-the-art (SOTA) approaches, our method can generate high-quality trajectories with better performance in 3D cluttered environments. △ Less

Submitted 14 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.19645 [pdf, other]

A Landmark-aware Network for Automated Cobb Angle Estimation Using X-ray Images

Authors: Jie Yang, Jiankun Wang, Max Q. -H. Meng

Abstract: Automated Cobb angle estimation based on X-ray images plays an important role in scoliosis diagnosis, treatment, and progression surveillance. The inadequate feature extraction and the noise in X-ray images are the main difficulties of automated Cobb angle estimation, and it is challenging to ensure that the calculated Cobb angle meets clinical requirements. To address these problems, we propose a… ▽ More Automated Cobb angle estimation based on X-ray images plays an important role in scoliosis diagnosis, treatment, and progression surveillance. The inadequate feature extraction and the noise in X-ray images are the main difficulties of automated Cobb angle estimation, and it is challenging to ensure that the calculated Cobb angle meets clinical requirements. To address these problems, we propose a Landmark-aware Network named LaNet with three components, Feature Robustness Enhancement Module (FREM), Landmark-aware Objective Function (LOF), and Cobb Angle Calculation Method (CACM), for automated Cobb angle estimation in this paper. To enhance feature extraction, FREM is designed to explore geometric and semantic constraints among landmarks, thus geometric and semantic correlations between landmarks are globally modeled, and robust landmark-based features are extracted. Furthermore, to mitigate the effect of background noise on landmark localization, LOF is proposed to focus more on the foreground near the landmarks and ignore irrelevant background pixels by exploiting category prior information of landmarks. In addition, we also advance CACM to locate the bending segments first and then calculate the Cobb angle within the bending segment, which facilitates the calculation of the clinical standardized Cobb angle. The experiment results on the AASCE dataset demonstrate that our proposed LaNet can significantly improve the Cobb angle estimation performance and outperform other state-of-the-art methods. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2404.00578 [pdf, other]

M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models

Authors: Fan Bai, Yuxin Du, Tiejun Huang, Max Q. -H. Meng, Bo Zhao

Abstract: Medical image analysis is essential to clinical diagnosis and treatment, which is increasingly supported by multi-modal large language models (MLLMs). However, previous research has primarily focused on 2D medical images, leaving 3D images under-explored, despite their richer spatial information. This paper aims to advance 3D medical image analysis with MLLMs. To this end, we present a large-scale… ▽ More Medical image analysis is essential to clinical diagnosis and treatment, which is increasingly supported by multi-modal large language models (MLLMs). However, previous research has primarily focused on 2D medical images, leaving 3D images under-explored, despite their richer spatial information. This paper aims to advance 3D medical image analysis with MLLMs. To this end, we present a large-scale 3D multi-modal medical dataset, M3D-Data, comprising 120K image-text pairs and 662K instruction-response pairs specifically tailored for various 3D medical tasks, such as image-text retrieval, report generation, visual question answering, positioning, and segmentation. Additionally, we propose M3D-LaMed, a versatile multi-modal large language model for 3D medical image analysis. Furthermore, we introduce a new 3D multi-modal medical benchmark, M3D-Bench, which facilitates automatic evaluation across eight tasks. Through comprehensive evaluation, our method proves to be a robust model for 3D medical image analysis, outperforming existing solutions. All code, data, and models are publicly available at: https://github.com/BAAI-DCAI/M3D. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: MLLM, 3D medical image analysis

arXiv:2403.01962 [pdf, other]

An Efficient Model-Based Approach on Learning Agile Motor Skills without Reinforcement

Authors: Haojie Shi, Tingguang Li, Qingxu Zhu, Jiapeng Sheng, Lei Han, Max Q. -H. Meng

Abstract: Learning-based methods have improved locomotion skills of quadruped robots through deep reinforcement learning. However, the sim-to-real gap and low sample efficiency still limit the skill transfer. To address this issue, we propose an efficient model-based learning framework that combines a world model with a policy network. We train a differentiable world model to predict future states and use i… ▽ More Learning-based methods have improved locomotion skills of quadruped robots through deep reinforcement learning. However, the sim-to-real gap and low sample efficiency still limit the skill transfer. To address this issue, we propose an efficient model-based learning framework that combines a world model with a policy network. We train a differentiable world model to predict future states and use it to directly supervise a Variational Autoencoder (VAE)-based policy network to imitate real animal behaviors. This significantly reduces the need for real interaction data and allows for rapid policy updates. We also develop a high-level network to track diverse commands and trajectories. Our simulated results show a tenfold sample efficiency increase compared to reinforcement learning methods such as PPO. In real-world testing, our policy achieves proficient command-following performance with only a two-minute data collection period and generalizes well to new speeds and paths. △ Less

Submitted 18 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: Accepted by ICRA2024

arXiv:2401.08433 [pdf, other]

Autonomous Multiple-Trolley Collection System with Nonholonomic Robots: Design, Control, and Implementation

Authors: Peijia Xie, Bingyi Xia, Anjun Hu, Ziqi Zhao, Lingxiao Meng, Zhirui Sun, Xuheng Gao, Jiankun Wang, Max Q. -H. Meng

Abstract: The intricate and multi-stage task in dynamic public spaces like luggage trolley collection in airports presents both a promising opportunity and an ongoing challenge for automated service robots. Previous research has primarily focused on handling a single trolley or individual functional components, creating a gap in providing cost-effective and efficient solutions for practical scenarios. In th… ▽ More The intricate and multi-stage task in dynamic public spaces like luggage trolley collection in airports presents both a promising opportunity and an ongoing challenge for automated service robots. Previous research has primarily focused on handling a single trolley or individual functional components, creating a gap in providing cost-effective and efficient solutions for practical scenarios. In this paper, we propose a mobile manipulation robot incorporated with an autonomy framework for the collection and transportation of multiple trolleys that can significantly enhance operational efficiency. We address the key challenges in the trolley collection problem through the novel design of the mechanical system and the vision-based control strategy. We design a lightweight manipulator and docking mechanism, optimized for the sequential stacking and transportation of multiple trolleys. Additionally, based on the Control Lyapunov Function and Control Barrier Function, we propose a novel vision-based control with the online Quadratic Programming which significantly improves the accuracy and efficiency of the collection process. The practical application of our system is demonstrated in real world scenarios, where it successfully executes multiple-trolley collection tasks. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2312.17076 [pdf, other]

Minimally-intrusive Navigation in Dense Crowds with Integrated Macro and Micro-level Dynamics

Authors: Tong Zhou, Senmao Qi, Guangdu Cen, Ziqi Zha, Erli Lyu, Jiaole Wang, Max Q. -H. Meng

Abstract: In mobile robot navigation, despite advancements, the generation of optimal paths often disrupts pedestrian areas. To tackle this, we propose three key contributions to improve human-robot coexistence in shared spaces. Firstly, we have established a comprehensive framework to understand disturbances at individual and flow levels. Our framework provides specialized computational strategies for in-d… ▽ More In mobile robot navigation, despite advancements, the generation of optimal paths often disrupts pedestrian areas. To tackle this, we propose three key contributions to improve human-robot coexistence in shared spaces. Firstly, we have established a comprehensive framework to understand disturbances at individual and flow levels. Our framework provides specialized computational strategies for in-depth studies of human-robot interactions from both micro and macro perspectives. By employing novel penalty terms, namely Flow Disturbance Penalty (FDP) and Individual Disturbance Penalty (IDP), our framework facilitates a more nuanced assessment and analysis of the robot navigation's impact on pedestrians. Secondly, we introduce an innovative sampling-based navigation system that adeptly integrates a suite of safety measures with the predictability of robotic movements. This system not only accounts for traditional factors such as trajectory length and travel time but also actively incorporates pedestrian awareness. Our navigation system aims to minimize disturbances and promote harmonious coexistence by considering safety protocols, trajectory clarity, and pedestrian engagement. Lastly, we validate our algorithm's effectiveness and real-time performance through simulations and real-world tests, demonstrating its ability to navigate with minimal pedestrian disturbance in various environments. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 23 pages, 13 figures

MSC Class: 68T40 ACM Class: I.2.9

arXiv:2310.09937 [pdf, other]

Joint Sparse Representations and Coupled Dictionary Learning in Multi-Source Heterogeneous Image Pseudo-color Fusion

Authors: Long Bai, Shilong Yao, Kun Gao, Yanjun Huang, Ruijie Tang, Hong Yan, Max Q. -H. Meng, Hongliang Ren

Abstract: Considering that Coupled Dictionary Learning (CDL) method can obtain a reasonable linear mathematical relationship between resource images, we propose a novel CDL-based Synthetic Aperture Radar (SAR) and multispectral pseudo-color fusion method. Firstly, the traditional Brovey transform is employed as a pre-processing method on the paired SAR and multispectral images. Then, CDL is used to capture… ▽ More Considering that Coupled Dictionary Learning (CDL) method can obtain a reasonable linear mathematical relationship between resource images, we propose a novel CDL-based Synthetic Aperture Radar (SAR) and multispectral pseudo-color fusion method. Firstly, the traditional Brovey transform is employed as a pre-processing method on the paired SAR and multispectral images. Then, CDL is used to capture the correlation between the pre-processed image pairs based on the dictionaries generated from the source images via enforced joint sparse coding. Afterward, the joint sparse representation in the pair of dictionaries is utilized to construct an image mask via calculating the reconstruction errors, and therefore generate the final fusion image. The experimental verification results of the SAR images from the Sentinel-1 satellite and the multispectral images from the Landsat-8 satellite show that the proposed method can achieve superior visual effects, and excellent quantitative performance in terms of spectral distortion, correlation coefficient, MSE, NIQE, BRISQUE, and PIQE. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: To appear in IEEE Sensors Journal

arXiv:2310.04675 [pdf, other]

Terrain-Aware Quadrupedal Locomotion via Reinforcement Learning

Authors: Haojie Shi, Qingxu Zhu, Lei Han, Wanchao Chi, Tingguang Li, Max Q. -H. Meng

Abstract: In nature, legged animals have developed the ability to adapt to challenging terrains through perception, allowing them to plan safe body and foot trajectories in advance, which leads to safe and energy-efficient locomotion. Inspired by this observation, we present a novel approach to train a Deep Neural Network (DNN) policy that integrates proprioceptive and exteroceptive states with a parameteri… ▽ More In nature, legged animals have developed the ability to adapt to challenging terrains through perception, allowing them to plan safe body and foot trajectories in advance, which leads to safe and energy-efficient locomotion. Inspired by this observation, we present a novel approach to train a Deep Neural Network (DNN) policy that integrates proprioceptive and exteroceptive states with a parameterized trajectory generator for quadruped robots to traverse rough terrains. Our key idea is to use a DNN policy that can modify the parameters of the trajectory generator, such as foot height and frequency, to adapt to different terrains. To encourage the robot to step on safe regions and save energy consumption, we propose foot terrain reward and lifting foot height reward, respectively. By incorporating these rewards, our method can learn a safer and more efficient terrain-aware locomotion policy that can move a quadruped robot flexibly in any direction. To evaluate the effectiveness of our approach, we conduct simulation experiments on challenging terrains, including stairs, step** stones, and poles. The simulation results demonstrate that our approach can successfully direct the robot to traverse such tough terrains in any direction. Furthermore, we validate our method on a real legged robot, which learns to traverse step** stones with gaps over 25.5cm. △ Less

Submitted 10 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2309.15079 [pdf, other]

Towards High Efficient Long-horizon Planning with Expert-guided Motion-encoding Tree Search

Authors: Tong Zhou, Erli Lyu, Jiaole Wang, Guangdu Cen, Ziqi Zha, Senmao Qi, Max Q. -H. Meng

Abstract: Autonomous driving holds promise for increased safety, optimized traffic management, and a new level of convenience in transportation. While model-based reinforcement learning approaches such as MuZero enables long-term planning, the exponentially increase of the number of search nodes as the tree goes deeper significantly effect the searching efficiency. To deal with this problem, in this paper w… ▽ More Autonomous driving holds promise for increased safety, optimized traffic management, and a new level of convenience in transportation. While model-based reinforcement learning approaches such as MuZero enables long-term planning, the exponentially increase of the number of search nodes as the tree goes deeper significantly effect the searching efficiency. To deal with this problem, in this paper we proposed the expert-guided motion-encoding tree search (EMTS) algorithm. EMTS extends the MuZero algorithm by representing possible motions with a comprehensive motion primitives latent space and incorporating expert policies toimprove the searching efficiency. The comprehensive motion primitives latent space enables EMTS to sample arbitrary trajectories instead of raw action to reduce the depth of the search tree. And the incorporation of expert policies guided the search and training phases the EMTS algorithm to enable early convergence. In the experiment section, the EMTS algorithm is compared with other four algorithms in three challenging scenarios. The experiment result verifies the effectiveness and the searching efficiency of the proposed EMTS algorithm. △ Less

Submitted 30 September, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: 7 pages, 5 figures

MSC Class: 68T40 ACM Class: I.2.9

arXiv:2309.13813 [pdf, other]

Efficient RRT*-based Safety-Constrained Motion Planning for Continuum Robots in Dynamic Environments

Authors: Peiyu Luo, Shilong Yao, Yiyao Yue, Jiankun Wang, Hong Yan, Max Q. -H. Meng

Abstract: Continuum robots, characterized by their high flexibility and infinite degrees of freedom (DoFs), have gained prominence in applications such as minimally invasive surgery and hazardous environment exploration. However, the intrinsic complexity of continuum robots requires a significant amount of time for their motion planning, posing a hurdle to their practical implementation. To tackle these cha… ▽ More Continuum robots, characterized by their high flexibility and infinite degrees of freedom (DoFs), have gained prominence in applications such as minimally invasive surgery and hazardous environment exploration. However, the intrinsic complexity of continuum robots requires a significant amount of time for their motion planning, posing a hurdle to their practical implementation. To tackle these challenges, efficient motion planning methods such as Rapidly Exploring Random Trees (RRT) and its variant, RRT*, have been employed. This paper introduces a unique RRT*-based motion control method tailored for continuum robots. Our approach embeds safety constraints derived from the robots' posture states, facilitating autonomous navigation and obstacle avoidance in rapidly changing environments. Simulation results show efficient trajectory planning amidst multiple dynamic obstacles and provide a robust performance evaluation based on the generated postures. Finally, preliminary tests were conducted on a two-segment cable-driven continuum robot prototype, confirming the effectiveness of the proposed planning approach. This method is versatile and can be adapted and deployed for various types of continuum robots through parameter adjustments. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.12660 [pdf, ps, other]

Disturbance Rejection Control for Autonomous Trolley Collection Robots with Prescribed Performance

Authors: Rui-Dong Xi, Liang Lu, Xue Zhang, Xiao Xiao, Bingyi Xia, Jiankun Wang, Max Q. -H. Meng

Abstract: Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped di… ▽ More Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped disturbances. On this basis, a robust controller with prescribed performance is proposed using a backstep** technique, which improves the transient performance and guarantees fast convergence. Simulation outcomes have been provided to illustrate the effectiveness of the proposed control scheme. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.11107 [pdf, other]

Indoor Exploration and Simultaneous Trolley Collection Through Task-Oriented Environment Partitioning

Authors: Junjie Gao, Peijia Xie, Xuheng Gao, Zhirui Sun, Jiankun Wang, Max Q. -H. Meng

Abstract: In this paper, we present a simultaneous exploration and object search framework for the application of autonomous trolley collection. For environment representation, a task-oriented environment partitioning algorithm is presented to extract diverse information for each sub-task. First, LiDAR data is classified as potential objects, walls, and obstacles after outlier removal. Segmented point cloud… ▽ More In this paper, we present a simultaneous exploration and object search framework for the application of autonomous trolley collection. For environment representation, a task-oriented environment partitioning algorithm is presented to extract diverse information for each sub-task. First, LiDAR data is classified as potential objects, walls, and obstacles after outlier removal. Segmented point clouds are then transformed into a hybrid map with the following functional components: object proposals to avoid missing trolleys during exploration; room layouts for semantic space segmentation; and polygonal obstacles containing geometry information for efficient motion planning. For exploration and simultaneous trolley collection, we propose an efficient exploration-based object search method. First, a traveling salesman problem with precedence constraints (TSP-PC) is formulated by grou** frontiers and object proposals. The next target is selected by prioritizing object search while avoiding excessive robot backtracking. Then, feasible trajectories with adequate obstacle clearance are generated by topological graph search. We validate the proposed framework through simulations and demonstrate the system with real-world autonomous trolley collection tasks. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2308.14667 [pdf]

Neural Network-Based Histologic Remission Prediction In Ulcerative Colitis

Authors: Yemin li, Zhongcheng Liu, Xiaoying Lou, Mirigual Kurban, Miao Li, Jie Yang, Kaiwei Che, Jiankun Wang, Max Q. -H Meng, Yan Huang, Qin Guo, Pin** Hu

Abstract: BACKGROUND & AIMS: Histological remission (HR) is advocated and considered as a new therapeutic target in ulcerative colitis (UC). Diagnosis of histologic remission currently relies on biopsy; during this process, patients are at risk for bleeding, infection, and post-biopsy fibrosis. In addition, histologic response scoring is complex and time-consuming, and there is heterogeneity among pathologi… ▽ More BACKGROUND & AIMS: Histological remission (HR) is advocated and considered as a new therapeutic target in ulcerative colitis (UC). Diagnosis of histologic remission currently relies on biopsy; during this process, patients are at risk for bleeding, infection, and post-biopsy fibrosis. In addition, histologic response scoring is complex and time-consuming, and there is heterogeneity among pathologists. Endocytoscopy (EC) is a novel ultra-high magnification endoscopic technique that can provide excellent in vivo assessment of glands. Based on the EC technique, we propose a neural network model that can assess histological disease activity in UC using EC images to address the above issues. The experiment results demonstrate that the proposed method can assist patients in precise treatment and prognostic assessment. METHODS: We construct a neural network model for UC evaluation. A total of 5105 images of 154 intestinal segments from 87 patients undergoing EC treatment at a center in China between March 2022 and March 2023 are scored according to the Geboes score. Subsequently, 103 intestinal segments are used as the training set, 16 intestinal segments are used as the validation set for neural network training, and the remaining 35 intestinal segments are used as the test set to measure the model performance together with the validation set. RESULTS: By treating HR as a negative category and histologic activity as a positive category, the proposed neural network model can achieve an accuracy of 0.9, a specificity of 0.95, a sensitivity of 0.75, and an area under the curve (AUC) of 0.81. CONCLUSION: We develop a specific neural network model that can distinguish histologic remission/activity in EC images of UC, which helps to accelerate clinical histological diagnosis. keywords: ulcerative colitis; Endocytoscopy; Geboes score; neural network. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2308.05137 [pdf, other]

Discrepancy-based Active Learning for Weakly Supervised Bleeding Segmentation in Wireless Capsule Endoscopy Images

Authors: Fan Bai, Xiaohan Xing, Yutian Shen, Han Ma, Max Q. -H. Meng

Abstract: Weakly supervised methods, such as class activation maps (CAM) based, have been applied to achieve bleeding segmentation with low annotation efforts in Wireless Capsule Endoscopy (WCE) images. However, the CAM labels tend to be extremely noisy, and there is an irreparable gap between CAM labels and ground truths for medical images. This paper proposes a new Discrepancy-basEd Active Learning (DEAL)… ▽ More Weakly supervised methods, such as class activation maps (CAM) based, have been applied to achieve bleeding segmentation with low annotation efforts in Wireless Capsule Endoscopy (WCE) images. However, the CAM labels tend to be extremely noisy, and there is an irreparable gap between CAM labels and ground truths for medical images. This paper proposes a new Discrepancy-basEd Active Learning (DEAL) approach to bridge the gap between CAMs and ground truths with a few annotations. Specifically, to liberate labor, we design a novel discrepancy decoder model and a CAMPUS (CAM, Pseudo-label and groUnd-truth Selection) criterion to replace the noisy CAMs with accurate model predictions and a few human labels. The discrepancy decoder model is trained with a unique scheme to generate standard, coarse and fine predictions. And the CAMPUS criterion is proposed to predict the gaps between CAMs and ground truths based on model divergence and CAM divergence. We evaluate our method on the WCE dataset and results show that our method outperforms the state-of-the-art active learning methods and reaches comparable performance to those trained with full annotated datasets with only 10% of the training data labeled. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: accepted by MICCAI 2022

arXiv:2308.04911 [pdf, other]

SLPT: Selective Labeling Meets Prompt Tuning on Label-Limited Lesion Segmentation

Authors: Fan Bai, Ke Yan, Xiaoyu Bai, Xinyu Mao, Xiaoli Yin, **gren Zhou, Yu Shi, Le Lu, Max Q. -H. Meng

Abstract: Medical image analysis using deep learning is often challenged by limited labeled data and high annotation costs. Fine-tuning the entire network in label-limited scenarios can lead to overfitting and suboptimal performance. Recently, prompt tuning has emerged as a more promising technique that introduces a few additional tunable parameters as prompts to a task-agnostic pre-trained model, and updat… ▽ More Medical image analysis using deep learning is often challenged by limited labeled data and high annotation costs. Fine-tuning the entire network in label-limited scenarios can lead to overfitting and suboptimal performance. Recently, prompt tuning has emerged as a more promising technique that introduces a few additional tunable parameters as prompts to a task-agnostic pre-trained model, and updates only these parameters using supervision from limited labeled data while kee** the pre-trained model unchanged. However, previous work has overlooked the importance of selective labeling in downstream tasks, which aims to select the most valuable downstream samples for annotation to achieve the best performance with minimum annotation cost. To address this, we propose a framework that combines selective labeling with prompt tuning (SLPT) to boost performance in limited labels. Specifically, we introduce a feature-aware prompt updater to guide prompt tuning and a TandEm Selective LAbeling (TESLA) strategy. TESLA includes unsupervised diversity selection and supervised selection using prompt-based uncertainty. In addition, we propose a diversified visual prompt tuning strategy to provide multi-prompt-based discrepant predictions for TESLA. We evaluate our method on liver tumor segmentation and achieve state-of-the-art performance, outperforming traditional fine-tuning with only 6% of tunable parameters, also achieving 94% of full-data performance by labeling only 5% of the data. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: accepted by MICCAI 2023

arXiv:2308.01533 [pdf, other]

Multi-robot Path Planning with Rapidly-exploring Random Disjointed-Trees

Authors: Biru Zhang, Jiankun Wang, Max Q. -H. Meng

Abstract: Multi-robot path planning is a computational process involving finding paths for each robot from its start to the goal while ensuring collision-free operation. It is widely used in robots and autonomous driving. However, the computational time of multi-robot path planning algorithms is enormous, resulting in low efficiency in practical applications. To address this problem, this article proposes a… ▽ More Multi-robot path planning is a computational process involving finding paths for each robot from its start to the goal while ensuring collision-free operation. It is widely used in robots and autonomous driving. However, the computational time of multi-robot path planning algorithms is enormous, resulting in low efficiency in practical applications. To address this problem, this article proposes a novel multi-robot path planning algorithm (Multi-Agent Rapidly-exploring Random Disjointed-Trees*, MA-RRdT*) based on multi-tree random sampling. The proposed algorithm is based on a single-robot path planning algorithm (Rapidly-exploring Random disjointed-Trees*, RRdT*). The novel MA-RRdT* algorithm has the advantages of fast speed, high space exploration efficiency, and suitability for complex maps. Comparative experiments are completed to evaluate the effectiveness of MA-RRdT*. The final experimental results validate the superior performance of the MA-RRdT* algorithm in terms of time cost and space exploration efficiency. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2308.01164 [pdf, other]

Virtual Reality Based Robot Teleoperation via Human-Scene Interaction

Authors: Lingxiao Meng, Jiangshan Liu, Wei Chai, Jiankun Wang, Max Q. -H. Meng

Abstract: Robot teleoperation gains great success in various situations, including chemical pollution rescue, disaster relief, and long-distance manipulation. In this article, we propose a virtual reality (VR) based robot teleoperation system to achieve more efficient and natural interaction with humans in different scenes. A user-friendly VR interface is designed to help users interact with a desktop scene… ▽ More Robot teleoperation gains great success in various situations, including chemical pollution rescue, disaster relief, and long-distance manipulation. In this article, we propose a virtual reality (VR) based robot teleoperation system to achieve more efficient and natural interaction with humans in different scenes. A user-friendly VR interface is designed to help users interact with a desktop scene using their hands efficiently and intuitively. To improve user experience and reduce workload, we simulate the process in the physics engine to help build a preview of the scene after manipulation in the virtual scene before execution. We conduct experiments with different users and compare our system with a direct control method across several teleoperation tasks. The user study demonstrates that the proposed system enables users to perform operations more instinctively with a lighter mental workload. Users can perform pick-and-place and object-stacking tasks in a considerably short time, even for beginners. Our code is available at https://github.com/lingxiaomeng/VR_Teleoperation_Gen3. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2305.10955 [pdf, other]

Deep Reinforcement Learning-Based Control for Stomach Coverage Scanning of Wireless Capsule Endoscopy

Authors: Yameng Zhang, Long Bai, Li Liu, Hongliang Ren, Max Q. -H. Meng

Abstract: Due to its non-invasive and painless characteristics, wireless capsule endoscopy has become the new gold standard for assessing gastrointestinal disorders. Omissions, however, could occur throughout the examination since controlling capsule endoscope can be challenging. In this work, we control the magnetic capsule endoscope for the coverage scanning task in the stomach based on reinforcement lear… ▽ More Due to its non-invasive and painless characteristics, wireless capsule endoscopy has become the new gold standard for assessing gastrointestinal disorders. Omissions, however, could occur throughout the examination since controlling capsule endoscope can be challenging. In this work, we control the magnetic capsule endoscope for the coverage scanning task in the stomach based on reinforcement learning so that the capsule can comprehensively scan every corner of the stomach. We apply a well-made virtual platform named VR-Caps to simulate the process of stomach coverage scanning with a capsule endoscope model. We utilize and compare two deep reinforcement learning algorithms, the Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) algorithms, to train the permanent magnetic agent, which actuates the capsule endoscope directly via magnetic fields and then optimizes the scanning efficiency of stomach coverage. We analyze the pros and cons of the two algorithms with different hyperparameters and achieve a coverage rate of 98.04% of the stomach area within 150.37 seconds. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: IEEE ROBIO 2022

arXiv:2305.09169 [pdf, other]

Style Transfer Enabled Sim2Real Framework for Efficient Learning of Robotic Ultrasound Image Analysis Using Simulated Data

Authors: Keyu Li, Xinyu Mao, Chengwei Ye, Ang Li, Yangxin Xu, Max Q. -H. Meng

Abstract: Robotic ultrasound (US) systems have shown great potential to make US examinations easier and more accurate. Recently, various machine learning techniques have been proposed to realize automatic US image interpretation for robotic US acquisition tasks. However, obtaining large amounts of real US imaging data for training is usually expensive or even unfeasible in some clinical applications. An alt… ▽ More Robotic ultrasound (US) systems have shown great potential to make US examinations easier and more accurate. Recently, various machine learning techniques have been proposed to realize automatic US image interpretation for robotic US acquisition tasks. However, obtaining large amounts of real US imaging data for training is usually expensive or even unfeasible in some clinical applications. An alternative is to build a simulator to generate synthetic US data for training, but the differences between simulated and real US images may result in poor model performance. This work presents a Sim2Real framework to efficiently learn robotic US image analysis tasks based only on simulated data for real-world deployment. A style transfer module is proposed based on unsupervised contrastive learning and used as a preprocessing step to convert the real US images into the simulation style. Thereafter, a task-relevant model is designed to combine CNNs with vision transformers to generate the task-dependent prediction with improved generalization ability. We demonstrate the effectiveness of our method in an image regression task to predict the probe position based on US images in robotic transesophageal echocardiography (TEE). Our results show that using only simulated US data and a small amount of unlabelled real data for training, our method can achieve comparable performance to semi-supervised and fully supervised learning methods. Moreover, the effectiveness of our previously proposed CT-based US image simulation method is also indirectly confirmed. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2304.14012 [pdf, other]

doi 10.1109/TRO.2024.3360954

Direct Visual Servoing Based on Discrete Orthogonal Moments

Authors: Yuhan Chen, Max Q. -H. Meng, Li Liu

Abstract: This paper proposes a new approach to achieve direct visual servoing (DVS) based on discrete orthogonal moments (DOMs). DVS is performed in such a way that the extraction of geometric primitives, matching, and tracking steps in the conventional feature-based visual servoing pipeline can be bypassed. Although DVS enables highly precise positioning, it suffers from a limited convergence domain and p… ▽ More This paper proposes a new approach to achieve direct visual servoing (DVS) based on discrete orthogonal moments (DOMs). DVS is performed in such a way that the extraction of geometric primitives, matching, and tracking steps in the conventional feature-based visual servoing pipeline can be bypassed. Although DVS enables highly precise positioning, it suffers from a limited convergence domain and poor robustness due to the extreme nonlinearity of the cost function to be minimized and the presence of redundant data between visual features. To tackle these issues, we propose a generic and augmented framework that considers DOMs as visual features. By using the Tchebichef, Krawtchouk, and Hahn moments as examples, we not only present the strategies for adaptively tuning the parameters and order of the visual features but also exhibit an analytical formulation of the associated interaction matrix. Simulations demonstrate the robustness and accuracy of our approach, as well as its advantages over the state-of-the-art. Real-world experiments have also been performed to validate the effectiveness of our approach. △ Less

Submitted 10 November, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

arXiv:2303.06624 [pdf, other]

Collaborative Trolley Transportation System with Autonomous Nonholonomic Robots

Authors: Bingyi Xia, Hao Luan, Ziqi Zhao, Xuheng Gao, Peijia Xie, Anxing Xiao, Jiankun Wang, Max Q. -H. Meng

Abstract: Cooperative object transportation using multiple robots has been intensively studied in the control and robotics literature, but most approaches are either only applicable to omnidirectional robots or lack a complete navigation and decision-making framework that operates in real time. This paper presents an autonomous nonholonomic multi-robot system and an end-to-end hierarchical autonomy framewor… ▽ More Cooperative object transportation using multiple robots has been intensively studied in the control and robotics literature, but most approaches are either only applicable to omnidirectional robots or lack a complete navigation and decision-making framework that operates in real time. This paper presents an autonomous nonholonomic multi-robot system and an end-to-end hierarchical autonomy framework for collaborative luggage trolley transportation. This framework finds kinematic-feasible paths, computes online motion plans, and provides feedback that enables the multi-robot system to handle long lines of luggage trolleys and navigate obstacles and pedestrians while dealing with multiple inherently complex and coupled constraints. We demonstrate the designed collaborative trolley transportation system through practical transportation tasks, and the experiment results reveal their effectiveness and reliability in complex and dynamic environments. △ Less

Submitted 21 July, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

arXiv:2303.06587 [pdf, other]

FabricFolding: Learning Efficient Fabric Folding without Expert Demonstrations

Authors: Can He, Lingxiao Meng, Zhirui Sun, Jiankun Wang, Max Q. -H. Meng

Abstract: Autonomous fabric manipulation is a challenging task due to complex dynamics and potential self-occlusion during fabric handling. An intuitive method of fabric folding manipulation first involves obtaining a smooth and unfolded fabric configuration before the folding process begins. However, the combination of quasi-static actions such as pick & place and dynamic action like fling proves inadequat… ▽ More Autonomous fabric manipulation is a challenging task due to complex dynamics and potential self-occlusion during fabric handling. An intuitive method of fabric folding manipulation first involves obtaining a smooth and unfolded fabric configuration before the folding process begins. However, the combination of quasi-static actions such as pick & place and dynamic action like fling proves inadequate in effectively unfolding long-sleeved T-shirts with sleeves mostly tucked inside the garment. To address this limitation, this paper introduces an improved quasi-static action called pick & drag, specifically designed to handle this type of fabric configuration. Additionally, an efficient dual-arm manipulation system is designed in this paper, which combines quasi-static (including pick & place and pick & drag) and dynamic fling actions to flexibly manipulate fabrics into unfolded and smooth configurations. Subsequently, keypoints of the fabric are detected, enabling autonomous folding. To address the scarcity of publicly available keypoint detection datasets for real fabric, we gathered images of various fabric configurations and types in real scenes to create a comprehensive keypoint dataset for fabric folding. This dataset aims to enhance the success rate of keypoint detection. Moreover, we evaluate the effectiveness of our proposed system in real-world settings, where it consistently and reliably unfolds and folds various types of fabrics, including challenging situations such as long-sleeved T-shirts with most parts of sleeves tucked inside the garment. Specifically, our method achieves a coverage rate of 0.822 and a success rate of 0.88 for long-sleeved T-shirts folding. △ Less

Submitted 11 September, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

arXiv:2303.06551 [pdf, other]

A Systematic Evaluation of Different Indoor Localization Methods in Robotic Autonomous Luggage Trolley Collection at Airports

Authors: Zhirui Sun, Weinan Chen, Jiankun Wang, Max Q. -H. Meng

Abstract: This article addresses the localization problem in robotic autonomous luggage trolley collection at airports and provides a systematic evaluation of different methods to solve it. The robotic autonomous luggage trolley collection is a complex system that involves object detection, localization, motion planning and control, manipulation, etc. Among these components, effective localization is essent… ▽ More This article addresses the localization problem in robotic autonomous luggage trolley collection at airports and provides a systematic evaluation of different methods to solve it. The robotic autonomous luggage trolley collection is a complex system that involves object detection, localization, motion planning and control, manipulation, etc. Among these components, effective localization is essential for the robot to employ subsequent motion planning and end-effector manipulation because it can provide a correct goal position. In this article, we survey four popular and representative localization methods to achieve object localization in the luggage collection process, including radio frequency identification (RFID), Keypoints, ultrawideband (UWB), and Reflectors. To test their performance, we construct a qualitative evaluation framework with Localization Accuracy, Mobile Power Supplies, Coverage Area, Cost, and Scalability. Besides, we conduct a series of quantitative experiments regarding Localization Accuracy and Success Rate on a real-world robotic autonomous luggage trolley collection system. We further analyze the performance of different localization methods based on experiment results, revealing that the Keypoints method is most suitable for indoor environments to achieve the luggage trolley collection. △ Less

Submitted 11 March, 2023; originally announced March 2023.

arXiv:2301.06388 [pdf, other]

doi 10.1109/TRO.2023.3281477

Closed-Loop Magnetic Manipulation for Robotic Transesophageal Echocardiography

Authors: Keyu Li, Yangxin Xu, Ziqi Zhao, Ang Li, Max Q. -H. Meng

Abstract: This paper presents a closed-loop magnetic manipulation framework for robotic transesophageal echocardiography (TEE) acquisitions. Different from previous work on intracorporeal robotic ultrasound acquisitions that focus on continuum robot control, we first investigate the use of magnetic control methods for more direct, intuitive, and accurate manipulation of the distal tip of the probe. We modif… ▽ More This paper presents a closed-loop magnetic manipulation framework for robotic transesophageal echocardiography (TEE) acquisitions. Different from previous work on intracorporeal robotic ultrasound acquisitions that focus on continuum robot control, we first investigate the use of magnetic control methods for more direct, intuitive, and accurate manipulation of the distal tip of the probe. We modify a standard TEE probe by attaching a permanent magnet and an inertial measurement unit sensor to the probe tip and replacing the flexible gastroscope with a soft tether containing only wires for transmitting ultrasound signals, and show that 6-DOF localization and 5-DOF closed-loop control of the probe can be achieved with an external permanent magnet based on the fusion of internal inertial measurement and external magnetic field sensing data. The proposed method does not require complex structures or motions of the actuator and the probe compared with existing magnetic manipulation methods. We have conducted extensive experiments to validate the effectiveness of the framework in terms of localization accuracy, update rate, workspace size, and tracking accuracy. In addition, our results obtained on a realistic cardiac tissue-mimicking phantom show that the proposed framework is applicable in real conditions and can generally meet the requirements for tele-operated TEE acquisitions. △ Less

Submitted 28 May, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

Comments: Accepted by IEEE Transactions on Robotics. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2210.05349 [pdf, other]

Extrinsic Manipulation on a Support Plane by Learning Regras**

Authors: Peng Xu, Zhiyuan Chen, Jiankun Wang, Max Q. -H. Meng

Abstract: Extrinsic manipulation, a technique that enables robots to leverage extrinsic resources for object manipulation, presents practical yet challenging scenarios. Particularly in the context of extrinsic manipulation on a supporting plane, regras** becomes essential for achieving the desired final object poses. This process involves sequential operation steps and stable placements of objects, which… ▽ More Extrinsic manipulation, a technique that enables robots to leverage extrinsic resources for object manipulation, presents practical yet challenging scenarios. Particularly in the context of extrinsic manipulation on a supporting plane, regras** becomes essential for achieving the desired final object poses. This process involves sequential operation steps and stable placements of objects, which provide grasp space for the robot. To address this challenge, we focus on predicting diverse placements of objects on the plane using deep neural networks. A framework that comprises orientation generation, placement refinement, and placement discrimination stages is proposed, leveraging point clouds to obtain precise and diverse stable placements. To facilitate training, a large-scale dataset is constructed, encompassing stable object placements and contact information between objects. Through extensive experiments, our approach is demonstrated to outperform the start-of-the-art, achieving an accuracy rate of 90.4\% and a diversity rate of 81.3\% in predicted placements. Furthermore, we validate the effectiveness of our approach through real-robot experiments, demonstrating its capability to compute sequential pick-and-place steps based on the predicted placements for regras** objects to goal poses that are not readily attainable within a single step. Videos and dataset are available at https://sites.google.com/view/pmvlr2022/. △ Less

Submitted 11 July, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

arXiv:2209.01319 [pdf, other]

Kinova Gemini: Interactive Robot Gras** with Visual Reasoning and Conversational AI

Authors: Hanxiao Chen, Jiankun Wang, Max Q. -H. Meng

Abstract: To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemi… ▽ More To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemini is able to fulfill the user's requests in three different applications: (1) It can start a natural dialogue with people to interact and assist humans to retrieve objects and hand them to the user one by one. (2) It detects diverse objects with YOLO v3 and recognize color attributes of the item to ask people if they want to grasp it via the dialogue or enable the user to choose which specific one is required. (3) It applies YOLO v3 to recognize multiple objects and let you choose two items for perception-based pick-and-place tasks such as "Put the banana into the bowl" with visual reasoning and conversational interaction. △ Less

Submitted 2 September, 2022; originally announced September 2022.

arXiv:2208.05201 [pdf, other]

Quadrotor Autonomous Landing on Moving Platform

Authors: Pengyu Wang, Chaoqun Wang, Jiankun Wang, Max Q. -H. Meng

Abstract: This paper introduces a quadrotor's autonomous take-off and landing system on a moving platform. The designed system addresses three challenging problems: fast pose estimation, restricted external localization, and effective obstacle avoidance. Specifically, first, we design a landing recognition and positioning system based on the AruCo marker to help the quadrotor quickly calculate the relative… ▽ More This paper introduces a quadrotor's autonomous take-off and landing system on a moving platform. The designed system addresses three challenging problems: fast pose estimation, restricted external localization, and effective obstacle avoidance. Specifically, first, we design a landing recognition and positioning system based on the AruCo marker to help the quadrotor quickly calculate the relative pose; second, we leverage a gradient-based local motion planner to generate collision-free reference trajectories rapidly for the quadrotor; third, we build an autonomous state machine that enables the quadrotor to complete its take-off, tracking and landing tasks in full autonomy; finally, we conduct experiments in simulated, real-world indoor and outdoor environments to verify the system's effectiveness and demonstrate its potential. △ Less

Submitted 10 August, 2022; originally announced August 2022.

arXiv:2205.06970 [pdf, other]

Learning to Reorient Objects with Stable Placements Afforded by Extrinsic Supports

Authors: Peng Xu, Hu Cheng, Jiankun Wang, Max Q. -H. Meng

Abstract: Reorienting objects by using supports is a practical yet challenging manipulation task. Owing to the intricate geometry of objects and the constrained feasible motions of the robot, multiple manipulation steps are required for object reorientation. In this work, we propose a pipeline for predicting various object placements from point clouds. This pipeline comprises three stages: a pose generation… ▽ More Reorienting objects by using supports is a practical yet challenging manipulation task. Owing to the intricate geometry of objects and the constrained feasible motions of the robot, multiple manipulation steps are required for object reorientation. In this work, we propose a pipeline for predicting various object placements from point clouds. This pipeline comprises three stages: a pose generation stage, followed by a pose refinement stage, and culminating in a placement classification stage. We also propose an algorithm to construct manipulation graphs based on point clouds. Feasible manipulation sequences are determined for the robot to transfer object placements. Both simulated and real-world experiments demonstrate that our approach is effective. The simulation results underscore our pipeline's capacity to generalize to novel objects in random start poses. Our predicted placements exhibit a 20% enhancement in accuracy compared to the state-of-the-art baseline. Furthermore, the robot finds feasible sequential steps in the manipulation graphs constructed by our algorithm to accomplish object reorientation manipulation. △ Less

Submitted 29 August, 2023; v1 submitted 14 May, 2022; originally announced May 2022.

arXiv:2205.06951 [pdf, other]

doi 10.1109/TASE.2022.3215562

NR-RRT: Neural Risk-Aware Near-Optimal Path Planning in Uncertain Nonconvex Environments

Authors: Fei Meng, Liangliang Chen, Han Ma, Jiankun Wang, Max Q. -H. Meng

Abstract: Balancing the trade-off between safety and efficiency is of significant importance for path planning under uncertainty. Many risk-aware path planners have been developed to explicitly limit the probability of collision to an acceptable bound in uncertain environments. However, convex obstacles or Gaussian uncertainties are usually assumed to make the problem tractable in the existing method. These… ▽ More Balancing the trade-off between safety and efficiency is of significant importance for path planning under uncertainty. Many risk-aware path planners have been developed to explicitly limit the probability of collision to an acceptable bound in uncertain environments. However, convex obstacles or Gaussian uncertainties are usually assumed to make the problem tractable in the existing method. These assumptions limit the generalization and application of path planners in real-world implementations. In this article, we propose to apply deep learning methods to the sampling-based planner, develo** a novel risk bounded near-optimal path planning algorithm named neural risk-aware RRT (NR-RRT). Specifically, a deterministic risk contours map is maintained by perceiving the probabilistic nonconvex obstacles, and a neural network sampler is proposed to predict the next most-promising safe state. Furthermore, the recursive divide-and-conquer planning and bidirectional search strategies are used to accelerate the convergence to a near-optimal solution with guaranteed bounded risk. Worst-case theoretical guarantees can also be proven owing to a standby safety guaranteed planner utilizing a uniform sampling distribution. Simulation experiments demonstrate that the proposed algorithm outperforms the state-of-the-art remarkably for finding risk bounded low-cost paths in seen and unseen environments with uncertainty and nonconvex constraints. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Journal ref: IEEE Transactions on Automation Science and Engineering, 2022

arXiv:2205.06940 [pdf, other]

BiAIT*: Symmetrical Bidirectional Optimal Path Planning with Adaptive Heuristic

Authors: Chenming Li, Han Ma, Peng Xu, Jiankun Wang, Max Q. -H. Meng

Abstract: Adaptively Informed Trees (AIT*) is an algorithm that uses the problem-specific heuristic to avoid unnecessary searches, which significantly improves its performance, especially when collision checking is expensive. However, the heuristic estimation in AIT* consumes lots of computational resources, and its asymmetric bidirectional searching strategy cannot fully exploit the potential of the bidire… ▽ More Adaptively Informed Trees (AIT*) is an algorithm that uses the problem-specific heuristic to avoid unnecessary searches, which significantly improves its performance, especially when collision checking is expensive. However, the heuristic estimation in AIT* consumes lots of computational resources, and its asymmetric bidirectional searching strategy cannot fully exploit the potential of the bidirectional method. In this article, we propose an extension of AIT* called BiAIT*. Unlike AIT*, BiAIT* uses symmetrical bidirectional search for both the heuristic and space searching. The proposed method allows BiAIT* to find the initial solution faster than AIT*, and update the heuristic with less computation when a collision occurs. We evaluated the performance of BiAIT* through simulations and experiments, and the results show that BiAIT* can find the solution faster than state-of-the-art methods. We also analyze the reasons for the different performances between BiAIT* and AIT*. Furthermore, we discuss two simple but effective modifications to fully exploit the potential of the adaptively heuristic method. △ Less

Submitted 25 May, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

arXiv:2205.04847 [pdf, other]

Multi-Tree Guided Efficient Robot Motion Planning

Authors: Zhirui Sun, Jiankun Wang, Max Q. -H. Meng

Abstract: Motion Planning is necessary for robots to complete different tasks. Rapidly-exploring Random Tree (RRT) and its variants have been widely used in robot motion planning due to their fast search in state space. However, they perform not well in many complex environments since the motion planning needs to simultaneously consider the geometry constraints and differential constraints. In this article,… ▽ More Motion Planning is necessary for robots to complete different tasks. Rapidly-exploring Random Tree (RRT) and its variants have been widely used in robot motion planning due to their fast search in state space. However, they perform not well in many complex environments since the motion planning needs to simultaneously consider the geometry constraints and differential constraints. In this article, we propose a novel robot motion planning algorithm that utilizes multi-tree to guide the exploration and exploitation. The proposed algorithm maintains more than two trees to search the state space at first. Each tree will explore the local environment. The tree starts from the root will gradually collect information from other trees and grow towards the goal state. This simultaneous exploration and exploitation method can quickly find a feasible trajectory. We compare the proposed algorithm with other popular motion planning algorithms. The experiment results demonstrate that our algorithm achieves the best performance on different evaluation metrics. △ Less

Submitted 17 May, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

arXiv:2202.08004 [pdf, other]

Deep Koopman Operator with Control for Nonlinear Systems

Authors: Haojie Shi, Max Q. -H. Meng

Abstract: Recently Koopman operator has become a promising data-driven tool to facilitate real-time control for unknown nonlinear systems. It maps nonlinear systems into equivalent linear systems in embedding space, ready for real-time linear control methods. However, designing an appropriate Koopman embedding function remains a challenging task. Furthermore, most Koopman-based algorithms only consider nonl… ▽ More Recently Koopman operator has become a promising data-driven tool to facilitate real-time control for unknown nonlinear systems. It maps nonlinear systems into equivalent linear systems in embedding space, ready for real-time linear control methods. However, designing an appropriate Koopman embedding function remains a challenging task. Furthermore, most Koopman-based algorithms only consider nonlinear systems with linear control input, resulting in lousy prediction and control performance when the system is fully nonlinear with the control input. In this work, we propose an end-to-end deep learning framework to learn the Koopman embedding function and Koopman Operator together to alleviate such difficulties. We first parameterize the embedding function and Koopman Operator with the neural network and train them end-to-end with the K-steps loss function. Then, an auxiliary control network is augmented to encode the nonlinear state-dependent control term to model the nonlinearity in the control input. This encoded term is considered the new control variable instead to ensure linearity of the modeled system in the embedding system.We next deploy Linear Quadratic Regulator (LQR) on the linear embedding space to derive the optimal control policy and decode the actual control input from the control net. Experimental results demonstrate that our approach outperforms other existing methods, reducing the prediction error by order of magnitude and achieving superior control performance in several nonlinear dynamic systems like dam** pendulum, CartPole, and the seven DOF robotic manipulator. △ Less

Submitted 15 June, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

arXiv:2112.08106 [pdf, other]

doi 10.1109/TASE.2022.3191519

Enhance Connectivity of Promising Regions for Sampling-based Path Planning

Authors: Han Ma, Chenming Li, Jianbang Liu, Jiankun Wang, Max Q. -H. Meng

Abstract: Sampling-based path planning algorithms usually implement uniform sampling methods to search the state space. However, uniform sampling may lead to unnecessary exploration in many scenarios, such as the environment with a few dead ends. Our previous work proposes to use the promising region to guide the sampling process to address the issue. However, the predicted promising regions are often disco… ▽ More Sampling-based path planning algorithms usually implement uniform sampling methods to search the state space. However, uniform sampling may lead to unnecessary exploration in many scenarios, such as the environment with a few dead ends. Our previous work proposes to use the promising region to guide the sampling process to address the issue. However, the predicted promising regions are often disconnected, which means they cannot connect the start and goal state, resulting in a lack of probabilistic completeness. This work focuses on enhancing the connectivity of predicted promising regions. Our proposed method regresses the connectivity probability of the edges in the x and y directions. In addition, it calculates the weight of the promising edges in loss to guide the neural network to pay more attention to the connectivity of the promising regions. We conduct a series of simulation experiments, and the results show that the connectivity of promising regions improves significantly. Furthermore, we analyze the effect of connectivity on sampling-based path planning algorithms and conclude that connectivity plays an essential role in maintaining algorithm performance. △ Less

Submitted 22 July, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

Comments: Accepted in Transactions on Automation Science and Engineering, 2022

arXiv:2111.03235 [pdf]

RASEC: Rescaling Acquisition Strategy with Energy Constraints under SE-OU Fusion Kernel for Active Trachea Palpation and Incision Recommendation in Laryngeal Region

Authors: Wenchao Yue, Fan Bai, Jianbang Liu, Feng Ju, Max Q-H Meng, Chwee Ming Lim, Hongliang Ren

Abstract: A novel palpation-based incision detection strategy in the laryngeal region, potentially for robotic tracheotomy, is proposed in this letter. A tactile sensor is introduced to measure tissue hardness in the specific laryngeal region by gentle contact. The kernel fusion method is proposed to combine the Squared Exponential (SE) kernel with Ornstein-Uhlenbeck (OU) kernel to figure out the drawbacks… ▽ More A novel palpation-based incision detection strategy in the laryngeal region, potentially for robotic tracheotomy, is proposed in this letter. A tactile sensor is introduced to measure tissue hardness in the specific laryngeal region by gentle contact. The kernel fusion method is proposed to combine the Squared Exponential (SE) kernel with Ornstein-Uhlenbeck (OU) kernel to figure out the drawbacks that the existing kernel functions are not sufficiently optimal in this scenario. Moreover, we further regularize exploration factor and greed factor, and the tactile sensor's moving distance and the robotic base link's rotation angle during the incision localization process are considered as new factors in the acquisition strategy. We conducted simulation and physical experiments to compare the newly proposed algorithm - Rescaling Acquisition Strategy with Energy Constraints (RASEC) in trachea detection with current palpation-based acquisition strategies. The result indicates that the proposed acquisition strategy with fusion kernel can successfully localize the incision with the highest algorithm performance (Average Precision 0.932, Average Recall 0.973, Average F1 score 0.952). During the robotic palpation process, the cumulative moving distance is reduced by 50%, and the cumulative rotation angle is reduced by 71.4% with no sacrifice in the comprehensive performance capabilities. Therefore, it proves that RASEC can efficiently suggest the incision zone in the laryngeal region and greatly reduced the energy loss. △ Less

Submitted 19 March, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: Submitted to RA-L

arXiv:2111.02167 [pdf, other]

doi 10.1109/TMRB.2021.3127015

Image-Guided Navigation of a Robotic Ultrasound Probe for Autonomous Spinal Sonography Using a Shadow-aware Dual-Agent Framework

Authors: Keyu Li, Yangxin Xu, Jian Wang, Dong Ni, Li Liu, Max Q. -H. Meng

Abstract: Ultrasound (US) imaging is commonly used to assist in the diagnosis and interventions of spine diseases, while the standardized US acquisitions performed by manually operating the probe require substantial experience and training of sonographers. In this work, we propose a novel dual-agent framework that integrates a reinforcement learning (RL) agent and a deep learning (DL) agent to jointly deter… ▽ More Ultrasound (US) imaging is commonly used to assist in the diagnosis and interventions of spine diseases, while the standardized US acquisitions performed by manually operating the probe require substantial experience and training of sonographers. In this work, we propose a novel dual-agent framework that integrates a reinforcement learning (RL) agent and a deep learning (DL) agent to jointly determine the movement of the US probe based on the real-time US images, in order to mimic the decision-making process of an expert sonographer to achieve autonomous standard view acquisitions in spinal sonography. Moreover, inspired by the nature of US propagation and the characteristics of the spinal anatomy, we introduce a view-specific acoustic shadow reward to utilize the shadow information to implicitly guide the navigation of the probe toward different standard views of the spine. Our method is validated in both quantitative and qualitative experiments in a simulation environment built with US data acquired from 17 volunteers. The average navigation accuracy toward different standard views achieves 5.18mm/5.25deg and 12.87mm/17.49deg in the intra- and inter-subject settings, respectively. The results demonstrate that our method can effectively interpret the US images and navigate the probe to acquire multiple standard views of the spine. △ Less

Submitted 10 November, 2021; v1 submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted by IEEE Transactions on Medical Robotics and Bionics. Copyright may be transferred without notice, after which this version may no longer be accessible

Journal ref: IEEE Transactions on Medical Robotics and Bionics (2021)

arXiv:2111.01977 [pdf, other]

Autonomous Magnetic Navigation Framework for Active Wireless Capsule Endoscopy Inspired by Conventional Colonoscopy Procedures

Authors: Yangxin Xu, Keyu Li, Ziqi Zhao, Max Q. -H. Meng

Abstract: In recent years, simultaneous magnetic actuation and localization (SMAL) for active wireless capsule endoscopy (WCE) has been intensively studied to improve the efficiency and accuracy of the examination. In this paper, we propose an autonomous magnetic navigation framework for active WCE that mimics the "insertion" and "withdrawal" procedures performed by an expert physician in conventional colon… ▽ More In recent years, simultaneous magnetic actuation and localization (SMAL) for active wireless capsule endoscopy (WCE) has been intensively studied to improve the efficiency and accuracy of the examination. In this paper, we propose an autonomous magnetic navigation framework for active WCE that mimics the "insertion" and "withdrawal" procedures performed by an expert physician in conventional colonoscopy, thereby enabling efficient and accurate navigation of a robotic capsule endoscope in the intestine with minimal user effort. First, the capsule is automatically propelled through the unknown intestinal environment and generate a viable path to represent the environment. Then, the capsule is autonomously navigated towards any point selected on the intestinal trajectory to allow accurate and repeated inspections of suspicious lesions. Moreover, we implement the navigation framework on a robotic system incorporated with advanced SMAL algorithms, and validate it in the navigation in various tubular environments using phantoms and an ex-vivo pig colon. Our results demonstrate that the proposed autonomous navigation framework can effectively navigate the capsule in unknown, complex tubular environments with a satisfactory accuracy, repeatability and efficiency compared with manual operation. △ Less

Submitted 2 November, 2021; originally announced November 2021.

arXiv:2111.00383 [pdf, other]

Relevant Region Sampling Strategy with Adaptive Heuristic for Asymptotically Optimal Path Planning

Authors: Chenming Li, Fei Meng, Han Ma, Jiankun Wang, Max Q. -H. Meng

Abstract: Sampling-based planning algorithm is a powerful tool for solving planning problems in high-dimensional state spaces. In this article, we present a novel approach to sampling in the most promising regions, which significantly reduces planning time-consumption. The RRT# algorithm defines the Relevant Region based on the cost-to-come provided by the optimal forward-searching tree. However, it uses th… ▽ More Sampling-based planning algorithm is a powerful tool for solving planning problems in high-dimensional state spaces. In this article, we present a novel approach to sampling in the most promising regions, which significantly reduces planning time-consumption. The RRT# algorithm defines the Relevant Region based on the cost-to-come provided by the optimal forward-searching tree. However, it uses the cumulative cost of a direct connection between the current state and the goal state as the cost-to-go. To improve the path planning efficiency, we propose a batch sampling method that samples in a refined Relevant Region with a direct sampling strategy, which is defined according to the optimal cost-to-come and the adaptive cost-to-go, taking advantage of various sources of heuristic information. The proposed sampling approach allows the algorithm to build the search tree in the direction of the most promising area, resulting in a superior initial solution quality and reducing the overall computation time compared to related work. To validate the effectiveness of our method, we conducted several simulations in both $SE(2)$ and $SE(3)$ state spaces. And the simulation results demonstrate the superiorities of proposed algorithm. △ Less

Submitted 25 May, 2023; v1 submitted 30 October, 2021; originally announced November 2021.

arXiv:2110.10436 [pdf, other]

A Survey on Deep-Learning Approaches for Vehicle Trajectory Prediction in Autonomous Driving

Authors: Jianbang Liu, Xinyu Mao, Yuqi Fang, Delong Zhu, Max Q. -H. Meng

Abstract: With the rapid development of machine learning, autonomous driving has become a hot issue, making urgent demands for more intelligent perception and planning systems. Self-driving cars can avoid traffic crashes with precisely predicted future trajectories of surrounding vehicles. In this work, we review and categorize existing learning-based trajectory forecasting methods from perspectives of repr… ▽ More With the rapid development of machine learning, autonomous driving has become a hot issue, making urgent demands for more intelligent perception and planning systems. Self-driving cars can avoid traffic crashes with precisely predicted future trajectories of surrounding vehicles. In this work, we review and categorize existing learning-based trajectory forecasting methods from perspectives of representation, modeling, and learning. Moreover, we make our implementation of Target-driveN Trajectory Prediction publicly available at https://github.com/Henry1iu/TNT-Trajectory-Predition, demonstrating its outstanding performance whereas its original codes are withheld. Enlightenment is expected for researchers seeking to improve trajectory prediction performance based on the achievement we have made. △ Less

Submitted 28 October, 2021; v1 submitted 20 October, 2021; originally announced October 2021.

Comments: Accepted by ROBIO2021

arXiv:2110.10041 [pdf, other]

Learning-based Fast Path Planning in Complex Environments

Authors: Jianbang Liu, Baopu Li, Tingguang Li, Wenzheng Chi, Jiankun Wang, Max Q. -H. Meng

Abstract: In this paper, we present a novel path planning algorithm to achieve fast path planning in complex environments. Most existing path planning algorithms are difficult to quickly find a feasible path in complex environments or even fail. However, our proposed framework can overcome this difficulty by using a learning-based prediction module and a sampling-based path planning module. The prediction m… ▽ More In this paper, we present a novel path planning algorithm to achieve fast path planning in complex environments. Most existing path planning algorithms are difficult to quickly find a feasible path in complex environments or even fail. However, our proposed framework can overcome this difficulty by using a learning-based prediction module and a sampling-based path planning module. The prediction module utilizes an auto-encoder-decoder-like convolutional neural network (CNN) to output a promising region where the feasible path probably lies in. In this process, the environment is treated as an RGB image to feed in our designed CNN module, and the output is also an RGB image. No extra computation is required so that we can maintain a high processing speed of 60 frames-per-second (FPS). Incorporated with a sampling-based path planner, we can extract a feasible path from the output image so that the robot can track it from start to goal. To demonstrate the advantage of the proposed algorithm, we compare it with conventional path planning algorithms in a series of simulation experiments. The results reveal that the proposed algorithm can achieve much better performance in terms of planning time, success rate, and path length. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: Accepted by ROBIO2021

arXiv:2110.06648 [pdf, other]

Robotic Autonomous Trolley Collection with Progressive Perception and Nonlinear Model Predictive Control

Authors: Anxing Xiao, Hao Luan, Ziqi Zhao, Yue Hong, Jieting Zhao, Weinan Chen, Jiankun Wang, Max Q. -H. Meng

Abstract: Autonomous mobile manipulation robots that can collect trolleys are widely used to liberate human resources and fight epidemics. Most prior robotic trolley collection solutions only detect trolleys with 2D poses or are merely based on specific marks and lack the formal design of planning algorithms. In this paper, we present a novel mobile manipulation system with applications in luggage trolley c… ▽ More Autonomous mobile manipulation robots that can collect trolleys are widely used to liberate human resources and fight epidemics. Most prior robotic trolley collection solutions only detect trolleys with 2D poses or are merely based on specific marks and lack the formal design of planning algorithms. In this paper, we present a novel mobile manipulation system with applications in luggage trolley collection. The proposed system integrates a compact hardware design and a progressive perception and planning framework, enabling the system to efficiently and robustly collect trolleys in dynamic and complex environments. For the perception, we first develop a 3D trolley detection method that combines object detection and keypoint estimation. Then, a docking process in a short distance is achieved with an accurate point cloud plane detection method and a novel manipulator design. On the planning side, we formulate the robot's motion planning under a nonlinear model predictive control framework with control barrier functions to improve obstacle avoidance capabilities while maintaining the target in the sensors' field of view at close distances. We demonstrate our design and framework by deploying the system on actual trolley collection tasks, and their effectiveness and robustness are experimentally validated. △ Less

Submitted 1 March, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: Accepted to the 2022 International Conference on Robotics and Automation (ICRA 2022)

arXiv:2110.04564 [pdf, other]

Human-Aware Robot Navigation via Reinforcement Learning with Hindsight Experience Replay and Curriculum Learning

Authors: Keyu Li, Ye Lu, Max Q. -H. Meng

Abstract: In recent years, the growing demand for more intelligent service robots is pushing the development of mobile robot navigation algorithms to allow safe and efficient operation in a dense crowd. Reinforcement learning (RL) approaches have shown superior ability in solving sequential decision making problems, and recent work has explored its potential to learn navigation polices in a socially complia… ▽ More In recent years, the growing demand for more intelligent service robots is pushing the development of mobile robot navigation algorithms to allow safe and efficient operation in a dense crowd. Reinforcement learning (RL) approaches have shown superior ability in solving sequential decision making problems, and recent work has explored its potential to learn navigation polices in a socially compliant manner. However, the expert demonstration data used in existing methods is usually expensive and difficult to obtain. In this work, we consider the task of training an RL agent without employing the demonstration data, to achieve efficient and collision-free navigation in a crowded environment. To address the sparse reward navigation problem, we propose to incorporate the hindsight experience replay (HER) and curriculum learning (CL) techniques with RL to efficiently learn the optimal navigation policy in the dense crowd. The effectiveness of our method is validated in a simulated crowd-robot coexisting environment. The results demonstrate that our method can effectively learn human-aware navigation without requiring additional demonstration data. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: Accepted at ROBIO 2021. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2110.04563 [pdf, other]

Automatic Recognition of Abdominal Organs in Ultrasound Images based on Deep Neural Networks and K-Nearest-Neighbor Classification

Authors: Keyu Li, Yangxin Xu, Max Q. -H. Meng

Abstract: Abdominal ultrasound imaging has been widely used to assist in the diagnosis and treatment of various abdominal organs. In order to shorten the examination time and reduce the cognitive burden on the sonographers, we present a classification method that combines the deep learning techniques and k-Nearest-Neighbor (k-NN) classification to automatically recognize various abdominal organs in the ultr… ▽ More Abdominal ultrasound imaging has been widely used to assist in the diagnosis and treatment of various abdominal organs. In order to shorten the examination time and reduce the cognitive burden on the sonographers, we present a classification method that combines the deep learning techniques and k-Nearest-Neighbor (k-NN) classification to automatically recognize various abdominal organs in the ultrasound images in real time. Fine-tuned deep neural networks are used in combination with PCA dimension reduction to extract high-level features from raw ultrasound images, and a k-NN classifier is employed to predict the abdominal organ in the image. We demonstrate the effectiveness of our method in the task of ultrasound image classification to automatically recognize six abdominal organs. A comprehensive comparison of different configurations is conducted to study the influence of different feature extractors and classifiers on the classification accuracy. Both quantitative and qualitative results show that with minimal training effort, our method can "lazily" recognize the abdominal organs in the ultrasound images in real time with an accuracy of 96.67%. Our implementation code is publicly available at: https://github.com/LeeKeyu/abdominal_ultrasound_classification. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: Accepted at ROBIO 2021. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2109.08973 [pdf, other]

Hierarchical Policy for Non-prehensile Multi-object Rearrangement with Deep Reinforcement Learning and Monte Carlo Tree Search

Authors: Fan Bai, Fei Meng, Jianbang Liu, Jiankun Wang, Max Q. -H. Meng

Abstract: Non-prehensile multi-object rearrangement is a robotic task of planning feasible paths and transferring multiple objects to their predefined target poses without gras**. It needs to consider how each object reaches the target and the order of object movement, which significantly deepens the complexity of the problem. To address these challenges, we propose a hierarchical policy to divide and con… ▽ More Non-prehensile multi-object rearrangement is a robotic task of planning feasible paths and transferring multiple objects to their predefined target poses without gras**. It needs to consider how each object reaches the target and the order of object movement, which significantly deepens the complexity of the problem. To address these challenges, we propose a hierarchical policy to divide and conquer for non-prehensile multi-object rearrangement. In the high-level policy, guided by a designed policy network, the Monte Carlo Tree Search efficiently searches for the optimal rearrangement sequence among multiple objects, which benefits from imitation and reinforcement. In the low-level policy, the robot plans the paths according to the order of path primitives and manipulates the objects to approach the goal poses one by one. We verify through experiments that the proposed method can achieve a higher success rate, fewer steps, and shorter path length compared with the state-of-the-art. △ Less

Submitted 18 September, 2021; originally announced September 2021.

arXiv:2109.06409 [pdf, other]

Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

Authors: Haojie Shi, Bo Zhou, Hongsheng Zeng, Fan Wang, Yueqiang Dong, Jiangyong Li, Kang Wang, Hao Tian, Max Q. -H. Meng

Abstract: Recently reinforcement learning (RL) has emerged as a promising approach for quadrupedal locomotion, which can save the manual effort in conventional approaches such as designing skill-specific controllers. However, due to the complex nonlinear dynamics in quadrupedal robots and reward sparsity, it is still difficult for RL to learn effective gaits from scratch, especially in challenging tasks suc… ▽ More Recently reinforcement learning (RL) has emerged as a promising approach for quadrupedal locomotion, which can save the manual effort in conventional approaches such as designing skill-specific controllers. However, due to the complex nonlinear dynamics in quadrupedal robots and reward sparsity, it is still difficult for RL to learn effective gaits from scratch, especially in challenging tasks such as walking over the balance beam. To alleviate such difficulty, we propose a novel RL-based approach that contains an evolutionary foot trajectory generator. Unlike prior methods that use a fixed trajectory generator, the generator continually optimizes the shape of the output trajectory for the given task, providing diversified motion priors to guide the policy learning. The policy is trained with reinforcement learning to output residual control signals that fit different gaits. We then optimize the trajectory generator and policy network alternatively to stabilize the training and share the exploratory data to improve sample efficiency. As a result, our approach can solve a range of challenging tasks in simulation by learning from scratch, including walking on a balance beam and crawling through the cave. To further verify the effectiveness of our approach, we deploy the controller learned in the simulation on a 12-DoF quadrupedal robot, and it can successfully traverse challenging scenarios with efficient gaits. △ Less

Submitted 16 September, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

arXiv:2108.11620 [pdf, other]

Trajectory Following Strategies for Wireless Capsule Endoscopy under Reciprocally Rotating Magnetic Actuation in a Tubular Environment

Authors: Yangxin Xu, Keyu Li, Ziqi Zhao, Max Q. -H. Meng

Abstract: Currently used wireless capsule endoscopy (WCE) is limited in terms of inspection time and flexibility since the capsule is passively moved by peristalsis and cannot be accurately positioned. Different methods have been proposed to facilitate active locomotion of WCE based on simultaneous magnetic actuation and localization technologies. In this work, we investigate the trajectory following proble… ▽ More Currently used wireless capsule endoscopy (WCE) is limited in terms of inspection time and flexibility since the capsule is passively moved by peristalsis and cannot be accurately positioned. Different methods have been proposed to facilitate active locomotion of WCE based on simultaneous magnetic actuation and localization technologies. In this work, we investigate the trajectory following problem of a robotic capsule under rotating magnetic actuation in a tubular environment, in order to realize safe, efficient and accurate inspection of the intestine at given points using wireless capsule endoscopes. Specifically, four trajectory following strategies are developed based on the PD controller, adaptive controller, model predictive controller and robust multi-stage model predictive controller. Moreover, our method takes into account the uncertainty in the intestinal environment by modeling the intestinal peristalsis and friction during the controller design. We validate our methods in simulation as well as in real-world experiments in various tubular environments, including plastic phantoms with different shapes and an ex-vivo pig colon. The results show that our approach can effectively actuate a reciprocally rotating capsule to follow a desired trajectory in complex tubular environments, thereby having the potential to enable accurate and repeatable inspection of the intestine for high-quality diagnosis. △ Less

Submitted 25 December, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

arXiv:2108.11256 [pdf, other]

doi 10.1109/TRO.2022.3161766

Adaptive Simultaneous Magnetic Actuation and Localization for WCE in a Tubular Environment

Authors: Yangxin Xu, Keyu Li, Ziqi Zhao, Max Q. -H. Meng

Abstract: Simultaneous Magnetic Actuation and Localization (SMAL) is a promising technology for active wireless capsule endoscopy (WCE). In this paper, an adaptive SMAL system is presented to efficiently propel and precisely locate a capsule in a tubular environment with complex shapes. In order to track the capsule with high localization accuracy and update frequency in a large workspace, we propose a mech… ▽ More Simultaneous Magnetic Actuation and Localization (SMAL) is a promising technology for active wireless capsule endoscopy (WCE). In this paper, an adaptive SMAL system is presented to efficiently propel and precisely locate a capsule in a tubular environment with complex shapes. In order to track the capsule with high localization accuracy and update frequency in a large workspace, we propose a mechanism that can automatically activate a sub-array of sensors with the optimal layout during the capsule movement. The improved multiple objects tracking (IMOT) method is simplified and adapted to our system to estimate the 6-D pose of the capsule in real time. Also, we study the locomotion of a magnetically actuated capsule in a tubular environment, and formulate a method to adaptively adjust the pose of the actuator to improve the propulsion efficiency. Our presented methods are applicable to other permanent magnet-based SMAL systems, and help to improve the actuation efficiency of active WCE. We verify the effectiveness of our proposed system in extensive experiments on phantoms and ex-vivo animal organs. The results demonstrate that our system can achieve convincing performance compared with the state-of-the-art ones in terms of actuation efficiency, workspace size, robustness, localization accuracy and update frequency. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Journal ref: IEEE Transactions on Robotics (2022)

arXiv:2108.11253 [pdf, other]

doi 10.1109/TMRB.2021.3123407

On Reciprocally Rotating Magnetic Actuation of a Robotic Capsule in Unknown Tubular Environments

Authors: Yangxin Xu, Keyu Li, Ziqi Zhao, Max Q. -H. Meng

Abstract: Active wireless capsule endoscopy (WCE) based on simultaneous magnetic actuation and localization (SMAL) techniques holds great promise for improving diagnostic accuracy, reducing examination time and relieving operator burden. To date, the rotating magnetic actuation methods have been constrained to use a continuously rotating permanent magnet. In this paper, we first propose the reciprocally rot… ▽ More Active wireless capsule endoscopy (WCE) based on simultaneous magnetic actuation and localization (SMAL) techniques holds great promise for improving diagnostic accuracy, reducing examination time and relieving operator burden. To date, the rotating magnetic actuation methods have been constrained to use a continuously rotating permanent magnet. In this paper, we first propose the reciprocally rotating magnetic actuation (RRMA) approach for active WCE to enhance patient safety. We first show how to generate a desired reciprocally rotating magnetic field for capsule actuation, and provide a theoretical analysis of the potential risk of causing volvulus due to the capsule motion. Then, an RRMA-based SMAL workflow is presented to automatically propel a capsule in an unknown tubular environment. We validate the effectiveness of our method in real-world experiments to automatically propel a robotic capsule in an ex-vivo pig colon. The experiment results show that our approach can achieve efficient and robust propulsion of the capsule with an average moving speed of $2.48 mm/s$ in the pig colon, and demonstrate the potential of using RRMA to enhance patient safety, reduce the inspection time, and improve the clinical acceptance of this technology. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Journal ref: IEEE Transactions on Medical Robotics and Bionics (2021)

arXiv:2108.02948 [pdf, other]

Deep Learning-based Biological Anatomical Landmark Detection in Colonoscopy Videos

Authors: Kaiwei Che, Chengwei Ye, Yibing Yao, Nachuan Ma, Ruo Zhang, Jiankun Wang, Max Q. -H. Meng

Abstract: Colonoscopy is a standard imaging tool for visualizing the entire gastrointestinal (GI) tract of patients to capture lesion areas. However, it takes the clinicians excessive time to review a large number of images extracted from colonoscopy videos. Thus, automatic detection of biological anatomical landmarks within the colon is highly demanded, which can help reduce the burden of clinicians by pro… ▽ More Colonoscopy is a standard imaging tool for visualizing the entire gastrointestinal (GI) tract of patients to capture lesion areas. However, it takes the clinicians excessive time to review a large number of images extracted from colonoscopy videos. Thus, automatic detection of biological anatomical landmarks within the colon is highly demanded, which can help reduce the burden of clinicians by providing guidance information for the locations of lesion areas. In this article, we propose a novel deep learning-based approach to detect biological anatomical landmarks in colonoscopy videos. First, raw colonoscopy video sequences are pre-processed to reject interference frames. Second, a ResNet-101 based network is used to detect three biological anatomical landmarks separately to obtain the intermediate detection results. Third, to achieve more reliable localization of the landmark periods within the whole video period, we propose to post-process the intermediate detection results by identifying the incorrectly predicted frames based on their temporal distribution and reassigning them back to the correct class. Finally, the average detection accuracy reaches 99.75\%. Meanwhile, the average IoU of 0.91 shows a high degree of similarity between our predicted landmark periods and ground truth. The experimental results demonstrate that our proposed model is capable of accurately detecting and localizing biological anatomical landmarks from colonoscopy videos. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: 9 pages, 7 figures

Showing 1–50 of 74 results for author: Meng, M Q