Search | arXiv e-print repository

Online Time-Informed Kinodynamic Motion Planning of Nonlinear Systems

Authors: Fei Meng, Jianbang Liu, Haojie Shi, Han Ma, Hongliang Ren, Max Q. -H. Meng

Abstract: Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and… ▽ More Sampling-based kinodynamic motion planners (SKMPs) are powerful in finding collision-free trajectories for high-dimensional systems under differential constraints. Time-informed set (TIS) can provide the heuristic search domain to accelerate their convergence to the time-optimal solution. However, existing TIS approximation methods suffer from the curse of dimensionality, computational burden, and limited system applicable scope, e.g., linear and polynomial nonlinear systems. To overcome these problems, we propose a method by leveraging deep learning technology, Koopman operator theory, and random set theory. Specifically, we propose a Deep Invertible Koopman operator with control U model named DIKU to predict states forward and backward over a long horizon by modifying the auxiliary network with an invertible neural network. A sampling-based approach, ASKU, performing reachability analysis for the DIKU is developed to approximate the TIS of nonlinear control systems online. Furthermore, we design an online time-informed SKMP using a direct sampling technique to draw uniform random samples in the TIS. Simulation experiment results demonstrate that our method outperforms other existing works, approximating TIS in near real-time and achieving superior planning performance in several time-optimal kinodynamic motion planning problems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02788 [pdf, other]

Generalized Gouy Rotation of Electron Vortex beams in uniform magnetic fields

Authors: Qi Meng, Xuan Liu, Wei Ma, Zhen Yang, Liang Lu, Alexander J. Silenko, Pengming Zhang, Li** Zou

Abstract: The rotation of electron vortex beams (EVBs) presents a complex interplay of the Gouy phase characterizing free-space behavior and Landau states or Larmor rotation observed in magnetic fields. Despite being studied separately, these phenomena manifest within a single beam during its propagation in magnetic fields, lacking a comprehensive description. We address this by utilizing exact solutions of… ▽ More The rotation of electron vortex beams (EVBs) presents a complex interplay of the Gouy phase characterizing free-space behavior and Landau states or Larmor rotation observed in magnetic fields. Despite being studied separately, these phenomena manifest within a single beam during its propagation in magnetic fields, lacking a comprehensive description. We address this by utilizing exact solutions of the relativistic paraxial equation in magnetic fields, termed "paraxial Landau modes". The paraxial Landau modes describe the quantum states of EVBs in magnetic fields. Our study of rotation angles demonstrates consistency with experimental data, supporting the practical presence of these modes. We provide a unified description of different regimes under generalized Gouy rotation, linking the Gouy phase to EVB rotation angles. This connection enhances our understanding of the Gouy phase and can be extended to nonuniform magnetic fields. Our theoretical analysis is validated through numerical simulations using the Chebyshev method. This work offers new insights into the dynamics of EVBs in magnetic fields and suggests practical applications in beam manipulation and beam optics of vortex particles. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01862 [pdf, other]

Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The 3rd BARN Challenge at ICRA 2024

Authors: Xuesu Xiao, Zifan Xu, Aniket Datar, Garrett Warnell, Peter Stone, Joshua Julian Damanik, Jaewon Jung, Chala Adane Deresa, Than Duc Huy, Chen **yu, Chen Yichen, Joshua Adrian Cahyono, **gda Wu, Longfei Mo, Mingyang Lv, Bowen Lan, Qingyang Meng, Weizhi Tao, Li Cheng

Abstract: The 3rd BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024) in Yokohama, Japan and continued to evaluate the performance of state-of-the-art autonomous ground navigation systems in highly constrained environments. Similar to the trend in The 1st and 2nd BARN Challenge at ICRA 2022 and 2023 in Philadelphi… ▽ More The 3rd BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024) in Yokohama, Japan and continued to evaluate the performance of state-of-the-art autonomous ground navigation systems in highly constrained environments. Similar to the trend in The 1st and 2nd BARN Challenge at ICRA 2022 and 2023 in Philadelphia (North America) and London (Europe), The 3rd BARN Challenge in Yokohama (Asia) became more regional, i.e., mostly Asian teams participated. The size of the competition has slightly shrunk (six simulation teams, four of which were invited to the physical competition). The competition results, compared to last two years, suggest that the field has adopted new machine learning approaches while at the same time slightly converged to a few common practices. However, the regional nature of the physical participants suggests a challenge to promote wider participation all over the world and provide more resources to travel to the venue. In this article, we discuss the challenge, the approaches used by the three winning teams, and lessons learned to direct future research and competitions. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: arXiv admin note: text overlap with arXiv:2308.03205

arXiv:2407.00562 [pdf, other]

Automated Robot Recovery from Assumption Violations of High-Level Specifications

Authors: Qian Meng, Hadas Kress-Gazit

Abstract: This paper presents a framework that enables robots to automatically recover from assumption violations of high-level specifications during task execution. In contrast to previous methods relying on user intervention to impose additional assumptions for failure recovery, our approach leverages synthesis-based repair to suggest new robot skills that, when implemented, repair the task. Our approach… ▽ More This paper presents a framework that enables robots to automatically recover from assumption violations of high-level specifications during task execution. In contrast to previous methods relying on user intervention to impose additional assumptions for failure recovery, our approach leverages synthesis-based repair to suggest new robot skills that, when implemented, repair the task. Our approach detects violations of environment safety assumptions during the task execution, relaxes the assumptions to admit observed environment behaviors, and acquires new robot skills for task completion. We demonstrate our approach with a Hello Robot Stretch in a factory-like scenario. △ Less

Submitted 1 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

Comments: To appear in the Proceedings of the 2024 IEEE 20th International Conference on Automation Science and Engineering (CASE 2024)

MSC Class: 68T40

arXiv:2406.18962 [pdf, other]

Multi-modal Food Recommendation using Clustering and Self-supervised Learning

Authors: Yixin Zhang, Xin Zhou, Qianwen Meng, Fanglin Zhu, Yonghui Xu, Zhiqi Shen, Lizhen Cui

Abstract: Food recommendation systems serve as pivotal components in the realm of digital lifestyle services, designed to assist users in discovering recipes and food items that resonate with their unique dietary predilections. Typically, multi-modal descriptions offer an exhaustive profile for each recipe, thereby ensuring recommendations that are both personalized and accurate. Our preliminary investigati… ▽ More Food recommendation systems serve as pivotal components in the realm of digital lifestyle services, designed to assist users in discovering recipes and food items that resonate with their unique dietary predilections. Typically, multi-modal descriptions offer an exhaustive profile for each recipe, thereby ensuring recommendations that are both personalized and accurate. Our preliminary investigation of two datasets indicates that pre-trained multi-modal dense representations might precipitate a deterioration in performance compared to ID features when encapsulating interactive relationships. This observation implies that ID features possess a relative superiority in modeling interactive collaborative signals. Consequently, contemporary cutting-edge methodologies augment ID features with multi-modal information as supplementary features, overlooking the latent semantic relations between recipes. To rectify this, we present CLUSSL, a novel food recommendation framework that employs clustering and self-supervised learning. Specifically, CLUSSL formulates a modality-specific graph tailored to each modality with discrete/continuous features, thereby transforming semantic features into structural representation. Furthermore, CLUSSL procures recipe representations pertinent to different modalities via graph convolutional operations. A self-supervised learning objective is proposed to foster independence between recipe representations derived from different unimodal graphs. Comprehensive experiments on real-world datasets substantiate that CLUSSL consistently surpasses state-of-the-art recommendation benchmarks in performance. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Working paper

arXiv:2406.04984 [pdf, other]

MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter

Authors: Jitai Hao, WeiWei Sun, Xin Xin, Qi Meng, Zhumin Chen, Pengjie Ren, Zhaochun Ren

Abstract: Parameter-Efficient Fine-tuning (PEFT) facilitates the fine-tuning of Large Language Models (LLMs) under limited resources. However, the fine-tuning performance with PEFT on complex, knowledge-intensive tasks is limited due to the constrained model capacity, which originates from the limited number of additional trainable parameters. To overcome this limitation, we introduce a novel mechanism that… ▽ More Parameter-Efficient Fine-tuning (PEFT) facilitates the fine-tuning of Large Language Models (LLMs) under limited resources. However, the fine-tuning performance with PEFT on complex, knowledge-intensive tasks is limited due to the constrained model capacity, which originates from the limited number of additional trainable parameters. To overcome this limitation, we introduce a novel mechanism that fine-tunes LLMs with adapters of larger size yet memory-efficient. This is achieved by leveraging the inherent activation sparsity in the Feed-Forward Networks (FFNs) of LLMs and utilizing the larger capacity of Central Processing Unit (CPU) memory compared to Graphics Processing Unit (GPU). We store and update the parameters of larger adapters on the CPU. Moreover, we employ a Mixture of Experts (MoE)-like architecture to mitigate unnecessary CPU computations and reduce the communication volume between the GPU and CPU. This is particularly beneficial over the limited bandwidth of PCI Express (PCIe). Our method can achieve fine-tuning results comparable to those obtained with larger memory capacities, even when operating under more limited resources such as a 24GB memory single GPU setup, with acceptable loss in training efficiency. Our codes are available at https://github.com/CURRENTF/MEFT. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: ACL 24

arXiv:2406.00808 [pdf, other]

EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing

Authors: Hadrien Reynaud, Qingjie Meng, Mischa Dombrowski, Arijit Ghosh, Thomas Day, Alberto Gomez, Paul Leeson, Bernhard Kainz

Abstract: To make medical datasets accessible without sharing sensitive patient information, we introduce a novel end-to-end approach for generative de-identification of dynamic medical imaging data. Until now, generative methods have faced constraints in terms of fidelity, spatio-temporal coherence, and the length of generation, failing to capture the complete details of dataset distributions. We present a… ▽ More To make medical datasets accessible without sharing sensitive patient information, we introduce a novel end-to-end approach for generative de-identification of dynamic medical imaging data. Until now, generative methods have faced constraints in terms of fidelity, spatio-temporal coherence, and the length of generation, failing to capture the complete details of dataset distributions. We present a model designed to produce high-fidelity, long and complete data samples with near-real-time efficiency and explore our approach on a challenging task: generating echocardiogram videos. We develop our generation method based on diffusion models and introduce a protocol for medical video dataset anonymization. As an exemplar, we present EchoNet-Synthetic, a fully synthetic, privacy-compliant echocardiogram dataset with paired ejection fraction labels. As part of our de-identification protocol, we evaluate the quality of the generated dataset and propose to use clinical downstream tasks as a measurement on top of widely used but potentially biased image quality metrics. Experimental outcomes demonstrate that EchoNet-Synthetic achieves comparable dataset fidelity to the actual dataset, effectively supporting the ejection fraction regression task. Code, weights and dataset are available at https://github.com/HReynaud/EchoNet-Synthetic. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Accepted at MICCAI 2024

arXiv:2406.00707 [pdf, other]

QUADFormer: Learning-based Detection of Cyber Attacks in Quadrotor UAVs

Authors: Pengyu Wang, Zhaohua Yang, Nachuan Yang, Zikai Wang, Jialu Li, Fan Zhang, Chaoqun Wang, Jiankun Wang, Max Q. -H. Meng, Ling Shi

Abstract: Safety-critical intelligent cyber-physical systems, such as quadrotor unmanned aerial vehicles (UAVs), are vulnerable to different types of cyber attacks, and the absence of timely and accurate attack detection can lead to severe consequences. When UAVs are engaged in large outdoor maneuvering flights, their system constitutes highly nonlinear dynamics that include non-Gaussian noises. Therefore,… ▽ More Safety-critical intelligent cyber-physical systems, such as quadrotor unmanned aerial vehicles (UAVs), are vulnerable to different types of cyber attacks, and the absence of timely and accurate attack detection can lead to severe consequences. When UAVs are engaged in large outdoor maneuvering flights, their system constitutes highly nonlinear dynamics that include non-Gaussian noises. Therefore, the commonly employed traditional statistics-based and emerging learning-based attack detection methods do not yield satisfactory results. In response to the above challenges, we propose QUADFormer, a novel Quadrotor UAV Attack Detection framework with transFormer-based architecture. This framework includes a residue generator designed to generate a residue sequence sensitive to anomalies. Subsequently, this sequence is fed into a transformer structure with disparity in correlation to specifically learn its statistical characteristics for the purpose of classification and attack detection. Finally, we design an alert module to ensure the safe execution of tasks by UAVs under attack conditions. We conduct extensive simulations and real-world experiments, and the results show that our method has achieved superior detection performance compared with many state-of-the-art methods. △ Less

Submitted 14 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00706 [pdf, other]

MINER-RRT*: A Hierarchical and Fast Trajectory Planning Framework in 3D Cluttered Environments

Authors: Pengyu Wang, Jiawei Tang, Hin Wang Lin, Fan Zhang, Chaoqun Wang, Jiankun Wang, Ling Shi, Max Q. -H. Meng

Abstract: Trajectory planning for quadrotors in cluttered environments has been challenging in recent years. While many trajectory planning frameworks have been successful, there still exists potential for improvements, particularly in enhancing the speed of generating efficient trajectories. In this paper, we present a novel hierarchical trajectory planning framework to reduce computational time and memory… ▽ More Trajectory planning for quadrotors in cluttered environments has been challenging in recent years. While many trajectory planning frameworks have been successful, there still exists potential for improvements, particularly in enhancing the speed of generating efficient trajectories. In this paper, we present a novel hierarchical trajectory planning framework to reduce computational time and memory usage called MINER-RRT*, which consists of two main components. First, we propose a sampling-based path planning method boosted by neural networks, where the predicted heuristic region accelerates the convergence of rapidly-exploring random trees. Second, we utilize the optimal conditions derived from the quadrotor's differential flatness properties to construct polynomial trajectories that minimize control effort in multiple stages. Extensive simulation and real-world experimental results demonstrate that, compared to several state-of-the-art (SOTA) approaches, our method can generate high-quality trajectories with better performance in 3D cluttered environments. △ Less

Submitted 14 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.19804 [pdf]

Exploring Key Factors for Long-Term Vessel Incident Risk Prediction

Authors: Tianyi Chen, Hua Wang, Yutong Cai, Maohan Liang, Qiang Meng

Abstract: Factor analysis acts a pivotal role in enhancing maritime safety. Most previous studies conduct factor analysis within the framework of incident-related label prediction, where the developed models can be categorized into short-term and long-term prediction models. The long-term models offer a more strategic approach, enabling more proactive risk management, compared to the short-term ones. Nevert… ▽ More Factor analysis acts a pivotal role in enhancing maritime safety. Most previous studies conduct factor analysis within the framework of incident-related label prediction, where the developed models can be categorized into short-term and long-term prediction models. The long-term models offer a more strategic approach, enabling more proactive risk management, compared to the short-term ones. Nevertheless, few studies have devoted to rigorously identifying the key factors for the long-term prediction and undertaking comprehensive factor analysis. Hence, this study aims to delve into the key factors for predicting the incident risk levels in the subsequent year given a specific datestamp. The majority of candidate factors potentially contributing to the incident risk are collected from vessels' historical safety performance data spanning up to five years. An improved embedded feature selection, which integrates Random Forest classifier with a feature filtering process is proposed to identify key risk-contributing factors from the candidate pool. The results demonstrate superior performance of the proposed method in incident prediction and factor interpretability. Comprehensive analysis is conducted upon the key factors, which could help maritime stakeholders formulate management strategies for incident prevenion. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19645 [pdf, other]

A Landmark-aware Network for Automated Cobb Angle Estimation Using X-ray Images

Authors: Jie Yang, Jiankun Wang, Max Q. -H. Meng

Abstract: Automated Cobb angle estimation based on X-ray images plays an important role in scoliosis diagnosis, treatment, and progression surveillance. The inadequate feature extraction and the noise in X-ray images are the main difficulties of automated Cobb angle estimation, and it is challenging to ensure that the calculated Cobb angle meets clinical requirements. To address these problems, we propose a… ▽ More Automated Cobb angle estimation based on X-ray images plays an important role in scoliosis diagnosis, treatment, and progression surveillance. The inadequate feature extraction and the noise in X-ray images are the main difficulties of automated Cobb angle estimation, and it is challenging to ensure that the calculated Cobb angle meets clinical requirements. To address these problems, we propose a Landmark-aware Network named LaNet with three components, Feature Robustness Enhancement Module (FREM), Landmark-aware Objective Function (LOF), and Cobb Angle Calculation Method (CACM), for automated Cobb angle estimation in this paper. To enhance feature extraction, FREM is designed to explore geometric and semantic constraints among landmarks, thus geometric and semantic correlations between landmarks are globally modeled, and robust landmark-based features are extracted. Furthermore, to mitigate the effect of background noise on landmark localization, LOF is proposed to focus more on the foreground near the landmarks and ignore irrelevant background pixels by exploiting category prior information of landmarks. In addition, we also advance CACM to locate the bending segments first and then calculate the Cobb angle within the bending segment, which facilitates the calculation of the clinical standardized Cobb angle. The experiment results on the AASCE dataset demonstrate that our proposed LaNet can significantly improve the Cobb angle estimation performance and outperform other state-of-the-art methods. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.08553 [pdf, other]

Improving Transformers with Dynamically Composable Multi-Head Attention

Authors: Da Xiao, Qingye Meng, Sheng** Li, Xingyuan Yuan

Abstract: Multi-Head Attention (MHA) is a key component of Transformer. In MHA, attention heads work independently, causing problems such as low-rank bottleneck of attention score matrices and head redundancy. We propose Dynamically Composable Multi-Head Attention (DCMHA), a parameter and computation efficient attention architecture that tackles the shortcomings of MHA and increases the expressive power of… ▽ More Multi-Head Attention (MHA) is a key component of Transformer. In MHA, attention heads work independently, causing problems such as low-rank bottleneck of attention score matrices and head redundancy. We propose Dynamically Composable Multi-Head Attention (DCMHA), a parameter and computation efficient attention architecture that tackles the shortcomings of MHA and increases the expressive power of the model by dynamically composing attention heads. At the core of DCMHA is a $\it{Compose}$ function that transforms the attention score and weight matrices in an input-dependent way. DCMHA can be used as a drop-in replacement of MHA in any transformer architecture to obtain the corresponding DCFormer. DCFormer significantly outperforms Transformer on different architectures and model scales in language modeling, matching the performance of models with ~1.7x-2.0x compute. For example, DCPythia-6.9B outperforms open source Pythia-12B on both pretraining perplexity and downstream task evaluation. The code and models are available at https://github.com/Caiyun-AI/DCFormer. △ Less

Submitted 4 June, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted to the 41st International Conference on Machine Learning (ICML'24 oral)

arXiv:2405.05131 [pdf, other]

DenserRadar: A 4D millimeter-wave radar point cloud detector based on dense LiDAR point clouds

Authors: Zeyu Han, Junkai Jiang, Xiaokang Ding, Qingwen Meng, Shaobing Xu, Lei He, Jianqiang Wang

Abstract: The 4D millimeter-wave (mmWave) radar, with its robustness in extreme environments, extensive detection range, and capabilities for measuring velocity and elevation, has demonstrated significant potential for enhancing the perception abilities of autonomous driving systems in corner-case scenarios. Nevertheless, the inherent sparsity and noise of 4D mmWave radar point clouds restrict its further d… ▽ More The 4D millimeter-wave (mmWave) radar, with its robustness in extreme environments, extensive detection range, and capabilities for measuring velocity and elevation, has demonstrated significant potential for enhancing the perception abilities of autonomous driving systems in corner-case scenarios. Nevertheless, the inherent sparsity and noise of 4D mmWave radar point clouds restrict its further development and practical application. In this paper, we introduce a novel 4D mmWave radar point cloud detector, which leverages high-resolution dense LiDAR point clouds. Our approach constructs dense 3D occupancy ground truth from stitched LiDAR point clouds, and employs a specially designed network named DenserRadar. The proposed method surpasses existing probability-based and learning-based radar point cloud detectors in terms of both point cloud density and accuracy on the K-Radar dataset. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.18112 [pdf, other]

Garbage Segmentation and Attribute Analysis by Robotic Dogs

Authors: Nuo Xu, Jianfeng Liao, Qiwei Meng, Wei Song

Abstract: Efficient waste management and recycling heavily rely on garbage exploration and identification. In this study, we propose GSA2Seg (Garbage Segmentation and Attribute Analysis), a novel visual approach that utilizes quadruped robotic dogs as autonomous agents to address waste management and recycling challenges in diverse indoor and outdoor environments. Equipped with advanced visual perception sy… ▽ More Efficient waste management and recycling heavily rely on garbage exploration and identification. In this study, we propose GSA2Seg (Garbage Segmentation and Attribute Analysis), a novel visual approach that utilizes quadruped robotic dogs as autonomous agents to address waste management and recycling challenges in diverse indoor and outdoor environments. Equipped with advanced visual perception system, including visual sensors and instance segmentators, the robotic dogs adeptly navigate their surroundings, diligently searching for common garbage items. Inspired by open-vocabulary algorithms, we introduce an innovative method for object attribute analysis. By combining garbage segmentation and attribute analysis techniques, the robotic dogs accurately determine the state of the trash, including its position and placement properties. This information enhances the robotic arm's gras** capabilities, facilitating successful garbage retrieval. Additionally, we contribute an image dataset, named GSA2D, to support evaluation. Through extensive experiments on GSA2D, this paper provides a comprehensive analysis of GSA2Seg's effectiveness. Dataset available: \href{https://www.kaggle.com/datasets/hellob/gsa2d-2024}{https://www.kaggle.com/datasets/hellob/gsa2d-2024}. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.14877 [pdf, other]

Combining Retrieval and Classification: Balancing Efficiency and Accuracy in Duplicate Bug Report Detection

Authors: Qianru Meng, Xiao Zhang, Guus Ramackers, Visser Joost

Abstract: In the realm of Duplicate Bug Report Detection (DBRD), conventional methods primarily focus on statically analyzing bug databases, often disregarding the running time of the model. In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designe… ▽ More In the realm of Duplicate Bug Report Detection (DBRD), conventional methods primarily focus on statically analyzing bug databases, often disregarding the running time of the model. In this context, complex models, despite their high accuracy potential, can be time-consuming, while more efficient models may compromise on accuracy. To address this issue, we propose a transformer-based system designed to strike a balance between time efficiency and accuracy performance. The existing methods primarily address it as either a retrieval or classification task. However, our hybrid approach leverages the strengths of both models. By utilizing the retrieval model, we can perform initial sorting to reduce the candidate set, while the classification model allows for more precise and accurate classification. In our assessment of commonly used models for retrieval and classification tasks, sentence BERT and RoBERTa outperform other baseline models in retrieval and classification, respectively. To provide a comprehensive evaluation of performance and efficiency, we conduct rigorous experimentation on five public datasets. The results reveal that our system maintains accuracy comparable to a classification model, significantly outperforming it in time efficiency and only slightly behind a retrieval model in time, thereby achieving an effective trade-off between accuracy and efficiency. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: In Proceedings of the Eighteenth International Conference on Software Engineering Advances (ICSEA 2023) (pp. 75-84). IARIA. ISBN: 978-1-68558-098-8. Valencia, Spain, November 13-17, 2023

arXiv:2404.14085 [pdf]

High efficient sunlight-driven CO2 hydrogenation to methanol over NiZn intermetallic catalysts under atmospheric pressure

Authors: Linjia Han, Fanqi Meng, Xianhua Bai, Qixuan Wu, Yanhong Luo, Jiangjian Shi, Yaguang Li, Dongmei Li, Qingbo Meng

Abstract: The synthesis of solar methanol through direct CO2 hydrogenation using solar energy is of great importance in advancing a sustainable energy economy. In this study, non-precious NiZn intermetallic/ZnO catalyst is reported to catalyze the hydrogenation of CO2 to methanol using sunlight irradiation (1sun). The NiZn-ZnO interface is identified as the active site to stabilize the key intermediates of… ▽ More The synthesis of solar methanol through direct CO2 hydrogenation using solar energy is of great importance in advancing a sustainable energy economy. In this study, non-precious NiZn intermetallic/ZnO catalyst is reported to catalyze the hydrogenation of CO2 to methanol using sunlight irradiation (1sun). The NiZn-ZnO interface is identified as the active site to stabilize the key intermediates of HxCO*. At ambient pressure, the NiZn-ZnO catalyst demonstrates a methanol production rate of 127.5 umol g-1h-1 from solar driven CO2 hydrogenation, with a remarkable 100% selectivity towards methanol in the total organic products. Notably, this production rate stands as the highest record for photothermic CO2 hydrogenation to methanol in continuous-flow reactors with sunlight as the only requisite energy input. This discovery not only paves the way for the development of novel catalysts for CO2 hydrogenation to methanol but also marks a significant stride towards a full solar-driven chemical energy storage. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.06901 [pdf]

Multi-interface engineering to realize all-solution processed highly efficient Kesterite solar cells

Authors: Licheng Lou, Kang Yin, **lin Wang, Yuan Li, Xiao Xu, Bowen Zhang, Menghan Jiao, Shudan Chen, Tan Guo, Jiangjian Shi, Huijue Wu, Yanhong Luo, Dongmei Li, Qingbo Meng

Abstract: With the rapid development of Kesterite Cu2ZnSn(S, Se)4 solar cells in the past few years, how to achieve higher cost-performance ratio has become an important topic in the future development and industrialization of this technology. Herein, we demonstrate an all-solution route for the cell fabrication, in particular targeting at the solution processed window layer comprised of ZnO nanoparticles/A… ▽ More With the rapid development of Kesterite Cu2ZnSn(S, Se)4 solar cells in the past few years, how to achieve higher cost-performance ratio has become an important topic in the future development and industrialization of this technology. Herein, we demonstrate an all-solution route for the cell fabrication, in particular targeting at the solution processed window layer comprised of ZnO nanoparticles/Ag nanowires. A multi-interface engineering strategy assisted by organic polymers and molecules is explored to synergistically improve the film deposition, passivate the surface defects and facilitate the charge transfer. These efforts help us achieve high-performance and robust Kesterite solar cells at extremely low time and energy costs, with efficiency records of 14.37% and 13.12% being realized in rigid and flexible Kesterite solar cells, respectively. Our strategy here is also promising to be transplanted into other solar cells with similar geometric and energy band structures, hel** reduce production costs and shorten the production cycle (i.e. increasing production capacity) of these photovoltaic industries. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.05974 [pdf]

Vacancy enhanced cation ordering enables >15% efficiency in Kesterite solar cells

Authors: **lin Wang, Licheng Lou, Kang Yin, Fanqi Meng, Xiao Xu, Menghan Jiao, Bowen Zhang, Jiangjian Shi, Huijue Wu, Yanhong Luo, Dongmei Li, Qingbo Meng

Abstract: Atomic disorder, a widespread problem in compound crystalline materials, is a imperative affecting the performance of multi-chalcogenide Cu2ZnSn(S, Se)4 (CZTSSe) photovoltaic device known for its low cost and environmental friendliness. Cu-Zn disorder is particularly abundantly present in CZTSSe due to its extraordinarily low formation energy, having induced high-concentration deep defects and sev… ▽ More Atomic disorder, a widespread problem in compound crystalline materials, is a imperative affecting the performance of multi-chalcogenide Cu2ZnSn(S, Se)4 (CZTSSe) photovoltaic device known for its low cost and environmental friendliness. Cu-Zn disorder is particularly abundantly present in CZTSSe due to its extraordinarily low formation energy, having induced high-concentration deep defects and severe charge loss, while its regulation remains challenging due to the contradiction between disorder-order phase transition thermodynamics and atom-interchange kinetics. Herein, through introducing more vacancies in the CZTSSe surface, we explored a vacancy-assisted strategy to reduce the atom-interchange barrier limit to facilitate the Cu-Zn ordering kinetic process. The improvement in the Cu-Zn order degree has significantly reduced the charge loss in the device and helped us realize 15.4% (certified at 14.9%) and 13.5% efficiency (certified at 13.3%) in 0.27 cm2 and 1.1 cm2-area CZTSSe solar cells, respectively, thus bringing substantial advancement for emerging inorganic thin-film photovoltaics. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.01238 [pdf, other]

Mass Spectra of Full-Heavy and Double-Heavy Tetraquark States in the Conventional Quark Model

Authors: Qi Meng, Guang-Juan Wang, Makoto Oka

Abstract: A comprehensive study of the $S$-wave heavy tetraquark states with identical quarks and antiquarks, specifically $QQ{\bar Q'}\bar Q'$ ($Q, Q'=c,b$), $QQ\bar s\bar s$/$\bar Q\bar Q ss$, and $QQ\bar q\bar q$/$\bar Q\bar Q qq$ ($q=u,d$), are studied in a unified constituent quark model. This model contains the one-gluon exchange and confinement potentials. The latter is modeled as the sum of all two-… ▽ More A comprehensive study of the $S$-wave heavy tetraquark states with identical quarks and antiquarks, specifically $QQ{\bar Q'}\bar Q'$ ($Q, Q'=c,b$), $QQ\bar s\bar s$/$\bar Q\bar Q ss$, and $QQ\bar q\bar q$/$\bar Q\bar Q qq$ ($q=u,d$), are studied in a unified constituent quark model. This model contains the one-gluon exchange and confinement potentials. The latter is modeled as the sum of all two-body linear potentials. We employ the Gaussian expansion method to solve the full four-body Schrödinger equations, and search bound and resonant states using the complex-scaling method. We then identify $3$ bound and $62$ resonant states. The bound states are all $QQ\bar q\bar q$ states with the isospin and spin-parity quantum numbers $I(J^P)=0(1^+)$: two bound $bb\bar{q}\bar{q}$ states with the binding energies, 153 MeV and 4 MeV below the $BB^*$ threshold, and a shallow $cc\bar{q}\bar{q}$ state at $-15$ MeV from the $DD^*$ threshold. The deeper $bb\bar q \bar q$ bound state aligns with the lattice QCD predictions, while $cc\bar q\bar q$ bound state, still has a much larger binding energy than the recently observed $T^+_{cc}$ by LHCb collaboration. No bound states are identified for the $QQ\bar Q'\bar Q'$, $QQ\bar s\bar s$ and $QQ\bar q\bar q$ with $I=1$. Our analysis shows that the bound $QQ\bar Q'\bar Q'$ states are more probable with a larger mass ratio, $m_Q/m_{Q'}$. Experimental investigation for these states is desired, which will enrich our understanding of hadron spectroscopy and probe insights into the confinement mechanisms within tetraquarks. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 38 pages, 18 figures

Report number: KEK-TH-2611

arXiv:2404.00578 [pdf, other]

M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models

Authors: Fan Bai, Yuxin Du, Tiejun Huang, Max Q. -H. Meng, Bo Zhao

Abstract: Medical image analysis is essential to clinical diagnosis and treatment, which is increasingly supported by multi-modal large language models (MLLMs). However, previous research has primarily focused on 2D medical images, leaving 3D images under-explored, despite their richer spatial information. This paper aims to advance 3D medical image analysis with MLLMs. To this end, we present a large-scale… ▽ More Medical image analysis is essential to clinical diagnosis and treatment, which is increasingly supported by multi-modal large language models (MLLMs). However, previous research has primarily focused on 2D medical images, leaving 3D images under-explored, despite their richer spatial information. This paper aims to advance 3D medical image analysis with MLLMs. To this end, we present a large-scale 3D multi-modal medical dataset, M3D-Data, comprising 120K image-text pairs and 662K instruction-response pairs specifically tailored for various 3D medical tasks, such as image-text retrieval, report generation, visual question answering, positioning, and segmentation. Additionally, we propose M3D-LaMed, a versatile multi-modal large language model for 3D medical image analysis. Furthermore, we introduce a new 3D multi-modal medical benchmark, M3D-Bench, which facilitates automatic evaluation across eight tasks. Through comprehensive evaluation, our method proves to be a robust model for 3D medical image analysis, outperforming existing solutions. All code, data, and models are publicly available at: https://github.com/BAAI-DCAI/M3D. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: MLLM, 3D medical image analysis

arXiv:2404.00291 [pdf]

Gradient bandgap enables >13% efficiency sulfide Kesterite solar cells with open-circuit voltage over 800 mV

Authors: Kang Yin, **lin Wang, Licheng Lou, Xiao Xu, Bowen Zhang, Menghan Jiao, Jiangjian Shi, Dongmei Li, Huijue Wu, Yanhong Luo, Qingbo Meng

Abstract: Sulfide Kesterite Cu2ZnSnS4 (CZTS), a nontoxic and low-cost photovoltaic material, has always being facing severe charge recombination and poor carrier transport, resulting in the cell efficiency record stagnating around 11% for years. Gradient bandgap is a promising approach to relieve these issues, however, has not been effectively realized in Kesterite solar cells due to the challenges in contr… ▽ More Sulfide Kesterite Cu2ZnSnS4 (CZTS), a nontoxic and low-cost photovoltaic material, has always being facing severe charge recombination and poor carrier transport, resulting in the cell efficiency record stagnating around 11% for years. Gradient bandgap is a promising approach to relieve these issues, however, has not been effectively realized in Kesterite solar cells due to the challenges in controlling the gradient distribution of alloying elements at high temperatures. Herein, targeting at the Cd alloyed CZTS, we propose a pre-crystallization strategy to reduce the intense vertical mass transport and Cd rapid diffusion in the film growth process, thereby realizing front Cd-gradient CZTS absorber. The Cd-gradient CZTS absorber, exhibiting downward bending conduction band structure, has significantly enhanced the minority carrier transport and additionally improved band alignment and interface property of CZTS/CdS heterojunction. Ultimately, we have achieved a champion total-area efficiency of 13.5% (active-area efficiency: 14.1%) in the cell and in particular a high open-circuit voltage of >800 mV. We have also achieved a certified total-area cell efficiency of 13.16%, realizing a substantial step forward for the pure sulfide Kesterite solar cell. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2403.15146 [pdf, ps, other]

On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond

Authors: Bohan Wang, Huishuai Zhang, Qi Meng, Ruoyu Sun, Zhi-Ming Ma, Wei Chen

Abstract: This paper aims to clearly distinguish between Stochastic Gradient Descent with Momentum (SGDM) and Adam in terms of their convergence rates. We demonstrate that Adam achieves a faster convergence compared to SGDM under the condition of non-uniformly bounded smoothness. Our findings reveal that: (1) in deterministic environments, Adam can attain the known lower bound for the convergence rate of de… ▽ More This paper aims to clearly distinguish between Stochastic Gradient Descent with Momentum (SGDM) and Adam in terms of their convergence rates. We demonstrate that Adam achieves a faster convergence compared to SGDM under the condition of non-uniformly bounded smoothness. Our findings reveal that: (1) in deterministic environments, Adam can attain the known lower bound for the convergence rate of deterministic first-order optimizers, whereas the convergence rate of Gradient Descent with Momentum (GDM) has higher order dependence on the initial function value; (2) in stochastic setting, Adam's convergence rate upper bound matches the lower bounds of stochastic first-order optimizers, considering both the initial function value and the final error, whereas there are instances where SGDM fails to converge with any learning rate. These insights distinctly differentiate Adam and SGDM regarding their convergence rates. Additionally, by introducing a novel stop**-time based technique, we further prove that if we consider the minimum gradient norm during iterations, the corresponding convergence rate can match the lower bounds across all problem hyperparameters. The technique can also help proving that Adam with a specific hyperparameter scheduler is parameter-agnostic, which hence can be of independent interest. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.13306 [pdf, other]

Thermal Hall effect driven by phonon-magnon hybridization in a honeycomb antiferromagnet

Authors: Qingkai Meng, Xiaokang Li, Lingxiao Zhao, Chao Dong, Zengwei Zhu, Kamran Behnia

Abstract: The underlying mechanism of the thermal Hall effect (THE) generated by phonons in a variety of insulators is yet to be identified. Here, we report on a sizeable thermal Hall conductivity in NiPS$_3$, a van de Waals stack of honeycomb layers with a zigzag antiferromagnetic order below $T_N$ = 155 K. The longitudinal ($κ_{aa}$) and the transverse ($κ_{ab}$) thermal conductivities peak at the same te… ▽ More The underlying mechanism of the thermal Hall effect (THE) generated by phonons in a variety of insulators is yet to be identified. Here, we report on a sizeable thermal Hall conductivity in NiPS$_3$, a van de Waals stack of honeycomb layers with a zigzag antiferromagnetic order below $T_N$ = 155 K. The longitudinal ($κ_{aa}$) and the transverse ($κ_{ab}$) thermal conductivities peak at the same temperature and the thermal Hall angle ($κ_{ab}/κ_{aa}/B$) respects a previously identified bound. The amplitude of $κ_{ab}$ is extremely sensitive to the amplitude of magnetization along the $b$-axis, in contrast to the phonon mean free path, which is not at all. We show that the magnon and acoustic phonon bands cross each other along the $b^\ast$ orientation in the momentum space. The exponential temperature dependence of $κ_{ab}$ above its peak reveals an energy scale on the order of magnitude of the gap expected to be opened by magnon-phonon hybridization. This points to an intrinsic scenario for THE with possible relevance to other magnetic insulators. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 8 pages, 4 figures, Supplemental Materials included

arXiv:2403.08216 [pdf, other]

PaddingFlow: Improving Normalizing Flows with Padding-Dimensional Noise

Authors: Qinglong Meng, Chongkun Xia, Xueqian Wang

Abstract: Normalizing flow is a generative modeling approach with efficient sampling. However, Flow-based models suffer two issues: 1) If the target distribution is manifold, due to the unmatch between the dimensions of the latent target distribution and the data distribution, flow-based models might perform badly. 2) Discrete data might make flow-based models collapse into a degenerate mixture of point mas… ▽ More Normalizing flow is a generative modeling approach with efficient sampling. However, Flow-based models suffer two issues: 1) If the target distribution is manifold, due to the unmatch between the dimensions of the latent target distribution and the data distribution, flow-based models might perform badly. 2) Discrete data might make flow-based models collapse into a degenerate mixture of point masses. To sidestep such two issues, we propose PaddingFlow, a novel dequantization method, which improves normalizing flows with padding-dimensional noise. To implement PaddingFlow, only the dimension of normalizing flows needs to be modified. Thus, our method is easy to implement and computationally cheap. Moreover, the padding-dimensional noise is only added to the padding dimension, which means PaddingFlow can dequantize without changing data distributions. Implementing existing dequantization methods needs to change data distributions, which might degrade performance. We validate our method on the main benchmarks of unconditional density estimation, including five tabular datasets and four image datasets for Variational Autoencoder (VAE) models, and the Inverse Kinematics (IK) experiments which are conditional density estimation. The results show that PaddingFlow can perform better in all experiments in this paper, which means PaddingFlow is widely suitable for various tasks. The code is available at: https://github.com/AdamQLMeng/PaddingFlow. △ Less

Submitted 23 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.01962 [pdf, other]

An Efficient Model-Based Approach on Learning Agile Motor Skills without Reinforcement

Authors: Haojie Shi, Tingguang Li, Qingxu Zhu, Jiapeng Sheng, Lei Han, Max Q. -H. Meng

Abstract: Learning-based methods have improved locomotion skills of quadruped robots through deep reinforcement learning. However, the sim-to-real gap and low sample efficiency still limit the skill transfer. To address this issue, we propose an efficient model-based learning framework that combines a world model with a policy network. We train a differentiable world model to predict future states and use i… ▽ More Learning-based methods have improved locomotion skills of quadruped robots through deep reinforcement learning. However, the sim-to-real gap and low sample efficiency still limit the skill transfer. To address this issue, we propose an efficient model-based learning framework that combines a world model with a policy network. We train a differentiable world model to predict future states and use it to directly supervise a Variational Autoencoder (VAE)-based policy network to imitate real animal behaviors. This significantly reduces the need for real interaction data and allows for rapid policy updates. We also develop a high-level network to track diverse commands and trajectories. Our simulated results show a tenfold sample efficiency increase compared to reinforcement learning methods such as PPO. In real-world testing, our policy achieves proficient command-following performance with only a two-minute data collection period and generalizes well to new speeds and paths. △ Less

Submitted 18 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: Accepted by ICRA2024

arXiv:2403.01510 [pdf, other]

doi 10.1109/TCSVT.2023.3306400

End-to-End Human Instance Matting

Authors: Qinglin Liu, Sheng** Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yao

Abstract: Human instance matting aims to estimate an alpha matte for each human instance in an image, which is extremely challenging and has rarely been studied so far. Despite some efforts to use instance segmentation to generate a trimap for each instance and apply trimap-based matting methods, the resulting alpha mattes are often inaccurate due to inaccurate segmentation. In addition, this approach is co… ▽ More Human instance matting aims to estimate an alpha matte for each human instance in an image, which is extremely challenging and has rarely been studied so far. Despite some efforts to use instance segmentation to generate a trimap for each instance and apply trimap-based matting methods, the resulting alpha mattes are often inaccurate due to inaccurate segmentation. In addition, this approach is computationally inefficient due to multiple executions of the matting method. To address these problems, this paper proposes a novel End-to-End Human Instance Matting (E2E-HIM) framework for simultaneous multiple instance matting in a more efficient manner. Specifically, a general perception network first extracts image features and decodes instance contexts into latent codes. Then, a united guidance network exploits spatial attention and semantics embedding to generate united semantics guidance, which encodes the locations and semantic correspondences of all instances. Finally, an instance matting network decodes the image features and united semantics guidance to predict all instance-level alpha mattes. In addition, we construct a large-scale human instance matting dataset (HIM-100K) comprising over 100,000 human images with instance alpha matte labels. Experiments on HIM-100K demonstrate the proposed E2E-HIM outperforms the existing methods on human instance matting with 50% lower errors and 5X faster speed (6 instances in a 640X640 image). Experiments on the PPM-100, RWP-636, and P3M datasets demonstrate that E2E-HIM also achieves competitive performance on traditional human matting. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Journal ref: IEEE T-CSVT 2023

arXiv:2403.00325 [pdf, other]

Small, Versatile and Mighty: A Range-View Perception Framework

Authors: Qiang Meng, Xiao Wang, JiaBao Wang, Liujiang Yan, Ke Wang

Abstract: Despite its compactness and information integrity, the range view representation of LiDAR data rarely occurs as the first choice for 3D perception tasks. In this work, we further push the envelop of the range-view representation with a novel multi-task framework, achieving unprecedented 3D detection performances. Our proposed Small, Versatile, and Mighty (SVM) network utilizes a pure convolutional… ▽ More Despite its compactness and information integrity, the range view representation of LiDAR data rarely occurs as the first choice for 3D perception tasks. In this work, we further push the envelop of the range-view representation with a novel multi-task framework, achieving unprecedented 3D detection performances. Our proposed Small, Versatile, and Mighty (SVM) network utilizes a pure convolutional architecture to fully unleash the efficiency and multi-tasking potentials of the range view representation. To boost detection performances, we first propose a range-view specific Perspective Centric Label Assignment (PCLA) strategy, and a novel View Adaptive Regression (VAR) module to further refine hard-to-predict box properties. In addition, our framework seamlessly integrates semantic segmentation and panoptic segmentation tasks for the LiDAR point cloud, without extra modules. Among range-view-based methods, our model achieves new state-of-the-art detection performances on the Waymo Open Dataset. Especially, over 10 mAP improvement over convolutional counterparts can be obtained on the vehicle class. Our presented results for other tasks further reveal the multi-task capabilities of the proposed small but mighty framework. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.11984 [pdf, other]

Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Authors: Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Di He, Zhouchen Lin

Abstract: Neuromorphic computing with spiking neural networks is promising for energy-efficient artificial intelligence (AI) applications. However, different from humans who continually learn different tasks in a lifetime, neural network models suffer from catastrophic forgetting. How could neuronal operations solve this problem is an important question for AI and neuroscience. Many previous studies draw in… ▽ More Neuromorphic computing with spiking neural networks is promising for energy-efficient artificial intelligence (AI) applications. However, different from humans who continually learn different tasks in a lifetime, neural network models suffer from catastrophic forgetting. How could neuronal operations solve this problem is an important question for AI and neuroscience. Many previous studies draw inspiration from observed neuroscience phenomena and propose episodic replay or synaptic metaplasticity, but they are not guaranteed to explicitly preserve knowledge for neuron populations. Other works focus on machine learning methods with more mathematical grounding, e.g., orthogonal projection on high dimensional spaces, but there is no neural correspondence for neuromorphic computing. In this work, we develop a new method with neuronal operations based on lateral connections and Hebbian learning, which can protect knowledge by projecting activity traces of neurons into an orthogonal subspace so that synaptic weight update will not interfere with old tasks. We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities and enable orthogonal projection. This provides new insights into how neural circuits and Hebbian learning can help continual learning, and also how the concept of orthogonal projection can be realized in neuronal systems. Our method is also flexible to utilize arbitrary training methods based on presynaptic activities/traces. Experiments show that our method consistently solves forgetting for spiking neural networks with nearly zero forgetting under various supervised training methods with different error propagation approaches, and outperforms previous approaches under various settings. Our method can pave a solid path for building continual neuromorphic computing systems. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: Accepted by ICLR 2024

arXiv:2402.11212 [pdf, ps, other]

Equivariant (co)module nuclearity of $C^*$-crossed products

Authors: Massoud Amini, Qing Meng

Abstract: We define an equivariant and equicovariant versions of the notion of module nuclearity. More precisely, for a discrete group $Γ$ and operator $\mathcal A$-$Γ$-(co)module $\mathcal B$, $\mathcal E$ over a $Γ$-C$^*$-algebra $\mathcal A$, we define $\mathcal E$-$Γ$-nuclearity of $\mathcal B$, as an equivariant version of the notion of $\mathcal E$-nuclearity, in which the identity map on… ▽ More We define an equivariant and equicovariant versions of the notion of module nuclearity. More precisely, for a discrete group $Γ$ and operator $\mathcal A$-$Γ$-(co)module $\mathcal B$, $\mathcal E$ over a $Γ$-C$^*$-algebra $\mathcal A$, we define $\mathcal E$-$Γ$-nuclearity of $\mathcal B$, as an equivariant version of the notion of $\mathcal E$-nuclearity, in which the identity map on $\mathcal B$ is required to be approximately factored through matrix algebras on $\mathcal E$ with module structures coming both from the original module structure of $\mathcal E$ and the $Γ$-action on $\mathcal E$. For trivial actions of $Γ$, this is shown to reduce to the notion of module nuclearity, introduced and studied by the first author. As a concrete example, for a discrete group $Γ$ acting amenably on a unital C$^*$-algebra $\mathcal A$, we show that the reduced crossed product $\mathcal A\rtimes_{r} Γ$ is $\mathcal A$-$Γ$-nuclear. Conversely, if $\mathcal A$ is a nuclear C$^*$-algebra with a $Γ$-invariant state $ρ$ and $\mathcal A\rtimes_{r} Γ$ is $\mathcal A$-$Γ$-nuclear, then we deduce that $Γ$ is amenable. We show that when $\mathcal A\rtimes_{r} Γ$ is $\mathcal A$-$Γ$-nuclear and $\mathcal A$ has the completely bounded approximation property (resp., is exact), then so is $\mathcal A\rtimes_{r} Γ$. We prove similar results for $\mathcal A\rtimes_{r} Γ$, regarded as an $\mathcal A$-$Γ$-comodule. △ Less

Submitted 17 February, 2024; originally announced February 2024.

MSC Class: 46L05; 46L55

arXiv:2401.09819 [pdf, other]

PPNet: A Two-Stage Neural Network for End-to-end Path Planning

Authors: Qinglong Meng, Chongkun Xia, Xueqian Wang, Song** Mai, Bin Liang

Abstract: The classical path planners, such as sampling-based path planners, can provide probabilistic completeness guarantees in the sense that the probability that the planner fails to return a solution if one exists, decays to zero as the number of samples approaches infinity. However, finding a near-optimal feasible solution in a given period is challenging in many applications such as the autonomous ve… ▽ More The classical path planners, such as sampling-based path planners, can provide probabilistic completeness guarantees in the sense that the probability that the planner fails to return a solution if one exists, decays to zero as the number of samples approaches infinity. However, finding a near-optimal feasible solution in a given period is challenging in many applications such as the autonomous vehicle. To achieve an end-to-end near-optimal path planner, we first divide the path planning problem into two subproblems, which are path space segmentation and waypoints generation in the given path's space. We further propose a two-stage neural network named Path Planning Network (PPNet) each stage solves one of the subproblems abovementioned. Moreover, we propose a novel efficient data generation method for path planning named EDaGe-PP. EDaGe-PP can generate data with continuous-curvature paths with analytical expression while satisfying the clearance requirement. The results show the total computation time of generating random 2D path planning data is less than 1/33 and the success rate of PPNet trained by the dataset that is generated by EDaGe-PP is about 2 times compared to other methods. We validate PPNet against state-of-the-art path planning methods. The results show that PPNet can find a near-optimal solution in 15.3ms, which is much shorter than the state-of-the-art path planners. △ Less

Submitted 23 April, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.08433 [pdf, other]

Autonomous Multiple-Trolley Collection System with Nonholonomic Robots: Design, Control, and Implementation

Authors: Peijia Xie, Bingyi Xia, Anjun Hu, Ziqi Zhao, Lingxiao Meng, Zhirui Sun, Xuheng Gao, Jiankun Wang, Max Q. -H. Meng

Abstract: The intricate and multi-stage task in dynamic public spaces like luggage trolley collection in airports presents both a promising opportunity and an ongoing challenge for automated service robots. Previous research has primarily focused on handling a single trolley or individual functional components, creating a gap in providing cost-effective and efficient solutions for practical scenarios. In th… ▽ More The intricate and multi-stage task in dynamic public spaces like luggage trolley collection in airports presents both a promising opportunity and an ongoing challenge for automated service robots. Previous research has primarily focused on handling a single trolley or individual functional components, creating a gap in providing cost-effective and efficient solutions for practical scenarios. In this paper, we propose a mobile manipulation robot incorporated with an autonomy framework for the collection and transportation of multiple trolleys that can significantly enhance operational efficiency. We address the key challenges in the trolley collection problem through the novel design of the mechanical system and the vision-based control strategy. We design a lightweight manipulator and docking mechanism, optimized for the sequential stacking and transportation of multiple trolleys. Additionally, based on the Control Lyapunov Function and Control Barrier Function, we propose a novel vision-based control with the online Quadratic Programming which significantly improves the accuracy and efficiency of the collection process. The practical application of our system is demonstrated in real world scenarios, where it successfully executes multiple-trolley collection tasks. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2312.17076 [pdf, other]

Minimally-intrusive Navigation in Dense Crowds with Integrated Macro and Micro-level Dynamics

Authors: Tong Zhou, Senmao Qi, Guangdu Cen, Ziqi Zha, Erli Lyu, Jiaole Wang, Max Q. -H. Meng

Abstract: In mobile robot navigation, despite advancements, the generation of optimal paths often disrupts pedestrian areas. To tackle this, we propose three key contributions to improve human-robot coexistence in shared spaces. Firstly, we have established a comprehensive framework to understand disturbances at individual and flow levels. Our framework provides specialized computational strategies for in-d… ▽ More In mobile robot navigation, despite advancements, the generation of optimal paths often disrupts pedestrian areas. To tackle this, we propose three key contributions to improve human-robot coexistence in shared spaces. Firstly, we have established a comprehensive framework to understand disturbances at individual and flow levels. Our framework provides specialized computational strategies for in-depth studies of human-robot interactions from both micro and macro perspectives. By employing novel penalty terms, namely Flow Disturbance Penalty (FDP) and Individual Disturbance Penalty (IDP), our framework facilitates a more nuanced assessment and analysis of the robot navigation's impact on pedestrians. Secondly, we introduce an innovative sampling-based navigation system that adeptly integrates a suite of safety measures with the predictability of robotic movements. This system not only accounts for traditional factors such as trajectory length and travel time but also actively incorporates pedestrian awareness. Our navigation system aims to minimize disturbances and promote harmonious coexistence by considering safety protocols, trajectory clarity, and pedestrian engagement. Lastly, we validate our algorithm's effectiveness and real-time performance through simulations and real-world tests, demonstrating its ability to navigate with minimal pedestrian disturbance in various environments. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 23 pages, 13 figures

MSC Class: 68T40 ACM Class: I.2.9

arXiv:2312.06957 [pdf, other]

Online Saddle Point Problem and Online Convex-Concave Optimization

Authors: Qing-xin Meng, Jian-wei Liu

Abstract: Centered around solving the Online Saddle Point problem, this paper introduces the Online Convex-Concave Optimization (OCCO) framework, which involves a sequence of two-player time-varying convex-concave games. We propose the generalized duality gap (Dual-Gap) as the performance metric and establish the parallel relationship between OCCO with Dual-Gap and Online Convex Optimization (OCO) with regr… ▽ More Centered around solving the Online Saddle Point problem, this paper introduces the Online Convex-Concave Optimization (OCCO) framework, which involves a sequence of two-player time-varying convex-concave games. We propose the generalized duality gap (Dual-Gap) as the performance metric and establish the parallel relationship between OCCO with Dual-Gap and Online Convex Optimization (OCO) with regret. To demonstrate the natural extension of OCCO from OCO, we develop two algorithms, the implicit online mirror descent-ascent and its optimistic variant. Analysis reveals that their duality gaps share similar expression forms with the corresponding dynamic regrets arising from implicit updates in OCO. Empirical results further substantiate the effectiveness of our algorithms. Simultaneously, we unveil that the dynamic Nash equilibrium regret, which was initially introduced in a recent paper, has inherent defects. △ Less

Submitted 15 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: Add Remark 8 and Section 6

arXiv:2311.14361 [pdf]

doi 10.1093/nsr/nwad336

Deciphering and integrating invariants for neural operator learning with various physical mechanisms

Authors: Rui Zhang, Qi Meng, Zhi-Ming Ma

Abstract: Neural operators have been explored as surrogate models for simulating physical systems to overcome the limitations of traditional partial differential equation (PDE) solvers. However, most existing operator learning methods assume that the data originate from a single physical mechanism, limiting their applicability and performance in more realistic scenarios. To this end, we propose Physical Inv… ▽ More Neural operators have been explored as surrogate models for simulating physical systems to overcome the limitations of traditional partial differential equation (PDE) solvers. However, most existing operator learning methods assume that the data originate from a single physical mechanism, limiting their applicability and performance in more realistic scenarios. To this end, we propose Physical Invariant Attention Neural Operator (PIANO) to decipher and integrate the physical invariants (PI) for operator learning from the PDE series with various physical mechanisms. PIANO employs self-supervised learning to extract physical knowledge and attention mechanisms to integrate them into dynamic convolutional layers. Compared to existing techniques, PIANO can reduce the relative error by 13.6\%-82.2\% on PDE forecasting tasks across varying coefficients, forces, or boundary conditions. Additionally, varied downstream tasks reveal that the PI embeddings deciphered by PIANO align well with the underlying invariants in the PDE systems, verifying the physical significance of PIANO. The source code will be publicly available at: https://github.com/optray/PIANO. △ Less

Submitted 12 February, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

arXiv:2311.03954 [pdf]

doi 10.1038/s41467-024-48850-9

Defect Regulation by Palladium Incorporation towards Grain Boundaries of Kesterite solar cells

Authors: **lin Wang, Jiangjian Shi, Kang Yin, Fanqi Meng, Shanshan Wang, Licheng Lou, Jiazheng Zhou, Xiao Xu, Huijue Wu, Yanhong Luo, Dongmei Li, Shiyou Chen, Qingbo Meng

Abstract: Kesterite Cu2ZnSn(S, Se)4 (CZTSSe) solar cell has emerged as one of the most promising candidates for thin-film photovoltaics. However, severe charge losses occurring at the grain boundaries (GBs) of Kesterite polycrystalline absorbers has hindered the improvement of cell performance. Herein, we report a redox reaction strategy involving palladium (Pd) to eliminate atomic vacancy defects such as V… ▽ More Kesterite Cu2ZnSn(S, Se)4 (CZTSSe) solar cell has emerged as one of the most promising candidates for thin-film photovoltaics. However, severe charge losses occurring at the grain boundaries (GBs) of Kesterite polycrystalline absorbers has hindered the improvement of cell performance. Herein, we report a redox reaction strategy involving palladium (Pd) to eliminate atomic vacancy defects such as VSn and VSe in GBs of the Kesterite absorbers. We demonstrate that PdSex compounds could form during the selenization process and distribute at the GBs and the absorber surfaces; thereby aid in the suppression of Sn and Se volatilization loss and inhibiting the formation of VSn and VSe defects. Furthermore, Pd(II)/Pd(IV) serves as a redox shuttle, i.e., on one hand, Pd(II) captures Se vapor from the reaction environment to produce PdSe2, on the other hand, PdSe2 provides Se atoms to the Kesterite absorber by being reduced to PdSe, thus contributing to the elimination of pre-existing VSe defects within GBs. These effects collectively reduce defects and enhance the p-type characteristics of the Kesterite absorber, leading to a significant reduction in charge recombination loss within the cell. As a result, high-performance Kesterite solar cells with a total-area efficiency of 14.5% have been achieved. This remarkable efficiency increase benefited from the redox reaction strategy offers a promising avenue for the precise regulation of defects in Kesterite solar cells and holds generally significant implications for the exploration of various other photovoltaic devices. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Journal ref: Nature Communications 2024, 15, 4344

arXiv:2310.13195 [pdf, ps, other]

A Class of Forward-Backward Stochastic Differential Equations Driven by Lévy Processes and Application to LQ Problems

Authors: Maozhong Xu, Maoning Tang, Qingxin Meng

Abstract: In this paper, our primary focus lies in the thorough investigation of a specific category of nonlinear fully coupled forward-backward stochastic differential equations involving time delays and advancements with the incorporation of Lévy processes, which we shall abbreviate as FBSDELDAs. Drawing inspiration from diverse examples of linear-quadratic (LQ) optimal control problems featuring delays a… ▽ More In this paper, our primary focus lies in the thorough investigation of a specific category of nonlinear fully coupled forward-backward stochastic differential equations involving time delays and advancements with the incorporation of Lévy processes, which we shall abbreviate as FBSDELDAs. Drawing inspiration from diverse examples of linear-quadratic (LQ) optimal control problems featuring delays and Lévy processes, we proceed to employ a set of domination-monotonicity conditions tailored to this class of FBSDELDAs. Through the application of the continuation method, we achieve the pivotal results of unique solvability and the derivation of a pair of estimates for the solutions of these FBSDELDAs. These findings, in turn, carry significant implications for a range of LQ problems. Specifically, they are relevant when stochastic Hamiltonian systems perfectly align with the FBSDELDAs that fulfill the domination-monotonicity conditions. Consequently, we are able to establish explicit expressions for the unique optimal controls by utilizing the solutions of the corresponding stochastic Hamiltonian systems. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.09937 [pdf, other]

Joint Sparse Representations and Coupled Dictionary Learning in Multi-Source Heterogeneous Image Pseudo-color Fusion

Authors: Long Bai, Shilong Yao, Kun Gao, Yanjun Huang, Ruijie Tang, Hong Yan, Max Q. -H. Meng, Hongliang Ren

Abstract: Considering that Coupled Dictionary Learning (CDL) method can obtain a reasonable linear mathematical relationship between resource images, we propose a novel CDL-based Synthetic Aperture Radar (SAR) and multispectral pseudo-color fusion method. Firstly, the traditional Brovey transform is employed as a pre-processing method on the paired SAR and multispectral images. Then, CDL is used to capture… ▽ More Considering that Coupled Dictionary Learning (CDL) method can obtain a reasonable linear mathematical relationship between resource images, we propose a novel CDL-based Synthetic Aperture Radar (SAR) and multispectral pseudo-color fusion method. Firstly, the traditional Brovey transform is employed as a pre-processing method on the paired SAR and multispectral images. Then, CDL is used to capture the correlation between the pre-processed image pairs based on the dictionaries generated from the source images via enforced joint sparse coding. Afterward, the joint sparse representation in the pair of dictionaries is utilized to construct an image mask via calculating the reconstruction errors, and therefore generate the final fusion image. The experimental verification results of the SAR images from the Sentinel-1 satellite and the multispectral images from the Landsat-8 satellite show that the proposed method can achieve superior visual effects, and excellent quantitative performance in terms of spectral distortion, correlation coefficient, MSE, NIQE, BRISQUE, and PIQE. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: To appear in IEEE Sensors Journal

arXiv:2310.04675 [pdf, other]

Terrain-Aware Quadrupedal Locomotion via Reinforcement Learning

Authors: Haojie Shi, Qingxu Zhu, Lei Han, Wanchao Chi, Tingguang Li, Max Q. -H. Meng

Abstract: In nature, legged animals have developed the ability to adapt to challenging terrains through perception, allowing them to plan safe body and foot trajectories in advance, which leads to safe and energy-efficient locomotion. Inspired by this observation, we present a novel approach to train a Deep Neural Network (DNN) policy that integrates proprioceptive and exteroceptive states with a parameteri… ▽ More In nature, legged animals have developed the ability to adapt to challenging terrains through perception, allowing them to plan safe body and foot trajectories in advance, which leads to safe and energy-efficient locomotion. Inspired by this observation, we present a novel approach to train a Deep Neural Network (DNN) policy that integrates proprioceptive and exteroceptive states with a parameterized trajectory generator for quadruped robots to traverse rough terrains. Our key idea is to use a DNN policy that can modify the parameters of the trajectory generator, such as foot height and frequency, to adapt to different terrains. To encourage the robot to step on safe regions and save energy consumption, we propose foot terrain reward and lifting foot height reward, respectively. By incorporating these rewards, our method can learn a safer and more efficient terrain-aware locomotion policy that can move a quadruped robot flexibly in any direction. To evaluate the effectiveness of our approach, we conduct simulation experiments on challenging terrains, including stairs, step** stones, and poles. The simulation results demonstrate that our approach can successfully direct the robot to traverse such tough terrains in any direction. Furthermore, we validate our method on a real legged robot, which learns to traverse step** stones with gaps over 25.5cm. △ Less

Submitted 10 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2309.15079 [pdf, other]

Towards High Efficient Long-horizon Planning with Expert-guided Motion-encoding Tree Search

Authors: Tong Zhou, Erli Lyu, Jiaole Wang, Guangdu Cen, Ziqi Zha, Senmao Qi, Max Q. -H. Meng

Abstract: Autonomous driving holds promise for increased safety, optimized traffic management, and a new level of convenience in transportation. While model-based reinforcement learning approaches such as MuZero enables long-term planning, the exponentially increase of the number of search nodes as the tree goes deeper significantly effect the searching efficiency. To deal with this problem, in this paper w… ▽ More Autonomous driving holds promise for increased safety, optimized traffic management, and a new level of convenience in transportation. While model-based reinforcement learning approaches such as MuZero enables long-term planning, the exponentially increase of the number of search nodes as the tree goes deeper significantly effect the searching efficiency. To deal with this problem, in this paper we proposed the expert-guided motion-encoding tree search (EMTS) algorithm. EMTS extends the MuZero algorithm by representing possible motions with a comprehensive motion primitives latent space and incorporating expert policies toimprove the searching efficiency. The comprehensive motion primitives latent space enables EMTS to sample arbitrary trajectories instead of raw action to reduce the depth of the search tree. And the incorporation of expert policies guided the search and training phases the EMTS algorithm to enable early convergence. In the experiment section, the EMTS algorithm is compared with other four algorithms in three challenging scenarios. The experiment result verifies the effectiveness and the searching efficiency of the proposed EMTS algorithm. △ Less

Submitted 30 September, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: 7 pages, 5 figures

MSC Class: 68T40 ACM Class: I.2.9

arXiv:2309.14306 [pdf, other]

DeepMesh: Mesh-based Cardiac Motion Tracking using Deep Learning

Authors: Qingjie Meng, Wenjia Bai, Declan P O'Regan, and Daniel Rueckert

Abstract: 3D motion estimation from cine cardiac magnetic resonance (CMR) images is important for the assessment of cardiac function and the diagnosis of cardiovascular diseases. Current state-of-the art methods focus on estimating dense pixel-/voxel-wise motion fields in image space, which ignores the fact that motion estimation is only relevant and useful within the anatomical objects of interest, e.g., t… ▽ More 3D motion estimation from cine cardiac magnetic resonance (CMR) images is important for the assessment of cardiac function and the diagnosis of cardiovascular diseases. Current state-of-the art methods focus on estimating dense pixel-/voxel-wise motion fields in image space, which ignores the fact that motion estimation is only relevant and useful within the anatomical objects of interest, e.g., the heart. In this work, we model the heart as a 3D mesh consisting of epi- and endocardial surfaces. We propose a novel learning framework, DeepMesh, which propagates a template heart mesh to a subject space and estimates the 3D motion of the heart mesh from CMR images for individual subjects. In DeepMesh, the heart mesh of the end-diastolic frame of an individual subject is first reconstructed from the template mesh. Mesh-based 3D motion fields with respect to the end-diastolic frame are then estimated from 2D short- and long-axis CMR images. By develo** a differentiable mesh-to-image rasterizer, DeepMesh is able to leverage 2D shape information from multiple anatomical views for 3D mesh reconstruction and mesh motion estimation. The proposed method estimates vertex-wise displacement and thus maintains vertex correspondences between time frames, which is important for the quantitative assessment of cardiac function across different subjects and populations. We evaluate DeepMesh on CMR images acquired from the UK Biobank. We focus on 3D motion estimation of the left ventricle in this work. Experimental results show that the proposed method quantitatively and qualitatively outperforms other image-based and mesh-based cardiac motion tracking methods. △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.13813 [pdf, other]

Efficient RRT*-based Safety-Constrained Motion Planning for Continuum Robots in Dynamic Environments

Authors: Peiyu Luo, Shilong Yao, Yiyao Yue, Jiankun Wang, Hong Yan, Max Q. -H. Meng

Abstract: Continuum robots, characterized by their high flexibility and infinite degrees of freedom (DoFs), have gained prominence in applications such as minimally invasive surgery and hazardous environment exploration. However, the intrinsic complexity of continuum robots requires a significant amount of time for their motion planning, posing a hurdle to their practical implementation. To tackle these cha… ▽ More Continuum robots, characterized by their high flexibility and infinite degrees of freedom (DoFs), have gained prominence in applications such as minimally invasive surgery and hazardous environment exploration. However, the intrinsic complexity of continuum robots requires a significant amount of time for their motion planning, posing a hurdle to their practical implementation. To tackle these challenges, efficient motion planning methods such as Rapidly Exploring Random Trees (RRT) and its variant, RRT*, have been employed. This paper introduces a unique RRT*-based motion control method tailored for continuum robots. Our approach embeds safety constraints derived from the robots' posture states, facilitating autonomous navigation and obstacle avoidance in rapidly changing environments. Simulation results show efficient trajectory planning amidst multiple dynamic obstacles and provide a robust performance evaluation based on the generated postures. Finally, preliminary tests were conducted on a two-segment cable-driven continuum robot prototype, confirming the effectiveness of the proposed planning approach. This method is versatile and can be adapted and deployed for various types of continuum robots through parameter adjustments. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.12660 [pdf, ps, other]

Disturbance Rejection Control for Autonomous Trolley Collection Robots with Prescribed Performance

Authors: Rui-Dong Xi, Liang Lu, Xue Zhang, Xiao Xiao, Bingyi Xia, Jiankun Wang, Max Q. -H. Meng

Abstract: Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped di… ▽ More Trajectory tracking control of autonomous trolley collection robots (ATCR) is an ambitious work due to the complex environment, serious noise and external disturbances. This work investigates a control scheme for ATCR subjecting to severe environmental interference. A kinematics model based adaptive sliding mode disturbance observer with fast convergence is first proposed to estimate the lumped disturbances. On this basis, a robust controller with prescribed performance is proposed using a backstep** technique, which improves the transient performance and guarantees fast convergence. Simulation outcomes have been provided to illustrate the effectiveness of the proposed control scheme. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.12587 [pdf, ps, other]

Coverage Dependent H$_2$ Desorption Energy: a Quantitative Explanation Based on Encounter Desorption Mechanism

Authors: Qingkuan Meng, Qiang Chang, Gang Zhao, Donghui Quan, Masashi Tsuge, Xia Zhang, Yong Zhang, Xiao-Hu Li

Abstract: Recent experiments show that the desorption energy of H$_2$ on a diamond-like carbon (DLC) surface depends on the H$_2$ coverage of the surface. We aim to quantitatively explain the coverage dependent H$_2$ desorption energy measured by the experiments. We derive a math formula to calculate an effective H$_2$ desorption energy based on the encounter desorption mechanism. The effective H$_2$ desorp… ▽ More Recent experiments show that the desorption energy of H$_2$ on a diamond-like carbon (DLC) surface depends on the H$_2$ coverage of the surface. We aim to quantitatively explain the coverage dependent H$_2$ desorption energy measured by the experiments. We derive a math formula to calculate an effective H$_2$ desorption energy based on the encounter desorption mechanism. The effective H$_2$ desorption energy depends on two key parameters, the desorption energy of H$_2$ on H$_2$ substrate and the ratio of H$_2$ diffusion barrier to its desorption energy. The calculated effective H$_2$ desorption energy qualitatively agrees with the coverage dependent H$_2$ desorption energy measured by the experiments if the values of these two parameters in literature are used in the calculations. We argue that the difference between the effective H$_2$ desorption energy and the experimental results is due to the lacking of knowledge about these two parameters. So, we recalculate these two parameters based on experimental data. Good agreement between theoretical and experimental results can be achieved if these two updated parameters are used in the calculations. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: 6 pages,6 figures,2 tables, accepted for publication in MNRAS

arXiv:2309.11107 [pdf, other]

Indoor Exploration and Simultaneous Trolley Collection Through Task-Oriented Environment Partitioning

Authors: Junjie Gao, Peijia Xie, Xuheng Gao, Zhirui Sun, Jiankun Wang, Max Q. -H. Meng

Abstract: In this paper, we present a simultaneous exploration and object search framework for the application of autonomous trolley collection. For environment representation, a task-oriented environment partitioning algorithm is presented to extract diverse information for each sub-task. First, LiDAR data is classified as potential objects, walls, and obstacles after outlier removal. Segmented point cloud… ▽ More In this paper, we present a simultaneous exploration and object search framework for the application of autonomous trolley collection. For environment representation, a task-oriented environment partitioning algorithm is presented to extract diverse information for each sub-task. First, LiDAR data is classified as potential objects, walls, and obstacles after outlier removal. Segmented point clouds are then transformed into a hybrid map with the following functional components: object proposals to avoid missing trolleys during exploration; room layouts for semantic space segmentation; and polygonal obstacles containing geometry information for efficient motion planning. For exploration and simultaneous trolley collection, we propose an efficient exploration-based object search method. First, a traveling salesman problem with precedence constraints (TSP-PC) is formulated by grou** frontiers and object proposals. The next target is selected by prioritizing object search while avoiding excessive robot backtracking. Then, feasible trajectories with adequate obstacle clearance are generated by topological graph search. We validate the proposed framework through simulations and demonstrate the system with real-world autonomous trolley collection tasks. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.09706 [pdf, other]

Dislocations with corners in an elastic body with applications to fault detection

Authors: Huaian Diao, Hongyu Liu, Qingle Meng

Abstract: This paper focuses on an elastic dislocation problem that is motivated by applications in the geophysical and seismological communities. In our model, the displacement satisfies the Lamé system in a bounded domain with a mixed homogeneous boundary condition. We also allow the occurrence of discontinuities in both the displacement and traction fields on the fault curve/surface. By the variational a… ▽ More This paper focuses on an elastic dislocation problem that is motivated by applications in the geophysical and seismological communities. In our model, the displacement satisfies the Lamé system in a bounded domain with a mixed homogeneous boundary condition. We also allow the occurrence of discontinuities in both the displacement and traction fields on the fault curve/surface. By the variational approach, we first prove the well-posedness of the direct dislocation problem in a rather general setting with the Lamé parameters being real-valued $L^\infty$ functions and satisfy the strong convexity condition. Next, by considering the scenario that the Lamé parameters are constant and the fault curve/surface possesses certain corner singularities, we establish a local characterisation of the slip vectors at the corner points over the dislocation curve/surface. In our study the dislocation is geometrically rather general and may be open or closed. For both cases, we establish the uniqueness results for the inverse problem of determining the dislocation curve/surface and the slips. △ Less

Submitted 8 November, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

arXiv:2308.14667 [pdf]

Neural Network-Based Histologic Remission Prediction In Ulcerative Colitis

Authors: Yemin li, Zhongcheng Liu, Xiaoying Lou, Mirigual Kurban, Miao Li, Jie Yang, Kaiwei Che, Jiankun Wang, Max Q. -H Meng, Yan Huang, Qin Guo, Pin** Hu

Abstract: BACKGROUND & AIMS: Histological remission (HR) is advocated and considered as a new therapeutic target in ulcerative colitis (UC). Diagnosis of histologic remission currently relies on biopsy; during this process, patients are at risk for bleeding, infection, and post-biopsy fibrosis. In addition, histologic response scoring is complex and time-consuming, and there is heterogeneity among pathologi… ▽ More BACKGROUND & AIMS: Histological remission (HR) is advocated and considered as a new therapeutic target in ulcerative colitis (UC). Diagnosis of histologic remission currently relies on biopsy; during this process, patients are at risk for bleeding, infection, and post-biopsy fibrosis. In addition, histologic response scoring is complex and time-consuming, and there is heterogeneity among pathologists. Endocytoscopy (EC) is a novel ultra-high magnification endoscopic technique that can provide excellent in vivo assessment of glands. Based on the EC technique, we propose a neural network model that can assess histological disease activity in UC using EC images to address the above issues. The experiment results demonstrate that the proposed method can assist patients in precise treatment and prognostic assessment. METHODS: We construct a neural network model for UC evaluation. A total of 5105 images of 154 intestinal segments from 87 patients undergoing EC treatment at a center in China between March 2022 and March 2023 are scored according to the Geboes score. Subsequently, 103 intestinal segments are used as the training set, 16 intestinal segments are used as the validation set for neural network training, and the remaining 35 intestinal segments are used as the test set to measure the model performance together with the validation set. RESULTS: By treating HR as a negative category and histologic activity as a positive category, the proposed neural network model can achieve an accuracy of 0.9, a specificity of 0.95, a sensitivity of 0.75, and an area under the curve (AUC) of 0.81. CONCLUSION: We develop a specific neural network model that can distinguish histologic remission/activity in EC images of UC, which helps to accelerate clinical histological diagnosis. keywords: ulcerative colitis; Endocytoscopy; Geboes score; neural network. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2308.10683 [pdf, other]

doi 10.1093/mnras/stad2595

Variability, polarimetry, and timing properties of single pulses from PSR J2222-0137 using FAST

Authors: X. L. Miao, W. W. Zhu, M. Kramer, P. C. C. Freire, L. Shao, M. Yuan, L. Q. Meng, Z. W. Wu, C. C. Miao, Y. J. Guo, D. J. Champion, E. Fonseca, J. M. Yao, M. Y. Xue, J. R. Niu, H. Hu, C. M. Zhang

Abstract: In our work, we analyse $5\times10^{4}$ single pulses from the recycled pulsar PSR J2222$-$0137 in one of its scintillation maxima observed by the Five-hundred-meter Aperture Spherical radio Telescope (FAST). PSR J2222$-$0137 is one of the nearest and best studies of binary pulsars and a unique laboratory for testing gravitational theories. We report single pulses' energy distribution and polariza… ▽ More In our work, we analyse $5\times10^{4}$ single pulses from the recycled pulsar PSR J2222$-$0137 in one of its scintillation maxima observed by the Five-hundred-meter Aperture Spherical radio Telescope (FAST). PSR J2222$-$0137 is one of the nearest and best studies of binary pulsars and a unique laboratory for testing gravitational theories. We report single pulses' energy distribution and polarization from the pulsar's main-pulse region. The single pulse energy follows the log-normal distribution. We resolve a steep polarization swing, but at the current time resolution ($64\,μ{\rm s}$), we find no evidence for the orthogonal jump in the main-pulse region, as has been suspected. We find a potential sub-pulse drifting period of $P_{3} \sim 3.5\,P$. We analyse the jitter noise from different integrated numbers of pulses and find that its $σ_{j}$ is $270\pm{9}\,{\rm ns}$ for 1-hr integration at 1.25 GHz. This result is useful for optimizing future timing campaigns with FAST or other radio telescopes. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 11 pages, 14 figures, accepted by Monthly Notices of the Royal Astronomical Society

Journal ref: MNRAS 526 (2023) 2156

arXiv:2308.05466 [pdf, other]

Doubly heavy tetraquarks including one-pion exchange potential

Authors: Qi Meng, Emiko Hiyama, Makoto Oka, Atsushi Hosaka, Chang Xu

Abstract: Spectrum of the doubly heavy tetraquarks is studied in a constituent quark model including one-pion exchange (OPE) potential. Central and tensor forces induced by OPE between two light quarks are considered. Our results show that $I(J^P)=0(1^+)$ compact bound states are shifted up because of the repulsive central force between $\bar{q}\bar{q}$. This effect possibly leads to the small binding energ… ▽ More Spectrum of the doubly heavy tetraquarks is studied in a constituent quark model including one-pion exchange (OPE) potential. Central and tensor forces induced by OPE between two light quarks are considered. Our results show that $I(J^P)=0(1^+)$ compact bound states are shifted up because of the repulsive central force between $\bar{q}\bar{q}$. This effect possibly leads to the small binding energy in $T_{cc}$. In addition, a $I(J^P)=1(1^+)$ resonant state is reported with $ E=10641 \ \rm{MeV},Γ=15 \ \rm{MeV}$ and $ E=10640 \ \rm{MeV},Γ=15 \ \rm{MeV}$, without and with including OPE potential, respectively. The repulsive central force and attractive tensor force almost cancel with each other and leave a small energy difference when OPE potential is included. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 7 pages, 10 figures, submitted to Physics Letters B

arXiv:2308.05137 [pdf, other]

Discrepancy-based Active Learning for Weakly Supervised Bleeding Segmentation in Wireless Capsule Endoscopy Images

Authors: Fan Bai, Xiaohan Xing, Yutian Shen, Han Ma, Max Q. -H. Meng

Abstract: Weakly supervised methods, such as class activation maps (CAM) based, have been applied to achieve bleeding segmentation with low annotation efforts in Wireless Capsule Endoscopy (WCE) images. However, the CAM labels tend to be extremely noisy, and there is an irreparable gap between CAM labels and ground truths for medical images. This paper proposes a new Discrepancy-basEd Active Learning (DEAL)… ▽ More Weakly supervised methods, such as class activation maps (CAM) based, have been applied to achieve bleeding segmentation with low annotation efforts in Wireless Capsule Endoscopy (WCE) images. However, the CAM labels tend to be extremely noisy, and there is an irreparable gap between CAM labels and ground truths for medical images. This paper proposes a new Discrepancy-basEd Active Learning (DEAL) approach to bridge the gap between CAMs and ground truths with a few annotations. Specifically, to liberate labor, we design a novel discrepancy decoder model and a CAMPUS (CAM, Pseudo-label and groUnd-truth Selection) criterion to replace the noisy CAMs with accurate model predictions and a few human labels. The discrepancy decoder model is trained with a unique scheme to generate standard, coarse and fine predictions. And the CAMPUS criterion is proposed to predict the gaps between CAMs and ground truths based on model divergence and CAM divergence. We evaluate our method on the WCE dataset and results show that our method outperforms the state-of-the-art active learning methods and reaches comparable performance to those trained with full annotated datasets with only 10% of the training data labeled. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: accepted by MICCAI 2022

arXiv:2308.04911 [pdf, other]

SLPT: Selective Labeling Meets Prompt Tuning on Label-Limited Lesion Segmentation

Authors: Fan Bai, Ke Yan, Xiaoyu Bai, Xinyu Mao, Xiaoli Yin, **gren Zhou, Yu Shi, Le Lu, Max Q. -H. Meng

Abstract: Medical image analysis using deep learning is often challenged by limited labeled data and high annotation costs. Fine-tuning the entire network in label-limited scenarios can lead to overfitting and suboptimal performance. Recently, prompt tuning has emerged as a more promising technique that introduces a few additional tunable parameters as prompts to a task-agnostic pre-trained model, and updat… ▽ More Medical image analysis using deep learning is often challenged by limited labeled data and high annotation costs. Fine-tuning the entire network in label-limited scenarios can lead to overfitting and suboptimal performance. Recently, prompt tuning has emerged as a more promising technique that introduces a few additional tunable parameters as prompts to a task-agnostic pre-trained model, and updates only these parameters using supervision from limited labeled data while kee** the pre-trained model unchanged. However, previous work has overlooked the importance of selective labeling in downstream tasks, which aims to select the most valuable downstream samples for annotation to achieve the best performance with minimum annotation cost. To address this, we propose a framework that combines selective labeling with prompt tuning (SLPT) to boost performance in limited labels. Specifically, we introduce a feature-aware prompt updater to guide prompt tuning and a TandEm Selective LAbeling (TESLA) strategy. TESLA includes unsupervised diversity selection and supervised selection using prompt-based uncertainty. In addition, we propose a diversified visual prompt tuning strategy to provide multi-prompt-based discrepant predictions for TESLA. We evaluate our method on liver tumor segmentation and achieve state-of-the-art performance, outperforming traditional fine-tuning with only 6% of tunable parameters, also achieving 94% of full-data performance by labeling only 5% of the data. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: accepted by MICCAI 2023

Showing 1–50 of 306 results for author: Meng, Q