Search | arXiv e-print repository

doi 10.1109/TGRS.2024.3412286

A Deep Learning-Augmented Stand-off Radar Scheme for Rapidly Detecting Tree Defects

Authors: Jiwei Qian, Yee Hui Lee, Kaixuan Cheng, Qiqi Dai, Mohamed Lokman Mohd Yusof, Daryl Lee, Abdulkadir C. Yucel

Abstract: Tree defect detection is crucial for the structural health screening of trees. Existing nondestructive testing (NDT) techniques for tree defect detection require time-consuming and labor-intensive measurement campaigns. This discourages their application for the routine structural health screening of whole populations of managed urban trees. To address this issue, this study proposes a deep-learni… ▽ More Tree defect detection is crucial for the structural health screening of trees. Existing nondestructive testing (NDT) techniques for tree defect detection require time-consuming and labor-intensive measurement campaigns. This discourages their application for the routine structural health screening of whole populations of managed urban trees. To address this issue, this study proposes a deep-learning augmented stand-off radar scheme for contactless scanning of tree trunks and rapid detection of tree defects. In this scheme, the antenna is moved along a straight trajectory at a distance from the tree trunk to obtain the trunk's B-scan. The obtained raw B-scan is then processed by a signal-processing framework specifically developed for revealing the scattering signatures of defects in B-scan, which achieves a 30 dB and 22 dB increase in the signal-to-clutter and noise ratio of the measurement data of tree trunk samples and living trees, respectively. Finally, the processed B-scan is input into a multilevel feature fusion neural network particularly designed for extracting the signature of the defect in the processed B-scan in real time. The developed scheme's applications to the detection of defects in real fresh-cut tree trunks show that the stand-off radar scheme can detect tree defects with 96% accuracy. This stand-off radar scheme is the first contactless NDT technique for tree defect detection while operated on a straight trajectory and potentially can be integrated into the routine tree inspection workflow which is part of urban tree management. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: Accepted and to be published in IEEE Transactions on Geoscience and Remote Sensing

arXiv:2403.00381 [pdf, other]

Structured Deep Neural Networks-Based Backstep** Trajectory Tracking Control for Lagrangian Systems

Authors: Jiajun Qian, Liang Xu, Xiaoqiang Ren, Xiaofan Wang

Abstract: Deep neural networks (DNN) are increasingly being used to learn controllers due to their excellent approximation capabilities. However, their black-box nature poses significant challenges to closed-loop stability guarantees and performance analysis. In this paper, we introduce a structured DNN-based controller for the trajectory tracking control of Lagrangian systems using backing techniques. By p… ▽ More Deep neural networks (DNN) are increasingly being used to learn controllers due to their excellent approximation capabilities. However, their black-box nature poses significant challenges to closed-loop stability guarantees and performance analysis. In this paper, we introduce a structured DNN-based controller for the trajectory tracking control of Lagrangian systems using backing techniques. By properly designing neural network structures, the proposed controller can ensure closed-loop stability for any compatible neural network parameters. In addition, improved control performance can be achieved by further optimizing neural network parameters. Besides, we provide explicit upper bounds on tracking errors in terms of controller parameters, which allows us to achieve the desired tracking performance by properly selecting the controller parameters. Furthermore, when system models are unknown, we propose an improved Lagrangian neural network (LNN) structure to learn the system dynamics and design the controller. We show that in the presence of model approximation errors and external disturbances, the closed-loop stability and tracking control performance can still be guaranteed. The effectiveness of the proposed approach is demonstrated through simulations. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2401.16141 [pdf, ps, other]

Reconfigurable AI Modules Aided Channel Estimation and MIMO Detection

Authors: Xiangzhao Qin, Sha Hu, Jiankun Zhang, **g Qian, Hao Wang

Abstract: Deep learning (DL) based channel estimation (CE) and multiple input and multiple output detection (MIMODet), as two separate research topics, have provided convinced evidence to demonstrate the effectiveness and robustness of artificial intelligence (AI) for receiver design. However, problem remains on how to unify the CE and MIMODet by optimizing AI's structure to achieve near optimal detection p… ▽ More Deep learning (DL) based channel estimation (CE) and multiple input and multiple output detection (MIMODet), as two separate research topics, have provided convinced evidence to demonstrate the effectiveness and robustness of artificial intelligence (AI) for receiver design. However, problem remains on how to unify the CE and MIMODet by optimizing AI's structure to achieve near optimal detection performance such as widely considered QR with M-algorithm (QRM) that can perform close to the maximum likelihood (ML) detector. In this paper, we propose an AI receiver that connects CE and MIMODet as an unified architecture. As a merit, CE and MIMODet only adopt structural input features and conventional neural networks (NN) to perform end-to-end (E2E) training offline. Numerical results show that, by adopting a simple super-resolution based convolutional neural network (SRCNN) as channel estimator and domain knowledge enhanced graphical neural network (GNN) as detector, the proposed QRM enhanced GNN receiver (QRMNet) achieves comparable block error rate (BLER) performance to near-optimal baseline detectors. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2308.13072 [pdf]

Full-dose Whole-body PET Synthesis from Low-dose PET Using High-efficiency Denoising Diffusion Probabilistic Model: PET Consistency Model

Authors: Shaoyan Pan, Elham Abouei, Junbo Peng, Joshua Qian, Jacob F Wynne, Tonghe Wang, Chih-Wei Chang, Justin Roper, Jonathon A Nye, Hui Mao, Xiaofeng Yang

Abstract: Objective: Positron Emission Tomography (PET) has been a commonly used imaging modality in broad clinical applications. One of the most important tradeoffs in PET imaging is between image quality and radiation dose: high image quality comes with high radiation exposure. Improving image quality is desirable for all clinical applications while minimizing radiation exposure is needed to reduce risk t… ▽ More Objective: Positron Emission Tomography (PET) has been a commonly used imaging modality in broad clinical applications. One of the most important tradeoffs in PET imaging is between image quality and radiation dose: high image quality comes with high radiation exposure. Improving image quality is desirable for all clinical applications while minimizing radiation exposure is needed to reduce risk to patients. Approach: We introduce PET Consistency Model (PET-CM), an efficient diffusion-based method for generating high-quality full-dose PET images from low-dose PET images. It employs a two-step process, adding Gaussian noise to full-dose PET images in the forward diffusion, and then denoising them using a PET Shifted-window Vision Transformer (PET-VIT) network in the reverse diffusion. The PET-VIT network learns a consistency function that enables direct denoising of Gaussian noise into clean full-dose PET images. PET-CM achieves state-of-the-art image quality while requiring significantly less computation time than other methods. Results: In experiments comparing eighth-dose to full-dose images, PET-CM demonstrated impressive performance with NMAE of 1.278+/-0.122%, PSNR of 33.783+/-0.824dB, SSIM of 0.964+/-0.009, NCC of 0.968+/-0.011, HRS of 4.543, and SUV Error of 0.255+/-0.318%, with an average generation time of 62 seconds per patient. This is a significant improvement compared to the state-of-the-art diffusion-based model with PET-CM reaching this result 12x faster. Similarly, in the quarter-dose to full-dose image experiments, PET-CM delivered competitive outcomes, achieving an NMAE of 0.973+/-0.066%, PSNR of 36.172+/-0.801dB, SSIM of 0.984+/-0.004, NCC of 0.990+/-0.005, HRS of 4.428, and SUV Error of 0.151+/-0.192% using the same generation process, which underlining its high quantitative and clinical precision in both denoising scenario. △ Less

Submitted 16 April, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.08792 [pdf, other]

Federated Reinforcement Learning for Electric Vehicles Charging Control on Distribution Networks

Authors: Junkai Qian, Yuning Jiang, Xin Liu, Qing Wang, Ting Wang, Yuanming Shi, Wei Chen

Abstract: With the growing popularity of electric vehicles (EVs), maintaining power grid stability has become a significant challenge. To address this issue, EV charging control strategies have been developed to manage the switch between vehicle-to-grid (V2G) and grid-to-vehicle (G2V) modes for EVs. In this context, multi-agent deep reinforcement learning (MADRL) has proven its effectiveness in EV charging… ▽ More With the growing popularity of electric vehicles (EVs), maintaining power grid stability has become a significant challenge. To address this issue, EV charging control strategies have been developed to manage the switch between vehicle-to-grid (V2G) and grid-to-vehicle (G2V) modes for EVs. In this context, multi-agent deep reinforcement learning (MADRL) has proven its effectiveness in EV charging control. However, existing MADRL-based approaches fail to consider the natural power flow of EV charging/discharging in the distribution network and ignore driver privacy. To deal with these problems, this paper proposes a novel approach that combines multi-EV charging/discharging with a radial distribution network (RDN) operating under optimal power flow (OPF) to distribute power flow in real time. A mathematical model is developed to describe the RDN load. The EV charging control problem is formulated as a Markov Decision Process (MDP) to find an optimal charging control strategy that balances V2G profits, RDN load, and driver anxiety. To effectively learn the optimal EV charging control strategy, a federated deep reinforcement learning algorithm named FedSAC is further proposed. Comprehensive simulation results demonstrate the effectiveness and superiority of our proposed algorithm in terms of the diversity of the charging control strategy, the power fluctuations on RDN, the convergence efficiency, and the generalization ability. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2306.16197 [pdf, other]

Multi-IMU with Online Self-Consistency for Freehand 3D Ultrasound Reconstruction

Authors: Mingyuan Luo, Xin Yang, Zhongnuo Yan, Junyu Li, Yuanji Zhang, Jiongquan Chen, Xindi Hu, Jikuan Qian, Jun Cheng, Dong Ni

Abstract: Ultrasound (US) imaging is a popular tool in clinical diagnosis, offering safety, repeatability, and real-time capabilities. Freehand 3D US is a technique that provides a deeper understanding of scanned regions without increasing complexity. However, estimating elevation displacement and accumulation error remains challenging, making it difficult to infer the relative position using images alone.… ▽ More Ultrasound (US) imaging is a popular tool in clinical diagnosis, offering safety, repeatability, and real-time capabilities. Freehand 3D US is a technique that provides a deeper understanding of scanned regions without increasing complexity. However, estimating elevation displacement and accumulation error remains challenging, making it difficult to infer the relative position using images alone. The addition of external lightweight sensors has been proposed to enhance reconstruction performance without adding complexity, which has been shown to be beneficial. We propose a novel online self-consistency network (OSCNet) using multiple inertial measurement units (IMUs) to improve reconstruction performance. OSCNet utilizes a modal-level self-supervised strategy to fuse multiple IMU information and reduce differences between reconstruction results obtained from each IMU data. Additionally, a sequence-level self-consistency strategy is proposed to improve the hierarchical consistency of prediction results among the scanning sequence and its sub-sequences. Experiments on large-scale arm and carotid datasets with multiple scanning tactics demonstrate that our OSCNet outperforms previous methods, achieving state-of-the-art reconstruction performance. △ Less

Submitted 18 July, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: Accepted by MICCAI-2023

arXiv:2306.11332 [pdf, ps, other]

Minimum Eigenvalue Based Covariance Matrix Estimation with Limited Samples

Authors: **g Qian, Juening **, Hao Wang

Abstract: In this paper, we consider the interference rejection combining (IRC) receiver, which improves the cell-edge user throughput via suppressing inter-cell interference and requires estimating the covariance matrix including the inter-cell interference with high accuracy. In order to solve the problem of sample covariance matrix estimation with limited samples, a regularization parameter optimization… ▽ More In this paper, we consider the interference rejection combining (IRC) receiver, which improves the cell-edge user throughput via suppressing inter-cell interference and requires estimating the covariance matrix including the inter-cell interference with high accuracy. In order to solve the problem of sample covariance matrix estimation with limited samples, a regularization parameter optimization based on the minimum eigenvalue criterion is developed. It is different from traditional methods that aim at minimizing the mean squared error, but goes straight at the objective of optimizing the final performance of the IRC receiver. A lower bound of the minimum eigenvalue that is easier to calculate is also derived. Simulation results demonstrate that the proposed approach is effective and can approach the performance of the oracle estimator in terms of the mutual information metric. △ Less

Submitted 20 June, 2023; originally announced June 2023.

arXiv:2303.08575 [pdf, other]

Observation of Periodic Systems: Bridge Centralized Kalman Filtering and Consensus-Based Distributed Filtering

Authors: Jiachen Qian, Zhisheng Duan, Peihu Duan, Zhongkui Li

Abstract: Compared with linear time invariant systems, linear periodic system can describe the periodic processes arising from nature and engineering more precisely. However, the time-varying system parameters increase the difficulty of the research on periodic system, such as stabilization and observation. This paper aims to consider the observation problem of periodic systems by bridging two fundamental f… ▽ More Compared with linear time invariant systems, linear periodic system can describe the periodic processes arising from nature and engineering more precisely. However, the time-varying system parameters increase the difficulty of the research on periodic system, such as stabilization and observation. This paper aims to consider the observation problem of periodic systems by bridging two fundamental filtering algorithms for periodic systems with a sensor network: consensus-on-measurement-based distributed filtering (CMDF) and centralized Kalman filtering (CKF). Firstly, one mild convergence condition based on uniformly collective observability is established for CMDF, under which the filtering performance of CMDF can be formulated as a symmetric periodic positive semidefinite (SPPS) solution to a discrete-time periodic Lyapunov equation. Then, the closed form of the performance gap between CMDF and CKF is presented in terms of the information fusion steps and the consensus weights of the network. Moreover, it is pointed out that the estimation error covariance of CMDF exponentially converges to the centralized one with the fusion steps tending to infinity. Altogether, these new results establish a concise and specific relationship between distributed and centralized filterings, and formulate the trade-off between the communication cost and distributed filtering performance on periodic systems. Finally, the theoretical results are verified with numerical experiments. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: arXiv admin note: text overlap with arXiv:2112.06395

arXiv:2211.11247 [pdf, ps, other]

Harmonic-Copuled Riccati Equations and its Applications in Distributed Filtering

Authors: Jiachen Qian, Peihu Duan, Zhisheng Duan, Ling shi

Abstract: The coupled Riccati equations are cosisted of multiple Riccati-like equations with solutions coupled with each other, which can be applied to depict the properties of more complex systems such as markovian systems or multi-agent systems. This paper manages to formulate and investigate a new kind of coupled Riccati equations, called harmonic-coupled Riccati equations (HCRE), from the matrix iterati… ▽ More The coupled Riccati equations are cosisted of multiple Riccati-like equations with solutions coupled with each other, which can be applied to depict the properties of more complex systems such as markovian systems or multi-agent systems. This paper manages to formulate and investigate a new kind of coupled Riccati equations, called harmonic-coupled Riccati equations (HCRE), from the matrix iterative law of the consensus on information-based distributed filtering (CIDF) algortihm proposed in [1], where the solutions of the equations are coupled with harmonic means. Firstly, mild conditions of the existence and uniqueness of the solution to HCRE are induced with collective observability and primitiviness of weighting matrix. Then, it is proved that the matrix iterative law of CIDF will converge to the unique solution of the corresponding HCRE, hence can be used to obtain the solution to HCRE. Moreover, through applying the novel theory of HCRE, it is pointed out that the real estimation error covariance of CIDF will also become steady-state and the convergent value is simplified as the solution to a discrete time Lyapunov equation (DLE). Altogether, these new results develop the theory of the coupled Riccati equations, and provide a novel perspective on the performance analysis of CIDF algorithm, which sufficiently reduces the conservativeness of the evaluation techniques in the literature. Finally, the theoretical results are verified with numerical experiments. △ Less

Submitted 12 July, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: 14 pages, 4 figures

arXiv:2208.08045 [pdf, other]

Soft MIMO Detection Using Marginal Posterior Probability Statistics

Authors: Jiankun Zhang, Hao Wang, **g Qian, Zhenxing Gao

Abstract: Soft demodulation of received symbols into bit log-likelihood ratios (LLRs) is at the very heart of multiple-input-multiple-output (MIMO) detection. However, the optimal maximum a posteriori (MAP) detector is complicated and infeasible to be used in a practical system. In this paper, we propose a soft MIMO detection algorithm based on marginal posterior probability statistics (MPPS). With the help… ▽ More Soft demodulation of received symbols into bit log-likelihood ratios (LLRs) is at the very heart of multiple-input-multiple-output (MIMO) detection. However, the optimal maximum a posteriori (MAP) detector is complicated and infeasible to be used in a practical system. In this paper, we propose a soft MIMO detection algorithm based on marginal posterior probability statistics (MPPS). With the help of optimal transport theory and order statistics theory, we transform the posteriori probability distribution of each layer into a Gaussian distribution. Then the full sampling paths can be implicitly restored from the first- and second-order moment statistics of the transformed distribution. A lightweight network is designed to learn to recovery the log-MAP LLRs from the moment statistics with low complexity. Simulation results show that the proposed algorithm can improve the performance significantly with reduced samples under fading and correlated channels. △ Less

Submitted 16 August, 2022; originally announced August 2022.

arXiv:2207.06527 [pdf]

doi 10.1109/LGRS.2022.3192003

A Deep Learning-Based GPR Forward Solver for Predicting B-Scans of Subsurface Objects

Authors: Qiqi Dai, Yee Hui Lee, Hai-Han Sun, Jiwei Qian, Genevieve Ow, Mohamed Lokman Mohd Yusof, Abdulkadir C. Yucel

Abstract: The forward full-wave modeling of ground-penetrating radar (GPR) facilitates the understanding and interpretation of GPR data. Traditional forward solvers require excessive computational resources, especially when their repetitive executions are needed in signal processing and/or machine learning algorithms for GPR data inversion. To alleviate the computational burden, a deep learning-based 2D GPR… ▽ More The forward full-wave modeling of ground-penetrating radar (GPR) facilitates the understanding and interpretation of GPR data. Traditional forward solvers require excessive computational resources, especially when their repetitive executions are needed in signal processing and/or machine learning algorithms for GPR data inversion. To alleviate the computational burden, a deep learning-based 2D GPR forward solver is proposed to predict the GPR B-scans of subsurface objects buried in the heterogeneous soil. The proposed solver is constructed as a bimodal encoder-decoder neural network. Two encoders followed by an adaptive feature fusion module are designed to extract informative features from the subsurface permittivity and conductivity maps. The decoder subsequently constructs the B-scans from the fused feature representations. To enhance the network's generalization capability, transfer learning is employed to fine-tune the network for new scenarios vastly different from those in training set. Numerical results show that the proposed solver achieves a mean relative error of 1.28%. For predicting the B-scan of one subsurface object, the proposed solver requires 12 milliseconds, which is 22,500x less than the time required by a classical physics-based solver. △ Less

Submitted 13 July, 2022; originally announced July 2022.

arXiv:2207.00475 [pdf, other]

Agent with Tangent-based Formulation and Anatomical Perception for Standard Plane Localization in 3D Ultrasound

Authors: Yuxin Zou, Haoran Dou, Yuhao Huang, Xin Yang, Jikuan Qian, Chaojiong Zhen, Xiaodan Ji, Nishant Ravikumar, Guoqiang Chen, Weijun Huang, Alejandro F. Frangi, Dong Ni

Abstract: Standard plane (SP) localization is essential in routine clinical ultrasound (US) diagnosis. Compared to 2D US, 3D US can acquire multiple view planes in one scan and provide complete anatomy with the addition of coronal plane. However, manually navigating SPs in 3D US is laborious and biased due to the orientation variability and huge search space. In this study, we introduce a novel reinforcemen… ▽ More Standard plane (SP) localization is essential in routine clinical ultrasound (US) diagnosis. Compared to 2D US, 3D US can acquire multiple view planes in one scan and provide complete anatomy with the addition of coronal plane. However, manually navigating SPs in 3D US is laborious and biased due to the orientation variability and huge search space. In this study, we introduce a novel reinforcement learning (RL) framework for automatic SP localization in 3D US. Our contribution is three-fold. First, we formulate SP localization in 3D US as a tangent-point-based problem in RL to restructure the action space and significantly reduce the search space. Second, we design an auxiliary task learning strategy to enhance the model's ability to recognize subtle differences crossing Non-SPs and SPs in plane search. Finally, we propose a spatial-anatomical reward to effectively guide learning trajectories by exploiting spatial and anatomical information simultaneously. We explore the efficacy of our approach on localizing four SPs on uterus and fetal brain datasets. The experiments indicate that our approach achieves a high localization accuracy as well as robust performance. △ Less

Submitted 1 July, 2022; originally announced July 2022.

Comments: Accepted by MICCAI 2022

arXiv:2112.06395 [pdf, other]

Consensus-Based Distributed Filtering with Fusion Step Analysis

Authors: Jiachen Qian, Peihu Duan, Zhisheng Duan, Guanrong Chen, Ling Shi

Abstract: For consensus on measurement-based distributed filtering (CMDF), through infinite consensus fusion operations during each sampling interval, each node in the sensor network can achieve optimal filtering performance with centralized filtering. However, due to the limited communication resources in physical systems, the number of fusion steps cannot be infinite. To deal with this issue, the present… ▽ More For consensus on measurement-based distributed filtering (CMDF), through infinite consensus fusion operations during each sampling interval, each node in the sensor network can achieve optimal filtering performance with centralized filtering. However, due to the limited communication resources in physical systems, the number of fusion steps cannot be infinite. To deal with this issue, the present paper analyzes the performance of CMDF with finite consensus fusion operations. First, by introducing a modified discrete-time algebraic Riccati equation and several novel techniques, the convergence of the estimation error covariance matrix of each sensor is guaranteed under a collective observability condition. In particular, the steady-state covariance matrix can be simplified as the solution to a discrete-time Lyapunov equation. Moreover, the performance degradation induced by reduced fusion frequency is obtained in closed form, which establishes an analytical relation between the performance of the CMDF with finite fusion steps and that of centralized filtering. Meanwhile, it provides a trade-off between the filtering performance and the communication cost. Furthermore, it is shown that the steady-state estimation error covariance matrix exponentially converges to the centralized optimal steady-state matrix with fusion operations tending to infinity during each sampling interval. Finally, the theoretical results are verified with illustrative numerical experiments. △ Less

Submitted 23 May, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

arXiv:2110.08424 [pdf]

Deep learning-based detection of intravenous contrast in computed tomography scans

Authors: Zezhong Ye, Jack M. Qian, Ahmed Hosny, Roman Zeleznik, Deborah Plana, Jirapat Likitlersuang, Zhongyi Zhang, Raymond H. Mak, Hugo J. W. L. Aerts, Benjamin H. Kann

Abstract: Purpose: Identifying intravenous (IV) contrast use within CT scans is a key component of data curation for model development and testing. Currently, IV contrast is poorly documented in imaging metadata and necessitates manual correction and annotation by clinician experts, presenting a major barrier to imaging analyses and algorithm deployment. We sought to develop and validate a convolutional neu… ▽ More Purpose: Identifying intravenous (IV) contrast use within CT scans is a key component of data curation for model development and testing. Currently, IV contrast is poorly documented in imaging metadata and necessitates manual correction and annotation by clinician experts, presenting a major barrier to imaging analyses and algorithm deployment. We sought to develop and validate a convolutional neural network (CNN)-based deep learning (DL) platform to identify IV contrast within CT scans. Methods: For model development and evaluation, we used independent datasets of CT scans of head, neck (HN) and lung cancer patients, totaling 133,480 axial 2D scan slices from 1,979 CT scans manually annotated for contrast presence by clinical experts. Five different DL models were adopted and trained in HN training datasets for slice-level contrast detection. Model performances were evaluated on a hold-out set and on an independent validation set from another institution. DL models was then fine-tuned on chest CT data and externally validated on a separate chest CT dataset. Results: Initial DICOM metadata tags for IV contrast were missing or erroneous in 1,496 scans (75.6%). The EfficientNetB4-based model showed the best overall detection performance. For HN scans, AUC was 0.996 in the internal validation set (n = 216) and 1.0 in the external validation set (n = 595). The fine-tuned model on chest CTs yielded an AUC: 1.0 for the internal validation set (n = 53), and AUC: 0.980 for the external validation set (n = 402). Conclusion: The DL model could accurately detect IV contrast in both HN and chest CT scans with near-perfect performance. △ Less

Submitted 19 October, 2021; v1 submitted 15 October, 2021; originally announced October 2021.

arXiv:2105.10626 [pdf, other]

Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound

Authors: Xin Yang, Yuhao Huang, Ruobing Huang, Haoran Dou, Rui Li, Jikuan Qian, Xiaoqiong Huang, Wenlong Shi, Chaoyu Chen, Yuanji Zhang, Haixia Wang, Yi Xiong, Dong Ni

Abstract: 3D ultrasound (US) has become prevalent due to its rich spatial and diagnostic information not contained in 2D US. Moreover, 3D US can contain multiple standard planes (SPs) in one shot. Thus, automatically localizing SPs in 3D US has the potential to improve user-independence and scanning-efficiency. However, manual SP localization in 3D US is challenging because of the low image quality, huge se… ▽ More 3D ultrasound (US) has become prevalent due to its rich spatial and diagnostic information not contained in 2D US. Moreover, 3D US can contain multiple standard planes (SPs) in one shot. Thus, automatically localizing SPs in 3D US has the potential to improve user-independence and scanning-efficiency. However, manual SP localization in 3D US is challenging because of the low image quality, huge search space and large anatomical variability. In this work, we propose a novel multi-agent reinforcement learning (MARL) framework to simultaneously localize multiple SPs in 3D US. Our contribution is four-fold. First, our proposed method is general and it can accurately localize multiple SPs in different challenging US datasets. Second, we equip the MARL system with a recurrent neural network (RNN) based collaborative module, which can strengthen the communication among agents and learn the spatial relationship among planes effectively. Third, we explore to adopt the neural architecture search (NAS) to automatically design the network architecture of both the agents and the collaborative module. Last, we believe we are the first to realize automatic SP localization in pelvic US volumes, and note that our approach can handle both normal and abnormal uterus cases. Extensively validated on two challenging datasets of the uterus and fetal brain, our proposed method achieves the average localization accuracy of 7.03 degrees/1.59mm and 9.75 degrees/1.19mm. Experimental results show that our light-weight MARL model has higher accuracy than state-of-the-art methods. △ Less

Submitted 21 May, 2021; originally announced May 2021.

Comments: Accepted by Medical Image Analysis (10 figures, 8 tabels)

arXiv:2103.14502 [pdf, other]

Agent with Warm Start and Adaptive Dynamic Termination for Plane Localization in 3D Ultrasound

Authors: Xin Yang, Haoran Dou, Ruobing Huang, Wufeng Xue, Yuhao Huang, Jikuan Qian, Yuanji Zhang, Huanjia Luo, Huizhi Guo, Tianfu Wang, Yi Xiong, Dong Ni

Abstract: Accurate standard plane (SP) localization is the fundamental step for prenatal ultrasound (US) diagnosis. Typically, dozens of US SPs are collected to determine the clinical diagnosis. 2D US has to perform scanning for each SP, which is time-consuming and operator-dependent. While 3D US containing multiple SPs in one shot has the inherent advantages of less user-dependency and more efficiency. Aut… ▽ More Accurate standard plane (SP) localization is the fundamental step for prenatal ultrasound (US) diagnosis. Typically, dozens of US SPs are collected to determine the clinical diagnosis. 2D US has to perform scanning for each SP, which is time-consuming and operator-dependent. While 3D US containing multiple SPs in one shot has the inherent advantages of less user-dependency and more efficiency. Automatically locating SP in 3D US is very challenging due to the huge search space and large fetal posture variations. Our previous study proposed a deep reinforcement learning (RL) framework with an alignment module and active termination to localize SPs in 3D US automatically. However, termination of agent search in RL is important and affects the practical deployment. In this study, we enhance our previous RL framework with a newly designed adaptive dynamic termination to enable an early stop for the agent searching, saving at most 67% inference time, thus boosting the accuracy and efficiency of the RL framework at the same time. Besides, we validate the effectiveness and generalizability of our algorithm extensively on our in-house multi-organ datasets containing 433 fetal brain volumes, 519 fetal abdomen volumes, and 683 uterus volumes. Our approach achieves localization error of 2.52mm/10.26 degrees, 2.48mm/10.39 degrees, 2.02mm/10.48 degrees, 2.00mm/14.57 degrees, 2.61mm/9.71 degrees, 3.09mm/9.58 degrees, 1.49mm/7.54 degrees for the transcerebellar, transventricular, transthalamic planes in fetal brain, abdominal plane in fetal abdomen, and mid-sagittal, transverse and coronal planes in uterus, respectively. Experimental results show that our method is general and has the potential to improve the efficiency and standardization of US scanning. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Comments: Accepted by IEEE Transactions on Medical Imaging (12 pages, 8 figures, 11 tabels)

arXiv:2102.10236 [pdf, other]

Singer Identification Using Deep Timbre Feature Learning with KNN-Net

Authors: Xulong Zhang, Jiale Qian, Yi Yu, Yifu Sun, Wei Li

Abstract: In this paper, we study the issue of automatic singer identification (SID) in popular music recordings, which aims to recognize who sang a given piece of song. The main challenge for this investigation lies in the fact that a singer's singing voice changes and intertwines with the signal of background accompaniment in time domain. To handle this challenge, we propose the KNN-Net for SID, which is… ▽ More In this paper, we study the issue of automatic singer identification (SID) in popular music recordings, which aims to recognize who sang a given piece of song. The main challenge for this investigation lies in the fact that a singer's singing voice changes and intertwines with the signal of background accompaniment in time domain. To handle this challenge, we propose the KNN-Net for SID, which is a deep neural network model with the goal of learning local timbre feature representation from the mixture of singer voice and background music. Unlike other deep neural networks using the softmax layer as the output layer, we instead utilize the KNN as a more interpretable layer to output target singer labels. Moreover, attention mechanism is first introduced to highlight crucial timbre features for SID. Experiments on the existing artist20 dataset show that the proposed approach outperforms the state-of-the-art method by 4%. We also create singer32 and singer60 datasets consisting of Chinese pop music to evaluate the reliability of the proposed method. The more extensive experiments additionally indicate that our proposed model achieves a significant performance improvement compared to the state-of-the-art methods. △ Less

Submitted 19 February, 2021; originally announced February 2021.

Comments: Published as a conference paper at ICASSP 2021

arXiv:2010.15718 [pdf, other]

Minimal Model Structure Analysis for Input Reconstruction in Federated Learning

Authors: Jia Qian, Hiba Nassar, Lars Kai Hansen

Abstract: \ac{fl} proposed a distributed \ac{ml} framework where every distributed worker owns a complete copy of global model and their own data. The training is occurred locally, which assures no direct transmission of training data. However, the recent work \citep{zhu2019deep} demonstrated that input data from a neural network may be reconstructed only using knowledge of gradients of that network, which… ▽ More \ac{fl} proposed a distributed \ac{ml} framework where every distributed worker owns a complete copy of global model and their own data. The training is occurred locally, which assures no direct transmission of training data. However, the recent work \citep{zhu2019deep} demonstrated that input data from a neural network may be reconstructed only using knowledge of gradients of that network, which completely breached the promise of \ac{fl} and sabotaged the user privacy. In this work, we aim to further explore the theoretical limits of reconstruction, speedup and stabilize the reconstruction procedure. We show that a single input may be reconstructed with the analytical form, regardless of network depth using a fully-connected neural network with one hidden node. Then we generalize this result to a gradient averaged over batches of size $B$. In this case, the full batch can be reconstructed if the number of hidden units exceeds $B$. For a \ac{cnn}, the number of required kernels in convolutional layers is decided by multiple factors, e.g., padding, kernel and stride size, etc. We require the number of kernels $h\geq (\frac{d}{d^{\prime}})^2C$, where we define $d$ as input width, $d^{\prime}$ as output width after convolutional layer, and $C$ as channel number of input. We validate our observation and demonstrate the improvements using bio-medical (fMRI, \ac{wbc}) and benchmark data (MNIST, Kuzushiji-MNIST, CIFAR100, ImageNet and face images). △ Less

Submitted 5 November, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

arXiv:2008.02519 [pdf]

Spectral-change enhancement with prior SNR for the hearing impaired

Authors: Xiang Li, Xin Tian, Henry Luo, **yu Qian, Xihong Wu, Dingsheng Luo, **g Chen

Abstract: A previous signal processing algorithm that aimed to enhance spectral changes (SCE) over time showed benefit for hearing-impaired (HI) listeners to recognize speech in background noise. In this work, the previous SCE was manipulated to perform on target-dominant segments, rather than treating all frames equally. Instantaneous signal-to-noise ratios (SNRs) were calculated to determine whether the s… ▽ More A previous signal processing algorithm that aimed to enhance spectral changes (SCE) over time showed benefit for hearing-impaired (HI) listeners to recognize speech in background noise. In this work, the previous SCE was manipulated to perform on target-dominant segments, rather than treating all frames equally. Instantaneous signal-to-noise ratios (SNRs) were calculated to determine whether the segments should be processed. Initially, the ideal SNR calculated by the knowledge of premixed signals was introduced to the previous SCE algorithm (SCE-iSNR). Speech intelligibility (SI) and clarity preference were measured for 12 HI listeners in steady speech-spectrum noise (SSN) and six-talk speech (STS) maskers, respectively. The results showed the SCE-iSNR algorithm improved SI significantly for both maskers at high signal-to-masker ratios (SMRs) and for STS masker at low SMRs, while processing effect on speech quality was small. Secondly, the estimated SNR obtained from real mixtures was used, resulting in another SCE-eSNR. SI and subjective rating on naturalness and speech quality were tested for 7 HI subjects. The SCE-eSNR algorithm showed improved SI for SSN masker at high SMRs and for STS masker at low SMRs, as well as better naturalness and speech quality for STS masker. The limitations of applying the algorithms are discussed. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: Accepted by 23rd International Congress on Acoustics (ICA 2019), see http://pub.dega-akustik.de/ICA2019/data/articles/000051.pdf

arXiv:2007.15273 [pdf, other]

Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound

Authors: Yuhao Huang, Xin Yang, Rui Li, Jikuan Qian, Xiaoqiong Huang, Wenlong Shi, Haoran Dou, Chaoyu Chen, Yuanji Zhang, Huanjia Luo, Alejandro Frangi, Yi Xiong, Dong Ni

Abstract: 3D ultrasound (US) is widely used due to its rich diagnostic information, portability and low cost. Automated standard plane (SP) localization in US volume not only improves efficiency and reduces user-dependence, but also boosts 3D US interpretation. In this study, we propose a novel Multi-Agent Reinforcement Learning (MARL) framework to localize multiple uterine SPs in 3D US simultaneously. Our… ▽ More 3D ultrasound (US) is widely used due to its rich diagnostic information, portability and low cost. Automated standard plane (SP) localization in US volume not only improves efficiency and reduces user-dependence, but also boosts 3D US interpretation. In this study, we propose a novel Multi-Agent Reinforcement Learning (MARL) framework to localize multiple uterine SPs in 3D US simultaneously. Our contribution is two-fold. First, we equip the MARL with a one-shot neural architecture search (NAS) module to obtain the optimal agent for each plane. Specifically, Gradient-based search using Differentiable Architecture Sampler (GDAS) is employed to accelerate and stabilize the training process. Second, we propose a novel collaborative strategy to strengthen agents' communication. Our strategy uses recurrent neural network (RNN) to learn the spatial relationship among SPs effectively. Extensively validated on a large dataset, our approach achieves the accuracy of 7.05 degree/2.21mm, 8.62 degree/2.36mm and 5.93 degree/0.89mm for the mid-sagittal, transverse and coronal plane localization, respectively. The proposed MARL framework can significantly increase the plane localization accuracy and reduce the computational cost and model size. △ Less

Submitted 30 July, 2020; originally announced July 2020.

Comments: Early accepted by MICCAI 2020

arXiv:2005.01305 [pdf, other]

Energy Model for UAV Communications: Experimental Validation and Model Generalization

Authors: Ning Gao, Yong Zeng, Jian Wang, Di Wu, Chaoyue Zhang, Qingheng Song, Jiachen Qian, Shi **

Abstract: Wireless communication involving unmanned aerial vehicles (UAVs) is expected to play an important role in future wireless networks. However, different from conventional terrestrial communication systems, UAVs typically have rather limited onboard energy on one hand, and require additional flying energy consumption on the other hand, which renders energy-efficient UAV communication with smart energ… ▽ More Wireless communication involving unmanned aerial vehicles (UAVs) is expected to play an important role in future wireless networks. However, different from conventional terrestrial communication systems, UAVs typically have rather limited onboard energy on one hand, and require additional flying energy consumption on the other hand, which renders energy-efficient UAV communication with smart energy expenditure of paramount importance. In this paper, via extensive flight experiments, we aim to firstly validate the recently derived theoretical energy model for rotary-wing UAVs, and then develop a general model for those complicated flight scenarios where rigorous theoretical model derivation is quite challenging, if not impossible. Specifically, we first investigate how UAV power consumption varies with its flying speed for the simplest straight-and-level flight. With about 12,000 valid power-speed data points collected, we first apply the model-based curve fitting to obtain the modelling parameters based on the theoretical closed-form energy model in the existing literature. In addition, in order to exclude the potential bias caused by the theoretical energy model, the obtained measurement data is also trained using a model-free deep neural network. It is found that the obtained curve from both methods can match quite well with the theoretical energy model. Next, we further extend the study to arbitrary 2-dimensional (2-D) flight, where, to our best knowledge, no rigorous theoretical derivation is available for the closed-form energy model as a function of its flying speed, direction, and acceleration. To fill the gap, we first propose a heuristic energy model for these more complicated cases, and then provide experimental validation based on the measurement results for circular level flight. △ Less

Submitted 4 May, 2020; originally announced May 2020.

arXiv:2004.07576 [pdf, other]

Deep Learning based Denoise Network for CSI Feedback in FDD Massive MIMO Systems

Authors: Hongyuan Ye, Feifei Gao, **g Qian, Hao Wang, Geoffrey Ye Li

Abstract: Channel state information (CSI) feedback is critical for frequency division duplex (FDD) massive multi-input multi-output (MIMO) systems. Most conventional algorithms are based on compressive sensing (CS) and are highly dependent on the level of channel sparsity. To address the issue, a recent approach adopts deep learning (DL) to compress CSI into a codeword with low dimensionality, which has sho… ▽ More Channel state information (CSI) feedback is critical for frequency division duplex (FDD) massive multi-input multi-output (MIMO) systems. Most conventional algorithms are based on compressive sensing (CS) and are highly dependent on the level of channel sparsity. To address the issue, a recent approach adopts deep learning (DL) to compress CSI into a codeword with low dimensionality, which has shown much better performance than the CS algorithms when feedback link is perfect. In practical scenario, however, there exists various interference and non-linear effect. In this article, we design a DL-based denoise network, called DNNet, to improve the performance of channel feedback. Numerical results show that the DL-based feedback algorithm with the proposed DNNet has superior performance over the existing algorithms, especially at low signal-to-noise ratio (SNR). △ Less

Submitted 16 April, 2020; originally announced April 2020.

arXiv:2001.01439 [pdf, other]

doi 10.1063/5.0003217

Deep-learning-enabled geometric constraints and phase unwrap** for single-shot absolute 3D shape measurement

Authors: Jiaming Qian, Shijie Feng, Tianyang Tao, Yan Hu, Yixuan Li, Qian Chen, Chao Zuo

Abstract: Fringe projection profilometry (FPP) is one of the most popular three-dimensional (3D) shape measurement techniques, and has becoming more prevalently adopted in intelligent manufacturing, defect detection and some other important applications. In FPP, how to efficiently recover the absolute phase has always been a great challenge. The stereo phase unwrap** (SPU) technologies based on geometric… ▽ More Fringe projection profilometry (FPP) is one of the most popular three-dimensional (3D) shape measurement techniques, and has becoming more prevalently adopted in intelligent manufacturing, defect detection and some other important applications. In FPP, how to efficiently recover the absolute phase has always been a great challenge. The stereo phase unwrap** (SPU) technologies based on geometric constraints can eliminate phase ambiguity without projecting any additional fringe patterns, which maximizes the efficiency of the retrieval of absolute phase. Inspired by the recent success of deep learning technologies for phase analysis, we demonstrate that deep learning can be an effective tool that organically unifies the phase retrieval, geometric constraints, and phase unwrap** steps into a comprehensive framework. Driven by extensive training dataset, the neutral network can gradually "learn" how to transfer one high-frequency fringe pattern into the "physically meaningful", and "most likely" absolute phase, instead of "step by step" as in convention approaches. Based on the properly trained framework, high-quality phase retrieval and robust phase ambiguity removal can be achieved based on only single-frame projection. Experimental results demonstrate that compared with traditional SPU, our method can more efficiently and stably unwrap the phase of dense fringe images in a larger measurement volume with fewer camera views. Limitations about the proposed approach are also discussed. We believe the proposed approach represents an important step forward in high-speed, high-accuracy, motion-artifacts-free absolute 3D shape measurement for complicated object from a single fringe pattern. △ Less

Submitted 22 April, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

Journal ref: APL Photonics 5, 046105 (2020)

arXiv:1910.04935 [pdf, other]

FetusMap: Fetal Pose Estimation in 3D Ultrasound

Authors: Xin Yang, Wenlong Shi, Haoran Dou, Jikuan Qian, Yi Wang, Wufeng Xue, Shengli Li, Dong Ni, Pheng-Ann Heng

Abstract: The 3D ultrasound (US) entrance inspires a multitude of automated prenatal examinations. However, studies about the structuralized description of the whole fetus in 3D US are still rare. In this paper, we propose to estimate the 3D pose of fetus in US volumes to facilitate its quantitative analyses in global and local scales. Given the great challenges in 3D US, including the high volume dimension… ▽ More The 3D ultrasound (US) entrance inspires a multitude of automated prenatal examinations. However, studies about the structuralized description of the whole fetus in 3D US are still rare. In this paper, we propose to estimate the 3D pose of fetus in US volumes to facilitate its quantitative analyses in global and local scales. Given the great challenges in 3D US, including the high volume dimension, poor image quality, symmetric ambiguity in anatomical structures and large variations of fetal pose, our contribution is three-fold. (i) This is the first work about 3D pose estimation of fetus in the literature. We aim to extract the skeleton of whole fetus and assign different segments/joints with correct torso/limb labels. (ii) We propose a self-supervised learning (SSL) framework to finetune the deep network to form visually plausible pose predictions. Specifically, we leverage the landmark-based registration to effectively encode case-adaptive anatomical priors and generate evolving label proxy for supervision. (iii) To enable our 3D network perceive better contextual cues with higher resolution input under limited computing resource, we further adopt the gradient check-pointing (GCP) strategy to save GPU memory and improve the prediction. Extensively validated on a large 3D US dataset, our method tackles varying fetal poses and achieves promising results. 3D pose estimation of fetus has potentials in serving as a map to provide navigation for many advanced studies. △ Less

Submitted 3 March, 2024; v1 submitted 10 October, 2019; originally announced October 2019.

Comments: 9 pages, 6 figures, 2 tables. Accepted by MICCAI 2019

arXiv:1910.04331 [pdf, other]

Agent with Warm Start and Active Termination for Plane Localization in 3D Ultrasound

Authors: Haoran Dou, Xin Yang, Jikuan Qian, Wufeng Xue, Hao Qin, Xu Wang, Lequan Yu, Shujun Wang, Yi Xiong, Pheng-Ann Heng, Dong Ni

Abstract: Standard plane localization is crucial for ultrasound (US) diagnosis. In prenatal US, dozens of standard planes are manually acquired with a 2D probe. It is time-consuming and operator-dependent. In comparison, 3D US containing multiple standard planes in one shot has the inherent advantages of less user-dependency and more efficiency. However, manual plane localization in US volume is challenging… ▽ More Standard plane localization is crucial for ultrasound (US) diagnosis. In prenatal US, dozens of standard planes are manually acquired with a 2D probe. It is time-consuming and operator-dependent. In comparison, 3D US containing multiple standard planes in one shot has the inherent advantages of less user-dependency and more efficiency. However, manual plane localization in US volume is challenging due to the huge search space and large fetal posture variation. In this study, we propose a novel reinforcement learning (RL) framework to automatically localize fetal brain standard planes in 3D US. Our contribution is two-fold. First, we equip the RL framework with a landmark-aware alignment module to provide warm start and strong spatial bounds for the agent actions, thus ensuring its effectiveness. Second, instead of passively and empirically terminating the agent inference, we propose a recurrent neural network based strategy for active termination of the agent's interaction procedure. This improves both the accuracy and efficiency of the localization system. Extensively validated on our in-house large dataset, our approach achieves the accuracy of 3.4mm/9.6° and 2.7mm/9.1° for the transcerebellar and transthalamic plane localization, respectively. Ourproposed RL framework is general and has the potential to improve the efficiency and standardization of US scanning. △ Less

Submitted 3 March, 2024; v1 submitted 9 October, 2019; originally announced October 2019.

Comments: 9 pages, 5 figures, 1 table. Accepted by MICCAI 2019 (oral)

arXiv:1810.11705 [pdf]

Wi-Motion: A Robust Human Activity Recognition Using WiFi Signals

Authors: Heju Li, Xukai Chen, Haohua Du, Xin He, Jianwei Qian, Peng-Jun Wan, Panlong Yang

Abstract: Recent research has shown that human motions and positions can be recognized through WiFi signals. The key intuition is that different motions and positions introduce different multi-path distortions in WiFi signals and generate different patterns in the time-series of channel state information (CSI). In this paper, we propose Wi-Motion, a WiFi-based human activities recognition system. Unlike exi… ▽ More Recent research has shown that human motions and positions can be recognized through WiFi signals. The key intuition is that different motions and positions introduce different multi-path distortions in WiFi signals and generate different patterns in the time-series of channel state information (CSI). In this paper, we propose Wi-Motion, a WiFi-based human activities recognition system. Unlike existing systems, Wi-Motion adopts the amplitude and phase information extracted from the CSI sequence to construct the classifiers respectively, and combines the results using a combination strategy based on posterior probability. As the simulation results shows, Wi-Motion can recognize six human activities with the mean accuracy of 98:4%. △ Less

Submitted 27 October, 2018; originally announced October 2018.

arXiv:1610.06283 [pdf, other]

Deep Neural Networks for Improved, Impromptu Trajectory Tracking of Quadrotors

Authors: Qiyang Li, **gxing Qian, Zining Zhu, Xuchan Bao, Mohamed K. Helwa, Angela P. Schoellig

Abstract: Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with it… ▽ More Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive "fly-as-you-draw" application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method's potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs' capability of generalizing knowledge. △ Less

Submitted 19 July, 2017; v1 submitted 20 October, 2016; originally announced October 2016.

Comments: 7 pages, 8 figures. Accepted final version. To appear in the proc. of the 2017 IEEE International Conference on Robotics and Automation

Showing 1–27 of 27 results for author: Qian, J