Search | arXiv e-print repository

RF-Diffusion: Radio Signal Generation via Time-Frequency Diffusion

Authors: Guoxuan Chi, Zheng Yang, Chenshu Wu, **gao Xu, Yuchong Gao, Yunhao Liu, Tony Xiao Han

Abstract: Along with AIGC shines in CV and NLP, its potential in the wireless domain has also emerged in recent years. Yet, existing RF-oriented generative solutions are ill-suited for generating high-quality, time-series RF data due to limited representation capabilities. In this work, inspired by the stellar achievements of the diffusion model in CV and NLP, we adapt it to the RF domain and propose RF-Dif… ▽ More Along with AIGC shines in CV and NLP, its potential in the wireless domain has also emerged in recent years. Yet, existing RF-oriented generative solutions are ill-suited for generating high-quality, time-series RF data due to limited representation capabilities. In this work, inspired by the stellar achievements of the diffusion model in CV and NLP, we adapt it to the RF domain and propose RF-Diffusion. To accommodate the unique characteristics of RF signals, we first introduce a novel Time-Frequency Diffusion theory to enhance the original diffusion model, enabling it to tap into the information within the time, frequency, and complex-valued domains of RF signals. On this basis, we propose a Hierarchical Diffusion Transformer to translate the theory into a practical generative DNN through elaborated design spanning network architecture, functional block, and complex-valued operator, making RF-Diffusion a versatile solution to generate diverse, high-quality, and time-series RF data. Performance comparison with three prevalent generative models demonstrates the RF-Diffusion's superior performance in synthesizing Wi-Fi and FMCW signals. We also showcase the versatility of RF-Diffusion in boosting Wi-Fi sensing systems and performing channel estimation in 5G networks. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: Accepted by MobiCom 2024

ACM Class: I.2.0

arXiv:2404.03440 [pdf, ps, other]

doi 10.1109/VTC2023-Fall60731.2023.10333715

Design and Optimization of Cooperative Sensing With Limited Backhaul Capacity

Authors: Wenrui Li, Min Li, An Liu, Tony Xiao Han

Abstract: This paper introduces a cooperative sensing framework designed for integrated sensing and communication cellular networks. The framework comprises one base station (BS) functioning as the sensing transmitter, while several nearby BSs act as sensing receivers. The primary objective is to facilitate cooperative target localization by enabling each receiver to share specific information with a fusion… ▽ More This paper introduces a cooperative sensing framework designed for integrated sensing and communication cellular networks. The framework comprises one base station (BS) functioning as the sensing transmitter, while several nearby BSs act as sensing receivers. The primary objective is to facilitate cooperative target localization by enabling each receiver to share specific information with a fusion center (FC) over a limited capacity backhaul link. To achieve this goal, we propose an advanced cooperative sensing design that enhances the communication process between the receivers and the FC. Each receiver independently estimates the time delay and the reflecting coefficient associated with the reflected path from the target. Subsequently, each receiver transmits the estimated values and the received signal samples centered around the estimated time delay to the FC. To efficiently quantize the signal samples, a Karhunen-Loève Transform coding scheme is employed. Furthermore, an optimization problem is formulated to allocate backhaul resources for quantizing different samples, improving target localization. Numerical results validate the effectiveness of our proposed advanced design and demonstrate its superiority over a baseline design, where only the locally estimated values are transmitted from each receiver to the FC. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: This paper has been published in 2023 IEEE 98th Vehicular Technology Conference (VTC2023-Fall)

arXiv:2403.16438 [pdf, other]

doi 10.1109/BIBM58861.2023.10385929

Real-time Neuron Segmentation for Voltage Imaging

Authors: Yosuke Bando, Ramdas Pillai, Atsushi Kajita, Farhan Abdul Hakeem, Yves Quemener, Hua-an Tseng, Kiryl D. Piatkevich, Changyang Linghu, Xue Han, Edward S. Boyden

Abstract: In voltage imaging, where the membrane potentials of individual neurons are recorded at from hundreds to thousand frames per second using fluorescence microscopy, data processing presents a challenge. Even a fraction of a minute of recording with a limited image size yields gigabytes of video data consisting of tens of thousands of frames, which can be time-consuming to process. Moreover, millisec… ▽ More In voltage imaging, where the membrane potentials of individual neurons are recorded at from hundreds to thousand frames per second using fluorescence microscopy, data processing presents a challenge. Even a fraction of a minute of recording with a limited image size yields gigabytes of video data consisting of tens of thousands of frames, which can be time-consuming to process. Moreover, millisecond-level short exposures lead to noisy video frames, obscuring neuron footprints especially in deep-brain samples where noisy signals are buried in background fluorescence. To address this challenge, we propose a fast neuron segmentation method able to detect multiple, potentially overlap**, spiking neurons from noisy video frames, and implement a data processing pipeline incorporating the proposed segmentation method along with GPU-accelerated motion correction. By testing on existing datasets as well as on new datasets we introduce, we show that our pipeline extracts neuron footprints that agree well with human annotation even from cluttered datasets, and demonstrate real-time processing of voltage imaging data on a single desktop computer for the first time. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Journal ref: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 813-818, 2023

arXiv:2401.05725 [pdf, ps, other]

Energy-Efficient STAR-RIS Enhanced UAV-Enabled MEC Networks with Bi-Directional Task Offloading

Authors: Han Xiao, Xiaoyan Hu, Weile Zhang, Wenjie Wang, Kai-Kit Wong, Kun Yang

Abstract: This paper introduces a novel multi-user mobile edge computing (MEC) scheme facilitated by the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) and the unmanned aerial vehicle (UAV). Unlike existing MEC approaches, the proposed scheme enables bidirectional offloading, allowing users to concurrently offload tasks to the MEC servers located at the ground base… ▽ More This paper introduces a novel multi-user mobile edge computing (MEC) scheme facilitated by the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) and the unmanned aerial vehicle (UAV). Unlike existing MEC approaches, the proposed scheme enables bidirectional offloading, allowing users to concurrently offload tasks to the MEC servers located at the ground base station (BS) and UAV with STAR-RIS support. Specifically, we formulate an optimization problem aiming at maximizing the energy efficiency of the system while ensuring the quality of service (QoS) constraints by jointly optimizing the resource allocation, user scheduling, passive beamforming of the STAR-RIS, and the UAV trajectory. A block coordinate descent (BCD) iterative algorithm designed with the Dinkelbach's algorithm and the successive convex approximation (SCA) technique is proposed to effectively handle the formulated non-convex optimization problem with significant coupling among variables. Simulation results indicate that the proposed STAR-RIS enhanced UAV-enabled MEC scheme possesses significant advantages in enhancing the system energy efficiency over other baseline schemes including the conventional RIS-aided scheme. △ Less

Submitted 9 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

arXiv:2312.15941 [pdf, other]

Resha** the ISAC Tradeoff Under OFDM Signaling: A Probabilistic Constellation Sha** Approach

Authors: Zhen Du, Fan Liu, Yifeng Xiong, Tony Xiao Han, Yonina C. Eldar, Shi **

Abstract: Integrated sensing and communications is regarded as a key enabling technology in the sixth generation networks, where a unified waveform, such as orthogonal frequency division multiplexing (OFDM) signal, is adopted to facilitate both sensing and communications (S&C). However, the random communication data embedded in the OFDM signal results in severe variability in the sidelobes of its ambiguity… ▽ More Integrated sensing and communications is regarded as a key enabling technology in the sixth generation networks, where a unified waveform, such as orthogonal frequency division multiplexing (OFDM) signal, is adopted to facilitate both sensing and communications (S&C). However, the random communication data embedded in the OFDM signal results in severe variability in the sidelobes of its ambiguity function (AF), which leads to missed detection of weak targets and false detection of ghost targets, thereby impairing the sensing performance. Therefore, balancing between preserving communication capability (i.e., the randomness) while improving sensing performance remains a challenging task. To cope with this issue, we characterize the random AF of OFDM communication signals, and demonstrate that the AF variance is determined by the fourth-moment of the constellation amplitudes. Subsequently, we propose an optimal probabilistic constellation sha** (PCS) approach by maximizing the achievable information rate (AIR) under the fourth-moment, power and probability constraints, where the optimal input distribution may be numerically specified through a modified Blahut-Arimoto algorithm. To reduce the computational overheads, we further propose a heuristic PCS approach by actively controlling the value of the fourth-moment, without involving the communication metric in the optimization model, despite that the AIR is passively scaled with the variation of the input distribution. Numerical results show that both approaches strike a scalable performance tradeoff between S&C, where the superiority of the PCS-enabled constellations over conventional uniform constellations is also verified. Notably, the heuristic approach achieves very close performance to the optimal counterpart, at a much lower computational complexity. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.15873 [pdf, other]

Investigating Inter-Satellite Link Spanning Patterns on Networking Performance in Mega-constellations

Authors: Xiangtong Wang, Xiaodong Han, Menglong Yang, Chuan Xing, Yuqi Wang, Songchen Han, Wei Li

Abstract: Low Earth orbit (LEO) mega-constellations rely on inter-satellite links (ISLs) to provide global connectivity. We note that in addition to the general constellation parameters, the ISL spanning patterns are also greatly influence the final network structure and thus the network performance. In this work, we formulate the ISL spanning patterns, apply different patterns to mega-constellation and g… ▽ More Low Earth orbit (LEO) mega-constellations rely on inter-satellite links (ISLs) to provide global connectivity. We note that in addition to the general constellation parameters, the ISL spanning patterns are also greatly influence the final network structure and thus the network performance. In this work, we formulate the ISL spanning patterns, apply different patterns to mega-constellation and generate multiple structures. Then, we delve into the performance estimation of these networks, specifically evaluating network capacity, throughput, latency, and routing path stretch. The experimental findings provide insights into the optimal network structure under diverse conditions, showcasing superior performance when compared to alternative network configurations. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 5pages

arXiv:2311.06002 [pdf, other]

Fully-Passive versus Semi-Passive IRS-Enabled Sensing: SNR and CRB Comparison

Authors: Xianxin Song, Xinmin Li, Xiaoqi Qin, Jie Xu, Tony Xiao Han, Derrick Wing Kwan Ng

Abstract: This paper investigates the sensing performance of two intelligent reflecting surface (IRS)-enabled non-line-of-sight (NLoS) sensing systems with fully-passive and semi-passive IRSs, respectively. In particular, we consider a fundamental setup with one base station (BS), one uniform linear array (ULA) IRS, and one point target in the NLoS region of the BS. Accordingly, we analyze the sensing signa… ▽ More This paper investigates the sensing performance of two intelligent reflecting surface (IRS)-enabled non-line-of-sight (NLoS) sensing systems with fully-passive and semi-passive IRSs, respectively. In particular, we consider a fundamental setup with one base station (BS), one uniform linear array (ULA) IRS, and one point target in the NLoS region of the BS. Accordingly, we analyze the sensing signal-to-noise ratio (SNR) performance for a target detection scenario and the estimation Cramér-Rao bound (CRB) performance for a target's direction-of-arrival (DoA) estimation scenario, in cases where the transmit beamforming at the BS and the reflective beamforming at the IRS are jointly optimized. First, for the target detection scenario, we characterize the maximum sensing SNR when the BS-IRS channels are line-of-sight (LoS) and Rayleigh fading, respectively. It is revealed that when the number of reflecting elements $N$ equipped at the IRS becomes sufficiently large, the maximum sensing SNR increases proportionally to $N^2$ for the semi-passive-IRS sensing system, but proportionally to $N^4$ for the fully-passive-IRS counterpart. Then, for the target's DoA estimation scenario, we analyze the minimum CRB performance when the BS-IRS channel follows Rayleigh fading. Specifically, when $N$ grows, the minimum CRB decreases inversely proportionally to $N^4$ and $N^6$ for the semi-passive and fully-passive-IRS sensing systems, respectively. Finally, numerical results are presented to corroborate our analysis across various transmit and reflective beamforming design schemes under general channel setups. It is shown that the fully-passive-IRS sensing system outperforms the semi-passive counterpart when $N$ exceeds a certain threshold. This advantage is attributed to the additional reflective beamforming gain in the IRS-BS path, which efficiently compensates for the path loss for a large $N$. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Comments: 13 pages,7 figures

arXiv:2310.18090 [pdf, ps, other]

Probabilistic Constellation Sha** for OFDM-Based ISAC Signaling

Authors: Zhen Du, Fan Liu, Yifeng Xiong, Tony Xiao Han, Weijie Yuan, Yuanhao Cui, Changhua Yao, Yonina C. Eldar

Abstract: Integrated Sensing and Communications (ISAC) has garnered significant attention as a promising technology for the upcoming sixth-generation wireless communication systems (6G). In pursuit of this goal, a common strategy is that a unified waveform, such as Orthogonal Frequency Division Multiplexing (OFDM), should serve dual-functional roles by enabling simultaneous sensing and communications (S&C)… ▽ More Integrated Sensing and Communications (ISAC) has garnered significant attention as a promising technology for the upcoming sixth-generation wireless communication systems (6G). In pursuit of this goal, a common strategy is that a unified waveform, such as Orthogonal Frequency Division Multiplexing (OFDM), should serve dual-functional roles by enabling simultaneous sensing and communications (S&C) operations. However, the sensing performance of an OFDM communication signal is substantially affected by the randomness of the data symbols mapped from bit streams. Therefore, achieving a balance between preserving communication capability (i.e., the randomness) while improving sensing performance remains a challenging task. To cope with this issue, in this paper we analyze the ambiguity function of the OFDM communication signal modulated by random data. Subsequently, a probabilistic constellation sha** (PCS) method is proposed to devise the probability distributions of constellation points, which is able to strike a scalable S&C tradeoff of the random transmitted signal. Finally, the superiority of the proposed PCS method over conventional uniformly distributed constellations is validated through numerical simulations. △ Less

Submitted 27 October, 2023; originally announced October 2023.

arXiv:2310.17661 [pdf, other]

An Overview on IEEE 802.11bf: WLAN Sensing

Authors: Rui Du, Haocheng Hua, Hailiang Xie, Xianxin Song, Zhonghao Lyu, Mengshi Hu, Narengerile, Yan Xin, Stephen McCann, Michael Montemurro, Tony Xiao Han, Jie Xu

Abstract: With recent advancements, the wireless local area network (WLAN) or wireless fidelity (Wi-Fi) technology has been successfully utilized to realize sensing functionalities such as detection, localization, and recognition. However, the WLANs standards are developed mainly for the purpose of communication, and thus may not be able to meet the stringent requirements for emerging sensing applications.… ▽ More With recent advancements, the wireless local area network (WLAN) or wireless fidelity (Wi-Fi) technology has been successfully utilized to realize sensing functionalities such as detection, localization, and recognition. However, the WLANs standards are developed mainly for the purpose of communication, and thus may not be able to meet the stringent requirements for emerging sensing applications. To resolve this issue, a new Task Group (TG), namely IEEE 802.11bf, has been established by the IEEE 802.11 working group, with the objective of creating a new amendment to the WLAN standard to meet advanced sensing requirements while minimizing the effect on communications. This paper provides a comprehensive overview on the up-to-date efforts in the IEEE 802.11bf TG. First, we introduce the definition of the 802.11bf amendment and its formation and standardization timeline. Next, we discuss the WLAN sensing use cases with the corresponding key performance indicator (KPI) requirements. After reviewing previous WLAN sensing research based on communication-oriented WLAN standards, we identify their limitations and underscore the practical need for the new sensing-oriented amendment in 802.11bf. Furthermore, we discuss the WLAN sensing framework and procedure used for measurement acquisition, by considering both sensing at sub-7GHz and directional multi-gigabit (DMG) sensing at 60 GHz, respectively, and address their shared features, similarities, and differences. In addition, we present various candidate technical features for IEEE 802.11bf, including waveform/sequence design, feedback types, as well as quantization and compression techniques. We also describe the methodologies and the channel modeling used by the IEEE 802.11bf TG for evaluation. Finally, we discuss the challenges and future research directions to motivate more research endeavors towards this field in details. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: 31 pages, 25 figures, this is a significant updated version of arXiv:2207.04859

arXiv:2309.13292 [pdf, other]

Beyond Fairness: Age-Harmless Parkinson's Detection via Voice

Authors: Yicheng Wang, Xiaotian Han, Leisheng Yu, Na Zou

Abstract: Parkinson's disease (PD), a neurodegenerative disorder, often manifests as speech and voice dysfunction. While utilizing voice data for PD detection has great potential in clinical applications, the widely used deep learning models currently have fairness issues regarding different ages of onset. These deep models perform well for the elderly group (age $>$ 55) but are less accurate for the young… ▽ More Parkinson's disease (PD), a neurodegenerative disorder, often manifests as speech and voice dysfunction. While utilizing voice data for PD detection has great potential in clinical applications, the widely used deep learning models currently have fairness issues regarding different ages of onset. These deep models perform well for the elderly group (age $>$ 55) but are less accurate for the young group (age $\leq$ 55). Through our investigation, the discrepancy between the elderly and the young arises due to 1) an imbalanced dataset and 2) the milder symptoms often seen in early-onset patients. However, traditional debiasing methods are impractical as they typically impair the prediction accuracy for the majority group while minimizing the discrepancy. To address this issue, we present a new debiasing method using GradCAM-based feature masking combined with ensemble models, ensuring that neither fairness nor accuracy is compromised. Specifically, the GradCAM-based feature masking selectively obscures age-related features in the input voice data while preserving essential information for PD detection. The ensemble models further improve the prediction accuracy for the minority (young group). Our approach effectively improves detection accuracy for early-onset patients without sacrificing performance for the elderly group. Additionally, we propose a two-step detection strategy for the young group, offering a practical risk assessment for potential early-onset PD patients. △ Less

Submitted 23 September, 2023; originally announced September 2023.

arXiv:2309.11276 [pdf, other]

doi 10.1145/3581783.3611955

Towards Real-Time Neural Video Codec for Cross-Platform Application Using Calibration Information

Authors: Kuan Tian, Yonghang Guan, **xi Xiang, Jun Zhang, Xiao Han, Wei Yang

Abstract: The state-of-the-art neural video codecs have outperformed the most sophisticated traditional codecs in terms of RD performance in certain cases. However, utilizing them for practical applications is still challenging for two major reasons. 1) Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream. 2) The high computational com… ▽ More The state-of-the-art neural video codecs have outperformed the most sophisticated traditional codecs in terms of RD performance in certain cases. However, utilizing them for practical applications is still challenging for two major reasons. 1) Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream. 2) The high computational complexity of the encoding and decoding process poses a challenge in achieving real-time performance. In this paper, we propose a real-time cross-platform neural video codec, which is capable of efficiently decoding of 720P video bitstream from other encoding platforms on a consumer-grade GPU. First, to solve the problem of inconsistency of codec caused by the uncertainty of floating point calculations across platforms, we design a calibration transmitting system to guarantee the consistent quantization of entropy parameters between the encoding and decoding stages. The parameters that may have transboundary quantization between encoding and decoding are identified in the encoding stage, and their coordinates will be delivered by auxiliary transmitted bitstream. By doing so, these inconsistent parameters can be processed properly in the decoding stage. Furthermore, to reduce the bitrate of the auxiliary bitstream, we rectify the distribution of entropy parameters using a piecewise Gaussian constraint. Second, to match the computational limitations on the decoding side for real-time video codec, we design a lightweight model. A series of efficiency techniques enable our model to achieve 25 FPS decoding speed on NVIDIA RTX 2080 GPU. Experimental results demonstrate that our model can achieve real-time decoding of 720P videos while encoding on another platform. Furthermore, the real-time model brings up to a maximum of 24.2\% BD-rate improvement from the perspective of PSNR with the anchor H.265. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: 14 pages

arXiv:2309.00960 [pdf, other]

Network Topology Inference with Sparsity and Laplacian Constraints

Authors: Jiaxi Ying, Xi Han, Rui Zhou, Xiwen Wang, Hing Cheung So

Abstract: We tackle the network topology inference problem by utilizing Laplacian constrained Gaussian graphical models, which recast the task as estimating a precision matrix in the form of a graph Laplacian. Recent research \cite{ying2020nonconvex} has uncovered the limitations of the widely used $\ell_1$-norm in learning sparse graphs under this model: empirically, the number of nonzero entries in the so… ▽ More We tackle the network topology inference problem by utilizing Laplacian constrained Gaussian graphical models, which recast the task as estimating a precision matrix in the form of a graph Laplacian. Recent research \cite{ying2020nonconvex} has uncovered the limitations of the widely used $\ell_1$-norm in learning sparse graphs under this model: empirically, the number of nonzero entries in the solution grows with the regularization parameter of the $\ell_1$-norm; theoretically, a large regularization parameter leads to a fully connected (densest) graph. To overcome these challenges, we propose a graph Laplacian estimation method incorporating the $\ell_0$-norm constraint. An efficient gradient projection algorithm is developed to solve the resulting optimization problem, characterized by sparsity and Laplacian constraints. Through numerical experiments with synthetic and financial time-series datasets, we demonstrate the effectiveness of the proposed method in network topology inference. △ Less

Submitted 2 September, 2023; originally announced September 2023.

arXiv:2308.07733 [pdf, other]

doi 10.1145/3581783.3612187

Dynamic Low-Rank Instance Adaptation for Universal Neural Image Compression

Authors: Yue Lv, **xi Xiang, Jun Zhang, Wenming Yang, Xiao Han, Wei Yang

Abstract: The latest advancements in neural image compression show great potential in surpassing the rate-distortion performance of conventional standard codecs. Nevertheless, there exists an indelible domain gap between the datasets utilized for training (i.e., natural images) and those utilized for inference (e.g., artistic images). Our proposal involves a low-rank adaptation approach aimed at addressing… ▽ More The latest advancements in neural image compression show great potential in surpassing the rate-distortion performance of conventional standard codecs. Nevertheless, there exists an indelible domain gap between the datasets utilized for training (i.e., natural images) and those utilized for inference (e.g., artistic images). Our proposal involves a low-rank adaptation approach aimed at addressing the rate-distortion drop observed in out-of-domain datasets. Specifically, we perform low-rank matrix decomposition to update certain adaptation parameters of the client's decoder. These updated parameters, along with image latents, are encoded into a bitstream and transmitted to the decoder in practical scenarios. Due to the low-rank constraint imposed on the adaptation parameters, the resulting bit rate overhead is small. Furthermore, the bit rate allocation of low-rank adaptation is \emph{non-trivial}, considering the diverse inputs require varying adaptation bitstreams. We thus introduce a dynamic gating network on top of the low-rank adaptation method, in order to decide which decoder layer should employ adaptation. The dynamic adaptation network is optimized end-to-end using rate-distortion loss. Our proposed method exhibits universality across diverse image datasets. Extensive results demonstrate that this paradigm significantly mitigates the domain gap, surpassing non-adaptive methods with an average BD-rate improvement of approximately $19\%$ across out-of-domain images. Furthermore, it outperforms the most advanced instance adaptive methods by roughly $5\%$ BD-rate. Ablation studies confirm our method's ability to universally enhance various image compression architectures. △ Less

Submitted 15 August, 2023; originally announced August 2023.

Comments: Accepted by ACM MM 2023, 13 pages, 12 figures

ACM Class: I.4.2; E.4

arXiv:2308.00187 [pdf, ps, other]

Detecting the Anomalies in LiDAR Pointcloud

Authors: Chiyu Zhang, Ji Han, Yao Zou, Kexin Dong, Yujia Li, Junchun Ding, Xiaoling Han

Abstract: LiDAR sensors play an important role in the perception stack of modern autonomous driving systems. Adverse weather conditions such as rain, fog and dust, as well as some (occasional) LiDAR hardware fault may cause the LiDAR to produce pointcloud with abnormal patterns such as scattered noise points and uncommon intensity values. In this paper, we propose a novel approach to detect whether a LiDAR… ▽ More LiDAR sensors play an important role in the perception stack of modern autonomous driving systems. Adverse weather conditions such as rain, fog and dust, as well as some (occasional) LiDAR hardware fault may cause the LiDAR to produce pointcloud with abnormal patterns such as scattered noise points and uncommon intensity values. In this paper, we propose a novel approach to detect whether a LiDAR is generating anomalous pointcloud by analyzing the pointcloud characteristics. Specifically, we develop a pointcloud quality metric based on the LiDAR points' spatial and intensity distribution to characterize the noise level of the pointcloud, which relies on pure mathematical analysis and does not require any labeling or training as learning-based methods do. Therefore, the method is scalable and can be quickly deployed either online to improve the autonomy safety by monitoring anomalies in the LiDAR data or offline to perform in-depth study of the LiDAR behavior over large amount of data. The proposed approach is studied with extensive real public road data collected by LiDARs with different scanning mechanisms and laser spectrums, and is proven to be able to effectively handle various known and unknown sources of pointcloud anomaly. △ Less

Submitted 31 July, 2023; originally announced August 2023.

arXiv:2307.00174 [pdf, other]

Multiscale Progressive Text Prompt Network for Medical Image Segmentation

Authors: Xianjun Han, Qianqian Chen, Zhaoyang Xie, Xuejun Li, Hongyu Yang

Abstract: The accurate segmentation of medical images is a crucial step in obtaining reliable morphological statistics. However, training a deep neural network for this task requires a large amount of labeled data to ensure high-accuracy results. To address this issue, we propose using progressive text prompts as prior knowledge to guide the segmentation process. Our model consists of two stages. In the fir… ▽ More The accurate segmentation of medical images is a crucial step in obtaining reliable morphological statistics. However, training a deep neural network for this task requires a large amount of labeled data to ensure high-accuracy results. To address this issue, we propose using progressive text prompts as prior knowledge to guide the segmentation process. Our model consists of two stages. In the first stage, we perform contrastive learning on natural images to pretrain a powerful prior prompt encoder (PPE). This PPE leverages text prior prompts to generate multimodality features. In the second stage, medical image and text prior prompts are sent into the PPE inherited from the first stage to achieve the downstream medical image segmentation task. A multiscale feature fusion block (MSFF) combines the features from the PPE to produce multiscale multimodality features. These two progressive features not only bridge the semantic gap but also improve prediction accuracy. Finally, an UpAttention block refines the predicted results by merging the image and text features. This design provides a simple and accurate way to leverage multiscale progressive text prior prompts for medical image segmentation. Compared with using only images, our model achieves high-quality results with low data annotation costs. Moreover, our model not only has excellent reliability and validity on medical images but also performs well on natural images. The experimental results on different image datasets demonstrate that our model is effective and robust for image segmentation. △ Less

Submitted 30 June, 2023; originally announced July 2023.

arXiv:2306.09025 [pdf, other]

CoverHunter: Cover Song Identification with Refined Attention and Alignments

Authors: Feng Liu, Deyi Tuo, Yinan Xu, Xintong Han

Abstract: Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track. In this paper, we propose a novel system named CoverHunter that overcomes the shortcomings of existing detection schemes by exploring richer features with refined attention and alignments. CoverHunter contains three key modules: 1) A convolution-augmented tr… ▽ More Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track. In this paper, we propose a novel system named CoverHunter that overcomes the shortcomings of existing detection schemes by exploring richer features with refined attention and alignments. CoverHunter contains three key modules: 1) A convolution-augmented transformer (i.e., Conformer) structure that captures both local and global feature interactions in contrast to previous methods mainly relying on convolutional neural networks; 2) An attention-based time pooling module that further exploits the attention in the time dimension; 3) A novel coarse-to-fine training scheme that first trains a network to roughly align the song chunks and then refines the network by training on the aligned chunks. At the same time, we also summarize some important training tricks used in our system that help achieve better results. Experiments on several standard CSI datasets show that our method significantly improves over state-of-the-art methods with an embedding size of 128 (2.3% on SHS100K-TEST and 17.7% on DaTacos). △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: 6 pages, 3 figures

arXiv:2306.07105 [pdf, ps, other]

STAR-RIS Assisted Covert Communications in NOMA Systems

Authors: Han Xiao, Xiaoyan Hu, Tong-Xing Zheng, Kai-Kit Wong

Abstract: Covert communications assisted by simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) in non-orthogonal multiple access (NOMA) systems have been explored in this paper. In particular, the access point (AP) transmitter adopts NOMA to serve a downlink covert user and a public user. The minimum detection error probability (DEP) at the warden is derived considering… ▽ More Covert communications assisted by simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) in non-orthogonal multiple access (NOMA) systems have been explored in this paper. In particular, the access point (AP) transmitter adopts NOMA to serve a downlink covert user and a public user. The minimum detection error probability (DEP) at the warden is derived considering the uncertainty of its background noise, which is used as a covertness constraint. We aim at maximizing the covert rate of the system by jointly optimizing APs transmit power and passive beamforming of STAR-RIS, under the covertness and quality of service (QoS) constraints. An iterative algorithm is proposed to effectively solve the non-convex optimization problem. Simulation results show that the proposed scheme significantly outperforms the conventional RIS-based scheme in ensuring system covert performance. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: arXiv admin note: text overlap with arXiv:2305.04930, arXiv:2305.03991

arXiv:2306.03835 [pdf, other]

Atrial Septal Defect Detection in Children Based on Ultrasound Video Using Multiple Instances Learning

Authors: Yiman Liu, Qiming Huang, Xiaoxiang Han, Tongtong Liang, Zhifang Zhang, Lijun Chen, **feng Wang, Angelos Stefanidis, Jionglong Su, Jiangang Chen, Qingli Li, Yuqi Zhang

Abstract: Purpose: Congenital heart defect (CHD) is the most common birth defect. Thoracic echocardiography (TTE) can provide sufficient cardiac structure information, evaluate hemodynamics and cardiac function, and is an effective method for atrial septal defect (ASD) examination. This paper aims to study a deep learning method based on cardiac ultrasound video to assist in ASD diagnosis. Materials and met… ▽ More Purpose: Congenital heart defect (CHD) is the most common birth defect. Thoracic echocardiography (TTE) can provide sufficient cardiac structure information, evaluate hemodynamics and cardiac function, and is an effective method for atrial septal defect (ASD) examination. This paper aims to study a deep learning method based on cardiac ultrasound video to assist in ASD diagnosis. Materials and methods: We select two standard views of the atrial septum (subAS) and low parasternal four-compartment view (LPS4C) as the two views to identify ASD. We enlist data from 300 children patients as part of a double-blind experiment for five-fold cross-validation to verify the performance of our model. In addition, data from 30 children patients (15 positives and 15 negatives) are collected for clinician testing and compared to our model test results (these 30 samples do not participate in model training). We propose an echocardiography video-based atrial septal defect diagnosis system. In our model, we present a block random selection, maximal agreement decision and frame sampling strategy for training and testing respectively, resNet18 and r3D networks are used to extract the frame features and aggregate them to build a rich video-level representation. Results: We validate our model using our private dataset by five-cross validation. For ASD detection, we achieve 89.33 AUC, 84.95 accuracy, 85.70 sensitivity, 81.51 specificity and 81.99 F1 score. Conclusion: The proposed model is multiple instances learning-based deep learning model for video atrial septal defect detection which effectively improves ASD detection accuracy when compared to the performances of previous networks and clinical doctors. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2305.15890 [pdf, ps, other]

doi 10.1109/MCOMSTD.0004.2200068

Flexible Spectrum Orchestration of Carrier Aggregation for 5G-Advanced

Authors: Xianghui Han, Chunli Liang, Ruiqi Liu, Xingguang Wei, Mengzhu Chen, Yu-Ngok Ruyue Li, Shi **

Abstract: With increasing availability of spectrum in the market due to new spectrum allocation and re-farming bands from previous cellular generation networks, a more flexible, efficient and green usage of the spectrum becomes an important topic in 5G-Advanced. In this article, we provide an overview on the 3rd Generation Partnership Project (3GPP) work on flexible spectrum orchestration for carrier aggreg… ▽ More With increasing availability of spectrum in the market due to new spectrum allocation and re-farming bands from previous cellular generation networks, a more flexible, efficient and green usage of the spectrum becomes an important topic in 5G-Advanced. In this article, we provide an overview on the 3rd Generation Partnership Project (3GPP) work on flexible spectrum orchestration for carrier aggregation (CA). The configuration settings, requirements and potential specification impacts are analyzed. Some involved Release 18 techniques, such as multi-cell scheduling, transmitter switching and network energy saving, are also presented. Evaluation results show that clear performance gain can be achieved by these techniques. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: Accepted by the IEEE Communications Standards Magazine. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material, creating new works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal ref: IEEE Communications Standards Magazine ( Volume: 7, Issue: 4, December 2023)

arXiv:2305.03991 [pdf, ps, other]

STAR-RIS Aided Covert Communication

Authors: Han Xiao, Xiaoyan Hu, Pengcheng Mu, Wenjie Wang, Tong-Xing Zheng, Kai-Kit Wong, Kun Yang

Abstract: This paper investigates the multi-antenna covert communications assisted by a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS). In particular, to shelter the existence of communications between transmitter and receiver from a warden, a friendly full-duplex receiver with two antennas is leveraged to make contributions to confuse the warden. Considering the wo… ▽ More This paper investigates the multi-antenna covert communications assisted by a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS). In particular, to shelter the existence of communications between transmitter and receiver from a warden, a friendly full-duplex receiver with two antennas is leveraged to make contributions to confuse the warden. Considering the worst case, the closed-form expression of the minimum detection error probability (DEP) at the warden is derived and utilized as a covert constraint. Then, we formulate an optimization problem maximizing the covert rate of the system under the covertness constraint and quality of service (QoS) constraint with communication outage analysis. To jointly design the active and passive beamforming of the transmitter and STAR-RIS, an iterative algorithm based on globally convergent version of method of moving asymptotes (GCMMA) is proposed to effectively solve the non-convex optimization problem. Simulation results show that the proposed STAR-RIS-assisted scheme highly outperforms the case with conventional RIS. △ Less

Submitted 30 August, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

arXiv:2303.12406 [pdf]

Phase regeneration of QPSK signals based on Kerr soliton combs in a highly nonlinear optical fiber

Authors: Xinjie Han, Yong Geng, Haocheng Ke, Kun Qiu

Abstract: We demonstrate an all-optical phase regeneration technique based on Kerr soliton combs, which can realize degraded quaternary phase shift keying (QPSK) signal regeneration through phase-sensitive amplification. A Kerr soliton comb is generated at the receiver side of optical communication systems based on a carrier recovery scheme and is used as coherent dual pumps to achieve phase regeneration. O… ▽ More We demonstrate an all-optical phase regeneration technique based on Kerr soliton combs, which can realize degraded quaternary phase shift keying (QPSK) signal regeneration through phase-sensitive amplification. A Kerr soliton comb is generated at the receiver side of optical communication systems based on a carrier recovery scheme and is used as coherent dual pumps to achieve phase regeneration. Our study will enhance the relay and reception performance of all-optical communication systems. △ Less

Submitted 26 March, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: 8pages;5figures

arXiv:2303.06340 [pdf, other]

Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learning

Authors: Yu-Jia An, Sheng-Chen Bai, Lin Cheng, Xiao-Guang Li, Cheng-en Wang, Xiao-Dong Han, Gang Su, Shi-Ju Ran, Cong Wang

Abstract: Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability,… ▽ More Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability, which might uncontrollably lead to incorrect predictions. Interpretability is particularly crucial to ML for clinical diagnosis as the consumers must gain necessary sense of security and trust from firm grounds or convincing interpretations. In this work, we propose a tensor-network (TN)-ML method to reliably predict lung cancer patients and their stages via screening Raman spectra data of Volatile organic compounds (VOCs) in exhaled breath, which are generally suitable as biomarkers and are considered to be an ideal way for non-invasive lung cancer screening. The prediction of TN-ML is based on the mutual distances of the breath samples mapped to the quantum Hilbert space. Thanks to the quantum probabilistic interpretation, the certainty of the predictions can be quantitatively characterized. The accuracy of the samples with high certainty is almost 100$\%$. The incorrectly-classified samples exhibit obviously lower certainty, and thus can be decipherably identified as anomalies, which will be handled by human experts to guarantee high reliability. Our work sheds light on shifting the ``AI for biomedical sciences'' from the conventional non-interpretable ML schemes to the interpretable human-ML interactive approaches, for the purpose of high accuracy and reliability. △ Less

Submitted 11 March, 2023; originally announced March 2023.

Comments: 10 pages, 7 figures

arXiv:2303.04968 [pdf]

Reconstruction of Cardiac Cine MRI under Free-breathing using Motion-guided Deformable Alignment and Multi-resolution Fusion

Authors: Xiaoxiang Han, Qiaohong Liu, Yiman Liu, Keyan Chen, Yuanjie Lin, Weikun Zhang

Abstract: Objective: Cardiac cine magnetic resonance imaging (MRI) is one of the important means to assess cardiac functions and vascular abnormalities. However, due to cardiac beat, blood flow, or the patient's involuntary movement during the long acquisition, the reconstructed images are prone to motion artifacts that affect the clinical diagnosis. Therefore, accelerated cardiac cine MRI acquisition to ac… ▽ More Objective: Cardiac cine magnetic resonance imaging (MRI) is one of the important means to assess cardiac functions and vascular abnormalities. However, due to cardiac beat, blood flow, or the patient's involuntary movement during the long acquisition, the reconstructed images are prone to motion artifacts that affect the clinical diagnosis. Therefore, accelerated cardiac cine MRI acquisition to achieve high-quality images is necessary for clinical practice. Approach: A novel end-to-end deep learning network is developed to improve cardiac cine MRI reconstruction under free breathing conditions. First, a U-Net is adopted to obtain the initial reconstructed images in k-space. Further to remove the motion artifacts, the Motion-Guided Deformable Alignment (MGDA) method with second-order bidirectional propagation is introduced to align the adjacent cine MRI frames by maximizing spatial-temporal information to alleviate motion artifacts. Finally, the Multi-Resolution Fusion (MRF) module is designed to correct the blur and artifacts generated from alignment operation and obtain the last high-quality reconstructed cardiac images. Main results: At an 8$\times$ acceleration rate, the numerical measurements on the ACDC dataset are SSIM of 78.40%$\pm$4.57%, PSNR of 30.46$\pm$1.22 dB, and NMSE of 0.0468$\pm$0.0075. On the ACMRI dataset, the results are SSIM of 87.65%$\pm$4.20%, PSNR of 30.04$\pm$1.18 dB, and NMSE of 0.0473$\pm$0.0072. Significance: The proposed method exhibits high-quality results with richer details and fewer artifacts for cardiac cine MRI reconstruction on different accelerations under free breathing conditions. △ Less

Submitted 24 September, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

Comments: 28 pages, 5 tables, 11 figures

arXiv:2302.13869 [pdf, other]

doi 10.1016/j.bspc.2023.105280

EDMAE: An Efficient Decoupled Masked Autoencoder for Standard View Identification in Pediatric Echocardiography

Authors: Yiman Liu, Xiaoxiang Han, Tongtong Liang, Bin Dong, Jiajun Yuan, Menghan Hu, Qiaohong Liu, Jiangang Chen, Qingli Li, Yuqi Zhang

Abstract: This paper introduces the Efficient Decoupled Masked Autoencoder (EDMAE), a novel self-supervised method for recognizing standard views in pediatric echocardiography. EDMAE introduces a new proxy task based on the encoder-decoder structure. The EDMAE encoder is composed of a teacher and a student encoder. The teacher encoder extracts the potential representation of the masked image blocks, while t… ▽ More This paper introduces the Efficient Decoupled Masked Autoencoder (EDMAE), a novel self-supervised method for recognizing standard views in pediatric echocardiography. EDMAE introduces a new proxy task based on the encoder-decoder structure. The EDMAE encoder is composed of a teacher and a student encoder. The teacher encoder extracts the potential representation of the masked image blocks, while the student encoder extracts the potential representation of the visible image blocks. The loss is calculated between the feature maps output by the two encoders to ensure consistency in the latent representations they extract. EDMAE uses pure convolution operations instead of the ViT structure in the MAE encoder. This improves training efficiency and convergence speed. EDMAE is pre-trained on a large-scale private dataset of pediatric echocardiography using self-supervised learning, and then fine-tuned for standard view recognition. The proposed method achieves high classification accuracy in 27 standard views of pediatric echocardiography. To further verify the effectiveness of the proposed method, the authors perform another downstream task of cardiac ultrasound segmentation on the public dataset CAMUS. The experimental results demonstrate that the proposed method outperforms some popular supervised and recent self-supervised methods, and is more competitive on different downstream tasks. △ Less

Submitted 3 August, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: 15 pages, 5 figures, 8 tables, Published in Biomedical Signal Processing and Control

Journal ref: Biomedical Signal Processing and Control 86 (2023) 105280

arXiv:2301.04889 [pdf]

Artificial intelligence for diagnosing and predicting survival of patients with renal cell carcinoma: Retrospective multi-center study

Authors: Siteng Chen, Xiyue Wang, Jun Zhang, Liren Jiang, Ning Zhang, Feng Gao, Wei Yang, **xi Xiang, Sen Yang, Junhua Zheng, Xiao Han

Abstract: Background: Clear cell renal cell carcinoma (ccRCC) is the most common renal-related tumor with high heterogeneity. There is still an urgent need for novel diagnostic and prognostic biomarkers for ccRCC. Methods: We proposed a weakly-supervised deep learning strategy using conventional histology of 1752 whole slide images from multiple centers. Our study was demonstrated through internal cross-val… ▽ More Background: Clear cell renal cell carcinoma (ccRCC) is the most common renal-related tumor with high heterogeneity. There is still an urgent need for novel diagnostic and prognostic biomarkers for ccRCC. Methods: We proposed a weakly-supervised deep learning strategy using conventional histology of 1752 whole slide images from multiple centers. Our study was demonstrated through internal cross-validation and external validations for the deep learning-based models. Results: Automatic diagnosis for ccRCC through intelligent subty** of renal cell carcinoma was proved in this study. Our graderisk achieved aera the curve (AUC) of 0.840 (95% confidence interval: 0.805-0.871) in the TCGA cohort, 0.840 (0.805-0.871) in the General cohort, and 0.840 (0.805-0.871) in the CPTAC cohort for the recognition of high-grade tumor. The OSrisk for the prediction of 5-year survival status achieved AUC of 0.784 (0.746-0.819) in the TCGA cohort, which was further verified in the independent General cohort and the CPTAC cohort, with AUC of 0.774 (0.723-0.820) and 0.702 (0.632-0.765), respectively. Cox regression analysis indicated that graderisk, OSrisk, tumor grade, and tumor stage were found to be independent prognostic factors, which were further incorporated into the competing-risk nomogram (CRN). Kaplan-Meier survival analyses further illustrated that our CRN could significantly distinguish patients with high survival risk, with hazard ratio of 5.664 (3.893-8.239, p < 0.0001) in the TCGA cohort, 35.740 (5.889-216.900, p < 0.0001) in the General cohort and 6.107 (1.815 to 20.540, p < 0.0001) in the CPTAC cohort. Comparison analyses conformed that our CRN outperformed current prognosis indicators in the prediction of survival status, with higher concordance index for clinical prognosis. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2212.10897 [pdf, other]

Deterministic-Random Tradeoff of Integrated Sensing and Communications in Gaussian Channels: A Rate-Distortion Perspective

Authors: Fan Liu, Yifeng Xiong, Kai Wan, Tony Xiao Han, Giuseppe Caire

Abstract: Integrated sensing and communications (ISAC) is recognized as a key enabling technology for future wireless networks. To shed light on the fundamental performance limits of ISAC systems, this paper studies the deterministic-random tradeoff between sensing and communications (S&C) from a rate-distortion perspective under vector Gaussian channels. We model the ISAC signal as a random matrix that car… ▽ More Integrated sensing and communications (ISAC) is recognized as a key enabling technology for future wireless networks. To shed light on the fundamental performance limits of ISAC systems, this paper studies the deterministic-random tradeoff between sensing and communications (S&C) from a rate-distortion perspective under vector Gaussian channels. We model the ISAC signal as a random matrix that carries information, whose realization is perfectly known to the sensing receiver, but is unknown to the communication receiver. We characterize the sensing mutual information conditioned on the random ISAC signal, and show that it provides a universal lower bound for distortion metrics of sensing. Furthermore, we prove that the distortion lower bound is minimized if the sample covariance matrix of the ISAC signal is deterministic. We then offer our understanding of the main results by interpreting wireless sensing as non-cooperative source-channel coding, and reveal the deterministic-random tradeoff of S&C for ISAC systems. Finally, we provide sufficient conditions for the achievability of the distortion bound by analyzing a specific example of target response matrix estimation. △ Less

Submitted 4 January, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

Comments: 8 pages, 3 figures, submitted to IEEE ISIT 2023

arXiv:2212.01825 [pdf, other]

doi 10.1109/TMI.2022.3225528

MouseGAN++: Unsupervised Disentanglement and Contrastive Representation for Multiple MRI Modalities Synthesis and Structural Segmentation of Mouse Brain

Authors: Ziqi Yu, Xiaoyang Han, Shengjie Zhang, Jianfeng Feng, Tingying Peng, Xiao-Yong Zhang

Abstract: Segmenting the fine structure of the mouse brain on magnetic resonance (MR) images is critical for delineating morphological regions, analyzing brain function, and understanding their relationships. Compared to a single MRI modality, multimodal MRI data provide complementary tissue features that can be exploited by deep learning models, resulting in better segmentation results. However, multimodal… ▽ More Segmenting the fine structure of the mouse brain on magnetic resonance (MR) images is critical for delineating morphological regions, analyzing brain function, and understanding their relationships. Compared to a single MRI modality, multimodal MRI data provide complementary tissue features that can be exploited by deep learning models, resulting in better segmentation results. However, multimodal mouse brain MRI data is often lacking, making automatic segmentation of mouse brain fine structure a very challenging task. To address this issue, it is necessary to fuse multimodal MRI data to produce distinguished contrasts in different brain structures. Hence, we propose a novel disentangled and contrastive GAN-based framework, named MouseGAN++, to synthesize multiple MR modalities from single ones in a structure-preserving manner, thus improving the segmentation performance by imputing missing modalities and multi-modality fusion. Our results demonstrate that the translation performance of our method outperforms the state-of-the-art methods. Using the subsequently learned modality-invariant information as well as the modality-translated images, MouseGAN++ can segment fine brain structures with averaged dice coefficients of 90.0% (T2w) and 87.9% (T1w), respectively, achieving around +10% performance improvement compared to the state-of-the-art algorithms. Our results demonstrate that MouseGAN++, as a simultaneous image synthesis and segmentation method, can be used to fuse cross-modality information in an unpaired manner and yield more robust performance in the absence of multimodal data. We release our method as a mouse brain structural segmentation tool for free academic usage at https://github.com/yu02019. △ Less

Submitted 4 December, 2022; originally announced December 2022.

Comments: IEEE Transactions on Medical Imaging (IEEE-TMI) 2022

arXiv:2212.01144 [pdf]

Resolution enhancement of NMR by decoupling with low-rank Hankel model

Authors: Tianyu Qiu, Amir Jahangiri, Xiao Han, Dmitry Lesovoy, Tatiana Agback, Peter Agback, Adnane Achour, Xiaobo Qu, Vladislav Orekhov

Abstract: Nuclear magnetic resonance (NMR) spectroscopy has become a formidable tool for biochemistry and medicine. Although J-coupling carries essential structural information it may also limit the spectral resolution. Homonuclear decoupling remains a challenging problem. In this work, we introduce a new approach that uses a specific coupling value as prior knowledge, and Hankel property of exponential NMR… ▽ More Nuclear magnetic resonance (NMR) spectroscopy has become a formidable tool for biochemistry and medicine. Although J-coupling carries essential structural information it may also limit the spectral resolution. Homonuclear decoupling remains a challenging problem. In this work, we introduce a new approach that uses a specific coupling value as prior knowledge, and Hankel property of exponential NMR signal to achieve the broadband heteronuclear decoupling using the low-rank method. Our results on synthetic and realistic HMQC spectra demonstrate that the proposed method not only effectively enhances resolution by decoupling, but also maintains sensitivity and suppresses spectral artefacts. The approach can be combined with the non-uniform sampling, which means that the resolution can be further improved without any extra acquisition time △ Less

Submitted 2 December, 2022; originally announced December 2022.

Comments: 8 pages, 4 figures

arXiv:2210.16592 [pdf, other]

Cramér-Rao Bound Minimization for IRS-Enabled Multiuser Integrated Sensing and Communication with Extended Target

Authors: Xianxin Song, Tony Xiao Han, Jie Xu

Abstract: This paper investigates an intelligent reflecting surface (IRS) enabled multiuser integrated sensing and communication (ISAC) system, which consists of one multi-antenna base station (BS), one IRS, multiple single-antenna communication users (CUs), and one extended target at the non-line-of-sight (NLoS) region of the BS. The IRS is deployed to not only assist the communication from the BS to the C… ▽ More This paper investigates an intelligent reflecting surface (IRS) enabled multiuser integrated sensing and communication (ISAC) system, which consists of one multi-antenna base station (BS), one IRS, multiple single-antenna communication users (CUs), and one extended target at the non-line-of-sight (NLoS) region of the BS. The IRS is deployed to not only assist the communication from the BS to the CUs, but also enable the BS's NLoS target sensing based on the echo signals from the BS-IRS-target-IRS-BS link. To provide full degrees of freedom for sensing, we suppose that the BS sends additional dedicated sensing signals combined with the information signals. Accordingly, we consider two types of CU receivers, namely Type-I and Type-II receivers, which do not have and have the capability of cancelling the interference from the sensing signals, respectively. Under this setup, we jointly optimize the transmit beamforming at the BS and the reflective beamforming at the IRS to minimize the Cramér-Rao bound (CRB) for estimating the target response matrix with respect to the IRS, subject to the minimum signal-to-interference-plus-noise ratio (SINR) constraints at the CUs and the maximum transmit power constraint at the BS. We present efficient algorithms to solve the highly non-convex SINR-constrained CRB minimization problems, by using the techniques of alternating optimization and semi-definite relaxation. Numerical results show that the proposed design achieves lower estimation CRB than other benchmark schemes, and the sensing signal interference pre-cancellation is beneficial when the number of CUs is greater than one. △ Less

Submitted 29 October, 2022; originally announced October 2022.

Comments: 6 pages, 3 figures

arXiv:2210.14645 [pdf, other]

Super-Resolution Based Patch-Free 3D Image Segmentation with High-Frequency Guidance

Authors: Hongyi Wang, Lanfen Lin, Hongjie Hu, Qingqing Chen, Yinhao Li, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

Abstract: High resolution (HR) 3D images are widely used nowadays, such as medical images like Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). However, segmentation of these 3D images remains a challenge due to their high spatial resolution and dimensionality in contrast to currently limited GPU memory. Therefore, most existing 3D image segmentation methods use patch-based models, which have… ▽ More High resolution (HR) 3D images are widely used nowadays, such as medical images like Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). However, segmentation of these 3D images remains a challenge due to their high spatial resolution and dimensionality in contrast to currently limited GPU memory. Therefore, most existing 3D image segmentation methods use patch-based models, which have low inference efficiency and ignore global contextual information. To address these problems, we propose a super-resolution (SR) based patch-free 3D image segmentation framework that can realize HR segmentation from a global-wise low-resolution (LR) input. The framework contains two sub-tasks, of which semantic segmentation is the main task and super resolution is an auxiliary task aiding in rebuilding the high frequency information from the LR input. To furthermore balance the information loss with the LR input, we propose a High-Frequency Guidance Module (HGM), and design an efficient selective crop** algorithm to crop an HR patch from the original image as restoration guidance for it. In addition, we also propose a Task-Fusion Module (TFM) to exploit the inter connections between segmentation and SR task, realizing joint optimization of the two tasks. When predicting, only the main segmentation task is needed, while other modules can be removed for acceleration. The experimental results on two different datasets show that our framework has a four times higher inference speed compared to traditional patch-based methods, while its performance also surpasses other patch-based and patch-free models. △ Less

Submitted 10 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: Version #2 uploaded in Jul 10, 2023

arXiv:2209.13818 [pdf, other]

doi 10.1007/978-3-031-16437-8_28

Denoising of 3D MR images using a voxel-wise hybrid residual MLP-CNN model to improve small lesion diagnostic confidence

Authors: Haibo Yang, Shengjie Zhang, Xiaoyang Han, Botao Zhao, Yan Ren, Yaru Sheng, Xiao-Yong Zhang

Abstract: Small lesions in magnetic resonance imaging (MRI) images are crucial for clinical diagnosis of many kinds of diseases. However, the MRI quality can be easily degraded by various noise, which can greatly affect the accuracy of diagnosis of small lesion. Although some methods for denoising MR images have been proposed, task-specific denoising methods for improving the diagnosis confidence of small l… ▽ More Small lesions in magnetic resonance imaging (MRI) images are crucial for clinical diagnosis of many kinds of diseases. However, the MRI quality can be easily degraded by various noise, which can greatly affect the accuracy of diagnosis of small lesion. Although some methods for denoising MR images have been proposed, task-specific denoising methods for improving the diagnosis confidence of small lesions are lacking. In this work, we propose a voxel-wise hybrid residual MLP-CNN model to denoise three-dimensional (3D) MR images with small lesions. We combine basic deep learning architecture, MLP and CNN, to obtain an appropriate inherent bias for the image denoising and integrate each output layers in MLP and CNN by adding residual connections to leverage long-range information. We evaluate the proposed method on 720 T2-FLAIR brain images with small lesions at different noise levels. The results show the superiority of our method in both quantitative and visual evaluations on testing dataset compared to state-of-the-art methods. Moreover, two experienced radiologists agreed that at moderate and high noise levels, our method outperforms other methods in terms of recovery of small lesions and overall image denoising quality. The implementation of our method is available at https://github.com/laowangbobo/Residual_MLP_CNN_Mixer. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: accepted by MICCAI 2022

arXiv:2209.05772 [pdf, other]

DeepNoise: Signal and Noise Disentanglement based on Classifying Fluorescent Microscopy Images via Deep Learning

Authors: Sen Yang, Tao Shen, Yuqi Fang, Xiyue Wang, Jun Zhang, Wei Yang, Junzhou Huang, Xiao Han

Abstract: The high-content image-based assay is commonly leveraged for identifying the phenotypic impact of genetic perturbations in biology field. However, a persistent issue remains unsolved during experiments: the interferential technical noise caused by systematic errors (e.g., temperature, reagent concentration, and well location) is always mixed up with the real biological signals, leading to misinter… ▽ More The high-content image-based assay is commonly leveraged for identifying the phenotypic impact of genetic perturbations in biology field. However, a persistent issue remains unsolved during experiments: the interferential technical noise caused by systematic errors (e.g., temperature, reagent concentration, and well location) is always mixed up with the real biological signals, leading to misinterpretation of any conclusion drawn. Here, we show a mean teacher based deep learning model (DeepNoise) that can disentangle biological signals from the experimental noise. Specifically, we aim to classify the phenotypic impact of 1,108 different genetic perturbations screened from 125,510 fluorescent microscopy images, which are totally unrecognizable by human eye. We validate our model by participating in the Recursion Cellular Image Classification Challenge, and our proposed method achieves an extremely high classification score (Acc: 99.596%), ranking the 2nd place among 866 participating groups. This promising result indicates the successful separation of biological and technical factors, which might help decrease the cost of treatment development and expedite the drug discovery process. △ Less

Submitted 7 December, 2022; v1 submitted 13 September, 2022; originally announced September 2022.

arXiv:2207.10427 [pdf, other]

A Two-stage Multiband WiFi Sensing Scheme via Stochastic Particle-Based Variational Bayesian Inference

Authors: Zhixiang Hu, An Liu, Yubo Wan, Tony Xiao Han, Minjian Zhao

Abstract: Multiband fusion enhances WiFi sensing by jointly utilizing signals from multiple non-contiguous frequency bands. However, in the multi-band WiFi sensing signal model, there are many local optimums in the associated likelihood function due to the existence of high frequency component and phase distortion factors, posing challenges for high-accuracy parameter estimation. To address this, we propose… ▽ More Multiband fusion enhances WiFi sensing by jointly utilizing signals from multiple non-contiguous frequency bands. However, in the multi-band WiFi sensing signal model, there are many local optimums in the associated likelihood function due to the existence of high frequency component and phase distortion factors, posing challenges for high-accuracy parameter estimation. To address this, we propose a two-stage scheme equipped with different signal models derived from the original model, where the first-stage coarse estimation is performed using a weighted root MUSIC algorithm to narrow down the search range for the subsequent stage, and the second-stage refined estimation utilizes a Bayesian approach to avoid convergence to bad suboptimal solutions. Specifically, we apply the block stochastic successive convex approximation (SSCA) approach to derive a novel stochastic particle-based variational Bayesian inference (SPVBI) algorithm in the refined stage. Unlike conventional particle-based VBI (PVBI) that optimizes only particle probability and incurs exponential per-iteration complexity with particle count, our more flexible SPVBI algorithm optimizes both the position and probability of each particle. Additionally, it utilizes block SSCA to significantly improve sampling efficiency by averaging over iterations, making it suitable for high-dimensional problems. Extensive simulations demonstrate the superiority of our proposed algorithm over various baseline methods. △ Less

Submitted 9 October, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

arXiv:2207.10306 [pdf, ps, other]

Fundamental Limits and Optimization of Multiband Sensing

Authors: Yubo Wan, An Liu, Rui Du, Tony Xiao Han

Abstract: Multiband sensing is a promising technology that utilizes multiple non-contiguous frequency bands to achieve high-resolution target sensing. In this paper, we investigate the fundamental limits and optimization of multiband sensing, focusing on the fundamental limits associated with time delay. We first derive a Fisher information matrix (FIM) with a compact form using the Dirichlet kernel and the… ▽ More Multiband sensing is a promising technology that utilizes multiple non-contiguous frequency bands to achieve high-resolution target sensing. In this paper, we investigate the fundamental limits and optimization of multiband sensing, focusing on the fundamental limits associated with time delay. We first derive a Fisher information matrix (FIM) with a compact form using the Dirichlet kernel and then derive a closed-form expression of the Cramer-Rao bound (CRB) for the delay separation in a simplified case to reveal useful insights. Then, a metric called the statistical resolution limit (SRL) that provides a resolution limit is employed to investigate the fundamental limits of delay resolution. The fundamental limits of delay estimation are also investigated based on the CRB and Ziv-Zakai bound (ZZB). Based on the above derived fundamental limits, numerical results are presented to analyze the effect of frequency band apertures and phase distortions on the performance limits of the multiband sensing systems. We formulate an optimization problem to find the optimal system configuration in multiband sensing systems with the objective of minimizing the delay SRL. To solve this non-convex constrained problem, we propose an efficient alternating optimization (AO) algorithm which iteratively optimizes the variables using successive convex approximation (SCA) and one-dimensional search. Simulation results demonstrate the effectiveness of the proposed algorithm. △ Less

Submitted 31 January, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

arXiv:2207.05611 [pdf, other]

doi 10.1109/TSP.2023.3280715

Intelligent Reflecting Surface Enabled Sensing: Cramér-Rao Bound Optimization

Authors: Xianxin Song, Jie Xu, Fan Liu, Tony Xiao Han, Yonina C. Eldar

Abstract: This paper investigates intelligent reflecting surface (IRS) enabled non-line-of-sight (NLoS) wireless sensing, in which an IRS is dedicatedly deployed to assist an access point (AP) to sense a target at its NLoS region. It is assumed that the AP is equipped with multiple antennas and the IRS is equipped with a uniform linear array. We consider two types of target models, namely the point and exte… ▽ More This paper investigates intelligent reflecting surface (IRS) enabled non-line-of-sight (NLoS) wireless sensing, in which an IRS is dedicatedly deployed to assist an access point (AP) to sense a target at its NLoS region. It is assumed that the AP is equipped with multiple antennas and the IRS is equipped with a uniform linear array. We consider two types of target models, namely the point and extended targets, for which the AP aims to estimate the target's direction-of-arrival (DoA) and the target response matrix with respect to the IRS, respectively, based on the echo signals from the AP-IRS-target-IRS-AP link. Under this setup, we jointly design the transmit beamforming at the AP and the reflective beamforming at the IRS to minimize the Cramér-Rao bound (CRB) on the estimation error. Towards this end, we first obtain the CRB expressions for the two target models in closed form. It is shown that in the point target case, the CRB for estimating the DoA depends on both the transmit and reflective beamformers; while in the extended target case, the CRB for estimating the target response matrix only depends on the transmit beamformers. Next, for the point target case, we optimize the joint beamforming design to minimize the CRB, via alternating optimization, semi-definite relaxation, and successive convex approximation. For the extended target case, we obtain the optimal transmit beamforming solution to minimize the CRB in closed form. Finally, numerical results show that for both cases, the proposed designs based on CRB minimization achieve improved sensing performance in terms of mean squared error, as compared to other traditional schemes. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: 14 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:2204.11071

arXiv:2207.04859 [pdf, ps, other]

An Overview on IEEE 802.11bf: WLAN Sensing

Authors: Rui Du, Hailiang Xie, Mengshi Hu, Narengerile, Yan Xin, Stephen McCann, Michael Montemurro, Tony Xiao Han, Jie Xu

Abstract: With recent advancements, the wireless local area network (WLAN) or wireless fidelity (Wi-Fi) technology has been successfully utilized to realize sensing functionalities such as detection, localization, and recognition. However, the WLANs standards are developed mainly for the purpose of communication, and thus may not be able to meet the stringent sensing requirements in emerging applications. T… ▽ More With recent advancements, the wireless local area network (WLAN) or wireless fidelity (Wi-Fi) technology has been successfully utilized to realize sensing functionalities such as detection, localization, and recognition. However, the WLANs standards are developed mainly for the purpose of communication, and thus may not be able to meet the stringent sensing requirements in emerging applications. To resolve this issue, a new Task Group (TG), namely IEEE 802.11bf, has been established by the IEEE 802.11 working group, with the objective of creating a new amendment to the WLAN standard to provide advanced sensing requirements while minimizing the effect on communications. This paper provides a comprehensive overview on the up-to-date efforts in the IEEE 802.11bf TG. First, we introduce the definition of the 802.11bf amendment and its standardization timeline. Then, we discuss the WLAN sensing procedure and framework used for measurement acquisition, by considering both conventional sensing at sub-7 GHz and directional multi-gigabit (DMG) sensing at 60 GHz, respectively. Next, we present various candidate technical features for IEEE 802.11bf, including waveform/sequence design, feedback types, quantization, as well as security and privacy. Finally, we describe the methodologies used by the IEEE 802.11bf TG to evaluate the alternative performance. It is desired that this overview paper provide useful insights on IEEE 802.11 WLAN sensing to people with great interests and promote the IEEE 802.11bf standard to be widely deployed. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2206.02996 [pdf, other]

doi 10.23919/JCIN.2022.10005216

An Indoor Environment Sensing and Localization System via mmWave Phased Array

Authors: Yifei Sun, Jie Li, Tong Zhang, Rui Wang, Xiaohui Peng, Tony Xiao Han, Haisheng Tan

Abstract: An indoor layout sensing and localization system in 60GHz millimeter wave (mmWave) band, named mmReality, is elaborated in this paper. The mmReality system consists of one transmitter and one mobile receiver, each with a phased array and a single radio frequency (RF) chain. To reconstruct the room layout, the pilot signal is delivered from the transmitter to the receiver via different pairs of tra… ▽ More An indoor layout sensing and localization system in 60GHz millimeter wave (mmWave) band, named mmReality, is elaborated in this paper. The mmReality system consists of one transmitter and one mobile receiver, each with a phased array and a single radio frequency (RF) chain. To reconstruct the room layout, the pilot signal is delivered from the transmitter to the receiver via different pairs of transmission and receiving beams, so that the signals at all antenna elements can be resolved. Then, the spatial smoothing and two-dimensional multiple signal classification (MUSIC) algorithm is applied to detect the angle-of-arrival (AoAs) and angle-of-departure (AoDs) of the rays from the transmitter to the receiver. Moreover, the technique of multi-carrier ranging is adopted to measure the distance of each propagation path. Synthesizing the above geometrical parameters, the location of receiver relative to the transmitter can be pinpointed, both line-of-sight (LoS) and non-line-of-sight (NLoS) paths can also be determined. Therefore, the room layout can be reconstructed by moving the receiver and repeating the above measurement in different locations of the room. At the end, we show that the reconstructed room layout can be utilized to locate a mobile device according to its AoA spectrum, even with single access point. △ Less

Submitted 9 January, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

Comments: Paper accepted for publication in Journal of Communications and Information Networks, 2022

arXiv:2206.00493 [pdf, ps, other]

Networked Sensing in 6G Cellular Networks: Opportunities and Challenges

Authors: Liang Liu, Shuowen Zhang, Rui Du, Tong Xiao Han, Shuguang Cui

Abstract: Radar and wireless communication are widely acknowledged as the two most successful applications of the radio technology over the past decades. Recently, there is a trend in both academia and industry to achieve integrated sensing and communication (ISAC) in one system via utilizing a common radio spectrum and the same hardware platform. This article will discuss about the possibility of exploitin… ▽ More Radar and wireless communication are widely acknowledged as the two most successful applications of the radio technology over the past decades. Recently, there is a trend in both academia and industry to achieve integrated sensing and communication (ISAC) in one system via utilizing a common radio spectrum and the same hardware platform. This article will discuss about the possibility of exploiting the future sixth-generation (6G) cellular network to realize ISAC. Our vision is that the cellular base stations (BSs) deployed all over the world can be transformed into a powerful sensor to provide highresolution localization services. Specifically, motivated by the joint encoding/decoding gain in multi-cell coordinated communication, we advocate the adoption of the networked sensing technique in 6G network to achieve the above goal, where the BSs can share the sensing information with each other for jointly estimating the locations and velocities of the targets. Several opportunities and challenges to realize networked sensing in the 6G era will be revealed in this article. Moreover, the future research directions for this promising trend will be outlined as well. △ Less

Submitted 1 June, 2022; originally announced June 2022.

arXiv:2205.04230 [pdf]

RCMNet: A deep learning model assists CAR-T therapy for leukemia

Authors: Ruitao Zhang, Xueying Han, Ijaz Gul, Shiyao Zhai, Ying Liu, Yongbing Zhang, Yuhan Dong, Lan Ma, Dongmei Yu, ** Zhou, Peiwu Qin

Abstract: Acute leukemia is a type of blood cancer with a high mortality rate. Current therapeutic methods include bone marrow transplantation, supportive therapy, and chemotherapy. Although a satisfactory remission of the disease can be achieved, the risk of recurrence is still high. Therefore, novel treatments are demanding. Chimeric antigen receptor-T (CAR-T) therapy has emerged as a promising approach t… ▽ More Acute leukemia is a type of blood cancer with a high mortality rate. Current therapeutic methods include bone marrow transplantation, supportive therapy, and chemotherapy. Although a satisfactory remission of the disease can be achieved, the risk of recurrence is still high. Therefore, novel treatments are demanding. Chimeric antigen receptor-T (CAR-T) therapy has emerged as a promising approach to treat and cure acute leukemia. To harness the therapeutic potential of CAR-T cell therapy for blood diseases, reliable cell morphological identification is crucial. Nevertheless, the identification of CAR-T cells is a big challenge posed by their phenotypic similarity with other blood cells. To address this substantial clinical challenge, herein we first construct a CAR-T dataset with 500 original microscopy images after staining. Following that, we create a novel integrated model called RCMNet (ResNet18 with CBAM and MHSA) that combines the convolutional neural network (CNN) and Transformer. The model shows 99.63% top-1 accuracy on the public dataset. Compared with previous reports, our model obtains satisfactory results for image classification. Although testing on the CAR-T cells dataset, a decent performance is observed, which is attributed to the limited size of the dataset. Transfer learning is adapted for RCMNet and a maximum of 83.36% accuracy has been achieved, which is higher than other SOTA models. The study evaluates the effectiveness of RCMNet on a big public dataset and translates it to a clinical dataset for diagnostic applications. △ Less

Submitted 6 May, 2022; originally announced May 2022.

arXiv:2205.04034 [pdf, other]

AI Based Digital Twin Model for Cattle Caring

Authors: Xue Han, Zihuai Lin

Abstract: In this paper, we developed innovative digital twins of cattle status that are powered by artificial intelligence (AI). The work was built on a farm IoT system that remotely monitors and tracks the state of cattle. A digital twin model of cattle health based on Deep Learning (DL) was generated using the sensor data acquired from the farm IoT system. The health and physiological cycle of cattle can… ▽ More In this paper, we developed innovative digital twins of cattle status that are powered by artificial intelligence (AI). The work was built on a farm IoT system that remotely monitors and tracks the state of cattle. A digital twin model of cattle health based on Deep Learning (DL) was generated using the sensor data acquired from the farm IoT system. The health and physiological cycle of cattle can be monitored in real time, and the state of the next physiological cycle of cattle can be anticipated using this model. The basis of this work is the vast amount of data which is required to validate the legitimacy of the digital twins model. In terms of behavioural state, it was found that the cattle treated with a combination of topical anaesthetic and meloxicam exhibits the least pain reaction. The digital twins model developed in this work can be used to monitor the health of cattle △ Less

Submitted 9 May, 2022; originally announced May 2022.

arXiv:2204.11071 [pdf, other]

Intelligent Reflecting Surface Enabled Sensing: Cramér-Rao Lower Bound Optimization

Authors: Xianxin Song, Jie Xu, Fan Liu, Tony Xiao Han, Yonina C. Eldar

Abstract: This paper investigates intelligent reflecting surface (IRS) enabled non-line-of-sight (NLoS) wireless sensing, in which an IRS is deployed to assist an access point (AP) to sense a target in its NLoS region. It is assumed that the AP is equipped with multiple antennas and the IRS is equipped with a uniform linear array. The AP aims to estimate the target's direction-of-arrival (DoA) with respect… ▽ More This paper investigates intelligent reflecting surface (IRS) enabled non-line-of-sight (NLoS) wireless sensing, in which an IRS is deployed to assist an access point (AP) to sense a target in its NLoS region. It is assumed that the AP is equipped with multiple antennas and the IRS is equipped with a uniform linear array. The AP aims to estimate the target's direction-of-arrival (DoA) with respect to the IRS, based on the echo signals from the AP-IRS-target-IRS-AP link. Under this setup, we jointly design the transmit beamforming at the AP and the reflective beamforming at the IRS to minimize the Cramér-Rao lower bound (CRLB) on estimation error. Towards this end, we first obtain the CRLB expression for estimating the DoA in closed form. Next, we optimize the joint beamforming design to minimize the CRLB, via alternating optimization, semi-definite relaxation, and successive convex approximation. Numerical results show that the proposed design based on CRLB minimization achieves improved sensing performance in terms of mean squared error, as compared to the traditional schemes with signal-to-noise ratio maximization and separate beamforming. △ Less

Submitted 12 October, 2022; v1 submitted 23 April, 2022; originally announced April 2022.

Comments: To be appear in 2022 IEEE Globecom Workshop; 6 pages, 3 figures

arXiv:2201.12358 [pdf, other]

EVBattery: A Large-Scale Electric Vehicle Dataset for Battery Health and Capacity Estimation

Authors: Haowei He, **gzhao Zhang, Yanan Wang, Benben Jiang, Shaobo Huang, Chen Wang, Yang Zhang, Gengang Xiong, Xuebing Han, Dongxu Guo, Guannan He, Minggao Ouyang

Abstract: Electric vehicles (EVs) play an important role in reducing carbon emissions. As EV adoption accelerates, safety issues caused by EV batteries have become an important research topic. In order to benchmark and develop data-driven methods for this task, we introduce a large and comprehensive dataset of EV batteries. Our dataset includes charging records collected from hundreds of EVs from three manu… ▽ More Electric vehicles (EVs) play an important role in reducing carbon emissions. As EV adoption accelerates, safety issues caused by EV batteries have become an important research topic. In order to benchmark and develop data-driven methods for this task, we introduce a large and comprehensive dataset of EV batteries. Our dataset includes charging records collected from hundreds of EVs from three manufacturers over several years. Our dataset is the first large-scale public dataset on real-world battery data, as existing data either include only several vehicles or is collected in the lab environment. Meanwhile, our dataset features two types of labels, corresponding to two key tasks - battery health estimation and battery capacity estimation. In addition to demonstrating how existing deep learning algorithms can be applied to this task, we further develop an algorithm that exploits the data structure of battery systems. Our algorithm achieves better results and shows that a customized method can improve model performances. We hope that this public dataset provides valuable resources for researchers, policymakers, and industry professionals to better understand the dynamics of EV battery aging and support the transition toward a sustainable transportation system. △ Less

Submitted 1 November, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

Comments: 15 pages, 8 figures

arXiv:2201.10827 [pdf, other]

Incorporate Day-ahead Robustness and Real-time Incentives for Electricity Market Design

Authors: Yi Guo, Xuejiao Han, Xinyang Zhou, Gabriela Hug

Abstract: In this paper, we propose a two-stage electricity market framework to explore the participation of distributed energy resources (DERs) in a day-ahead (DA) market and a real-time (RT) market. The objective is to determine the optimal bidding strategies of the aggregated DERs in the DA market and generate online incentive signals for DER-owners to optimize the social welfare taking into account netw… ▽ More In this paper, we propose a two-stage electricity market framework to explore the participation of distributed energy resources (DERs) in a day-ahead (DA) market and a real-time (RT) market. The objective is to determine the optimal bidding strategies of the aggregated DERs in the DA market and generate online incentive signals for DER-owners to optimize the social welfare taking into account network operational constraints. Distributionally robust optimization is used to explicitly incorporate data-based statistical information of renewable forecasts into the supply/demand decisions in the DA market. We evaluate the conservativeness of bidding strategies distinguished by different risk aversion settings. In the RT market, a bi-level time-varying optimization problem is proposed to design the online incentive signals to tradeoff the RT imbalance penalty for distribution system operators (DSOs) and the costs of individual DER-owners. This enables tracking their optimal dispatch to provide fast balancing services, in the presence of time-varying network states while satisfying the voltage regulation requirement. Simulation results on both DA wholesale market and RT balancing market demonstrate the necessity of this two-stage design, and its robustness to uncertainties, the performance of convergence, the tracking ability, and the feasibility of the resulting network operations. △ Less

Submitted 28 August, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

arXiv:2111.14458 [pdf, ps, other]

Decoupled Low-light Image Enhancement

Authors: Shijie Hao, Xu Han, Yanrong Guo, Meng Wang

Abstract: The visual quality of photographs taken under imperfect lightness conditions can be degenerated by multiple factors, e.g., low lightness, imaging noise, color distortion and so on. Current low-light image enhancement models focus on the improvement of low lightness only, or simply deal with all the degeneration factors as a whole, therefore leading to a sub-optimal performance. In this paper, we p… ▽ More The visual quality of photographs taken under imperfect lightness conditions can be degenerated by multiple factors, e.g., low lightness, imaging noise, color distortion and so on. Current low-light image enhancement models focus on the improvement of low lightness only, or simply deal with all the degeneration factors as a whole, therefore leading to a sub-optimal performance. In this paper, we propose to decouple the enhancement model into two sequential stages. The first stage focuses on improving the scene visibility based on a pixel-wise non-linear map**. The second stage focuses on improving the appearance fidelity by suppressing the rest degeneration factors. The decoupled model facilitates the enhancement in two aspects. On the one hand, the whole low-light enhancement can be divided into two easier subtasks. The first one only aims to enhance the visibility. It also helps to bridge the large intensity gap between the low-light and normal-light images. In this way, the second subtask can be shaped as the local appearance adjustment. On the other hand, since the parameter matrix learned from the first stage is aware of the lightness distribution and the scene structure, it can be incorporated into the second stage as the complementary information. In the experiments, our model demonstrates the state-of-the-art performance in both qualitative and quantitative comparisons, compared with other low-light image enhancement models. In addition, the ablation studies also validate the effectiveness of our model in multiple aspects, such as model structure and loss function. The trained model is available at https://github.com/hanxuhfut/Decoupled-Low-light-Image-Enhancement. △ Less

Submitted 29 November, 2021; originally announced November 2021.

Comments: This paper has been accepted in the ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)

arXiv:2111.13511 [pdf, other]

Joint transmit and reflective beamforming for IRS-assisted integrated sensing and communication

Authors: Xianxin Song, Ding Zhao, Haocheng Hua, Tony Xiao Han, Xun Yang, Jie Xu

Abstract: This paper studies an intelligent reflecting surface (IRS)-assisted integrated sensing and communication (ISAC) system, in which one IRS is deployed to not only assist the wireless communication from a multi-antenna base station (BS) to a single-antenna communication user (CU), but also create virtual line-of-sight (LoS) links for sensing targets at areas with LoS links blocked. We consider that t… ▽ More This paper studies an intelligent reflecting surface (IRS)-assisted integrated sensing and communication (ISAC) system, in which one IRS is deployed to not only assist the wireless communication from a multi-antenna base station (BS) to a single-antenna communication user (CU), but also create virtual line-of-sight (LoS) links for sensing targets at areas with LoS links blocked. We consider that the BS transmits combined information and sensing signals for ISAC. Under this setup, we jointly optimize the transmit information and sensing beamforming at the BS and the reflective beamforming at the IRS, to maximize the IRS's minimum beampattern gain towards the desired sensing angles, subject to the minimum signal-to-noise ratio (SNR) requirement at the CU and the maximum transmit power constraint at the BS. Although the formulated SNR-constrained beampattern gain maximization problem is non-convex and difficult to solve, we present an efficient algorithm to obtain a high-quality solution using alternating optimization and semi-definite relaxation (SDR). Numerical results show that the proposed joint beamforming design achieves improved sensing performance while ensuring the communication requirement as compared to benchmarks without such joint optimization. It is also shown that the use of dedicated sensing beams is beneficial in enhancing the performance for IRS-assisted ISAC. △ Less

Submitted 12 February, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

Comments: 6 pages

arXiv:2111.04734 [pdf, other]

Mixed Transformer U-Net For Medical Image Segmentation

Authors: Hongyi Wang, Shiao Xie, Lanfen Lin, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

Abstract: Though U-Net has achieved tremendous success in medical image segmentation tasks, it lacks the ability to explicitly model long-range dependencies. Therefore, Vision Transformers have emerged as alternative segmentation structures recently, for their innate ability of capturing long-range correlations through Self-Attention (SA). However, Transformers usually rely on large-scale pre-training and h… ▽ More Though U-Net has achieved tremendous success in medical image segmentation tasks, it lacks the ability to explicitly model long-range dependencies. Therefore, Vision Transformers have emerged as alternative segmentation structures recently, for their innate ability of capturing long-range correlations through Self-Attention (SA). However, Transformers usually rely on large-scale pre-training and have high computational complexity. Furthermore, SA can only model self-affinities within a single sample, ignoring the potential correlations of the overall dataset. To address these problems, we propose a novel Transformer module named Mixed Transformer Module (MTM) for simultaneous inter- and intra- affinities learning. MTM first calculates self-affinities efficiently through our well-designed Local-Global Gaussian-Weighted Self-Attention (LGG-SA). Then, it mines inter-connections between data samples through External Attention (EA). By using MTM, we construct a U-shaped model named Mixed Transformer U-Net (MT-UNet) for accurate medical image segmentation. We test our method on two different public datasets, and the experimental results show that the proposed method achieves better performance over other state-of-the-art methods. The code is available at: https://github.com/Dootmaan/MT-UNet. △ Less

Submitted 11 November, 2021; v1 submitted 8 November, 2021; originally announced November 2021.

arXiv:2108.07165 [pdf, other]

Integrated Sensing and Communications: Towards Dual-functional Wireless Networks for 6G and Beyond

Authors: Fan Liu, Yuanhao Cui, Christos Masouros, Jie Xu, Tony Xiao Han, Yonina C. Eldar, Stefano Buzzi

Abstract: As the standardization of 5G is being solidified, researchers are speculating what 6G will be. Integrating sensing functionality is emerging as a key feature of the 6G Radio Access Network (RAN), allowing to exploit the dense cell infrastructure of 5G for constructing a perceptive network. In this paper, we provide a comprehensive overview on the background, range of key applications and state-of-… ▽ More As the standardization of 5G is being solidified, researchers are speculating what 6G will be. Integrating sensing functionality is emerging as a key feature of the 6G Radio Access Network (RAN), allowing to exploit the dense cell infrastructure of 5G for constructing a perceptive network. In this paper, we provide a comprehensive overview on the background, range of key applications and state-of-the-art approaches of Integrated Sensing and Communications (ISAC). We commence by discussing the interplay between sensing and communications (S&C) from a historical point of view, and then consider multiple facets of ISAC and its performance gains. By introducing both ongoing and potential use cases, we shed light on industrial progress and standardization activities related to ISAC. We analyze a number of performance tradeoffs between S&C, spanning from information theoretical limits, tradeoffs in physical layer performance, to the tradeoff in cross-layer designs. Next, we discuss signal processing aspects of ISAC, namely ISAC waveform design and receive signal processing. As a step further, we provide our vision on the deeper integration between S&C within the framework of perceptive networks, where the two functionalities are expected to mutually assist each other, i.e., communication-assisted sensing and sensing-assisted communications. Finally, we summarize the paper by identifying the potential integration between ISAC and other emerging communication technologies, and their positive impact on the future of wireless networks. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: 36 pages, 20 figures, 4 tables, submitted to IEEE for possible publication

arXiv:2107.09621 [pdf, ps, other]

Integrated Sensing and Communication from Learning Perspective: An SDP3 Approach

Authors: Guoliang Li, Shuai Wang, Jie Li, Rui Wang, Fan Liu, Xiaohui Peng, Tony Xiao Han, Chengzhong Xu

Abstract: Characterizing the sensing and communication performance tradeoff in integrated sensing and communication (ISAC) systems is challenging in the applications of learning-based human motion recognition. This is because of the large experimental datasets and the black-box nature of deep neural networks. This paper presents SDP3, a Simulation-Driven Performance Predictor and oPtimizer, which consists o… ▽ More Characterizing the sensing and communication performance tradeoff in integrated sensing and communication (ISAC) systems is challenging in the applications of learning-based human motion recognition. This is because of the large experimental datasets and the black-box nature of deep neural networks. This paper presents SDP3, a Simulation-Driven Performance Predictor and oPtimizer, which consists of SDP3 data simulator, SDP3 performance predictor and SDP3 performance optimizer. Specifically, the SDP3 data simulator generates vivid wireless sensing datasets in a virtual environment, the SDP3 performance predictor predicts the sensing performance based on the function regression method, and the SDP3 performance optimizer investigates the sensing and communication performance tradeoff analytically. It is shown that the simulated sensing dataset matches the experimental dataset very well in the motion recognition accuracy. By leveraging SDP3, it is found that the achievable region of recognition accuracy and communication throughput consists of a communication saturation zone, a sensing saturation zone, and a communication-sensing adversarial zone, of which the desired balanced performance for ISAC systems lies in the third one. △ Less

Submitted 14 February, 2023; v1 submitted 20 July, 2021; originally announced July 2021.

Comments: 13 pages, 9 figures, 3 tables, submitted to IEEE for possible publication

arXiv:2107.06538 [pdf, other]

doi 10.1016/j.neucom.2022.04.037

Transformer with Peak Suppression and Knowledge Guidance for Fine-grained Image Recognition

Authors: Xinda Liu, Lili Wang, Xiaoguang Han

Abstract: Fine-grained image recognition is challenging because discriminative clues are usually fragmented, whether from a single image or multiple images. Despite their significant improvements, most existing methods still focus on the most discriminative parts from a single image, ignoring informative details in other regions and lacking consideration of clues from other associated images. In this paper,… ▽ More Fine-grained image recognition is challenging because discriminative clues are usually fragmented, whether from a single image or multiple images. Despite their significant improvements, most existing methods still focus on the most discriminative parts from a single image, ignoring informative details in other regions and lacking consideration of clues from other associated images. In this paper, we analyze the difficulties of fine-grained image recognition from a new perspective and propose a transformer architecture with the peak suppression module and knowledge guidance module, which respects the diversification of discriminative features in a single image and the aggregation of discriminative clues among multiple images. Specifically, the peak suppression module first utilizes a linear projection to convert the input image into sequential tokens. It then blocks the token based on the attention response generated by the transformer encoder. This module penalizes the attention to the most discriminative parts in the feature learning process, therefore, enhancing the information exploitation of the neglected regions. The knowledge guidance module compares the image-based representation generated from the peak suppression module with the learnable knowledge embedding set to obtain the knowledge response coefficients. Afterwards, it formalizes the knowledge learning as a classification problem using response coefficients as the classification scores. Knowledge embeddings and image-based representations are updated during training so that the knowledge embedding includes discriminative clues for different images. Finally, we incorporate the acquired knowledge embeddings into the image-based representations as comprehensive representations, leading to significantly higher performance. Extensive evaluations on the six popular datasets demonstrate the advantage of the proposed method. △ Less

Submitted 10 December, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

arXiv:2107.04306 [pdf, other]

Hepatocellular Carcinoma Segmentation from Digital Subtraction Angiography Videos using Learnable Temporal Difference

Authors: Wenting Jiang, Yicheng Jiang, Lu Zhang, Changmiao Wang, Xiaoguang Han, Shuixing Zhang, Xiang Wan, Shuguang Cui

Abstract: Automatic segmentation of hepatocellular carcinoma (HCC) in Digital Subtraction Angiography (DSA) videos can assist radiologists in efficient diagnosis of HCC and accurate evaluation of tumors in clinical practice. Few studies have investigated HCC segmentation from DSA videos. It shows great challenging due to motion artifacts in filming, ambiguous boundaries of tumor regions and high similarity… ▽ More Automatic segmentation of hepatocellular carcinoma (HCC) in Digital Subtraction Angiography (DSA) videos can assist radiologists in efficient diagnosis of HCC and accurate evaluation of tumors in clinical practice. Few studies have investigated HCC segmentation from DSA videos. It shows great challenging due to motion artifacts in filming, ambiguous boundaries of tumor regions and high similarity in imaging to other anatomical tissues. In this paper, we raise the problem of HCC segmentation in DSA videos, and build our own DSA dataset. We also propose a novel segmentation network called DSA-LTDNet, including a segmentation sub-network, a temporal difference learning (TDL) module and a liver region segmentation (LRS) sub-network for providing additional guidance. DSA-LTDNet is preferable for learning the latent motion information from DSA videos proactively and boosting segmentation performance. All of experiments are conducted on our self-collected dataset. Experimental results show that DSA-LTDNet increases the DICE score by nearly 4% compared to the U-Net baseline. △ Less

Submitted 16 September, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

Comments: 10 pages; accepted to MICCAI 2021

Showing 1–50 of 76 results for author: Han, X