Search | arXiv e-print repository

Revolutionizing Wireless Networks with Self-Supervised Learning: A Pathway to Intelligent Communications

Authors: Zhixiang Yang, Hongyang Du, Dusit Niyato, Xudong Wang, Yu Zhou, Lei Feng, Fanqin Zhou, Wen**g Li, Xuesong Qiu

Abstract: With the rapid proliferation of mobile devices and data, next-generation wireless communication systems face stringent requirements for ultra-low latency, ultra-high reliability, and massive connectivity. Traditional AI-driven wireless network designs, while promising, often suffer from limitations such as dependency on labeled data and poor generalization. To address these challenges, we present… ▽ More With the rapid proliferation of mobile devices and data, next-generation wireless communication systems face stringent requirements for ultra-low latency, ultra-high reliability, and massive connectivity. Traditional AI-driven wireless network designs, while promising, often suffer from limitations such as dependency on labeled data and poor generalization. To address these challenges, we present an integration of self-supervised learning (SSL) into wireless networks. SSL leverages large volumes of unlabeled data to train models, enhancing scalability, adaptability, and generalization. This paper offers a comprehensive overview of SSL, categorizing its application scenarios in wireless network optimization and presenting a case study on its impact on semantic communication. Our findings highlight the potentials of SSL to significantly improve wireless network performance without extensive labeled data, paving the way for more intelligent and efficient communication systems. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2405.16258 [pdf, other]

USD: Unsupervised Soft Contrastive Learning for Fault Detection in Multivariate Time Series

Authors: Hong Liu, Xiuxiu Qiu, Yiming Shi, Zelin Zang

Abstract: Unsupervised fault detection in multivariate time series is critical for maintaining the integrity and efficiency of complex systems, with current methodologies largely focusing on statistical and machine learning techniques. However, these approaches often rest on the assumption that data distributions conform to Gaussian models, overlooking the diversity of patterns that can manifest in both nor… ▽ More Unsupervised fault detection in multivariate time series is critical for maintaining the integrity and efficiency of complex systems, with current methodologies largely focusing on statistical and machine learning techniques. However, these approaches often rest on the assumption that data distributions conform to Gaussian models, overlooking the diversity of patterns that can manifest in both normal and abnormal states, thereby diminishing discriminative performance. Our innovation addresses this limitation by introducing a combination of data augmentation and soft contrastive learning, specifically designed to capture the multifaceted nature of state behaviors more accurately. The data augmentation process enriches the dataset with varied representations of normal states, while soft contrastive learning fine-tunes the model's sensitivity to the subtle differences between normal and abnormal patterns, enabling it to recognize a broader spectrum of anomalies. This dual strategy significantly boosts the model's ability to distinguish between normal and abnormal states, leading to a marked improvement in fault detection performance across multiple datasets and settings, thereby setting a new benchmark for unsupervised fault detection in complex systems. The code of our method is available at \url{https://github.com/zangzelin/code_USD.git}. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 19 pages, 7 figures, under review

arXiv:2404.16484 [pdf, other]

Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF codec, instead of JPEG. All the proposed methods improve PSNR fidelity over Lanczos interpolation, and process images under 10ms. Out of the 160 participants, 25 teams submitted their code and models. The solutions present novel designs tailored for memory-efficiency and runtime on edge devices. This survey describes the best solutions for real-time SR of compressed high-resolution images. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: CVPR 2024, AI for Streaming (AIS) Workshop

arXiv:2404.10388 [pdf, other]

Worst-Case Riemannian Optimization with Uncertain Target Steering Vector for Slow-Time Transmit Sequence of Cognitive Radar

Authors: Xinyu Zhang, Weidong Jiang, Xiangfeng Qiu, Yongxiang Liu

Abstract: Optimization of slow-time transmit sequence endows cognitive radar with the ability to suppress strong clutter in the range-Doppler domain. However, in practice, inaccurate target velocity information or random phase error would induce uncertainty about the actual target steering vector, which would in turn severely deteriorate the the performance of the slow-time matched filter. In order to solve… ▽ More Optimization of slow-time transmit sequence endows cognitive radar with the ability to suppress strong clutter in the range-Doppler domain. However, in practice, inaccurate target velocity information or random phase error would induce uncertainty about the actual target steering vector, which would in turn severely deteriorate the the performance of the slow-time matched filter. In order to solve this problem, we propose a new optimization method for slow-time transmit sequence design. The proposed method transforms the original non-convex optimization with an uncertain target steering vector into a two-step worst-case optimization problem. For each sub-problem, we develop a corresponding trust-region Riemannian optimization algorithm. By iteratively solving the two sub-problems, a sub-optimal solution can be reached without accurate information about the target steering vector. Furthermore, the convergence property of the proposed algorithms has been analyzed and detailed proof of the convergence is given. Unlike the traditional waveform optimization method, the proposed method is designed to work with an uncertain target steering vector and therefore, is more robust in practical radar systems. Numerical simulation results in different scenarios verify the effectiveness of the proposed method in suppressing the clutter and show its advantages in terms of the output signal-to-clutter plus noise ratio (SCNR) over traditional methods. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.05600 [pdf, other]

SpeechAlign: Aligning Speech Generation to Human Preferences

Authors: Dong Zhang, Zhaowei Li, Shimin Li, Xin Zhang, Pengyu Wang, Yaqian Zhou, Xipeng Qiu

Abstract: Speech language models have significantly advanced in generating realistic speech, with neural codec language models standing out. However, the integration of human feedback to align speech outputs to human preferences is often neglected. This paper addresses this gap by first analyzing the distribution gap in codec language models, highlighting how it leads to discrepancies between the training a… ▽ More Speech language models have significantly advanced in generating realistic speech, with neural codec language models standing out. However, the integration of human feedback to align speech outputs to human preferences is often neglected. This paper addresses this gap by first analyzing the distribution gap in codec language models, highlighting how it leads to discrepancies between the training and inference phases, which negatively affects performance. Then we explore leveraging learning from human feedback to bridge the distribution gap. We introduce SpeechAlign, an iterative self-improvement strategy that aligns speech language models to human preferences. SpeechAlign involves constructing a preference codec dataset contrasting golden codec tokens against synthetic tokens, followed by preference optimization to improve the codec language model. This cycle of improvement is carried out iteratively to steadily convert weak models to strong ones. Through both subjective and objective evaluations, we show that SpeechAlign can bridge the distribution gap and facilitating continuous self-improvement of the speech language model. Moreover, SpeechAlign exhibits robust generalization capabilities and works for smaller models. Code and models will be available at https://github.com/0nutation/SpeechGPT. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Work in progress

arXiv:2402.01194 [pdf, other]

A Robust Super-resolution Gridless Imaging Framework for UAV-borne SAR Tomography

Authors: Silin Gao, Wenlong Wang, Muhan Wang, Zhe Zhang, Zai Yang, Xiaolan Qiu, Bingchen Zhang, Yirong Wu

Abstract: Synthetic aperture radar (SAR) tomography (TomoSAR) retrieves three-dimensional (3-D) information from multiple SAR images, effectively addresses the layover problem, and has become pivotal in urban map**. Unmanned aerial vehicle (UAV) has gained popularity as a TomoSAR platform, offering distinct advantages such as the ability to achieve 3-D imaging in a single flight, cost-effectiveness, rapid… ▽ More Synthetic aperture radar (SAR) tomography (TomoSAR) retrieves three-dimensional (3-D) information from multiple SAR images, effectively addresses the layover problem, and has become pivotal in urban map**. Unmanned aerial vehicle (UAV) has gained popularity as a TomoSAR platform, offering distinct advantages such as the ability to achieve 3-D imaging in a single flight, cost-effectiveness, rapid deployment, and flexible trajectory planning. The evolution of compressed sensing (CS) has led to the widespread adoption of sparse reconstruction techniques in TomoSAR signal processing, with a focus on $\ell _1$ norm regularization and other grid-based CS methods. However, the discretization of illuminated scene along elevation introduces modeling errors, resulting in reduced reconstruction accuracy, known as the "off-grid" effect. Recent advancements have introduced gridless CS algorithms to mitigate this issue. This paper presents an innovative gridless 3-D imaging framework tailored for UAV-borne TomoSAR. Capitalizing on the pulse repetition frequency (PRF) redundancy inherent in slow UAV platforms, a multiple measurement vectors (MMV) model is constructed to enhance noise immunity without compromising azimuth-range resolution. Given the sparsely placed array elements due to mounting platform constraints, an atomic norm soft thresholding algorithm is proposed for partially observed MMV, offering gridless reconstruction capability and super-resolution. An efficient alternative optimization algorithm is also employed to enhance computational efficiency. Validation of the proposed framework is achieved through computer simulations and flight experiments, affirming its efficacy in UAV-borne TomoSAR applications. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.13527 [pdf, other]

SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation

Authors: Dong Zhang, Xin Zhang, Jun Zhan, Shimin Li, Yaqian Zhou, Xipeng Qiu

Abstract: Benefiting from effective speech modeling, current Speech Large Language Models (SLLMs) have demonstrated exceptional capabilities in in-context speech generation and efficient generalization to unseen speakers. However, the prevailing information modeling process is encumbered by certain redundancies, leading to inefficiencies in speech generation. We propose Chain-of-Information Generation (CoIG… ▽ More Benefiting from effective speech modeling, current Speech Large Language Models (SLLMs) have demonstrated exceptional capabilities in in-context speech generation and efficient generalization to unseen speakers. However, the prevailing information modeling process is encumbered by certain redundancies, leading to inefficiencies in speech generation. We propose Chain-of-Information Generation (CoIG), a method for decoupling semantic and perceptual information in large-scale speech generation. Building on this, we develop SpeechGPT-Gen, an 8-billion-parameter SLLM efficient in semantic and perceptual information modeling. It comprises an autoregressive model based on LLM for semantic information modeling and a non-autoregressive model employing flow matching for perceptual information modeling. Additionally, we introduce the novel approach of infusing semantic information into the prior distribution to enhance the efficiency of flow matching. Extensive experimental results demonstrate that SpeechGPT-Gen markedly excels in zero-shot text-to-speech, zero-shot voice conversion, and speech-to-speech dialogue, underscoring CoIG's remarkable proficiency in capturing and modeling speech's semantic and perceptual dimensions. Code and models are available at https://github.com/0nutation/SpeechGPT. △ Less

Submitted 25 January, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: work in progress

arXiv:2308.16692 [pdf, other]

SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models

Authors: Xin Zhang, Dong Zhang, Shimin Li, Yaqian Zhou, Xipeng Qiu

Abstract: Current speech large language models build upon discrete speech representations, which can be categorized into semantic tokens and acoustic tokens. However, existing speech tokens are not specifically designed for speech language modeling. To assess the suitability of speech tokens for building speech language models, we established the first benchmark, SLMTokBench. Our results indicate that neith… ▽ More Current speech large language models build upon discrete speech representations, which can be categorized into semantic tokens and acoustic tokens. However, existing speech tokens are not specifically designed for speech language modeling. To assess the suitability of speech tokens for building speech language models, we established the first benchmark, SLMTokBench. Our results indicate that neither semantic nor acoustic tokens are ideal for this purpose. Therefore, we propose SpeechTokenizer, a unified speech tokenizer for speech large language models. SpeechTokenizer adopts the Encoder-Decoder architecture with residual vector quantization (RVQ). Unifying semantic and acoustic tokens, SpeechTokenizer disentangles different aspects of speech information hierarchically across different RVQ layers. Furthermore, We construct a Unified Speech Language Model (USLM) leveraging SpeechTokenizer. Experiments show that SpeechTokenizer performs comparably to EnCodec in speech reconstruction and demonstrates strong performance on the SLMTokBench benchmark. Also, USLM outperforms VALL-E in zero-shot Text-to-Speech tasks. Code and models are available at https://github.com/ZhangXInFD/SpeechTokenizer/. △ Less

Submitted 22 January, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

Comments: Accepted by ICLR 2024. Project page is at https://0nutation.github.io/SpeechTokenizer.github.io/

arXiv:2307.16219 [pdf, other]

Unsupervised Decomposition Networks for Bias Field Correction in MR Image

Authors: Dong Liang, Xingyu Qiu, Kuanquan Wang, Gongning Luo, Wei Wang, Yashu Liu

Abstract: Bias field, which is caused by imperfect MR devices or imaged objects, introduces intensity inhomogeneity into MR images and degrades the performance of MR image analysis methods. Many retrospective algorithms were developed to facilitate the bias correction, to which the deep learning-based methods outperformed. However, in the training phase, the supervised deep learning-based methods heavily re… ▽ More Bias field, which is caused by imperfect MR devices or imaged objects, introduces intensity inhomogeneity into MR images and degrades the performance of MR image analysis methods. Many retrospective algorithms were developed to facilitate the bias correction, to which the deep learning-based methods outperformed. However, in the training phase, the supervised deep learning-based methods heavily rely on the synthesized bias field. As the formation of the bias field is extremely complex, it is difficult to mimic the true physical property of MR images by synthesized data. While bias field correction and image segmentation are strongly related, the segmentation map is precisely obtained by decoupling the bias field from the original MR image, and the bias value is indicated by the segmentation map in reverse. Thus, we proposed novel unsupervised decomposition networks that are trained only with biased data to obtain the bias-free MR images. Networks are made up of: a segmentation part to predict the probability of every pixel belonging to each class, and an estimation part to calculate the bias field, which are optimized alternately. Furthermore, loss functions based on the combination of fuzzy clustering and the multiplicative bias field are also devised. The proposed loss functions introduce the smoothness of bias field and construct the soft relationships among different classes under intra-consistency constraints. Extensive experiments demonstrate that the proposed method can accurately estimate bias fields and produce better bias correction results. The code is available on the link: https://github.com/LeongDong/Bias-Decomposition-Networks. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: Version 1.0

arXiv:2306.10246 [pdf, other]

Conceptual Study and Performance Analysis of Tandem Dual-Antenna Spaceborne SAR Interferometry

Authors: Fengming Hu, Feng Xu, Xiaolan Qiu, Chibiao Ding, Yaqiu **

Abstract: Multi-baseline synthetic aperture radar interferometry (MB-InSAR), capable of map** 3D surface model with high precision, is able to overcome the ill-posed problem in the single-baseline InSAR by use of the baseline diversity. Single pass MB acquisition with the advantages of high coherence and simple phase components has a more practical capability in 3D reconstruction than conventional repeat-… ▽ More Multi-baseline synthetic aperture radar interferometry (MB-InSAR), capable of map** 3D surface model with high precision, is able to overcome the ill-posed problem in the single-baseline InSAR by use of the baseline diversity. Single pass MB acquisition with the advantages of high coherence and simple phase components has a more practical capability in 3D reconstruction than conventional repeat-pass MB acquisition. Using an asymptotic 3D phase unwrap** (PU), it is possible to get a reliable 3D reconstruction using very sparse acquisitions but the interferograms should follow the optimal baseline design. However, current spaceborne SAR system doesn't satisfy this principle, inducing more difficulties in practical application. In this article, a new concept of Tandem Dual-Antenna SAR Interferometry (TDA-InSAR) system for single-pass reliable 3D surface map** using the asymptotic 3D PU is proposed. Its optimal MB acquisition is analyzed to achieve both good relative height precision and flexible baseline design. Two indicators, i.e., expected relative height precision and successful phase unwrap** rate, are selected to optimize the system parameters and evaluate the performance of various baseline configurations. Additionally, simulation-based demonstrations are conducted to evaluate the performance in typical scenarios and investigate the impact of various error sources. The results indicate that the proposed TDA-InSAR is able to get the specified MB acquisition for the asymptotic 3D PU, which offers a feasible solution for single-pass 3D SAR imaging. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: 16 pages, 20 figures

arXiv:2305.07918 [pdf]

CVGG-Net: Ship Recognition for SAR Images Based on Complex-Valued Convolutional Neural Network

Authors: Dandan Zhao, Zhe Zhang, Dongdong Lu, Jian Kang, Xiaolan Qiu, Yirong Wu

Abstract: Ship target recognition is a vital task in synthetic aperture radar (SAR) imaging applications. Although convolutional neural networks have been successfully employed for SAR image target recognition, surpassing traditional algorithms, most existing research concentrates on the amplitude domain and neglects the essential phase information. Furthermore, several complex-valued neural networks utiliz… ▽ More Ship target recognition is a vital task in synthetic aperture radar (SAR) imaging applications. Although convolutional neural networks have been successfully employed for SAR image target recognition, surpassing traditional algorithms, most existing research concentrates on the amplitude domain and neglects the essential phase information. Furthermore, several complex-valued neural networks utilize average pooling to achieve full complex values, resulting in suboptimal performance. To address these concerns, this paper introduces a Complex-valued Convolutional Neural Network (CVGG-Net) specifically designed for SAR image ship recognition. CVGG-Net effectively leverages both the amplitude and phase information in complex-valued SAR data. Additionally, this study examines the impact of various widely-used complex activation functions on network performance and presents a novel complex max-pooling method, called Complex Area Max-Pooling. Experimental results from two measured SAR datasets demonstrate that the proposed algorithm outperforms conventional real-valued convolutional neural networks. The proposed framework is validated on several SAR datasets. △ Less

Submitted 13 May, 2023; originally announced May 2023.

arXiv:2304.04428 [pdf, other]

SPHR-SAR-Net: Superpixel High-resolution SAR Imaging Network Based on Nonlocal Total Variation

Authors: Guoru Zhou, Zhongqiu Xu, Yizhe Fan, Zhe Zhang, Xiaolan Qiu, Bingchen Zhang, Kun Fu, Yirong Wu

Abstract: High-resolution is a key trend in the development of synthetic aperture radar (SAR), which enables the capture of fine details and accurate representation of backscattering properties. However, traditional high-resolution SAR imaging algorithms face several challenges. Firstly, these algorithms tend to focus on local information, neglecting non-local information between different pixel patches. Se… ▽ More High-resolution is a key trend in the development of synthetic aperture radar (SAR), which enables the capture of fine details and accurate representation of backscattering properties. However, traditional high-resolution SAR imaging algorithms face several challenges. Firstly, these algorithms tend to focus on local information, neglecting non-local information between different pixel patches. Secondly, speckle is more pronounced and difficult to filter out in high-resolution SAR images. Thirdly, the process of high-resolution SAR imaging generally involves high time and computational complexity, making real-time imaging difficult to achieve. To address these issues, we propose a Superpixel High-Resolution SAR Imaging Network (SPHR-SAR-Net) for rapid despeckling in high-resolution SAR mode. Based on the concept of superpixel techniques, we initially combine non-convex and non-local total variation as compound regularization. This approach more effectively despeckles and manages the relationship between pixels while reducing bias effects caused by convex constraints. Subsequently, we solve the compound regularization model using the Alternating Direction Method of Multipliers (ADMM) algorithm and unfold it into a Deep Unfolded Network (DUN). The network's parameters are adaptively learned in a data-driven manner, and the learned network significantly increases imaging speed. Additionally, the Deep Unfolded Network is compatible with high-resolution imaging modes such as spotlight, staring spotlight, and sliding spotlight. In this paper, we demonstrate the superiority of SPHR-SAR-Net through experiments in both simulated and real SAR scenarios. The results indicate that SPHR-SAR-Net can rapidly perform high-resolution SAR imaging from raw echo data, producing accurate imaging results. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2303.10823 [pdf, other]

MF-JMoDL-Net: A Deep Network for Azimuth Undersampling Pattern Design and Ambiguity Suppression for Sparse SAR Imaging

Authors: Yuwei Wu, Zhe Zhang, Xiaolan Qiu, Yao Zhao, Weidong Yu

Abstract: repetition frequency (PRF). Given the system complexity and resource constraints, it is often difficult to achieve high imaging performance and low ambiguity without compromising the swath. In this paper, we propose a joint optimization framework for sparse strip SAR imaging algorithms and azimuth undersampling patterns based on a deep convolutional neural network, combined with matched filter (MF… ▽ More repetition frequency (PRF). Given the system complexity and resource constraints, it is often difficult to achieve high imaging performance and low ambiguity without compromising the swath. In this paper, we propose a joint optimization framework for sparse strip SAR imaging algorithms and azimuth undersampling patterns based on a deep convolutional neural network, combined with matched filter (MF) approximate measurement operators and inverse MF operators, referred to as MF-JMoDL-Net, for sparse SAR imaging methods. Compared with conventional sparse SAR imaging, MF-JMoDL-Net enables us to alleviate the limitations imposed by PRF. In the proposed scheme, joint and continuous optimization of azimuth undersampling patterns and convolutional neural network parameters are implemented to suppress azimuth ambiguity and enhance sparse SAR imaging quality. Experiments and comparisons under various conditions demonstrate the effectiveness and superiority of the proposed framework in imaging results. △ Less

Submitted 19 March, 2023; originally announced March 2023.

arXiv:2212.09247 [pdf, other]

ColoristaNet for Photorealistic Video Style Transfer

Authors: Xiaowen Qiu, Ruize Xu, Boan He, Yingtao Zhang, Wenqiang Zhang, Weifeng Ge

Abstract: Photorealistic style transfer aims to transfer the artistic style of an image onto an input image or video while kee** photorealism. In this paper, we think it's the summary statistics matching scheme in existing algorithms that leads to unrealistic stylization. To avoid employing the popular Gram loss, we propose a self-supervised style transfer framework, which contains a style removal part an… ▽ More Photorealistic style transfer aims to transfer the artistic style of an image onto an input image or video while kee** photorealism. In this paper, we think it's the summary statistics matching scheme in existing algorithms that leads to unrealistic stylization. To avoid employing the popular Gram loss, we propose a self-supervised style transfer framework, which contains a style removal part and a style restoration part. The style removal network removes the original image styles, and the style restoration network recovers image styles in a supervised manner. Meanwhile, to address the problems in current feature transformation methods, we propose decoupled instance normalization to decompose feature transformation into style whitening and restylization. It works quite well in ColoristaNet and can transfer image styles efficiently while kee** photorealism. To ensure temporal coherency, we also incorporate optical flow methods and ConvLSTM to embed contextual information. Experiments demonstrates that ColoristaNet can achieve better stylization effects when compared with state-of-the-art algorithms. △ Less

Submitted 21 December, 2022; v1 submitted 18 December, 2022; originally announced December 2022.

Comments: 30 pages, 29 figures

arXiv:2211.16855 [pdf, other]

doi 10.1109/TGRS.2023.3268132

ATASI-Net: An Efficient Sparse Reconstruction Network for Tomographic SAR Imaging with Adaptive Threshold

Authors: Muhan Wang, Zhe Zhang, Xiaolan Qiu, Silin Gao, Yue Wang

Abstract: Tomographic SAR technique has attracted remarkable interest for its ability of three-dimensional resolving along the elevation direction via a stack of SAR images collected from different cross-track angles. The emerged compressed sensing (CS)-based algorithms have been introduced into TomoSAR considering its super-resolution ability with limited samples. However, the conventional CS-based methods… ▽ More Tomographic SAR technique has attracted remarkable interest for its ability of three-dimensional resolving along the elevation direction via a stack of SAR images collected from different cross-track angles. The emerged compressed sensing (CS)-based algorithms have been introduced into TomoSAR considering its super-resolution ability with limited samples. However, the conventional CS-based methods suffer from several drawbacks, including weak noise resistance, high computational complexity, and complex parameter fine-tuning. Aiming at efficient TomoSAR imaging, this paper proposes a novel efficient sparse unfolding network based on the analytic learned iterative shrinkage thresholding algorithm (ALISTA) architecture with adaptive threshold, named Adaptive Threshold ALISTA-based Sparse Imaging Network (ATASI-Net). The weight matrix in each layer of ATASI-Net is pre-computed as the solution of an off-line optimization problem, leaving only two scalar parameters to be learned from data, which significantly simplifies the training stage. In addition, adaptive threshold is introduced for each azimuth-range pixel, enabling the threshold shrinkage to be not only layer-varied but also element-wise. Moreover, the final learned thresholds can be visualized and combined with the SAR image semantics for mutual feedback. Finally, extensive experiments on simulated and real data are carried out to demonstrate the effectiveness and efficiency of the proposed method. △ Less

Submitted 30 November, 2022; originally announced November 2022.

arXiv:2205.02445 [pdf, other]

TomoSAR-ALISTA: Efficient TomoSAR Imaging via Deep Unfolded Network

Authors: Muhan Wang, Zhe Zhang, Yue Wang, Silin Gao, Xiaolan Qiu

Abstract: Synthetic aperture radar (SAR) tomography (TomoSAR) has attracted remarkable interest for its ability in achieving three-dimensional reconstruction along the elevation direction from multiple observations. In recent years, compressed sensing (CS) technique has been introduced into TomoSAR considering for its super-resolution ability with limited samples. Whereas, the CS-based methods suffer from s… ▽ More Synthetic aperture radar (SAR) tomography (TomoSAR) has attracted remarkable interest for its ability in achieving three-dimensional reconstruction along the elevation direction from multiple observations. In recent years, compressed sensing (CS) technique has been introduced into TomoSAR considering for its super-resolution ability with limited samples. Whereas, the CS-based methods suffer from several drawbacks, including weak noise resistance, high computational complexity and complex parameter fine-tuning. Among the different CS algorithms, iterative soft-thresholding algorithm (ISTA) is widely used as a robust reconstruction approach, however, the parameters in the ISTA algorithm are manually chosen, which usually requires a time-consuming fine-tuning process to achieve the best performance. Aiming at efficient TomoSAR imaging, a novel sparse unfolding network named analytic learned ISTA (ALISTA) is proposed towards the TomoSAR imaging problem in this paper, and the key parameters of ISTA are learned from training data via deep learning to avoid complex parameter fine-tuning and significantly relieves the training burden. In addition, experiments verify that it is feasible to use traditional CS algorithms as training labels, which provides a tangible supervised training method to achieve better 3D reconstruction performance even in the absence of labeled data in real applications. △ Less

Submitted 5 May, 2022; originally announced May 2022.

arXiv:2203.08574 [pdf, other]

doi 10.1109/TGRS.2023.3273568

A Novel Gradient Descent Least Squares (GDLS) Algorithm for Efficient SMV Gridless Line Spectrum Estimation with Applications in Tomographic SAR Imaging

Authors: Ruizhe Shi, Zhe Zhang, Xiaolan Qiu, Chibiao Ding

Abstract: This paper presents a novel efficient method for gridless line spectrum estimation problem with single snapshot, namely the gradient descent least squares (GDLS) method. Conventional single snapshot (a.k.a. single measure vector or SMV) line spectrum estimation methods either rely on smoothing techniques that sacrifice the array aperture, or adopt the sparsity constraint and utilize compressed sen… ▽ More This paper presents a novel efficient method for gridless line spectrum estimation problem with single snapshot, namely the gradient descent least squares (GDLS) method. Conventional single snapshot (a.k.a. single measure vector or SMV) line spectrum estimation methods either rely on smoothing techniques that sacrifice the array aperture, or adopt the sparsity constraint and utilize compressed sensing (CS) method by defining prior grids and resulting in the off-grid problem. Recently emerged atomic norm minimization (ANM) methods achieved gridless SMV line spectrum estimation, but its computational complexity is extremely high; thus it is practically infeasible in real applications with large problem scales. Our proposed GDLS method reformulates the line spectrum estimations problem into a least squares (LS) estimation problem and solves the corresponding objective function via gradient descent algorithm in an iterative fashion with efficiency. The convergence guarantee, computational complexity, as well as performance analysis are discussed in this paper. Numerical simulations and real data experiments show that the proposed GDLS algorithm outperforms the state-of-the-art methods e.g., CS and ANM, in terms of estimation performances. It can completely avoid the off-grid problem, and its computational complexity is significantly lower than ANM. Our method has been tested in tomographic SAR (TomoSAR) imaging applications via simulated and real experiment data. Results show great potential of the proposed method in terms of better cloud point performance and eliminating the gridding effect. △ Less

Submitted 27 April, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

arXiv:2203.03883 [pdf]

Online Dynamic Parameter Estimation of an Alkaline Electrolysis System Based on Bayesian Inference

Authors: Xiaoyan Qiu, Hang Zhang, Yiwei Qiu, Buxiang Zhou, Tianlei Zang, Ruomei Qi, ** Lin, Jiepeng Wang

Abstract: When directly coupled with fluctuating energy sources such as wind and photovoltage power, the alkaline electrolysis (AEL) in a power-to-hydrogen (P2H) system is required to operate flexibly by dynamically adjusting its hydrogen production rate. The flex-ibility characteristics, e.g., loading range and ram** rate, of an AEL system are significantly influenced by some parameters re-lated to the d… ▽ More When directly coupled with fluctuating energy sources such as wind and photovoltage power, the alkaline electrolysis (AEL) in a power-to-hydrogen (P2H) system is required to operate flexibly by dynamically adjusting its hydrogen production rate. The flex-ibility characteristics, e.g., loading range and ram** rate, of an AEL system are significantly influenced by some parameters re-lated to the dynamic processes of the AEL system. These parame-ters are usually difficult to measure directly and may even change with time. To accurately evaluate the flexibility of an AEL system in online operation, this paper presents a Bayesian Inference-based Markov Chain Monte Carlo (MCMC) method to estimate these parameters. Meanwhile, posterior joint probability distribu-tions of the estimated parameters are obtained as a byproduct, which provides valuable physical insight into the AEL systems. Experiments on a 25 kW electrolyzer validate the proposed pa-rameter estimation method. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: Accepted by 2022 IEEE 5th International Electrical and Energy Conference

arXiv:2202.03433 [pdf, other]

A Coarse-to-fine Morphological Approach With Knowledge-based Rules and Self-adapting Correction for Lung Nodules Segmentation

Authors: Xinliang Fu, Jiayin Zheng, Juanyun Mai, Yanbo Shao, Minghao Wang, Linyu Li, Zhaoqi Diao, Yulong Chen, Jianyu Xiao, Jian You, Airu Yin, Yang Yang, Xiangcheng Qiu, **sheng Tao, Bo Wang, Hua Ji

Abstract: The segmentation module which precisely outlines the nodules is a crucial step in a computer-aided diagnosis(CAD) system. The most challenging part of such a module is how to achieve high accuracy of the segmentation, especially for the juxtapleural, non-solid and small nodules. In this research, we present a coarse-to-fine methodology that greatly improves the thresholding method performance with… ▽ More The segmentation module which precisely outlines the nodules is a crucial step in a computer-aided diagnosis(CAD) system. The most challenging part of such a module is how to achieve high accuracy of the segmentation, especially for the juxtapleural, non-solid and small nodules. In this research, we present a coarse-to-fine methodology that greatly improves the thresholding method performance with a novel self-adapting correction algorithm and effectively removes noisy pixels with well-defined knowledge-based principles. Compared with recent strong morphological baselines, our algorithm, by combining dataset features, achieves state-of-the-art performance on both the public LIDC-IDRI dataset (DSC 0.699) and our private LC015 dataset (DSC 0.760) which closely approaches the SOTA deep learning-based models' performances. Furthermore, unlike most available morphological methods that can only segment the isolated and well-circumscribed nodules accurately, the precision of our method is totally independent of the nodule type or diameter, proving its applicability and generality. △ Less

Submitted 7 February, 2022; originally announced February 2022.

arXiv:2201.13392

MHSnet: Multi-head and Spatial Attention Network with False-Positive Reduction for Pulmonary Nodules Detection

Authors: Juanyun Mai, Minghao Wang, Jiayin Zheng, Yanbo Shao, Zhaoqi Diao, Xinliang Fu, Yulong Chen, Jianyu Xiao, Jian You, Airu Yin, Yang Yang, Xiangcheng Qiu, **sheng Tao, Bo Wang, Hua Ji

Abstract: The mortality of lung cancer has ranked high among cancers for many years. Early detection of lung cancer is critical for disease prevention, cure, and mortality rate reduction. However, existing detection methods on pulmonary nodules introduce an excessive number of false positive proposals in order to achieve high sensitivity, which is not practical in clinical situations. In this paper, we prop… ▽ More The mortality of lung cancer has ranked high among cancers for many years. Early detection of lung cancer is critical for disease prevention, cure, and mortality rate reduction. However, existing detection methods on pulmonary nodules introduce an excessive number of false positive proposals in order to achieve high sensitivity, which is not practical in clinical situations. In this paper, we propose the multi-head detection and spatial squeeze-and-attention network, MHSnet, to detect pulmonary nodules, in order to aid doctors in the early diagnosis of lung cancers. Specifically, we first introduce multi-head detectors and skip connections to customize for the variety of nodules in sizes, shapes and types and capture multi-scale features. Then, we implement a spatial attention module to enable the network to focus on different regions differently inspired by how experienced clinicians screen CT images, which results in fewer false positive proposals. Lastly, we present a lightweight but effective false positive reduction module with the Linear Regression model to cut down the number of false positive proposals, without any constraints on the front network. Extensive experimental results compared with the state-of-the-art models have shown the superiority of the MHSnet in terms of the average FROC, sensitivity and especially false discovery rate (2.98% and 2.18% improvement in terms of average FROC and sensitivity, 5.62% and 28.33% decrease in terms of false discovery rate and average candidates per scan). The false positive reduction module significantly decreases the average number of candidates generated per scan by 68.11% and the false discovery rate by 13.48%, which is promising to reduce distracted proposals for the downstream tasks based on the detection results. △ Less

Submitted 12 May, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

Comments: We have to revise the experiment results and conclusions

arXiv:2112.05900 [pdf]

Automated assessment of disease severity of COVID-19 using artificial intelligence with synthetic chest CT

Authors: Mengqiu Liu, Ying Liu, Yidong Yang, Ai** Liu, Shana Li, Changbing Qu, Xiaohui Qiu, Yang Li, Weifu Lv, Peng Zhang, Jie Wen

Abstract: Background: Triage of patients is important to control the pandemic of coronavirus disease 2019 (COVID-19), especially during the peak of the pandemic when clinical resources become extremely limited. Purpose: To develop a method that automatically segments and quantifies lung and pneumonia lesions with synthetic chest CT and assess disease severity in COVID-19 patients. Materials and Methods:… ▽ More Background: Triage of patients is important to control the pandemic of coronavirus disease 2019 (COVID-19), especially during the peak of the pandemic when clinical resources become extremely limited. Purpose: To develop a method that automatically segments and quantifies lung and pneumonia lesions with synthetic chest CT and assess disease severity in COVID-19 patients. Materials and Methods: In this study, we incorporated data augmentation to generate synthetic chest CT images using public available datasets (285 datasets from "Lung Nodule Analysis 2016"). The synthetic images and masks were used to train a 2D U-net neural network and tested on 203 COVID-19 datasets to generate lung and lesion segmentations. Disease severity scores (DL: damage load; DS: damage score) were calculated based on the segmentations. Correlations between DL/DS and clinical lab tests were evaluated using Pearson's method. A p-value < 0.05 was considered as statistical significant. Results: Automatic lung and lesion segmentations were compared with manual annotations. For lung segmentation, the median values of dice similarity coefficient, Jaccard index and average surface distance, were 98.56%, 97.15% and 0.49 mm, respectively. The same metrics for lesion segmentation were 76.95%, 62.54% and 2.36 mm, respectively. Significant (p << 0.05) correlations were found between DL/DS and percentage lymphocytes tests, with r-values of -0.561 and -0.501, respectively. Conclusion: An AI system that based on thoracic radiographic and data augmentation was proposed to segment lung and lesions in COVID-19 patients. Correlations between imaging findings and clinical lab tests suggested the value of this system as a potential tool to assess disease severity of COVID-19. △ Less

Submitted 10 December, 2021; originally announced December 2021.

arXiv:2111.11602 [pdf]

Unsupervised COVID-19 Lesion Segmentation in CT Using Cycle Consistent Generative Adversarial Network

Authors: Chengyijue Fang, Yingao Liu, Mengqiu Liu, Xiaohui Qiu, Ying Liu, Yang Li, Jie Wen, Yidong Yang

Abstract: COVID-19 has become a global pandemic and is still posing a severe health risk to the public. Accurate and efficient segmentation of pneumonia lesions in CT scans is vital for treatment decision-making. We proposed a novel unsupervised approach using cycle consistent generative adversarial network (cycle-GAN) which automates and accelerates the process of lesion delineation. The workflow includes… ▽ More COVID-19 has become a global pandemic and is still posing a severe health risk to the public. Accurate and efficient segmentation of pneumonia lesions in CT scans is vital for treatment decision-making. We proposed a novel unsupervised approach using cycle consistent generative adversarial network (cycle-GAN) which automates and accelerates the process of lesion delineation. The workflow includes lung volume segmentation, "synthetic" healthy lung generation, infected and healthy image subtraction, and binary lesion mask creation. The lung volume volume was firstly delineated using a pre-trained U-net and worked as the input for the later network. The cycle-GAN was developed to generate synthetic "healthy" lung CT images from infected lung images. After that, the pneumonia lesions are extracted by subtracting the synthetic "healthy" lung CT images from the "infected" lung CT images. A median filter and K-means clustering were then applied to contour the lesions. The auto segmentation approach was validated on two public datasets (Coronacases and Radiopedia). The Dice coefficients reached 0.748 and 0.730, respectively, for the Coronacases and Radiopedia datasets. Meanwhile, the precision and sensitivity for lesion segmentationdetection are 0.813 and 0.735 for the Coronacases dataset, and 0.773 and 0.726 for the Radiopedia dataset. The performance is comparable to existing supervised segmentation networks and outperforms previous unsupervised ones. The proposed unsupervised segmentation method achieved high accuracy and efficiency in automatic COVID-19 lesion delineation. The segmentation result can serve as a baseline for further manual modification and a quality assurance tool for lesion diagnosis. Furthermore, due to its unsupervised nature, the result is not influenced by physicians' experience which otherwise is crucial for supervised methods. △ Less

Submitted 22 November, 2021; originally announced November 2021.

Comments: It has been submitted to Medical Physics for peer-review on July 26, 2021

arXiv:2111.07334 [pdf, other]

Relative Distributed Formation and Obstacle Avoidance with Multi-agent Reinforcement Learning

Authors: Yuzi Yan, Xiaoxiang Li, Xinyou Qiu, Jiantao Qiu, Jian Wang, Yu Wang, Yuan Shen

Abstract: Multi-agent formation as well as obstacle avoidance is one of the most actively studied topics in the field of multi-agent systems. Although some classic controllers like model predictive control (MPC) and fuzzy control achieve a certain measure of success, most of them require precise global information which is not accessible in harsh environments. On the other hand, some reinforcement learning… ▽ More Multi-agent formation as well as obstacle avoidance is one of the most actively studied topics in the field of multi-agent systems. Although some classic controllers like model predictive control (MPC) and fuzzy control achieve a certain measure of success, most of them require precise global information which is not accessible in harsh environments. On the other hand, some reinforcement learning (RL) based approaches adopt the leader-follower structure to organize different agents' behaviors, which sacrifices the collaboration between agents thus suffering from bottlenecks in maneuverability and robustness. In this paper, we propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL). Agents in our system only utilize local and relative information to make decisions and control themselves distributively. Agent in the multi-agent system will reorganize themselves into a new topology quickly in case that any of them is disconnected. Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines (both classic control methods and another RL-based method). The feasibility of our method is verified by both simulation and hardware implementation with Ackermann-steering vehicles. △ Less

Submitted 14 November, 2021; originally announced November 2021.

arXiv:2109.03037 [pdf, other]

doi 10.1109/ICAS49788.2021.9551123

A drl based distributed formation control scheme with stream based collision avoidance

Authors: Xinyou Qiu, Xiaoxiang Li, Jian Wang, Yu Wang, Yuan Shen

Abstract: Formation and collision avoidance abilities are essential for multi-agent systems. Conventional methods usually require a central controller and global information to achieve collaboration, which is impractical in an unknown environment. In this paper, we propose a deep reinforcement learning (DRL) based distributed formation control scheme for autonomous vehicles. A modified stream-based obstacle… ▽ More Formation and collision avoidance abilities are essential for multi-agent systems. Conventional methods usually require a central controller and global information to achieve collaboration, which is impractical in an unknown environment. In this paper, we propose a deep reinforcement learning (DRL) based distributed formation control scheme for autonomous vehicles. A modified stream-based obstacle avoidance method is applied to smoothen the optimal trajectory, and onboard sensors such as Lidar and antenna arrays are used to obtain local relative distance and angle information. The proposed scheme obtains a scalable distributed control policy which jointly optimizes formation tracking error and average collision rate with local observations. Simulation results demonstrate that our method outperforms two other state-of-the-art algorithms on maintaining formation and collision avoidance. △ Less

Submitted 5 September, 2021; originally announced September 2021.

Comments: 5 pages, 5 figures, been accepted and to be published in IEEE International Conference on Autonomous Systems 2021

arXiv:2104.10385 [pdf, other]

Wide-Beam Array Antenna Power Gain Maximization via ADMM Framework

Authors: Shiwen Lei, **g Tian, Zhipeng Lin, Haoquan Hu, Bo Chen, Wei Yang, Pu Tang, Xiangdong Qiu

Abstract: This paper proposes two algorithms to maximize the minimum array power gain in a wide-beam mainlobe by solving the power gain pattern synthesis (PGPS) problem with and without sidelobe constraints. Firstly, the nonconvex PGPS problem is transformed into a nonconvex linear inequality optimization problem and then converted to an augmented Lagrangian problem by introducing auxiliary variables via th… ▽ More This paper proposes two algorithms to maximize the minimum array power gain in a wide-beam mainlobe by solving the power gain pattern synthesis (PGPS) problem with and without sidelobe constraints. Firstly, the nonconvex PGPS problem is transformed into a nonconvex linear inequality optimization problem and then converted to an augmented Lagrangian problem by introducing auxiliary variables via the Alternating Direction Method of Multipliers (ADMM) framework. Next,the original intractable problem is converted into a series of nonconvex and convex subproblems. The nonconvex subproblems are solved by dividing their solution space into a finite set of smaller ones, in which the solution would be obtained pseudoanalytically. In such a way, the proposed algorithms are superior to the existing PGPS-based ones as their convergence can be theoretically guaranteed with a lower computational burden. Numerical examples with both isotropic element pattern (IEP) and active element pattern (AEP) arrays are simulated to show the effectiveness and superiority of the proposed algorithms by comparing with the related existing algorithms. △ Less

Submitted 21 April, 2021; originally announced April 2021.

arXiv:2008.03201 [pdf]

Convolutional neural network based deep-learning architecture for intraprostatic tumour contouring on PSMA PET images in patients with primary prostate cancer

Authors: Dejan Kostyszyn, Tobias Fechter, Nico Bartl, Anca L. Grosu, Christian Gratzke, August Sigle, Michael Mix, Juri Ruf, Thomas F. Fassbender, Selina Kiefer, Alisa S. Bettermann, Nils H. Nicolay, Simon Spohn, Maria U. Kramer, Peter Bronsert, Hongqian Guo, Xuefeng Qiu, Feng Wang, Christoph Henkenberens, Rudolf A. Werner, Dimos Baltas, Philipp T. Meyer, Thorsten Derlin, Mengxia Chen, Constantinos Zamboglou

Abstract: Accurate delineation of the intraprostatic gross tumour volume (GTV) is a prerequisite for treatment approaches in patients with primary prostate cancer (PCa). Prostate-specific membrane antigen positron emission tomography (PSMA-PET) may outperform MRI in GTV detection. However, visual GTV delineation underlies interobserver heterogeneity and is time consuming. The aim of this study was to develo… ▽ More Accurate delineation of the intraprostatic gross tumour volume (GTV) is a prerequisite for treatment approaches in patients with primary prostate cancer (PCa). Prostate-specific membrane antigen positron emission tomography (PSMA-PET) may outperform MRI in GTV detection. However, visual GTV delineation underlies interobserver heterogeneity and is time consuming. The aim of this study was to develop a convolutional neural network (CNN) for automated segmentation of intraprostatic tumour (GTV-CNN) in PSMA-PET. Methods: The CNN (3D U-Net) was trained on [68Ga]PSMA-PET images of 152 patients from two different institutions and the training labels were generated manually using a validated technique. The CNN was tested on two independent internal (cohort 1: [68Ga]PSMA-PET, n=18 and cohort 2: [18F]PSMA-PET, n=19) and one external (cohort 3: [68Ga]PSMA-PET, n=20) test-datasets. Accordance between manual contours and GTV-CNN was assessed with Dice-Sørensen coefficient (DSC). Sensitivity and specificity were calculated for the two internal test-datasets by using whole-mount histology. Results: Median DSCs for cohorts 1-3 were 0.84 (range: 0.32-0.95), 0.81 (range: 0.28-0.93) and 0.83 (range: 0.32-0.93), respectively. Sensitivities and specificities for GTV-CNN were comparable with manual expert contours: 0.98 and 0.76 (cohort 1) and 1 and 0.57 (cohort 2), respectively. Computation time was around 6 seconds for a standard dataset. Conclusion: The application of a CNN for automated contouring of intraprostatic GTV in [68Ga]PSMA- and [18F]PSMA-PET images resulted in a high concordance with expert contours and in high sensitivities and specificities in comparison with histology reference. This robust, accurate and fast technique may be implemented for treatment concepts in primary PCa. The trained model and the study's source code are available in an open source repository. △ Less

Submitted 7 August, 2020; originally announced August 2020.

arXiv:2007.00894 [pdf, other]

Decentralized Blockchain for Privacy-Preserving Large-Scale Contact Tracing

Authors: Wenzhe Lv, Sheng Wu, Chunxiao Jiang, Yuanhao Cui, Xuesong Qiu, Yan Zhang

Abstract: Activity-tracking applications and location-based services using short-range communication (SRC) techniques have been abruptly demanded in the COVID-19 pandemic, especially for automated contact tracing. The attention from both public and policy keeps raising on related practical problems, including \textit{1) how to protect data security and location privacy? 2) how to efficiently and dynamically… ▽ More Activity-tracking applications and location-based services using short-range communication (SRC) techniques have been abruptly demanded in the COVID-19 pandemic, especially for automated contact tracing. The attention from both public and policy keeps raising on related practical problems, including \textit{1) how to protect data security and location privacy? 2) how to efficiently and dynamically deploy SRC Internet of Thing (IoT) witnesses to monitor large areas?} To answer these questions, in this paper, we propose a decentralized and permissionless blockchain protocol, named \textit{Bychain}. Specifically, 1) a privacy-preserving SRC protocol for activity-tracking and corresponding generalized block structure is developed, by connecting an interactive zero-knowledge proof protocol and the key escrow mechanism. As a result, connections between personal identity and the ownership of on-chain location information are decoupled. Meanwhile, the owner of the on-chain location data can still claim its ownership without revealing the private key to anyone else. 2) An artificial potential field-based incentive allocation mechanism is proposed to incentivize IoT witnesses to pursue the maximum monitoring coverage deployment. We implemented and evaluated the proposed blockchain protocol in the real-world using the Bluetooth 5.0. The storage, CPU utilization, power consumption, time delay, and security of each procedure and performance of activities are analyzed. The experiment and security analysis is shown to provide a real-world performance evaluation. △ Less

Submitted 2 July, 2020; originally announced July 2020.

Comments: 16 pages, 22 figures

arXiv:2005.08566 [pdf, other]

doi 10.13140/RG.2.2.17061.52969

Quaternion Neural Networks for Multi-channel Distant Speech Recognition

Authors: Xinchi Qiu, Titouan Parcollet, Mirco Ravanelli, Nicholas Lane, Mohamed Morchid

Abstract: Despite the significant progress in automatic speech recognition (ASR), distant ASR remains challenging due to noise and reverberation. A common approach to mitigate this issue consists of equip** the recording devices with multiple microphones that capture the acoustic scene from different perspectives. These multi-channel audio recordings contain specific internal relations between each signal… ▽ More Despite the significant progress in automatic speech recognition (ASR), distant ASR remains challenging due to noise and reverberation. A common approach to mitigate this issue consists of equip** the recording devices with multiple microphones that capture the acoustic scene from different perspectives. These multi-channel audio recordings contain specific internal relations between each signal. In this paper, we propose to capture these inter- and intra- structural dependencies with quaternion neural networks, which can jointly process multiple signals as whole quaternion entities. The quaternion algebra replaces the standard dot product with the Hamilton one, thus offering a simple and elegant way to model dependencies between elements. The quaternion layers are then coupled with a recurrent neural network, which can learn long-term dependencies in the time domain. We show that a quaternion long-short term memory neural network (QLSTM), trained on the concatenated multi-channel speech signals, outperforms equivalent real-valued LSTM on two different tasks of multi-channel distant speech recognition. △ Less

Submitted 19 May, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: 4 pages

arXiv:1912.11775 [pdf, ps, other]

Stabilization with Closed-loop DOA Enlargement: An Interval Analysis Approach

Authors: Xiang Qiu, Zijun Feng, Chaolun Lu, Yongqiang Li

Abstract: In this paper, the stabilization problem with closed-loop domain of attraction (DOA) enlargement for discrete-time general nonlinear plants is solved. First, a sufficient condition for asymptotic stabilization and estimation of the closed-loop DOA is given. It shows that, for a given Lyapunov function, the negative-definite and invariant set in the state-control space is a stabilizing controller s… ▽ More In this paper, the stabilization problem with closed-loop domain of attraction (DOA) enlargement for discrete-time general nonlinear plants is solved. First, a sufficient condition for asymptotic stabilization and estimation of the closed-loop DOA is given. It shows that, for a given Lyapunov function, the negative-definite and invariant set in the state-control space is a stabilizing controller set and its projection along the control space to the state space can be an estimate of the closed-loop DOA. Then, an algorithm is proposed to approximate the negative-definite and invariant set for the given Lyapunov function, in which an interval analysis algorithm is used to find an inner approximation of sets as precise as desired. Finally, a solvable optimization problem is formulated to enlarge the estimate of the closed-loop DOA by selecting an appropriate Lyapunov function from a positive-definite function set. The proposed method try to find a unstructured controller set (namely, the negative-definite and invariant set) in the state-control space rather than design parameters of a structured controller in traditional synthesis methods. △ Less

Submitted 14 March, 2021; v1 submitted 25 December, 2019; originally announced December 2019.

Comments: 16 pages, 2 figures

arXiv:1911.02399 [pdf]

Dynamic Energy Beacon: An Adaptive and Cost-effective Energy Harvesting and Power Management System for A Better Life

Authors: Nan Xu, Xiao Qiu, Bo Xu, Junyuan Shu, Ka Ho Wan

Abstract: In this proposal, a cost-effective energy harvesting and management system have been proposed. The regular power keeps around 200 Watt while the peak power can reach 300 Watt. The cost of this system satisfies the requirements and budget for residents in the rural area and live off-grid. It could be a potential solution to the global energy crisis, particularly the billions of people living in sev… ▽ More In this proposal, a cost-effective energy harvesting and management system have been proposed. The regular power keeps around 200 Watt while the peak power can reach 300 Watt. The cost of this system satisfies the requirements and budget for residents in the rural area and live off-grid. It could be a potential solution to the global energy crisis, particularly the billions of people living in severe energy poverty. Also, it is an important renewable alternative to conventional fossil fuel electricity generation not only the cost of manufacturing is low and high efficiency, but also it is safe and eco-friendly. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: Entered the Pacific-Asia regional session of IEEE "Empower A Billion Lives" Contest in 2018 PEAC

arXiv:1909.03377 [pdf]

doi 10.1038/s41598-020-77614-w

Ultra-broadband local active noise control with remote acoustic sensing

Authors: Tong Xiao, Xiaojun Qiu, Benjamin Halkon

Abstract: One enduring challenge for controlling high frequency sound in local active noise control (ANC) systems is to obtain the acoustic signal at the specific location to be controlled. In some applications such as in ANC headrest systems, it is not practical to install error microphones in a person's ears to provide the user a quiet or optimally acoustically controlled environment. Many virtual error s… ▽ More One enduring challenge for controlling high frequency sound in local active noise control (ANC) systems is to obtain the acoustic signal at the specific location to be controlled. In some applications such as in ANC headrest systems, it is not practical to install error microphones in a person's ears to provide the user a quiet or optimally acoustically controlled environment. Many virtual error sensing approaches have been proposed to estimate the acoustic signal remotely with the current state-of-the-art method using an array of four microphones and a head tracking system to yield sound reduction up to 1 kHz for a single sound source. In the work reported in this paper, a novel approach of incorporating remote acoustic sensing using a laser Doppler vibrometer into an ANC headrest system is investigated. In this 'virtual ANC headphone' system, a lightweight retro-reflective membrane pick-up is mounted in each synthetic ear of a head and torso simulator to determine the sound in the ear in real-time with minimal invasiveness. The membrane design and the effects of its location on the system performance are explored, the noise spectra in the ears without and with ANC for a variety of relevant primary sound fields are reported, and the performance of the system during head movements is demonstrated. The test results show that at least 10 dB sound attenuation can be realised in the ears over an extended frequency range from (500 Hz to 6 kHz) under a complex sound field and for several common types of synthesised environmental noise, even in the presence of head motion. △ Less

Submitted 27 November, 2020; v1 submitted 7 September, 2019; originally announced September 2019.

Report number: 20784

Journal ref: Sci. Rep. 10 (2020)

arXiv:1907.01169 [pdf, other]

Can a Robot Hear the Shape and Dimensions of a Room?

Authors: Linh Nguyen, Jaime Valls Miro, Xiaojun Qiu

Abstract: Knowing the geometry of a space is desirable for many applications, e.g. sound source localization, sound field reproduction or auralization. In circumstances where only acoustic signals can be obtained, estimating the geometry of a room is a challenging proposition. Existing methods have been proposed to reconstruct a room from the room impulse responses (RIRs). However, the sound source and micr… ▽ More Knowing the geometry of a space is desirable for many applications, e.g. sound source localization, sound field reproduction or auralization. In circumstances where only acoustic signals can be obtained, estimating the geometry of a room is a challenging proposition. Existing methods have been proposed to reconstruct a room from the room impulse responses (RIRs). However, the sound source and microphones must be deployed in a feasible region of the room for it to work, which is impractical when the room is unknown. This work propose to employ a robot equipped with a sound source and four acoustic sensors, to follow a proposed path planning strategy to moves around the room to collect first image sources for room geometry estimation. The strategy can effectively drives the robot from a random initial location through the room so that the room geometry is guaranteed to be revealed. Effectiveness of the proposed approach is extensively validated in a synthetic environment, where the results obtained are highly promising. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Comments: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2019)

arXiv:1905.04144 [pdf]

doi 10.1063/1.5115448

Optical synthetic sampling imaging: concept and an example of microscopy

Authors: Junzheng Peng, Manhong Yao, Zixin Cai, Xue Qiu, Zibang Zhang, Shi** Li, **gang Zhong

Abstract: Digital two-dimensional (2D) spatial sampling devices (such as charge-coupled device) have been widely used in various imaging systems, especially in computational imaging systems. However, the undersampling of digital sampling devices is a problem that limits the resolution of the acquired images. In this study, we present a synthetic sampling imaging (SSI) concept to solve the undersampling prob… ▽ More Digital two-dimensional (2D) spatial sampling devices (such as charge-coupled device) have been widely used in various imaging systems, especially in computational imaging systems. However, the undersampling of digital sampling devices is a problem that limits the resolution of the acquired images. In this study, we present a synthetic sampling imaging (SSI) concept to solve the undersampling problem. It combines the structured illumination system and conventional 2D image detection system to simultaneously sample the specimen from the illumination and the detection sides. Then, we synthesize the illumination sampling rate and the detection sampling rate to reconstruct a high sampling rate image. The concept of the proposed SSI is demonstrated by an example of microscopy. Experimental results confirm that the proposed method can double the sampling resolution of the microscope. The synthetic sampling scheme, where the sampling task is shared by the illumination and detection sides, provides insight for resolving the undersampling problem of the digital imaging system. △ Less

Submitted 13 May, 2019; v1 submitted 9 May, 2019; originally announced May 2019.

Showing 1–33 of 33 results for author: Qiu, X