Search | arXiv e-print repository

arXiv:2406.19749 [pdf, other]

SPIRONet: Spatial-Frequency Learning and Topological Channel Interaction Network for Vessel Segmentation

Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou

Abstract: Automatic vessel segmentation is paramount for develo** next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel… ▽ More Automatic vessel segmentation is paramount for develo** next-generation interventional navigation systems. However, current approaches suffer from suboptimal segmentation performances due to significant challenges in intraoperative images (i.e., low signal-to-noise ratio, small or slender vessels, and strong interference). In this paper, a novel spatial-frequency learning and topological channel interaction network (SPIRONet) is proposed to address the above issues. Specifically, dual encoders are utilized to comprehensively capture local spatial and global frequency vessel features. Then, a cross-attention fusion module is introduced to effectively fuse spatial and frequency features, thereby enhancing feature discriminability. Furthermore, a topological channel interaction module is designed to filter out task-irrelevant responses based on graph neural networks. Extensive experimental results on several challenging datasets (CADSA, CAXF, DCA1, and XCAD) demonstrate state-of-the-art performances of our method. Moreover, the inference speed of SPIRONet is 21 FPS with a 512x512 input size, surpassing clinical real-time requirements (6~12FPS). These promising outcomes indicate SPIRONet's potential for integration into vascular interventional navigation systems. Code is available at https://github.com/Dxhuang-CASIA/SPIRONet. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.16317 [pdf]

SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech Enhancement

Authors: Zhongshu Hou, Qinwen Hu, Zhanzhong Cao, Ming Tang, **g Lu

Abstract: Despite significant progress made in the last decade, deep neural network (DNN) based speech enhancement (SE) still faces the challenge of notable degradation in the quality of recovered speech under low signal-to-noise ratio (SNR) conditions. In this letter, we propose an SNR-progressive speech enhancement model with harmonic compensation for low-SNR SE. Reliable pitch estimation is obtained from… ▽ More Despite significant progress made in the last decade, deep neural network (DNN) based speech enhancement (SE) still faces the challenge of notable degradation in the quality of recovered speech under low signal-to-noise ratio (SNR) conditions. In this letter, we propose an SNR-progressive speech enhancement model with harmonic compensation for low-SNR SE. Reliable pitch estimation is obtained from the intermediate output, which has the benefit of retaining more speech components than the coarse estimate while possessing a significant higher SNR than the input noisy speech. An effective harmonic compensation mechanism is introduced for better harmonic recovery. Extensive ex-periments demonstrate the advantage of our proposed model. A multi-modal speech extraction system based on the proposed backbone model ranks first in the ICASSP 2024 MISP Challenge: https://mispchallenge.github.io/mispchallenge2023/index.html. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2405.08317 [pdf, other]

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Authors: Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff

Abstract: Integrated Speech and Large Language Models (SLMs) that can follow speech instructions and generate relevant text responses have gained popularity lately. However, the safety and robustness of these models remains largely unclear. In this work, we investigate the potential vulnerabilities of such instruction-following speech-language models to adversarial attacks and jailbreaking. Specifically, we… ▽ More Integrated Speech and Large Language Models (SLMs) that can follow speech instructions and generate relevant text responses have gained popularity lately. However, the safety and robustness of these models remains largely unclear. In this work, we investigate the potential vulnerabilities of such instruction-following speech-language models to adversarial attacks and jailbreaking. Specifically, we design algorithms that can generate adversarial examples to jailbreak SLMs in both white-box and black-box attack settings without human involvement. Additionally, we propose countermeasures to thwart such jailbreaking attacks. Our models, trained on dialog data with speech instructions, achieve state-of-the-art performance on spoken question-answering task, scoring over 80% on both safety and helpfulness metrics. Despite safety guardrails, experiments on jailbreaking demonstrate the vulnerability of SLMs to adversarial perturbations and transfer attacks, with average attack success rates of 90% and 10% respectively when evaluated on a dataset of carefully designed harmful questions spanning 12 different toxic categories. However, we demonstrate that our proposed countermeasures reduce the attack success significantly. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 9+6 pages, Submitted to ACL 2024

arXiv:2404.16852 [pdf, other]

A Disease Labeler for Chinese Chest X-Ray Report Generation

Authors: Mengwei Wang, Ruixin Yan, Zeyi Hou, Ning Lang, Xiuzhuang Zhou

Abstract: In the field of medical image analysis, the scarcity of Chinese chest X-ray report datasets has hindered the development of technology for generating Chinese chest X-ray reports. On one hand, the construction of a Chinese chest X-ray report dataset is limited by the time-consuming and costly process of accurate expert disease annotation. On the other hand, a single natural language generation metr… ▽ More In the field of medical image analysis, the scarcity of Chinese chest X-ray report datasets has hindered the development of technology for generating Chinese chest X-ray reports. On one hand, the construction of a Chinese chest X-ray report dataset is limited by the time-consuming and costly process of accurate expert disease annotation. On the other hand, a single natural language generation metric is commonly used to evaluate the similarity between generated and ground-truth reports, while the clinical accuracy and effectiveness of the generated reports rely on an accurate disease labeler (classifier). To address the issues, this study proposes a disease labeler tailored for the generation of Chinese chest X-ray reports. This labeler leverages a dual BERT architecture to handle diagnostic reports and clinical information separately and constructs a hierarchical label learning algorithm based on the affiliation between diseases and body parts to enhance text classification performance. Utilizing this disease labeler, a Chinese chest X-ray report dataset comprising 51,262 report samples was established. Finally, experiments and analyses were conducted on a subset of expert-annotated Chinese chest X-ray reports, validating the effectiveness of the proposed disease labeler. △ Less

Submitted 18 March, 2024; originally announced April 2024.

arXiv:2404.15366 [pdf, other]

A Weight-aware-based Multi-source Unsupervised Domain Adaptation Method for Human Motion Intention Recognition

Authors: Xiao-Yin Liu, Guotao Li, Xiao-Hu Zhou, Xu Liang, Zeng-Guang Hou

Abstract: Accurate recognition of human motion intention (HMI) is beneficial for exoskeleton robots to improve the wearing comfort level and achieve natural human-robot interaction. A classifier trained on labeled source subjects (domains) performs poorly on unlabeled target subject since the difference in individual motor characteristics. The unsupervised domain adaptation (UDA) method has become an effect… ▽ More Accurate recognition of human motion intention (HMI) is beneficial for exoskeleton robots to improve the wearing comfort level and achieve natural human-robot interaction. A classifier trained on labeled source subjects (domains) performs poorly on unlabeled target subject since the difference in individual motor characteristics. The unsupervised domain adaptation (UDA) method has become an effective way to this problem. However, the labeled data are collected from multiple source subjects that might be different not only from the target subject but also from each other. The current UDA methods for HMI recognition ignore the difference between each source subject, which reduces the classification accuracy. Therefore, this paper considers the differences between source subjects and develops a novel theory and algorithm for UDA to recognize HMI, where the margin disparity discrepancy (MDD) is extended to multi-source UDA theory and a novel weight-aware-based multi-source UDA algorithm (WMDD) is proposed. The source domain weight, which can be adjusted adaptively by the MDD between each source subject and target subject, is incorporated into UDA to measure the differences between source subjects. The developed multi-source UDA theory is theoretical and the generalization error on target subject is guaranteed. The theory can be transformed into an optimization problem for UDA, successfully bridging the gap between theory and algorithm. Moreover, a lightweight network is employed to guarantee the real-time of classification and the adversarial learning between feature generator and ensemble classifiers is utilized to further improve the generalization ability. The extensive experiments verify theoretical analysis and show that WMDD outperforms previous UDA methods on HMI recognition tasks. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 13 pages, 5 figures

arXiv:2403.19126 [pdf, other]

Harnessing Data for Accelerating Model Predictive Control by Constraint Removal

Authors: Zhinan Hou, Feiran Zhao, Keyou You

Abstract: Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC poli… ▽ More Model predictive control (MPC) solves a receding-horizon optimization problem in real-time, which can be computationally demanding when there are thousands of constraints. To accelerate online computation of MPC, we utilize data to adaptively remove the constraints while maintaining the MPC policy unchanged. Specifically, we design the removal rule based on the Lipschitz continuity of the MPC policy. This removal rule can use the information of historical data according to the Lipschitz constant and the distance between the current state and historical states. In particular, we provide the explicit expression for calculating the Lipschitz constant by the model parameters. Finally, simulations are performed to validate the effectiveness of the proposed method. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2402.15635 [pdf, other]

Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise

Authors: Xi Chen, Zhewen Hou, Christopher A. Metzler, Arian Maleki, Shirin Jalali

Abstract: We investigate both the theoretical and algorithmic aspects of likelihood-based methods for recovering a complex-valued signal from multiple sets of measurements, referred to as looks, affected by speckle (multiplicative) noise. Our theoretical contributions include establishing the first existing theoretical upper bound on the Mean Squared Error (MSE) of the maximum likelihood estimator under the… ▽ More We investigate both the theoretical and algorithmic aspects of likelihood-based methods for recovering a complex-valued signal from multiple sets of measurements, referred to as looks, affected by speckle (multiplicative) noise. Our theoretical contributions include establishing the first existing theoretical upper bound on the Mean Squared Error (MSE) of the maximum likelihood estimator under the deep image prior hypothesis. Our theoretical results capture the dependence of MSE upon the number of parameters in the deep image prior, the number of looks, the signal dimension, and the number of measurements per look. On the algorithmic side, we introduce the concept of bagged Deep Image Priors (Bagged-DIP) and integrate them with projected gradient descent. Furthermore, we show how employing Newton-Schulz algorithm for calculating matrix inverses within the iterations of PGD reduces the computational complexity of the algorithm. We will show that this method achieves the state-of-the-art performance. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2401.11856 [pdf, other]

MOSformer: Momentum encoder-based inter-slice fusion transformer for medical image segmentation

Authors: De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Xiu-Ling Liu, Zeng-Guang Hou

Abstract: Medical image segmentation takes an important position in various clinical applications. Deep learning has emerged as the predominant solution for automated segmentation of volumetric medical images. 2.5D-based segmentation models bridge computational efficiency of 2D-based models and spatial perception capabilities of 3D-based models. However, prevailing 2.5D-based models often treat each slice e… ▽ More Medical image segmentation takes an important position in various clinical applications. Deep learning has emerged as the predominant solution for automated segmentation of volumetric medical images. 2.5D-based segmentation models bridge computational efficiency of 2D-based models and spatial perception capabilities of 3D-based models. However, prevailing 2.5D-based models often treat each slice equally, failing to effectively learn and exploit inter-slice information, resulting in suboptimal segmentation performances. In this paper, a novel Momentum encoder-based inter-slice fusion transformer (MOSformer) is proposed to overcome this issue by leveraging inter-slice information at multi-scale feature maps extracted by different encoders. Specifically, dual encoders are employed to enhance feature distinguishability among different slices. One of the encoders is moving-averaged to maintain the consistency of slice representations. Moreover, an IF-Swin transformer module is developed to fuse inter-slice multi-scale features. The MOSformer is evaluated on three benchmark datasets (Synapse, ACDC, and AMOS), establishing a new state-of-the-art with 85.63%, 92.19%, and 85.43% of DSC, respectively. These promising results indicate its competitiveness in medical image segmentation. Codes and models of MOSformer will be made publicly available upon acceptance. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Under Review

arXiv:2311.07758 [pdf, other]

Synchrophasor Data Anomaly Detection on Grid Edge by 5G Communication and Adjacent Compute

Authors: Chuan Qin, Dexin Wang, Kishan Prudhvi Guddanti, Xiaoyuan Fan, Zhangshuan Hou

Abstract: The fifth-generation mobile communication (5G) technology offers opportunities to enhance the real-time monitoring of grids. The 5G-enabled phasor measurement units (PMUs) feature flexible positioning and cost-effective long-term maintenance without the constraints of fixing wires. This paper is the first to demonstrate the applicability of 5G in PMU communication, and the experiment was carried o… ▽ More The fifth-generation mobile communication (5G) technology offers opportunities to enhance the real-time monitoring of grids. The 5G-enabled phasor measurement units (PMUs) feature flexible positioning and cost-effective long-term maintenance without the constraints of fixing wires. This paper is the first to demonstrate the applicability of 5G in PMU communication, and the experiment was carried out at Verizon non-standalone test-bed at Pacific Northwest National Laboratory (PNNL) Advanced Wireless Communication lab. The performance of the 5G-enabled PMU communication setup is reviewed and discussed in this paper, and a generalized dynamic linear model (GDLM) based real-time synchrophasor data anomaly detection use-case is presented. Last but not least, the practicability of implementing 5G for wide-area protection strategies is explored and discussed by analyzing the experimental results. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 5 pages, 4 figures

arXiv:2310.08418 [pdf, ps, other]

doi 10.1109/TSG.2024.3420743

Privacy-Preserved Aggregate Thermal Dynamic Model of Buildings

Authors: Zeyin Hou, Shuai Lu, Yijun Xu, Haifeng Qiu, Wei Gu, Zhaoyang Dong, Shixing Ding

Abstract: The thermal inertia of buildings brings considerable flexibility to the heating and cooling load, which is known to be a promising demand response resource. The aggregate model that can describe the thermal dynamics of the building cluster is an important interference for energy systems to exploit its intrinsic thermal inertia. However, the private information of users, such as the indoor temperat… ▽ More The thermal inertia of buildings brings considerable flexibility to the heating and cooling load, which is known to be a promising demand response resource. The aggregate model that can describe the thermal dynamics of the building cluster is an important interference for energy systems to exploit its intrinsic thermal inertia. However, the private information of users, such as the indoor temperature and heating/cooling power, needs to be collected in the parameter estimation procedure to obtain the aggregate model, causing severe privacy concerns. In light of this, we propose a novel privacy-preserved parameter estimation approach to infer the aggregate model for the thermal dynamics of the building cluster for the first time. Using it, the parameters of the aggregate thermal dynamic model (ATDM) can be obtained by the load aggregator without accessing the individual's privacy information. More specifically, this method not only exploits the block coordinate descent (BCD) method to resolve its non-convexity in the estimation but investigates the transformation-based encryption (TE) associated with its secure aggregation protocol (SAP) techniques to realize privacy-preserved computation. Its capability of preserving privacy is also theoretically proven. Finally, simulation results using real-world data demonstrate the accuracy and privacy-preserved performance of our proposed method. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2302.05693 [pdf]

Local spectral attention for full-band speech enhancement

Authors: Zhongshu Hou, Qinwen Hu, Kai Chen, **g Lu

Abstract: Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the inherent connection of signal both in time domain and spectrum domain. Usually, the span of attention is limited in time domain while the attention in frequency domain spans the whole frequency range. In this paper, we notice that the attention over the whole frequency range h… ▽ More Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the inherent connection of signal both in time domain and spectrum domain. Usually, the span of attention is limited in time domain while the attention in frequency domain spans the whole frequency range. In this paper, we notice that the attention over the whole frequency range hampers the inference for full-band SE and possibly leads to excessive residual noise. To alleviate this problem, we introduce local spectral attention (LSA) into full-band SE model by limiting the span of attention. The ablation test on the state-of-the-art (SOTA) full-band SE model reveals that the local frequency attention can effectively improve overall performance. The improved model achieves the best objective score on the full-band VoiceBank+DEMAND set. △ Less

Submitted 11 February, 2023; originally announced February 2023.

arXiv:2302.05690 [pdf]

Attention does not guarantee best performance in speech enhancement

Authors: Zhongshu Hou, Qinwen Hu, Kai Chen, **g Lu

Abstract: Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the long-term inherent connection of signal both in time domain and spectrum domain. However, the generally used global attention mechanism might not be the best choice since the adjacent information naturally imposes more influence than the far-apart information in speech enhance… ▽ More Attention mechanism has been widely utilized in speech enhancement (SE) because theoretically it can effectively model the long-term inherent connection of signal both in time domain and spectrum domain. However, the generally used global attention mechanism might not be the best choice since the adjacent information naturally imposes more influence than the far-apart information in speech enhancement. In this paper, we validate this conjecture by replacing attention with RNN in two typical state-of-the-art (SOTA) models, multi-scale temporal frequency convolutional network (MTFAA) with axial attention and conformer-based metric-GAN network (CMGAN). △ Less

Submitted 11 February, 2023; originally announced February 2023.

arXiv:2211.09959 [pdf]

Potential Auto-driving Threat: Universal Rain-removal Attack

Authors: **chegn Hu, Jihao Li, Zhuoran Hou, **g**g Jiang, Cunjia Liu, Yuanjian Zhang

Abstract: The problem of robustness in adverse weather conditions is considered a significant challenge for computer vision algorithms in the applicants of autonomous driving. Image rain removal algorithms are a general solution to this problem. They find a deep connection between raindrops/rain-streaks and images by mining the hidden features and restoring information about the rain-free environment based… ▽ More The problem of robustness in adverse weather conditions is considered a significant challenge for computer vision algorithms in the applicants of autonomous driving. Image rain removal algorithms are a general solution to this problem. They find a deep connection between raindrops/rain-streaks and images by mining the hidden features and restoring information about the rain-free environment based on the powerful representation capabilities of neural networks. However, previous research has focused on architecture innovations and has yet to consider the vulnerability issues that already exist in neural networks. This research gap hints at a potential security threat geared toward the intelligent perception of autonomous driving in the rain. In this paper, we propose a universal rain-removal attack (URA) on the vulnerability of image rain-removal algorithms by generating a non-additive spatial perturbation that significantly reduces the similarity and image quality of scene restoration. Notably, this perturbation is difficult to recognise by humans and is also the same for different target images. Thus, URA could be considered a critical tool for the vulnerability detection of image rain-removal algorithms. It also could be developed as a real-world artificial intelligence attack method. Experimental results show that URA can reduce the scene repair capability by 39.5% and the image generation quality by 26.4%, targeting the state-of-the-art (SOTA) single-image rain-removal algorithms currently available. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2211.04001 [pdf]

Progress and summary of reinforcement learning on energy management of MPS-EV

Authors: **cheng Hu, Yang Lin, Liang Chu, Zhuoran Hou, Jihan Li, **g**g Jiang, Yuanjian Zhang

Abstract: The high emission and low energy efficiency caused by internal combustion engines (ICE) have become unacceptable under environmental regulations and the energy crisis. As a promising alternative solution, multi-power source electric vehicles (MPS-EVs) introduce different clean energy systems to improve powertrain efficiency. The energy management strategy (EMS) is a critical technology for MPS-EVs… ▽ More The high emission and low energy efficiency caused by internal combustion engines (ICE) have become unacceptable under environmental regulations and the energy crisis. As a promising alternative solution, multi-power source electric vehicles (MPS-EVs) introduce different clean energy systems to improve powertrain efficiency. The energy management strategy (EMS) is a critical technology for MPS-EVs to maximize efficiency, fuel economy, and range. Reinforcement learning (RL) has become an effective methodology for the development of EMS. RL has received continuous attention and research, but there is still a lack of systematic analysis of the design elements of RL-based EMS. To this end, this paper presents an in-depth analysis of the current research on RL-based EMS (RL-EMS) and summarizes the design elements of RL-based EMS. This paper first summarizes the previous applications of RL in EMS from five aspects: algorithm, perception scheme, decision scheme, reward function, and innovative training method. The contribution of advanced algorithms to the training effect is shown, the perception and control schemes in the literature are analyzed in detail, different reward function settings are classified, and innovative training methods with their roles are elaborated. Finally, by comparing the development routes of RL and RL-EMS, this paper identifies the gap between advanced RL solutions and existing RL-EMS. Finally, this paper suggests potential development directions for implementing advanced artificial intelligence (AI) solutions in EMS. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2209.04995 [pdf, other]

A novel learning-based robust model predictive control energy management strategy for fuel cell electric vehicles

Authors: Shibo Li, Zhuoran Hou, Liang Chu, **g**g Jiang, Yuanjian Zhang

Abstract: The multi-source electromechanical coupling makes the energy management of fuel cell electric vehicles (FCEVs) relatively nonlinear and complex especially in the types of 4-wheel-drive (4WD) FCEVs. Accurate state observing for complicated nonlinear system is the basis for fantastic energy managing in FCEVs. Aiming at releasing the energy-saving potential of FCEVs, a novel learning-based robust mod… ▽ More The multi-source electromechanical coupling makes the energy management of fuel cell electric vehicles (FCEVs) relatively nonlinear and complex especially in the types of 4-wheel-drive (4WD) FCEVs. Accurate state observing for complicated nonlinear system is the basis for fantastic energy managing in FCEVs. Aiming at releasing the energy-saving potential of FCEVs, a novel learning-based robust model predictive control (LRMPC) strategy is proposed for a 4WD FCEV, contributing to suitable power distribution among multiple energy sources. The well-designed strategy based on machine learning (ML) translates the knowledge of the nonlinear system to the explicit controlling scheme with superior robust performance. To start with, ML methods with high regression accuracy and superior generalization ability are trained offline to establish the precise state observer for SOC. Then, explicit data tables for SOC generated by state observer are used for grabbing accurate state changing, whose input features include the vehicle status and the states of vehicle components. To be specific, the vehicle velocity estimation for providing future speed reference is constructed by deep forest. Next, the components including explicit data tables and vehicle velocity estimation are combined with model predictive control (MPC) to release the state-of-the-art energy-saving ability for the multi-freedom system in FCEVs, whose name is LRMPC. At last, the detailed assessment is performed in simulation test to validate the advancing performance of LRMPC. The corresponding results highlight the optimal control effect in energy-saving potential and strong real-time application ability of LRMPC. △ Less

Submitted 11 September, 2022; originally announced September 2022.

Comments: 16 pages, 16 figures

arXiv:2206.14524 [pdf]

A light-weight full-band speech enhancement model

Authors: Qinwen Hu, Zhongshu Hou, Xiaohuai Le, **g Lu

Abstract: Deep neural network based full-band speech enhancement systems face challenges of high demand of computational resources and imbalanced frequency distribution. In this paper, a light-weight full-band model is proposed with two dedicated strategies, i.e., a learnable spectral compression map** for more effective high-band spectral information compression, and the utilization of the multi-head att… ▽ More Deep neural network based full-band speech enhancement systems face challenges of high demand of computational resources and imbalanced frequency distribution. In this paper, a light-weight full-band model is proposed with two dedicated strategies, i.e., a learnable spectral compression map** for more effective high-band spectral information compression, and the utilization of the multi-head attention mechanism for more effective modeling of the global spectral pattern. Experiments validate the efficacy of the proposed strategies and show that the proposed model achieves competitive performance with only 0.89M parameters. △ Less

Submitted 3 July, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

arXiv:2206.13136 [pdf]

A two-stage full-band speech enhancement model with effective spectral compression map**

Authors: Zhongshu Hou, Qinwen Hu, Kai Chen, **g Lu

Abstract: The direct expansion of deep neural network (DNN) based wide-band speech enhancement (SE) to full-band processing faces the challenge of low frequency resolution in low frequency range, which would highly likely lead to deteriorated performance of the model. In this paper, we propose a learnable spectral compression map** (SCM) to effectively compress the high frequency components so that they c… ▽ More The direct expansion of deep neural network (DNN) based wide-band speech enhancement (SE) to full-band processing faces the challenge of low frequency resolution in low frequency range, which would highly likely lead to deteriorated performance of the model. In this paper, we propose a learnable spectral compression map** (SCM) to effectively compress the high frequency components so that they can be processed in a more efficient manner. By doing so, the model can pay more attention to low and middle frequency range, where most of the speech power is concentrated. Instead of suppressing noise in a single network structure, we first estimate a spectral magnitude mask, converting the speech to a high signal-to-ratio (SNR) state, and then utilize a subsequent model to further optimize the real and imaginary mask of the pre-enhanced signal. We conduct comprehensive experiments to validate the efficacy of the proposed method. △ Less

Submitted 27 June, 2022; originally announced June 2022.

arXiv:2204.07999 [pdf, other]

Introduction of Integrated Image Deep Learning Solution and how it brought laboratorial level heart rate and blood oxygen results to everyone

Authors: Zhuang Hou, Xiaolei Cao

Abstract: The general public and medical professionals recognized the importance of accurately measuring and storing blood oxygen levels and heart rate during the COVID-19 pandemic. The demand for accurate contact-less devices was motivated by the need for cross-infection reduction and the shortage of conventional oximeters, partially due to the global supply chain issue. This paper evaluated a contact-less… ▽ More The general public and medical professionals recognized the importance of accurately measuring and storing blood oxygen levels and heart rate during the COVID-19 pandemic. The demand for accurate contact-less devices was motivated by the need for cross-infection reduction and the shortage of conventional oximeters, partially due to the global supply chain issue. This paper evaluated a contact-less mini-program HealthyPai's heart rate (HR) and oxygen saturation (SpO2) measurements compared with other wearable devices. In the HR study of 185 samples (81 in the laboratory environment, 104 in the real-life environment), the mean absolute error (MAE) $\pm$ standard deviation was $1.4827 \pm 1.7452$ in the lab, $6.9231 \pm 5.6426$ in the real-life setting. In the SpO2 study of 24 samples, the mean absolute error (MAE) $\pm$ standard deviation of the measurement was $1.0375 \pm 0.7745$. Our results validated that HealthyPai utilizing the Integrated Image Deep Learning Solution (IIDLS) framework can accurately measure HR and SpO2, providing the test quality at least comparable to other FDA-approved wearable devices in the market and surpassing the consumer-grade and research-grade wearable standards. △ Less

Submitted 13 April, 2022; originally announced April 2022.

arXiv:2202.11889 [pdf, other]

A spectral-spatial fusion anomaly detection method for hyperspectral imagery

Authors: Zengfu Hou, Siyuan Cheng, Ting Hu

Abstract: In hyperspectral, high-quality spectral signals convey subtle spectral differences to distinguish similar materials, thereby providing unique advantage for anomaly detection. Hence fine spectra of anomalous pixels can be effectively screened out from heterogeneous background pixels. Since the same materials have similar characteristics in spatial and spectral dimension, detection performance can b… ▽ More In hyperspectral, high-quality spectral signals convey subtle spectral differences to distinguish similar materials, thereby providing unique advantage for anomaly detection. Hence fine spectra of anomalous pixels can be effectively screened out from heterogeneous background pixels. Since the same materials have similar characteristics in spatial and spectral dimension, detection performance can be significantly enhanced by jointing spatial and spectral information. In this paper, a spectralspatial fusion anomaly detection (SSFAD) method is proposed for hyperspectral imagery. First, original spectral signals are mapped to a local linear background space composed of median and mean with high confidence, where saliency weight and feature enhancement strategies are implemented to obtain an initial detection map in spectral domain. Futhermore, to make full use of similarity information of local background around testing pixel, a new detector is designed to extract the local similarity spatial features of patch images in spatial domain. Finally, anomalies are detected by adaptively combining the spectral and spatial detection maps. The experimental results demonstrate that our proposed method has superior detection performance than traditional methods. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2110.10965 [pdf, other]

2020 CATARACTS Semantic Segmentation Challenge

Authors: Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heon** Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio , et al. (15 additional authors not shown)

Abstract: Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presenc… ▽ More Surgical scene segmentation is essential for anatomy and instrument localization which can be further used to assess tissue-instrument interactions during a surgical procedure. In 2017, the Challenge on Automatic Tool Annotation for cataRACT Surgery (CATARACTS) released 50 cataract surgery videos accompanied by instrument usage annotations. These annotations included frame-level instrument presence information. In 2020, we released pixel-wise semantic annotations for anatomy and instruments for 4670 images sampled from 25 videos of the CATARACTS training set. The 2020 CATARACTS Semantic Segmentation Challenge, which was a sub-challenge of the 2020 MICCAI Endoscopic Vision (EndoVis) Challenge, presented three sub-tasks to assess participating solutions on anatomical structure and instrument segmentation. Their performance was assessed on a hidden test set of 531 images from 10 videos of the CATARACTS test set. △ Less

Submitted 24 February, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

arXiv:2105.08364 [pdf, other]

Deep Learning-based Physical-Layer Secret Key Generation for FDD Systems

Authors: Xinwei Zhang, Guyue Li, Junqing Zhang, Aiqun Hu, Zongyue Hou, Bin Xiao

Abstract: Physical-layer key generation (PKG) establishes cryptographic keys from highly correlated measurements of wireless channels, which relies on reciprocal channel characteristics between uplink and downlink, is a promising wireless security technique for Internet of Things (IoT). However, it is challenging to extract common features in frequency division duplexing (FDD) systems as uplink and downlink… ▽ More Physical-layer key generation (PKG) establishes cryptographic keys from highly correlated measurements of wireless channels, which relies on reciprocal channel characteristics between uplink and downlink, is a promising wireless security technique for Internet of Things (IoT). However, it is challenging to extract common features in frequency division duplexing (FDD) systems as uplink and downlink transmissions operate at different frequency bands whose channel frequency responses are not reciprocal any more. Existing PKG methods for FDD systems have many limitations, i.e., high overhead and security problems. This paper proposes a novel PKG scheme that uses the feature map** function between different frequency bands obtained by deep learning to make two users generate highly similar channel features in FDD systems. In particular, this is the first time to apply deep learning for PKG in FDD systems. We first prove the existence of the band feature map** function for a given environment and a feedforward network with a single hidden layer can approximate the map** function. Then a Key Generation neural Network (KGNet) is proposed for reciprocal channel feature construction, and a key generation scheme based on the KGNet is also proposed. Numerical results verify the excellent performance of the KGNet-based key generation scheme in terms of randomness, key generation ratio, and key error rate. Besides, the overhead analysis shows that the method proposed in this paper can be used for resource-contrained IoT devices in FDD systems. △ Less

Submitted 30 August, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

Comments: Accepted for publication in IEEE Internet of Things Journal

arXiv:2105.06270 [pdf, other]

Group Feature Learning and Domain Adversarial Neural Network for aMCI Diagnosis System Based on EEG

Authors: Chen-Chen Fan, Haiqun Xie, Liang Peng, Hongjun Yang, Zhen-Liang Ni, Guan'an Wang, Yan-Jie Zhou, Sheng Chen, Zhijie Fang, Shuyun Huang, Zeng-Guang Hou

Abstract: Medical diagnostic robot systems have been paid more and more attention due to its objectivity and accuracy. The diagnosis of mild cognitive impairment (MCI) is considered an effective means to prevent Alzheimer's disease (AD). Doctors diagnose MCI based on various clinical examinations, which are expensive and the diagnosis results rely on the knowledge of doctors. Therefore, it is necessary to d… ▽ More Medical diagnostic robot systems have been paid more and more attention due to its objectivity and accuracy. The diagnosis of mild cognitive impairment (MCI) is considered an effective means to prevent Alzheimer's disease (AD). Doctors diagnose MCI based on various clinical examinations, which are expensive and the diagnosis results rely on the knowledge of doctors. Therefore, it is necessary to develop a robot diagnostic system to eliminate the influence of human factors and obtain a higher accuracy rate. In this paper, we propose a novel Group Feature Domain Adversarial Neural Network (GF-DANN) for amnestic MCI (aMCI) diagnosis, which involves two important modules. A Group Feature Extraction (GFE) module is proposed to reduce individual differences by learning group-level features through adversarial learning. A Dual Branch Domain Adaptation (DBDA) module is carefully designed to reduce the distribution difference between the source and target domain in a domain adaption way. On three types of data set, GF-DANN achieves the best accuracy compared with classic machine learning and deep learning methods. On the DMS data set, GF-DANN has obtained an accuracy rate of 89.47%, and the sensitivity and specificity are 90% and 89%. In addition, by comparing three EEG data collection paradigms, our results demonstrate that the DMS paradigm has the potential to build an aMCI diagnose robot system. △ Less

Submitted 28 April, 2021; originally announced May 2021.

Comments: This paper has been accepted by 2021 International Conference on Robotics and Automation (ICRA 2021)

arXiv:2012.00951 [pdf, ps, other]

Interval-driven discrete-time general nonlinear robust control: stabilization with closed-loop robust DOA enlargement

Authors: Chaolun Lu, Yongqiang Li, Zijun Feng, Zhongsheng Hou, Yu Feng, Yuan**g Feng

Abstract: This paper presents new results that allow one to address the discrete-time general nonlinear robust control problem. The uncertain system is described by a general nonlinear function set characterized by the nominal model and the corresponding modeling error bound. Traditional synthesis methods design parameters of a structured robust controller. The key aim of this paper is to find an unstructur… ▽ More This paper presents new results that allow one to address the discrete-time general nonlinear robust control problem. The uncertain system is described by a general nonlinear function set characterized by the nominal model and the corresponding modeling error bound. Traditional synthesis methods design parameters of a structured robust controller. The key aim of this paper is to find an unstructured robust controller set in the state-control space, which enlarges the estimate of the closed-loop robust domain of attraction (RDOA). Based on the interval analysis arithmetic, a numerical method to estimate the unstructured robust controller set is proposed and the rigorous convergence analysis is given. The existing RDOA results are constrained by the level-set of the Lyapunov function, whereas the results in this paper remove this limitation. Furthermore, a solvable optimization problem is formulated so the estimate of RDOA is enlarged by selecting a Lyapunov function from a Lyapunov function set of sum-of-squares polynomials. The method is then validated by a specific case simulation study and results show more extensive RDOA than the previous methods. △ Less

Submitted 1 December, 2020; originally announced December 2020.

Comments: 15 pages, 6 figures. arXiv admin note: substantial text overlap with arXiv:1912.11775

arXiv:2005.13749 [pdf]

IoT-based Remote Control Study of a Robotic Trans-esophageal Ultrasound Probe via LAN and 5G

Authors: Shuangyi Wang, Xilong Hou, Richard Housden, Zengguang Hou, Davinder Singh, Kawal Rhode

Abstract: A robotic trans-esophageal echocardiography (TEE) probe has been recently developed to address the problems with manual control in the X-ray envi-ronment when a conventional probe is used for interventional procedure guidance. However, the robot was exclusively to be used in local areas and the effectiveness of remote control has not been scientifically tested. In this study, we implemented an Int… ▽ More A robotic trans-esophageal echocardiography (TEE) probe has been recently developed to address the problems with manual control in the X-ray envi-ronment when a conventional probe is used for interventional procedure guidance. However, the robot was exclusively to be used in local areas and the effectiveness of remote control has not been scientifically tested. In this study, we implemented an Internet-of-things (IoT)-based configuration to the TEE robot so the system can set up a local area network (LAN) or be configured to connect to an internet cloud over 5G. To investigate the re-mote control, backlash hysteresis effects were measured and analysed. A joy-stick-based device and a button-based gamepad were then employed and compared with the manual control in a target reaching experiment for the two steering axes. The results indicated different hysteresis curves for the left-right and up-down steering axes with the input wheel's deadbands found to be 15 deg and deg, respectively. Similar magnitudes of positioning errors at approximately 0.5 deg and maximum overshoots at around 2.5 deg were found when manually and robotically controlling the TEE probe. The amount of time to finish the task indicated a better performance using the button-based gamepad over joystick-based device, although both were worse than the manual control. It is concluded that the IoT-based remote control of the TEE probe is feasible and a trained user can accurately manipulate the probe. The main identified problem was the backlash hysteresis in the steering axes, which can result in continuous oscillations and overshoots. △ Less

Submitted 27 May, 2020; originally announced May 2020.

Comments: 9 pages, 5 figures, to be submitted to MICCAI ASMUS 2020 workshop

arXiv:2005.12679 [pdf]

doi 10.1109/TMRB.2020.3036461

Design of a Low-cost Miniature Robot to Assist the COVID-19 Nasopharyngeal Swab Sampling

Authors: Shuangyi Wang, Kehao Wang, Hongbin Liu, Zengguang Hou

Abstract: Nasopharyngeal (NP) swab sampling is an effective approach for the diagnosis of coronavirus disease 2019 (COVID-19). Medical staffs carrying out the task of collecting NP specimens are in close contact with the suspected patient, thereby posing a high risk of cross-infection. We propose a low-cost miniature robot that can be easily assembled and remotely controlled. The system includes an active e… ▽ More Nasopharyngeal (NP) swab sampling is an effective approach for the diagnosis of coronavirus disease 2019 (COVID-19). Medical staffs carrying out the task of collecting NP specimens are in close contact with the suspected patient, thereby posing a high risk of cross-infection. We propose a low-cost miniature robot that can be easily assembled and remotely controlled. The system includes an active end-effector, a passive positioning arm, and a detachable swab gripper with integrated force sensing capability. The cost of the materials for building this robot is 55 USD and the total weight of the functional part is 0.23kg. The design of the force sensing swab gripper was justified using Finite Element (FE) modeling and the performances of the robot were validated with a simulation phantom and three pig noses. FE analysis indicated a 0.5mm magnitude displacement of the gripper's sensing beam, which meets the ideal detecting range of the optoelectronic sensor. Studies on both the phantom and the pig nose demonstrated the successful operation of the robot during the collection task. The average forces were found to be 0.35N and 0.85N, respectively. It is concluded that the proposed robot is promising and could be further developed to be used in vivo. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: 5 pages, 8 figures

Journal ref: IEEE Transactions on Medical Robotics and Bionics (Volume: 3, Issue: 1, Feb. 2021)

arXiv:2002.11045 [pdf, ps, other]

Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks

Authors: Changyang She, Rui Dong, Zhouyou Gu, Zhanwei Hou, Yonghui Li, Wibowo Hardjawana, Chenyang Yang, Lingyang Song, Branka Vucetic

Abstract: In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in p… ▽ More In the future 6th generation networks, ultra-reliable and low-latency communications (URLLC) will lay the foundation for emerging mission-critical applications that have stringent requirements on end-to-end delay and reliability. Existing works on URLLC are mainly based on theoretical models and assumptions. The model-based solutions provide useful insights, but cannot be directly implemented in practice. In this article, we first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC, and discuss some open problems of these methods. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC. The basic idea is to merge theoretical models and real-world data in analyzing the latency and reliability and training deep neural networks (DNNs). Deep transfer learning is adopted in the architecture to fine-tune the pre-trained DNNs in non-stationary networks. Further considering that the computing capacity at each user and each mobile edge computing server is limited, federated learning is applied to improve the learning efficiency. Finally, we provide some experimental and simulation results and discuss some future directions. △ Less

Submitted 22 February, 2020; originally announced February 2020.

Comments: The manuscript contains 4 figures 2 tables. It has been submitted to IEEE Network (in the second round of revision)

arXiv:2001.07093 [pdf, other]

BARNet: Bilinear Attention Network with Adaptive Receptive Fields for Surgical Instrument Segmentation

Authors: Zhen-Liang Ni, Gui-Bin Bian, Guan-An Wang, Xiao-Hu Zhou, Zeng-Guang Hou, Xiao-Liang Xie, Zhen Li, Yu-Han Wang

Abstract: Surgical instrument segmentation is extremely important for computer-assisted surgery. Different from common object segmentation, it is more challenging due to the large illumination and scale variation caused by the special surgical scenes. In this paper, we propose a novel bilinear attention network with adaptive receptive field to solve these two challenges. For the illumination variation, the… ▽ More Surgical instrument segmentation is extremely important for computer-assisted surgery. Different from common object segmentation, it is more challenging due to the large illumination and scale variation caused by the special surgical scenes. In this paper, we propose a novel bilinear attention network with adaptive receptive field to solve these two challenges. For the illumination variation, the bilinear attention module can capture second-order statistics to encode global contexts and semantic dependencies between local pixels. With them, semantic features in challenging areas can be inferred from their neighbors and the distinction of various semantics can be boosted. For the scale variation, our adaptive receptive field module aggregates multi-scale features and automatically fuses them with different weights. Specifically, it encodes the semantic relationship between channels to emphasize feature maps with appropriate scales, changing the receptive field of subsequent convolutions. The proposed network achieves the best performance 97.47% mean IOU on Cata7 and comes first place on EndoVis 2017 by 10.10% IOU overtaking second-ranking method. △ Less

Submitted 21 May, 2020; v1 submitted 20 January, 2020; originally announced January 2020.

arXiv:1912.11480 [pdf, ps, other]

Data-Driven Robust Stabilization with Robust DOA Enlargement for Nonlinear Systems

Authors: Chaolun Lu, Yongqiang Li, Zhongsheng Hou, Yuan**g Feng, Yu Feng, Ronghu Chi, Xuhui Bu

Abstract: Most of nonlinear robust control methods just consider the affine nonlinear nominal model. When the nominal model is assumed to be affine nonlinear, available information about existing non-affine nonlinearities is ignored. For non-affine nonlinear system, Li et al. (2019) proposes a new nonlinear control method to solve the robust stabilization problem with estimation of the robust closed-loop DO… ▽ More Most of nonlinear robust control methods just consider the affine nonlinear nominal model. When the nominal model is assumed to be affine nonlinear, available information about existing non-affine nonlinearities is ignored. For non-affine nonlinear system, Li et al. (2019) proposes a new nonlinear control method to solve the robust stabilization problem with estimation of the robust closed-loop DOA (Domain of attraction). However, Li et al. (2019) assumes that the Lyapunov function is given and does not consider the problem of finding a good Lyapunov function to enlarge the estimate of the robust closed-loop DOA. The motivation of this paper is to enlarge the estimate of the closed-loop DOA by selecting an appropriate Lyapunov function. To achieve this goal, a solvable optimization problem is formulated to select an appropriate Lyapunov function from a parameterized positive-definite function set. The effectiveness of proposed method is verified by numerical results. △ Less

Submitted 24 December, 2019; originally announced December 2019.

Comments: 6 pages, 6 figures, preprint submitted to IFAC World Congress(2020). arXiv admin note: text overlap with arXiv:1909.12561

arXiv:1910.10934 [pdf]

Optimal Future Sub-Transmission Volt-Var Planning Tool to Enable High PV Penetration

Authors: Quan Nguyen, Xinda Ke, Nader Samaan, Jesse Holzer, Marcelo Elizondo, Huifen Zhou, Zhangshuan Hou, Renke Huang, Mallikarjuna Vallem, Bharat Vyakaranam, Malini Ghosal, Yuri V. Makarov

Abstract: This paper proposes a reactive power planning tool for sub-transmission systems to mitigate voltage violations and fluctuations caused by high photovoltaic (PV) penetration and intermittency with a minimum investment cost. The tool considers all existing volt-ampere reactive (var) assets in both sub-transmission and distribution systems to reduce the need of new equipment. The planning tool coordi… ▽ More This paper proposes a reactive power planning tool for sub-transmission systems to mitigate voltage violations and fluctuations caused by high photovoltaic (PV) penetration and intermittency with a minimum investment cost. The tool considers all existing volt-ampere reactive (var) assets in both sub-transmission and distribution systems to reduce the need of new equipment. The planning tool coordinates with an operational volt-var optimization tool to determine all scenarios with voltage violations and verify the planning results. The planning result of each scenario is the solution of a proposed optimal power-flow framework with efficient techniques to handle a high number of discrete variables. The final planning decision is obtained from the planning results of all selected violated scenarios by using two different approaches - direct combination of all single-step solutions and final investment decision based only on the scenarios that are representative for the power-flow voltage violations at most time steps. The final planning decision is verified using a realistic large-scale sub-transmission system and 5-minute PV and load data. The results show a significant voltage violation reduction with a less investment cost for additional var equipment compared to conventional approaches. △ Less

Submitted 24 October, 2019; originally announced October 2019.

arXiv:1909.12561 [pdf, ps, other]

Data-Driven Robust Stabilization with RobustDomain of Attraction Estimate for Nonlinear Discrete-Time Systems

Authors: Yongqiang Li, Chaolun Lu, Zhongsheng Hou, Yuan**g Feng

Abstract: Nonlinear robust control is pursued by overcoming the drawback of linear robust control that it ignores available information about existing nonlinearities and the resulting controllers may be too conservative, especially when the nonlinearities are significant. However, most existing nonlinear robust control approaches just consider the affine nonlinear nominal model and thereby ignore available… ▽ More Nonlinear robust control is pursued by overcoming the drawback of linear robust control that it ignores available information about existing nonlinearities and the resulting controllers may be too conservative, especially when the nonlinearities are significant. However, most existing nonlinear robust control approaches just consider the affine nonlinear nominal model and thereby ignore available information about existing non-affine nonlinearities. When the general nonlinear nominal model is considered, the robust domain of attraction (RDOA) of closed-loops requires extensive investigation because it is hard to achieve the global stabilization. In this paper, we propose a new nonlinear robust control method based on Lyapunov function to stabilize a discrete-time uncertain system and to estimate the RDOA of closed-loops. First, a sufficient condition for robust stabilization of all plants in a plant set and estimation of the RDOA of all closed-loops is proposed. Then, to tackle the non-affine nonlinearities, a data-driven method of estimating the robust negative-definite domains (RNDD) is presented, and based on it the estimation of the RDOA of closed-loops and the resulting controller design are also given. △ Less

Submitted 25 December, 2019; v1 submitted 27 September, 2019; originally announced September 2019.

Comments: 6 pages, 3 figures, preprint submitted to Automatica (Accept provisionally)

arXiv:1909.05787 [pdf, ps, other]

Prediction and Communication Co-design for Ultra-Reliable and Low-Latency Communications

Authors: Zhanwei Hou, Changyang She, Yonghui Li, Zhuo Li, Branka Vucetic

Abstract: Ultra-reliable and low-latency communications (URLLC) are considered as one of three new application scenarios in the fifth generation cellular networks. In this work, we aim to reduce the user experienced delay through prediction and communication co-design, where each mobile device predicts its future states and sends them to a data center in advance. Since predictions are not error-free, we con… ▽ More Ultra-reliable and low-latency communications (URLLC) are considered as one of three new application scenarios in the fifth generation cellular networks. In this work, we aim to reduce the user experienced delay through prediction and communication co-design, where each mobile device predicts its future states and sends them to a data center in advance. Since predictions are not error-free, we consider prediction errors and packet losses in communications when evaluating the reliability of the system. Then, we formulate an optimization problem that maximizes the number of URLLC services supported by the system by optimizing time and frequency resources and the prediction horizon. Simulation results verify the effectiveness of the proposed method, and show that the tradeoff between user experienced delay and reliability can be improved significantly via prediction and communication co-design. Furthermore, we carried out an experiment on the remote control in a virtual factory, and validated our concept on prediction and communication co-design with the practical mobility data generated by a real tactile device. △ Less

Submitted 5 September, 2019; originally announced September 2019.

Comments: This paper has been submitted to IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. Part of this work was presented in IEEE ICC 2019

arXiv:1903.03913 [pdf, ps, other]

Towards Ultra-Reliable Low-Latency Communications: Typical Scenarios, Possible Solutions, and Open Issues

Authors: Daquan Feng, Changyang She, Kai Ying, Lifeng Lai, Zhanwei Hou, Tony Q. S. Quek, Yonghui Li, Branka Vucetic

Abstract: Ultra-reliable low-latency communications (URLLC) has been considered as one of the three new application scenarios in the \emph{5th Generation} (5G) \emph {New Radio} (NR), where the physical layer design aspects have been specified. With the 5G NR, we can guarantee the reliability and latency in radio access networks. However, for communication scenarios where the transmission involves both radi… ▽ More Ultra-reliable low-latency communications (URLLC) has been considered as one of the three new application scenarios in the \emph{5th Generation} (5G) \emph {New Radio} (NR), where the physical layer design aspects have been specified. With the 5G NR, we can guarantee the reliability and latency in radio access networks. However, for communication scenarios where the transmission involves both radio access and wide area core networks, the delay in radio access networks only contributes to part of the \emph{end-to-end} (E2E) delay. In this paper, we outline the delay components and packet loss probabilities in typical communication scenarios of URLLC, and formulate the constraints on E2E delay and overall packet loss probability. Then, we summarize possible solutions in the physical layer, the link layer, the network layer, and the cross-layer design, respectively. Finally, we discuss the open issues in prediction and communication co-design for URLLC in wide area large scale networks. △ Less

Submitted 9 March, 2019; originally announced March 2019.

Comments: 8 pages, 7 figures. Accepted by IEEE Vehicular Technology Magazine

Journal ref: IEEE Vehicular Technology Magazine, 2019

arXiv:1707.06039 [pdf, ps, other]

doi 10.1016/j.automatica.2018.12.011

Quantum gate identification: error analysis, numerical results and optical experiment

Authors: Yuanlong Wang, Qi Yin, Daoyi Dong, Bo Qi, Ian R. Petersen, Zhibo Hou, Hidehiro Yonezawa, Guo-Yong Xiang

Abstract: The identification of an unknown quantum gate is a significant issue in quantum technology. In this paper, we propose a quantum gate identification method within the framework of quantum process tomography. In this method, a series of pure states are inputted to the gate and then a fast state tomography on the output states is performed and the data are used to reconstruct the quantum gate. Our al… ▽ More The identification of an unknown quantum gate is a significant issue in quantum technology. In this paper, we propose a quantum gate identification method within the framework of quantum process tomography. In this method, a series of pure states are inputted to the gate and then a fast state tomography on the output states is performed and the data are used to reconstruct the quantum gate. Our algorithm has computational complexity $O(d^3)$ with the system dimension $d$. The algorithm is compared with maximum likelihood estimation method for the running time, which shows the efficiency advantage of our method. An error upper bound is established for the identification algorithm and the robustness of the algorithm against the purity of input states is also tested. We perform quantum optical experiment on single-qubit Hadamard gate to verify the effectiveness of the identification algorithm. △ Less

Submitted 19 July, 2017; originally announced July 2017.

Comments: 19 pages, 5 figures

Journal ref: Automatica, 2019, Vol. 101, pp. 269-279

arXiv:1508.06927 [pdf, ps, other]

On Convergence Rate of Leader-Following Consensus of Linear Multi-Agent Systems with Communication Noises

Authors: Long Cheng, Yunpeng Wang, Wei Ren, Zeng-Guang Hou, Min Tan

Abstract: This note further studies the previously proposed consensus protocol for linear multi-agent systems with communication noises in [15], [16]. Each agent is allowed to have its own time-varying gain to attenuate the effect of communication noises. Therefore, the common assumption in most references that all agents have the same noise-attenuation gain is not necessary. It has been proved that if all… ▽ More This note further studies the previously proposed consensus protocol for linear multi-agent systems with communication noises in [15], [16]. Each agent is allowed to have its own time-varying gain to attenuate the effect of communication noises. Therefore, the common assumption in most references that all agents have the same noise-attenuation gain is not necessary. It has been proved that if all noise-attenuation gains are infinitesimal of the same order, then the mean square leader-following consensus can be reached. Furthermore, the convergence rate of the multi-agent system has been investigated. If the noise-attenuation gains belong to a class of functions which are bounded above and below by $t^{-β}$ $(β\in(0,1))$ asymptotically, then the states of all follower agents are convergent in mean square to the leader's state with the rate characterized by a function bounded above by $t^{-β}$ asymptotically. △ Less

Submitted 27 August, 2015; originally announced August 2015.

arXiv:1411.4346 [pdf, other]

Containment Control of Multi-Agent Systems with Dynamic Leaders Based on a $PI^n$-Type Approach

Authors: Yunpeng Wang, Long Cheng, Wei Ren, Zeng-Guang Hou, Min Tan

Abstract: This paper studies the containment control problem of multi-agent systems with multiple dynamic leaders in both the discrete-time domain and the continuous-time domain. The leaders' motions are described by $(n-1)$-order polynomial trajectories. This setting makes practical sense because given some critical points, the leaders' trajectories are usually planned by the polynomial interpolations. In… ▽ More This paper studies the containment control problem of multi-agent systems with multiple dynamic leaders in both the discrete-time domain and the continuous-time domain. The leaders' motions are described by $(n-1)$-order polynomial trajectories. This setting makes practical sense because given some critical points, the leaders' trajectories are usually planned by the polynomial interpolations. In order to drive all followers into the convex hull spanned by the leaders, a $PI^n$-type ($P$ and $I$ are short for {\it Proportion} and {\it Integration}, respectively; $I^n$ implies that the algorithm includes high-order integral terms) containment algorithm is proposed. It is theoretically proved that the $PI^n$-type containment algorithm is able to solve the containment problem of multi-agent systems where the followers are described by any order integral dynamics. Compared with the previous results on the multi-agent systems with dynamic leaders, the distinguished features of this paper are that: (1) the containment problem is studied not only in the continuous-time domain but also in the discrete-time domain while most existing results only work in the continuous-time domain; (2) to deal with the leaders with the $(n-1)$-order polynomial trajectories, existing results require the follower's dynamics to be $n$-order integral while the followers considered in this paper can be described by any-order integral; and (3) the "sign" function is not employed in the proposed algorithm, which avoids the chattering phenomenon. Furthermore, in order to illustrate the practical value of the proposed approach, an application, the containment control of multiple mobile robots is studied. Finally, two simulation examples are given to demonstrate the effectiveness of the proposed algorithm. △ Less

Submitted 27 August, 2015; v1 submitted 16 November, 2014; originally announced November 2014.

arXiv:1304.3972 [pdf, ps, other]

Reaching a Consensus in Networks of High-Order Integral Agents under Switching Directed Topology

Authors: Long Cheng, Zeng-Guang Hou, Min Tan

Abstract: Consensus problem of high-order integral multi-agent systems under switching directed topology is considered in this study. Depending on whether the agent's full state is available or not, two distributed protocols are proposed to ensure that states of all agents can be convergent to a same stationary value. In the proposed protocols, the gain vector associated with the agent's (estimated) state a… ▽ More Consensus problem of high-order integral multi-agent systems under switching directed topology is considered in this study. Depending on whether the agent's full state is available or not, two distributed protocols are proposed to ensure that states of all agents can be convergent to a same stationary value. In the proposed protocols, the gain vector associated with the agent's (estimated) state and the gain vector associated with the relative (estimated) states between agents are designed in a sophisticated way. By this particular design, the high-order integral multi-agent system can be transformed into a first-order integral multi-agent system. And the convergence of the transformed first-order integral agent's state indicates the convergence of the original high-order integral agent's state if and only if all roots of the polynomial, whose coefficients are the entries of the gain vector associated with the relative (estimated) states between agents, are in the open left-half complex plane. Therefore, many analysis techniques in the first-order integral multi-agent system can be directly borrowed to solve the problems in the high-order integral multi-agent system. Due to this property, it is proved that to reach a consensus, the switching directed topology of multi-agent system is only required to be "uniformly jointly quasi-strongly connected", which seems the mildest connectivity condition in the literature. In addition, the consensus problem of discrete-time high-order integral multi-agent systems is studied. The corresponding consensus protocol and performance analysis are presented. Finally, three simulation examples are provided to show the effectiveness of the proposed approach. △ Less

Submitted 14 April, 2013; originally announced April 2013.

Showing 1–36 of 36 results for author: Hou, Z