Search | arXiv e-print repository

Channel Measurements and Modeling for Dynamic Vehicular ISAC Scenarios at 28 GHz

Authors: Zhengyu Zhang, Ruisi He, Bo Ai, Mi Yang, Xuejian Zhang, Ziyi Qi, Yuan Yuan

Abstract: Integrated sensing and communication (ISAC) is a promising technology for 6G, with the goal of providing end-to-end information processing and inherent perception capabilities for future communication systems. Within ISAC emerging application scenarios, vehicular ISAC technologies have the potential to enhance traffic efficiency and safety through integration of communication and synchronized perc… ▽ More Integrated sensing and communication (ISAC) is a promising technology for 6G, with the goal of providing end-to-end information processing and inherent perception capabilities for future communication systems. Within ISAC emerging application scenarios, vehicular ISAC technologies have the potential to enhance traffic efficiency and safety through integration of communication and synchronized perception abilities. To establish a foundational theoretical support for vehicular ISAC system design and standardization, it is necessary to conduct channel measurements, and modeling to obtain a deep understanding of the radio propagation. In this paper, a dynamic statistical channel model is proposed for vehicular ISAC scenarios, incorporating Sensing Multipath Components (S-MPCs) and Clutter Multipath Components (C-MPCs), which are identified by the proposed tracking algorithm. Based on actual vehicular ISAC channel measurements at 28 GHz, time-varying sensing characteristics in front, left, and right directions are investigated. To model the dynamic evolution process of channel, number of new S-MPCs, lifetimes, initial power and delay positions, dynamic variations within their lifetimes, clustering, power decay, and fading of C-MPCs are statistically characterized. Finally, the paper provides implementation of dynamic vehicular ISAC model and validates it by comparing key simulation statistics between measurements and simulations. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2403.00569 [pdf, other]

Characterization of Wireless Channel Semantics: A New Paradigm

Authors: Zhengyu Zhang, Ruisi He, Mi Yang, Xuejian Zhang, Ziyi Qi, Yuan Yuan, Bo Ai

Abstract: Recently, deep learning enabled semantic communications have been developed to understand transmission content from semantic level, which realize effective and accurate information transfer. Aiming to the vision of sixth generation (6G) networks, wireless devices are expected to have native perception and intelligent capabilities, which associate wireless channel with surrounding environments from… ▽ More Recently, deep learning enabled semantic communications have been developed to understand transmission content from semantic level, which realize effective and accurate information transfer. Aiming to the vision of sixth generation (6G) networks, wireless devices are expected to have native perception and intelligent capabilities, which associate wireless channel with surrounding environments from physical propagation dimension to semantic information dimension. Inspired by these, we aim to provide a new paradigm on wireless channel from semantic level. A channel semantic model and its characterization framework are proposed in this paper. Specifically, a channel semantic model composes of status semantics, behavior semantics and event semantics. Based on actual channel measurement at 28 GHz, as well as multi-mode data, example results of channel semantic characterization are provided and analyzed, which exhibits reasonable and interpretable semantic information. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2403.00557 [pdf, other]

Non-stationarity Characteristics in Dynamic Vehicular ISAC Channels at 28 GHz

Authors: Zhengyu Zhang, Ruisi He, Mi Yang, Xuejian Zhang, Ziyi Qi, Hang Mi, Guiqi Sun, **gya Yang, Bo Ai

Abstract: Integrated sensing and communications (ISAC) is a potential technology of 6G, aiming to enable end-to-end information processing ability and native perception capability for future communication systems. As an important part of the ISAC application scenarios, ISAC aided vehicle-to-everything (V2X) can improve the traffic efficiency and safety through intercommunication and synchronous perception.… ▽ More Integrated sensing and communications (ISAC) is a potential technology of 6G, aiming to enable end-to-end information processing ability and native perception capability for future communication systems. As an important part of the ISAC application scenarios, ISAC aided vehicle-to-everything (V2X) can improve the traffic efficiency and safety through intercommunication and synchronous perception. It is necessary to carry out measurement, characterization, and modeling for vehicular ISAC channels as the basic theoretical support for system design. In this paper, dynamic vehicular ISAC channel measurements at 28 GHz are carried out and provide data for the characterization of non-stationarity characteristics. Based on the actual measurements, this paper analyzes the time-varying PDPs, RMSDS and non-stationarity characteristics of front, lower front, left and right perception directions in a complicated V2X scenarios. The research in this paper can enrich the investigation of vehicular ISAC channels and enable the analysis and design of vehicular ISAC systems. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2312.12317 [pdf, other]

doi 10.1109/PCS60826.2024.10566416

Full-reference Video Quality Assessment for User Generated Content Transcoding

Authors: Zihao Qi, Chen Feng, Duolikun Danier, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

Abstract: Unlike video coding for professional content, the delivery pipeline of User Generated Content (UGC) involves transcoding where unpristine reference content needs to be compressed repeatedly. In this work, we observe that existing full-/no-reference quality metrics fail to accurately predict the perceptual quality difference between transcoded UGC content and the corresponding unpristine references… ▽ More Unlike video coding for professional content, the delivery pipeline of User Generated Content (UGC) involves transcoding where unpristine reference content needs to be compressed repeatedly. In this work, we observe that existing full-/no-reference quality metrics fail to accurately predict the perceptual quality difference between transcoded UGC content and the corresponding unpristine references. Therefore, they are unsuited for guiding the rate-distortion optimisation process in the transcoding process. In this context, we propose a bespoke full-reference deep video quality metric for UGC transcoding. The proposed method features a transcoding-specific weakly supervised training strategy employing a quality ranking-based Siamese structure. The proposed method is evaluated on the YouTube-UGC VP9 subset and the LIVE-Wild database, demonstrating state-of-the-art performance compared to existing VQA methods. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 5 pages, 4 figures

arXiv:2311.16024 [pdf, other]

MadRadar: A Black-Box Physical Layer Attack Framework on mmWave Automotive FMCW Radars

Authors: David Hunt, Kristen Angell, Zhenzhou Qi, Tingjun Chen, Miroslav Pajic

Abstract: Frequency modulated continuous wave (FMCW) millimeter-wave (mmWave) radars play a critical role in many of the advanced driver assistance systems (ADAS) featured on today's vehicles. While previous works have demonstrated (only) successful false-positive spoofing attacks against these sensors, all but one assumed that an attacker had the runtime knowledge of the victim radar's configuration. In th… ▽ More Frequency modulated continuous wave (FMCW) millimeter-wave (mmWave) radars play a critical role in many of the advanced driver assistance systems (ADAS) featured on today's vehicles. While previous works have demonstrated (only) successful false-positive spoofing attacks against these sensors, all but one assumed that an attacker had the runtime knowledge of the victim radar's configuration. In this work, we introduce MadRadar, a general black-box radar attack framework for automotive mmWave FMCW radars capable of estimating the victim radar's configuration in real-time, and then executing an attack based on the estimates. We evaluate the impact of such attacks maliciously manipulating a victim radar's point cloud, and show the novel ability to effectively `add' (i.e., false positive attacks), `remove' (i.e., false negative attacks), or `move' (i.e., translation attacks) object detections from a victim vehicle's scene. Finally, we experimentally demonstrate the feasibility of our attacks on real-world case studies performed using a real-time physical prototype on a software-defined radio platform. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.10656 [pdf, other]

LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement

Authors: Zili Qi, Xinhui Hu, Wang** Zhou, Sheng Li, Hao Wu, Jian Lu, Xinkang Xu

Abstract: Recently, researchers have shown an increasing interest in automatically predicting the subjective evaluation for speech synthesis systems. This prediction is a challenging task, especially on the out-of-domain test set. In this paper, we proposed a novel fusion model for MOS prediction that combines supervised and unsupervised approaches. In the supervised aspect, we developed an SSL-based predic… ▽ More Recently, researchers have shown an increasing interest in automatically predicting the subjective evaluation for speech synthesis systems. This prediction is a challenging task, especially on the out-of-domain test set. In this paper, we proposed a novel fusion model for MOS prediction that combines supervised and unsupervised approaches. In the supervised aspect, we developed an SSL-based predictor called LE-SSL-MOS. The LE-SSL-MOS utilizes pre-trained self-supervised learning models and further improves prediction accuracy by utilizing the opinion scores of each utterance in the listener enhancement branch. In the unsupervised aspect, two steps are contained: we fine-tuned the unit language model (ULM) using highly intelligible domain data to improve the correlation of an unsupervised metric - SpeechLMScore. Another is that we utilized ASR confidence as a new metric with the help of ensemble learning. To our knowledge, this is the first architecture that fuses supervised and unsupervised methods for MOS prediction. With these approaches, our experimental results on the VoiceMOS Challenge 2023 show that LE-SSL-MOS performs better than the baseline. Our fusion system achieved an absolute improvement of 13% over LE-SSL-MOS on the noisy and enhanced speech track. Our system ranked 1st and 2nd, respectively, in the French speech synthesis track and the challenge's noisy and enhanced speech track. △ Less

Submitted 17 November, 2023; originally announced November 2023.

Comments: accepted in IEEE-ASRU2023

arXiv:2311.08225 [pdf, other]

Uni-COAL: A Unified Framework for Cross-Modality Synthesis and Super-Resolution of MR Images

Authors: Zhiyun Song, Zengxin Qi, Xin Wang, Xiangyu Zhao, Zhenrong Shen, Sheng Wang, Manman Fei, Zhe Wang, Di Zang, Dongdong Chen, Linlin Yao, Qian Wang, Xuehai Wu, Lichi Zhang

Abstract: Cross-modality synthesis (CMS), super-resolution (SR), and their combination (CMSR) have been extensively studied for magnetic resonance imaging (MRI). Their primary goals are to enhance the imaging quality by synthesizing the desired modality and reducing the slice thickness. Despite the promising synthetic results, these techniques are often tailored to specific tasks, thereby limiting their ada… ▽ More Cross-modality synthesis (CMS), super-resolution (SR), and their combination (CMSR) have been extensively studied for magnetic resonance imaging (MRI). Their primary goals are to enhance the imaging quality by synthesizing the desired modality and reducing the slice thickness. Despite the promising synthetic results, these techniques are often tailored to specific tasks, thereby limiting their adaptability to complex clinical scenarios. Therefore, it is crucial to build a unified network that can handle various image synthesis tasks with arbitrary requirements of modality and resolution settings, so that the resources for training and deploying the models can be greatly reduced. However, none of the previous works is capable of performing CMS, SR, and CMSR using a unified network. Moreover, these MRI reconstruction methods often treat alias frequencies improperly, resulting in suboptimal detail restoration. In this paper, we propose a Unified Co-Modulated Alias-free framework (Uni-COAL) to accomplish the aforementioned tasks with a single network. The co-modulation design of the image-conditioned and stochastic attribute representations ensures the consistency between CMS and SR, while simultaneously accommodating arbitrary combinations of input/output modalities and thickness. The generator of Uni-COAL is also designed to be alias-free based on the Shannon-Nyquist signal processing framework, ensuring effective suppression of alias frequencies. Additionally, we leverage the semantic prior of Segment Anything Model (SAM) to guide Uni-COAL, ensuring a more authentic preservation of anatomical structures during synthesis. Experiments on three datasets demonstrate that Uni-COAL outperforms the alternatives in CMS, SR, and CMSR tasks for MR images, which highlights its generalizability to wide-range applications. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2210.02189 [pdf]

A Generalizable Artificial Intelligence Model for COVID-19 Classification Task Using Chest X-ray Radiographs: Evaluated Over Four Clinical Datasets with 15,097 Patients

Authors: Ran Zhang, Xin Tie, John W. Garrett, Dalton Griner, Zhihua Qi, Nicholas B. Bevins, Scott B. Reeder, Guang-Hong Chen

Abstract: Purpose: To answer the long-standing question of whether a model trained from a single clinical site can be generalized to external sites. Materials and Methods: 17,537 chest x-ray radiographs (CXRs) from 3,264 COVID-19-positive patients and 4,802 COVID-19-negative patients were collected from a single site for AI model development. The generalizability of the trained model was retrospectively e… ▽ More Purpose: To answer the long-standing question of whether a model trained from a single clinical site can be generalized to external sites. Materials and Methods: 17,537 chest x-ray radiographs (CXRs) from 3,264 COVID-19-positive patients and 4,802 COVID-19-negative patients were collected from a single site for AI model development. The generalizability of the trained model was retrospectively evaluated using four different real-world clinical datasets with a total of 26,633 CXRs from 15,097 patients (3,277 COVID-19-positive patients). The area under the receiver operating characteristic curve (AUC) was used to assess diagnostic performance. Results: The AI model trained using a single-source clinical dataset achieved an AUC of 0.82 (95% CI: 0.80, 0.84) when applied to the internal temporal test set. When applied to datasets from two external clinical sites, an AUC of 0.81 (95% CI: 0.80, 0.82) and 0.82 (95% CI: 0.80, 0.84) were achieved. An AUC of 0.79 (95% CI: 0.77, 0.81) was achieved when applied to a multi-institutional COVID-19 dataset collected by the Medical Imaging and Data Resource Center (MIDRC). A power-law dependence, N^(k )(k is empirically found to be -0.21 to -0.25), indicates a relatively weak performance dependence on the training data sizes. Conclusion: COVID-19 classification AI model trained using well-curated data from a single clinical site is generalizable to external clinical sites without a significant drop in performance. △ Less

Submitted 4 October, 2022; originally announced October 2022.

arXiv:2208.06099 [pdf, other]

TBI-GAN: An Adversarial Learning Approach for Data Synthesis on Traumatic Brain Segmentation

Authors: Xiangyu Zhao, Di Zang, Sheng Wang, Zhenrong Shen, Kai Xuan, Zeyu Wei, Zhe Wang, Ruizhe Zheng, Xuehai Wu, Zheren Li, Qian Wang, Zengxin Qi, Lichi Zhang

Abstract: Brain network analysis for traumatic brain injury (TBI) patients is critical for its consciousness level assessment and prognosis evaluation, which requires the segmentation of certain consciousness-related brain regions. However, it is difficult to construct a TBI segmentation model as manually annotated MR scans of TBI patients are hard to collect. Data augmentation techniques can be applied to… ▽ More Brain network analysis for traumatic brain injury (TBI) patients is critical for its consciousness level assessment and prognosis evaluation, which requires the segmentation of certain consciousness-related brain regions. However, it is difficult to construct a TBI segmentation model as manually annotated MR scans of TBI patients are hard to collect. Data augmentation techniques can be applied to alleviate the issue of data scarcity. However, conventional data augmentation strategies such as spatial and intensity transformation are unable to mimic the deformation and lesions in traumatic brains, which limits the performance of the subsequent segmentation task. To address these issues, we propose a novel medical image inpainting model named TBI-GAN to synthesize TBI MR scans with paired brain label maps. The main strength of our TBI-GAN method is that it can generate TBI images and corresponding label maps simultaneously, which has not been achieved in the previous inpainting methods for medical images. We first generate the inpainted image under the guidance of edge information following a coarse-to-fine manner, and then the synthesized intensity image is used as the prior for label inpainting. Furthermore, we introduce a registration-based template augmentation pipeline to increase the diversity of the synthesized image pairs and enhance the capacity of data augmentation. Experimental results show that the proposed TBI-GAN method can produce sufficient synthesized TBI images with high quality and valid label maps, which can greatly improve the 2D and 3D traumatic brain segmentation performance compared with the alternatives. △ Less

Submitted 11 August, 2022; originally announced August 2022.

arXiv:2207.08634 [pdf, other]

Enhancing HDR Video Compression through CNN-based Effective Bit Depth Adaptation

Authors: Chen Feng, Zihao Qi, Duolikun Danier, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

Abstract: It is well known that high dynamic range (HDR) video can provide more immersive visual experiences compared to conventional standard dynamic range content. However, HDR content is typically more challenging to encode due to the increased detail associated with the wider dynamic range. In this paper, we improve HDR compression performance using the effective bit depth adaptation approach (EBDA). Th… ▽ More It is well known that high dynamic range (HDR) video can provide more immersive visual experiences compared to conventional standard dynamic range content. However, HDR content is typically more challenging to encode due to the increased detail associated with the wider dynamic range. In this paper, we improve HDR compression performance using the effective bit depth adaptation approach (EBDA). This method reduces the effective bit depth of the original video content before encoding and reconstructs the full bit depth using a CNN-based up-sampling method at the decoder. In this work, we modify the MFRNet network architecture to enable multiple frame processing, and the new network, multi-frame MFRNet, has been integrated into the EBDA framework using two Versatile Video Coding (VVC) host codecs: VTM 16.2 and the Fraunhofer Versatile Video Encoder (VVenC 1.4.0). The proposed approach was evaluated under the JVET HDR Common Test Conditions using the Random Access configuration. The results show coding gains over both the original VVC VTM 16.2 and VVenC 1.4.0 (w/o EBDA) on JVET HDR tested sequences, with average bitrate savings of 2.9% (over VTM) and 4.8% (against VVenC) based on the Bjontegaard Delta measurement. The source code of multi-frame MFRNet has been released at https://github.com/fan-aaron-zhang/MF-MFRNet. △ Less

Submitted 18 July, 2022; originally announced July 2022.

Comments: 5 pages, 3 figures

arXiv:2202.12943 [pdf, other]

Arrhythmia Classifier Using Convolutional Neural Network with Adaptive Loss-aware Multi-bit Networks Quantization

Authors: Hanshi Sun, Ao Wang, Ninghao Pu, Zhiqing Li, Junguang Huang, Hao Liu, Zhi Qi

Abstract: Cardiovascular disease (CVDs) is one of the universal deadly diseases, and the detection of it in the early stage is a challenging task to tackle. Recently, deep learning and convolutional neural networks have been employed widely for the classification of objects. Moreover, it is promising that lots of networks can be deployed on wearable devices. An increasing number of methods can be used to re… ▽ More Cardiovascular disease (CVDs) is one of the universal deadly diseases, and the detection of it in the early stage is a challenging task to tackle. Recently, deep learning and convolutional neural networks have been employed widely for the classification of objects. Moreover, it is promising that lots of networks can be deployed on wearable devices. An increasing number of methods can be used to realize ECG signal classification for the sake of arrhythmia detection. However, the existing neural networks proposed for arrhythmia detection are not hardware-friendly enough due to a remarkable quantity of parameters resulting in memory and power consumption. In this paper, we present a 1-D adaptive loss-aware quantization, achieving a high compression rate that reduces memory consumption by 23.36 times. In order to adapt to our compression method, we need a smaller and simpler network. We propose a 17 layer end-to-end neural network classifier to classify 17 different rhythm classes trained on the MIT-BIH dataset, realizing a classification accuracy of 93.5%, which is higher than most existing methods. Due to the adaptive bitwidth method making important layers get more attention and offered a chance to prune useless parameters, the proposed quantization method avoids accuracy degradation. It even improves the accuracy rate, which is 95.84%, 2.34% higher than before. Our study achieves a 1-D convolutional neural network with high performance and low resources consumption, which is hardware-friendly and illustrates the possibility of deployment on wearable devices to realize a real-time arrhythmia diagnosis. △ Less

Submitted 27 February, 2022; originally announced February 2022.

Comments: 7 pages, 7 figures

arXiv:2111.14346 [pdf, other]

Pessimistic Model Selection for Offline Deep Reinforcement Learning

Authors: Chao-Han Huck Yang, Zhengling Qi, Yifan Cui, Pin-Yu Chen

Abstract: Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications. Despite its promising performance, practical gaps exist when deploying DRL in real-world scenarios. One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL. In particular, for offline DRL with observational data… ▽ More Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications. Despite its promising performance, practical gaps exist when deploying DRL in real-world scenarios. One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL. In particular, for offline DRL with observational data, model selection is a challenging task as there is no ground truth available for performance demonstration, in contrast with the online setting with simulated environments. In this work, we propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee, which features a provably effective framework for finding the best policy among a set of candidate models. Two refined approaches are also proposed to address the potential bias of DRL model in identifying the optimal policy. Numerical studies demonstrated the superior performance of our approach over existing methods. △ Less

Submitted 29 November, 2021; originally announced November 2021.

Comments: Preprint. A non-archival and preliminary venue was presented at NeurIPS 2021 Offline Reinforcement Learning Workshop

arXiv:2109.00228 [pdf]

Deployable Networks for Public Safety in 5G and Beyond: A Coverage and Interference Study

Authors: Zhiqiang Qi, Adrián Lahuerta-Lavieja, **gya Li, Keerthi Kumar Nagalapur

Abstract: Deployable networks are foreseen to be one of the key technologies for public safety in fifth generation (5G) mobile communications and beyond. They can be used to complement the existing public cellular networks to provide temporary and on-demand connectivity in emergency situations. However, operating deployable networks in coexistence with public cellular networks can be challenging from an int… ▽ More Deployable networks are foreseen to be one of the key technologies for public safety in fifth generation (5G) mobile communications and beyond. They can be used to complement the existing public cellular networks to provide temporary and on-demand connectivity in emergency situations. However, operating deployable networks in coexistence with public cellular networks can be challenging from an interference perspective. To gain insights on the deployment strategy for deployable networks, in this article, we present an extensive numerical study of coverage and interference analysis, considering four different co-existence scenarios and different types of deployable base stations (BSs), i.e., BS on a truck and BS on an Unmanned Aerial Vehicle (UAV). Our simulation results show that deploying deployable BSs in rural scenarios can provide good coverage to meet the service requirement for mission critical (MC) users. In addition, the interference impact is only substantial when the deployable and public networks are close to each other. Finally, allowing the MC users to access the public network can be of vital importance to guarantee their service when the interference level between public and deployable network is very high. △ Less

Submitted 20 September, 2021; v1 submitted 1 September, 2021; originally announced September 2021.

arXiv:2107.04721 [pdf, other]

doi 10.1007/978-3-030-87000-3_7

U-Net with Hierarchical Bottleneck Attention for Landmark Detection in Fundus Images of the Degenerated Retina

Authors: Shuyun Tang, Ziming Qi, Jacob Granley, Michael Beyeler

Abstract: Fundus photography has routinely been used to document the presence and severity of retinal degenerative diseases such as age-related macular degeneration (AMD), glaucoma, and diabetic retinopathy (DR) in clinical practice, for which the fovea and optic disc (OD) are important retinal landmarks. However, the occurrence of lesions, drusen, and other retinal abnormalities during retinal degeneration… ▽ More Fundus photography has routinely been used to document the presence and severity of retinal degenerative diseases such as age-related macular degeneration (AMD), glaucoma, and diabetic retinopathy (DR) in clinical practice, for which the fovea and optic disc (OD) are important retinal landmarks. However, the occurrence of lesions, drusen, and other retinal abnormalities during retinal degeneration severely complicates automatic landmark detection and segmentation. Here we propose HBA-U-Net: a U-Net backbone enriched with hierarchical bottleneck attention. The network consists of a novel bottleneck attention block that combines and refines self-attention, channel attention, and relative-position attention to highlight retinal abnormalities that may be important for fovea and OD segmentation in the degenerated retina. HBA-U-Net achieved state-of-the-art results on fovea detection across datasets and eye conditions (ADAM: Euclidean Distance (ED) of 25.4 pixels, REFUGE: 32.5 pixels, IDRiD: 32.1 pixels), on OD segmentation for AMD (ADAM: Dice Coefficient (DC) of 0.947), and on OD detection for DR (IDRiD: ED of 20.5 pixels). Our results suggest that HBA-U-Net may be well suited for landmark detection in the presence of a variety of retinal degenerative diseases. △ Less

Submitted 9 July, 2021; originally announced July 2021.

Journal ref: Ophthalmic Medical Image Analysis 2021

arXiv:2008.01221 [pdf, ps, other]

Configuration Learning in Underwater Optical Links

Authors: Xueyuan Zhao, Zhuoran Qi, Dario Pompili

Abstract: A new research problem named configuration learning is described in this work. A novel algorithm is proposed to address the configuration learning problem. The configuration learning problem is defined to be the optimization of the Machine Learning (ML) classifier to maximize the ML performance metric optimizing the transmitter configuration in the signal processing/communication systems. Specific… ▽ More A new research problem named configuration learning is described in this work. A novel algorithm is proposed to address the configuration learning problem. The configuration learning problem is defined to be the optimization of the Machine Learning (ML) classifier to maximize the ML performance metric optimizing the transmitter configuration in the signal processing/communication systems. Specifically, this configuration learning problem is investigated in an underwater optical communication system with signal processing performance metric of the physical-layer communication throughput. A novel algorithm is proposed to perform the configuration learning by alternating optimization of key design parameters and switching between several Recurrent Neural Network (RNN) classifiers dependant on the learning objective. The proposed ML algorithm is validated with the datasets of an underwater optical communication system and is compared with competing ML algorithms. Performance results indicate that the proposal outperforms the competing algorithms for binary and multi-class configuration learning in underwater optical communication datasets. The proposed configuration learning framework can be further investigated and applied to a broad range of topics in signal processing and communications. △ Less

Submitted 3 August, 2020; originally announced August 2020.

arXiv:2004.03416 [pdf, other]

A SARS-CoV-2 Microscopic Image Dataset with Ground Truth Images and Visual Features

Authors: Chen Li, Jiawei Zhang, Frank Kulwa, Shouliang Qi, Ziyu Qi

Abstract: SARS-CoV-2 has characteristics of wide contagion and quick propagation velocity. To analyse the visual information of it, we build a SARS-CoV-2 Microscopic Image Dataset (SC2-MID) with 48 electron microscopic images and also prepare their ground truth images. Furthermore, we extract multiple classical features and novel deep learning features to describe the visual information of SARS-CoV-2. Final… ▽ More SARS-CoV-2 has characteristics of wide contagion and quick propagation velocity. To analyse the visual information of it, we build a SARS-CoV-2 Microscopic Image Dataset (SC2-MID) with 48 electron microscopic images and also prepare their ground truth images. Furthermore, we extract multiple classical features and novel deep learning features to describe the visual information of SARS-CoV-2. Finally, it is proved that the visual features of the SARS-CoV-2 images which are observed under the electron microscopic can be extracted and analysed. △ Less

Submitted 5 March, 2021; v1 submitted 7 April, 2020; originally announced April 2020.

arXiv:2002.04170 [pdf, other]

Learning to Incorporate Structure Knowledge for Image Inpainting

Authors: Jie Yang, Zhiquan Qi, Yong Shi

Abstract: This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant struct… ▽ More This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively. △ Less

Submitted 11 February, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: Accepted by AAAI 2020

arXiv:1909.10774 [pdf, other]

doi 10.1109/TIP.2020.3014953

s-LWSR: Super Lightweight Super-Resolution Network

Authors: Biao Li, Jiabin Liu, Bo Wang, Zhiquan Qi, Yong Shi

Abstract: Deep learning (DL) architectures for superresolution (SR) normally contain tremendous parameters, which has been regarded as the crucial advantage for obtaining satisfying performance. However, with the widespread use of mobile phones for taking and retouching photos, this character greatly hampers the deployment of DL-SR models on the mobile devices. To address this problem, in this paper, we pro… ▽ More Deep learning (DL) architectures for superresolution (SR) normally contain tremendous parameters, which has been regarded as the crucial advantage for obtaining satisfying performance. However, with the widespread use of mobile phones for taking and retouching photos, this character greatly hampers the deployment of DL-SR models on the mobile devices. To address this problem, in this paper, we propose a super lightweight SR network: s-LWSR. There are mainly three contributions in our work. Firstly, in order to efficiently abstract features from the low resolution image, we build an information pool to mix multi-level information from the first half part of the pipeline. Accordingly, the information pool feeds the second half part with the combination of hierarchical features from the previous layers. Secondly, we employ a compression module to further decrease the size of parameters. Intensive analysis confirms its capacity of trade-off between model complexity and accuracy. Thirdly, by revealing the specific role of activation in deep models, we remove several activation layers in our SR model to retain more information for performance improvement. Extensive experiments show that our s-LWSR, with limited parameters and operations, can achieve similar performance to other cumbersome DL-SR methods. △ Less

Submitted 24 September, 2019; originally announced September 2019.

Showing 1–18 of 18 results for author: Qi, Z