-
Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing
Authors:
Jiarun Ding,
Peiwen Jiang,
Chao-Kai Wen,
Shi **
Abstract:
Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this iss…
▽ More
Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this issue, we propose a novel scheme named ASCViT-JSCC, which utilizes vision transformers (ViTs) integrated with an orthogonal frequency division multiplexing (OFDM) system. This scheme adaptively allocates bandwidth for objects and backgrounds in images according to the importance order of different parts determined by object detection of you only look once version 5 (YOLOv5) and feature points detection of scale invariant feature transform (SIFT). Furthermore, the proposed scheme adheres to digital modulation standards by incorporating quantization modules. We validate this approach through an over-the-air (OTA) testbed named intelligent communication prototype validation platform (ICP) based on a software-defined radio (SDR) and NVIDIA embedded kits. Our findings from both simulations and practical measurements show that ASCViT-JSCC significantly preserves objects in images and enhances reconstruction quality compared to existing methods.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
EEGDiR: Electroencephalogram denoising network for temporal information storage and global modeling through Retentive Network
Authors:
Bin Wang,
Fei Deng,
Peifan Jiang
Abstract:
Electroencephalogram (EEG) signals play a pivotal role in clinical medicine, brain research, and neurological disease studies. However, susceptibility to various physiological and environmental artifacts introduces noise in recorded EEG data, impeding accurate analysis of underlying brain activity. Denoising techniques are crucial to mitigate this challenge. Recent advancements in deep learningbas…
▽ More
Electroencephalogram (EEG) signals play a pivotal role in clinical medicine, brain research, and neurological disease studies. However, susceptibility to various physiological and environmental artifacts introduces noise in recorded EEG data, impeding accurate analysis of underlying brain activity. Denoising techniques are crucial to mitigate this challenge. Recent advancements in deep learningbased approaches exhibit substantial potential for enhancing the signal-to-noise ratio of EEG data compared to traditional methods. In the realm of large-scale language models (LLMs), the Retentive Network (Retnet) infrastructure, prevalent for some models, demonstrates robust feature extraction and global modeling capabilities. Recognizing the temporal similarities between EEG signals and natural language, we introduce the Retnet from natural language processing to EEG denoising. This integration presents a novel approach to EEG denoising, opening avenues for a profound understanding of brain activities and accurate diagnosis of neurological diseases. Nonetheless, direct application of Retnet to EEG denoising is unfeasible due to the one-dimensional nature of EEG signals, while natural language processing deals with two-dimensional data. To facilitate Retnet application to EEG denoising, we propose the signal embedding method, transforming one-dimensional EEG signals into two dimensions for use as network inputs. Experimental results validate the substantial improvement in denoising effectiveness achieved by the proposed method.
△ Less
Submitted 20 May, 2024; v1 submitted 20 March, 2024;
originally announced April 2024.
-
Semantic Satellite Communications Based on Generative Foundation Model
Authors:
Peiwen Jiang,
Chao-Kai Wen,
Xiao Li,
Shi **,
Geoffrey Ye Li
Abstract:
Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce…
▽ More
Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce an FM-based semantic satellite communication framework, termed FMSAT. This framework leverages FM-based segmentation and reconstruction to significantly reduce bandwidth requirements and accurately recover semantic features under high noise and interference. Considering the high speed of satellites, an adaptive encoder-decoder is proposed to protect important features and avoid frequent retransmissions. Meanwhile, a well-received image can provide a reference for repairing damaged images under sudden attenuation. Since acknowledgment feedback is subject to long propagation delays when retransmission is unavoidable, a novel error detection method is proposed to roughly detect semantic errors at the regenerative satellite. With the proposed detectors at both the satellite and the gateway, the quality of the received images can be ensured. The simulation results demonstrate that the proposed method can significantly reduce bandwidth requirements, adapt to complex satellite scenarios, and protect semantic information with an acceptable transmission delay.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Reflectivity Is All You Need!: Advancing LiDAR Semantic Segmentation
Authors:
Kasi Viswanath,
Peng Jiang,
Srikanth Saripalli
Abstract:
LiDAR semantic segmentation frameworks predominantly leverage geometry-based features to differentiate objects within a scan. While these methods excel in scenarios with clear boundaries and distinct shapes, their performance declines in environments where boundaries are blurred, particularly in off-road contexts. To address this, recent strides in 3D segmentation algorithms have focused on harnes…
▽ More
LiDAR semantic segmentation frameworks predominantly leverage geometry-based features to differentiate objects within a scan. While these methods excel in scenarios with clear boundaries and distinct shapes, their performance declines in environments where boundaries are blurred, particularly in off-road contexts. To address this, recent strides in 3D segmentation algorithms have focused on harnessing raw LiDAR intensity measurements to improve prediction accuracy. Despite these efforts, current learning-based models struggle to correlate the intricate connections between raw intensity and factors such as distance, incidence angle, material reflectivity, and atmospheric conditions. Building upon our prior work, this paper delves into the advantages of employing calibrated intensity (also referred to as reflectivity) within learning-based LiDAR semantic segmentation frameworks. We initially establish that incorporating reflectivity as an input enhances the existing LiDAR semantic segmentation model. Furthermore, we present findings that enable the model to learn to calibrate intensity can boost its performance. Through extensive experimentation on the off-road dataset Rellis-3D, we demonstrate notable improvements. Specifically, converting intensity to reflectivity results in a 4% increase in mean Intersection over Union (mIoU) when compared to using raw intensity in Off-road scenarios. Additionally, we also investigate the possible benefits of using calibrated intensity in semantic segmentation in urban environments (SemanticKITTI) and cross-sensor domain adaptation.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Semantic Communications using Foundation Models: Design Approaches and Open Issues
Authors:
Peiwen Jiang,
Chao-Kai Wen,
** Yi,
Xiao Li,
Shi **,
Jun Zhang
Abstract:
Foundation models (FMs), including large language models, have become increasingly popular due to their wide-ranging applicability and ability to understand human-like semantics. While previous research has explored the use of FMs in semantic communications to improve semantic extraction and reconstruction, the impact of these models on different system levels, considering computation and memory c…
▽ More
Foundation models (FMs), including large language models, have become increasingly popular due to their wide-ranging applicability and ability to understand human-like semantics. While previous research has explored the use of FMs in semantic communications to improve semantic extraction and reconstruction, the impact of these models on different system levels, considering computation and memory complexity, requires further analysis. This study focuses on integrating FMs at the effectiveness, semantic, and physical levels, using universal knowledge to profoundly transform system design. Additionally, it examines the use of compact models to balance performance and complexity, comparing three separate approaches that employ FMs. Ultimately, the study highlights unresolved issues in the field that need addressing.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
RIS-Enhanced Semantic Communications Adaptive to User Requirements
Authors:
Peiwen Jiang,
Chao-Kai Wen,
Shi **,
Geoffrey Ye Li
Abstract:
Semantic communication significantly reduces required bandwidth by understanding semantic meaning of the transmitted. However, current deep learning-based semantic communication methods rely on joint source-channel coding design and end-to-end training, which limits their adaptability to new physical channels and user requirements. Reconfigurable intelligent surfaces (RIS) offer a solution by cust…
▽ More
Semantic communication significantly reduces required bandwidth by understanding semantic meaning of the transmitted. However, current deep learning-based semantic communication methods rely on joint source-channel coding design and end-to-end training, which limits their adaptability to new physical channels and user requirements. Reconfigurable intelligent surfaces (RIS) offer a solution by customizing channels in different environments. In this study, we propose the RIS-SC framework, which allocates semantic contents with varying levels of RIS assistance to satisfy the changing user requirements. It takes into account user movement and line-of-sight obstructions, enabling the RIS resource to protect important semantics in challenging channel conditions. The simulation results indicate reasonable task performance, but some semantic parts that have no effect on task performances are abandoned under severe channel conditions. To address this issue, a reconstruction method is also introduced to improve visual acceptance by inferring those missing semantic parts. Furthermore, the framework can adjust RIS resources in friendly channel conditions to save and allocate them efficiently among multiple users. Simulation results demonstrate the adaptability and efficiency of the RIS-SC framework across diverse channel conditions and user requirements.
△ Less
Submitted 5 August, 2023; v1 submitted 29 July, 2023;
originally announced July 2023.
-
Grou** Method for mmWave Massive MIMO System: Exploitation of Angular Multiplexing Gain
Authors:
Peng Jiang,
Pengcheng Zhu,
Jiamin Li,
Dongming Wang
Abstract:
A future millimeter-wave (mmWave) massive multiple-input and multiple-output (MIMO) system may serve hundreds or thousands of users at the same time; thus, research on multiple access technology is particularly important.Moreover, due to the short-wavelength nature of a mmWave, large-scale arrays are easier to implement than microwaves, while their directivity and sparseness make the physical beam…
▽ More
A future millimeter-wave (mmWave) massive multiple-input and multiple-output (MIMO) system may serve hundreds or thousands of users at the same time; thus, research on multiple access technology is particularly important.Moreover, due to the short-wavelength nature of a mmWave, large-scale arrays are easier to implement than microwaves, while their directivity and sparseness make the physical beamforming effect of precoding more prominent.In consideration of the mmWave angle division multiple access (ADMA) system based on precoding, this paper investigates the influence of the angle distribution on system performance, which is denoted as the angular multiplexing gain.Furthermore, inspired by the above research, we transform the ADMA user grou** problem to maximize the system sum-rate into the inter-user angular spacing equalization problem.Then, the form of the optimal solution for the approximate problem is derived, and the corresponding grou** algorithm is proposed.The simulation results demonstrate that the proposed algorithm performs better than the comparison methods.Finally, a complexity analysis also shows that the proposed algorithm has extremely low complexity.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Speech Intelligibility Classifiers from 550k Disordered Speech Samples
Authors:
Subhashini Venugopalan,
Jimmy Tobin,
Samuel J. Yang,
Katie Seaver,
Richard J. N. Cave,
Pan-Pan Jiang,
Neil Zeghidour,
Rus Heywood,
Jordan Green,
Michael P. Brenner
Abstract:
We developed dysarthric speech intelligibility classifiers on 551,176 disordered speech samples contributed by a diverse set of 468 speakers, with a range of self-reported speaking disorders and rated for their overall intelligibility on a five-point scale. We trained three models following different deep learning approaches and evaluated them on ~94K utterances from 100 speakers. We further found…
▽ More
We developed dysarthric speech intelligibility classifiers on 551,176 disordered speech samples contributed by a diverse set of 468 speakers, with a range of self-reported speaking disorders and rated for their overall intelligibility on a five-point scale. We trained three models following different deep learning approaches and evaluated them on ~94K utterances from 100 speakers. We further found the models to generalize well (without further training) on the TORGO database (100% accuracy), UASpeech (0.93 correlation), ALS-TDI PMP (0.81 AUC) datasets as well as on a dataset of realistic unprompted speech we gathered (106 dysarthric and 76 control speakers,~2300 samples).
△ Less
Submitted 15 March, 2023; v1 submitted 13 March, 2023;
originally announced March 2023.
-
A novel TomoSAR imaging method with few observations based on nested array
Authors:
Pengyu Jiang,
Zhe Zhang,
Bingchen Zhang,
Zhongqiu Xu
Abstract:
Synthetic aperture radar tomography (TomoSAR) baseline optimization technique is capable of reducing system complexity and improving the temporal coherence of data, which has become an important research in the field of TomoSAR. In this paper, we propose a nested TomoSAR technique, which introduces the nested array into TomoSAR as the baseline configuration. This technique obtains uniform and cont…
▽ More
Synthetic aperture radar tomography (TomoSAR) baseline optimization technique is capable of reducing system complexity and improving the temporal coherence of data, which has become an important research in the field of TomoSAR. In this paper, we propose a nested TomoSAR technique, which introduces the nested array into TomoSAR as the baseline configuration. This technique obtains uniform and continuous difference co-array through nested array to increase the degrees of freedom (DoF) of the system and expands the virtual aperture along the elevation direction. In order to make full use of the difference co-array, covariance matrix of the echo needs to be obtained. Therefore, we propose a TomoSAR sparse reconstruction algorithm based on nested array, which uses adaptive covariance matrix estimation to improve the estimation performance in complex scenes. We demonstrate the effectiveness of the proposed method through simulated and real data experiments. Compared with traditional TomoSAR and coprime TomoSAR, the imaging results of our proposed method have a better anti-noise performance and retain more image information.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
An analysis of degenerating speech due to progressive dysarthria on ASR performance
Authors:
Katrin Tomanek,
Katie Seaver,
Pan-Pan Jiang,
Richard Cave,
Lauren Harrel,
Jordan R. Green
Abstract:
Although personalized automatic speech recognition (ASR) models have recently been designed to recognize even severely impaired speech, model performance may degrade over time for persons with degenerating speech. The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition…
▽ More
Although personalized automatic speech recognition (ASR) models have recently been designed to recognize even severely impaired speech, model performance may degrade over time for persons with degenerating speech. The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition throughout disease progression. Speech was recorded by four individuals with degrading speech due to amyotrophic lateral sclerosis (ALS). Word error rates (WER) across recording sessions were computed for three ASR models: Unadapted Speaker Independent (U-SI), Adapted Speaker Independent (A-SI), and Adapted Speaker Dependent (A-SD or personalized). The performance of all three models degraded significantly over time as speech became more impaired, but the performance of the A-SD model improved markedly when it was updated with recordings from the severe stages of speech progression. Recording additional utterances early in the disease before speech degraded significantly did not improve the performance of A-SD models. Overall, our findings emphasize the importance of continuous recording (and model retraining) when providing personalized models for individuals with progressive speech impairments.
△ Less
Submitted 31 October, 2022;
originally announced November 2022.
-
Wireless Semantic Transmission via Revising Modules in Conventional Communications
Authors:
Peiwen Jiang,
Chao-Kai Wen,
Shi **,
Geoffrey Ye Li
Abstract:
Semantic communication has become a popular research area due its high spectrum efficiency and error-correction performance. Some studies use deep learning to extract semantic features, which usually form end-to-end semantic communication systems and are hard to address the varying wireless environments. Therefore, the novel semantic-based coding methods and performance metrics have been investiga…
▽ More
Semantic communication has become a popular research area due its high spectrum efficiency and error-correction performance. Some studies use deep learning to extract semantic features, which usually form end-to-end semantic communication systems and are hard to address the varying wireless environments. Therefore, the novel semantic-based coding methods and performance metrics have been investigated and the designed semantic systems consist of various modules as in the conventional communications but with improved functions. This article discusses recent achievements in the state-of-art semantic communications exploiting the conventional modules in wireless systems. We demonstrate through two examples that the traditional hybrid automatic repeat request and modulation methods can be redesigned for novel semantic coding and metrics to further improve the performance of wireless semantic communications. At the end of this article, some open issues are identified.
△ Less
Submitted 2 October, 2022;
originally announced October 2022.
-
A SSIM Guided cGAN Architecture For Clinically Driven Generative Image Synthesis of Multiplexed Spatial Proteomics Channels
Authors:
Jillur Rahman Saurav,
Mohammad Sadegh Nasr,
Paul Koomey,
Michael Robben,
Manfred Huber,
Jon Weidanz,
Bríd Ryan,
Eytan Ruppin,
Peng Jiang,
Jacob M. Luber
Abstract:
Here we present a structural similarity index measure (SSIM) guided conditional Generative Adversarial Network (cGAN) that generatively performs image-to-image (i2i) synthesis to generate photo-accurate protein channels in multiplexed spatial proteomics images. This approach can be utilized to accurately generate missing spatial proteomics channels that were not included during experimental data c…
▽ More
Here we present a structural similarity index measure (SSIM) guided conditional Generative Adversarial Network (cGAN) that generatively performs image-to-image (i2i) synthesis to generate photo-accurate protein channels in multiplexed spatial proteomics images. This approach can be utilized to accurately generate missing spatial proteomics channels that were not included during experimental data collection either at the bench or the clinic. Experimental spatial proteomic data from the Human BioMolecular Atlas Program (HuBMAP) was used to generate spatial representations of missing proteins through a U-Net based image synthesis pipeline. HuBMAP channels were hierarchically clustered by the (SSIM) as a heuristic to obtain the minimal set needed to recapitulate the underlying biology represented by the spatial landscape of proteins. We subsequently prove that our SSIM based architecture allows for scaling of generative image synthesis to slides with up to 100 channels, which is better than current state of the art algorithms which are limited to data with 11 channels. We validate these claims by generating a new experimental spatial proteomics data set from human lung adenocarcinoma tissue sections and show that a model trained on HuBMAP can accurately synthesize channels from our new data set. The ability to recapitulate experimental data from sparsely stained multiplexed histological slides containing spatial proteomic will have tremendous impact on medical diagnostics and drug development, and also raises important questions on the medical ethics of utilizing data produced by generative image synthesis in the clinical setting. The algorithm that we present in this paper will allow researchers and clinicians to save time and costs in proteomics based histological staining while also increasing the amount of data that they can generate through their experiments.
△ Less
Submitted 11 June, 2023; v1 submitted 20 May, 2022;
originally announced May 2022.
-
Wireless Semantic Communications for Video Conferencing
Authors:
Peiwen Jiang,
Chao-Kai Wen,
Shi **,
Geoffrey Ye Li
Abstract:
Video conferencing has become a popular mode of meeting even if it consumes considerable communication resources. Conventional video compression causes resolution reduction under limited bandwidth. Semantic video conferencing maintains high resolution by transmitting some keypoints to represent motions because the background is almost static, and the speakers do not change often. However, the stud…
▽ More
Video conferencing has become a popular mode of meeting even if it consumes considerable communication resources. Conventional video compression causes resolution reduction under limited bandwidth. Semantic video conferencing maintains high resolution by transmitting some keypoints to represent motions because the background is almost static, and the speakers do not change often. However, the study on the impact of the transmission errors on keypoints is limited. In this paper, we initially establish a basal semantic video conferencing (SVC) network, which dramatically reduces transmission resources while only losing detailed expressions. The transmission errors in SVC only lead to a changed expression, whereas those in the conventional methods destroy pixels directly. However, the conventional error detector, such as the cyclic redundancy check, cannot reflect the degree of expression changes. To overcome this issue, we develop an incremental redundancy hybrid automatic repeat-request (IR-HARQ) framework for the varying channels (SVC-HARQ) incorporating a novel semantic error detector. The SVC-HARQ has flexibility in bit consumption and achieves good performance. In addition, SVC-CSI is designed for channel state information (CSI) feedback to allocate the keypoint transmission and enhance the performance dramatically. Simulation shows that the proposed wireless semantic communication system can significantly improve the transmission efficiency.This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
△ Less
Submitted 16 April, 2022;
originally announced April 2022.
-
Deep Learning based Intelligent Coin-tap Test for Defect Recognition
Authors:
Hongyu Li,
Peng Jiang,
Tiejun Wang
Abstract:
The coin-tap test is a convenient and primary method for non-destructive testing, while its manual on-site operation is tough and costly. With the help of the latest intelligent signal processing method, convolutional neural networks (CNN), we achieve an intelligent coin-tap test which exhibited superior performance in recognizing the defects. However, this success of CNNs relies on plenty of well…
▽ More
The coin-tap test is a convenient and primary method for non-destructive testing, while its manual on-site operation is tough and costly. With the help of the latest intelligent signal processing method, convolutional neural networks (CNN), we achieve an intelligent coin-tap test which exhibited superior performance in recognizing the defects. However, this success of CNNs relies on plenty of well-labeled data from the identical scenario, which could be difficult to get for many real industrial practices. This paper further develops transfer learning strategies for this issue, that is, to transfer the model trained on data of one scenario to another. In experiments, the result presents a notable improvement by using domain adaptation and pseudo label learning strategies. Hence, it becomes possible to apply the model into scenarios with none or little (less than 10\%) labeled data adopting the transfer learning strategies proposed herein. In addition, we used a benchmark dataset constructed ourselves throughout this study. This benchmark dataset for the coin-tap test containing around 100,000 sound signals is published at https://github.com/PPhub-hy/torch-tapnet.
△ Less
Submitted 20 March, 2022;
originally announced March 2022.
-
Deep Source-Channel Coding for Sentence Semantic Transmission with HARQ
Authors:
Peiwen Jiang,
Chao-Kai Wen,
Shi **,
Geoffrey Ye Li
Abstract:
Recently, semantic communication has been brought to the forefront because of its great success in deep learning (DL), especially Transformer. Even if semantic communication has been successfully applied in the sentence transmission to reduce semantic errors, existing architecture is usually fixed in the codeword length and is inefficient and inflexible for the varying sentence length. In this pap…
▽ More
Recently, semantic communication has been brought to the forefront because of its great success in deep learning (DL), especially Transformer. Even if semantic communication has been successfully applied in the sentence transmission to reduce semantic errors, existing architecture is usually fixed in the codeword length and is inefficient and inflexible for the varying sentence length. In this paper, we exploit hybrid automatic repeat request (HARQ) to reduce semantic transmission error further. We first combine semantic coding (SC) with Reed Solomon (RS) channel coding and HARQ, called SC-RS-HARQ, which exploits the superiority of the SC and the reliability of the conventional methods successfully. Although the SC-RS-HARQ is easily applied in the existing HARQ systems, we also develop an end-to-end architecture, called SCHARQ, to pursue the performance further. Numerical results demonstrate that SCHARQ significantly reduces the required number of bits for sentence semantic transmission and sentence error rate. Finally, we attempt to replace error detection from cyclic redundancy check to a similarity detection network called Sim32 to allow the receiver to reserve the wrong sentences with similar semantic information and to save transmission resources.
△ Less
Submitted 5 June, 2021;
originally announced June 2021.
-
Single-photon imaging over 200 km
Authors:
Zheng-** Li,
Jun-Tian Ye,
Xin Huang,
Peng-Yu Jiang,
Yuan Cao,
Yu Hong,
Chao Yu,
Jun Zhang,
Qiang Zhang,
Cheng-Zhi Peng,
Feihu Xu,
Jian-Wei Pan
Abstract:
Long-range active imaging has widespread applications in remote sensing and target recognition. Single-photon light detection and ranging (lidar) has been shown to have high sensitivity and temporal resolution. On the application front, however, the operating range of practical single-photon lidar systems is limited to about tens of kilometers over the Earth's atmosphere, mainly due to the weak ec…
▽ More
Long-range active imaging has widespread applications in remote sensing and target recognition. Single-photon light detection and ranging (lidar) has been shown to have high sensitivity and temporal resolution. On the application front, however, the operating range of practical single-photon lidar systems is limited to about tens of kilometers over the Earth's atmosphere, mainly due to the weak echo signal mixed with high background noise. Here, we present a compact coaxial single-photon lidar system capable of realizing 3D imaging at up to 201.5 km. It is achieved by using high-efficiency optical devices for collection and detection, and what we believe is a new noise-suppression technique that is efficient for long-range applications. We show that photon-efficient computational algorithms enable accurate 3D imaging over hundreds of kilometers with as few as 0.44 signal photons per pixel. The results represent a significant step toward practical, low-power lidar over extra-long ranges.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Defect segmentation: Map** tunnel lining internal defects with ground penetrating radar data using a convolutional neural network
Authors:
Senlin Yang,
Zhengfang Wang,
**g Wang,
Anthony G. Cohn,
Jiaqi Zhang,
Peng Jiang,
Peng Jiang,
Qingmei Sui
Abstract:
This research proposes a Ground Penetrating Radar (GPR) data processing method for non-destructive detection of tunnel lining internal defects, called defect segmentation. To perform this critical step of automatic tunnel lining detection, the method uses a CNN called Segnet combined with the Lovász softmax loss function to map the internal defect structure with GPR synthetic data, which improves…
▽ More
This research proposes a Ground Penetrating Radar (GPR) data processing method for non-destructive detection of tunnel lining internal defects, called defect segmentation. To perform this critical step of automatic tunnel lining detection, the method uses a CNN called Segnet combined with the Lovász softmax loss function to map the internal defect structure with GPR synthetic data, which improves the accuracy, automation and efficiency of defects detection. The novel method we present overcomes several difficulties of traditional GPR data interpretation as demonstrated by an evaluation on both synthetic and real datas -- to verify the method on real data, a test model containing a known defect was designed and built and GPR data was obtained and analyzed.
△ Less
Submitted 29 March, 2020;
originally announced March 2020.
-
Range-Doppler Sidelobe Suppression for Pulsed Radar Based on Golay Complementary Codes
Authors:
Zhong-Jie Wu,
Chen-Xu Wang,
Pei-He Jiang,
Zhi-Quan Zhou
Abstract:
To relieve the interference caused by range-Doppler sidelobes in pulsed radars, we propose a new method to construct Doppler resilient complementary waveforms based on Golay codes. We design both the transmit pulse train and the receive pulse weights, so that the similarity between the pulse weights and a given window function is maximized and the constraints on Doppler null points and energy are…
▽ More
To relieve the interference caused by range-Doppler sidelobes in pulsed radars, we propose a new method to construct Doppler resilient complementary waveforms based on Golay codes. We design both the transmit pulse train and the receive pulse weights, so that the similarity between the pulse weights and a given window function is maximized and the constraints on Doppler null points and energy are met. That is summarized as a two-way partitioning problem, and then solved by semidefinite programming and randomization techniques. The novel waveform thus obtained has its range sidelobe outright suppressed in multiple and flexibly-adjustable Doppler zones, and performs well in Doppler sidelobe suppression, Doppler resolution and SNR. It shows great promise in detecting slightly-moving weak targets with the existence of dense interference.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
LiDARNet: A Boundary-Aware Domain Adaptation Model for Point Cloud Semantic Segmentation
Authors:
Peng Jiang,
Srikanth Saripalli
Abstract:
We present a boundary-aware domain adaptation model for LiDAR scan full-scene semantic segmentation (LiDARNet). Our model can extract both the domain private features and the domain shared features with a two-branch structure. We embedded Gated-SCNN into the segmentor component of LiDARNet to learn boundary information while learning to predict full-scene semantic segmentation labels. Moreover, we…
▽ More
We present a boundary-aware domain adaptation model for LiDAR scan full-scene semantic segmentation (LiDARNet). Our model can extract both the domain private features and the domain shared features with a two-branch structure. We embedded Gated-SCNN into the segmentor component of LiDARNet to learn boundary information while learning to predict full-scene semantic segmentation labels. Moreover, we further reduce the domain gap by inducing the model to learn a map** between two domains using the domain shared and private features. Additionally, we introduce a new dataset (SemanticUSL\footnote{The access address of SemanticUSL:\url{https://unmannedlab.github.io/research/SemanticUSL}}) for domain adaptation for LiDAR point cloud semantic segmentation. The dataset has the same data format and ontology as SemanticKITTI. We conducted experiments on real-world datasets SemanticKITTI, SemanticPOSS, and SemanticUSL, which have differences in channel distributions, reflectivity distributions, diversity of scenes, and sensors setup. Using our approach, we can get a single projection-based LiDAR full-scene semantic segmentation model working on both domains. Our model can keep almost the same performance on the source domain after adaptation and get an 8\%-22\% mIoU performance increase in the target domain.
△ Less
Submitted 24 April, 2021; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Super-resolution single-photon imaging at 8.2 kilometers
Authors:
Zheng-** Li,
Xin Huang,
Peng-Yu Jiang,
Yu Hong,
Chao Yu,
Yuan Cao,
Jun Zhang,
Feihu Xu,
Jian-Wei Pan
Abstract:
Single-photon light detection and ranging (LiDAR), offering single-photon sensitivity and picosecond time resolution, has been widely adopted for active imaging applications. Long-range active imaging is a great challenge, because the spatial resolution degrades significantly with the imaging range due to the diffraction limit of the optics, and only weak echo signal photons can return but mixed w…
▽ More
Single-photon light detection and ranging (LiDAR), offering single-photon sensitivity and picosecond time resolution, has been widely adopted for active imaging applications. Long-range active imaging is a great challenge, because the spatial resolution degrades significantly with the imaging range due to the diffraction limit of the optics, and only weak echo signal photons can return but mixed with a strong background noise. Here we propose and demonstrate a photon-efficient LiDAR approach that can achieve sub-Rayleigh resolution imaging over long ranges. This approach exploits fine sub-pixel scanning and a deconvolution algorithm tailored to this long-range application. Using this approach, we experimentally demonstrated active three-dimensional (3D) single-photon imaging by recognizing different postures of a mannequin model at a stand-off distance of 8.2 km in both daylight and night. The observed spatial (transversal) resolution is about 5.5 cm at 8.2 km, which is about twice of the system's resolution. This also beats the optical system's Rayleigh criterion. The results are valuable for geosciences and target recognition over long ranges.
△ Less
Submitted 30 January, 2020;
originally announced January 2020.
-
GPRInvNet: Deep Learning-Based Ground Penetrating Radar Data Inversion for Tunnel Lining
Authors:
Bin Liu,
Yuxiao Ren,
Hanchi Liu,
Hui Xu,
Zhengfang Wang,
Anthony G. Cohn,
Peng Jiang
Abstract:
A DNN architecture referred to as GPRInvNet was proposed to tackle the challenges of map** the ground-penetrating radar (GPR) B-Scan data to complex permittivity maps of subsurface structures. The GPRInvNet consisted of a trace-to-trace encoder and a decoder. It was specially designed to take into account the characteristics of GPR inversion when faced with complex GPR B-Scan data, as well as ad…
▽ More
A DNN architecture referred to as GPRInvNet was proposed to tackle the challenges of map** the ground-penetrating radar (GPR) B-Scan data to complex permittivity maps of subsurface structures. The GPRInvNet consisted of a trace-to-trace encoder and a decoder. It was specially designed to take into account the characteristics of GPR inversion when faced with complex GPR B-Scan data, as well as addressing the spatial alignment issues between time-series B-Scan data and spatial permittivity maps. It displayed the ability to fuse features from several adjacent traces on the B-Scan data to enhance each trace, and then further condense the features of each trace separately. As a result, the sensitive zones on the permittivity maps spatially aligned to the enhanced trace could be reconstructed accurately. The GPRInvNet has been utilized to reconstruct the permittivity map of tunnel linings. A diverse range of dielectric models of tunnel linings containing complex defects has been reconstructed using GPRInvNet. The results have demonstrated that the GPRInvNet is capable of effectively reconstructing complex tunnel lining defects with clear boundaries. Comparative results with existing baseline methods also demonstrated the superiority of the GPRInvNet. For the purpose of generalizing the GPRInvNet to real GPR data, some background noise patches recorded from practical model testing were integrated into the synthetic GPR data to retrain the GPRInvNet. The model testing has been conducted for validation, and experimental results revealed that the GPRInvNet had also achieved satisfactory results with regard to the real data.
△ Less
Submitted 26 September, 2021; v1 submitted 11 December, 2019;
originally announced December 2019.
-
AI-Aided Online Adaptive OFDM Receiver: Design and Experimental Results
Authors:
Peiwen Jiang,
Tianqi Wang,
Bin Han,
Xuanxuan Gao,
**g Zhang,
Chao-Kai Wen,
Shi **,
Geoffrey Ye Li
Abstract:
Orthogonal frequency division multiplexing (OFDM) has been widely applied in current communication systems. The artificial intelligence (AI)-aided OFDM receivers are currently brought to the forefront to replace and improve the traditional OFDM receivers. In this study, we first compare two AI-aided OFDM receivers, namely, data-driven fully connected deep neural network and model-driven ComNet, th…
▽ More
Orthogonal frequency division multiplexing (OFDM) has been widely applied in current communication systems. The artificial intelligence (AI)-aided OFDM receivers are currently brought to the forefront to replace and improve the traditional OFDM receivers. In this study, we first compare two AI-aided OFDM receivers, namely, data-driven fully connected deep neural network and model-driven ComNet, through extensive simulation and real-time video transmission using a 5G rapid prototy** system for an over-the-air (OTA) test. We find a performance gap between the simulation and the OTA test caused by the discrepancy between the channel model for offline training and the real environment. We develop a novel online training system, which is called SwitchNet receiver, to address this issue. This receiver has a flexible and extendable architecture and can adapt to real channels by training only several parameters online. From the OTA test, the AI-aided OFDM receivers, especially the SwitchNet receiver, are robust to real environments and promising for future communication systems. We discuss potential challenges and future research inspired by our initial study in this paper.
△ Less
Submitted 24 December, 2021; v1 submitted 17 December, 2018;
originally announced December 2018.
-
Design of remote control software of near infrared Sky Brightness Monitor in Antarctica
Authors:
Zhi-yue Wang,
Ya-qi Chen,
Ming-hao Jia,
Guang-yu Zhang,
Jun Zhang,
Yi-hao Zhang,
**-ting Chen,
Hong-fei Zhang,
Peng Jiang,
Tuo Ji,
Jian Wang
Abstract:
The Near-infrared Sky Brightness Monitor (NIRBM) aims to measure the middle infrared sky background in Antarctica. The NIRBM mainly consists of an InGaAs detector, a chopper, a reflector, a cooler and a black body. The reflector can rotate to scan the sky with a field of view ranging from 0° to 180°. Electromechanical control and weak signal readout functions are accomplished by the same circuit,…
▽ More
The Near-infrared Sky Brightness Monitor (NIRBM) aims to measure the middle infrared sky background in Antarctica. The NIRBM mainly consists of an InGaAs detector, a chopper, a reflector, a cooler and a black body. The reflector can rotate to scan the sky with a field of view ranging from 0° to 180°. Electromechanical control and weak signal readout functions are accomplished by the same circuit, whose core chip is a STM32F407VG microcontroller. Considering the environment is harsh for humans in Antarctica, a multi-level remote control software system is designed and implemented. A set of EPICS IOCs are developed to control each hardware module independently via serial port communication with the STM32 microcontroller. The tornado web framework and PyEpics are introduced as a combination where PyEpics is used to monitor or change the EPICS Process Variables, functioning as a client for the EPICS framework. Tornado is responsible for the specific operation process of inter-device collaboration, and expose a set of interfaces to users to make calls. Considering the high delay and low bandwidth of the network environment, the tornado back-end is designed as a master-and-agent architecture to improve domestic user experience. The master node is deployed in Antarctic while multiple agent nodes can be deployed domestic. The master and agent nodes communicate with each other through the WebSocket protocol to exchange latest information so that bandwidth is saved. The GUI is implemented in the form of single-page application based on the Vue framework which communicates with tornado through WebSocket and AJAX requests. The web page integrates device control, data curve drawing, alarm display, auto observation and other functions together.
△ Less
Submitted 1 March, 2019; v1 submitted 5 June, 2018;
originally announced June 2018.