Search | arXiv e-print repository

Random Time-hop** Secure Ranging Strategy Against Distance-Reduction Attacks in UWB

Authors: Wenlong Gou, Chuanhang Yu, Gang Wu

Abstract: In order to mitigate the distance reduction attack in Ultra-Wide Band (UWB) ranging, this paper proposes a secure ranging scheme based on a random time-hop** mechanism without redundant signaling overhead. Additionally, a secure ranging strategy is designed for backward compatibility with existing standards such as IEEE 802.15.4a/z, combined with an attack detection scheme. The effectiveness and… ▽ More In order to mitigate the distance reduction attack in Ultra-Wide Band (UWB) ranging, this paper proposes a secure ranging scheme based on a random time-hop** mechanism without redundant signaling overhead. Additionally, a secure ranging strategy is designed for backward compatibility with existing standards such as IEEE 802.15.4a/z, combined with an attack detection scheme. The effectiveness and feasibility of the proposed strategy are demonstrated through both simulation and experimental results in the case of the Ghost Peak attack, as demonstrated by Patrick Leu et al. The random time-hop** mechanism is verified to be capable of reducing the success rate of distance reduction attacks to less than 0.01%, thereby significantly enhancing the security of UWB ranging. △ Less

Submitted 10 June, 2024; originally announced June 2024.

ACM Class: H.1.1

arXiv:2405.18255 [pdf, other]

Channel Reciprocity Based Attack Detection for Securing UWB Ranging by Autoencoder

Authors: Wenlong Gou, Chuanhang Yu, Juntao Ma, Gang Wu, Vladimir Mordachev

Abstract: A variety of ranging threats represented by Ghost Peak attack have raised concerns regarding the security performance of Ultra-Wide Band (UWB) systems with the finalization of the IEEE 802.15.4z standard. Based on channel reciprocity, this paper proposes a low complexity attack detection scheme that compares Channel Impulse Response (CIR) features of both ranging sides utilizing an autoencoder wit… ▽ More A variety of ranging threats represented by Ghost Peak attack have raised concerns regarding the security performance of Ultra-Wide Band (UWB) systems with the finalization of the IEEE 802.15.4z standard. Based on channel reciprocity, this paper proposes a low complexity attack detection scheme that compares Channel Impulse Response (CIR) features of both ranging sides utilizing an autoencoder with the capability of data compression and feature extraction. Taking Ghost Peak attack as an example, this paper demonstrates the effectiveness, feasibility and generalizability of the proposed attack detection scheme through simulation and experimental validation. The proposed scheme achieves an attack detection success rate of over 99% and can be implemented in current systems at low cost. △ Less

Submitted 10 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

ACM Class: H.1.1

arXiv:2405.12415 [pdf, other]

Distribution Steering for Discrete-Time Uncertain Ensemble Systems

Authors: Guangyu Wu, Panagiotis Tsiotras, Anders Lindquist

Abstract: Ensemble systems appear frequently in many engineering applications and, as a result, they have become an important research topic in control theory. These systems are best characterized by the evolution of their underlying state distribution. Despite the work to date, few results exist dealing with the problem of directly modifying (i.e., "steering") the distribution of an ensemble system. In add… ▽ More Ensemble systems appear frequently in many engineering applications and, as a result, they have become an important research topic in control theory. These systems are best characterized by the evolution of their underlying state distribution. Despite the work to date, few results exist dealing with the problem of directly modifying (i.e., "steering") the distribution of an ensemble system. In addition, in most of the existing results, the distribution of the states of an ensemble of discrete-time systems is assumed to be Gaussian. However, in case the system parameters are uncertain, it is not always realistic to assume that the distribution of the system follows a Gaussian distribution, thus complicating the solution of the overall problem. In this paper, we address the general distribution steering problem for first-order discrete-time ensemble systems, where the distributions of the system parameters and the states are arbitrary with finite first few moments. Both linear and nonlinear system dynamics are considered using the method of power moments to transform the original infinite-dimensional problem into a finite-dimensional one. We also propose a control law for the ensuing moment system, which allows us to obtain the power moments of the desired control inputs. Finally, we solve the inverse problem to obtain the feasible control inputs from their corresponding power moments. We provide numerical results to validate our theoretical developments. These include cases where the parameter distribution is uniform, Gaussian, non-Gaussian, and multi-modal, respectively. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 16 pages, 18 figures

arXiv:2405.10691 [pdf, other]

LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets wit… ▽ More The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets with missing time points. This limitation significantly impedes subsequent neuroscience and clinical modeling. Yet, existing deep generative models are facing difficulties in missing brain image completion, due to sparse data and the nonlinear, dramatic contrast/geometric variations in the develo** brain. We propose LoCI-DiffCom, a novel Longitudinal Consistency-Informed Diffusion model for infant brain image Completion,which integrates the images from preceding and subsequent time points to guide a diffusion model for generating high-fidelity missing data. Our designed LoCI module can work on highly sparse sequences, relying solely on data from two temporal points. Despite wide separation and diversity between age time points, our approach can extract individualized developmental features while ensuring context-aware consistency. Our experiments on a large infant brain MR dataset demonstrate its effectiveness with consistent performance on missing infant brain MR completion even in big gap scenarios, aiding in better delineation of early developmental trajectories. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2404.19477 [pdf, other]

Hybrid Bit and Semantic Communications

Authors: Kaiwen Yu, Renhe Fan, Gang Wu, Zhi** Qin

Abstract: Semantic communication technology is regarded as a method surpassing the Shannon limit of bit transmission, capable of effectively enhancing transmission efficiency. However, current approaches that directly map content to transmission symbols are challenging to deploy in practice, imposing significant limitations on the development of semantic communication. To address this challenge, we propose… ▽ More Semantic communication technology is regarded as a method surpassing the Shannon limit of bit transmission, capable of effectively enhancing transmission efficiency. However, current approaches that directly map content to transmission symbols are challenging to deploy in practice, imposing significant limitations on the development of semantic communication. To address this challenge, we propose a hybrid bit and semantic communication system, named HybridBSC, in which encoded semantic information is inserted into bit information for transmission via conventional digital communication systems utilizing same spectrum resources. The system can be easily deployed using existing communication architecture to achieve bit and semantic information transmission. Particularly, we design a semantic insertion and extraction scheme to implement this strategy. Furthermore, we conduct experimental validation based on the pluto-based software defined radio (SDR) platform in a real wireless channel, demonstrating that the proposed strategy can simultaneously transmit semantic and bit information. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.00260 [pdf, other]

Exploiting Self-Supervised Constraints in Image Super-Resolution

Authors: Gang Wu, Junjun Jiang, Kui Jiang, Xianming Liu

Abstract: Recent advances in self-supervised learning, predominantly studied in high-level visual tasks, have been explored in low-level image processing. This paper introduces a novel self-supervised constraint for single image super-resolution, termed SSC-SR. SSC-SR uniquely addresses the divergence in image complexity by employing a dual asymmetric paradigm and a target model updated via exponential movi… ▽ More Recent advances in self-supervised learning, predominantly studied in high-level visual tasks, have been explored in low-level image processing. This paper introduces a novel self-supervised constraint for single image super-resolution, termed SSC-SR. SSC-SR uniquely addresses the divergence in image complexity by employing a dual asymmetric paradigm and a target model updated via exponential moving average to enhance stability. The proposed SSC-SR framework works as a plug-and-play paradigm and can be easily applied to existing SR models. Empirical evaluations reveal that our SSC-SR framework delivers substantial enhancements on a variety of benchmark datasets, achieving an average increase of 0.1 dB over EDSR and 0.06 dB over SwinIR. In addition, extensive ablation studies corroborate the effectiveness of each constituent in our SSC-SR framework. Codes are available at https://github.com/Aitical/SSCSR. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: ICME 2024

arXiv:2403.19251 [pdf, other]

Arbitrary State Transition of Open Qubit System Based on Switching Control

Authors: Guangpu Wu, Shibei Xue, Shan Ma, Sen Kuang, Daoyi Dong, Ian R. Petersen

Abstract: We present a switching control strategy based on Lyapunov control for arbitrary state transitions in open qubit systems. With coherent vector representation, we propose a switching control strategy, which can prevent the state of the qubit from entering invariant sets and singular value sets, effectively driving the system ultimately to a sufficiently small neighborhood of target states. In compar… ▽ More We present a switching control strategy based on Lyapunov control for arbitrary state transitions in open qubit systems. With coherent vector representation, we propose a switching control strategy, which can prevent the state of the qubit from entering invariant sets and singular value sets, effectively driving the system ultimately to a sufficiently small neighborhood of target states. In comparison to existing works, this control strategy relaxes the strict constraints on system models imposed by special target states. Furthermore, we identify conditions under which the open qubit system achieves finite-time stability (FTS) and finite-time contractive stability (FTCS), respectively. This represents a critical improvement in quantum state transitions, especially considering the asymptotic stability of arbitrary target states is unattainable in open quantum systems. The effectiveness of our proposed method is convincingly demonstrated through its application in a qubit system affected by various types of decoherence, including amplitude, dephasing and polarization decoherence. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 12 pages, 7 figures

arXiv:2401.05633 [pdf, other]

Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach

Authors: Gang Wu, Junjun Jiang, Junpeng Jiang, Xianming Liu

Abstract: Recent progress in single-image super-resolution (SISR) has achieved remarkable performance, yet the computational costs of these methods remain a challenge for deployment on resource-constrained devices. Especially for transformer-based methods, the self-attention mechanism in such models brings great breakthroughs while incurring substantial computational costs. To tackle this issue, we introduc… ▽ More Recent progress in single-image super-resolution (SISR) has achieved remarkable performance, yet the computational costs of these methods remain a challenge for deployment on resource-constrained devices. Especially for transformer-based methods, the self-attention mechanism in such models brings great breakthroughs while incurring substantial computational costs. To tackle this issue, we introduce the Convolutional Transformer layer (ConvFormer) and the ConvFormer-based Super-Resolution network (CFSR), which offer an effective and efficient solution for lightweight image super-resolution tasks. In detail, CFSR leverages the large kernel convolution as the feature mixer to replace the self-attention module, efficiently modeling long-range dependencies and extensive receptive fields with a slight computational cost. Furthermore, we propose an edge-preserving feed-forward network, simplified as EFN, to obtain local feature aggregation and simultaneously preserve more high-frequency information. Extensive experiments demonstrate that CFSR can achieve an advanced trade-off between computational cost and performance when compared to existing lightweight SR methods. Compared to state-of-the-art methods, e.g. ShuffleMixer, the proposed CFSR achieves 0.39 dB gains on Urban100 dataset for x2 SR task while containing 26% and 31% fewer parameters and FLOPs, respectively. Code and pre-trained models are available at https://github.com/Aitical/CFSR. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: submitting to TIP

arXiv:2311.18073 [pdf, other]

DiffGEPCI: 3D MRI Synthesis from mGRE Signals using 2.5D Diffusion Model

Authors: Yuyang Hu, Satya V. V. N. Kothapalli, Weijie Gan, Alexander L. Sukstanskii, Gregory F. Wu, Manu Goyal, Dmitriy A. Yablonskiy, Ulugbek S. Kamilov

Abstract: We introduce a new framework called DiffGEPCI for cross-modality generation in magnetic resonance imaging (MRI) using a 2.5D conditional diffusion model. DiffGEPCI can synthesize high-quality Fluid Attenuated Inversion Recovery (FLAIR) and Magnetization Prepared-Rapid Gradient Echo (MPRAGE) images, without acquiring corresponding measurements, by leveraging multi-Gradient-Recalled Echo (mGRE) MRI… ▽ More We introduce a new framework called DiffGEPCI for cross-modality generation in magnetic resonance imaging (MRI) using a 2.5D conditional diffusion model. DiffGEPCI can synthesize high-quality Fluid Attenuated Inversion Recovery (FLAIR) and Magnetization Prepared-Rapid Gradient Echo (MPRAGE) images, without acquiring corresponding measurements, by leveraging multi-Gradient-Recalled Echo (mGRE) MRI signals as conditional inputs. DiffGEPCI operates in a two-step fashion: it initially estimates a 3D volume slice-by-slice using the axial plane and subsequently applies a refinement algorithm (referred to as 2.5D) to enhance the quality of the coronal and sagittal planes. Experimental validation on real mGRE data shows that DiffGEPCI achieves excellent performance, surpassing generative adversarial networks (GANs) and traditional diffusion models. △ Less

Submitted 18 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.13254 [pdf, other]

DA-STC: Domain Adaptive Video Semantic Segmentation via Spatio-Temporal Consistency

Authors: Zhe Zhang, Gaochang Wu, **g Zhang, Chunhua Shen, Dacheng Tao, Tianyou Chai

Abstract: Video semantic segmentation is a pivotal aspect of video representation learning. However, significant domain shifts present a challenge in effectively learning invariant spatio-temporal features across the labeled source domain and unlabeled target domain for video semantic segmentation. To solve the challenge, we propose a novel DA-STC method for domain adaptive video semantic segmentation, whic… ▽ More Video semantic segmentation is a pivotal aspect of video representation learning. However, significant domain shifts present a challenge in effectively learning invariant spatio-temporal features across the labeled source domain and unlabeled target domain for video semantic segmentation. To solve the challenge, we propose a novel DA-STC method for domain adaptive video semantic segmentation, which incorporates a bidirectional multi-level spatio-temporal fusion module and a category-aware spatio-temporal feature alignment module to facilitate consistent learning for domain-invariant features. Firstly, we perform bidirectional spatio-temporal fusion at the image sequence level and shallow feature level, leading to the construction of two fused intermediate video domains. This prompts the video semantic segmentation model to consistently learn spatio-temporal features of shared patch sequences which are influenced by domain-specific contexts, thereby mitigating the feature gap between the source and target domain. Secondly, we propose a category-aware feature alignment module to promote the consistency of spatio-temporal features, facilitating adaptation to the target domain. Specifically, we adaptively aggregate the domain-specific deep features of each category along spatio-temporal dimensions, which are further constrained to achieve cross-domain intra-class feature alignment and inter-class feature separation. Extensive experiments demonstrate the effectiveness of our method, which achieves state-of-the-art mIOUs on multiple challenging benchmarks. Furthermore, we extend the proposed DA-STC to the image domain, where it also exhibits superior performance for domain adaptive semantic segmentation. The source code and models will be made available at \url{https://github.com/ZHE-SAPI/DA-STC}. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: 18 pages,9 figures

arXiv:2311.04389 [pdf, other]

Structural Balance of Complex Weighted Graphs and Multi-partite Consensus

Authors: Honghui Wu, Ahmet Taha Koru, Guanxuan Wu, Frank L. Lewis, Hai Lin

Abstract: The structural balance of a signed graph is known to be necessary and sufficient to obtain a bipartite consensus among agents with friend-foe relationships. In the real world, relationships are multifarious, and the coexistence of different opinions is ubiquitous. We are therefore motivated to study the multi-partite consensus problem of multi-agent systems, for which we extend the concept of stru… ▽ More The structural balance of a signed graph is known to be necessary and sufficient to obtain a bipartite consensus among agents with friend-foe relationships. In the real world, relationships are multifarious, and the coexistence of different opinions is ubiquitous. We are therefore motivated to study the multi-partite consensus problem of multi-agent systems, for which we extend the concept of structural balance to graphs with complex edge weights. It is shown that the generalized structural balance property is necessary and sufficient for achieving multi-partite consensus. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2310.03297 [pdf, other]

Passive Respiration Detection via mmWave Communication Signal Under Interference

Authors: Kehan Wu, Renqi Chen, Haiyu Wang, Chenqing Ji, Jiayuan Zhu, Guang Wu

Abstract: Recent research has highlighted the detection of human respiration rate using commodity WiFi devices. Nevertheless, these devices encounter challenges in accurately discerning human respiration amidst the prevailing human motion interference encountered in daily life. To tackle this predicament, this paper introduces a passive sensing and communication system designed specifically for respiration… ▽ More Recent research has highlighted the detection of human respiration rate using commodity WiFi devices. Nevertheless, these devices encounter challenges in accurately discerning human respiration amidst the prevailing human motion interference encountered in daily life. To tackle this predicament, this paper introduces a passive sensing and communication system designed specifically for respiration detection in the presence of robust human motion interference. Operating within the 60.48 GHz band, the proposed system aims to detect human respiration even when confronted with substantial human motion interference within close proximity. Subsequently, a neural network is trained using the collected data by us to enable human respiration detection. The experimental results demonstrate a consistently high accuracy rate over 90\% of the human respiration detection under interference, given an adequate sensing duration. Finally, an empirical model is derived analytically to achieve the respiratory rate counting in 10 seconds. △ Less

Submitted 4 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to WCNC2024 Workshop

arXiv:2309.05115 [pdf, other]

Real-time Learning of Driving Gap Preference for Personalized Adaptive Cruise Control

Authors: Zhouqiao Zhao, Xishun Liao, Amr Abdelraouf, Kyungtae Han, Rohit Gupta, Matthew J. Barth, Guoyuan Wu

Abstract: Advanced Driver Assistance Systems (ADAS) are increasingly important in improving driving safety and comfort, with Adaptive Cruise Control (ACC) being one of the most widely used. However, pre-defined ACC settings may not always align with driver's preferences and habits, leading to discomfort and potential safety issues. Personalized ACC (P-ACC) has been proposed to address this problem, but most… ▽ More Advanced Driver Assistance Systems (ADAS) are increasingly important in improving driving safety and comfort, with Adaptive Cruise Control (ACC) being one of the most widely used. However, pre-defined ACC settings may not always align with driver's preferences and habits, leading to discomfort and potential safety issues. Personalized ACC (P-ACC) has been proposed to address this problem, but most existing research uses historical driving data to imitate behaviors that conform to driver preferences, neglecting real-time driver feedback. To bridge this gap, we propose a cloud-vehicle collaborative P-ACC framework that incorporates driver feedback adaptation in real time. The framework is divided into offline and online parts. The offline component records the driver's naturalistic car-following trajectory and uses inverse reinforcement learning (IRL) to train the model on the cloud. In the online component, driver feedback is used to update the driving gap preference in real time. The model is then retrained on the cloud with driver's takeover trajectories, achieving incremental learning to better match driver's preference. Human-in-the-loop (HuiL) simulation experiments demonstrate that our proposed method significantly reduces driver intervention in automatic control systems by up to 62.8%. By incorporating real-time driver feedback, our approach enhances the comfort and safety of P-ACC, providing a personalized and adaptable driving experience. △ Less

Submitted 10 September, 2023; originally announced September 2023.

arXiv:2308.14315 [pdf, other]

General Discrete-Time Fokker-Planck Control by Power Moments

Authors: Guangyu Wu, Anders Lindquist

Abstract: In this paper, we address the so-called general Fokker-Planck control problem for discrete-time first-order linear systems. Unlike conventional treatments, we don't assume the distributions of the system states to be Gaussian. Instead, we only assume the existence and finiteness of the first several order power moments of the distributions. It is proved in the literature that there doesn't exist a… ▽ More In this paper, we address the so-called general Fokker-Planck control problem for discrete-time first-order linear systems. Unlike conventional treatments, we don't assume the distributions of the system states to be Gaussian. Instead, we only assume the existence and finiteness of the first several order power moments of the distributions. It is proved in the literature that there doesn't exist a solution, which has a form of conventional feedback control, to this problem. We propose a moment representation of the system to turn the original problem into a finite-dimensional one. Then a novel feedback control term, which is a mixture of a feedback term and a Markovian transition kernel term is proposed to serve as the control input of the moment system. The states of the moment system are obtained by maximizing the smoothness of the state transition. The power moments of the transition kernels are obtained by a convex optimization problem, of which the solution is proved to exist and be unique. Then they are mapped back to the probability distributions. The control inputs to the original system are then obtained by sampling from the realized distributions. Simulation results are provided to validate our algorithm in treating the general discrete-time Fokker-Planck control problem. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: 10 pages, 10 figures

arXiv:2308.06749 [pdf, other]

doi 10.1145/3581783.3611933

FastLLVE: Real-Time Low-Light Video Enhancement with Intensity-Aware Lookup Table

Authors: Wenhao Li, Guangyang Wu, Wenyi Wang, Peiran Ren, Xiaohong Liu

Abstract: Low-Light Video Enhancement (LLVE) has received considerable attention in recent years. One of the critical requirements of LLVE is inter-frame brightness consistency, which is essential for maintaining the temporal coherence of the enhanced video. However, most existing single-image-based methods fail to address this issue, resulting in flickering effect that degrades the overall quality after en… ▽ More Low-Light Video Enhancement (LLVE) has received considerable attention in recent years. One of the critical requirements of LLVE is inter-frame brightness consistency, which is essential for maintaining the temporal coherence of the enhanced video. However, most existing single-image-based methods fail to address this issue, resulting in flickering effect that degrades the overall quality after enhancement. Moreover, 3D Convolution Neural Network (CNN)-based methods, which are designed for video to maintain inter-frame consistency, are computationally expensive, making them impractical for real-time applications. To address these issues, we propose an efficient pipeline named FastLLVE that leverages the Look-Up-Table (LUT) technique to maintain inter-frame brightness consistency effectively. Specifically, we design a learnable Intensity-Aware LUT (IA-LUT) module for adaptive enhancement, which addresses the low-dynamic problem in low-light scenarios. This enables FastLLVE to perform low-latency and low-complexity enhancement operations while maintaining high-quality results. Experimental results on benchmark datasets demonstrate that our method achieves the State-Of-The-Art (SOTA) performance in terms of both image quality and inter-frame brightness consistency. More importantly, our FastLLVE can process 1,080p videos at $\mathit{50+}$ Frames Per Second (FPS), which is $\mathit{2 \times}$ faster than SOTA CNN-based methods in inference time, making it a promising solution for real-time applications. The code is available at https://github.com/Wenhao-Li-777/FastLLVE. △ Less

Submitted 13 August, 2023; originally announced August 2023.

Comments: 11pages, 9 Figures, and 6 Tables. Accepted by ACMMM 2023

ACM Class: I.4.3

arXiv:2306.15753 [pdf]

Integrated Simulation Platform for Quantifying the Traffic-Induced Environmental and Health Impacts

Authors: Xuanpeng Zhao, Guoyuan Wu, Akula Venkatram, Ji Luo, Peng Hao, Kanok Boriboonsomsin, Shaohua Hu

Abstract: Air quality and human exposure to mobile source pollutants have become major concerns in urban transportation. Existing studies mainly focus on mitigating traffic congestion and reducing carbon footprints, with limited understanding of traffic-related health impacts from the environmental justice perspective. To address this gap, we present an innovative integrated simulation platform that models… ▽ More Air quality and human exposure to mobile source pollutants have become major concerns in urban transportation. Existing studies mainly focus on mitigating traffic congestion and reducing carbon footprints, with limited understanding of traffic-related health impacts from the environmental justice perspective. To address this gap, we present an innovative integrated simulation platform that models traffic-related air quality and human exposure at the microscopic level. The platform consists of five modules: SUMO for traffic modeling, MOVES for emissions modeling, a 3D grid-based dispersion model, a Matlab-based concentration visualizer, and a human exposure model. Our case study on multi-modal mobility on-demand services demonstrates that a distributed pickup strategy can reduce human cancer risk associated with PM2.5 by 33.4% compared to centralized pickup. Our platform offers quantitative results of traffic-related air quality and health impacts, useful for evaluating environmental issues and improving transportation systems management and operations strategies. △ Less

Submitted 13 June, 2023; originally announced June 2023.

Comments: 35 pages, 11 figures

arXiv:2306.15303 [pdf, other]

Energy-Efficient MIMO Integrated Sensing and Communications with On-off Non-transmission Power

Authors: Guanlin Wu, Yuan Fang, Jie Xu, Zhiyong Feng, Shuguang Cui

Abstract: This paper investigates the energy efficiency of a multiple-input multiple-output (MIMO) integrated sensing and communications (ISAC) system, in which one multi-antenna base station (BS) transmits unified ISAC signals to a multi-antenna communication user (CU) and at the same time use the echo signals to estimate an extended target. We focus on one particular ISAC transmission block and take into… ▽ More This paper investigates the energy efficiency of a multiple-input multiple-output (MIMO) integrated sensing and communications (ISAC) system, in which one multi-antenna base station (BS) transmits unified ISAC signals to a multi-antenna communication user (CU) and at the same time use the echo signals to estimate an extended target. We focus on one particular ISAC transmission block and take into account the practical on-off non-transmission power at the BS. Under this setup, we minimize the energy consumption at the BS while ensuring a minimum average data rate requirement for communication and a maximum Cramér-Rao bound (CRB) requirement for target estimation, by jointly optimizing the transmit covariance matrix and the ``on'' duration for active transmission. We obtain the optimal solution to the rate-and-CRB-constrained energy minimization problem in a semi-closed form. Interestingly, the obtained optimal solution is shown to unify the spectrum-efficient and energy-efficient communications and sensing designs. In particular, for the special MIMO sensing case with rate constraint inactive, the optimal solution follows the isotropic transmission with shortest ``on'' duration, in which the BS radiates the required sensing energy by using sufficiently high power over the shortest duration. For the general ISAC case, the optimal transmit covariance solution is of full rank and follows the eigenmode transmission based on the communication channel, while the optimal ``on'' duration is determined based on both the rate and CRB constraints. Numerical results show that the proposed ISAC design achieves significantly reduced energy consumption as compared to the benchmark schemes based on isotropic transmission, always-on transmission, and sensing or communications only designs, especially when the rate and CRB constraints become stringent. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: 13 pages, 9 figures

arXiv:2306.09736 [pdf]

Overtaking-enabled Eco-approach Control at Signalized Intersections for Connected and Automated Vehicles

Authors: Haoxuan Dong, Weichao Zhuang, Guoyuan Wu, Zhaojian Li, Guodong Yin, Ziyou Song

Abstract: Preceding vehicles typically dominate the movement of following vehicles in traffic systems, thereby significantly influencing the efficacy of eco-driving control that concentrates on vehicle speed optimization. To potentially mitigate the negative effect of preceding vehicles on eco-driving control at the signalized intersection, this paper proposes an overtakingenabled eco-approach control (OEAC… ▽ More Preceding vehicles typically dominate the movement of following vehicles in traffic systems, thereby significantly influencing the efficacy of eco-driving control that concentrates on vehicle speed optimization. To potentially mitigate the negative effect of preceding vehicles on eco-driving control at the signalized intersection, this paper proposes an overtakingenabled eco-approach control (OEAC) strategy. It combines driving lane planning and speed optimization for connected and automated vehicles to relax the first-in-first-out queuing policy at the signalized intersection, minimizing the target vehicle's energy consumption and travel delay. The OEAC adopts a receding horizon two-stage control framework to derive optimal driving trajectories for adapting to dynamic traffic conditions. In the first stage, the driving lane optimization problem is formulated as a Markov decision process and solved using dynamic programming, which takes into account the uncertain disturbance from preceding vehicles. In the second stage, the vehicle's speed trajectory with the minimal driving cost is optimized rapidly using Pontryagin's minimum principle to obtain the closed-form analytical optimal solution. Extensive simulations are conducted to evaluate the effectiveness of the OEAC. The results show that the OEAC is excellent in driving cost reduction over constant speed and regular eco-approach and departure strategies in various traffic scenarios, with an average improvement of 20.91% and 5.62%, respectively. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.08903 [pdf, other]

Two-Way Semantic Transmission of Images without Feedback

Authors: Kaiwen Yu, Qi He, Gang Wu

Abstract: As a competitive technology for 6G, semantic communications can significantly improve transmission efficiency. However, many existing semantic communication systems require information feedback during the training coding process, resulting in a significant communication overhead. In this article, we consider a two-way semantic communication (TW-SC) system, where information feedback can be omitted… ▽ More As a competitive technology for 6G, semantic communications can significantly improve transmission efficiency. However, many existing semantic communication systems require information feedback during the training coding process, resulting in a significant communication overhead. In this article, we consider a two-way semantic communication (TW-SC) system, where information feedback can be omitted by exploiting the weight reciprocity in the transceiver. Particularly, the channel simulator and semantic transceiver are implemented on both TW-SC nodes and the channel distribution is modeled by a conditional generative adversarial network. Simulation results demonstrate that the proposed TW-SC system performs closing to the state-of-the-art one-way semantic communication systems but requiring no feedback between the transceiver in the training process. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2305.18096 [pdf, other]

Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target

Authors: Guan-Wei Wu, Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee

Abstract: Spoken Language Understanding (SLU) is a task that aims to extract semantic information from spoken utterances. Previous research has made progress in end-to-end SLU by using paired speech-text data, such as pre-trained Automatic Speech Recognition (ASR) models or paired text as intermediate targets. However, acquiring paired transcripts is expensive and impractical for unwritten languages. On the… ▽ More Spoken Language Understanding (SLU) is a task that aims to extract semantic information from spoken utterances. Previous research has made progress in end-to-end SLU by using paired speech-text data, such as pre-trained Automatic Speech Recognition (ASR) models or paired text as intermediate targets. However, acquiring paired transcripts is expensive and impractical for unwritten languages. On the other hand, Textless SLU extracts semantic information from speech without utilizing paired transcripts. However, the absence of intermediate targets and training guidance for textless SLU often results in suboptimal performance. In this work, inspired by the content-disentangled discrete units from self-supervised speech models, we proposed to use discrete units as intermediate guidance to improve textless SLU performance. Our method surpasses the baseline method on five SLU benchmark corpora. Additionally, we find that unit guidance facilitates few-shot learning and enhances the model's ability to handle noise. △ Less

Submitted 8 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: Accepted by Interspeech 2023. *Equal contribution

arXiv:2305.12701 [pdf, other]

More Perspectives Mean Better: Underwater Target Recognition and Localization with Multimodal Data via Symbiotic Transformer and Multiview Regression

Authors: Shipei Liu, Xiaoya Fan, Guowei Wu

Abstract: Underwater acoustic target recognition (UATR) and localization (UATL) play important roles in marine exploration. The highly noisy acoustic signal and time-frequency interference among various sources pose big challenges to this task. To tackle these issues, we propose a multimodal approach to extract and fuse audio-visual-textual information to recognize and localize underwater targets through th… ▽ More Underwater acoustic target recognition (UATR) and localization (UATL) play important roles in marine exploration. The highly noisy acoustic signal and time-frequency interference among various sources pose big challenges to this task. To tackle these issues, we propose a multimodal approach to extract and fuse audio-visual-textual information to recognize and localize underwater targets through the designed Symbiotic Transformer (Symb-Trans) and Multi-View Regression (MVR) method. The multimodal data were first preprocessed by a custom-designed HetNorm module to normalize the multi-source data in a common feature space. The Symb-Trans module embeds audiovisual features by co-training the preprocessed multimodal features through parallel branches and a content encoder with cross-attention. The audiovisual features are then used for underwater target recognition. Meanwhile, the text embedding combined with the audiovisual features is fed to an MVR module to predict the localization of the underwater targets through multi-view clustering and multiple regression. Since no off-the-shell multimodal dataset is available for UATR and UATL, we combined multiple public datasets, consisting of acoustic, and/or visual, and/or textural data, to obtain audio-visual-textual triplets for model training and validation. Experiments show that our model outperforms comparative methods in 91.7% (11 out of 12 metrics) and 100% (4 metrics) of the quantitative metrics for the recognition and localization tasks, respectively. In a case study, we demonstrate the advantages of multi-view models in establishing sample discriminability through visualization methods. For UATL, the proposed MVR method produces the relation graphs, which allow predictions based on records of underwater targets with similar conditions. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2303.12479

Distributed Two-tier DRL Framework for Cell-Free Network: Association, Beamforming and Power Allocation

Authors: Kaiwen Yu, Chonghao Zhao, Gang Wu, Geoffrey Ye Li

Abstract: Intelligent wireless networks have long been expected to have self-configuration and self-optimization capabilities to adapt to various environments and demands. In this paper, we develop a novel distributed hierarchical deep reinforcement learning (DHDRL) framework with two-tier control networks in different timescales to optimize the long-term spectrum efficiency (SE) of the downlink cell-free m… ▽ More Intelligent wireless networks have long been expected to have self-configuration and self-optimization capabilities to adapt to various environments and demands. In this paper, we develop a novel distributed hierarchical deep reinforcement learning (DHDRL) framework with two-tier control networks in different timescales to optimize the long-term spectrum efficiency (SE) of the downlink cell-free multiple-input single-output (MISO) network, consisting of multiple distributed access points (AP) and user terminals (UT). To realize the proposed two-tier control strategy, we decompose the optimization problem into two sub-problems, AP-UT association (AUA) as well as beamforming and power allocation (BPA), resulting in a Markov decision process (MDP) and Partially Observable MDP (POMDP). The proposed method consists of two neural networks. At the system level, a distributed high-level neural network is introduced to optimize wireless network structure on a large timescale. While at the link level, a distributed low-level neural network is proposed to mitigate inter-AP interference and improve the transmission performance on a small timescale. Numerical results show that our method is effective for high-dimensional problems, in terms of spectrum efficiency, signaling overhead as well as satisfaction probability, and generalize well to diverse multi-object problems. △ Less

Submitted 5 December, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: The paper has some updated

arXiv:2303.11084 [pdf, other]

Quantitative Error Analyses of Spectral Density Estimators Using Covariance Lags

Authors: Guangyu Wu, Anders Lindquist

Abstract: Spectral density estimation is a core problem of system identification, which is an important research area of system control and signal processing. There have been numerous results on the design of spectral density estimators. However to our best knowledge, quantitative error analyses of the spectral density estimation have not been proposed yet. In real practice, there are two main factors which… ▽ More Spectral density estimation is a core problem of system identification, which is an important research area of system control and signal processing. There have been numerous results on the design of spectral density estimators. However to our best knowledge, quantitative error analyses of the spectral density estimation have not been proposed yet. In real practice, there are two main factors which induce errors in the spectral density estimation, including the external additive noise and the limited number of samples. In this paper, which is a very preliminary version, we first consider a univariate spectral density estimator using covariance lags. The estimation task is performed by a convex optimization scheme, and the covariance lags of the estimated spectral density are exactly as desired, which makes it possible for quantitative error analyses such as to derive tight error upper bounds. We analyze the errors induced by the two factors and propose upper and lower bounds for the errors. Then the results of the univariate spectral estimator are generalized to the multivariate one. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2303.05680 [pdf, other]

Power Adaptation for Suborbital Downlink with Stochastic Satellites Interference

Authors: Yihao He, Juntao Ma, Zhendong Peng, Gang Wu

Abstract: This paper investigates downlink power adaptation for the suborbital node in suborbital-ground communication systems, which are subject to extremely high reliability and ultra-low latency communications requirements. The problem is formulated as a power threshold-minimization problem, where interference from satellites is modeled as an accumulation of stochastic point processes on different orbit… ▽ More This paper investigates downlink power adaptation for the suborbital node in suborbital-ground communication systems, which are subject to extremely high reliability and ultra-low latency communications requirements. The problem is formulated as a power threshold-minimization problem, where interference from satellites is modeled as an accumulation of stochastic point processes on different orbit planes, and hybrid beamforming (HBF) is considered. To ensure Quality of Service (QoS) constraints, the finite blocklength regime is adopted. Numerical results show that the required transmit power of the suborbital node decreases as the elevation angle at the receiving station increases. △ Less

Submitted 2 May, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

arXiv:2302.10391 [pdf, other]

doi 10.3390/drones7050318

Low-Complexity Three-Dimensional AOA-Cross Geometric Center Localization Methods via Multi-UAV network

Authors: Baihua Shi, Yifan Li, Guilu Wu, Shihao Yan, Feng Shu

Abstract: Angle of arrival (AOA) is widely used to locate a wireless signal emitter in unmanned aerial vehicle (UAV) localization. Compared with received signal strength (RSS) and time of arrival (TOA), it has higher accuracy and is not sensitive to time synchronization of the distributed sensors. However, there are few works focused on three-dimensional (3-D) scenario. Furthermore, although maximum likelih… ▽ More Angle of arrival (AOA) is widely used to locate a wireless signal emitter in unmanned aerial vehicle (UAV) localization. Compared with received signal strength (RSS) and time of arrival (TOA), it has higher accuracy and is not sensitive to time synchronization of the distributed sensors. However, there are few works focused on three-dimensional (3-D) scenario. Furthermore, although maximum likelihood estimator (MLE) has a relatively high performance, its computational complexity is ultra high. It is hard to employ it in practical applications. This paper proposed two multiplane geometric center based methods for 3-D AOA in UAV positioning. The first method could estimate the source position and angle measurement noise at the same time by seeking a center of the inscribed sphere, called CIS. Firstly, every sensor could measure two angles, azimuth angle and elevation angle. Based on that, two planes are constructed. Then, the estimated values of source position and angle noise are achieved by seeking the center and radius of the corresponding inscribed sphere. Deleting the estimation of the radius, the second algorithm, called MSD-LS, is born. It is not able to estimate angle noise but has lower computational complexity. Theoretical analysis and simulation results show that proposed methods could approach the Cramer-Rao lower bound (CRLB) and have lower complexity than MLE. △ Less

Submitted 16 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: 7 pages, 5 figures

Journal ref: Drones 2023, 7(5), 318

arXiv:2301.06227 [pdf, other]

General Distribution Steering: A Sub-Optimal Solution by Convex Optimization

Authors: Guangyu Wu, Anders Lindquist

Abstract: General distribution steering is intrinsically an infinite-dimensional problem, where the distributions to steer are arbitrary. In the literature, the distribution steering problem governed by system dynamics is usually treated by assuming the distribution of the system states to be Gaussian. In our previous paper, we considered the distribution steering problem where the initial and terminal dist… ▽ More General distribution steering is intrinsically an infinite-dimensional problem, where the distributions to steer are arbitrary. In the literature, the distribution steering problem governed by system dynamics is usually treated by assuming the distribution of the system states to be Gaussian. In our previous paper, we considered the distribution steering problem where the initial and terminal distributions are arbitrary (only required to have first several orders of power moments), and proposed to use the moments to turn this problem into a finite-dimensional one. We put forward a moment representation of the primal system for control. However, the control law in that paper was an empirical one without optimization towards a design criterion, which doesn't always ensure a most satisfactory solution. In this paper, which is a very preliminary version, we propose a convex optimization approach to the general distribution steering problem of the first-order discrete-time linear system, i.e., an optimal control law for the corresponding moment system. The optimal control inputs of the moment system are obtained by convex optimization, of which the convexity of the domain is proved. An algorithm of distribution steering is then put forward by extending a realization scheme of control inputs using the Kullback-Leibler distance to one realization using the squared Hellinger distance, of which the performance has been shown to be better than the former one. Experiments on different types of cost functions are given to validate the performance of our proposed algorithm. Since the moment system is a dimension-reduced counterpart of the primal system, and we are not optimizing the cost function over all feasible control inputs, we call this solution a sub-optimal one to the primal general distribution steering problem. △ Less

Submitted 15 January, 2023; originally announced January 2023.

Comments: 16 pages, 23 figures. arXiv admin note: text overlap with arXiv:2211.13370

arXiv:2211.13374 [pdf, other]

A Multivariate Non-Gaussian Bayesian Filter Using Power Moments

Authors: Guangyu Wu, Anders Lindquist

Abstract: In this paper, we extend our results on the univariate non-Gaussian Bayesian filter using power moments to the multivariate systems, which can be either linear or nonlinear. Doing this introduces several challenging problems, for example a positive parametrization of the density surrogate, which is not only a problem of filter design, but also one of the multiple dimensional Hamburger moment probl… ▽ More In this paper, we extend our results on the univariate non-Gaussian Bayesian filter using power moments to the multivariate systems, which can be either linear or nonlinear. Doing this introduces several challenging problems, for example a positive parametrization of the density surrogate, which is not only a problem of filter design, but also one of the multiple dimensional Hamburger moment problem. We propose a parametrization of the density surrogate with the proofs to its existence, Positivstellensatz and uniqueness. Based on it, we analyze the errors of moments of the density estimates by the proposed density surrogate. A discussion on continuous and discrete treatments to the non-Gaussian Bayesian filtering problem is proposed to motivate the research on continuous parametrization of the system state. Simulation results on estimating different types of multivariate density functions are given to validate our proposed filter. To the best of our knowledge, the proposed filter is the first one implementing the multivariate Bayesian filter with the system state parameterized as a continuous function, which only requires the true states being Lebesgue integrable. △ Less

Submitted 9 November, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

Comments: 16 pages, 4 figures. arXiv admin note: text overlap with arXiv:2207.08519

arXiv:2211.02322 [pdf, other]

Density Steering by Power Moments

Authors: Guangyu Wu, Anders Lindquist

Abstract: This paper considers the problem of steering an arbitrary initial probability density function to an arbitrary terminal one, where the system dynamics is governed by a first-order linear stochastic difference equation. It is a generalization of the conventional stochastic control problem where the uncertainty of the system state is usually characterized by a Gaussian distribution. We propose to us… ▽ More This paper considers the problem of steering an arbitrary initial probability density function to an arbitrary terminal one, where the system dynamics is governed by a first-order linear stochastic difference equation. It is a generalization of the conventional stochastic control problem where the uncertainty of the system state is usually characterized by a Gaussian distribution. We propose to use the power moments to turn the infinite-dimensional problem into a finite-dimensional one and to present an empirical control scheme. By the designed control law, the moment sequence of the controls at each time step is positive, which ensures the existence of the control for the moment system. We then realize the control at each time step as a function in analytic form by a convex optimization scheme, for which the existence and uniqueness of the solution have been proved in our previous paper. Two numerical examples are given to validate our proposed algorithm. △ Less

Submitted 4 July, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

Comments: 6 pages, 6 figures

arXiv:2211.01294 [pdf]

Driver Digital Twin for Online Prediction of Personalized Lane Change Behavior

Authors: Xishun Liao, Xuanpeng Zhao, Ziran Wang, Zhouqiao Zhao, Kyungtae Han, Rohit Gupta, Matthew J. Barth, Guoyuan Wu

Abstract: Connected and automated vehicles (CAVs) are supposed to share the road with human-driven vehicles (HDVs) in a foreseeable future. Therefore, considering the mixed traffic environment is more pragmatic, as the well-planned operation of CAVs may be interrupted by HDVs. In the circumstance that human behaviors have significant impacts, CAVs need to understand HDV behaviors to make safe actions. In th… ▽ More Connected and automated vehicles (CAVs) are supposed to share the road with human-driven vehicles (HDVs) in a foreseeable future. Therefore, considering the mixed traffic environment is more pragmatic, as the well-planned operation of CAVs may be interrupted by HDVs. In the circumstance that human behaviors have significant impacts, CAVs need to understand HDV behaviors to make safe actions. In this study, we develop a Driver Digital Twin (DDT) for the online prediction of personalized lane change behavior, allowing CAVs to predict surrounding vehicles' behaviors with the help of the digital twin technology. DDT is deployed on a vehicle-edge-cloud architecture, where the cloud server models the driver behavior for each HDV based on the historical naturalistic driving data, while the edge server processes the real-time data from each driver with his/her digital twin on the cloud to predict the lane change maneuver. The proposed system is first evaluated on a human-in-the-loop co-simulation platform, and then in a field implementation with three passenger vehicles connected through the 4G/LTE cellular network. The lane change intention can be recognized in 6 seconds on average before the vehicle crosses the lane separation line, and the Mean Euclidean Distance between the predicted trajectory and GPS ground truth is 1.03 meters within a 4-second prediction window. Compared to the general model, using a personalized model can improve prediction accuracy by 27.8%. The demonstration video of the proposed system can be watched at https://youtu.be/5cbsabgIOdM. △ Less

Submitted 2 November, 2022; originally announced November 2022.

arXiv:2208.07069 [pdf, ps, other]

Channel Estimation for RIS-Aided Multi-User mmWave Systems with Uniform Planar Arrays

Authors: Zhendong Peng, Gui Zhou, Cunhua Pan, Hong Ren, A. Lee Swindlehurst, Petar Popovski, Gang Wu

Abstract: In this paper, we adopt a three-stage based uplink channel estimation protocol with reduced pilot overhead for an reconfigurable intelligent surface (RIS)-aided multi-user (MU) millimeter wave (mmWave) communication system, in which both the base station (BS) and the RIS are equipped with a uniform planar array (UPA). Specifically, in Stage I, the channel state information (CSI) of a typical user… ▽ More In this paper, we adopt a three-stage based uplink channel estimation protocol with reduced pilot overhead for an reconfigurable intelligent surface (RIS)-aided multi-user (MU) millimeter wave (mmWave) communication system, in which both the base station (BS) and the RIS are equipped with a uniform planar array (UPA). Specifically, in Stage I, the channel state information (CSI) of a typical user is estimated. To address the power leakage issue for the common angles-of-arrival (AoAs) estimation in this stage, we develop a low-complexity one-dimensional search method. In Stage II, a re-parameterized common BS-RIS channel is constructed with the estimated information from Stage I to estimate other users' CSI. In Stage III, only the rapidly varying channel gains need to re-estimated. Furthermore, the proposed method can be extended to multi-antenna UPA-type users, by decomposing the estimation of a multi-antenna channel with $J$ scatterers into estimating $J$ single-scatterer channels for a virtual single-antenna user. An orthogonal matching pursuit (OMP)-based method is proposed to estimate the angles-of-departure (AoDs) at the users. Simulation results demonstrate that the proposed algorithm significantly achieves high channel estimation accuracy, which approaches the genie-aided upper bound in the high SNR regime. △ Less

Submitted 15 April, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

Comments: 20 pages, 7 figures, 4 Appendices

arXiv:2208.02447 [pdf, other]

DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV

Authors: Xiao Mao, Zhiguang Cao, Mingfeng Fan, Guohua Wu, Witold Pedrycz

Abstract: Exploiting unmanned aerial vehicles (UAVs) to execute tasks is gaining growing popularity recently. To solve the underlying task scheduling problem, the deep reinforcement learning (DRL) based methods demonstrate notable advantage over the conventional heuristics as they rely less on hand-engineered rules. However, their decision space will become prohibitively huge as the problem scales up, thus… ▽ More Exploiting unmanned aerial vehicles (UAVs) to execute tasks is gaining growing popularity recently. To solve the underlying task scheduling problem, the deep reinforcement learning (DRL) based methods demonstrate notable advantage over the conventional heuristics as they rely less on hand-engineered rules. However, their decision space will become prohibitively huge as the problem scales up, thus deteriorating the computation efficiency. To alleviate this issue, we propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF), where we decompose the task scheduling of multi-UAV into task allocation and route planning. Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs, and we exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks given the maximum flight distance of the UAV. To effectively train the two models, we design an interactive training strategy (ITS), which includes pre-training, intensive training and alternate training. Experimental results show that our DL-DRL performs favorably against the learning-based and conventional baselines including the OR-Tools, in terms of solution quality and computation efficiency. We also verify the generalization performance of our approach by applying it to larger sizes of up to 1000 tasks. Moreover, we also show via an ablation study that our ITS can help achieve a balance between the performance and training efficiency. △ Less

Submitted 6 June, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

Comments: 13 pages, 7 figures

arXiv:2207.03052 [pdf, other]

Energy-Efficient Transmit Beamforming and Antenna Selection with Non-Linear PA Efficiency

Authors: Yuan Fang, Yi Huang, Chuan Ma, Yinghao **, Gaoyuan Cheng, Guanlin Wu, Jie Xu

Abstract: This letter studies the energy-efficient design in a downlink multi-antenna multi-user system consisting of a multi-antenna base station (BS) and multiple single-antenna users, by considering the practical non-linear power amplifier (PA) efficiency and the on-off power consumption of radio frequency (RF) chain at each transmit antenna. Under this setup, we jointly optimize the transmit beamforming… ▽ More This letter studies the energy-efficient design in a downlink multi-antenna multi-user system consisting of a multi-antenna base station (BS) and multiple single-antenna users, by considering the practical non-linear power amplifier (PA) efficiency and the on-off power consumption of radio frequency (RF) chain at each transmit antenna. Under this setup, we jointly optimize the transmit beamforming and antenna on/off selection at the BS to minimize its total power consumption while ensuring the individual signal-to-interference-plus-noise ratio (SINR) constraints at the users. However, due to the non-linear PA efficiency and the on-off RF chain power consumption, the formulated SINR-constrained power minimization problem is highly non-convex and difficult to solve. To tackle this issue, we propose an efficient algorithm to obtain a high-quality solution based on the technique of sequential convex approximation (SCA). We provide numerical results to validate the performance of our proposed design. It is shown that at the optimized solution, the BS tends to activate fewer antennas and use higher power transmission at each antenna to exploit the non-linear PA efficiency. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Comments: 5 pages, 5 figures, submitted to WCL

arXiv:2206.13774 [pdf, other]

Assessment of U.S. Department of Transportation Lane-Level Map for Connected Vehicle Applications

Authors: Wang Hu, David Oswald, Guoyuan Wu, Jay A. Farrell

Abstract: High-definition (Hi-Def) digital maps are an indispensable automated driving technology that is develo** rapidly. There are various commercial or governmental map products in the market. It is notable that the U.S. Department of Transportation (USDOT) map tool allows the user to create MAP and Signal Phase and Timing (SPaT) messages with free access. However, an analysis of the accuracy of this… ▽ More High-definition (Hi-Def) digital maps are an indispensable automated driving technology that is develo** rapidly. There are various commercial or governmental map products in the market. It is notable that the U.S. Department of Transportation (USDOT) map tool allows the user to create MAP and Signal Phase and Timing (SPaT) messages with free access. However, an analysis of the accuracy of this map tool is currently lacking in the literature. This paper provides such an analysis. The analysis manually selects 39 feature points within about 200 meters of the verified point and 55 feature points over longer distances from the verified point. All feature locations are surveyed using GNSS and mapped using the USDOT tool. Different error sources are evaluated to allow assessment of the USDOT map accuracy. In this investigation, The USDOT map tool is demonstrated to achieve 17 centimeters horizontal accuracy, which meets the lane-level map requirement. The maximum horizontal map error is less than 30 centimeters. △ Less

Submitted 28 June, 2022; originally announced June 2022.

Comments: 6 pages, 6 figures

arXiv:2206.09736 [pdf, other]

Geo-NI: Geometry-aware Neural Interpolation for Light Field Rendering

Authors: Gaochang Wu, Yuemei Zhou, Yebin Liu, Lu Fang, Tianyou Chai

Abstract: In this paper, we present a Geometry-aware Neural Interpolation (Geo-NI) framework for light field rendering. Previous learning-based approaches either rely on the capability of neural networks to perform direct interpolation, which we dubbed Neural Interpolation (NI), or explore scene geometry for novel view synthesis, also known as Depth Image-Based Rendering (DIBR). Instead, we incorporate the… ▽ More In this paper, we present a Geometry-aware Neural Interpolation (Geo-NI) framework for light field rendering. Previous learning-based approaches either rely on the capability of neural networks to perform direct interpolation, which we dubbed Neural Interpolation (NI), or explore scene geometry for novel view synthesis, also known as Depth Image-Based Rendering (DIBR). Instead, we incorporate the ideas behind these two kinds of approaches by launching the NI with a novel DIBR pipeline. Specifically, the proposed Geo-NI first performs NI using input light field sheared by a set of depth hypotheses. Then the DIBR is implemented by assigning the sheared light fields with a novel reconstruction cost volume according to the reconstruction quality under different depth hypotheses. The reconstruction cost is interpreted as a blending weight to render the final output light field by blending the reconstructed light fields along the dimension of depth hypothesis. By combining the superiorities of NI and DIBR, the proposed Geo-NI is able to render views with large disparity with the help of scene geometry while also reconstruct non-Lambertian effect when depth is prone to be ambiguous. Extensive experiments on various datasets demonstrate the superior performance of the proposed geometry-aware light field rendering framework. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: 13 pages, 8 figures, 4 tables

arXiv:2205.08579 [pdf, other]

The Power of Fragmentation: A Hierarchical Transformer Model for Structural Segmentation in Symbolic Music Generation

Authors: Guowei Wu, Shipei Liu, Xiaoya Fan

Abstract: Symbolic Music Generation relies on the contextual representation capabilities of the generative model, where the most prevalent approach is the Transformer-based model. The learning of musical context is also related to the structural elements in music, i.e. intro, verse, and chorus, which are currently overlooked by the research community. In this paper, we propose a hierarchical Transformer mod… ▽ More Symbolic Music Generation relies on the contextual representation capabilities of the generative model, where the most prevalent approach is the Transformer-based model. The learning of musical context is also related to the structural elements in music, i.e. intro, verse, and chorus, which are currently overlooked by the research community. In this paper, we propose a hierarchical Transformer model to learn multi-scale contexts in music. In the encoding phase, we first designed a Fragment Scope Localization layer to syncopate the music into chords and sections. Then, we use a multi-scale attention mechanism to learn note-, chord-, and section-level contexts. In the decoding phase, we proposed a hierarchical Transformer model that uses fine-decoders to generate sections in parallel and a coarse-decoder to decode the combined music. We also designed a Music Style Normalization layer to achieve a consistent music style between the generated sections. Our model is evaluated on two open MIDI datasets, and experiments show that our model outperforms the best contemporary music generative models. More excitingly, the visual evaluation shows that our model is superior in melody reuse, resulting in more realistic music. △ Less

Submitted 10 July, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

arXiv:2205.05675 [pdf, other]

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

Authors: Yawei Li, Kai Zhang, Radu Timofte, Luc Van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, **gyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, **shan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang , et al. (86 additional authors not shown)

Abstract: This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of e… ▽ More This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29.00dB on DIV2K validation set. IMDN is set as the baseline for efficiency measurement. The challenge had 3 tracks including the main track (runtime), sub-track one (model complexity), and sub-track two (overall performance). In the main track, the practical runtime performance of the submissions was evaluated. The rank of the teams were determined directly by the absolute value of the average runtime on the validation set and test set. In sub-track one, the number of parameters and FLOPs were considered. And the individual rankings of the two metrics were summed up to determine a final ranking in this track. In sub-track two, all of the five metrics mentioned in the description of the challenge including runtime, parameter count, FLOPs, activations, and memory consumption were considered. Similar to sub-track one, the rankings of five metrics were summed up to determine a final ranking. The challenge had 303 registered participants, and 43 teams made valid submissions. They gauge the state-of-the-art in efficient single image super-resolution. △ Less

Submitted 11 May, 2022; originally announced May 2022.

Comments: Validation code of the baseline model is available at https://github.com/ofsoundof/IMDN. Validation of all submitted models is available at https://github.com/ofsoundof/NTIRE2022_ESR

arXiv:2205.03225 [pdf, other]

doi 10.1364/OE.460704

Multiple-access relay stations for long-haul fiber-optic radio frequency transfer

Authors: Qi Li, Liang Hu, **bo Zhang, Jian** Chen, Guiling Wu

Abstract: We report on the realization of a long-haul radio frequency (RF) transfer scheme by using multiple-access relay stations (MARSs). The proposed scheme with independent link noise compensation for each fiber sub-link effectively solves the limitation of compensation bandwidth for long-haul transfer. The MARS can have the capability to share the same modulated optical signal for the front and rear fi… ▽ More We report on the realization of a long-haul radio frequency (RF) transfer scheme by using multiple-access relay stations (MARSs). The proposed scheme with independent link noise compensation for each fiber sub-link effectively solves the limitation of compensation bandwidth for long-haul transfer. The MARS can have the capability to share the same modulated optical signal for the front and rear fiber sub-links, simplifying the configuration at the repeater station and enabling the transfer system to have the multiple-access capability. At the same time, we for the first time theoretically model the effect of the MARS position on the fractional frequency instability of the fiber-optic RF transfer, demonstrating that the MARS position has little effect on system's performance when the ratio of the front and rear fiber sub-links is around $1:1$. We experimentally demonstrate a 1 GHz signal transfer by using one MARS connecting 260 and 280 km fiber links with the fractional frequency instabilities of less than $5.9\times10^{-14}$ at 1 s and $8.5\times10^{-17}$ at 10,000 s at the remote site and of $5.6\times10^{-14}$ and $6.6\times10^{-17}$ at the integration times of 1 s and 10,000 s at the MARS. The proposed scalable technique can arbitrarily add the same MARSs in the fiber link, which has great potential in realizing ultra-long-haul RF transfer. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: Accepted for publication in Optics Express

arXiv:2204.08397 [pdf, other]

Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution

Authors: Zongcai Du, Ding Liu, Jie Liu, Jie Tang, Gangshan Wu, Lean Fu

Abstract: Runtime and memory consumption are two important aspects for efficient image super-resolution (EISR) models to be deployed on resource-constrained devices. Recent advances in EISR exploit distillation and aggregation strategies with plenty of channel split and concatenation operations to make full use of limited hierarchical features. In contrast, sequential network operations avoid frequently acc… ▽ More Runtime and memory consumption are two important aspects for efficient image super-resolution (EISR) models to be deployed on resource-constrained devices. Recent advances in EISR exploit distillation and aggregation strategies with plenty of channel split and concatenation operations to make full use of limited hierarchical features. In contrast, sequential network operations avoid frequently accessing preceding states and extra nodes, and thus are beneficial to reducing the memory consumption and runtime overhead. Following this idea, we design our lightweight network backbone by mainly stacking multiple highly optimized convolution and activation layers and decreasing the usage of feature fusion. We propose a novel sequential attention branch, where every pixel is assigned an important factor according to local and global contexts, to enhance high-frequency details. In addition, we tailor the residual block for EISR and propose an enhanced residual block (ERB) to further accelerate the network inference. Finally, combining all the above techniques, we construct a fast and memory-efficient network (FMEN) and its small version FMEN-S, which runs 33% faster and reduces 74% memory consumption compared with the state-of-the-art EISR model: E-RFDN, the champion in AIM 2020 efficient super-resolution challenge. Besides, FMEN-S achieves the lowest memory consumption and the second shortest runtime in NTIRE 2022 challenge on efficient super-resolution. Code is available at https://github.com/NJU-Jet/FMEN. △ Less

Submitted 18 April, 2022; originally announced April 2022.

Comments: Accepted by NTIRE 2022 (CVPR Workshop)

arXiv:2204.01335 [pdf, ps, other]

Logistics in the Sky: A Two-phase Optimization Approach for the Drone Package Pickup and Delivery System

Authors: Fangyu Hong, Guohua Wu, Qizhang Luo, Huan Liu, ** Fang, Witold Pedrycz

Abstract: The application of drones in the last-mile distribution is a research hotspot in recent years. Different from the previous urban distribution mode that depends on trucks, this paper proposes a novel package pick-up and delivery mode and system in which multiple drones collaborate with automatic devices. The proposed mode uses free areas on the top of residential buildings to set automatic devices… ▽ More The application of drones in the last-mile distribution is a research hotspot in recent years. Different from the previous urban distribution mode that depends on trucks, this paper proposes a novel package pick-up and delivery mode and system in which multiple drones collaborate with automatic devices. The proposed mode uses free areas on the top of residential buildings to set automatic devices as delivery and pick-up points of packages, and employs drones to transport packages between buildings and depots. Integrated scheduling problem of package drop-pickup considering m-drone, m-depot, m-customer is crucial for the system. We propose a simulated-annealing-based two-phase optimization approach (SATO) to solve this problem. In the first phase, tasks are allocated to depots for serving, such that the initial problem is decomposed into multiple single depot scheduling problems with m-drone. In the second phase, considering the drone capability constraints and task demand constraints, we generate the route planning scheme for drones in each depot. Concurrently, an improved variable neighborhood descent algorithm (IVND) is designed in the first phase to reallocate tasks, and a local search algorithm (LS) are proposed to search the high-quality solution in the second phase. Finally, extensive experiments and comparative studies are conducted to test the effectiveness of the proposed approach. Experiments indicate that the proposed SATO-IVND can reduce the cost by more than 14% in a reasonable time compared with several other peer algorithms. △ Less

Submitted 4 April, 2022; originally announced April 2022.

arXiv:2203.12476 [pdf, other]

Stable Optimization for Large Vision Model Based Deep Image Prior in Cone-Beam CT Reconstruction

Authors: Minghui Wu, Yangdi Xu, Yingying Xu, Guangwei Wu, Qingqing Chen, Hongxiang Lin

Abstract: Large Vision Model (LVM) has recently demonstrated great potential for medical imaging tasks, potentially enabling image enhancement for sparse-view Cone-Beam Computed Tomography (CBCT), despite requiring a substantial amount of data for training. Meanwhile, Deep Image Prior (DIP) effectively guides an untrained neural network to generate high-quality CBCT images without any training data. However… ▽ More Large Vision Model (LVM) has recently demonstrated great potential for medical imaging tasks, potentially enabling image enhancement for sparse-view Cone-Beam Computed Tomography (CBCT), despite requiring a substantial amount of data for training. Meanwhile, Deep Image Prior (DIP) effectively guides an untrained neural network to generate high-quality CBCT images without any training data. However, the original DIP method relies on a well-defined forward model and a large-capacity backbone network, which is notoriously difficult to converge. In this paper, we propose a stable optimization method for the forward-model-free, LVM-based DIP model for sparse-view CBCT. Our approach consists of two main characteristics: (1) multi-scale perceptual loss (MSPL) which measures the similarity of perceptual features between the reference and output images at multiple resolutions without the need for any forward model, and (2) a reweighting mechanism that stabilizes the iteration trajectory of MSPL. One shot optimization is used to simultaneously and stably reweight MSPL and optimize LVM. We evaluate our approach on two publicly available datasets: SPARE and Walnut. The results show significant improvements in both image quality metrics and visualization that demonstrates reduced streak artifacts. The source code is available upon request. △ Less

Submitted 28 January, 2024; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: 5 pages, 4 figures, 1 table. Accepted to ICASSP 2024

arXiv:2202.10603 [pdf, other]

doi 10.1109/TPAMI.2022.3152488

Disentangling Light Fields for Super-Resolution and Disparity Estimation

Authors: Yingqian Wang, Longguang Wang, Gaochang Wu, Jungang Yang, Wei An, **gyi Yu, Yulan Guo

Abstract: Light field (LF) cameras record both intensity and directions of light rays, and encode 3D scenes into 4D LF images. Recently, many convolutional neural networks (CNNs) have been proposed for various LF image processing tasks. However, it is challenging for CNNs to effectively process LF images since the spatial and angular information are highly inter-twined with varying disparities. In this pape… ▽ More Light field (LF) cameras record both intensity and directions of light rays, and encode 3D scenes into 4D LF images. Recently, many convolutional neural networks (CNNs) have been proposed for various LF image processing tasks. However, it is challenging for CNNs to effectively process LF images since the spatial and angular information are highly inter-twined with varying disparities. In this paper, we propose a generic mechanism to disentangle these coupled information for LF image processing. Specifically, we first design a class of domain-specific convolutions to disentangle LFs from different dimensions, and then leverage these disentangled features by designing task-specific modules. Our disentangling mechanism can well incorporate the LF structure prior and effectively handle 4D LF data. Based on the proposed mechanism, we develop three networks (i.e., DistgSSR, DistgASR and DistgDisp) for spatial super-resolution, angular super-resolution and disparity estimation. Experimental results show that our networks achieve state-of-the-art performance on all these three tasks, which demonstrates the effectiveness, efficiency, and generality of our disentangling mechanism. Project page: https://yingqianwang.github.io/DistgLF/. △ Less

Submitted 22 July, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: We have corrected a mistake in Table 1 and updated Fig. 6 by using HR GT depth maps for evaluation

arXiv:2201.11871 [pdf, other]

Infrastructure-Based Object Detection and Tracking for Cooperative Driving Automation: A Survey

Authors: Zhengwei Bai, Guoyuan Wu, Xuewei Qi, Yongkang Liu, Kentaro Oguchi, Matthew J. Barth

Abstract: Object detection plays a fundamental role in enabling Cooperative Driving Automation (CDA), which is regarded as the revolutionary solution to addressing safety, mobility, and sustainability issues of contemporary transportation systems. Although current computer vision technologies could provide satisfactory object detection results in occlusion-free scenarios, the perception performance of onboa… ▽ More Object detection plays a fundamental role in enabling Cooperative Driving Automation (CDA), which is regarded as the revolutionary solution to addressing safety, mobility, and sustainability issues of contemporary transportation systems. Although current computer vision technologies could provide satisfactory object detection results in occlusion-free scenarios, the perception performance of onboard sensors could be inevitably limited by the range and occlusion. Owing to flexible position and pose for sensor installation, infrastructure-based detection and tracking systems can enhance the perception capability for connected vehicles and thus quickly become one of the most popular research topics. In this paper, we review the research progress for infrastructure-based object detection and tracking systems. Architectures of roadside perception systems based on different types of sensors are reviewed to show a high-level description of the workflows for infrastructure-based perception systems. Roadside sensors and different perception methodologies are reviewed and analyzed with detailed literature to provide a low-level explanation for specific methods followed by Datasets and Simulators to draw an overall landscape of infrastructure-based object detection and tracking methods. Discussions are conducted to point out current opportunities, open problems, and anticipated future trends. △ Less

Submitted 19 March, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

arXiv:2201.08996 [pdf, other]

Linear Array Network for Low-light Image Enhancement

Authors: Keqi Wang, Ziteng Cui, Jieru Jia, Hao Xu, Ge Wu, Yin Zhuang, Lu Chen, Zhiguo Hu, Yuhua Qian

Abstract: Convolution neural networks (CNNs) based methods have dominated the low-light image enhancement tasks due to their outstanding performance. However, the convolution operation is based on a local sliding window mechanism, which is difficult to construct the long-range dependencies of the feature maps. Meanwhile, the self-attention based global relationship aggregation methods have been widely used… ▽ More Convolution neural networks (CNNs) based methods have dominated the low-light image enhancement tasks due to their outstanding performance. However, the convolution operation is based on a local sliding window mechanism, which is difficult to construct the long-range dependencies of the feature maps. Meanwhile, the self-attention based global relationship aggregation methods have been widely used in computer vision, but these methods are difficult to handle high-resolution images because of the high computational complexity. To solve this problem, this paper proposes a Linear Array Self-attention (LASA) mechanism, which uses only two 2-D feature encodings to construct 3-D global weights and then refines feature maps generated by convolution layers. Based on LASA, Linear Array Network (LAN) is proposed, which is superior to the existing state-of-the-art (SOTA) methods in both RGB and RAW based low-light enhancement tasks with a smaller amount of parameters. The code is released in https://github.com/cuiziteng/LASA_enhancement. △ Less

Submitted 16 February, 2022; v1 submitted 22 January, 2022; originally announced January 2022.

arXiv:2112.15386 [pdf, other]

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning

Authors: Bin-Cheng Yang, Gangshan Wu

Abstract: Deep convolutional neural networks have been demonstrated to be effective for SISR in recent years. On the one hand, residual connections and dense connections have been used widely to ease forward information and backward gradient flows to boost performance. However, current methods use residual connections and dense connections separately in most network layers in a sub-optimal way. On the other… ▽ More Deep convolutional neural networks have been demonstrated to be effective for SISR in recent years. On the one hand, residual connections and dense connections have been used widely to ease forward information and backward gradient flows to boost performance. However, current methods use residual connections and dense connections separately in most network layers in a sub-optimal way. On the other hand, although various networks and methods have been designed to improve computation efficiency, save parameters, or utilize training data of multiple scale factors for each other to boost performance, it either do super-resolution in HR space to have a high computation cost or can not share parameters between models of different scale factors to save parameters and inference time. To tackle these challenges, we propose an efficient single image super-resolution network using dual path connections with multiple scale learning named as EMSRDPN. By introducing dual path connections inspired by Dual Path Networks into EMSRDPN, it uses residual connections and dense connections in an integrated way in most network layers. Dual path connections have the benefits of both reusing common features of residual connections and exploring new features of dense connections to learn a good representation for SISR. To utilize the feature correlation of multiple scale factors, EMSRDPN shares all network units in LR space between different scale factors to learn shared features and only uses a separate reconstruction unit for each scale factor, which can utilize training data of multiple scale factors to help each other to boost performance, meanwhile which can save parameters and support shared inference for multiple scale factors to improve efficiency. Experiments show EMSRDPN achieves better performance and comparable or even better parameter and inference efficiency over SOTA methods. △ Less

Submitted 28 October, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

Comments: 21 pages, 9 figures, 5 tables

arXiv:2111.13905 [pdf, other]

AdaDM: Enabling Normalization for Image Super-Resolution

Authors: Jie Liu, Jie Tang, Gangshan Wu

Abstract: Normalization like Batch Normalization (BN) is a milestone technique to normalize the distributions of intermediate layers in deep learning, enabling faster training and better generalization accuracy. However, in fidelity image Super-Resolution (SR), it is believed that normalization layers get rid of range flexibility by normalizing the features and they are simply removed from modern SR network… ▽ More Normalization like Batch Normalization (BN) is a milestone technique to normalize the distributions of intermediate layers in deep learning, enabling faster training and better generalization accuracy. However, in fidelity image Super-Resolution (SR), it is believed that normalization layers get rid of range flexibility by normalizing the features and they are simply removed from modern SR networks. In this paper, we study this phenomenon quantitatively and qualitatively. We found that the standard deviation of the residual feature shrinks a lot after normalization layers, which causes the performance degradation in SR networks. Standard deviation reflects the amount of variation of pixel values. When the variation becomes smaller, the edges will become less discriminative for the network to resolve. To address this problem, we propose an Adaptive Deviation Modulator (AdaDM), in which a modulation factor is adaptively predicted to amplify the pixel deviation. For better generalization performance, we apply BN in state-of-the-art SR networks with the proposed AdaDM. Meanwhile, the deviation amplification strategy in AdaDM makes the edge information in the feature more distinguishable. As a consequence, SR networks with BN and our AdaDM can get substantial performance improvements on benchmark datasets. Extensive experiments have been conducted to show the effectiveness of our method. △ Less

Submitted 27 November, 2021; originally announced November 2021.

arXiv:2111.09521 [pdf]

Evaluating Cybersecurity Risks of Cooperative Ramp Merging in Mixed Traffic Environments

Authors: Xuanpeng Zhao, Ahmed Abdo, Xishun Liao, Matthew J. Barth, Guoyuan Wu

Abstract: Connected and Automated Vehicle (CAV) technology has the potential to greatly improve transportation mobility, safety, and energy efficiency. However, ubiquitous vehicular connectivity also opens up the door for cyber-attacks. In this study, we investigate cybersecurity risks of a representative cooperative traffic management application, i.e., highway on-ramp merging, in a mixed traffic environme… ▽ More Connected and Automated Vehicle (CAV) technology has the potential to greatly improve transportation mobility, safety, and energy efficiency. However, ubiquitous vehicular connectivity also opens up the door for cyber-attacks. In this study, we investigate cybersecurity risks of a representative cooperative traffic management application, i.e., highway on-ramp merging, in a mixed traffic environment. We develop threat models with two trajectory spoofing strategies on CAVs to create traffic congestion, and we also devise an attack-resilient strategy for system defense. Furthermore, we leverage VENTOS, a Veins extension simulator made for CAV applications, to evaluate cybersecurity risks of the attacks and performance of the proposed defense strategy. A comprehensive case study is conducted across different traffic congestion levels, penetration rates of CAVs, and attack ratios. As expected, the results show that the performance of mobility decreases up to 55.19% at the worst case when the attack ratio increases, as does safety and energy. With our proposed mitigation defense algorithm, the system's cyber-attack resiliency is greatly improved. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: 10 pages, 7 figures, 4 tables

arXiv:2108.10001 [pdf, other]

doi 10.1109/LWC.2021.3102069

Automatic Modulation Classification Using Involution Enabled Residual Networks

Authors: Hao Zhang, Lu Yuan, Guangyu Wu, Fuhui Zhou, Qihui Wu

Abstract: Automatic modulation classification (AMC) is of crucial importance for realizing wireless intelligence communications. Many deep learning based models especially convolution neural networks (CNNs) have been proposed for AMC. However, the computation cost is very high, which makes them inappropriate for beyond the fifth generation wireless communication networks that have stringent requirements on… ▽ More Automatic modulation classification (AMC) is of crucial importance for realizing wireless intelligence communications. Many deep learning based models especially convolution neural networks (CNNs) have been proposed for AMC. However, the computation cost is very high, which makes them inappropriate for beyond the fifth generation wireless communication networks that have stringent requirements on the classification accuracy and computing time. In order to tackle those challenges, a novel involution enabled AMC scheme is proposed by using the bottleneck structure of the residual networks. Involution is utilized instead of convolution to enhance the discrimination capability and expressiveness of the model by incorporating a self-attention mechanism. Simulation results demonstrate that our proposed scheme achieves superior classification performance and faster convergence speed comparing with other benchmark schemes. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Journal ref: IEEE Wireless Communications Letters,2021

arXiv:2105.09750 [pdf, other]

Anchor-based Plain Net for Mobile Image Super-Resolution

Authors: Zongcai Du, Jie Liu, Jie Tang, Gangshan Wu

Abstract: Along with the rapid development of real-world applications, higher requirements on the accuracy and efficiency of image super-resolution (SR) are brought forward. Though existing methods have achieved remarkable success, the majority of them demand plenty of computational resources and large amount of RAM, and thus they can not be well applied to mobile device. In this paper, we aim at designing… ▽ More Along with the rapid development of real-world applications, higher requirements on the accuracy and efficiency of image super-resolution (SR) are brought forward. Though existing methods have achieved remarkable success, the majority of them demand plenty of computational resources and large amount of RAM, and thus they can not be well applied to mobile device. In this paper, we aim at designing efficient architecture for 8-bit quantization and deploy it on mobile device. First, we conduct an experiment about meta-node latency by decomposing lightweight SR architectures, which determines the portable operations we can utilize. Then, we dig deeper into what kind of architecture is beneficial to 8-bit quantization and propose anchor-based plain net (ABPN). Finally, we adopt quantization-aware training strategy to further boost the performance. Our model can outperform 8-bit quantized FSRCNN by nearly 2dB in terms of PSNR, while satisfying realistic needs at the same time. Code is avaliable at https://github.com/NJU- Jet/SR_Mobile_Quantization. △ Less

Submitted 24 September, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

Comments: accepted by CVPR2021 MAI Workshop

arXiv:2104.06797 [pdf, other]

doi 10.1109/TPAMI.2021.3073739

Revisiting Light Field Rendering with Deep Anti-Aliasing Neural Network

Authors: Gaochang Wu, Yebin Liu, Lu Fang, Tianyou Chai

Abstract: The light field (LF) reconstruction is mainly confronted with two challenges, large disparity and the non-Lambertian effect. Typical approaches either address the large disparity challenge using depth estimation followed by view synthesis or eschew explicit depth information to enable non-Lambertian rendering, but rarely solve both challenges in a unified framework. In this paper, we revisit the c… ▽ More The light field (LF) reconstruction is mainly confronted with two challenges, large disparity and the non-Lambertian effect. Typical approaches either address the large disparity challenge using depth estimation followed by view synthesis or eschew explicit depth information to enable non-Lambertian rendering, but rarely solve both challenges in a unified framework. In this paper, we revisit the classic LF rendering framework to address both challenges by incorporating it with advanced deep learning techniques. First, we analytically show that the essential issue behind the large disparity and non-Lambertian challenges is the aliasing problem. Classic LF rendering approaches typically mitigate the aliasing with a reconstruction filter in the Fourier domain, which is, however, intractable to implement within a deep learning pipeline. Instead, we introduce an alternative framework to perform anti-aliasing reconstruction in the image domain and analytically show comparable efficacy on the aliasing issue. To explore the full potential, we then embed the anti-aliasing framework into a deep neural network through the design of an integrated architecture and trainable parameters. The network is trained through end-to-end optimization using a peculiar training set, including regular LFs and unstructured LFs. The proposed deep learning pipeline shows a substantial superiority in solving both the large disparity and the non-Lambertian challenges compared with other state-of-the-art approaches. In addition to the view interpolation for an LF, we also show that the proposed pipeline also benefits light field view extrapolation. △ Less

Submitted 27 April, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: 15 pages, 12 figures. Accepted by IEEE TPAMI

Journal ref: IEEE TPAMI, 2021

arXiv:2103.13043 [pdf, other]

doi 10.1109/TPAMI.2018.2845393

Light Field Reconstruction Using Convolutional Network on EPI and Extended Applications

Authors: Gaochang Wu, Yebin Liu, Lu Fang, Qionghai Dai, Tianyou Chai

Abstract: In this paper, a novel convolutional neural network (CNN)-based framework is developed for light field reconstruction from a sparse set of views. We indicate that the reconstruction can be efficiently modeled as angular restoration on an epipolar plane image (EPI). The main problem in direct reconstruction on the EPI involves an information asymmetry between the spatial and angular dimensions, whe… ▽ More In this paper, a novel convolutional neural network (CNN)-based framework is developed for light field reconstruction from a sparse set of views. We indicate that the reconstruction can be efficiently modeled as angular restoration on an epipolar plane image (EPI). The main problem in direct reconstruction on the EPI involves an information asymmetry between the spatial and angular dimensions, where the detailed portion in the angular dimensions is damaged by undersampling. Directly upsampling or super-resolving the light field in the angular dimensions causes ghosting effects. To suppress these ghosting effects, we contribute a novel "blur-restoration-deblur" framework. First, the "blur" step is applied to extract the low-frequency components of the light field in the spatial dimensions by convolving each EPI slice with a selected blur kernel. Then, the "restoration" step is implemented by a CNN, which is trained to restore the angular details of the EPI. Finally, we use a non-blind "deblur" operation to recover the spatial high frequencies suppressed by the EPI blur. We evaluate our approach on several datasets, including synthetic scenes, real-world scenes and challenging microscope light field data. We demonstrate the high performance and robustness of the proposed framework compared with state-of-the-art algorithms. We further show extended applications, including depth enhancement and interpolation for unstructured input. More importantly, a novel rendering approach is presented by combining the proposed framework and depth information to handle large disparities. △ Less

Submitted 24 March, 2021; originally announced March 2021.

Comments: Published in IEEE TPAMI, 2019

Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019

Showing 1–50 of 83 results for author: Wu, G