Search | arXiv e-print repository

Multi-Functional Beamforming Design for Integrated Sensing, Communication, and Computation

Authors: Yapeng Zhao, Qingqing Wu, Wen Chen, Yong Zeng, Ruiqi Liu, Weidong Mei, Fen Hou, Shaodan Ma

Abstract: Integrated sensing and communication (ISAC) systems may face a heavy computation burden since the sensory data needs to be further processed. This paper studies a novel system that integrates sensing, communication, and computation, aiming to provide services for different objectives efficiently. This system consists of a multi-antenna multi-functional base station (BS), an edge server, a target,… ▽ More Integrated sensing and communication (ISAC) systems may face a heavy computation burden since the sensory data needs to be further processed. This paper studies a novel system that integrates sensing, communication, and computation, aiming to provide services for different objectives efficiently. This system consists of a multi-antenna multi-functional base station (BS), an edge server, a target, and multiple singleantenna communication users. The BS needs to allocate the available resources to efficiently provide sensing, communication, and computation services. Due to the heavy service burden and limited power budget, the BS can partially offload the tasks to the nearby edge server instead of computing them locally. We consider the estimation of the target response matrix, a general problem in radar sensing, and utilize Cramer-Rao bound (CRB) as the corresponding performance metric. To tackle the non-convex optimization problem, we propose both semidefinite relaxation (SDR)-based alternating optimization and SDR-based successive convex approximation (SCA) algorithms to minimize the CRB of radar sensing while meeting the requirement of communication users and the need for task computing. Furthermore, we demonstrate that the optimal rankone solutions of both the alternating and SCA algorithms can be directly obtained via the solver or further constructed even when dealing with multiple functionalities. Simulation results show that the proposed algorithms can provide higher target estimation performance than state-of-the-art benchmarks while satisfying the communication and computation constraints. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.09627 [pdf, other]

RobustSAM: Segment Anything Robustly on Degraded Images

Authors: Wei-Ting Chen, Yu-Jiet Vong, Sy-Yen Kuo, Sizhuo Ma, Jian Wang

Abstract: Segment Anything Model (SAM) has emerged as a transformative approach in image segmentation, acclaimed for its robust zero-shot segmentation capabilities and flexible prompting system. Nonetheless, its performance is challenged by images with degraded quality. Addressing this limitation, we propose the Robust Segment Anything Model (RobustSAM), which enhances SAM's performance on low-quality image… ▽ More Segment Anything Model (SAM) has emerged as a transformative approach in image segmentation, acclaimed for its robust zero-shot segmentation capabilities and flexible prompting system. Nonetheless, its performance is challenged by images with degraded quality. Addressing this limitation, we propose the Robust Segment Anything Model (RobustSAM), which enhances SAM's performance on low-quality images while preserving its promptability and zero-shot generalization. Our method leverages the pre-trained SAM model with only marginal parameter increments and computational requirements. The additional parameters of RobustSAM can be optimized within 30 hours on eight GPUs, demonstrating its feasibility and practicality for typical research laboratories. We also introduce the Robust-Seg dataset, a collection of 688K image-mask pairs with different degradations designed to train and evaluate our model optimally. Extensive experiments across various segmentation tasks and datasets confirm RobustSAM's superior performance, especially under zero-shot conditions, underscoring its potential for extensive real-world application. Additionally, our method has been shown to effectively improve the performance of SAM-based downstream tasks such as single image dehazing and deblurring. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted by CVPR2024 (Highlight); Project Page: https://robustsam.github.io/

arXiv:2406.09622 [pdf, other]

DSL-FIQA: Assessing Facial Image Quality via Dual-Set Degradation Learning and Landmark-Guided Transformer

Authors: Wei-Ting Chen, Gurunandan Krishnan, Qiang Gao, Sy-Yen Kuo, Sizhuo Ma, Jian Wang

Abstract: Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images, which is crucial in improving image restoration algorithms and selecting high-quality face images for downstream tasks. We present a novel transformer-based method for GFIQA, which is aided by two unique mechanisms. First, a Dual-Set Degradation Representation Learning (DSL) mechanism uses facial image… ▽ More Generic Face Image Quality Assessment (GFIQA) evaluates the perceptual quality of facial images, which is crucial in improving image restoration algorithms and selecting high-quality face images for downstream tasks. We present a novel transformer-based method for GFIQA, which is aided by two unique mechanisms. First, a Dual-Set Degradation Representation Learning (DSL) mechanism uses facial images with both synthetic and real degradations to decouple degradation from content, ensuring generalizability to real-world scenarios. This self-supervised method learns degradation features on a global scale, providing a robust alternative to conventional methods that use local patch information in degradation learning. Second, our transformer leverages facial landmarks to emphasize visually salient parts of a face image in evaluating its perceptual quality. We also introduce a balanced and diverse Comprehensive Generic Face IQA (CGFIQA-40k) dataset of 40K images carefully designed to overcome the biases, in particular the imbalances in skin tone and gender representation, in existing datasets. Extensive analysis and evaluation demonstrate the robustness of our method, marking a significant improvement over prior methods. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted by CVPR 2024, Project Page: https://dsl-fiqa.github.io/

arXiv:2406.09389 [pdf, other]

Sagiri: Low Dynamic Range Image Enhancement with Generative Diffusion Prior

Authors: Baiang Li, Sizhuo Ma, Yanhong Zeng, Xiaogang Xu, Youqing Fang, Zhao Zhang, Jian Wang, Kai Chen

Abstract: Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas. Traditional LDR image enhancement methods primarily focus on color map**, which enhances the visual representation by expanding the image's color range and adjusting the brightness… ▽ More Capturing High Dynamic Range (HDR) scenery using 8-bit cameras often suffers from over-/underexposure, loss of fine details due to low bit-depth compression, skewed color distributions, and strong noise in dark areas. Traditional LDR image enhancement methods primarily focus on color map**, which enhances the visual representation by expanding the image's color range and adjusting the brightness. However, these approaches fail to effectively restore content in dynamic range extremes, which are regions with pixel values close to 0 or 255. To address the full scope of challenges in HDR imaging and surpass the limitations of current models, we propose a novel two-stage approach. The first stage maps the color and brightness to an appropriate range while kee** the existing details, and the second stage utilizes a diffusion prior to generate content in dynamic range extremes lost during capture. This generative refinement module can also be used as a plug-and-play module to enhance and complement existing LDR enhancement models. The proposed method markedly improves the quality and details of LDR images, demonstrating superior performance through rigorous experimental validation. The project page is at https://sagiri0208.github.io △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: https://sagiri0208.github.io

arXiv:2406.08887 [pdf, other]

Low-Overhead Channel Estimation via 3D Extrapolation for TDD mmWave Massive MIMO Systems Under High-Mobility Scenarios

Authors: Binggui Zhou, Xi Yang, Shaodan Ma, Feifei Gao, Guanghua Yang

Abstract: In TDD mmWave massive MIMO systems, the downlink CSI can be attained through uplink channel estimation thanks to the uplink-downlink channel reciprocity. However, the channel aging issue is significant under high-mobility scenarios and thus necessitates frequent uplink channel estimation. In addition, large amounts of antennas and subcarriers lead to high-dimensional CSI matrices, aggravating the… ▽ More In TDD mmWave massive MIMO systems, the downlink CSI can be attained through uplink channel estimation thanks to the uplink-downlink channel reciprocity. However, the channel aging issue is significant under high-mobility scenarios and thus necessitates frequent uplink channel estimation. In addition, large amounts of antennas and subcarriers lead to high-dimensional CSI matrices, aggravating the pilot training overhead. To systematically reduce the pilot overhead, a spatial, frequency, and temporal domain (3D) channel extrapolation framework is proposed in this paper. Considering the marginal effects of pilots in the spatial and frequency domains and the effectiveness of traditional knowledge-driven channel estimation methods, we first propose a knowledge-and-data driven spatial-frequency channel extrapolation network (KDD-SFCEN) for uplink channel estimation by exploiting the least square estimator for coarse channel estimation and joint spatial-frequency channel extrapolation to reduce the spatial-frequency domain pilot overhead. Then, resorting to the uplink-downlink channel reciprocity and temporal domain dependencies of downlink channels, a temporal uplink-downlink channel extrapolation network (TUDCEN) is proposed for slot-level channel extrapolation, aiming to enlarge the pilot signal period and thus reduce the temporal domain pilot overhead under high-mobility scenarios. Specifically, we propose the spatial-frequency sampling embedding module to reduce the representation dimension and consequent computational complexity, and we propose to exploit the autoregressive generative Transformer for generating downlink channels autoregressively. Numerical results demonstrate the superiority of the proposed framework in significantly reducing the pilot training overhead by more than 16 times and improving the system's spectral efficiency under high-mobility scenarios. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 13 pages, 11 figures, 3 tables. This paper has been submitted to IEEE journal for possible publication

arXiv:2405.15339 [pdf, other]

Environment Sensing-aided Beam Prediction with Transfer Learning for Smart Factory

Authors: Yuan Feng, Chuanbing Zhao, Feifei Gao, Yong Zhang, Shaodan Ma

Abstract: In this paper, we propose an environment sensing-aided beam prediction model for smart factory that can be transferred from given environments to a new environment. In particular, we first design a pre-training model that predicts the optimal beam by sensing the present environmental information. When encountering a new environment, it generally requires collecting a large amount of new training d… ▽ More In this paper, we propose an environment sensing-aided beam prediction model for smart factory that can be transferred from given environments to a new environment. In particular, we first design a pre-training model that predicts the optimal beam by sensing the present environmental information. When encountering a new environment, it generally requires collecting a large amount of new training data to retrain the model, whose cost severely impedes the application of the designed pre-training model. Therefore, we next design a transfer learning strategy that fine-tunes the pre-trained model by limited labeled data of the new environment. Simulation results show that when the pre-trained model is fine-tuned by 30\% of labeled data from the new environment, the Top-10 beam prediction accuracy reaches 94\%. Moreover, compared with the way to completely re-training the prediction model, the amount of training data and the time cost of the proposed transfer learning strategy reduce 70\% and 75\% respectively. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2404.03222 [pdf, other]

Enabling Clean Energy Resilience with Machine Learning-Empowered Underground Hydrogen Storage

Authors: Alvaro Carbonero, Shaowen Mao, Mohamed Mehana

Abstract: To address the urgent challenge of climate change, there is a critical need to transition away from fossil fuels towards sustainable energy systems, with renewable energy sources playing a pivotal role. However, the inherent variability of renewable energy, without effective storage solutions, often leads to imbalances between energy supply and demand. Underground Hydrogen Storage (UHS) emerges as… ▽ More To address the urgent challenge of climate change, there is a critical need to transition away from fossil fuels towards sustainable energy systems, with renewable energy sources playing a pivotal role. However, the inherent variability of renewable energy, without effective storage solutions, often leads to imbalances between energy supply and demand. Underground Hydrogen Storage (UHS) emerges as a promising long-term storage solution to bridge this gap, yet its widespread implementation is impeded by the high computational costs associated with high fidelity UHS simulations. This paper introduces UHS from a data-driven perspective and outlines a roadmap for integrating machine learning into UHS, thereby facilitating the large-scale deployment of UHS. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 10 pages, 4 figures, accepted proposal track paper at ICLR CCAI workshop

arXiv:2404.00598 [pdf, other]

Robust Beamforming Design and Antenna Selection for Dynamic HRIS-aided Massive MIMO Systems

Authors: **tao Wang, Binggui Zhou, Chengzhi Ma, Shiqi Gong, Guanghua Yang, Shaodan Ma

Abstract: In this paper, a dynamic hybrid active-passive reconfigurable intelligent surface (HRIS) is proposed to further enhance the massive multiple-input-multiple-output (MIMO) system, since it supports the dynamic placement of active and passive elements. Specifically, considering the impact of the hardware impairments (HWIs), we investigate the channel-aware configuration of the receive antennas at the… ▽ More In this paper, a dynamic hybrid active-passive reconfigurable intelligent surface (HRIS) is proposed to further enhance the massive multiple-input-multiple-output (MIMO) system, since it supports the dynamic placement of active and passive elements. Specifically, considering the impact of the hardware impairments (HWIs), we investigate the channel-aware configuration of the receive antennas at the base station (BS) and the active/passive elements at the HRIS to improve the reliability of system. To this end, we investigate the average mean-square-error (MSE) minimization problem for the HRIS-aided massive MIMO system by jointly optimizing the BS receive antenna selection matrix, the reflection phase coefficients, the reflection amplitude matrix, and the mode selection matrix of the HRIS under the power budget of the HRIS. To tackle the non-convexity and intractability of this problem, we first transform the binary and discrete variables into continuous ones, and then propose a penalty-based exact block coordinate descent (BCD) algorithm to solve these subproblems alternately. Numerical simulations demonstrate the great superiority of the proposed scheme over the conventional benchmark schemes. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: 5 pages, 2 figures

arXiv:2403.19251 [pdf, other]

Arbitrary State Transition of Open Qubit System Based on Switching Control

Authors: Guangpu Wu, Shibei Xue, Shan Ma, Sen Kuang, Daoyi Dong, Ian R. Petersen

Abstract: We present a switching control strategy based on Lyapunov control for arbitrary state transitions in open qubit systems. With coherent vector representation, we propose a switching control strategy, which can prevent the state of the qubit from entering invariant sets and singular value sets, effectively driving the system ultimately to a sufficiently small neighborhood of target states. In compar… ▽ More We present a switching control strategy based on Lyapunov control for arbitrary state transitions in open qubit systems. With coherent vector representation, we propose a switching control strategy, which can prevent the state of the qubit from entering invariant sets and singular value sets, effectively driving the system ultimately to a sufficiently small neighborhood of target states. In comparison to existing works, this control strategy relaxes the strict constraints on system models imposed by special target states. Furthermore, we identify conditions under which the open qubit system achieves finite-time stability (FTS) and finite-time contractive stability (FTCS), respectively. This represents a critical improvement in quantum state transitions, especially considering the asymptotic stability of arbitrary target states is unattainable in open quantum systems. The effectiveness of our proposed method is convincingly demonstrated through its application in a qubit system affected by various types of decoherence, including amplitude, dephasing and polarization decoherence. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 12 pages, 7 figures

arXiv:2403.11417 [pdf, ps, other]

Positioning Using Wireless Networks: Applications, Recent Progress and Future Challenges

Authors: Yang Yang, Mingzhe Chen, Yufei Blankenship, Jemin Lee, Zabih Ghassemlooy, Julian Cheng, Shiwen Mao

Abstract: Positioning has recently received considerable attention as a key enabler in emerging applications such as extended reality, unmanned aerial vehicles and smart environments. These applications require both data communication and high-precision positioning, and thus they are particularly well-suited to be offered in wireless networks (WNs). The purpose of this paper is to provide a comprehensive ov… ▽ More Positioning has recently received considerable attention as a key enabler in emerging applications such as extended reality, unmanned aerial vehicles and smart environments. These applications require both data communication and high-precision positioning, and thus they are particularly well-suited to be offered in wireless networks (WNs). The purpose of this paper is to provide a comprehensive overview of existing works and new trends in the field of positioning techniques from both the academic and industrial perspectives. The paper provides a comprehensive overview of positioning in WNs, covering the background, applications, measurements, state-of-the-art technologies and future challenges. The paper outlines the applications of positioning from the perspectives of public facilities, enterprises and individual users. We investigate the key performance indicators and measurements of positioning systems, followed by the review of the key enabler techniques such as artificial intelligence/large models and adaptive systems. Next, we discuss a number of typical wireless positioning technologies. We extend our overview beyond the academic progress, to include the standardization efforts, and finally, we provide insight into the challenges that remain. The comprehensive overview of exisitng efforts and new trends in the field of positioning from both the academic and industrial communities would be a useful reference to researchers in the field. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.11102 [pdf, other]

Jointly Optimizing Terahertz based Sensing and Communications in Vehicular Networks: A Dynamic Graph Neural Network Approach

Authors: Xuefei Li, Mingzhe Chen, Ye Hu, Zhilong Zhang, Danpu Liu, Shiwen Mao

Abstract: In this paper, the problem of vehicle service mode selection (sensing, communication, or both) and vehicle connections within terahertz (THz) enabled joint sensing and communications over vehicular networks is studied. The considered network consists of several service provider vehicles (SPVs) that can provide: 1) only sensing service, 2) only communication service, and 3) both services, sensing s… ▽ More In this paper, the problem of vehicle service mode selection (sensing, communication, or both) and vehicle connections within terahertz (THz) enabled joint sensing and communications over vehicular networks is studied. The considered network consists of several service provider vehicles (SPVs) that can provide: 1) only sensing service, 2) only communication service, and 3) both services, sensing service request vehicles, and communication service request vehicles. Based on the vehicle network topology and their service accessibility, SPVs strategically select service request vehicles to provide sensing, communication, or both services. This problem is formulated as an optimization problem, aiming to maximize the number of successfully served vehicles by jointly determining the service mode of each SPV and its associated vehicles. To solve this problem, we propose a dynamic graph neural network (GNN) model that selects appropriate graph information aggregation functions according to the vehicle network topology, thus extracting more vehicle network information compared to traditional static GNNs that use fixed aggregation functions for different vehicle network topologies. Using the extracted vehicle network information, the service mode of each SPV and its served service request vehicles will be determined. Simulation results show that the proposed dynamic GNN based method can improve the number of successfully served vehicles by up to 17% and 28% compared to a GNN based algorithm with a fixed neural network model and a conventional optimization algorithm without using GNNs. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.07274 [pdf, other]

Achievable Rate Analysis and Optimization of Double-RIS Assisted Spatially Correlated MIMO with Statistical CSI

Authors: Kaizhe Xu, Jiajia Guo, Jun Zhang, Shi **, Shaodan Ma

Abstract: Reconfigurable intelligent surface (RIS) is a novel meta-material which can form a smart radio environment by dynamically altering reflection directions of the im**ing electromagnetic waves. In the prior literature, the inter-RIS links which also contribute to the performance of the whole system are usually neglected when multiple RISs are deployed. In this paper we investigate a general double-… ▽ More Reconfigurable intelligent surface (RIS) is a novel meta-material which can form a smart radio environment by dynamically altering reflection directions of the im**ing electromagnetic waves. In the prior literature, the inter-RIS links which also contribute to the performance of the whole system are usually neglected when multiple RISs are deployed. In this paper we investigate a general double-RIS assisted multiple-input multiple-output (MIMO) wireless communication system under spatially correlated non line-of-sight propagation channels, where the cooperation of the double RISs is also considered. The design objective is to maximize the achievable ergodic rate based on full statistical channel state information (CSI). Specifically, we firstly present a closed-form asymptotic expression for the achievable ergodic rate by utilizing replica method from statistical physics. Then a full statistical CSI-enabled optimal design is proposed which avoids high pilot training overhead compared to instantaneous CSI-enabled design. To further reduce the signal processing overhead and lower the complexity for practical realization, a common-phase scheme is proposed to design the double RISs. Simulation results show that the derived asymptotic ergodic rate is quite accurate even for small-sized antenna arrays. And the proposed optimization algorithm can achieve substantial gain at the expense of a low overhead and complexity. Furthermore, the cooperative double-RIS assisted MIMO framework is proven to achieve superior ergodic rate performance and high communication reliability under harsh propagation environment. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.06579 [pdf, other]

Edge Information Hub: Orchestrating Satellites, UAVs, MEC, Sensing and Communications for 6G Closed-Loop Controls

Authors: Chengleyang Lei, Wei Feng, Peng Wei, Yunfei Chen, Ning Ge, Shiwen Mao

Abstract: An increasing number of field robots would be used for mission-critical tasks in remote or post-disaster areas. Due to usually-limited individual abilities, these robots require an edge information hub (EIH), which is capable of not only communications but also sensing and computing. Such EIH could be deployed on a flexibly-dispatched unmanned aerial vehicle (UAV). Different from traditional aeria… ▽ More An increasing number of field robots would be used for mission-critical tasks in remote or post-disaster areas. Due to usually-limited individual abilities, these robots require an edge information hub (EIH), which is capable of not only communications but also sensing and computing. Such EIH could be deployed on a flexibly-dispatched unmanned aerial vehicle (UAV). Different from traditional aerial base stations or mobile edge computing (MEC), the EIH would direct the operations of robots via sensing-communication-computing-control ($\textbf{SC}^3$) closed-loop orchestration. This paper aims to optimize the closed-loop control performance of multiple $\textbf{SC}^3$ loops, under the constraints of satellite-backhaul rate, computing capability, and on-board energy. Specifically, the linear quadratic regulator (LQR) control cost is used to measure the closed-loop utility, and a sum LQR cost minimization problem is formulated to jointly optimize the splitting of sensor data and allocation of communication and computing resources. We first derive the optimal splitting ratio of sensor data, and then recast the problem to a more tractable form. An iterative algorithm is finally proposed to provide a sub-optimal solution. Simulation results demonstrate the superiority of the proposed algorithm. We also uncover the influence of $\textbf{SC}^3$ parameters on closed-loop controls, highlighting more systematic understanding. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 13pages, 9 figures

arXiv:2403.05826 [pdf, other]

Cached Model-as-a-Resource: Provisioning Large Language Model Agents for Edge Intelligence in Space-air-ground Integrated Networks

Authors: Minrui Xu, Dusit Niyato, Hongliang Zhang, Jiawen Kang, Zehui Xiong, Shiwen Mao, Zhu Han

Abstract: Edge intelligence in space-air-ground integrated networks (SAGINs) can enable worldwide network coverage beyond geographical limitations for users to access ubiquitous and low-latency intelligence services. Facing global coverage and complex environments in SAGINs, edge intelligence can provision approximate large language models (LLMs) agents for users via edge servers at ground base stations (BS… ▽ More Edge intelligence in space-air-ground integrated networks (SAGINs) can enable worldwide network coverage beyond geographical limitations for users to access ubiquitous and low-latency intelligence services. Facing global coverage and complex environments in SAGINs, edge intelligence can provision approximate large language models (LLMs) agents for users via edge servers at ground base stations (BSs) or cloud data centers relayed by satellites. As LLMs with billions of parameters are pre-trained on vast datasets, LLM agents have few-shot learning capabilities, e.g., chain-of-thought (CoT) prompting for complex tasks, which raises a new trade-off between resource consumption and performance in SAGINs. In this paper, we propose a joint caching and inference framework for edge intelligence to provision sustainable and ubiquitous LLM agents in SAGINs. We introduce "cached model-as-a-resource" for offering LLMs with limited context windows and propose a novel optimization framework, i.e., joint model caching and inference, to utilize cached model resources for provisioning LLM agent services along with communication, computing, and storage resources. We design "age of thought" (AoT) considering the CoT prompting of LLMs, and propose a least AoT cached model replacement algorithm for optimizing the provisioning cost. We propose a deep Q-network-based modified second-bid (DQMSB) auction to incentivize network operators, which can enhance allocation efficiency by 23% while guaranteeing strategy-proofness and free from adverse selection. △ Less

Submitted 31 May, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.03809 [pdf, other]

Variational Bayesian Learning based Joint Localization and Channel Estimation with Distance-dependent Noise

Authors: Yunfei Li, Yiting Luo, Weiqiang Tan, Chunguo Li, Shaodan Ma, Guanghua Yang

Abstract: In the Industrial Internet of Things (IIoTs) and Ocean of Things (OoTs), the advent of massive intelligent services has imposed stringent requirements on both communication and localization, particularly emphasizing precise localization and channel information. This paper focuses on the challenge of jointly optimizing localization and communication in IoT networks. Departing from the conventional… ▽ More In the Industrial Internet of Things (IIoTs) and Ocean of Things (OoTs), the advent of massive intelligent services has imposed stringent requirements on both communication and localization, particularly emphasizing precise localization and channel information. This paper focuses on the challenge of jointly optimizing localization and communication in IoT networks. Departing from the conventional independent noise model used in localization and channel estimation problems, we consider a more realistic model incorporating distance-dependent noise variance, as revealed in recent theoretical analyses and experimental results. The distance-dependent noise introduces unknown noise power and a complex noise model, resulting in an exceptionally challenging non-convex and nonlinear optimization problem. In this study, we address a joint localization and channel estimation problem encompassing distance-dependent noise, unknown channel parameters, and uncertainties in sensor node locations. To surmount the intractable nonlinear and non-convex objective function inherent in the problem, we introduce a variational Bayesian learning-based framework. This framework enables the joint optimization of localization and channel parameters by leveraging an effective approximation to the true posterior distribution. Furthermore, the proposed joint learning algorithm provides an iterative closed-form solution and exhibits superior performance in terms of computational complexity compared to existing algorithms. Computer simulation results demonstrate that the proposed algorithm approaches the performance of the Bayesian Cramer-Rao bound (BCRB), achieves localization performance comparable to the ML-GMP algorithm, and outperforms the other two comparison algorithms. △ Less

Submitted 6 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

arXiv:2403.03736 [pdf, other]

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

Authors: Naifu Xue, Qi Mao, Zijian Wang, Yuan Zhang, Siwei Ma

Abstract: Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios (<0.05 bpp). Motivat… ▽ More Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios (<0.05 bpp). Motivated by the capabilities of predictive language models for lossless compression, this paper introduces a novel Unified Image Generation-Compression (UIGC) paradigm, merging the processes of generation and compression. A key feature of the UIGC framework is the adoption of vector-quantized (VQ) image models for tokenization, alongside a multi-stage transformer designed to exploit spatial contextual information for modeling the prior distribution. As such, the dual-purpose framework effectively utilizes the learned prior for entropy estimation and assists in the regeneration of lost tokens. Extensive experiments demonstrate the superiority of the proposed UIGC framework over existing codecs in perceptual quality and human perception, particularly in ultra-low bitrate scenarios (<=0.03 bpp), pioneering a new direction in generative compression. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2403.03145 [pdf, other]

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

Authors: Yuxin Guo, Shijie Ma, Hu Su, Zhiqing Wang, Yuhao Zhao, Wei Zou, Siyang Sun, Yun Zheng

Abstract: Audio-Visual Source Localization (AVSL) aims to locate sounding objects within video frames given the paired audio clips. Existing methods predominantly rely on self-supervised contrastive learning of audio-visual correspondence. Without any bounding-box annotations, they struggle to achieve precise localization, especially for small objects, and suffer from blurry boundaries and false positives.… ▽ More Audio-Visual Source Localization (AVSL) aims to locate sounding objects within video frames given the paired audio clips. Existing methods predominantly rely on self-supervised contrastive learning of audio-visual correspondence. Without any bounding-box annotations, they struggle to achieve precise localization, especially for small objects, and suffer from blurry boundaries and false positives. Moreover, the naive semi-supervised method is poor in fully leveraging the information of abundant unlabeled data. In this paper, we propose a novel semi-supervised learning framework for AVSL, namely Dual Mean-Teacher (DMT), comprising two teacher-student structures to circumvent the confirmation bias issue. Specifically, two teachers, pre-trained on limited labeled data, are employed to filter out noisy samples via the consensus between their predictions, and then generate high-quality pseudo-labels by intersecting their confidence maps. The sufficient utilization of both labeled and unlabeled data and the proposed unbiased framework enable DMT to outperform current state-of-the-art methods by a large margin, with CIoU of 90.4% and 48.8% on Flickr-SoundNet and VGG-Sound Source, obtaining 8.9%, 9.6% and 4.6%, 6.4% improvements over self- and semi-supervised methods respectively, given only 3% positional-annotations. We also extend our framework to some existing AVSL methods and consistently boost their performance. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: Accepted to NeurIPS2023

arXiv:2403.03095 [pdf, other]

Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization

Authors: Yuxin Guo, Shijie Ma, Yuhao Zhao, Hu Su, Wei Zou

Abstract: Audio-Visual Source Localization (AVSL) is the task of identifying specific sounding objects in the scene given audio cues. In our work, we focus on semi-supervised AVSL with pseudo-labeling. To address the issues with vanilla hard pseudo-labels including bias accumulation, noise sensitivity, and instability, we propose a novel method named Cross Pseudo-Labeling (XPL), wherein two models learn fro… ▽ More Audio-Visual Source Localization (AVSL) is the task of identifying specific sounding objects in the scene given audio cues. In our work, we focus on semi-supervised AVSL with pseudo-labeling. To address the issues with vanilla hard pseudo-labels including bias accumulation, noise sensitivity, and instability, we propose a novel method named Cross Pseudo-Labeling (XPL), wherein two models learn from each other with the cross-refine mechanism to avoid bias accumulation. We equip XPL with two effective components. Firstly, the soft pseudo-labels with sharpening and pseudo-label exponential moving average mechanisms enable models to achieve gradual self-improvement and ensure stable training. Secondly, the curriculum data selection module adaptively selects pseudo-labels with high quality during training to mitigate potential bias. Experimental results demonstrate that XPL significantly outperforms existing methods, achieving state-of-the-art performance while effectively mitigating confirmation bias and ensuring training stability. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: Accepted To ICASSP2024

arXiv:2403.01093 [pdf, other]

Variational Bayesian Learning Based Localization and Channel Reconstruction in RIS-aided Systems

Authors: Yunfei Li, Yiting Luo, Xianda Wu, Zheng Shi, Shaodan Ma, Guanghua Yang

Abstract: The emerging immersive and autonomous services have posed stringent requirements on both communications and localization. By considering the great potential of reconfigurable intelligent surface (RIS), this paper focuses on the joint channel estimation and localization for RIS-aided wireless systems. As opposed to existing works that treat channel estimation and localization independently, this pa… ▽ More The emerging immersive and autonomous services have posed stringent requirements on both communications and localization. By considering the great potential of reconfigurable intelligent surface (RIS), this paper focuses on the joint channel estimation and localization for RIS-aided wireless systems. As opposed to existing works that treat channel estimation and localization independently, this paper exploits the intrinsic coupling and nonlinear relationships between the channel parameters and user location for enhancement of both localization and channel reconstruction. By noticing the non-convex, nonlinear objective function and the sparser angle pattern, a variational Bayesian learning-based framework is developed to jointly estimate the channel parameters and user location through leveraging an effective approximation of the posterior distribution. The proposed framework is capable of unifying near-field and far-field scenarios owing to exploitation of sparsity of the angular domain. Since the joint channel and location estimation problem has a closed-form solution in each iteration, our proposed iterative algorithm performs better than the conventional particle swarm optimization (PSO) and maximum likelihood (ML) based ones in terms of computational complexity. Simulations demonstrate that the proposed algorithm almost reaches the Bayesian Cramer-Rao bound (BCRB) and achieves a superior estimation accuracy by comparing to the PSO and the ML algorithms. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.03042 [pdf, other]

Semi-Passive Intelligent Reflecting Surface Enabled Sensing Systems

Authors: Qiaoyan Peng, Qingqing Wu, Wen Chen, Shaodan Ma, Ming-Min Zhao, Octavia A. Dobre

Abstract: Intelligent reflecting surface (IRS) has garnered growing interest and attention due to its potential for facilitating and supporting wireless communications and sensing. This paper studies a semi-passive IRS-enabled sensing system, where an IRS consists of both passive reflecting elements and active sensors. Our goal is to minimize the Cramér-Rao bound (CRB) for parameter estimation under both po… ▽ More Intelligent reflecting surface (IRS) has garnered growing interest and attention due to its potential for facilitating and supporting wireless communications and sensing. This paper studies a semi-passive IRS-enabled sensing system, where an IRS consists of both passive reflecting elements and active sensors. Our goal is to minimize the Cramér-Rao bound (CRB) for parameter estimation under both point and extended target cases. Towards this goal, we begin by deriving the CRB for the direction-of-arrival (DoA) estimation in closed-form and then theoretically analyze the IRS reflecting elements and sensors allocation design based on the CRB under the point target case with a single-antenna base station (BS). To efficiently solve the corresponding optimization problem for the case with a multi-antenna BS, we propose an efficient algorithm by jointly optimizing the IRS phase shifts and the BS beamformers. Under the extended target case, the CRB for the target response matrix (TRM) estimation is minimized via the optimization of the BS transmit beamformers. Moreover, we explore the influence of various system parameters on the CRB and compare these effects to those observed under the point target case. Simulation results show the effectiveness of the semi-passive IRS and our proposed beamforming design for improving the performance of the sensing system. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.01246 [pdf, other]

LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving

Authors: Daocheng Fu, Wenjie Lei, Licheng Wen, Pinlong Cai, Song Mao, Min Dou, Botian Shi, Yu Qiao

Abstract: The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platform… ▽ More The emergence of Multimodal Large Language Models ((M)LLMs) has ushered in new avenues in artificial intelligence, particularly for autonomous driving by offering enhanced understanding and reasoning capabilities. This paper introduces LimSim++, an extended version of LimSim designed for the application of (M)LLMs in autonomous driving. Acknowledging the limitations of existing simulation platforms, LimSim++ addresses the need for a long-term closed-loop infrastructure supporting continuous learning and improved generalization in autonomous driving. The platform offers extended-duration, multi-scenario simulations, providing crucial information for (M)LLM-driven vehicles. Users can engage in prompt engineering, model evaluation, and framework enhancement, making LimSim++ a versatile tool for research and practice. This paper additionally introduces a baseline (M)LLM-driven framework, systematically validated through quantitative experiments across diverse scenarios. The open-source resources of LimSim++ are available at: https://pjlab-adg.github.io/limsim-plus/. △ Less

Submitted 12 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted by 35th IEEE Intelligent Vehicles Symposium (IV 2024)

arXiv:2401.11205 [pdf, other]

Joint Beamforming Optimization and Mode Selection for RDARS-aided MIMO Systems

Authors: **tao Wang, Chengzhi Ma, Shiqi Gong, Xi Yang, Shaodan Ma

Abstract: Considering the appealing distribution gains of distributed antenna systems (DAS) and passive gains of reconfigurable intelligent surface (RIS), a flexible reconfigurable architecture called reconfigurable distributed antenna and reflecting surface (RDARS) is proposed. RDARS encompasses DAS and RIS as two special cases and maintains the advantages of distributed antennas while reducing the hardwar… ▽ More Considering the appealing distribution gains of distributed antenna systems (DAS) and passive gains of reconfigurable intelligent surface (RIS), a flexible reconfigurable architecture called reconfigurable distributed antenna and reflecting surface (RDARS) is proposed. RDARS encompasses DAS and RIS as two special cases and maintains the advantages of distributed antennas while reducing the hardware cost by replacing some active antennas with low-cost passive reflecting surfaces. In this paper, we present a RDARS-aided uplink multi-user communication system and investigate the system transmission reliability with the newly proposed architecture. Specifically, in addition to the distribution gain and the reflection gain provided by the connection and reflection modes, respectively, we also consider the dynamic mode switching of each element which introduces an additional degree of freedom (DoF) and thus results in a selection gain. As such, we aim to minimize the total sum mean-square-error (MSE) of all data streams by jointly optimizing the receive beamforming matrix, the reflection phase shifts and the channel-aware placement of elements in the connection mode. To tackle this nonconvex problem with intractable binary and cardinality constraints, we propose an inexact block coordinate descent (BCD) based penalty dual decomposition (PDD) algorithm with the guaranteed convergence. Since the PDD algorithm usually suffers from high computational complexity, a low-complexity greedy-search-based alternating optimization (AO) algorithm is developed to yield a semi-closed-form solution with acceptable performance. Numerical results demonstrate the superiority of the proposed architecture compared to the conventional fully passive RIS or DAS. Furthermore, some insights about the practical implementation of RDARS are provided. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: 13 pages, 9 figures. This paper has been submitted to IEEE journal for possible publication

arXiv:2401.09455 [pdf, other]

Dynamic Routing for Integrated Satellite-Terrestrial Networks: A Constrained Multi-Agent Reinforcement Learning Approach

Authors: Yifeng Lyu, Han Hu, Rongfei Fan, Zhi Liu, Jian** An, Shiwen Mao

Abstract: The integrated satellite-terrestrial network (ISTN) system has experienced significant growth, offering seamless communication services in remote areas with limited terrestrial infrastructure. However, designing a routing scheme for ISTN is exceedingly difficult, primarily due to the heightened complexity resulting from the inclusion of additional ground stations, along with the requirement to sat… ▽ More The integrated satellite-terrestrial network (ISTN) system has experienced significant growth, offering seamless communication services in remote areas with limited terrestrial infrastructure. However, designing a routing scheme for ISTN is exceedingly difficult, primarily due to the heightened complexity resulting from the inclusion of additional ground stations, along with the requirement to satisfy various constraints related to satellite service quality. To address these challenges, we study packet routing with ground stations and satellites working jointly to transmit packets, while prioritizing fast communication and meeting energy efficiency and packet loss requirements. Specifically, we formulate the problem of packet routing with constraints as a max-min problem using the Lagrange method. Then we propose a novel constrained Multi-Agent reinforcement learning (MARL) dynamic routing algorithm named CMADR, which efficiently balances objective improvement and constraint satisfaction during the updating of policy and Lagrange multipliers. Finally, we conduct extensive experiments and an ablation study using the OneWeb and Telesat mega-constellations. Results demonstrate that CMADR reduces the packet delay by a minimum of 21% and 15%, while meeting stringent energy consumption and packet loss rate constraints, outperforming several baseline algorithms. △ Less

Submitted 22 December, 2023; originally announced January 2024.

arXiv:2401.05182 [pdf, other]

Integrated Sensing and Communication with Reconfigurable Distributed Antenna and Reflecting Surface: Joint Beamforming and Mode Selection

Authors: **** Zhang, **tao Wang, Yulin Shao, Shaodan Ma

Abstract: This paper presents a new integrated sensing and communication (ISAC) framework, leveraging the recent advancements of reconfigurable distributed antenna and reflecting surface (RDARS). RDARS is a programmable surface structure comprising numerous elements, each of which can be flexibly configured to operate either in a reflection mode, resembling a passive reconfigurable intelligent surface (RIS)… ▽ More This paper presents a new integrated sensing and communication (ISAC) framework, leveraging the recent advancements of reconfigurable distributed antenna and reflecting surface (RDARS). RDARS is a programmable surface structure comprising numerous elements, each of which can be flexibly configured to operate either in a reflection mode, resembling a passive reconfigurable intelligent surface (RIS), or in a connected mode, functioning as a remote transmit or receive antenna. Our RDARS-aided ISAC framework effectively mitigates the adverse impact of multiplicative fading when compared to the passive RIS-aided ISAC, and reduces cost and energy consumption when compared to the active RIS-aided ISAC. Within our RDARS-aided ISAC framework, we consider a radar output signal-to-noise ratio (SNR) maximization problem under communication constraints to jointly optimize the active transmit beamforming matrix of the base station (BS), the reflection and mode selection matrices of RDARS, and the receive filter. To tackle the inherent non-convexity and the binary integer optimization introduced by the mode selection in this optimization challenge, we propose an efficient iterative algorithm with proved convergence based on majorization minimization (MM) and penalty-based methods.Numerical and simulation results demonstrate the superior performance of our new framework, and clearly verify substantial distribution, reflection as well as selection gains obtained by properly configuring the RDARS. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: 13 pages, 9 figures

arXiv:2312.14563 [pdf, other]

AI Generated Signal for Wireless Sensing

Authors: Hanxiang He, Han Hu, Xintao Huan, Heng Liu, Jian** An, Shiwen Mao

Abstract: Deep learning has significantly advanced wireless sensing technology by leveraging substantial amounts of high-quality training data. However, collecting wireless sensing data encounters diverse challenges, including unavoidable data noise, limited data scale due to significant collection overhead, and the necessity to reacquire data in new environments. Taking inspiration from the achievements of… ▽ More Deep learning has significantly advanced wireless sensing technology by leveraging substantial amounts of high-quality training data. However, collecting wireless sensing data encounters diverse challenges, including unavoidable data noise, limited data scale due to significant collection overhead, and the necessity to reacquire data in new environments. Taking inspiration from the achievements of AI-generated content, this paper introduces a signal generation method that achieves data denoising, augmentation, and synthesis by disentangling distinct attributes within the signal, such as individual and environment. The approach encompasses two pivotal modules: structured signal selection and signal disentanglement generation. Structured signal selection establishes a minimal signal set with the target attributes for subsequent attribute disentanglement. Signal disentanglement generation disentangles the target attributes and reassembles them to generate novel signals. Extensive experimental results demonstrate that the proposed method can generate data that closely resembles real-world data on two wireless sensing datasets, exhibiting state-of-the-art performance. Our approach presents a robust framework for comprehending and manipulating attribute-specific information in wireless sensing. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Comments: 6 pages, 6 figures, published to Globecom2023

arXiv:2312.13045 [pdf, ps, other]

Feasibility Conditions for Mobile LiFi

Authors: Shuai Ma, Haihong Sheng, Junchang Sun, Hang Li, Xiaodong Liu, Chen Qiu, Majid Safari, Naofal Al-Dhahir, Shiyin Li

Abstract: Light fidelity (LiFi) is a potential key technology for future 6G networks. However, its feasibility of supporting mobile communications has not been fundamentally discussed. In this paper, we investigate the time-varying channel characteristics of mobile LiFi based on measured mobile phone rotation and movement data. Specifically, we define LiFi channel coherence time to evaluate the correlation… ▽ More Light fidelity (LiFi) is a potential key technology for future 6G networks. However, its feasibility of supporting mobile communications has not been fundamentally discussed. In this paper, we investigate the time-varying channel characteristics of mobile LiFi based on measured mobile phone rotation and movement data. Specifically, we define LiFi channel coherence time to evaluate the correlation of the channel timing sequence. Then, we derive the expression of LiFi transmission rate based on the m-pulse-amplitude-modulation (M-PAM). The derived rate expression indicates that mobile LiFi communications is feasible by using at least two photodiodes (PDs) with different orientations. Further, we propose two channel estimation schemes, and propose a LiFi channel tracking scheme to improve the communication performance. Finally, our experimental results show that the channel coherence time is on the order of tens of milliseconds, which indicates a relatively stable channel. In addition, based on the measured data, better communication performance can be realized in the multiple-input multiple-output (MIMO) scenario with a rate of 36Mbit/s, compared to other scenarios. The results also show that the proposed channel estimation and tracking schemes are effective in designing mobile LiFi systems. △ Less

Submitted 20 December, 2023; originally announced December 2023.

arXiv:2312.04377 [pdf, other]

HARQ-IR Aided Short Packet Communications: BLER Analysis and Throughput Maximization

Authors: Fuchao He, Zheng Shi, Guanghua Yang, Xiaofan Li, Xinrong Ye, Shaodan Ma

Abstract: This paper introduces hybrid automatic repeat request with incremental redundancy (HARQ-IR) to boost the reliability of short packet communications. The finite blocklength information theory and correlated decoding events tremendously preclude the analysis of average block error rate (BLER). Fortunately, the recursive form of average BLER motivates us to calculate its value through the trapezoidal… ▽ More This paper introduces hybrid automatic repeat request with incremental redundancy (HARQ-IR) to boost the reliability of short packet communications. The finite blocklength information theory and correlated decoding events tremendously preclude the analysis of average block error rate (BLER). Fortunately, the recursive form of average BLER motivates us to calculate its value through the trapezoidal approximation and Gauss-Laguerre quadrature. Moreover, the asymptotic analysis is performed to derive a simple expression for the average BLER at high signal-to-noise ratio (SNR). Then, we study the maximization of long term average throughput (LTAT) via power allocation meanwhile ensuring the power and the BLER constraints. For tractability, the asymptotic BLER is employed to solve the problem through geometric programming (GP). However, the GP-based solution underestimates the LTAT at low SNR due to a large approximation error in this case. Alternatively, we also develop a deep reinforcement learning (DRL)-based framework to learn power allocation policy. In particular, the optimization problem is transformed into a constrained Markov decision process, which is solved by integrating deep deterministic policy gradient (DDPG) with subgradient method. The numerical results finally demonstrate that the DRL-based method outperforms the GP-based one at low SNR, albeit at the cost of increasing computational burden. △ Less

Submitted 9 January, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: 13 pages, 10 figures

arXiv:2312.04062 [pdf, other]

A Low-Overhead Incorporation-Extrapolation based Few-Shot CSI Feedback Framework for Massive MIMO Systems

Authors: Binggui Zhou, Xi Yang, **tao Wang, Shaodan Ma, Feifei Gao, Guanghua Yang

Abstract: Accurate channel state information (CSI) is essential for downlink precoding in frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems with orthogonal frequency-division multiplexing (OFDM). However, obtaining CSI through feedback from the user equipment (UE) becomes challenging with the increasing scale of antennas and subcarriers and leads to extremely high CSI… ▽ More Accurate channel state information (CSI) is essential for downlink precoding in frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems with orthogonal frequency-division multiplexing (OFDM). However, obtaining CSI through feedback from the user equipment (UE) becomes challenging with the increasing scale of antennas and subcarriers and leads to extremely high CSI feedback overhead. Deep learning-based methods have emerged for compressing CSI but these methods generally require substantial collected samples and thus pose practical challenges. Moreover, existing deep learning methods also suffer from dramatically growing feedback overhead owing to their focus on full-dimensional CSI feedback. To address these issues, we propose a low-overhead Incorporation-Extrapolation based Few-Shot CSI feedback Framework (IEFSF) for massive MIMO systems. An incorporation-extrapolation scheme for eigenvector-based CSI feedback is proposed to reduce the feedback overhead. Then, to alleviate the necessity of extensive collected samples and enable few-shot CSI feedback, we further propose a knowledge-driven data augmentation (KDDA) method and an artificial intelligence-generated content (AIGC) -based data augmentation method by exploiting the domain knowledge of wireless channels and by exploiting a novel generative model, respectively. Experimental results based on the DeepMIMO dataset demonstrate that the proposed IEFSF significantly reduces CSI feedback overhead by 64 times compared with existing methods while maintaining higher feedback accuracy using only several hundred collected samples. △ Less

Submitted 21 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: 16 pages, 12 figures, 5 tables. Accepted by IEEE Transactions on Wireless Communications

arXiv:2311.07128 [pdf, other]

doi 10.1109/TVT.2023.3331707

Sum Rate Maximization under AoI Constraints for RIS-Assisted mmWave Communications

Authors: Ziqi Guo, Yong Niu, Shiwen Mao, Changming Zhang, Ning Wang, Zhangdui Zhong, Bo Ai

Abstract: The concept of age of information (AoI) has been proposed to quantify information freshness, which is crucial for time-sensitive applications. However, in millimeter wave (mmWave) communication systems, the link blockage caused by obstacles and the severe path loss greatly impair the freshness of information received by the user equipments (UEs). In this paper, we focus on reconfigurable intellige… ▽ More The concept of age of information (AoI) has been proposed to quantify information freshness, which is crucial for time-sensitive applications. However, in millimeter wave (mmWave) communication systems, the link blockage caused by obstacles and the severe path loss greatly impair the freshness of information received by the user equipments (UEs). In this paper, we focus on reconfigurable intelligent surface (RIS)-assisted mmWave communications, where beamforming is performed at transceivers to provide directional beam gain and a RIS is deployed to combat link blockage. We aim to maximize the system sum rate while satisfying the information freshness requirements of UEs by jointly optimizing the beamforming at transceivers, the discrete RIS reflection coefficients, and the UE scheduling strategy. To facilitate a practical solution, we decompose the problem into two subproblems. For the first per-UE data rate maximization problem, we further decompose it into a beamforming optimization subproblem and a RIS reflection coefficient optimization subproblem. Considering the difficulty of channel estimation, we utilize the hierarchical search method for the former and the local search method for the latter, and then adopt the block coordinate descent (BCD) method to alternately solve them. For the second scheduling strategy design problem, a low-complexity heuristic scheduling algorithm is designed. Simulation results show that the proposed algorithm can effectively improve the system sum rate while satisfying the information freshness requirements of all UEs. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.06993 [pdf, other]

State-of-the-Art Review and Synthesis: A Requirement-based Roadmap for Standardized Predictive Maintenance Automation Using Digital Twin Technologies

Authors: Sizhe Ma, Katherine A. Flanigan, Mario Bergés

Abstract: Recent digital advances have popularized predictive maintenance (PMx), offering enhanced efficiency, automation, accuracy, cost savings, and independence in maintenance. Yet, it continues to face numerous limitations such as poor explainability, sample inefficiency of data-driven methods, complexity of physics-based methods, and limited generalizability and scalability of knowledge-based methods.… ▽ More Recent digital advances have popularized predictive maintenance (PMx), offering enhanced efficiency, automation, accuracy, cost savings, and independence in maintenance. Yet, it continues to face numerous limitations such as poor explainability, sample inefficiency of data-driven methods, complexity of physics-based methods, and limited generalizability and scalability of knowledge-based methods. This paper proposes leveraging Digital Twins (DTs) to address these challenges and enable automated PMx adoption at larger scales. While we argue that DTs have this transformative potential, they have not yet reached the level of maturity needed to bridge these gaps in a standardized way. Without a standard definition for such evolution, this transformation lacks a solid foundation upon which to base its development. This paper provides a requirement-based roadmap supporting standardized PMx automation using DT technologies. A systematic approach comprising two primary stages is presented. First, we methodically identify the Informational Requirements (IRs) and Functional Requirements (FRs) for PMx, which serve as a foundation from which any unified framework must emerge. Our approach to defining and using IRs and FRs to form the backbone of any PMx DT is supported by the track record of IRs and FRs being successfully used as blueprints in other areas, such as for product development within the software industry. Second, we conduct a thorough literature review spanning fields to determine the ways in which these IRs and FRs are currently being used within DTs, enabling us to point to the specific areas where further research is warranted to support the progress and maturation of requirement-based PMx DTs. △ Less

Submitted 12 November, 2023; originally announced November 2023.

Comments: (1)This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2311.06973 [pdf]

doi 10.35833/MPCE.2023.000432

Analytical Verification of Performance of Deep Neural Network Based Time-Synchronized Distribution System State Estimation

Authors: Behrouz Azimian, Shiva Moshtagh, Anamitra Pal, Shanshan Ma

Abstract: Recently, we demonstrated success of a time-synchronized state estimator using deep neural networks (DNNs) for real-time unobservable distribution systems. In this letter, we provide analytical bounds on the performance of that state estimator as a function of perturbations in the input measurements. It has already been shown that evaluating performance based on only the test dataset might not eff… ▽ More Recently, we demonstrated success of a time-synchronized state estimator using deep neural networks (DNNs) for real-time unobservable distribution systems. In this letter, we provide analytical bounds on the performance of that state estimator as a function of perturbations in the input measurements. It has already been shown that evaluating performance based on only the test dataset might not effectively indicate a trained DNN's ability to handle input perturbations. As such, we analytically verify robustness and trustworthiness of DNNs to input perturbations by treating them as mixed-integer linear programming (MILP) problems. The ability of batch normalization in addressing the scalability limitations of the MILP formulation is also highlighted. The framework is validated by performing time-synchronized distribution system state estimation for a modified IEEE 34-node system and a real-world large distribution system, both of which are incompletely observed by micro-phasor measurement units. △ Less

Submitted 22 February, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: 8 pages, in Journal of Modern Power Systems and Clean Energy, 2023

arXiv:2311.01674 [pdf, other]

Integrated Sensing and Communications in Clutter Environment

Authors: Hongliang Luo, Yucong Wang, Dongqi Luo, Jianwei Zhao, Huihui Wu, Shaodan Ma, Feifei Gao

Abstract: In this paper, we propose a practical integrated sensing and communications (ISAC) framework to sense dynamic targets from clutter environment while ensuring users communications quality. To implement communications function and sensing function simultaneously, we design multiple communications beams that can communicate with the users as well as one sensing beam that can rotate and scan the entir… ▽ More In this paper, we propose a practical integrated sensing and communications (ISAC) framework to sense dynamic targets from clutter environment while ensuring users communications quality. To implement communications function and sensing function simultaneously, we design multiple communications beams that can communicate with the users as well as one sensing beam that can rotate and scan the entire space. To minimize the interference of sensing beam on existing communications systems, we divide the service area into sensing beam for sensing (S4S) sector and communications beam for sensing (C4S) sector, and provide beamforming design and power allocation optimization strategies for each type sector. Unlike most existing ISAC studies that ignore the interference of static environmental clutter on target sensing, we construct a mixed sensing channel model that includes both static environment and dynamic targets. When base station receives the echo signals, the mean phasor cancellation (MPC) method is employed to filter out the interference from static environmental clutter and to extract the effective dynamic target echoes. Then a complete and practical dynamic target sensing scheme is designed to detect the presence of dynamic targets and to estimate their angles, distances, and velocities. In particular, dynamic target detection and angle estimation are realized through angle-Doppler spectrum estimation (ADSE) and joint detection over multiple subcarriers (MSJD), while distance and velocity estimation are realized through the extended subspace algorithm. Simulation results demonstrate the effectiveness of the proposed scheme and its superiority over the existing methods that ignore environmental clutter. △ Less

Submitted 5 February, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.15574 [pdf, other]

3D Multi-Target Localization Via Intelligent Reflecting Surface: Protocol and Analysis

Authors: Meng Hua, Guangji Chen, Kaitao Meng, Shaodan Ma, Chau Yuen, Hing Cheung So

Abstract: With the emerging environment-aware applications, ubiquitous sensing is expected to play a key role in future networks. In this paper, we study a 3-dimensional (3D) multi-target localization system where multiple intelligent reflecting surfaces (IRSs) are applied to create virtual line-of-sight (LoS) links that bypass the base station (BS) and targets. To fully unveil the fundamental limit of IRS… ▽ More With the emerging environment-aware applications, ubiquitous sensing is expected to play a key role in future networks. In this paper, we study a 3-dimensional (3D) multi-target localization system where multiple intelligent reflecting surfaces (IRSs) are applied to create virtual line-of-sight (LoS) links that bypass the base station (BS) and targets. To fully unveil the fundamental limit of IRS for sensing, we first study a single-target-single-IRS case and propose a novel \textit{two-stage localization protocol} by controlling the on/off state of IRS. To be specific, in the IRS-off stage, we derive the Cramér-Rao bound (CRB) of the azimuth/elevation direction-of-arrival (DoA) of the BS-target link and design a DoA estimator based on the MUSIC algorithm. In the IRS-on stage, the CRB of the azimuth/elevation DoA of the IRS-target link is derived and a simple DoA estimator based on the on-grid IRS beam scanning method is proposed. Particularly, the impact of echo signals reflected by IRS from different paths on sensing performance is analyzed. Moreover, we prove that the single-beam of the IRS is not capable of sensing, but it can be achieved with \textit{multi-beam}. Based on the two obtained DoAs, the 3D single-target location is constructed. We then extend to the multi-target-multi-IRS case and propose an \textit{IRS-adaptive sensing protocol} by controlling the on/off state of multiple IRSs, and a multi-target localization algorithm is developed. Simulation results demonstrate the effectiveness of our scheme and show that sub-meter-level positioning accuracy can be achieved. △ Less

Submitted 28 February, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

Comments: This paper has been submitted to IEEE journal for possible publication

arXiv:2310.10964 [pdf, other]

Spectral-Efficiency and Energy-Efficiency of Variable-Length XP-HARQ

Authors: Jiahui Feng, Zheng Shi, Yaru Fu, Hong Wang, Guanghua Yang, Shaodan Ma

Abstract: A variable-length cross-packet hybrid automatic repeat request (VL-XP-HARQ) is proposed to boost the spectral efficiency (SE) and the energy efficiency (EE) of communications. The SE is firstly derived in terms of the outage probabilities, with which the SE is proved to be upper bounded by the ergodic capacity (EC). Moreover, to facilitate the maximization of the SE, the asymptotic outage probabil… ▽ More A variable-length cross-packet hybrid automatic repeat request (VL-XP-HARQ) is proposed to boost the spectral efficiency (SE) and the energy efficiency (EE) of communications. The SE is firstly derived in terms of the outage probabilities, with which the SE is proved to be upper bounded by the ergodic capacity (EC). Moreover, to facilitate the maximization of the SE, the asymptotic outage probability is obtained at high signal-to-noise ratio (SNR), with which the SE is maximized by properly choosing the number of new information bits while guaranteeing outage requirement. By applying Dinkelbach's transform, the fractional objective function is transformed into a subtraction form, which can be decomposed into multiple sub-problems through alternating optimization. By noticing that the asymptotic outage probability is a convex function, each sub-problem can be easily relaxed to a convex problem by adopting successive convex approximation (SCA). Besides, the EE of VL-XP-HARQ is also investigated. An upper bound of the EE is found and proved to be attainable. Furthermore, by aiming at maximizing the EE via power allocation while confining outage within a certain constraint, the methods to the maximization of SE are invoked to solve the similar fractional problem. Finally, numerical results are presented for verification. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.01024 [pdf, other]

Joint Source-Channel Coding System for 6G Communication: Design, Prototype and Future Directions

Authors: Xinchao Zhong, Sean Longyu Ma, Hong-fu Chou, Arsham Mostaani, Thang X. Vu, Symeon Chatzinotas

Abstract: The goal of semantic communication is to surpass optimal Shannon's criterion regarding a notable problem for future communication which lies in the integration of collaborative efforts between the intelligence of the transmission source and the joint design of source coding and channel coding. The convergence of scholarly investigation and applicable products in the field of semantic communication… ▽ More The goal of semantic communication is to surpass optimal Shannon's criterion regarding a notable problem for future communication which lies in the integration of collaborative efforts between the intelligence of the transmission source and the joint design of source coding and channel coding. The convergence of scholarly investigation and applicable products in the field of semantic communication is facilitated by the utilization of flexible structural hardware design, which is constrained by the computational capabilities of edge devices. This characteristic represents a significant benefit of joint source-channel coding (JSCC), as it enables the generation of source alphabets with diverse lengths and achieves a code rate of unity. Moreover, JSCC exhibits near-capacity performance while maintaining low complexity. Therefore, we leverage not only quasi-cyclic (QC) characteristics to propose a QC-LDPC code-based JSCC scheme but also Unequal Error Protection (UEP) to ensure the recovery of semantic importance. In this study, the feasibility for using a semantic encoder/decoder that is aware of UEP can be explored based on the existing JSCC system. This approach is aimed at protecting the significance of semantic task-oriented information. Additionally, the deployment of a JSCC system can be facilitated by employing Low-Density Parity-Check (LDPC) codes on a reconfigurable device. This is achieved by reconstructing the LDPC codes as QC-LDPC codes. The QC-LDPC layered decoding technique, which has been specifically optimized for hardware parallelism and tailored for channel decoding applications, can be suitably adapted to accommodate the JSCC system. The performance of the proposed system is evaluated by conducting BER measurements using both floating-point and 6-bit quantization. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: 14 pages, 9 figures, Journal

arXiv:2309.11913 [pdf, other]

Spatial-Temporal Transformer based Video Compression Framework

Authors: Yanbo Gao, Wenjia Huang, Shuai Li, Hui Yuan, Mao Ye, Siwei Ma

Abstract: Learned video compression (LVC) has witnessed remarkable advancements in recent years. Similar as the traditional video coding, LVC inherits motion estimation/compensation, residual coding and other modules, all of which are implemented with neural networks (NNs). However, within the framework of NNs and its training mechanism using gradient backpropagation, most existing works often struggle to c… ▽ More Learned video compression (LVC) has witnessed remarkable advancements in recent years. Similar as the traditional video coding, LVC inherits motion estimation/compensation, residual coding and other modules, all of which are implemented with neural networks (NNs). However, within the framework of NNs and its training mechanism using gradient backpropagation, most existing works often struggle to consistently generate stable motion information, which is in the form of geometric features, from the input color features. Moreover, the modules such as the inter-prediction and residual coding are independent from each other, making it inefficient to fully reduce the spatial-temporal redundancy. To address the above problems, in this paper, we propose a novel Spatial-Temporal Transformer based Video Compression (STT-VC) framework. It contains a Relaxed Deformable Transformer (RDT) with Uformer based offsets estimation for motion estimation and compensation, a Multi-Granularity Prediction (MGP) module based on multi-reference frames for prediction refinement, and a Spatial Feature Distribution prior based Transformer (SFD-T) for efficient temporal-spatial joint residual compression. Specifically, RDT is developed to stably estimate the motion information between frames by thoroughly investigating the relationship between the similarity based geometric motion feature extraction and self-attention. MGP is designed to fuse the multi-reference frame information by effectively exploring the coarse-grained prediction feature generated with the coded motion information. SFD-T is to compress the residual information by jointly exploring the spatial feature distributions in both residual and temporal prediction to further reduce the spatial-temporal redundancy. Experimental results demonstrate that our method achieves the best result with 13.5% BD-Rate saving over VTM. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2309.10575 [pdf, ps, other]

AI/ML for Beam Management in 5G-Advanced

Authors: Qing Xue, Jiajia Guo, Binggui Zhou, Yongjun Xu, Zhidu Li, Shaodan Ma

Abstract: In beamformed wireless cellular systems such as 5G New Radio (NR) networks, beam management (BM) is a crucial operation. In the second phase of 5G NR standardization, known as 5G-Advanced, which is being vigorously promoted, the key component is the use of artificial intelligence (AI) based on machine learning (ML) techniques. AI/ML for BM is selected as a representative use case. This article pro… ▽ More In beamformed wireless cellular systems such as 5G New Radio (NR) networks, beam management (BM) is a crucial operation. In the second phase of 5G NR standardization, known as 5G-Advanced, which is being vigorously promoted, the key component is the use of artificial intelligence (AI) based on machine learning (ML) techniques. AI/ML for BM is selected as a representative use case. This article provides an overview of the AI/ML for BM in 5G-Advanced. The legacy non-AI and prime AI-enabled BM frameworks are first introduced and compared. Then, the main scope of AI/ML for BM is presented, including improving accuracy, reducing overhead and latency. Finally, the key challenges and open issues in the standardization of AI/ML for BM are discussed, especially the design of new protocols for AI-enabled BM. This article provides a guideline for the study of AI/ML-based BM standardization. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 4 figures

arXiv:2309.07589 [pdf, other]

MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

Authors: Chuanmin Jia, Feng Ye, Fanke Dong, Kai Lin, Leonardo Chiariglione, Siwei Ma, Huifang Sun, Wen Gao

Abstract: The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is develo** a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." Th… ▽ More The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is develo** a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." The aim of this AI-based video standard project is to compress the number of bits required to represent high-fidelity video data by utilizing data-trained neural coding technologies. This approach is not constrained by how data coding has traditionally been applied in the context of a hybrid framework. This paper presents an overview of recent and ongoing standardization efforts in this area and highlights the key technologies and design philosophy of EEV. It also provides a comparison and report on some primary efforts such as the coding efficiency of the reference model. Additionally, it discusses emerging activities such as learned Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under development, or in the exploration phase. With a focus on UAV video signals, this paper addresses the current status of these preliminary efforts. It also indicates development timelines, summarizes the main technical details, and provides pointers to further points of reference. The exploration experiment shows that the EEV model performs better than the state-of-the-art video coding standard H.266/VVC in terms of perceptual evaluation metric. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.01426 [pdf, other]

A Unified Framework for Guiding Generative AI with Wireless Perception in Resource Constrained Mobile Edge Networks

Authors: Jiacheng Wang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Deepu Rajan, Shiwen Mao, Xuemin, Shen

Abstract: With the significant advancements in artificial intelligence (AI) technologies and powerful computational capabilities, generative AI (GAI) has become a pivotal digital content generation technique for offering superior digital services. However, directing GAI towards desired outputs still suffer the inherent instability of the AI model. In this paper, we design a novel framework that utilizes wir… ▽ More With the significant advancements in artificial intelligence (AI) technologies and powerful computational capabilities, generative AI (GAI) has become a pivotal digital content generation technique for offering superior digital services. However, directing GAI towards desired outputs still suffer the inherent instability of the AI model. In this paper, we design a novel framework that utilizes wireless perception to guide GAI (WiPe-GAI) for providing digital content generation service, i.e., AI-generated content (AIGC), in resource-constrained mobile edge networks. Specifically, we first propose a new sequential multi-scale perception (SMSP) algorithm to predict user skeleton based on the channel state information (CSI) extracted from wireless signals. This prediction then guides GAI to provide users with AIGC, such as virtual character generation. To ensure the efficient operation of the proposed framework in resource constrained networks, we further design a pricing-based incentive mechanism and introduce a diffusion model based approach to generate an optimal pricing strategy for the service provisioning. The strategy maximizes the user's utility while enhancing the participation of the virtual service provider (VSP) in AIGC provision. The experimental results demonstrate the effectiveness of the designed framework in terms of skeleton prediction and optimal pricing strategy generation comparing with other existing solutions. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2308.12797 [pdf, other]

TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search

Authors: Licheng Wen, Ze Fu, Pinlong Cai, Daocheng Fu, Song Mao, Botian Shi

Abstract: Digital twins for intelligent transportation systems are currently attracting great interests, in which generating realistic, diverse, and human-like traffic flow in simulations is a formidable challenge. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adapt… ▽ More Digital twins for intelligent transportation systems are currently attracting great interests, in which generating realistic, diverse, and human-like traffic flow in simulations is a formidable challenge. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adaptability. In this paper, we introduce TrafficMCTS, an innovative framework that harnesses the synergy of groupbased Monte Carlo tree search (MCTS) and Social Value Orientation (SVO) to engender a multifaceted traffic flow replete with varying driving styles and cooperative tendencies. Anchored by a closed-loop architecture, our framework enables vehicles to dynamically adapt to their environment in real time, and ensure feasible collision-free trajectories. Through comprehensive comparisons with state-of-the-art methods, we illuminate the advantages of our approach in terms of computational efficiency, planning success rate, intent completion time, and diversity metrics. Besides, we simulate highway and roundabout scenarios to illustrate the effectiveness of the proposed framework and highlight its ability to induce diverse social behaviors within the traffic flow. Finally, we validate the scalability of TrafficMCTS by showcasing its prowess in simultaneously mass vehicles within a sprawling road network, cultivating a landscape of traffic flow that mirrors the intricacies of human behavior. △ Less

Submitted 31 August, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.09349 [pdf, other]

Intelligent Reflecting Surface Aided Multi-Tier Hybrid Computing

Authors: Yapeng Zhao, Qingqing Wu, Guangji Chen, Wen Chen, Ruiqi Liu, Ming-Min Zhao, Yuan Wu, Shaodan Ma

Abstract: The digital twin edge network (DITEN) aims to integrate mobile edge computing (MEC) and digital twin (DT) to provide real-time system configuration and flexible resource allocation for the sixth-generation network. This paper investigates an intelligent reflecting surface (IRS)-aided multi-tier hybrid computing system that can achieve mutual benefits for DT and MEC in the DITEN. For the first time… ▽ More The digital twin edge network (DITEN) aims to integrate mobile edge computing (MEC) and digital twin (DT) to provide real-time system configuration and flexible resource allocation for the sixth-generation network. This paper investigates an intelligent reflecting surface (IRS)-aided multi-tier hybrid computing system that can achieve mutual benefits for DT and MEC in the DITEN. For the first time, this paper presents the opportunity to realize the network-wide convergence of DT and MEC. In the considered system, specifically, over-the-air computation (AirComp) is employed to monitor the status of the DT system, while MEC is performed with the assistance of DT to provide low-latency computing services. Besides, the IRS is utilized to enhance signal transmission and mitigate interference among heterogeneous nodes. We propose a framework for designing the hybrid computing system, aiming to maximize the sum computation rate under communication and computation resources constraints. To tackle the non-convex optimization problem, alternative optimization and successive convex approximation techniques are leveraged to decouple variables and then transform the problem into a more tractable form. Simulation results verify the effectiveness of the proposed algorithm and demonstrate the IRS can significantly improve the system performance with appropriate phase shift configurations. Moreover, the results indicate that the DT assisted MEC system can precisely achieve the balance between local computing and task offloading since real-time system status can be obtained with the help of DT. △ Less

Submitted 25 October, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

arXiv:2308.07991 [pdf, other]

Demo: Reconfigurable Distributed Antennas and Reflecting Surface (RDARS)-aided Integrated Sensing and Communication System

Authors: **tao Wang, Chengwang Ji, Jiajia Guo, Shaodan Ma

Abstract: Integrated sensing and communication (ISAC) system has been envisioned as a promising technology to be applied in future applications requiring both communication and high-accuracy sensing. Different from most research focusing on theoretical analysis and optimization in the area of ISAC, we implement a reconfigurable distributed antennas and reflecting surfaces (RDARS)-aided ISAC system prototype… ▽ More Integrated sensing and communication (ISAC) system has been envisioned as a promising technology to be applied in future applications requiring both communication and high-accuracy sensing. Different from most research focusing on theoretical analysis and optimization in the area of ISAC, we implement a reconfigurable distributed antennas and reflecting surfaces (RDARS)-aided ISAC system prototype to achieve the dual-functionalities with the communication signal. A RDARS, composed of programmable elements capable of switching between reflection mode and connected mode, is introduced to assist in uplink signal transmission and sensing. The developed RDARS-aided ISAC prototype achieves reliable user localization without compromising the communication rate, showcasing its potential for future 6G systems. △ Less

Submitted 15 August, 2023; originally announced August 2023.

Comments: 2 pages, 3 figures. Accepted by IEEE/CIC International Conference on Communications in China, Dalian, China, 2023

arXiv:2308.05862 [pdf, other]

Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: the FLARE22 Challenge

Authors: Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Shihao Ma, Adamo Young, Cheng Zhu, Kangkang Meng, Xin Yang, Ziyan Huang, Fan Zhang, Wentao Liu, YuanKe Pan, Shou** Huang, Jiacheng Wang, Mingze Sun, Weixin Xu, Dengqiang Jia, Jae Won Choi, Natália Alves, Bram de Wilde, Gregor Koehler, Yajun Wu, Manuel Wiesenfarth, Qiongjie Zhu , et al. (4 additional authors not shown)

Abstract: Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations,… ▽ More Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations, we organized the FLARE 2022 Challenge, the largest abdominal organ analysis challenge to date, to benchmark fast, low-resource, accurate, annotation-efficient, and generalized AI algorithms. We constructed an intercontinental and multinational dataset from more than 50 medical groups, including Computed Tomography (CT) scans with different races, diseases, phases, and manufacturers. We independently validated that a set of AI algorithms achieved a median Dice Similarity Coefficient (DSC) of 90.0\% by using 50 labeled scans and 2000 unlabeled scans, which can significantly reduce annotation requirements. The best-performing algorithms successfully generalized to holdout external validation sets, achieving a median DSC of 89.5\%, 90.9\%, and 88.3\% on North American, European, and Asian cohorts, respectively. They also enabled automatic extraction of key organ biology features, which was labor-intensive with traditional manual measurements. This opens the potential to use unlabeled data to boost performance and alleviate annotation shortages for modern AI models. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: MICCAI FLARE22: https://flare22.grand-challenge.org/

arXiv:2308.02140 [pdf, ps, other]

Deep Reinforcement Learning Empowered Rate Selection of XP-HARQ

Authors: Da Wu, Jiahui Feng, Zheng Shi, Hongjiang Lei, Guanghua Yang, Shaodan Ma

Abstract: The complex transmission mechanism of cross-packet hybrid automatic repeat request (XP-HARQ) hinders its optimal system design. To overcome this difficulty, this letter attempts to use the deep reinforcement learning (DRL) to solve the rate selection problem of XP-HARQ over correlated fading channels. In particular, the long term average throughput (LTAT) is maximized by properly choosing the incr… ▽ More The complex transmission mechanism of cross-packet hybrid automatic repeat request (XP-HARQ) hinders its optimal system design. To overcome this difficulty, this letter attempts to use the deep reinforcement learning (DRL) to solve the rate selection problem of XP-HARQ over correlated fading channels. In particular, the long term average throughput (LTAT) is maximized by properly choosing the incremental information rate for each HARQ round on the basis of the outdated channel state information (CSI) available at the transmitter. The rate selection problem is first converted into a Markov decision process (MDP), which is then solved by capitalizing on the algorithm of deep deterministic policy gradient (DDPG) with prioritized experience replay. The simulation results finally corroborate the superiority of the proposed XP-HARQ scheme over the conventional HARQ with incremental redundancy (HARQ-IR) and the XP-HARQ with only statistical CSI. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.02135 [pdf, ps, other]

doi 10.1109/COMST.2024.3361991

A Survey of Beam Management for mmWave and THz Communications Towards 6G

Authors: Qing Xue, Chengwang Ji, Shaodan Ma, Jiajia Guo, Yongjun Xu, Qianbin Chen, Wei Zhang

Abstract: Communication in millimeter wave (mmWave) and even terahertz (THz) frequency bands is ushering in a new era of wireless communications. Beam management, namely initial access and beam tracking, has been recognized as an essential technique to ensure robust mmWave/THz communications, especially for mobile scenarios. However, narrow beams at higher carrier frequency lead to huge beam measurement ove… ▽ More Communication in millimeter wave (mmWave) and even terahertz (THz) frequency bands is ushering in a new era of wireless communications. Beam management, namely initial access and beam tracking, has been recognized as an essential technique to ensure robust mmWave/THz communications, especially for mobile scenarios. However, narrow beams at higher carrier frequency lead to huge beam measurement overhead, which has a negative impact on beam acquisition and tracking. In addition, the beam management process is further complicated by the fluctuation of mmWave/THz channels, the random movement patterns of users, and the dynamic changes in the environment. For mmWave and THz communications toward 6G, we have witnessed a substantial increase in research and industrial attention on artificial intelligence (AI), reconfigurable intelligent surface (RIS), and integrated sensing and communications (ISAC). The introduction of these enabling technologies presents both open opportunities and unique challenges for beam management. In this paper, we present a comprehensive survey on mmWave and THz beam management. Further, we give some insights on technical challenges and future research directions in this promising area. △ Less

Submitted 6 February, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

Comments: accepted by IEEE Communications Surveys & Tutorials

arXiv:2308.02131 [pdf, other]

Graph Convolutional Network Enabled Power-Constrained HARQ Strategy for URLLC

Authors: Yi Chen, Zheng Shi, Hong Wang, Yaru Fu, Guanghua Yang, Shaodan Ma, Haichuan Ding

Abstract: In this paper, a power-constrained hybrid automatic repeat request (HARQ) transmission strategy is developed to support ultra-reliable low-latency communications (URLLC). In particular, we aim to minimize the delivery latency of HARQ schemes over time-correlated fading channels, meanwhile ensuring the high reliability and limited power consumption. To ease the optimization, the simple asymptotic o… ▽ More In this paper, a power-constrained hybrid automatic repeat request (HARQ) transmission strategy is developed to support ultra-reliable low-latency communications (URLLC). In particular, we aim to minimize the delivery latency of HARQ schemes over time-correlated fading channels, meanwhile ensuring the high reliability and limited power consumption. To ease the optimization, the simple asymptotic outage expressions of HARQ schemes are adopted. Furthermore, by noticing the non-convexity of the latency minimization problem and the intricate connection between different HARQ rounds, the graph convolutional network (GCN) is invoked for the optimal power solution owing to its powerful ability of handling the graph data. The primal-dual learning method is then leveraged to train the GCN weights. Consequently, the numerical results are presented for verification together with the comparisons among three HARQ schemes in terms of the latency and the reliability, where the three HARQ schemes include Type-I HARQ, HARQ with chase combining (HARQ-CC), and HARQ with incremental redundancy (HARQ-IR). To recapitulate, it is revealed that HARQ-IR offers the lowest latency while guaranteeing the demanded reliability target under a stringent power constraint, albeit at the price of high coding complexity. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2307.08265 [pdf, other]

Extreme Image Compression using Fine-tuned VQGANs

Authors: Qi Mao, Tinghan Yang, Yinuo Zhang, Zijian Wang, Meng Wang, Shiqi Wang, Siwei Ma

Abstract: Recent advances in generative compression methods have demonstrated remarkable progress in enhancing the perceptual quality of compressed data, especially in scenarios with low bitrates. However, their efficacy and applicability to achieve extreme compression ratios ($<0.05$ bpp) remain constrained. In this work, we propose a simple yet effective coding framework by introducing vector quantization… ▽ More Recent advances in generative compression methods have demonstrated remarkable progress in enhancing the perceptual quality of compressed data, especially in scenarios with low bitrates. However, their efficacy and applicability to achieve extreme compression ratios ($<0.05$ bpp) remain constrained. In this work, we propose a simple yet effective coding framework by introducing vector quantization (VQ)--based generative models into the image compression domain. The main insight is that the codebook learned by the VQGAN model yields a strong expressive capacity, facilitating efficient compression of continuous information in the latent space while maintaining reconstruction quality. Specifically, an image can be represented as VQ-indices by finding the nearest codeword, which can be encoded using lossless compression methods into bitstreams. We propose clustering a pre-trained large-scale codebook into smaller codebooks through the K-means algorithm, yielding variable bitrates and different levels of reconstruction quality within the coding framework. Furthermore, we introduce a transformer to predict lost indices and restore images in unstable environments. Extensive qualitative and quantitative experiments on various benchmark datasets demonstrate that the proposed framework outperforms state-of-the-art codecs in terms of perceptual quality-oriented metrics and human perception at extremely low bitrates ($\le 0.04$ bpp). Remarkably, even with the loss of up to $20\%$ of indices, the images can be effectively restored with minimal perceptual loss. △ Less

Submitted 15 December, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: Generative Compression, Extreme Compression, VQGANs, Low Bitrate

arXiv:2307.06648 [pdf, other]

LimSim: A Long-term Interactive Multi-scenario Traffic Simulator

Authors: Licheng Wen, Daocheng Fu, Song Mao, Pinlong Cai, Min Dou, Yikang Li, Yu Qiao

Abstract: With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styl… ▽ More With the growing popularity of digital twin and autonomous driving in transportation, the demand for simulation systems capable of generating high-fidelity and reliable scenarios is increasing. Existing simulation systems suffer from a lack of support for different types of scenarios, and the vehicle models used in these systems are too simplistic. Thus, such systems fail to represent driving styles and multi-vehicle interactions, and struggle to handle corner cases in the dataset. In this paper, we propose LimSim, the Long-term Interactive Multi-scenario traffic Simulator, which aims to provide a long-term continuous simulation capability under the urban road network. LimSim can simulate fine-grained dynamic scenarios and focus on the diverse interactions between multiple vehicles in the traffic flow. This paper provides a detailed introduction to the framework and features of the LimSim, and demonstrates its performance through case studies and experiments. LimSim is now open source on GitHub: https://www.github.com/PJLab-ADG/LimSim . △ Less

Submitted 26 July, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

Comments: Accepted by 26th IEEE International Conference on Intelligent Transportation Systems (ITSC 2023)

arXiv:2307.05362 [pdf, other]

SleepEGAN: A GAN-enhanced Ensemble Deep Learning Model for Imbalanced Classification of Sleep Stages

Authors: Xuewei Cheng, Ke Huang, Yi Zou, Shujie Ma

Abstract: Deep neural networks have played an important role in automatic sleep stage classification because of their strong representation and in-model feature transformation abilities. However, class imbalance and individual heterogeneity which typically exist in raw EEG signals of sleep data can significantly affect the classification performance of any machine learning algorithms. To solve these two pro… ▽ More Deep neural networks have played an important role in automatic sleep stage classification because of their strong representation and in-model feature transformation abilities. However, class imbalance and individual heterogeneity which typically exist in raw EEG signals of sleep data can significantly affect the classification performance of any machine learning algorithms. To solve these two problems, this paper develops a generative adversarial network (GAN)-powered ensemble deep learning model, named SleepEGAN, for the imbalanced classification of sleep stages. To alleviate class imbalance, we propose a new GAN (called EGAN) architecture adapted to the features of EEG signals for data augmentation. The generated samples for the minority classes are used in the training process. In addition, we design a cost-free ensemble learning strategy to reduce the model estimation variance caused by the heterogeneity between the validation and test sets, so as to enhance the accuracy and robustness of prediction performance. We show that the proposed method can improve classification accuracy compared to several existing state-of-the-art methods using three public sleep datasets. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: 20 pages, 6 figures

arXiv:2307.05361 [pdf, other]

A Physics-Informed Low-Shot Learning For sEMG-Based Estimation of Muscle Force and Joint Kinematics

Authors: Yue Shi, Shuhao Ma, Yihui Zhao, Zhiqiang Zhang

Abstract: Muscle force and joint kinematics estimation from surface electromyography (sEMG) are essential for real-time biomechanical analysis of the dynamic interplay among neural muscle stimulation, muscle dynamics, and kinetics. Recent advances in deep neural networks (DNNs) have shown the potential to improve biomechanical analysis in a fully automated and reproducible manner. However, the small sample… ▽ More Muscle force and joint kinematics estimation from surface electromyography (sEMG) are essential for real-time biomechanical analysis of the dynamic interplay among neural muscle stimulation, muscle dynamics, and kinetics. Recent advances in deep neural networks (DNNs) have shown the potential to improve biomechanical analysis in a fully automated and reproducible manner. However, the small sample nature and physical interpretability of biomechanical analysis limit the applications of DNNs. This paper presents a novel physics-informed low-shot learning method for sEMG-based estimation of muscle force and joint kinematics. This method seamlessly integrates Lagrange's equation of motion and inverse dynamic muscle model into the generative adversarial network (GAN) framework for structured feature decoding and extrapolated estimation from the small sample data. Specifically, Lagrange's equation of motion is introduced into the generative model to restrain the structured decoding of the high-level features following the laws of physics. And a physics-informed policy gradient is designed to improve the adversarial learning efficiency by rewarding the consistent physical representation of the extrapolated estimations and the physical references. Experimental validations are conducted on two scenarios (i.e. the walking trials and wrist motion trials). Results indicate that the estimations of the muscle forces and joint kinematics are unbiased compared to the physics-based inverse dynamics, which outperforms the selected benchmark methods, including physics-informed convolution neural network (PI-CNN), vallina generative adversarial network (GAN), and multi-layer extreme learning machine (ML-ELM). △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: 17 pages, 8 Figures

Showing 1–50 of 181 results for author: Mao, S