Search | arXiv e-print repository

doi 10.1109/TPAMI.2024.3400041

Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation

Authors: **gyuan Xia, Zhixiong Yang, Shengxi Li, Shuanghui Zhang, Yaowen Fu, Deniz Gündüz, Xiang Li

Abstract: Learning-based approaches have witnessed great successes in blind single image super-resolution (SISR) tasks, however, handcrafted kernel priors and learning based kernel priors are typically required. In this paper, we propose a Meta-learning and Markov Chain Monte Carlo (MCMC) based SISR approach to learn kernel priors from organized randomness. In concrete, a lightweight network is adopted as k… ▽ More Learning-based approaches have witnessed great successes in blind single image super-resolution (SISR) tasks, however, handcrafted kernel priors and learning based kernel priors are typically required. In this paper, we propose a Meta-learning and Markov Chain Monte Carlo (MCMC) based SISR approach to learn kernel priors from organized randomness. In concrete, a lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions. This procedure provides an approximation for the rational blur kernel, and introduces a network-level Langevin dynamics into SISR optimization processes, which contributes to preventing bad local optimal solutions for kernel estimation. Meanwhile, a meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer, respectively. In contrast to the conventional alternating minimization strategy, a meta-learning-based framework is applied to learn an adaptive optimization strategy, which is less-greedy and results in better convergence performance. These two procedures are iteratively processed in a plug-and-play fashion, for the first time, realizing a learning-based but plug-and-play blind SISR solution in unsupervised inference. Extensive simulations demonstrate the superior performance and generalization ability of the proposed approach when comparing with state-of-the-arts on synthesis and real-world datasets. The code is available at https://github.com/XYLGroup/MLMC. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)

arXiv:2405.15969 [pdf, other]

Massive Digital Over-the-Air Computation for Communication-Efficient Federated Edge Learning

Authors: Li Qiao, Zhen Gao, Mahdi Boloursaz Mashhadi, Deniz Gündüz

Abstract: Over-the-air computation (AirComp) is a promising technology converging communication and computation over wireless networks, which can be particularly effective in model training, inference, and more emerging edge intelligence applications. AirComp relies on uncoded transmission of individual signals, which are added naturally over the multiple access channel thanks to the superposition property… ▽ More Over-the-air computation (AirComp) is a promising technology converging communication and computation over wireless networks, which can be particularly effective in model training, inference, and more emerging edge intelligence applications. AirComp relies on uncoded transmission of individual signals, which are added naturally over the multiple access channel thanks to the superposition property of the wireless medium. Despite significantly improved communication efficiency, how to accommodate AirComp in the existing and future digital communication networks, that are based on discrete modulation schemes, remains a challenge. This paper proposes a massive digital AirComp (MD-AirComp) scheme, that leverages an unsourced massive access protocol, to enhance compatibility with both current and next-generation wireless networks. MD-AirComp utilizes vector quantization to reduce the uplink communication overhead, and employs shared quantization and modulation codebooks. At the receiver, we propose a near-optimal approximate message passing-based algorithm to compute the model aggregation results from the superposed sequences, which relies on estimating the number of devices transmitting each code sequence, rather than trying to decode the messages of individual transmitters. We apply MD-AirComp to the federated edge learning (FEEL), and show that it significantly accelerates FEEL convergence compared to state-of-the-art while using the same amount of communication resources. To support further research and ensure reproducibility, we have made our code available at https://github.com/liqiao19/MD-AirComp. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: To be published in the IEEE Journal on Selected Areas in Communications

arXiv:2405.09698 [pdf, other]

A Deep Joint Source-Channel Coding Scheme for Hybrid Mobile Multi-hop Networks

Authors: Chenghong Bian, Yulin Shao, Deniz Gündüz

Abstract: Efficient data transmission across mobile multi-hop networks that connect edge devices to core servers presents significant challenges, particularly due to the variability in link qualities between wireless and wired segments. This variability necessitates a robust transmission scheme that transcends the limitations of existing deep joint source-channel coding (DeepJSCC) strategies, which often st… ▽ More Efficient data transmission across mobile multi-hop networks that connect edge devices to core servers presents significant challenges, particularly due to the variability in link qualities between wireless and wired segments. This variability necessitates a robust transmission scheme that transcends the limitations of existing deep joint source-channel coding (DeepJSCC) strategies, which often struggle at the intersection of analog and digital methods. Addressing this need, this paper introduces a novel hybrid DeepJSCC framework, h-DJSCC, tailored for effective image transmission from edge devices through a network architecture that includes initial wireless transmission followed by multiple wired hops. Our approach harnesses the strengths of DeepJSCC for the initial, variable-quality wireless link to avoid the cliff effect inherent in purely digital schemes. For the subsequent wired hops, which feature more stable and high-capacity connections, we implement digital compression and forwarding techniques to prevent noise accumulation. This dual-mode strategy is adaptable even in scenarios with limited knowledge of the image distribution, enhancing the framework's robustness and utility. Extensive numerical simulations demonstrate that our hybrid solution outperforms traditional fully digital approaches by effectively managing transitions between different network segments and optimizing for variable signal-to-noise ratios (SNRs). We also introduce a fully adaptive h-DJSCC architecture capable of adjusting to different network conditions and achieving diverse rate-distortion objectives, thereby reducing the memory requirements on network nodes. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: Submitted to possible IEEE journal

arXiv:2403.20237 [pdf, other]

Evolving Semantic Communication with Generative Model

Authors: Shunpu Tang, Qianqian Yang, Deniz Gündüz, Zhaoyang Zhang

Abstract: Recently, learning-based semantic communication (SemCom) has emerged as a promising approach in the upcoming 6G network and researchers have made remarkable efforts in this field. However, existing works have yet to fully explore the advantages of the evolving nature of learning-based systems, where knowledge accumulates during transmission have the potential to enhance system performance. In this… ▽ More Recently, learning-based semantic communication (SemCom) has emerged as a promising approach in the upcoming 6G network and researchers have made remarkable efforts in this field. However, existing works have yet to fully explore the advantages of the evolving nature of learning-based systems, where knowledge accumulates during transmission have the potential to enhance system performance. In this paper, we explore an evolving semantic communication system for image transmission, referred to as ESemCom, with the capability to continuously enhance transmission efficiency. The system features a novel channel-aware semantic encoder that utilizes a pre-trained Semantic StyleGAN to extract the channel-correlated latent variables consisting of serval semantic vectors from the input images, which can be directly transmitted over a noisy channel without further channel coding. Moreover, we introduce a semantic caching mechanism that dynamically stores the transmitted semantic vectors in the local caching memory of both the transmitter and receiver. The cached semantic vectors are then exploited to eliminate the need to transmit similar codes in subsequent transmission, thus further reducing communication overhead. Simulation results highlight the evolving performance of the proposed system in terms of transmission efficiency, achieving superior perceptual quality with an average bandwidth compression ratio (BCR) of 1/192 for a sequence of 100 testing images compared to DeepJSCC and Inverse JSCC with the same BCR. Code of this paper is available at \url{https://github.com/recusant7/GAN_SeCom}. △ Less

Submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.13615 [pdf, other]

MIMO Channel as a Neural Function: Implicit Neural Representations for Extreme CSI Compression in Massive MIMO Systems

Authors: Haotian Wu, Maojun Zhang, Yulin Shao, Krystian Mikolajczyk, Deniz Gündüz

Abstract: Acquiring and utilizing accurate channel state information (CSI) can significantly improve transmission performance, thereby holding a crucial role in realizing the potential advantages of massive multiple-input multiple-output (MIMO) technology. Current prevailing CSI feedback approaches improve precision by employing advanced deep-learning methods to learn representative CSI features for a subse… ▽ More Acquiring and utilizing accurate channel state information (CSI) can significantly improve transmission performance, thereby holding a crucial role in realizing the potential advantages of massive multiple-input multiple-output (MIMO) technology. Current prevailing CSI feedback approaches improve precision by employing advanced deep-learning methods to learn representative CSI features for a subsequent compression process. Diverging from previous works, we treat the CSI compression problem in the context of implicit neural representations. Specifically, each CSI matrix is viewed as a neural function that maps the CSI coordinates (antenna number and subchannel) to the corresponding channel gains. Instead of transmitting the parameters of the implicit neural functions directly, we transmit modulations based on the CSI matrix derived through a meta-learning algorithm. Modulations are then applied to a shared base network to generate the elements of the CSI matrix. Modulations corresponding to the CSI matrix are quantized and entropy-coded to further reduce the communication bandwidth, thus achieving extreme CSI compression ratios. Numerical results show that our proposed approach achieves state-of-the-art performance and showcases flexibility in feedback strategies. △ Less

Submitted 20 March, 2024; originally announced March 2024.

MSC Class: 94A24 ACM Class: E.4

arXiv:2403.10613 [pdf, other]

Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Authors: Chenghong Bian, Yulin Shao, Haotian Wu, Emre Ozfatura, Deniz Gunduz

Abstract: This paper introduces an innovative deep joint source-channel coding (DeepJSCC) approach to image transmission over a cooperative relay channel. The relay either amplifies and forwards a scaled version of its received signal, referred to as DeepJSCC-AF, or leverages neural networks to extract relevant features about the source signal before forwarding it to the destination, which we call DeepJSCC-… ▽ More This paper introduces an innovative deep joint source-channel coding (DeepJSCC) approach to image transmission over a cooperative relay channel. The relay either amplifies and forwards a scaled version of its received signal, referred to as DeepJSCC-AF, or leverages neural networks to extract relevant features about the source signal before forwarding it to the destination, which we call DeepJSCC-PF (Process-and-Forward). In the full-duplex scheme, inspired by the block Markov coding (BMC) concept, we introduce a novel block transmission strategy built upon novel vision transformer architecture. In the proposed scheme, the source transmits information in blocks, and the relay updates its knowledge about the input signal after each block and generates its own signal to be conveyed to the destination. To enhance practicality, we introduce an adaptive transmission model, which allows a single trained DeepJSCC model to adapt seamlessly to various channel qualities, making it a versatile solution. Simulation results demonstrate the superior performance of our proposed DeepJSCC compared to the state-of-the-art BPG image compression algorithm, even when operating at the maximum achievable rate of conventional decode-and-forward and compress-and-forward protocols, for both half-duplex and full-duplex relay scenarios. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: Submitted for possible IEEE journal

arXiv:2402.08934 [pdf, other]

Extreme Video Compression with Pre-trained Diffusion Models

Authors: Bohan Li, Yiming Liu, Xueyan Niu, Bo Bai, Lei Deng, Deniz Gündüz

Abstract: Diffusion models have achieved remarkable success in generating high quality image and video data. More recently, they have also been used for image compression with high perceptual quality. In this paper, we present a novel approach to extreme video compression leveraging the predictive power of diffusion-based generative models at the decoder. The conditional diffusion model takes several neural… ▽ More Diffusion models have achieved remarkable success in generating high quality image and video data. More recently, they have also been used for image compression with high perceptual quality. In this paper, we present a novel approach to extreme video compression leveraging the predictive power of diffusion-based generative models at the decoder. The conditional diffusion model takes several neural compressed frames and generates subsequent frames. When the reconstruction quality drops below the desired level, new frames are encoded to restart prediction. The entire video is sequentially encoded to achieve a visually pleasing reconstruction, considering perceptual quality metrics such as the learned perceptual image patch similarity (LPIPS) and the Frechet video distance (FVD), at bit rates as low as 0.02 bits per pixel (bpp). Experimental results demonstrate the effectiveness of the proposed scheme compared to standard codecs such as H.264 and H.265 in the low bpp regime. The results showcase the potential of exploiting the temporal relations in video data using generative models. Code is available at: https://github.com/ElesionKyrie/Extreme-Video-Compression-With-Prediction-Using-Pre-trainded-Diffusion-Models- △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.07573 [pdf, other]

Goal-Oriented and Semantic Communication in 6G AI-Native Networks: The 6G-GOALS Approach

Authors: Emilio Calvanese Strinati, Paolo Di Lorenzo, Vincenzo Sciancalepore, Adnan Aijaz, Marios Kountouris, Deniz Gündüz, Petar Popovski, Mohamed Sana, Photios A. Stavrou, Beatriz Soret, Nicola Cordeschi, Simone Scardapane, Mattia Merluzzi, Lanfranco Zanzi, Mauro Boldi Renato, Tony Quek, Nicola di Pietro, Olivier Forceville, Francesca Costanzo, Peizheng Li

Abstract: Recent advances in AI technologies have notably expanded device intelligence, fostering federation and cooperation among distributed AI agents. These advancements impose new requirements on future 6G mobile network architectures. To meet these demands, it is essential to transcend classical boundaries and integrate communication, computation, control, and intelligence. This paper presents the 6G-G… ▽ More Recent advances in AI technologies have notably expanded device intelligence, fostering federation and cooperation among distributed AI agents. These advancements impose new requirements on future 6G mobile network architectures. To meet these demands, it is essential to transcend classical boundaries and integrate communication, computation, control, and intelligence. This paper presents the 6G-GOALS approach to goal-oriented and semantic communications for AI-Native 6G Networks. The proposed approach incorporates semantic, pragmatic, and goal-oriented communication into AI-native technologies, aiming to facilitate information exchange between intelligent agents in a more relevant, effective, and timely manner, thereby optimizing bandwidth, latency, energy, and electromagnetic field (EMF) radiation. The focus is on distilling data to its most relevant form and terse representation, aligning with the source's intent or the destination's objectives and context, or serving a specific goal. 6G-GOALS builds on three fundamental pillars: i) AI-enhanced semantic data representation, sensing, compression, and communication, ii) foundational AI reasoning and causal semantic data representation, contextual relevance, and value for goal-oriented effectiveness, and iii) sustainability enabled by more efficient wireless services. Finally, we illustrate two proof-of-concepts implementing semantic, goal-oriented, and pragmatic communication principles in near-future use cases. Our study covers the project's vision, methodologies, and potential impact. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.03886 [pdf, other]

Full-Duplex Millimeter Wave MIMO Channel Estimation: A Neural Network Approach

Authors: Mehdi Sattari, Hao Guo, Deniz Gündüz, Ashkan Panahi, Tommy Svensson

Abstract: Millimeter wave (mmWave) multiple-input-multi-output (MIMO) is now a reality with great potential for further improvement. We study full-duplex transmissions as an effective way to improve mmWave MIMO systems. Compared to half-duplex systems, full-duplex transmissions may offer higher data rates and lower latency. However, full-duplex transmission is hindered by self-interference (SI) at the recei… ▽ More Millimeter wave (mmWave) multiple-input-multi-output (MIMO) is now a reality with great potential for further improvement. We study full-duplex transmissions as an effective way to improve mmWave MIMO systems. Compared to half-duplex systems, full-duplex transmissions may offer higher data rates and lower latency. However, full-duplex transmission is hindered by self-interference (SI) at the receive antennas, and SI channel estimation becomes a crucial step to make the full-duplex systems feasible. In this paper, we address the problem of channel estimation in full-duplex mmWave MIMO systems using neural networks (NNs). Our approach involves sharing pilot resources between user equipments (UEs) and transmit antennas at the base station (BS), aiming to reduce the pilot overhead in full-duplex systems and to achieve a comparable level to that of a half-duplex system. Additionally, in the case of separate antenna configurations in a full-duplex BS, providing channel estimates of transmit antenna (TX) arrays to the downlink UEs poses another challenge, as the TX arrays are not capable of receiving pilot signals. To address this, we employ an NN to map the channel from the downlink UEs to the receive antenna (RX) arrays to the channel from the TX arrays to the downlink UEs. We further elaborate on how NNs perform the estimation with different architectures, (e.g., different numbers of hidden layers), the introduction of non-linear distortion (e.g., with a 1-bit analog-to-digital converter (ADC)), and different channel conditions (e.g., low-correlated and high-correlated channels). Our work provides novel insights into NN-based channel estimators. △ Less

Submitted 18 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.17999 [pdf, ps, other]

Remote Estimation of Markov Processes over Costly Channels: On the Benefits of Implicit Information

Authors: Edoardo David Santi, Touraj Soleymani, Deniz Gunduz

Abstract: In this paper, we study the remote estimation problem of a Markov process over a channel with a cost. We formulate this problem as an infinite horizon optimization problem with two players, i.e., a sensor and a monitor, that have distinct information, and with a reward function that takes into account both the communication cost and the estimation quality. We show that the main challenge in solvin… ▽ More In this paper, we study the remote estimation problem of a Markov process over a channel with a cost. We formulate this problem as an infinite horizon optimization problem with two players, i.e., a sensor and a monitor, that have distinct information, and with a reward function that takes into account both the communication cost and the estimation quality. We show that the main challenge in solving this problem is associated with the consideration of implicit information, i.e., information that the monitor can obtain about the source when the sensor is silent. Our main objective is to develop a framework for finding solutions to this problem without neglecting implicit information a priori. To that end, we propose three different algorithms. The first one is an alternating policy algorithm that converges to a Nash equilibrium. The second one is an occupancy-state algorithm that is guaranteed to find a globally optimal solution. The last one is a heuristic algorithm that is able to find a near-optimal solution. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.00658 [pdf, other]

Point Cloud in the Air

Authors: Yulin Shao, Chenghong Bian, Li Yang, Qianqian Yang, Zhaoyang Zhang, Deniz Gunduz

Abstract: Acquisition and processing of point clouds (PCs) is a crucial enabler for many emerging applications reliant on 3D spatial data, such as robot navigation, autonomous vehicles, and augmented reality. In most scenarios, PCs acquired by remote sensors must be transmitted to an edge server for fusion, segmentation, or inference. Wireless transmission of PCs not only puts on increased burden on the alr… ▽ More Acquisition and processing of point clouds (PCs) is a crucial enabler for many emerging applications reliant on 3D spatial data, such as robot navigation, autonomous vehicles, and augmented reality. In most scenarios, PCs acquired by remote sensors must be transmitted to an edge server for fusion, segmentation, or inference. Wireless transmission of PCs not only puts on increased burden on the already congested wireless spectrum, but also confronts a unique set of challenges arising from the irregular and unstructured nature of PCs. In this paper, we meticulously delineate these challenges and offer a comprehensive examination of existing solutions while candidly acknowledging their inherent limitations. In response to these intricacies, we proffer four pragmatic solution frameworks, spanning advanced techniques, hybrid schemes, and distributed data aggregation approaches. In doing so, our goal is to chart a path toward efficient, reliable, and low-latency wireless PC transmission. △ Less

Submitted 31 December, 2023; originally announced January 2024.

arXiv:2311.07028 [pdf, other]

A Hybrid Joint Source-Channel Coding Scheme for Mobile Multi-hop Networks

Authors: Chenghong Bian, Yulin Shao, Deniz Gunduz

Abstract: We propose a novel hybrid joint source-channel coding (JSCC) scheme for robust image transmission over multi-hop networks. In the considered scenario, a mobile user wants to deliver an image to its destination over a mobile cellular network. We assume a practical setting, where the links between the nodes belonging to the mobile core network are stable and of high quality, while the link between t… ▽ More We propose a novel hybrid joint source-channel coding (JSCC) scheme for robust image transmission over multi-hop networks. In the considered scenario, a mobile user wants to deliver an image to its destination over a mobile cellular network. We assume a practical setting, where the links between the nodes belonging to the mobile core network are stable and of high quality, while the link between the mobile user and the first node (e.g., the access point) is potentially time-varying with poorer quality. In recent years, neural network based JSCC schemes (called DeepJSCC) have emerged as promising solutions to overcome the limitations of separation-based fully digital schemes. However, relying on analog transmission, DeepJSCC suffers from noise accumulation over multi-hop networks. Moreover, most of the hops within the mobile core network may be high-capacity wireless connections, calling for digital approaches. To this end, we propose a hybrid solution, where DeepJSCC is adopted for the first hop, while the received signal at the first relay is digitally compressed and forwarded through the mobile core network. We show through numerical simulations that the proposed scheme is able to outperform both the fully analog and fully digital schemes. Thanks to DeepJSCC it can avoid the cliff effect over the first hop, while also avoiding noise forwarding over the mobile core network thank to digital transmission. We believe this work paves the way for the practical deployment of DeepJSCC solutions in 6G and future wireless networks. △ Less

Submitted 7 February, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: Accepted to IEEE International Conference on Communications (ICC), 2024. Source code will be released soon

arXiv:2310.02559 [pdf, ps, other]

doi 10.1109/TWC.2023.3270908

Semi-Federated Learning: Convergence Analysis and Optimization of A Hybrid Learning Framework

Authors: **gheng Zheng, Wanli Ni, Hui Tian, Deniz Gunduz, Tony Q. S. Quek, Zhu Han

Abstract: Under the organization of the base station (BS), wireless federated learning (FL) enables collaborative model training among multiple devices. However, the BS is merely responsible for aggregating local updates during the training process, which incurs a waste of the computational resource at the BS. To tackle this issue, we propose a semi-federated learning (SemiFL) paradigm to leverage the compu… ▽ More Under the organization of the base station (BS), wireless federated learning (FL) enables collaborative model training among multiple devices. However, the BS is merely responsible for aggregating local updates during the training process, which incurs a waste of the computational resource at the BS. To tackle this issue, we propose a semi-federated learning (SemiFL) paradigm to leverage the computing capabilities of both the BS and devices for a hybrid implementation of centralized learning (CL) and FL. Specifically, each device sends both local gradients and data samples to the BS for training a shared global model. To improve communication efficiency over the same time-frequency resources, we integrate over-the-air computation for aggregation and non-orthogonal multiple access for transmission by designing a novel transceiver structure. To gain deep insights, we conduct convergence analysis by deriving a closed-form optimality gap for SemiFL and extend the result to two extra cases. In the first case, the BS uses all accumulated data samples to calculate the CL gradient, while a decreasing learning rate is adopted in the second case. Our analytical results capture the destructive effect of wireless communication and show that both FL and CL are special cases of SemiFL. Then, we formulate a non-convex problem to reduce the optimality gap by jointly optimizing the transmit power and receive beamformers. Accordingly, we propose a two-stage algorithm to solve this intractable problem, in which we provide the closed-form solutions to the beamformers. Extensive simulation results on two real-world datasets corroborate our theoretical analysis, and show that the proposed SemiFL outperforms conventional FL and achieves 3.2% accuracy gain on the MNIST dataset compared to state-of-the-art benchmarks. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: This paper has been accepted by IEEE Transactions on Wireless Communications

arXiv:2310.01130 [pdf, other]

CommIN: Semantic Image Communications as an Inverse Problem with INN-Guided Diffusion Models

Authors: Jiakang Chen, Di You, Deniz Gündüz, Pier Luigi Dragotti

Abstract: Joint source-channel coding schemes based on deep neural networks (DeepJSCC) have recently achieved remarkable performance for wireless image transmission. However, these methods usually focus only on the distortion of the reconstructed signal at the receiver side with respect to the source at the transmitter side, rather than the perceptual quality of the reconstruction which carries more semanti… ▽ More Joint source-channel coding schemes based on deep neural networks (DeepJSCC) have recently achieved remarkable performance for wireless image transmission. However, these methods usually focus only on the distortion of the reconstructed signal at the receiver side with respect to the source at the transmitter side, rather than the perceptual quality of the reconstruction which carries more semantic information. As a result, severe perceptual distortion can be introduced under extreme conditions such as low bandwidth and low signal-to-noise ratio. In this work, we propose CommIN, which views the recovery of high-quality source images from degraded reconstructions as an inverse problem. To address this, CommIN combines Invertible Neural Networks (INN) with diffusion models, aiming for superior perceptual quality. Through experiments, we show that our CommIN significantly improves the perceptual quality compared to DeepJSCC under extreme conditions and outperforms other inverse problem approaches used in DeepJSCC. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.15889 [pdf, other]

High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models

Authors: Selim F. Yilmaz, Xueyan Niu, Bo Bai, Wei Han, Lei Deng, Deniz Gunduz

Abstract: We consider the image transmission problem over a noisy wireless channel via deep learning-based joint source-channel coding (DeepJSCC) along with a denoising diffusion probabilistic model (DDPM) at the receiver. Specifically, we are interested in the perception-distortion trade-off in the practical finite block length regime, in which separate source and channel coding can be highly suboptimal. W… ▽ More We consider the image transmission problem over a noisy wireless channel via deep learning-based joint source-channel coding (DeepJSCC) along with a denoising diffusion probabilistic model (DDPM) at the receiver. Specifically, we are interested in the perception-distortion trade-off in the practical finite block length regime, in which separate source and channel coding can be highly suboptimal. We introduce a novel scheme that utilizes the range-null space decomposition of the target image. We transmit the range-space of the image after encoding and employ DDPM to progressively refine its null space contents. Through extensive experiments, we demonstrate significant improvements in distortion and perceptual quality of reconstructed images compared to standard DeepJSCC and the state-of-the-art generative learning-based method. We will publicly share our source code to facilitate further research and reproducibility. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: 6 pages, 4 figures

arXiv:2309.00470 [pdf, other]

Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels

Authors: Haotian Wu, Yulin Shao, Chenghong Bian, Krystian Mikolajczyk, Deniz Gündüz

Abstract: This paper introduces a vision transformer (ViT)-based deep joint source and channel coding (DeepJSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) channels, denoted as DeepJSCC-MIMO. We consider DeepJSCC-MIMO for adaptive image transmission in both open-loop and closed-loop MIMO systems. The novel DeepJSCC-MIMO architecture surpasses the classical separation-b… ▽ More This paper introduces a vision transformer (ViT)-based deep joint source and channel coding (DeepJSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) channels, denoted as DeepJSCC-MIMO. We consider DeepJSCC-MIMO for adaptive image transmission in both open-loop and closed-loop MIMO systems. The novel DeepJSCC-MIMO architecture surpasses the classical separation-based benchmarks with robustness to channel estimation errors and showcases remarkable flexibility in adapting to diverse channel conditions and antenna numbers without requiring retraining. Specifically, by harnessing the self-attention mechanism of ViT, DeepJSCC-MIMO intelligently learns feature map** and power allocation strategies tailored to the unique characteristics of the source image and prevailing channel conditions. Extensive numerical experiments validate the significant improvements in transmission quality achieved by DeepJSCC-MIMO for both open-loop and closed-loop MIMO systems across a wide range of scenarios. Moreover, DeepJSCC-MIMO exhibits robustness to varying channel conditions, channel estimation errors, and different antenna numbers, making it an appealing solution for emerging semantic communication systems. △ Less

Submitted 7 May, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: arXiv admin note: text overlap with arXiv:2210.15347

MSC Class: 94A24 ACM Class: E.4

arXiv:2308.08244 [pdf, other]

A Hybrid Wireless Image Transmission Scheme with Diffusion

Authors: Xueyan Niu, Xu Wang, Deniz Gündüz, Bo Bai, Weichao Chen, Guohua Zhou

Abstract: We propose a hybrid joint source-channel coding (JSCC) scheme, in which the conventional digital communication scheme is complemented with a generative refinement component to improve the perceptual quality of the reconstruction. The input image is decomposed into two components: the first is a coarse compressed version, and is transmitted following the conventional separation based approach. An a… ▽ More We propose a hybrid joint source-channel coding (JSCC) scheme, in which the conventional digital communication scheme is complemented with a generative refinement component to improve the perceptual quality of the reconstruction. The input image is decomposed into two components: the first is a coarse compressed version, and is transmitted following the conventional separation based approach. An additional component is obtained through the diffusion process by adding independent Gaussian noise to the input image, and is transmitted using DeepJSCC. The decoder combines the two signals to produce a high quality reconstruction of the source. Experimental results show that the hybrid design provides bandwidth savings and enables graceful performance improvement as the channel quality improves. △ Less

Submitted 16 August, 2023; originally announced August 2023.

arXiv:2308.02892 [pdf, ps, other]

Secure Deep-JSCC Against Multiple Eavesdroppers

Authors: Seyyed Amirhossein Ameli Kalkhoran, Mehdi Letafati, Ecenaz Erdemir, Babak Hossein Khalaj, Hamid Behroozi, Deniz Gündüz

Abstract: In this paper, a generalization of deep learning-aided joint source channel coding (Deep-JSCC) approach to secure communications is studied. We propose an end-to-end (E2E) learning-based approach for secure communication against multiple eavesdroppers over complex-valued fading channels. Both scenarios of colluding and non-colluding eavesdroppers are studied. For the colluding strategy, eavesdropp… ▽ More In this paper, a generalization of deep learning-aided joint source channel coding (Deep-JSCC) approach to secure communications is studied. We propose an end-to-end (E2E) learning-based approach for secure communication against multiple eavesdroppers over complex-valued fading channels. Both scenarios of colluding and non-colluding eavesdroppers are studied. For the colluding strategy, eavesdroppers share their logits to collaboratively infer private attributes based on ensemble learning method, while for the non-colluding setup they act alone. The goal is to prevent eavesdroppers from inferring private (sensitive) information about the transmitted images, while delivering the images to a legitimate receiver with minimum distortion. By generalizing the ideas of privacy funnel and wiretap channel coding, the trade-off between the image recovery at the legitimate node and the information leakage to the eavesdroppers is characterized. To solve this secrecy funnel framework, we implement deep neural networks (DNNs) to realize a data-driven secure communication scheme, without relying on a specific data distribution. Simulations over CIFAR-10 dataset verifies the secrecy-utility trade-off. Adversarial accuracy of eavesdroppers are also studied over Rayleigh fading, Nakagami-m, and AWGN channels to verify the generalization of the proposed scheme. Our experiments show that employing the proposed secure neural encoding can decrease the adversarial accuracy by 28%. △ Less

Submitted 5 August, 2023; originally announced August 2023.

arXiv:2306.17580 [pdf, other]

Timely and Massive Communication in 6G: Pragmatics, Learning, and Inference

Authors: Deniz Gündüz, Federico Chiariotti, Kaibin Huang, Anders E. Kalør, Szymon Kobus, Petar Popovski

Abstract: 5G has expanded the traditional focus of wireless systems to embrace two new connectivity types: ultra-reliable low latency and massive communication. The technology context at the dawn of 6G is different from the past one for 5G, primarily due to the growing intelligence at the communicating nodes. This has driven the set of relevant communication problems beyond reliable transmission towards sem… ▽ More 5G has expanded the traditional focus of wireless systems to embrace two new connectivity types: ultra-reliable low latency and massive communication. The technology context at the dawn of 6G is different from the past one for 5G, primarily due to the growing intelligence at the communicating nodes. This has driven the set of relevant communication problems beyond reliable transmission towards semantic and pragmatic communication. This paper puts the evolution of low-latency and massive communication towards 6G in the perspective of these new developments. At first, semantic/pragmatic communication problems are presented by drawing parallels to linguistics. We elaborate upon the relation of semantic communication to the information-theoretic problems of source/channel coding, while generalized real-time communication is put in the context of cyber-physical systems and real-time inference. The evolution of massive access towards massive closed-loop communication is elaborated upon, enabling interactive communication, learning, and cooperation among wireless sensors and actuators. △ Less

Submitted 26 September, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

Comments: Submitted for publication to IEEE BITS (revised version preprint)

arXiv:2306.09101 [pdf, other]

Transformer-aided Wireless Image Transmission with Channel Feedback

Authors: Haotian Wu, Yulin Shao, Emre Ozfatura, Krystian Mikolajczyk, Deniz Gündüz

Abstract: This paper presents a novel wireless image transmission paradigm that can exploit feedback from the receiver, called DeepJSCC-ViT-f. We consider a block feedback channel model, where the transmitter receives noiseless/noisy channel output feedback after each block. The proposed scheme employs a single encoder to facilitate transmission over multiple blocks, refining the receiver's estimation at ea… ▽ More This paper presents a novel wireless image transmission paradigm that can exploit feedback from the receiver, called DeepJSCC-ViT-f. We consider a block feedback channel model, where the transmitter receives noiseless/noisy channel output feedback after each block. The proposed scheme employs a single encoder to facilitate transmission over multiple blocks, refining the receiver's estimation at each block. Specifically, the unified encoder of DeepJSCC-ViT-f can leverage the semantic information from the source image, and acquire channel state information and the decoder's current belief about the source image from the feedback signal to generate coded symbols at each block. Numerical experiments show that our DeepJSCC-ViT-f scheme achieves state-of-the-art transmission performance with robustness to noise in the feedback link. Additionally, DeepJSCC-ViT-f can adapt to the channel condition directly through feedback without the need for separate channel estimation. We further extend the scope of the DeepJSCC-ViT-f approach to include the broadcast channel, which enables the transmitter to generate broadcast codes in accordance with signal semantics and channel feedback from individual receivers. △ Less

Submitted 14 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

MSC Class: 94A24 ACM Class: E.4

arXiv:2306.08730 [pdf, other]

Wireless Point Cloud Transmission

Authors: Chenghong Bian, Yulin Shao, Deniz Gunduz

Abstract: 3D point cloud is a three-dimensional data format generated by LiDARs and depth sensors, and is being increasingly used in a large variety of applications. This paper presents a novel solution called SEmantic Point cloud Transmission (SEPT), for the transmission of point clouds over wireless channels with limited bandwidth. At the transmitter, SEPT encodes the point cloud via an iterative downsamp… ▽ More 3D point cloud is a three-dimensional data format generated by LiDARs and depth sensors, and is being increasingly used in a large variety of applications. This paper presents a novel solution called SEmantic Point cloud Transmission (SEPT), for the transmission of point clouds over wireless channels with limited bandwidth. At the transmitter, SEPT encodes the point cloud via an iterative downsampling and feature extraction process. At the receiver, SEPT reconstructs the point cloud with latent reconstruction and offset-based upsampling. Extensive numerical experiments confirm that SEPT significantly outperforms the standard approach with octree-based compression followed by channel coding. Compared with a more advanced benchmark that utilizes state-of-the-art deep learning-based compression techniques, SEPT achieves comparable performance while eliminating the cliff and leveling effects. Thanks to its improved performance and robustness against channel variations, we believe that SEPT can be instrumental in collaborative sensing and inference applications among robots and vehicles, particularly in the low-latency and high-mobility scenarios. △ Less

Submitted 14 June, 2023; originally announced June 2023.

Comments: 7 pages

arXiv:2306.02726 [pdf, ps, other]

Learning-Based Rich Feedback HARQ for Energy-Efficient Short Packet Transmission

Authors: Martin Voigt Vejling, Federico Chiariotti, Anders Ellersgaard Kalør, Deniz Gündüz, Gianluigi Liva, Petar Popovski

Abstract: The trade-off between reliability, latency, and energy-efficiency is a central problem in communication systems. Advanced hybrid automated repeat request (HARQ) techniques can reduce the number of retransmissions required for reliable communication, but they have a significant computational cost. On the other hand, strict energy constraints apply mainly to devices, while the access point receiving… ▽ More The trade-off between reliability, latency, and energy-efficiency is a central problem in communication systems. Advanced hybrid automated repeat request (HARQ) techniques can reduce the number of retransmissions required for reliable communication, but they have a significant computational cost. On the other hand, strict energy constraints apply mainly to devices, while the access point receiving their packets is usually connected to the electrical grid. Therefore, moving the computational complexity required for HARQ schemes from the transmitter to the receiver may provide a way to overcome this trade-off. To achieve this, we propose the Reinforcement-based Adaptive Feedback (RAF) scheme, in which the receiver adaptively learns how much additional redundancy it requires to decode a packet and sends rich feedback (i.e., more than a single bit), requesting the coded retransmission of specific symbols. Simulation results show that the RAF scheme achieves a better trade-off between energy-efficiency, reliability, and latency, compared to existing HARQ solutions and a fixed threshold-based policy. Our RAF scheme can easily adapt to different modulation schemes, and since it relies on the posterior probabilities of the codeword symbols at the decoder, it can generalize to different channel statistics. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2306.00659 [pdf, other]

Do not Interfere but Cooperate: A Fully Learnable Code Design for Multi-Access Channels with Feedback

Authors: Emre Ozfatura, Chenghong Bian, Deniz Gunduz

Abstract: Data-driven deep learning based code designs, including low-complexity neural decoders for existing codes, or end-to-end trainable auto-encoders have exhibited impressive results, particularly in scenarios for which we do not have high-performing structured code designs. However, the vast majority of existing data-driven solutions for channel coding focus on a point-to-point scenario. In this work… ▽ More Data-driven deep learning based code designs, including low-complexity neural decoders for existing codes, or end-to-end trainable auto-encoders have exhibited impressive results, particularly in scenarios for which we do not have high-performing structured code designs. However, the vast majority of existing data-driven solutions for channel coding focus on a point-to-point scenario. In this work, we consider a multiple access channel (MAC) with feedback and try to understand whether deep learning-based designs are capable of enabling coordination and cooperation among the encoders as well as allowing error correction. Simulation results show that the proposed multi-access block attention feedback (MBAF) code improves the upper bound of the achievable rate of MAC without feedback in finite block length regime. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: 5 pages

arXiv:2305.14094 [pdf, other]

Sustainable Edge Intelligence Through Energy-Aware Early Exiting

Authors: Marcello Bullo, Seifallah Jardak, Pietro Carnelli, Deniz Gündüz

Abstract: Deep learning (DL) models have emerged as a promising solution for the Internet of Things (IoT). However, due to their computational complexity, DL models consume significant amounts of energy, which can rapidly drain the battery and compromise the performance of IoT devices. For sustainable operation, we consider an edge device with a rechargeable battery and energy harvesting (EH) capabilities.… ▽ More Deep learning (DL) models have emerged as a promising solution for the Internet of Things (IoT). However, due to their computational complexity, DL models consume significant amounts of energy, which can rapidly drain the battery and compromise the performance of IoT devices. For sustainable operation, we consider an edge device with a rechargeable battery and energy harvesting (EH) capabilities. In addition to the stochastic nature of the ambient energy source, the harvesting rate is often insufficient to meet the inference energy requirements, leading to drastic performance degradation in energy-agnostic devices. To mitigate this problem, we propose energy-adaptive dynamic early exiting (EE) to enable efficient and accurate inference in an EH edge intelligence system. Our approach derives an energy-aware EE policy that determines the optimal amount of computational processing on a per-sample basis. The proposed policy balances the energy consumption to match the limited incoming energy and achieves continuous availability. Numerical results show that accuracy and service rate are improved up to 25% and 35%, respectively, in comparison with an energy-agnostic policy. △ Less

Submitted 16 July, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: 6 pages, accepted at IEEE MLSP 2023

arXiv:2305.13161 [pdf, ps, other]

DeepJSCC-l++: Robust and Bandwidth-Adaptive Wireless Image Transmission

Authors: Chenghong Bian, Yulin Shao, Deniz Gunduz

Abstract: This paper presents a novel vision transformer (ViT) based deep joint source channel coding (DeepJSCC) scheme, dubbed DeepJSCC-l++, which can be adaptive to multiple target bandwidth ratios as well as different channel signal-to-noise ratios (SNRs) using a single model. To achieve this, we train the proposed DeepJSCC-l++ model with different bandwidth ratios and SNRs, which are fed to the model as… ▽ More This paper presents a novel vision transformer (ViT) based deep joint source channel coding (DeepJSCC) scheme, dubbed DeepJSCC-l++, which can be adaptive to multiple target bandwidth ratios as well as different channel signal-to-noise ratios (SNRs) using a single model. To achieve this, we train the proposed DeepJSCC-l++ model with different bandwidth ratios and SNRs, which are fed to the model as side information. The reconstruction losses corresponding to different bandwidth ratios are calculated, and a new training methodology is proposed, which dynamically assigns different weights to the losses of different bandwidth ratios according to their individual reconstruction qualities. Shifted window (Swin) transformer, is adopted as the backbone for our DeepJSCC-l++ model. Through extensive simulations it is shown that the proposed DeepJSCC-l++ and successive refinement models can adapt to different bandwidth ratios and channel SNRs with marginal performance loss compared to the separately trained models. We also observe the proposed schemes can outperform the digital baseline, which concatenates the BPG compression with capacity-achieving channel code. △ Less

Submitted 30 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted to IEEE Global Communications Conference 2023. Code available at https://github.com/aprilbian/deepjscc-lplusplus

arXiv:2305.10609 [pdf, other]

Unsourced Massive Access-Based Digital Over-the-Air Computation for Efficient Federated Edge Learning

Authors: Li Qiao, Zhen Gao, Zhongxiang Li, Deniz Gündüz

Abstract: Over-the-air computation (OAC) is a promising technique to achieve fast model aggregation across multiple devices in federated edge learning (FEEL). In addition to the analog schemes, one-bit digital aggregation (OBDA) scheme was proposed to adapt OAC to modern digital wireless systems. However, one-bit quantization in OBDA can result in a serious information loss and slower convergence of FEEL. T… ▽ More Over-the-air computation (OAC) is a promising technique to achieve fast model aggregation across multiple devices in federated edge learning (FEEL). In addition to the analog schemes, one-bit digital aggregation (OBDA) scheme was proposed to adapt OAC to modern digital wireless systems. However, one-bit quantization in OBDA can result in a serious information loss and slower convergence of FEEL. To overcome this limitation, this paper proposes an unsourced massive access (UMA)-based generalized digital OAC (GD-OAC) scheme. Specifically, at the transmitter, all the devices share the same non-orthogonal UMA codebook for uplink transmission. The local model update of each device is quantized based on the same quantization codebook. Then, each device transmits a sequence selected from the UMA codebook based on the quantized elements of its model update. At the receiver, we propose an approximate message passing-based algorithm for efficient UMA detection and model aggregation. Simulation results show that the proposed GD-OAC scheme significantly accelerates the FEEL convergences compared with the state-of-the-art OBDA scheme while using the same uplink communication resources. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: 2023 IEEE International Symposium on Information Theory (ISIT)

arXiv:2304.08221 [pdf, other]

Features-over-the-Air: Contrastive Learning Enabled Cooperative Edge Inference

Authors: Haotian Wu, Nitish Mital, Krystian Mikolajczyk, Deniz Gündüz

Abstract: We study the collaborative image retrieval problem at the wireless edge, where multiple edge devices capture images of the same object, which are then used jointly to retrieve similar images at the edge server over a shared multiple access channel. We propose a semantic non-orthogonal multiple access (NOMA) communication paradigm, in which extracted features from each device are mapped directly to… ▽ More We study the collaborative image retrieval problem at the wireless edge, where multiple edge devices capture images of the same object, which are then used jointly to retrieve similar images at the edge server over a shared multiple access channel. We propose a semantic non-orthogonal multiple access (NOMA) communication paradigm, in which extracted features from each device are mapped directly to channel inputs, which are then added over-the-air. We propose a novel contrastive learning (CL)-based semantic communication (CL-SC) paradigm, aiming to exploit signal correlations to maximize the retrieval accuracy under a total bandwidth constraints. Specifically, we treat noisy correlated signals as different augmentations of a common identity, and propose a cross-view CL algorithm to optimize the correlated signals in a coarse-to-fine fashion to improve retrieval accuracy. Extensive numerical experiments verify that our method achieves the state-of-the-art performance and can significantly improve retrieval accuracy, with particularly significant gains in low signla-to-noise ratio (SNR) and limited bandwidth regimes. △ Less

Submitted 17 April, 2023; originally announced April 2023.

MSC Class: 94A24 ACM Class: E.4

arXiv:2302.08447 [pdf, other]

Graph Neural Networks over the Air for Decentralized Tasks in Wireless Networks

Authors: Zhan Gao, Deniz Gunduz

Abstract: Graph neural networks (GNNs) model representations from networked data and allow for decentralized inference through localized communications. Existing GNN architectures often assume ideal communications and ignore potential channel effects, such as fading and noise, leading to performance degradation in real-world implementation. Considering a GNN implemented over nodes connected through wireless… ▽ More Graph neural networks (GNNs) model representations from networked data and allow for decentralized inference through localized communications. Existing GNN architectures often assume ideal communications and ignore potential channel effects, such as fading and noise, leading to performance degradation in real-world implementation. Considering a GNN implemented over nodes connected through wireless links, this paper conducts a stability analysis to study the impact of channel impairments on the performance of GNNs, and proposes graph neural networks over the air (AirGNNs), a novel GNN architecture that incorporates the communication model. AirGNNs modify graph convolutional operations that shift graph signals over random communication graphs to take into account channel fading and noise when aggregating features from neighbors, thus, improving architecture robustness to channel impairments during testing. We develop a channel-inversion signal transmission strategy for AirGNNs when channel state information (CSI) is available, and propose a stochastic gradient descent based method to train AirGNNs when CSI is unknown. The convergence analysis shows that the training procedure approaches a stationary solution of an associated stochastic optimization problem and the variance analysis characterizes the statistical behavior of the trained model. Experiments on decentralized source localization and multi-robot flocking corroborate theoretical findings and show superior performance of AirGNNs over wireless communication channels. △ Less

Submitted 21 May, 2024; v1 submitted 16 February, 2023; originally announced February 2023.

arXiv:2301.03996 [pdf, other]

Collaborative Semantic Communication for Edge Inference

Authors: Wing Fei Lo, Nitish Mital, Haotian Wu, Deniz Gündüz

Abstract: We study the collaborative image retrieval problem at the wireless edge, where multiple edge devices capture images of the same object from different angles and locations, which are then used jointly to retrieve similar images at the edge server over a shared multiple access channel (MAC). We propose two novel deep learning-based joint source and channel coding (JSCC) schemes for the task over bot… ▽ More We study the collaborative image retrieval problem at the wireless edge, where multiple edge devices capture images of the same object from different angles and locations, which are then used jointly to retrieve similar images at the edge server over a shared multiple access channel (MAC). We propose two novel deep learning-based joint source and channel coding (JSCC) schemes for the task over both additive white Gaussian noise (AWGN) and Rayleigh slow fading channels, with the aim of maximizing the retrieval accuracy under a total bandwidth constraint. The proposed schemes are evaluated on a wide range of channel signal-to-noise ratios (SNRs), and shown to outperform the single-device JSCC and the separation-based multiple-access benchmarks. We also propose two novel SNR-aware JSCC schemes with attention modules to improve the performance in the case of channel mismatch between training and test instances. △ Less

Submitted 12 February, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

MSC Class: 94A24 ACM Class: E.4

arXiv:2211.13772 [pdf, other]

Generative Joint Source-Channel Coding for Semantic Image Transmission

Authors: Ecenaz Erdemir, Tze-Yang Tung, Pier Luigi Dragotti, Deniz Gunduz

Abstract: Recent works have shown that joint source-channel coding (JSCC) schemes using deep neural networks (DNNs), called DeepJSCC, provide promising results in wireless image transmission. However, these methods mostly focus on the distortion of the reconstructed signals with respect to the input image, rather than their perception by humans. However, focusing on traditional distortion metrics alone does… ▽ More Recent works have shown that joint source-channel coding (JSCC) schemes using deep neural networks (DNNs), called DeepJSCC, provide promising results in wireless image transmission. However, these methods mostly focus on the distortion of the reconstructed signals with respect to the input image, rather than their perception by humans. However, focusing on traditional distortion metrics alone does not necessarily result in high perceptual quality, especially in extreme physical conditions, such as very low bandwidth compression ratio (BCR) and low signal-to-noise ratio (SNR) regimes. In this work, we propose two novel JSCC schemes that leverage the perceptual quality of deep generative models (DGMs) for wireless image transmission, namely InverseJSCC and GenerativeJSCC. While the former is an inverse problem approach to DeepJSCC, the latter is an end-to-end optimized JSCC scheme. In both, we optimize a weighted sum of mean squared error (MSE) and learned perceptual image patch similarity (LPIPS) losses, which capture more semantic similarities than other distortion metrics. InverseJSCC performs denoising on the distorted reconstructions of a DeepJSCC model by solving an inverse optimization problem using style-based generative adversarial network (StyleGAN). Our simulation results show that InverseJSCC significantly improves the state-of-the-art (SotA) DeepJSCC in terms of perceptual quality in edge cases. In GenerativeJSCC, we carry out end-to-end training of an encoder and a StyleGAN-based decoder, and show that GenerativeJSCC significantly outperforms DeepJSCC both in terms of distortion and perceptual quality. △ Less

Submitted 24 November, 2022; originally announced November 2022.

Comments: 12 pages, 9 figures

arXiv:2211.09920 [pdf, other]

Distributed Deep Joint Source-Channel Coding over a Multiple Access Channel

Authors: Selim F. Yilmaz, Can Karamanli, Deniz Gunduz

Abstract: We consider distributed image transmission over a noisy multiple access channel (MAC) using deep joint source-channel coding (DeepJSCC). It is known that Shannon's separation theorem holds when transmitting independent sources over a MAC in the asymptotic infinite block length regime. However, we are interested in the practical finite block length regime, in which case separate source and channel… ▽ More We consider distributed image transmission over a noisy multiple access channel (MAC) using deep joint source-channel coding (DeepJSCC). It is known that Shannon's separation theorem holds when transmitting independent sources over a MAC in the asymptotic infinite block length regime. However, we are interested in the practical finite block length regime, in which case separate source and channel coding is known to be suboptimal. We introduce a novel joint image compression and transmission scheme, where the devices send their compressed image representations in a non-orthogonal manner. While non-orthogonal multiple access (NOMA) is known to achieve the capacity region, to the best of our knowledge, non-orthogonal joint source channel coding (JSCC) scheme for practical systems has not been studied before. Through extensive experiments, we show significant improvements in terms of the quality of the reconstructed images compared to orthogonal transmission employing current DeepJSCC approaches particularly for low bandwidth ratios. We publicly share source code to facilitate further research and reproducibility. △ Less

Submitted 2 March, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

Comments: To appear in IEEE International Conference on Communications (ICC) 2023

arXiv:2211.08747 [pdf, other]

Deep Joint Source-Channel Coding for Semantic Communications

Authors: Jialong Xu, Tze-Yang Tung, Bo Ai, Wei Chen, Yuxuan Sun, Deniz Gunduz

Abstract: Semantic communications is considered as a promising technology to increase the efficiency of next-generation communication systems, particularly targeting human-machine and machine-type communications. In contrast to the source-agnostic approach of conventional wireless communication systems, semantic communication seeks to ensure that only the relevant information for the underlying task is comm… ▽ More Semantic communications is considered as a promising technology to increase the efficiency of next-generation communication systems, particularly targeting human-machine and machine-type communications. In contrast to the source-agnostic approach of conventional wireless communication systems, semantic communication seeks to ensure that only the relevant information for the underlying task is communicated to the receiver. Considering that most semantic communication applications have strict latency, bandwidth, and power constraints, a prominent approach is to model them as a joint source-channel coding (JSCC) problem. Although JSCC has been a long-standing open problem in communication and coding theory, remarkable performance gains have been shown recently over existing separate source and channel coding systems, particularly in low-latency and low-power scenarios. Recent progress is thanks to the adoption of deep learning techniques for joint source-channel code design that outperform the concatenation of state-of-the-art compression and channel coding schemes, which are results of decades-long research efforts. In this article, we present an adaptive deep learning based JSCC (DeepJSCC) architecture for semantic communications, introduce its design principles, highlight its benefits, and outline future research challenges that lie ahead. △ Less

Submitted 18 July, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

Comments: 7 pages, 6 figures

arXiv:2211.06705 [pdf, ps, other]

Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Authors: Chenghong Bian, Yulin Shao, Haotian Wu, Deniz Gunduz

Abstract: This paper presents a novel deep joint source-channel coding (DeepJSCC) scheme for image transmission over a half-duplex cooperative relay channel. Specifically, we apply DeepJSCC to two basic modes of cooperative communications, namely amplify-and-forward (AF) and decode-and-forward (DF). In DeepJSCC-AF, the relay simply amplifies and forwards its received signal. In DeepJSCC-DF, on the other han… ▽ More This paper presents a novel deep joint source-channel coding (DeepJSCC) scheme for image transmission over a half-duplex cooperative relay channel. Specifically, we apply DeepJSCC to two basic modes of cooperative communications, namely amplify-and-forward (AF) and decode-and-forward (DF). In DeepJSCC-AF, the relay simply amplifies and forwards its received signal. In DeepJSCC-DF, on the other hand, the relay first reconstructs the transmitted image and then re-encodes it before forwarding. Considering the excessive computation overhead of DeepJSCC-DF for recovering the image at the relay, we propose an alternative scheme, called DeepJSCC-PF, in which the relay processes and forwards its received signal without necessarily recovering the image. Simulation results show that the proposed DeepJSCC-AF, DF, and PF schemes are superior to the digital baselines with BPG compression with polar codes and provides a graceful performance degradation with deteriorating channel quality. Further investigation shows that the PSNR gain of DeepJSCC-DF/PF over DeepJSCC-AF improves as the channel condition between the source and relay improves. Moreover, DeepJSCC-PF scheme achieves a similar performance to DeepJSCC-DF with lower computational complexity. △ Less

Submitted 18 March, 2024; v1 submitted 12 November, 2022; originally announced November 2022.

Comments: Accepted to IEEE International Conference on Machine Learning for Communication and Networking (ICMLCN) 2024, code available via this link https://github.com/aprilbian/Relay_JSCC

arXiv:2211.01730 [pdf, other]

Feedback is Good, Active Feedback is Better: Block Attention Active Feedback Codes

Authors: Emre Ozfatura, Yulin Shao, Amin Ghazanfari, Alberto Perotti, Branislav Popovic, Deniz Gunduz

Abstract: Deep neural network (DNN)-assisted channel coding designs, such as low-complexity neural decoders for existing codes, or end-to-end neural-network-based auto-encoder designs are gaining interest recently due to their improved performance and flexibility; particularly for communication scenarios in which high-performing structured code designs do not exist. Communication in the presence of feedback… ▽ More Deep neural network (DNN)-assisted channel coding designs, such as low-complexity neural decoders for existing codes, or end-to-end neural-network-based auto-encoder designs are gaining interest recently due to their improved performance and flexibility; particularly for communication scenarios in which high-performing structured code designs do not exist. Communication in the presence of feedback is one such communication scenario, and practical code design for feedback channels has remained an open challenge in coding theory for many decades. Recently, DNN-based designs have shown impressive results in exploiting feedback. In particular, generalized block attention feedback (GBAF) codes, which utilizes the popular transformer architecture, achieved significant improvement in terms of the block error rate (BLER) performance. However, previous works have focused mainly on passive feedback, where the transmitter observes a noisy version of the signal at the receiver. In this work, we show that GBAF codes can also be used for channels with active feedback. We implement a pair of transformer architectures, at the transmitter and the receiver, which interact with each other sequentially, and achieve a new state-of-the-art BLER performance, especially in the low SNR regime. △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2210.16985 [pdf, other]

Space-time design for deep joint source channel coding of images Over MIMO channels

Authors: Chenghong Bian, Yulin Shao, Haotian Wu, Deniz Gunduz

Abstract: We propose novel deep joint source-channel coding (DeepJSCC) algorithms for wireless image transmission over multi-input multi-output (MIMO) Rayleigh fading channels, when channel state information (CSI) is available only at the receiver. We consider two different schemes; one exploiting the spatial diversity and the other exploiting the spatial multiplexing gain of the MIMO channel, respectively.… ▽ More We propose novel deep joint source-channel coding (DeepJSCC) algorithms for wireless image transmission over multi-input multi-output (MIMO) Rayleigh fading channels, when channel state information (CSI) is available only at the receiver. We consider two different schemes; one exploiting the spatial diversity and the other exploiting the spatial multiplexing gain of the MIMO channel, respectively. For the former, we utilize an orthogonal space-time block code (OSTBC) to achieve full diversity and increase the robustness against channel variations. In the latter, we directly map the input to the antennas, where the additional degree of freedom can be used to send more information about the source signal. Simulation results show that the diversity scheme outperforms the multiplexing scheme for lower signal-to-noise ratio (SNR) values and a smaller number of receive antennas at the AP. When the number of transmit antennas is greater than two, however, the full-diversity scheme becomes less beneficial. We also show that both the diversity and multiplexing schemes can achieve comparable performance with the state-of-the-art BPG algorithm delivered at the instantaneous capacity of the MIMO channel, which serves as an upper bound on the performance of separation-based practical systems. △ Less

Submitted 20 June, 2023; v1 submitted 30 October, 2022; originally announced October 2022.

Comments: Accepted to SPAWC 2023, 5 pages

arXiv:2210.15347 [pdf, other]

Vision Transformer for Adaptive Image Transmission over MIMO Channels

Authors: Haotian Wu, Yulin Shao, Chenghong Bian, Krystian Mikolajczyk, Deniz Gündüz

Abstract: This paper presents a vision transformer (ViT) based joint source and channel coding (JSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) systems, called ViT-MIMO. The proposed ViT-MIMO architecture, in addition to outperforming separation-based benchmarks, can flexibly adapt to different channel conditions without requiring retraining. Specifically, exploiting… ▽ More This paper presents a vision transformer (ViT) based joint source and channel coding (JSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) systems, called ViT-MIMO. The proposed ViT-MIMO architecture, in addition to outperforming separation-based benchmarks, can flexibly adapt to different channel conditions without requiring retraining. Specifically, exploiting the self-attention mechanism of the ViT enables the proposed ViT-MIMO model to adaptively learn the feature map** and power allocation based on the source image and channel conditions. Numerical experiments show that ViT-MIMO can significantly improve the transmission quality cross a large variety of scenarios, including varying channel conditions, making it an attractive solution for emerging semantic communication systems. △ Less

Submitted 27 October, 2022; originally announced October 2022.

MSC Class: 94A24 ACM Class: E.4

arXiv:2210.13279 [pdf, other]

A Meta-Learning Based Gradient Descent Algorithm for MU-MIMO Beamforming

Authors: **g-Yuan Xia, Zhixiong Yang, Tong Qiu, Huaizhang Liao, Deniz Gunduz

Abstract: Multi-user multiple-input multiple-output (MU-MIMO) beamforming design is typically formulated as a non-convex weighted sum rate (WSR) maximization problem that is known to be NP-hard. This problem is solved either by iterative algorithms, which suffer from slow convergence, or more recently by using deep learning tools, which require time-consuming pre-training process. In this paper, we propose… ▽ More Multi-user multiple-input multiple-output (MU-MIMO) beamforming design is typically formulated as a non-convex weighted sum rate (WSR) maximization problem that is known to be NP-hard. This problem is solved either by iterative algorithms, which suffer from slow convergence, or more recently by using deep learning tools, which require time-consuming pre-training process. In this paper, we propose a low-complexity meta-learning based gradient descent algorithm. A meta network with lightweight architecture is applied to learn an adaptive gradient descent update rule to directly optimize the beamformer. This lightweight network is trained during the iterative optimization process, which we refer to as \emph{training while solving}, which removes both the training process and the data-dependency of existing deep learning based solutions.Extensive simulations show that the proposed method achieves superior WSR performance compared to existing learning-based approaches as well as the conventional WMMSE algorithm, while enjoying much lower computational load. △ Less

Submitted 27 October, 2022; v1 submitted 24 October, 2022; originally announced October 2022.

arXiv:2209.15340 [pdf, other]

A Learnable Optimization and Regularization Approach to Massive MIMO CSI Feedback

Authors: Zhengyang Hu, Guanzhang Liu, Qi Xie, Jiang Xue, Deyu Meng, Deniz Gunduz

Abstract: Channel state information (CSI) plays a critical role in achieving the potential benefits of massive multiple input multiple output (MIMO) systems. In frequency division duplex (FDD) massive MIMO systems, the base station (BS) relies on sustained and accurate CSI feedback from the users. However, due to the large number of antennas and users being served in massive MIMO systems, feedback overhead… ▽ More Channel state information (CSI) plays a critical role in achieving the potential benefits of massive multiple input multiple output (MIMO) systems. In frequency division duplex (FDD) massive MIMO systems, the base station (BS) relies on sustained and accurate CSI feedback from the users. However, due to the large number of antennas and users being served in massive MIMO systems, feedback overhead can become a bottleneck. In this paper, we propose a model-driven deep learning method for CSI feedback, called learnable optimization and regularization algorithm (LORA). Instead of using l1-norm as the regularization term, a learnable regularization module is introduced in LORA to automatically adapt to the characteristics of CSI. We unfold the conventional iterative shrinkage-thresholding algorithm (ISTA) to a neural network and learn both the optimization process and regularization term by end-toend training. We show that LORA improves the CSI feedback accuracy and speed. Besides, a novel learnable quantization method and the corresponding training scheme are proposed, and it is shown that LORA can operate successfully at different bit rates, providing flexibility in terms of the CSI feedback overhead. Various realistic scenarios are considered to demonstrate the effectiveness and robustness of LORA through numerical simulations. △ Less

Submitted 30 September, 2022; originally announced September 2022.

arXiv:2208.09245 [pdf, other]

Deep Joint Source-Channel and Encryption Coding: Secure Semantic Communications

Authors: Tze-Yang Tung, Deniz Gunduz

Abstract: Deep learning driven joint source-channel coding (JSCC) for wireless image or video transmission, also called DeepJSCC, has been a topic of interest recently with very promising results. The idea is to map similar source samples to nearby points in the channel input space such that, despite the noise introduced by the channel, the input can be recovered with minimal distortion. In DeepJSCC, this i… ▽ More Deep learning driven joint source-channel coding (JSCC) for wireless image or video transmission, also called DeepJSCC, has been a topic of interest recently with very promising results. The idea is to map similar source samples to nearby points in the channel input space such that, despite the noise introduced by the channel, the input can be recovered with minimal distortion. In DeepJSCC, this is achieved by an autoencoder architecture with a non-trainable channel layer between the encoder and decoder. DeepJSCC has many favorable properties, such as better end-to-end distortion performance than its separate source and channel coding counterpart as well as graceful degradation with respect to channel quality. However, due to the inherent correlation between the source sample and channel input, DeepJSCC is vulnerable to eavesdrop** attacks. In this paper, we propose the first DeepJSCC scheme for wireless image transmission that is secure against eavesdroppers, called DeepJSCEC. DeepJSCEC not only preserves the favorable properties of DeepJSCC, it also provides security against chosen-plaintext attacks from the eavesdropper, without the need to make assumptions about the eavesdropper's channel condition, or its intended use of the intercepted signal. Numerical results show that DeepJSCEC achieves similar or better image quality than separate source coding using BPG compression, AES encryption, and LDPC codes for channel coding, while preserving the graceful degradation of image quality with respect to channel quality. We also show that the proposed encryption method is problem agnostic, meaning it can be applied to other end-to-end JSCC problems, such as remote classification, without modification. Given the importance of security in modern wireless communication systems, we believe this work brings DeepJSCC schemes much closer to adoption in practice. △ Less

Submitted 31 August, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

arXiv:2208.08342 [pdf, other]

Semantic Communications with Discrete-time Analog Transmission: A PAPR Perspective

Authors: Yulin Shao, Deniz Gunduz

Abstract: Recent progress in deep learning (DL)-based joint source-channel coding (DeepJSCC) has led to a new paradigm of semantic communications. Two salient features of DeepJSCC-based semantic communications are the exploitation of semantic-aware features directly from the source signal, and the discrete-time analog transmission (DTAT) of these features. Compared with traditional digital communications, s… ▽ More Recent progress in deep learning (DL)-based joint source-channel coding (DeepJSCC) has led to a new paradigm of semantic communications. Two salient features of DeepJSCC-based semantic communications are the exploitation of semantic-aware features directly from the source signal, and the discrete-time analog transmission (DTAT) of these features. Compared with traditional digital communications, semantic communications with DeepJSCC provide superior reconstruction performance at the receiver and graceful degradation with diminishing channel quality, but also exhibit a large peak-to-average power ratio (PAPR) in the transmitted signal. An open question has been whether the gains of DeepJSCC come from the additional freedom brought by the high-PAPR continuous-amplitude signal. In this paper, we address this question by exploring three PAPR reduction techniques in the application of image transmission. We confirm that the superior image reconstruction performance of DeepJSCC-based semantic communications can be retained while the transmitted PAPR is suppressed to an acceptable level. This observation is an important step towards the implementation of DeepJSCC in practical semantic communication systems. △ Less

Submitted 29 December, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

Comments: Keywords: semantic communication, DeepJSCC, discrete-time analog transmission, PAPR

arXiv:2207.08489 [pdf, other]

Neural Distributed Image Compression with Cross-Attention Feature Alignment

Authors: Nitish Mital, Ezgi Ozyilkan, Ali Garjani, Deniz Gunduz

Abstract: We consider the problem of compressing an information source when a correlated one is available as side information only at the decoder side, which is a special case of the distributed source coding problem in information theory. In particular, we consider a pair of stereo images, which have overlap** fields of view, and are captured by a synchronized and calibrated pair of cameras as correlated… ▽ More We consider the problem of compressing an information source when a correlated one is available as side information only at the decoder side, which is a special case of the distributed source coding problem in information theory. In particular, we consider a pair of stereo images, which have overlap** fields of view, and are captured by a synchronized and calibrated pair of cameras as correlated image sources. In previously proposed methods, the encoder transforms the input image to a latent representation using a deep neural network, and compresses the quantized latent representation losslessly using entropy coding. The decoder decodes the entropy-coded quantized latent representation, and reconstructs the input image using this representation and the available side information. In the proposed method, the decoder employs a cross-attention module to align the feature maps obtained from the received latent representation of the input image and a latent representation of the side information. We argue that aligning the correlated patches in the feature maps allows better utilization of the side information. We empirically demonstrate the competitiveness of the proposed algorithm on KITTI and Cityscape datasets of stereo image pairs. Our experimental results show that the proposed architecture is able to exploit the decoder-only side information in a more efficient manner compared to previous works. △ Less

Submitted 5 January, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: 16 pages, 15 figures, presented in WACV 2023

arXiv:2207.03605 [pdf, other]

Learning-based Autonomous Channel Access in the Presence of Hidden Terminals

Authors: Yulin Shao, Yucheng Cai, Taotao Wang, Ziyang Guo, Peng Liu, Jiajun Luo, Deniz Gunduz

Abstract: We consider the problem of autonomous channel access (AutoCA), where a group of terminals tries to discover a communication strategy with an access point (AP) via a common wireless channel in a distributed fashion. Due to the irregular topology and the limited communication range of terminals, a practical challenge for AutoCA is the hidden terminal problem, which is notorious in wireless networks… ▽ More We consider the problem of autonomous channel access (AutoCA), where a group of terminals tries to discover a communication strategy with an access point (AP) via a common wireless channel in a distributed fashion. Due to the irregular topology and the limited communication range of terminals, a practical challenge for AutoCA is the hidden terminal problem, which is notorious in wireless networks for deteriorating the throughput and delay performances. To meet the challenge, this paper presents a new multi-agent deep reinforcement learning paradigm, dubbed MADRL-HT, tailored for AutoCA in the presence of hidden terminals. MADRL-HT exploits topological insights and transforms the observation space of each terminal into a scalable form independent of the number of terminals. To compensate for the partial observability, we put forth a look-back mechanism such that the terminals can infer behaviors of their hidden terminals from the carrier sensed channel states as well as feedback from the AP. A window-based global reward function is proposed, whereby the terminals are instructed to maximize the system throughput while balancing the terminals' transmission opportunities over the course of learning. Extensive numerical experiments verified the superior performance of our solution benchmarked against the legacy carrier-sense multiple access with collision avoidance (CSMA/CA) protocol. △ Less

Submitted 2 December, 2022; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: Keywords: multiple channel access, hidden terminal, multi-agent deep reinforcement learning, Wi-Fi, proximal policy optimization

arXiv:2206.10499 [pdf, other]

doi 10.1109/LWC.2022.3186160

A Learning Aided Flexible Gradient Descent Approach to MISO Beamforming

Authors: Zhixiong Yang, **g-Yuan Xia, Junshan Luo, Shuanghui Zhang, Deniz Gündüz

Abstract: This paper proposes a learning aided gradient descent (LAGD) algorithm to solve the weighted sum rate (WSR) maximization problem for multiple-input single-output (MISO) beamforming. The proposed LAGD algorithm directly optimizes the transmit precoder through implicit gradient descent based iterations, at each of which the optimization strategy is determined by a neural network, and thus, is dynami… ▽ More This paper proposes a learning aided gradient descent (LAGD) algorithm to solve the weighted sum rate (WSR) maximization problem for multiple-input single-output (MISO) beamforming. The proposed LAGD algorithm directly optimizes the transmit precoder through implicit gradient descent based iterations, at each of which the optimization strategy is determined by a neural network, and thus, is dynamic and adaptive. At each instance of the problem, this network is initialized randomly, and updated throughout the iterative solution process. Therefore, the LAGD algorithm can be implemented at any signal-to-noise ratio (SNR) and for arbitrary antenna/user numbers, does not require labelled data or training prior to deployment. Numerical results show that the LAGD algorithm can outperform of the well-known WMMSE algorithm as well as other learning-based solutions with a modest computational complexity. Our code is available at https://github.com/XiaGroup/LAGD. △ Less

Submitted 25 July, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

Journal ref: [J]. IEEE Wireless Communications Letters, 2022

arXiv:2206.09457 [pdf, other]

All you need is feedback: Communication with block attention feedback codes

Authors: Emre Ozfatura, Yulin Shao, Alberto Perotti, Branislav Popovic, Deniz Gunduz

Abstract: Deep learning based channel code designs have recently gained interest as an alternative to conventional coding algorithms, particularly for channels for which existing codes do not provide effective solutions. Communication over a feedback channel is one such problem, for which promising results have recently been obtained by employing various deep learning architectures. In this paper, we introd… ▽ More Deep learning based channel code designs have recently gained interest as an alternative to conventional coding algorithms, particularly for channels for which existing codes do not provide effective solutions. Communication over a feedback channel is one such problem, for which promising results have recently been obtained by employing various deep learning architectures. In this paper, we introduce a novel learning-aided code design for feedback channels, called generalized block attention feedback (GBAF) codes, which i) employs a modular architecture that can be implemented using different neural network architectures; ii) provides order-of-magnitude improvements in the probability of error compared to existing designs; and iii) can transmit at desired code rates. △ Less

Submitted 5 October, 2022; v1 submitted 19 June, 2022; originally announced June 2022.

arXiv:2206.08100 [pdf, other]

DeepJSCC-Q: Constellation Constrained Deep Joint Source-Channel Coding

Authors: Tze-Yang Tung, David Burth Kurka, Mikolaj Jankowski, Deniz Gunduz

Abstract: Recent works have shown that modern machine learning techniques can provide an alternative approach to the long-standing joint source-channel coding (JSCC) problem. Very promising initial results, superior to popular digital schemes that utilize separate source and channel codes, have been demonstrated for wireless image and video transmission using deep neural networks (DNNs). However, end-to-end… ▽ More Recent works have shown that modern machine learning techniques can provide an alternative approach to the long-standing joint source-channel coding (JSCC) problem. Very promising initial results, superior to popular digital schemes that utilize separate source and channel codes, have been demonstrated for wireless image and video transmission using deep neural networks (DNNs). However, end-to-end training of such schemes requires a differentiable channel input representation; hence, prior works have assumed that any complex value can be transmitted over the channel. This can prevent the application of these codes in scenarios where the hardware or protocol can only admit certain sets of channel inputs, prescribed by a digital constellation. Herein, we propose DeepJSCC-Q, an end-to-end optimized JSCC solution for wireless image transmission using a finite channel input alphabet. We show that DeepJSCC-Q can achieve similar performance to prior works that allow any complex valued channel input, especially when high modulation orders are available, and that the performance asymptotically approaches that of unconstrained channel input as the modulation order increases. Importantly, DeepJSCC-Q preserves the graceful degradation of image quality in unpredictable channel conditions, a desirable property for deployment in mobile systems with rapidly changing channel conditions. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: arXiv admin note: text overlap with arXiv:2111.13042

arXiv:2205.03770 [pdf, other]

Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication

Authors: Yang Wang, Zhen Gao, Dezhi Zheng, Sheng Chen, Deniz Gündüz, H. Vincent Poor

Abstract: It is anticipated that 6G wireless networks will accelerate the convergence of the physical and cyber worlds and enable a paradigm-shift in the way we deploy and exploit communication networks. Machine learning, in particular deep learning (DL), is expected to be one of the key technological enablers of 6G by offering a new paradigm for the design and optimization of networks with a high level of… ▽ More It is anticipated that 6G wireless networks will accelerate the convergence of the physical and cyber worlds and enable a paradigm-shift in the way we deploy and exploit communication networks. Machine learning, in particular deep learning (DL), is expected to be one of the key technological enablers of 6G by offering a new paradigm for the design and optimization of networks with a high level of intelligence. In this article, we introduce an emerging DL architecture, known as the transformer, and discuss its potential impact on 6G network design. We first discuss the differences between the transformer and classical DL architectures, and emphasize the transformer's self-attention mechanism and strong representation capabilities, which make it particularly appealing for tackling various challenges in wireless network design. Specifically, we propose transformer-based solutions for various massive multiple-input multiple-output (MIMO) and semantic communication problems, and show their superiority compared to other architectures. Finally, we discuss key challenges and open issues in transformer-based solutions, and identify future research directions for their deployment in intelligent 6G networks. △ Less

Submitted 3 November, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: 9 pages, 6 figures. The current version has been accepted by IEEE Wireless Communications Magzine

arXiv:2205.02417 [pdf, ps, other]

doi 10.1109/LWC.2022.3204837

Channel-Adaptive Wireless Image Transmission with OFDM

Authors: Haotian Wu, Yulin Shao, Krystian Mikolajczyk, Deniz Gündüz

Abstract: We present a learning-based channel-adaptive joint source and channel coding (CA-JSCC) scheme for wireless image transmission over multipath fading channels. The proposed method is an end-to-end autoencoder architecture with a dual-attention mechanism employing orthogonal frequency division multiplexing (OFDM) transmission. Unlike the previous works, our approach is adaptive to channel-gain and no… ▽ More We present a learning-based channel-adaptive joint source and channel coding (CA-JSCC) scheme for wireless image transmission over multipath fading channels. The proposed method is an end-to-end autoencoder architecture with a dual-attention mechanism employing orthogonal frequency division multiplexing (OFDM) transmission. Unlike the previous works, our approach is adaptive to channel-gain and noise-power variations by exploiting the estimated channel state information (CSI). Specifically, with the proposed dual-attention mechanism, our model can learn to map the features and allocate transmission-power resources judiciously based on the estimated CSI. Extensive numerical experiments verify that CA-JSCC achieves state-of-the-art performance among existing JSCC schemes. In addition, CA-JSCC is robust to varying channel conditions and can better exploit the limited channel resources by transmitting critical features over better subchannels. △ Less

Submitted 8 September, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

Comments: IEEE Wireless Communications Letters

MSC Class: 94A24 ACM Class: E.4

arXiv:2202.03129 [pdf, other]

Over-the-Air Ensemble Inference with Model Privacy

Authors: Selim F. Yilmaz, Burak Hasircioglu, Deniz Gunduz

Abstract: We consider distributed inference at the wireless edge, where multiple clients with an ensemble of models, each trained independently on a local dataset, are queried in parallel to make an accurate decision on a new sample. In addition to maximizing inference accuracy, we also want to maximize the privacy of local models. We exploit the superposition property of the air to implement bandwidth-effi… ▽ More We consider distributed inference at the wireless edge, where multiple clients with an ensemble of models, each trained independently on a local dataset, are queried in parallel to make an accurate decision on a new sample. In addition to maximizing inference accuracy, we also want to maximize the privacy of local models. We exploit the superposition property of the air to implement bandwidth-efficient ensemble inference methods. We introduce different over-the-air ensemble methods and show that these schemes perform significantly better than their orthogonal counterparts, while using less resources and providing privacy guarantees. We also provide experimental results verifying the benefits of the proposed over-the-air inference approach, whose source code is shared publicly on Github. △ Less

Submitted 15 May, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: To appear in IEEE International Symposium on Information Theory (ISIT) 2022

arXiv:2112.11789 [pdf, other]

DRF Codes: Deep SNR-Robust Feedback Codes

Authors: Mahdi Boloursaz Mashhadi, Deniz Gunduz, Alberto Perotti, Branislav Popovic

Abstract: We present a new deep-neural-network (DNN) based error correction code for fading channels with output feedback, called deep SNR-robust feedback (DRF) code. At the encoder, parity symbols are generated by a long short term memory (LSTM) network based on the message as well as the past forward channel outputs observed by the transmitter in a noisy fashion. The decoder uses a bi-directional LSTM arc… ▽ More We present a new deep-neural-network (DNN) based error correction code for fading channels with output feedback, called deep SNR-robust feedback (DRF) code. At the encoder, parity symbols are generated by a long short term memory (LSTM) network based on the message as well as the past forward channel outputs observed by the transmitter in a noisy fashion. The decoder uses a bi-directional LSTM architecture along with a signal to noise ratio (SNR)-aware attention NN to decode the message. The proposed code overcomes two major shortcomings of the previously proposed DNN-based codes over channels with passive output feedback: (i) the SNR-aware attention mechanism at the decoder enables reliable application of the same trained NN over a wide range of SNR values; (ii) curriculum training with batch-size scheduling is used to speed up and stabilize training while improving the SNR-robustness of the resulting code. We show that the DRF codes significantly outperform state-of-the-art in terms of both the SNR-robustness and the error rate in additive white Gaussian noise (AWGN) channel with feedback. In fading channels with perfect phase compensation at the receiver, DRF codes learn to efficiently exploit knowledge of the instantaneous fading amplitude (which is available to the encoder through feedback) to reduce the overhead and complexity associated with channel estimation at the decoder. Finally, we show the effectiveness of DRF codes in multicast channels with feedback, where linear feedback codes are known to be strictly suboptimal. △ Less

Submitted 22 December, 2021; originally announced December 2021.

arXiv:2112.07244 [pdf, other]

Progressive Feature Transmission for Split Inference at the Wireless Edge

Authors: Qiao Lan, Qunsong Zeng, Petar Popovski, Deniz Gündüz, Kaibin Huang

Abstract: In edge inference, an edge server provides remote-inference services to edge devices. This requires the edge devices to upload high-dimensional features of data samples over resource-constrained wireless channels, which creates a communication bottleneck. The conventional solution of feature pruning requires that the device has access to the inference model, which is unavailable in the current sce… ▽ More In edge inference, an edge server provides remote-inference services to edge devices. This requires the edge devices to upload high-dimensional features of data samples over resource-constrained wireless channels, which creates a communication bottleneck. The conventional solution of feature pruning requires that the device has access to the inference model, which is unavailable in the current scenario of split inference. To address this issue, we propose the progressive feature transmission (ProgressFTX) protocol, which minimizes the overhead by progressively transmitting features until a target confidence level is reached. The optimal control policy of the protocol to accelerate inference is derived and it comprises two key operations. The first is importance-aware feature selection at the server, for which it is shown to be optimal to select the most important features, characterized by the largest discriminant gains of the corresponding feature dimensions. The second is transmission-termination control by the server for which the optimal policy is shown to exhibit a threshold structure. Specifically, the transmission is stopped when the incremental uncertainty reduction by further feature transmission is outweighed by its communication cost. The indices of the selected features and transmission decision are fed back to the device in each slot. The optimal policy is first derived for the tractable case of linear classification and then extended to the more complex case of classification using a convolutional neural network. Both Gaussian and fading channels are considered. Experimental results are obtained for both a statistical data model and a real dataset. It is seen that ProgressFTX can substantially reduce the communication latency compared to conventional feature pruning and random feature transmission. △ Less

Submitted 14 December, 2021; originally announced December 2021.

Showing 1–50 of 97 results for author: Gündüz, D