Search | arXiv e-print repository

Joint Source-Channel Coding for Wireless Image Transmission: A Deep Compressed-Sensing Based Method

Authors: Mohammad Amin Jarrahi, Eirina Bourtsoulatze, Vahid Abolghasemi

Abstract: Nowadays, the demand for image transmission over wireless networks has surged significantly. To meet the need for swift delivery of high-quality images through time-varying channels with limited bandwidth, the development of efficient transmission strategies and techniques for preserving image quality is of importance. This paper introduces an innovative approach to Joint Source-Channel Coding (JS… ▽ More Nowadays, the demand for image transmission over wireless networks has surged significantly. To meet the need for swift delivery of high-quality images through time-varying channels with limited bandwidth, the development of efficient transmission strategies and techniques for preserving image quality is of importance. This paper introduces an innovative approach to Joint Source-Channel Coding (JSCC) tailored for wireless image transmission. It capitalizes on the power of Compressed Sensing (CS) to achieve superior compression and resilience to channel noise. In this method, the process begins with the compression of images using a block-based CS technique implemented through a Convolutional Neural Network (CNN) structure. Subsequently, the images are encoded by directly map** image blocks to complex-valued channel input symbols. Upon reception, the data is decoded to recover the channel-encoded information, effectively removing the noise introduced during transmission. To finalize the process, a novel CNN-based reconstruction network is employed to restore the original image from the channel-decoded data. The performance of the proposed method is assessed using the CIFAR-10 and Kodak datasets. The results illustrate a substantial improvement over existing JSCC frameworks when assessed in terms of metrics such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) across various channel Signal-to-Noise Ratios (SNRs) and channel bandwidth values. These findings underscore the potential of harnessing CNN-based CS for the development of deep JSCC algorithms tailored for wireless image transmission. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:1910.03579 [pdf, other]

doi 10.1109/TIP.2020.3023597

Graph-based Spatial-temporal Feature Learning for Neuromorphic Vision Sensing

Authors: Yin Bi, Aaron Chadha, Alhabib Abbas, Eirina Bourtsoulatze, Yiannis Andreopoulos

Abstract: Neuromorphic vision sensing (NVS)\ devices represent visual information as sequences of asynchronous discrete events (a.k.a., "spikes") in response to changes in scene reflectance. Unlike conventional active pixel sensing (APS), NVS allows for significantly higher event sampling rates at substantially increased energy efficiency and robustness to illumination changes. However, feature representati… ▽ More Neuromorphic vision sensing (NVS)\ devices represent visual information as sequences of asynchronous discrete events (a.k.a., "spikes") in response to changes in scene reflectance. Unlike conventional active pixel sensing (APS), NVS allows for significantly higher event sampling rates at substantially increased energy efficiency and robustness to illumination changes. However, feature representation for NVS is far behind its APS-based counterparts, resulting in lower performance in high-level computer vision tasks. To fully utilize its sparse and asynchronous nature, we propose a compact graph representation for NVS, which allows for end-to-end learning with graph convolution neural networks. We couple this with a novel end-to-end feature learning framework that accommodates both appearance-based and motion-based tasks. The core of our framework comprises a spatial feature learning module, which utilizes residual-graph convolutional neural networks (RG-CNN), for end-to-end learning of appearance-based features directly from graphs. We extend this with our proposed Graph2Grid block and temporal feature learning module for efficiently modelling temporal dependencies over multiple graphs and a long temporal extent. We show how our framework can be configured for object classification, action recognition and action similarity labeling. Importantly, our approach preserves the spatial and temporal coherence of spike events, while requiring less computation and memory. The experimental validation shows that our proposed framework outperforms all recent methods on standard datasets. Finally, to address the absence of large real-world NVS datasets for complex recognition tasks, we introduce, evaluate and make available the American Sign Language letters (ASL-DVS), as well as human action dataset (UCF101-DVS, HMDB51-DVS and ASLAN-DVS). △ Less

Submitted 11 November, 2019; v1 submitted 8 October, 2019; originally announced October 2019.

Comments: 16 pages, 5 figures. This work is a journal extension of our ICCV'19 paper arXiv:1908.06648

arXiv:1908.06648 [pdf, ps, other]

Graph-Based Object Classification for Neuromorphic Vision Sensing

Authors: Yin Bi, Aaron Chadha, Alhabib Abbas, Eirina Bourtsoulatze, Yiannis Andreopoulos

Abstract: Neuromorphic vision sensing (NVS)\ devices represent visual information as sequences of asynchronous discrete events (a.k.a., ``spikes'') in response to changes in scene reflectance. Unlike conventional active pixel sensing (APS), NVS allows for significantly higher event sampling rates at substantially increased energy efficiency and robustness to illumination changes. However, object classificat… ▽ More Neuromorphic vision sensing (NVS)\ devices represent visual information as sequences of asynchronous discrete events (a.k.a., ``spikes'') in response to changes in scene reflectance. Unlike conventional active pixel sensing (APS), NVS allows for significantly higher event sampling rates at substantially increased energy efficiency and robustness to illumination changes. However, object classification with NVS streams cannot leverage on state-of-the-art convolutional neural networks (CNNs), since NVS does not produce frame representations. To circumvent this mismatch between sensing and processing with CNNs, we propose a compact graph representation for NVS. We couple this with novel residual graph CNN architectures and show that, when trained on spatio-temporal NVS data for object classification, such residual graph CNNs preserve the spatial and temporal coherence of spike events, while requiring less computation and memory. Finally, to address the absence of large real-world NVS datasets for complex recognition tasks, we present and make available a 100k dataset of NVS recordings of the American sign language letters, acquired with an iniLabs DAVIS240c device under real-world conditions. △ Less

Submitted 19 August, 2019; originally announced August 2019.

Comments: 13 pages, 4 figures, ICCV 2019

arXiv:1908.00812 [pdf, other]

Deep Video Precoding

Authors: Eirina Bourtsoulatze, Aaron Chadha, Ilya Fadeev, Vasileios Giotsas, Yiannis Andreopoulos

Abstract: Several groups are currently investigating how deep learning may advance the state-of-the-art in image and video coding. An open question is how to make deep neural networks work in conjunction with existing (and upcoming) video codecs, such as MPEG AVC, HEVC, VVC, Google VP9 and AOM AV1, as well as existing container and transport formats, without imposing any changes at the client side. Such com… ▽ More Several groups are currently investigating how deep learning may advance the state-of-the-art in image and video coding. An open question is how to make deep neural networks work in conjunction with existing (and upcoming) video codecs, such as MPEG AVC, HEVC, VVC, Google VP9 and AOM AV1, as well as existing container and transport formats, without imposing any changes at the client side. Such compatibility is a crucial aspect when it comes to practical deployment, especially due to the fact that the video content industry and hardware manufacturers are expected to remain committed to these standards for the foreseeable future. We propose to use deep neural networks as precoders for current and future video codecs and adaptive video streaming systems. In our current design, the core precoding component comprises a cascaded structure of downscaling neural networks that operates during video encoding, prior to transmission. This is coupled with a precoding mode selection algorithm for each independently-decodable stream segment, which adjusts the downscaling factor according to scene characteristics, the utilized encoder, and the desired bitrate and encoding configuration. Our framework is compatible with all current and future codec and transport standards, as our deep precoding network structure is trained in conjunction with linear upscaling filters (e.g., the bilinear filter), which are supported by all web video players. Results with FHD and UHD content and widely-used AVC, HEVC and VP9 encoders show that coupling such standards with the proposed deep video precoding allows for 15% to 45% rate reduction under encoding configurations and bitrates suitable for video-on-demand adaptive streaming systems. The use of precoding can also lead to encoding complexity reduction, which is essential for cost-effective cloud deployment of complex encoders like H.265/HEVC and VP9. △ Less

Submitted 13 December, 2019; v1 submitted 2 August, 2019; originally announced August 2019.

Comments: 16 pages, 14 figures, 11 tables, to appear in IEEE Trans. Circ. Syst. for Video Technology

arXiv:1902.09581 [pdf, other]

Tile-Based Joint Caching and Delivery of $360^o$ Videos in Heterogeneous Networks

Authors: Pantelis Maniotis, Eirina Bourtsoulatze, Nikolaos Thomos

Abstract: The recent surge of applications involving the use of $360^o$ video challenges mobile networks infrastructure, as $360^o$ video files are of significant size, and current delivery and edge caching architectures are unable to guarantee their timely delivery. In this paper, we investigate the problem of joint collaborative content-aware caching and delivery of $360^o$ videos in a video on demand set… ▽ More The recent surge of applications involving the use of $360^o$ video challenges mobile networks infrastructure, as $360^o$ video files are of significant size, and current delivery and edge caching architectures are unable to guarantee their timely delivery. In this paper, we investigate the problem of joint collaborative content-aware caching and delivery of $360^o$ videos in a video on demand setting. The proposed scheme takes advantage of $360^o$ video encoding in multiple tiles and layers to make fine-grained decisions regarding which tiles to cache in each Small Base Station (SBS), and where to deliver them from to the end users, as users may reside in the coverage area of multiple SBSs. This permits to cache the most popular tiles in the SBSs, while the remaining tiles may be obtained through the backhaul. In addition, we explicitly consider the time delivery constraints to ensure continuous video playback. To reduce the computational complexity of the optimization problem, we simplify it by introducing a fairness constraint. This allows us to split the original problem into subproblems corresponding to Groups of Pictures (GoP). Each of the subproblems is then solved with the method of Lagrange partial relaxation. Finally, we evaluate the performance of the proposed method for various system parameters and compare it with schemes that do not consider $360^o$ video encoding into multiple tiles and quality layers, as well as with two variants of the proposed method one that considers layered encoding and SBSs collaboration and another that uses tiles encoding but with no SBSs collaboration. The results showcase the benefits coming from caching and delivery decisions on per tile basis and the importance of exploiting SBSs collaboration. △ Less

Submitted 25 October, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

arXiv:1901.01187 [pdf, other]

PopNetCod: A Popularity-based Caching Policy for Network Coding enabled Named Data Networking

Authors: Jonnahtan Saltarin, Torsten Braun, Eirina Bourtsoulatze, Nikolaos Thomos

Abstract: In this paper, we propose PopNetCod, a popularity-based caching policy for network coding enabled Named Data Networking. PopNetCod is a distributed caching policy, in which each router measures the local popularity of the content objects by analyzing the requests that it receives. It then uses this information to decide which Data packets to cache or evict from its content store. Since network cod… ▽ More In this paper, we propose PopNetCod, a popularity-based caching policy for network coding enabled Named Data Networking. PopNetCod is a distributed caching policy, in which each router measures the local popularity of the content objects by analyzing the requests that it receives. It then uses this information to decide which Data packets to cache or evict from its content store. Since network coding is used, partial caching of content objects is supported, which facilitates the management of the content store. The routers decide the Data packets that they cache or evict in an online manner when they receive requests for Data packets. This allows the most popular Data packets to be cached closer to the network edges. The evaluation of PopNetCod shows an improved cache-hit rate compared to the widely used Leave Copy Everywhere placement policy and the Least Recently Used eviction policy. The improved cache-hit rate helps the clients to achieve higher goodput, while it also reduces the load on the source servers. △ Less

Submitted 4 January, 2019; originally announced January 2019.

Comments: presented at IFIP networking 2018

arXiv:1809.01733 [pdf, other]

doi 10.1109/TCCN.2019.2919300

Deep Joint Source-Channel Coding for Wireless Image Transmission

Authors: Eirina Bourtsoulatze, David Burth Kurka, Deniz Gunduz

Abstract: We propose a joint source and channel coding (JSCC) technique for wireless image transmission that does not rely on explicit codes for either compression or error correction; instead, it directly maps the image pixel values to the complex-valued channel input symbols. We parameterize the encoder and decoder functions by two convolutional neural networks (CNNs), which are trained jointly, and can b… ▽ More We propose a joint source and channel coding (JSCC) technique for wireless image transmission that does not rely on explicit codes for either compression or error correction; instead, it directly maps the image pixel values to the complex-valued channel input symbols. We parameterize the encoder and decoder functions by two convolutional neural networks (CNNs), which are trained jointly, and can be considered as an autoencoder with a non-trainable layer in the middle that represents the noisy communication channel. Our results show that the proposed deep JSCC scheme outperforms digital transmission concatenating JPEG or JPEG2000 compression with a capacity achieving channel code at low signal-to-noise ratio (SNR) and channel bandwidth values in the presence of additive white Gaussian noise (AWGN). More strikingly, deep JSCC does not suffer from the ``cliff effect'', and it provides a graceful performance degradation as the channel SNR varies with respect to the SNR value assumed during training. In the case of a slow Rayleigh fading channel, deep JSCC learns noise resilient coded representations and significantly outperforms separation-based digital communication at all SNR and channel bandwidth values. △ Less

Submitted 17 June, 2019; v1 submitted 4 September, 2018; originally announced September 2018.

Comments: To appear in IEEE Transactions on Cognitive Communications and Networking

arXiv:1804.01035 [pdf, ps, other]

Cache-Aided Interactive Multiview Video Streaming in Small Cell Wireless Networks

Authors: Eirina Bourtsoulatze, Deniz Gündüz

Abstract: The emergence of novel interactive multimedia applications with high rate and low latency requirements has led to a drastic increase in the video data traffic over wireless cellular networks. Endowing the small base stations of a macro-cell with caches that can store some of the content is a promising technology to cope with the increasing pressure on the backhaul connections, and to reduce the de… ▽ More The emergence of novel interactive multimedia applications with high rate and low latency requirements has led to a drastic increase in the video data traffic over wireless cellular networks. Endowing the small base stations of a macro-cell with caches that can store some of the content is a promising technology to cope with the increasing pressure on the backhaul connections, and to reduce the delay for demanding video applications. In this work, delivery of an interactive multiview video to a set of wireless users is studied in an heterogeneous cellular network. Differently from existing works that focus on the optimization of the delivery delay and ignore the video characteristics, the caching and scheduling policies are jointly optimized, taking into account the quality of the delivered video and the video delivery time constraints. We formulate our joint caching and scheduling problem as the average expected video distortion minimization, and show that this problem is NP-hard. We then provide an equivalent formulation based on submodular set function maximization and propose a greedy solution with $\frac{1}{2}(1-\mbox{e}^{-1})$ approximation guarantee. The evaluation of the proposed joint caching and scheduling policy shows that it significantly outperforms benchmark algorithms based on popularity caching and independent scheduling. Another important contribution of this paper is a new constant approximation ratio for the greedy submodular set function maximization subject to a $d$-dimensional knapsack constraint. △ Less

Submitted 3 April, 2018; originally announced April 2018.

arXiv:1512.00259 [pdf, other]

doi 10.1109/INFOCOM.2016.7524382

NetCodCCN: a Network Coding approach for Content-Centric Networks

Authors: Jonnahtan Saltarin, Eirina Bourtsoulatze, Nikolaos Thomos, Torsten Braun

Abstract: Content-Centric Networking (CCN) naturally supports multi-path communication, as it allows the simultaneous use of multiple interfaces (e.g. LTE and WiFi). When multiple sources and multiple clients are considered, the optimal set of distribution trees should be determined in order to optimally use all the available interfaces. This is not a trivial task, as it is a computationally intense procedu… ▽ More Content-Centric Networking (CCN) naturally supports multi-path communication, as it allows the simultaneous use of multiple interfaces (e.g. LTE and WiFi). When multiple sources and multiple clients are considered, the optimal set of distribution trees should be determined in order to optimally use all the available interfaces. This is not a trivial task, as it is a computationally intense procedure that should be done centrally. The need for central coordination can be removed by employing network coding, which also offers improved resiliency to errors and large throughput gains. In this paper, we propose NetCodCCN, a protocol for integrating network coding in CCN. In comparison to previous works proposing to enable network coding in CCN, NetCodCCN permit Interest aggregation and Interest pipelining, which reduce the data retrieval times. The experimental evaluation shows that the proposed protocol leads to significant improvements in terms of content retrieval delay compared to the original CCN. Our results demonstrate that the use of network coding adds robustness to losses and permits to exploit more efficiently the available network resources. The performance gains are verified for content retrieval in various network scenarios. △ Less

Submitted 1 December, 2015; originally announced December 2015.

Comments: Accepted for inclusion in the IEEE INFOCOM 2016 technical program

arXiv:1510.06659 [pdf, ps, other]

Content-Aware Delivery of Scalable Video in Network Coding Enabled Named Data Networks

Authors: Eirina Bourtsoulatze, Nikolaos Thomos, Jonnahtan Saltarin, Torsten Braun

Abstract: In this paper, we propose a novel network coding enabled NDN architecture for the delivery of scalable video. Our scheme utilizes network coding in order to address the problem that arises in the original NDN protocol, where optimal use of the bandwidth and caching resources necessitates the coordination of the forwarding decisions. To optimize the performance of the proposed network coding based… ▽ More In this paper, we propose a novel network coding enabled NDN architecture for the delivery of scalable video. Our scheme utilizes network coding in order to address the problem that arises in the original NDN protocol, where optimal use of the bandwidth and caching resources necessitates the coordination of the forwarding decisions. To optimize the performance of the proposed network coding based NDN protocol and render it appropriate for transmission of scalable video, we devise a novel rate allocation algorithm that decides on the optimal rates of Interest messages sent by clients and intermediate nodes. This algorithm guarantees that the achieved flow of Data objects will maximize the average quality of the video delivered to the client population. To support the handling of Interest messages and Data objects when intermediate nodes perform network coding, we modify the standard NDN protocol and introduce the use of Bloom filters, which store efficiently additional information about the Interest messages and Data objects. The proposed architecture is evaluated for transmission of scalable video over PlanetLab topologies. The evaluation shows that the proposed scheme performs very close to the optimal performance. △ Less

Submitted 22 October, 2015; originally announced October 2015.

arXiv:1307.7138 [pdf, ps, other]

Reconstruction of Network Coded Sources From Incomplete Datasets

Authors: Eirina Bourtsoulatze, Nikolaos Thomos, Pascal Frossard

Abstract: In this paper, we investigate the problem of recovering source information from an incomplete set of network coded data. We first study the theoretical performance of such systems under maximum a posteriori (MAP) decoding and derive the upper bound on the probability of decoding error as a function of the system parameters. We also establish the sufficient conditions on the number of network coded… ▽ More In this paper, we investigate the problem of recovering source information from an incomplete set of network coded data. We first study the theoretical performance of such systems under maximum a posteriori (MAP) decoding and derive the upper bound on the probability of decoding error as a function of the system parameters. We also establish the sufficient conditions on the number of network coded symbols required to achieve decoding error probability below a certain level. We then propose a low complexity iterative decoding algorithm based on message passing for decoding the network coded data of a particular class of statistically dependent sources that present pairwise linear correlation. The algorithm operates on a graph that captures the network coding constraints, while the knowledge about the source correlation is directly incorporated in the messages exchanged over the graph. We test the proposed method on both synthetic data and correlated image sequences and demonstrate that the prior knowledge about the source correlation can be effectively exploited at the decoder in order to provide a good reconstruction of the transmitted data in cases where the network coded data available at the decoder is not sufficient for exact decoding. △ Less

Submitted 13 March, 2015; v1 submitted 26 July, 2013; originally announced July 2013.

arXiv:1212.5032 [pdf, ps, other]

doi 10.1109/TMM.2014.2328320

Distributed Rate Allocation in Inter-Session Network Coding

Authors: Eirina Bourtsoulatze, Nikolaos Thomos, Pascal Frossard

Abstract: In this work, we propose a distributed rate allocation algorithm that minimizes the average decoding delay for multimedia clients in inter-session network coding systems. We consider a scenario where the users are organized in a mesh network and each user requests the content of one of the available sources. We propose a novel distributed algorithm where network users determine the coding operatio… ▽ More In this work, we propose a distributed rate allocation algorithm that minimizes the average decoding delay for multimedia clients in inter-session network coding systems. We consider a scenario where the users are organized in a mesh network and each user requests the content of one of the available sources. We propose a novel distributed algorithm where network users determine the coding operations and the packet rates to be requested from the parent nodes, such that the decoding delay is minimized for all the clients. A rate allocation problem is solved by every user, which seeks the rates that minimize the average decoding delay for its children and for itself. Since the optimization problem is a priori non-convex, we introduce the concept of equivalent packet flows, which permits to estimate the expected number of packets that every user needs to collect for decoding. We then decompose our original rate allocation problem into a set of convex subproblems, which are eventually combined to obtain an effective approximate solution to the delay minimization problem. The results demonstrate that the proposed scheme eliminates the bottlenecks and reduces the decoding delay experienced by users with limited bandwidth resources. We validate the performance of our distributed rate allocation algorithm in different video streaming scenarios using the NS-3 network simulator. We show that our system is able to take benefit of inter-session network coding for simultaneous delivery of video sessions in networks with path diversity. △ Less

Submitted 20 September, 2013; v1 submitted 20 December, 2012; originally announced December 2012.

Comments: Submitted to IEEE Transactions on Multimedia

arXiv:1211.0951 [pdf, ps, other]

doi 10.1109/TCOMM.2014.2318701

Decoding Delay Minimization in Inter-Session Network Coding

Authors: Eirina Bourtsoulatze, Nikolaos Thomos, Pascal Frossard

Abstract: Intra-session network coding has been shown to offer significant gains in terms of achievable throughput and delay in settings where one source multicasts data to several clients. In this paper, we consider a more general scenario where multiple sources transmit data to sets of clients and study the benefits of inter-session network coding, when network nodes have the opportunity to combine packet… ▽ More Intra-session network coding has been shown to offer significant gains in terms of achievable throughput and delay in settings where one source multicasts data to several clients. In this paper, we consider a more general scenario where multiple sources transmit data to sets of clients and study the benefits of inter-session network coding, when network nodes have the opportunity to combine packets from different sources. In particular, we propose a novel framework for optimal rate allocation in inter-session network coding systems. We formulate the problem as the minimization of the average decoding delay in the client population and solve it with a gradient-based stochastic algorithm. Our optimized inter-session network coding solution is evaluated in different network topologies and compared with basic intra-session network coding solutions. Our results show the benefits of proper coding decisions and effective rate allocation for lowering the decoding delay when the network is used by concurrent multicast sessions. △ Less

Submitted 5 November, 2012; originally announced November 2012.

Comments: Submitted to IEEE Transactions on Communications

Showing 1–13 of 13 results for author: Bourtsoulatze, E