Search | arXiv e-print repository

ECRF: Entropy-Constrained Neural Radiance Fields Compression with Frequency Domain Optimization

Authors: Soonbin Lee, Fangwen Shu, Yago Sanchez, Thomas Schierl, Cornelius Hellge

Abstract: Explicit feature-grid based NeRF models have shown promising results in terms of rendering quality and significant speed-up in training. However, these methods often require a significant amount of data to represent a single scene or object. In this work, we present a compression model that aims to minimize the entropy in the frequency domain in order to effectively reduce the data size. First, we… ▽ More Explicit feature-grid based NeRF models have shown promising results in terms of rendering quality and significant speed-up in training. However, these methods often require a significant amount of data to represent a single scene or object. In this work, we present a compression model that aims to minimize the entropy in the frequency domain in order to effectively reduce the data size. First, we propose using the discrete cosine transform (DCT) on the tensorial radiance fields to compress the feature-grid. This feature-grid is transformed into coefficients, which are then quantized and entropy encoded, following a similar approach to the traditional video coding pipeline. Furthermore, to achieve a higher level of sparsity, we propose using an entropy parameterization technique for the frequency domain, specifically for DCT coefficients of the feature-grid. Since the transformed coefficients are optimized during the training phase, the proposed model does not require any fine-tuning or additional information. Our model only requires a lightweight compression pipeline for encoding and decoding, making it easier to apply volumetric radiance field methods for real-world applications. Experimental results demonstrate that our proposed frequency domain entropy model can achieve superior compression performance across various datasets. The source code will be made publicly available. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: 10 pages, 6 figures, 4 tables

arXiv:2305.05359 [pdf, other]

On the Limits of HARQ Prediction for Short Deterministic Codes with Error Detection in Memoryless Channels (Extended Version with Proofs)

Authors: Barış Göktepe, Cornelius Hellge, Tatiana Rykova, Thomas Schierl, Slawomir Stanczak

Abstract: We provide a mathematical framework to analyze the limits of Hybrid Automatic Repeat reQuest (HARQ) and derive analytical expressions for the most powerful test for estimating the decodability under maximum-likelihood decoding and $t$-error decoding. Furthermore, we numerically approximate the most powerful test for sum-product decoding. We compare the performance of previously studied HARQ predic… ▽ More We provide a mathematical framework to analyze the limits of Hybrid Automatic Repeat reQuest (HARQ) and derive analytical expressions for the most powerful test for estimating the decodability under maximum-likelihood decoding and $t$-error decoding. Furthermore, we numerically approximate the most powerful test for sum-product decoding. We compare the performance of previously studied HARQ prediction schemes and show that none of the state-of-the-art HARQ prediction is most powerful to estimate the decodability of a partially received signal vector under maximum-likelihood decoding and sum-product decoding. Furthermore, we demonstrate that decoding in general is suboptimal for predicting the decodability. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: ISIT23

arXiv:2202.08706 [pdf, other]

doi 10.1109/TWC.2023.3275296

Distributed Machine-Learning for Early HARQ Feedback Prediction in Cloud RANs

Authors: Barış Göktepe, Cornelius Hellge, Thomas Schierl, Slawomir Stanczak

Abstract: In this work, we propose novel HARQ prediction schemes for Cloud RANs (C-RANs) that use feedback over a rate-limited feedback channel (2 - 6 bits) from the Remote Radio Heads (RRHs) to predict at the User Equipment (UE) the decoding outcome at the BaseBand Unit (BBU) ahead of actual decoding. In particular, we propose a Dual Autoencoding 2-Stage Gaussian Mixture Model (DA2SGMM) that is trained in… ▽ More In this work, we propose novel HARQ prediction schemes for Cloud RANs (C-RANs) that use feedback over a rate-limited feedback channel (2 - 6 bits) from the Remote Radio Heads (RRHs) to predict at the User Equipment (UE) the decoding outcome at the BaseBand Unit (BBU) ahead of actual decoding. In particular, we propose a Dual Autoencoding 2-Stage Gaussian Mixture Model (DA2SGMM) that is trained in an end-to-end fashion over the whole C-RAN setup. Using realistic link-level simulations in the sub-THz band at 100 GHz, we show that the novel DA2SGMM HARQ prediction scheme clearly outperforms all other adapted and state-of-the-art schemes. The DA2SGMM shows a superior performance in terms of blockage detection as well as HARQ prediction in the no-blockage and single-blockage cases. In particular, the DA2SGMM with 4~bit feedback achieves a more than 200 % higher throughput in average compared to its best alternative. Compared to regular HARQ, the DA2SGMM reduces the maximum transmission latency by more than 72.4 %, while maintaining more than 75 % of the throughput in the no-blockage scenario. In the single-blockage scenario, DA2SGMM significantly increases the throughput for most of the evaluated Signal-to-Noise-Ratios (SNRs) compared to regular HARQ. △ Less

Submitted 18 May, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

arXiv:2103.06675 [pdf]

Open GOP Resolution Switching in HTTP Adaptive Streaming with VVC

Authors: Robert Skupin, Christian Bartnik, Adam Wieckowski, Yago Sanchez, Benjamin Bross, Cornelius Hellge, Thomas Schierl

Abstract: The user experience in adaptive HTTP streaming relies on offering bitrate ladders with suitable operation points for all users and typically involves multiple resolutions. While open GOP coding structures are generally known to provide substantial coding efficiency benefit, their use in HTTP streaming has been precluded through lacking support of reference picture resampling (RPR) in AVC and HEVC.… ▽ More The user experience in adaptive HTTP streaming relies on offering bitrate ladders with suitable operation points for all users and typically involves multiple resolutions. While open GOP coding structures are generally known to provide substantial coding efficiency benefit, their use in HTTP streaming has been precluded through lacking support of reference picture resampling (RPR) in AVC and HEVC. The newly emerging Versatile Video Coding (VVC) standard supports RPR, but only conversational scenarios were primarily investigated during the design of VVC. This paper aims at enabling usage of RPR in HTTP streaming scenarios through analysing the drift potential of VVC coding tools and presenting a constrained encoding method that avoids severe drift artefacts in resolution switching with open GOP coding in VVC. In typical live streaming configurations, the presented method achieves -8.57% BD-rate reduction compared to closed GOP coding while in a typical Video on Demand configuration, -1.89% BD-rate reduction is reported. The constraints penalty compared to regular open GOP coding is 0.65% BD-rate in the worst case. The presented method was integrated into the publicly available open source VVC encoder VVenC v0.3. △ Less

Submitted 29 April, 2021; v1 submitted 11 March, 2021; originally announced March 2021.

Comments: Accepted at IEEE Picture Coding Symposium 2021, 5 pages, 3 figures

arXiv:2009.06301 [pdf, ps, other]

doi 10.1109/GLOBECOM42002.2020.9322302

Feedback Prediction for Proactive HARQ in the Context of Industrial Internet of Things

Authors: Baris Göktepe, Tatiana Rykova, Thomas Fehrenbach, Thomas Schierl, Cornelius Hellge

Abstract: In this work, we investigate proactive Hybrid Automatic Repeat reQuest (HARQ) using link-level simulations for multiple packet sizes, modulation orders, BLock Error Rate (BLER) targets and two delay budgets of 1 ms and 2 ms, in the context of Industrial Internet of Things (IIOT) applications. In particular, we propose an enhanced proactive HARQ protocol using a feedback prediction mechanism. We sh… ▽ More In this work, we investigate proactive Hybrid Automatic Repeat reQuest (HARQ) using link-level simulations for multiple packet sizes, modulation orders, BLock Error Rate (BLER) targets and two delay budgets of 1 ms and 2 ms, in the context of Industrial Internet of Things (IIOT) applications. In particular, we propose an enhanced proactive HARQ protocol using a feedback prediction mechanism. We show that the enhanced protocol achieves a significant gain over the classical proactive HARQ in terms of energy efficiency for almost all evaluated BLER targets at least for sufficiently large feedback delays. Furthermore, we demonstrate that the proposed protocol clearly outperforms the classical proactive HARQ in all scenarios when taking a processing delay reduction due to the less complex prediction approach into account, achieving an energy efficiency gain in the range of 11% up to 15% for very stringent latency budgets of 1 ms at $10^{-2}$ BLER and from 4% up to 7.5% for less stringent latency budgets of 2 ms at $10^{-3}$ BLER. Furthermore, we show that power-constrained proactive HARQ with prediction even outperforms unconstrained reactive HARQ for sufficiently large feedback delays. △ Less

Submitted 18 February, 2022; v1 submitted 14 September, 2020; originally announced September 2020.

Journal ref: GLOBECOM 2020 - 2020 IEEE Global Communications Conference

arXiv:2007.14084 [pdf, other]

doi 10.1145/3394171.3413699

Kalman Filter-based Head Motion Prediction for Cloud-based Mixed Reality

Authors: Serhan Gül, Sebastian Bosse, Dimitri Podborski, Thomas Schierl, Cornelius Hellge

Abstract: Volumetric video allows viewers to experience highly-realistic 3D content with six degrees of freedom in mixed reality (MR) environments. Rendering complex volumetric videos can require a prohibitively high amount of computational power for mobile devices. A promising technique to reduce the computational burden on mobile devices is to perform the rendering at a cloud server. However, cloud-based… ▽ More Volumetric video allows viewers to experience highly-realistic 3D content with six degrees of freedom in mixed reality (MR) environments. Rendering complex volumetric videos can require a prohibitively high amount of computational power for mobile devices. A promising technique to reduce the computational burden on mobile devices is to perform the rendering at a cloud server. However, cloud-based rendering systems suffer from an increased interaction (motion-to-photon) latency that may cause registration errors in MR environments. One way of reducing the effective latency is to predict the viewer's head pose and render the corresponding view from the volumetric video in advance. In this paper, we design a Kalman filter for head motion prediction in our cloud-based volumetric video streaming system. We analyze the performance of our approach using recorded head motion traces and compare its performance to an autoregression model for different prediction intervals (look-ahead times). Our results show that the Kalman filter can predict head orientations 0.5 degrees more accurately than the autoregression model for a look-ahead time of 60 ms. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: Accepted at the ACM Multimedia Conference (ACMMM) 2020. 9 pages, 9 figures

Journal ref: Proceedings of the 28th ACM International Conference on Multimedia (2020) 3632-3641

arXiv:2003.02526 [pdf, other]

doi 10.1145/3339825.3393583

Cloud Rendering-based Volumetric Video Streaming System for Mixed Reality Services

Authors: Serhan Gül, Dimitri Podborski, Jangwoo Son, Gurdeep Singh Bhullar, Thomas Buchholz, Thomas Schierl, Cornelius Hellge

Abstract: Volumetric video is an emerging technology for immersive representation of 3D spaces that captures objects from all directions using multiple cameras and creates a dynamic 3D model of the scene. However, processing volumetric content requires high amounts of processing power and is still a very demanding task for today's mobile devices. To mitigate this, we propose a volumetric video streaming sys… ▽ More Volumetric video is an emerging technology for immersive representation of 3D spaces that captures objects from all directions using multiple cameras and creates a dynamic 3D model of the scene. However, processing volumetric content requires high amounts of processing power and is still a very demanding task for today's mobile devices. To mitigate this, we propose a volumetric video streaming system that offloads the rendering to a powerful cloud/edge server and only sends the rendered 2D view to the client instead of the full volumetric content. We use 6DoF head movement prediction techniques, WebRTC protocol and hardware video encoding to ensure low-latency in different parts of the processing chain. We demonstrate our system using both a browser-based client and a Microsoft HoloLens client. Our application contains generic interfaces that allow for easy deployment of various augmented/mixed reality clients using the same server implementation. △ Less

Submitted 16 July, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

Comments: 4 pages, 2 figures

Journal ref: 11th ACM Multimedia Systems Conference (MMSys) 2020

arXiv:2001.06466 [pdf, other]

doi 10.1145/3386290.3396933

Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction

Authors: Serhan Gül, Dimitri Podborski, Thomas Buchholz, Thomas Schierl, Cornelius Hellge

Abstract: Volumetric video is an emerging key technology for immersive representation of 3D spaces and objects. Rendering volumetric video requires lots of computational power which is challenging especially for mobile devices. To mitigate this, we developed a streaming system that renders a 2D view from the volumetric video at a cloud server and streams a 2D video stream to the client. However, such networ… ▽ More Volumetric video is an emerging key technology for immersive representation of 3D spaces and objects. Rendering volumetric video requires lots of computational power which is challenging especially for mobile devices. To mitigate this, we developed a streaming system that renders a 2D view from the volumetric video at a cloud server and streams a 2D video stream to the client. However, such network-based processing increases the motion-to-photon (M2P) latency due to the additional network and processing delays. In order to compensate the added latency, prediction of the future user pose is necessary. We developed a head motion prediction model and investigated its potential to reduce the M2P latency for different look-ahead times. Our results show that the presented model reduces the rendering errors caused by the M2P latency compared to a baseline system in which no prediction is performed. △ Less

Submitted 16 July, 2020; v1 submitted 17 January, 2020; originally announced January 2020.

Comments: 7 pages, 4 figures

Journal ref: 30th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV) 2020

arXiv:1903.02971 [pdf, other]

doi 10.1145/3304109.3323835

HTML5 MSE Playback of MPEG 360 VR Tiled Streaming

Authors: Dimitri Podborski, Jangwoo Son, Gurdeep Singh Bhullar, Robert Skupin, Yago Sanchez, Cornelius Hellge, Thomas Schierl

Abstract: Virtual Reality (VR) and 360-degree video streaming have gained significant attention in recent years. First standards have been published in order to avoid market fragmentation. For instance, 3GPP released its first VR specification to enable 360-degree video streaming over 5G networks which relies on several technologies specified in ISO/IEC 23090-2, also known as MPEG-OMAF. While some implement… ▽ More Virtual Reality (VR) and 360-degree video streaming have gained significant attention in recent years. First standards have been published in order to avoid market fragmentation. For instance, 3GPP released its first VR specification to enable 360-degree video streaming over 5G networks which relies on several technologies specified in ISO/IEC 23090-2, also known as MPEG-OMAF. While some implementations of OMAF-compatible players have already been demonstrated at several trade shows, so far, no web browser-based implementations have been presented. In this demo paper we describe a browser-based JavaScript player implementation of the most advanced media profile of OMAF: HEVC-based viewport-dependent OMAF video profile, also known as tile-based streaming, with multi-resolution HEVC tiles. We also describe the applied workarounds for the implementation challenges we encountered with state-of-the-art HTML5 browsers. The presented implementation was tested in the Safari browser with support of HEVC video through the HTML5 Media Source Extensions API. In addition, the WebGL API was used for rendering, using region-wise packing metadata as defined in OMAF. △ Less

Submitted 23 April, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

Comments: Accepted for the demo track of ACM MMSys'19

arXiv:1902.05392 [pdf, other]

doi 10.1109/ICIP.2019.8803335

Multi-Kernel Prediction Networks for Denoising of Burst Images

Authors: Talmaj Marinč, Vignesh Srinivasan, Serhan Gül, Cornelius Hellge, Wojciech Samek

Abstract: In low light or short-exposure photography the image is often corrupted by noise. While longer exposure helps reduce the noise, it can produce blurry results due to the object and camera motion. The reconstruction of a noise-less image is an ill posed problem. Recent approaches for image denoising aim to predict kernels which are convolved with a set of successively taken images (burst) to obtain… ▽ More In low light or short-exposure photography the image is often corrupted by noise. While longer exposure helps reduce the noise, it can produce blurry results due to the object and camera motion. The reconstruction of a noise-less image is an ill posed problem. Recent approaches for image denoising aim to predict kernels which are convolved with a set of successively taken images (burst) to obtain a clear image. We propose a deep neural network based approach called Multi-Kernel Prediction Networks (MKPN) for burst image denoising. MKPN predicts kernels of not just one size but of varying sizes and performs fusion of these different kernels resulting in one kernel per pixel. The advantages of our method are two fold: (a) the different sized kernels help in extracting different information from the image which results in better reconstruction and (b) kernel fusion assures retaining of the extracted information while maintaining computational efficiency. Experimental results reveal that MKPN outperforms state-of-the-art on our synthetic datasets with different noise levels. △ Less

Submitted 11 March, 2021; v1 submitted 5 February, 2019; originally announced February 2019.

Comments: 5 pages, 4 figures

Journal ref: 2019 IEEE International Conference on Image Processing (ICIP), pp. 2404-2408

arXiv:1808.07034 [pdf, other]

URLLC Services in 5G - Low Latency Enhancements for LTE

Authors: Thomas Fehrenbach, Rohit Datta, Barış Göktepe, Thomas Wirth, Cornelius Hellge

Abstract: 5G is envisioned to support three broad categories of services: eMBB, URLLC, and mMTC. URLLC services refer to future applications which require reliable data communications from one end to another, while fulfilling ultra-low latency constraints. In this paper, we highlight the requirements and mechanisms that are necessary for URLLC in LTE. Design challenges faced when reducing the latency in LTE… ▽ More 5G is envisioned to support three broad categories of services: eMBB, URLLC, and mMTC. URLLC services refer to future applications which require reliable data communications from one end to another, while fulfilling ultra-low latency constraints. In this paper, we highlight the requirements and mechanisms that are necessary for URLLC in LTE. Design challenges faced when reducing the latency in LTE are shown. The performance of short processing time and frame structure enhancements are analyzed. Our proposed DCI Duplication method to increase LTE control channel reliability is presented and evaluated. The feasibility of achieving low latency and high reliability for the IMT-2020 submission of LTE is shown. We further anticipate the opportunities and technical design challenges when evolving 3GPP's LTE and designing the new 5G NR standard to meet the requirements of novel URLLC services. △ Less

Submitted 22 August, 2018; v1 submitted 21 August, 2018; originally announced August 2018.

Comments: Accepted for publication at IEEE Vehicular Technology Conference (VTC), Fall 2018

arXiv:1807.10495 [pdf, other]

doi 10.1109/JSAC.2019.2934001

Enhanced Machine Learning Techniques for Early HARQ Feedback Prediction in 5G

Authors: Nils Strodthoff, Barış Göktepe, Thomas Schierl, Cornelius Hellge, Wojciech Samek

Abstract: We investigate Early Hybrid Automatic Repeat reQuest (E-HARQ) feedback schemes enhanced by machine learning techniques as a path towards ultra-reliable and low-latency communication (URLLC). To this end, we propose machine learning methods to predict the outcome of the decoding process ahead of the end of the transmission. We discuss different input features and classification algorithms ranging f… ▽ More We investigate Early Hybrid Automatic Repeat reQuest (E-HARQ) feedback schemes enhanced by machine learning techniques as a path towards ultra-reliable and low-latency communication (URLLC). To this end, we propose machine learning methods to predict the outcome of the decoding process ahead of the end of the transmission. We discuss different input features and classification algorithms ranging from traditional methods to newly developed supervised autoencoders. These methods are evaluated based on their prospects of complying with the URLLC requirements of effective block error rates below $10^{-5}$ at small latency overheads. We provide realistic performance estimates in a system model incorporating scheduling effects to demonstrate the feasibility of E-HARQ across different signal-to-noise ratios, subcode lengths, channel conditions and system loads, and show the benefit over regular HARQ and existing E-HARQ schemes without machine learning. △ Less

Submitted 25 October, 2019; v1 submitted 27 July, 2018; originally announced July 2018.

Comments: 14 pages, 15 figures; accepted version

Journal ref: IEEE JSAC 37 (2019), no. 11, 2573-2587

Showing 1–12 of 12 results for author: Hellge, C