-
ECRF: Entropy-Constrained Neural Radiance Fields Compression with Frequency Domain Optimization
Authors:
Soonbin Lee,
Fangwen Shu,
Yago Sanchez,
Thomas Schierl,
Cornelius Hellge
Abstract:
Explicit feature-grid based NeRF models have shown promising results in terms of rendering quality and significant speed-up in training. However, these methods often require a significant amount of data to represent a single scene or object. In this work, we present a compression model that aims to minimize the entropy in the frequency domain in order to effectively reduce the data size. First, we…
▽ More
Explicit feature-grid based NeRF models have shown promising results in terms of rendering quality and significant speed-up in training. However, these methods often require a significant amount of data to represent a single scene or object. In this work, we present a compression model that aims to minimize the entropy in the frequency domain in order to effectively reduce the data size. First, we propose using the discrete cosine transform (DCT) on the tensorial radiance fields to compress the feature-grid. This feature-grid is transformed into coefficients, which are then quantized and entropy encoded, following a similar approach to the traditional video coding pipeline. Furthermore, to achieve a higher level of sparsity, we propose using an entropy parameterization technique for the frequency domain, specifically for DCT coefficients of the feature-grid. Since the transformed coefficients are optimized during the training phase, the proposed model does not require any fine-tuning or additional information. Our model only requires a lightweight compression pipeline for encoding and decoding, making it easier to apply volumetric radiance field methods for real-world applications. Experimental results demonstrate that our proposed frequency domain entropy model can achieve superior compression performance across various datasets. The source code will be made publicly available.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
On the Limits of HARQ Prediction for Short Deterministic Codes with Error Detection in Memoryless Channels (Extended Version with Proofs)
Authors:
Barış Göktepe,
Cornelius Hellge,
Tatiana Rykova,
Thomas Schierl,
Slawomir Stanczak
Abstract:
We provide a mathematical framework to analyze the limits of Hybrid Automatic Repeat reQuest (HARQ) and derive analytical expressions for the most powerful test for estimating the decodability under maximum-likelihood decoding and $t$-error decoding. Furthermore, we numerically approximate the most powerful test for sum-product decoding. We compare the performance of previously studied HARQ predic…
▽ More
We provide a mathematical framework to analyze the limits of Hybrid Automatic Repeat reQuest (HARQ) and derive analytical expressions for the most powerful test for estimating the decodability under maximum-likelihood decoding and $t$-error decoding. Furthermore, we numerically approximate the most powerful test for sum-product decoding. We compare the performance of previously studied HARQ prediction schemes and show that none of the state-of-the-art HARQ prediction is most powerful to estimate the decodability of a partially received signal vector under maximum-likelihood decoding and sum-product decoding. Furthermore, we demonstrate that decoding in general is suboptimal for predicting the decodability.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Distributed Machine-Learning for Early HARQ Feedback Prediction in Cloud RANs
Authors:
Barış Göktepe,
Cornelius Hellge,
Thomas Schierl,
Slawomir Stanczak
Abstract:
In this work, we propose novel HARQ prediction schemes for Cloud RANs (C-RANs) that use feedback over a rate-limited feedback channel (2 - 6 bits) from the Remote Radio Heads (RRHs) to predict at the User Equipment (UE) the decoding outcome at the BaseBand Unit (BBU) ahead of actual decoding. In particular, we propose a Dual Autoencoding 2-Stage Gaussian Mixture Model (DA2SGMM) that is trained in…
▽ More
In this work, we propose novel HARQ prediction schemes for Cloud RANs (C-RANs) that use feedback over a rate-limited feedback channel (2 - 6 bits) from the Remote Radio Heads (RRHs) to predict at the User Equipment (UE) the decoding outcome at the BaseBand Unit (BBU) ahead of actual decoding. In particular, we propose a Dual Autoencoding 2-Stage Gaussian Mixture Model (DA2SGMM) that is trained in an end-to-end fashion over the whole C-RAN setup. Using realistic link-level simulations in the sub-THz band at 100 GHz, we show that the novel DA2SGMM HARQ prediction scheme clearly outperforms all other adapted and state-of-the-art schemes. The DA2SGMM shows a superior performance in terms of blockage detection as well as HARQ prediction in the no-blockage and single-blockage cases. In particular, the DA2SGMM with 4~bit feedback achieves a more than 200 % higher throughput in average compared to its best alternative. Compared to regular HARQ, the DA2SGMM reduces the maximum transmission latency by more than 72.4 %, while maintaining more than 75 % of the throughput in the no-blockage scenario. In the single-blockage scenario, DA2SGMM significantly increases the throughput for most of the evaluated Signal-to-Noise-Ratios (SNRs) compared to regular HARQ.
△ Less
Submitted 18 May, 2023; v1 submitted 17 February, 2022;
originally announced February 2022.
-
Open GOP Resolution Switching in HTTP Adaptive Streaming with VVC
Authors:
Robert Skupin,
Christian Bartnik,
Adam Wieckowski,
Yago Sanchez,
Benjamin Bross,
Cornelius Hellge,
Thomas Schierl
Abstract:
The user experience in adaptive HTTP streaming relies on offering bitrate ladders with suitable operation points for all users and typically involves multiple resolutions. While open GOP coding structures are generally known to provide substantial coding efficiency benefit, their use in HTTP streaming has been precluded through lacking support of reference picture resampling (RPR) in AVC and HEVC.…
▽ More
The user experience in adaptive HTTP streaming relies on offering bitrate ladders with suitable operation points for all users and typically involves multiple resolutions. While open GOP coding structures are generally known to provide substantial coding efficiency benefit, their use in HTTP streaming has been precluded through lacking support of reference picture resampling (RPR) in AVC and HEVC. The newly emerging Versatile Video Coding (VVC) standard supports RPR, but only conversational scenarios were primarily investigated during the design of VVC. This paper aims at enabling usage of RPR in HTTP streaming scenarios through analysing the drift potential of VVC coding tools and presenting a constrained encoding method that avoids severe drift artefacts in resolution switching with open GOP coding in VVC. In typical live streaming configurations, the presented method achieves -8.57% BD-rate reduction compared to closed GOP coding while in a typical Video on Demand configuration, -1.89% BD-rate reduction is reported. The constraints penalty compared to regular open GOP coding is 0.65% BD-rate in the worst case. The presented method was integrated into the publicly available open source VVC encoder VVenC v0.3.
△ Less
Submitted 29 April, 2021; v1 submitted 11 March, 2021;
originally announced March 2021.
-
Feedback Prediction for Proactive HARQ in the Context of Industrial Internet of Things
Authors:
Baris Göktepe,
Tatiana Rykova,
Thomas Fehrenbach,
Thomas Schierl,
Cornelius Hellge
Abstract:
In this work, we investigate proactive Hybrid Automatic Repeat reQuest (HARQ) using link-level simulations for multiple packet sizes, modulation orders, BLock Error Rate (BLER) targets and two delay budgets of 1 ms and 2 ms, in the context of Industrial Internet of Things (IIOT) applications. In particular, we propose an enhanced proactive HARQ protocol using a feedback prediction mechanism. We sh…
▽ More
In this work, we investigate proactive Hybrid Automatic Repeat reQuest (HARQ) using link-level simulations for multiple packet sizes, modulation orders, BLock Error Rate (BLER) targets and two delay budgets of 1 ms and 2 ms, in the context of Industrial Internet of Things (IIOT) applications. In particular, we propose an enhanced proactive HARQ protocol using a feedback prediction mechanism. We show that the enhanced protocol achieves a significant gain over the classical proactive HARQ in terms of energy efficiency for almost all evaluated BLER targets at least for sufficiently large feedback delays. Furthermore, we demonstrate that the proposed protocol clearly outperforms the classical proactive HARQ in all scenarios when taking a processing delay reduction due to the less complex prediction approach into account, achieving an energy efficiency gain in the range of 11% up to 15% for very stringent latency budgets of 1 ms at $10^{-2}$ BLER and from 4% up to 7.5% for less stringent latency budgets of 2 ms at $10^{-3}$ BLER. Furthermore, we show that power-constrained proactive HARQ with prediction even outperforms unconstrained reactive HARQ for sufficiently large feedback delays.
△ Less
Submitted 18 February, 2022; v1 submitted 14 September, 2020;
originally announced September 2020.
-
Kalman Filter-based Head Motion Prediction for Cloud-based Mixed Reality
Authors:
Serhan Gül,
Sebastian Bosse,
Dimitri Podborski,
Thomas Schierl,
Cornelius Hellge
Abstract:
Volumetric video allows viewers to experience highly-realistic 3D content with six degrees of freedom in mixed reality (MR) environments. Rendering complex volumetric videos can require a prohibitively high amount of computational power for mobile devices. A promising technique to reduce the computational burden on mobile devices is to perform the rendering at a cloud server. However, cloud-based…
▽ More
Volumetric video allows viewers to experience highly-realistic 3D content with six degrees of freedom in mixed reality (MR) environments. Rendering complex volumetric videos can require a prohibitively high amount of computational power for mobile devices. A promising technique to reduce the computational burden on mobile devices is to perform the rendering at a cloud server. However, cloud-based rendering systems suffer from an increased interaction (motion-to-photon) latency that may cause registration errors in MR environments. One way of reducing the effective latency is to predict the viewer's head pose and render the corresponding view from the volumetric video in advance. In this paper, we design a Kalman filter for head motion prediction in our cloud-based volumetric video streaming system. We analyze the performance of our approach using recorded head motion traces and compare its performance to an autoregression model for different prediction intervals (look-ahead times). Our results show that the Kalman filter can predict head orientations 0.5 degrees more accurately than the autoregression model for a look-ahead time of 60 ms.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Cloud Rendering-based Volumetric Video Streaming System for Mixed Reality Services
Authors:
Serhan Gül,
Dimitri Podborski,
Jangwoo Son,
Gurdeep Singh Bhullar,
Thomas Buchholz,
Thomas Schierl,
Cornelius Hellge
Abstract:
Volumetric video is an emerging technology for immersive representation of 3D spaces that captures objects from all directions using multiple cameras and creates a dynamic 3D model of the scene. However, processing volumetric content requires high amounts of processing power and is still a very demanding task for today's mobile devices. To mitigate this, we propose a volumetric video streaming sys…
▽ More
Volumetric video is an emerging technology for immersive representation of 3D spaces that captures objects from all directions using multiple cameras and creates a dynamic 3D model of the scene. However, processing volumetric content requires high amounts of processing power and is still a very demanding task for today's mobile devices. To mitigate this, we propose a volumetric video streaming system that offloads the rendering to a powerful cloud/edge server and only sends the rendered 2D view to the client instead of the full volumetric content. We use 6DoF head movement prediction techniques, WebRTC protocol and hardware video encoding to ensure low-latency in different parts of the processing chain. We demonstrate our system using both a browser-based client and a Microsoft HoloLens client. Our application contains generic interfaces that allow for easy deployment of various augmented/mixed reality clients using the same server implementation.
△ Less
Submitted 16 July, 2020; v1 submitted 5 March, 2020;
originally announced March 2020.
-
Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction
Authors:
Serhan Gül,
Dimitri Podborski,
Thomas Buchholz,
Thomas Schierl,
Cornelius Hellge
Abstract:
Volumetric video is an emerging key technology for immersive representation of 3D spaces and objects. Rendering volumetric video requires lots of computational power which is challenging especially for mobile devices. To mitigate this, we developed a streaming system that renders a 2D view from the volumetric video at a cloud server and streams a 2D video stream to the client. However, such networ…
▽ More
Volumetric video is an emerging key technology for immersive representation of 3D spaces and objects. Rendering volumetric video requires lots of computational power which is challenging especially for mobile devices. To mitigate this, we developed a streaming system that renders a 2D view from the volumetric video at a cloud server and streams a 2D video stream to the client. However, such network-based processing increases the motion-to-photon (M2P) latency due to the additional network and processing delays. In order to compensate the added latency, prediction of the future user pose is necessary. We developed a head motion prediction model and investigated its potential to reduce the M2P latency for different look-ahead times. Our results show that the presented model reduces the rendering errors caused by the M2P latency compared to a baseline system in which no prediction is performed.
△ Less
Submitted 16 July, 2020; v1 submitted 17 January, 2020;
originally announced January 2020.
-
HTML5 MSE Playback of MPEG 360 VR Tiled Streaming
Authors:
Dimitri Podborski,
Jangwoo Son,
Gurdeep Singh Bhullar,
Robert Skupin,
Yago Sanchez,
Cornelius Hellge,
Thomas Schierl
Abstract:
Virtual Reality (VR) and 360-degree video streaming have gained significant attention in recent years. First standards have been published in order to avoid market fragmentation. For instance, 3GPP released its first VR specification to enable 360-degree video streaming over 5G networks which relies on several technologies specified in ISO/IEC 23090-2, also known as MPEG-OMAF. While some implement…
▽ More
Virtual Reality (VR) and 360-degree video streaming have gained significant attention in recent years. First standards have been published in order to avoid market fragmentation. For instance, 3GPP released its first VR specification to enable 360-degree video streaming over 5G networks which relies on several technologies specified in ISO/IEC 23090-2, also known as MPEG-OMAF. While some implementations of OMAF-compatible players have already been demonstrated at several trade shows, so far, no web browser-based implementations have been presented. In this demo paper we describe a browser-based JavaScript player implementation of the most advanced media profile of OMAF: HEVC-based viewport-dependent OMAF video profile, also known as tile-based streaming, with multi-resolution HEVC tiles. We also describe the applied workarounds for the implementation challenges we encountered with state-of-the-art HTML5 browsers. The presented implementation was tested in the Safari browser with support of HEVC video through the HTML5 Media Source Extensions API. In addition, the WebGL API was used for rendering, using region-wise packing metadata as defined in OMAF.
△ Less
Submitted 23 April, 2019; v1 submitted 7 March, 2019;
originally announced March 2019.
-
Multi-Kernel Prediction Networks for Denoising of Burst Images
Authors:
Talmaj Marinč,
Vignesh Srinivasan,
Serhan Gül,
Cornelius Hellge,
Wojciech Samek
Abstract:
In low light or short-exposure photography the image is often corrupted by noise. While longer exposure helps reduce the noise, it can produce blurry results due to the object and camera motion. The reconstruction of a noise-less image is an ill posed problem. Recent approaches for image denoising aim to predict kernels which are convolved with a set of successively taken images (burst) to obtain…
▽ More
In low light or short-exposure photography the image is often corrupted by noise. While longer exposure helps reduce the noise, it can produce blurry results due to the object and camera motion. The reconstruction of a noise-less image is an ill posed problem. Recent approaches for image denoising aim to predict kernels which are convolved with a set of successively taken images (burst) to obtain a clear image. We propose a deep neural network based approach called Multi-Kernel Prediction Networks (MKPN) for burst image denoising. MKPN predicts kernels of not just one size but of varying sizes and performs fusion of these different kernels resulting in one kernel per pixel. The advantages of our method are two fold: (a) the different sized kernels help in extracting different information from the image which results in better reconstruction and (b) kernel fusion assures retaining of the extracted information while maintaining computational efficiency. Experimental results reveal that MKPN outperforms state-of-the-art on our synthetic datasets with different noise levels.
△ Less
Submitted 11 March, 2021; v1 submitted 5 February, 2019;
originally announced February 2019.
-
URLLC Services in 5G - Low Latency Enhancements for LTE
Authors:
Thomas Fehrenbach,
Rohit Datta,
Barış Göktepe,
Thomas Wirth,
Cornelius Hellge
Abstract:
5G is envisioned to support three broad categories of services: eMBB, URLLC, and mMTC. URLLC services refer to future applications which require reliable data communications from one end to another, while fulfilling ultra-low latency constraints. In this paper, we highlight the requirements and mechanisms that are necessary for URLLC in LTE. Design challenges faced when reducing the latency in LTE…
▽ More
5G is envisioned to support three broad categories of services: eMBB, URLLC, and mMTC. URLLC services refer to future applications which require reliable data communications from one end to another, while fulfilling ultra-low latency constraints. In this paper, we highlight the requirements and mechanisms that are necessary for URLLC in LTE. Design challenges faced when reducing the latency in LTE are shown. The performance of short processing time and frame structure enhancements are analyzed. Our proposed DCI Duplication method to increase LTE control channel reliability is presented and evaluated. The feasibility of achieving low latency and high reliability for the IMT-2020 submission of LTE is shown. We further anticipate the opportunities and technical design challenges when evolving 3GPP's LTE and designing the new 5G NR standard to meet the requirements of novel URLLC services.
△ Less
Submitted 22 August, 2018; v1 submitted 21 August, 2018;
originally announced August 2018.
-
Enhanced Machine Learning Techniques for Early HARQ Feedback Prediction in 5G
Authors:
Nils Strodthoff,
Barış Göktepe,
Thomas Schierl,
Cornelius Hellge,
Wojciech Samek
Abstract:
We investigate Early Hybrid Automatic Repeat reQuest (E-HARQ) feedback schemes enhanced by machine learning techniques as a path towards ultra-reliable and low-latency communication (URLLC). To this end, we propose machine learning methods to predict the outcome of the decoding process ahead of the end of the transmission. We discuss different input features and classification algorithms ranging f…
▽ More
We investigate Early Hybrid Automatic Repeat reQuest (E-HARQ) feedback schemes enhanced by machine learning techniques as a path towards ultra-reliable and low-latency communication (URLLC). To this end, we propose machine learning methods to predict the outcome of the decoding process ahead of the end of the transmission. We discuss different input features and classification algorithms ranging from traditional methods to newly developed supervised autoencoders. These methods are evaluated based on their prospects of complying with the URLLC requirements of effective block error rates below $10^{-5}$ at small latency overheads. We provide realistic performance estimates in a system model incorporating scheduling effects to demonstrate the feasibility of E-HARQ across different signal-to-noise ratios, subcode lengths, channel conditions and system loads, and show the benefit over regular HARQ and existing E-HARQ schemes without machine learning.
△ Less
Submitted 25 October, 2019; v1 submitted 27 July, 2018;
originally announced July 2018.