Search | arXiv e-print repository

3R-INN: How to be climate friendly while consuming/delivering videos?

Authors: Zoubida Ameur, Claire-Hélène Demarty, Daniel Menard, Olivier Le Meur

Abstract: The consumption of a video requires a considerable amount of energy during the various stages of its life-cycle. With a billion hours of video consumed daily, this contributes significantly to the greenhouse gas emission. Therefore, reducing the end-to-end carbon footprint of the video chain, while preserving the quality of experience at the user side, is of high importance. To contribute in an im… ▽ More The consumption of a video requires a considerable amount of energy during the various stages of its life-cycle. With a billion hours of video consumed daily, this contributes significantly to the greenhouse gas emission. Therefore, reducing the end-to-end carbon footprint of the video chain, while preserving the quality of experience at the user side, is of high importance. To contribute in an impactful manner, we propose 3R-INN, a single light invertible network that does three tasks at once: given a high-resolution grainy image, it Rescales it to a lower resolution, Removes film grain and Reduces its power consumption when displayed. Providing such a minimum viable quality content contributes to reducing the energy consumption during encoding, transmission, decoding and display. 3R-INN also offers the possibility to restore either the high-resolution grainy original image or a grain-free version, thanks to its invertibility and the disentanglement of the high frequency, and without transmitting auxiliary data. Experiments show that, while enabling significant energy savings for encoding (78%), decoding (77%) and rendering (5% to 20%), 3R-INN outperforms state-of-the-art film grain synthesis and energy-aware methods and achieves state-of-the-art performance on the rescaling task on different test-sets. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2212.04184 [pdf, other]

Customizing Number Representation and Precision

Authors: Olivier Sentieys, Daniel Menard

Abstract: There is a growing interest in the use of reduced-precision arithmetic, exacerbated by the recent interest in artificial intelligence, especially with deep learning. Most architectures already provide reduced-precision capabilities (e.g., 8-bit integer, 16-bit floating point). In the context of FPGAs, any number format and bit-width can even be considered.In computer arithmetic, the representation… ▽ More There is a growing interest in the use of reduced-precision arithmetic, exacerbated by the recent interest in artificial intelligence, especially with deep learning. Most architectures already provide reduced-precision capabilities (e.g., 8-bit integer, 16-bit floating point). In the context of FPGAs, any number format and bit-width can even be considered.In computer arithmetic, the representation of real numbers is a major issue. Fixed-point (FxP) and floating-point (FlP) are the main options to represent reals, both with their advantages and drawbacks. This chapter presents both FxP and FlP number representations, and draws a fair a comparison between their cost, performance and energy, as well as their impact on accuracy during computations.It is shown that the choice between FxP and FlP is not obvious and strongly depends on the application considered. In some cases, low-precision floating-point arithmetic can be the most effective and provides some benefits over the classical fixed-point choice for energy-constrained applications. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: In press

arXiv:2012.14792 [pdf, other]

doi 10.1109/ICIP40778.2020.9190928

Quality-Driven Dynamic VVC Frame Partitioning for Efficient Parallel Processing

Authors: Thomas Amestoy, Wassim Hamidouche, Cyril Bergeron, Daniel Menard

Abstract: VVC is the next generation video coding standard, offering coding capability beyond HEVC standard. The high computational complexity of the latest video coding standards requires high-level parallelism techniques, in order to achieve real-time and low latency encoding and decoding. HEVC and VVC include tile grid partitioning that allows to process simultaneously rectangular regions of a frame with… ▽ More VVC is the next generation video coding standard, offering coding capability beyond HEVC standard. The high computational complexity of the latest video coding standards requires high-level parallelism techniques, in order to achieve real-time and low latency encoding and decoding. HEVC and VVC include tile grid partitioning that allows to process simultaneously rectangular regions of a frame with independent threads. The tile grid may be further partitioned into a horizontal sub-grid of Rectangular Slices (RSs), increasing the partitioning flexibility. The dynamic Tile and Rectangular Slice (TRS) partitioning solution proposed in this paper benefits from this flexibility. The TRS partitioning is carried-out at the frame level, taking into account both spatial texture of the content and encoding times of previously encoded frames. The proposed solution searches the best partitioning configuration that minimizes the trade-off between multi-thread encoding time and encoding quality loss. Experiments prove that the proposed solution, compared to uniform TRS partitioning, significantly decreases multi-thread encoding time, with slightly better encoding quality. △ Less

Submitted 29 December, 2020; originally announced December 2020.

Journal ref: 27th IEEE International Conference on Image Processing (ICIP 2020), Oct 2020, Abu Dhabi, United Arab Emirates. pp.3129-3133

arXiv:1909.01394 [pdf]

A Novel Loss Function Incorporating Imaging Acquisition Physics for PET Attenuation Map Generation using Deep Learning

Authors: Luyao Shi, John A. Onofrey, Enette Mae Revilla, Takuya Toyonaga, David Menard, Jo-seph Ankrah, Richard E. Carson, Chi Liu, Yihuan Lu

Abstract: In PET/CT imaging, CT is used for PET attenuation correction (AC). Mismatch between CT and PET due to patient body motion results in AC artifacts. In addition, artifact caused by metal, beam-hardening and count-starving in CT itself also introduces inaccurate AC for PET. Maximum likelihood reconstruction of activity and attenuation (MLAA) was proposed to solve those issues by simultaneously recons… ▽ More In PET/CT imaging, CT is used for PET attenuation correction (AC). Mismatch between CT and PET due to patient body motion results in AC artifacts. In addition, artifact caused by metal, beam-hardening and count-starving in CT itself also introduces inaccurate AC for PET. Maximum likelihood reconstruction of activity and attenuation (MLAA) was proposed to solve those issues by simultaneously reconstructing tracer activity ($λ$-MLAA) and attenuation map ($μ$-MLAA) based on the PET raw data only. However, $μ$-MLAA suffers from high noise and $λ$-MLAA suffers from large bias as compared to the reconstruction using the CT-based attenuation map ($μ$-CT). Recently, a convolutional neural network (CNN) was applied to predict the CT attenuation map ($μ$-CNN) from $λ$-MLAA and $μ$-MLAA, in which an image-domain loss (IM-loss) function between the $μ$-CNN and the ground truth $μ$-CT was used. However, IM-loss does not directly measure the AC errors according to the PET attenuation physics, where the line-integral projection of the attenuation map ($μ$) along the path of the two annihilation events, instead of the $μ$ itself, is used for AC. Therefore, a network trained with the IM-loss may yield suboptimal performance in the $μ$ generation. Here, we propose a novel line-integral projection loss (LIP-loss) function that incorporates the PET attenuation physics for $μ$ generation. Eighty training and twenty testing datasets of whole-body 18F-FDG PET and paired ground truth $μ$-CT were used. Quantitative evaluations showed that the model trained with the additional LIP-loss was able to significantly outperform the model trained solely based on the IM-loss function. △ Less

Submitted 3 September, 2019; originally announced September 2019.

Comments: Accepted at MICCAI 2019

Showing 1–4 of 4 results for author: Menard, D