Search | arXiv e-print repository

Cool-chic video: Learned video coding with 800 parameters

Authors: Thomas Leguay, Théo Ladune, Pierrick Philippe, Olivier Déforges

Abstract: We propose a lightweight learned video codec with 900 multiplications per decoded pixel and 800 parameters overall. To the best of our knowledge, this is one of the neural video codecs with the lowest decoding complexity. It is built upon the overfitted image codec Cool-chic and supplements it with an inter coding module to leverage the video's temporal redundancies. The proposed model is able to… ▽ More We propose a lightweight learned video codec with 900 multiplications per decoded pixel and 800 parameters overall. To the best of our knowledge, this is one of the neural video codecs with the lowest decoding complexity. It is built upon the overfitted image codec Cool-chic and supplements it with an inter coding module to leverage the video's temporal redundancies. The proposed model is able to compress videos using both low-delay and random access configurations and achieves rate-distortion close to AVC while out-performing other overfitted codecs such as FFNeRV. The system is made open-source: orange-opensource.github.io/Cool-Chic. △ Less

Submitted 6 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: 10 pages, published in Data Compression Conference 2024

arXiv:2202.04365 [pdf, other]

AIVC: Artificial Intelligence based Video Codec

Authors: Théo Ladune, Pierrick Philippe

Abstract: This paper introduces AIVC, an end-to-end neural video codec. It is based on two conditional autoencoders MNet and CNet, for motion compensation and coding. AIVC learns to compress videos using any coding configurations through a single end-to-end rate-distortion optimization. Furthermore, it offers performance competitive with the recent video coder HEVC under several established test conditions.… ▽ More This paper introduces AIVC, an end-to-end neural video codec. It is based on two conditional autoencoders MNet and CNet, for motion compensation and coding. AIVC learns to compress videos using any coding configurations through a single end-to-end rate-distortion optimization. Furthermore, it offers performance competitive with the recent video coder HEVC under several established test conditions. A comprehensive ablation study is performed to evaluate the benefits of the different modules composing AIVC. The implementation is made available at https://orange-opensource.github.io/AIVC/. △ Less

Submitted 28 June, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Journal ref: ICIP 2022 (IEEE International Conference on Image Processing), Oct 2022, Bordeaux, France

arXiv:2104.09103 [pdf, other]

Conditional Coding and Variable Bitrate for Practical Learned Video Coding

Authors: Théo Ladune, Pierrick Philippe, Wassim Hamidouche, Lu Zhang, Olivier Déforges

Abstract: This paper introduces a practical learned video codec. Conditional coding and quantization gain vectors are used to provide flexibility to a single encoder/decoder pair, which is able to compress video sequences at a variable bitrate. The flexibility is leveraged at test time by choosing the rate and GOP structure to optimize a rate-distortion cost. Using the CLIC21 video test conditions, the prop… ▽ More This paper introduces a practical learned video codec. Conditional coding and quantization gain vectors are used to provide flexibility to a single encoder/decoder pair, which is able to compress video sequences at a variable bitrate. The flexibility is leveraged at test time by choosing the rate and GOP structure to optimize a rate-distortion cost. Using the CLIC21 video test conditions, the proposed approach shows performance on par with HEVC. △ Less

Submitted 20 April, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

Journal ref: CLIC workshop, CVPR 2021, Jun 2021, Nashville, United States

arXiv:2008.02580 [pdf, other]

Optical Flow and Mode Selection for Learning-based Video Coding

Authors: Théo Ladune, Pierrick Philippe, Wassim Hamidouche, Lu Zhang, Olivier Déforges

Abstract: This paper introduces a new method for inter-frame coding based on two complementary autoencoders: MOFNet and CodecNet. MOFNet aims at computing and conveying the Optical Flow and a pixel-wise coding Mode selection. The optical flow is used to perform a prediction of the frame to code. The coding mode selection enables competition between direct copy of the prediction or transmission through Codec… ▽ More This paper introduces a new method for inter-frame coding based on two complementary autoencoders: MOFNet and CodecNet. MOFNet aims at computing and conveying the Optical Flow and a pixel-wise coding Mode selection. The optical flow is used to perform a prediction of the frame to code. The coding mode selection enables competition between direct copy of the prediction or transmission through CodecNet. The proposed coding scheme is assessed under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding conditions, where it is shown to perform on par with the state-of-the-art video codec ITU/MPEG HEVC. Moreover, the possibility of copying the prediction enables to learn the optical flow in an end-to-end fashion i.e. without relying on pre-training and/or a dedicated loss term. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: MMSP 2020, IEEE 22nd International Workshop on Multimedia Signal Processing, Sep 2020, Tampere, Finland

arXiv:2007.02532 [pdf, other]

ModeNet: Mode Selection Network For Learned Video Coding

Authors: Théo Ladune, Pierrick Philippe, Wassim Hamidouche, Lu Zhang, Olivier Déforges

Abstract: In this paper, a mode selection network (ModeNet) is proposed to enhance deep learning-based video compression. Inspired by traditional video coding, ModeNet purpose is to enable competition among several coding modes. The proposed ModeNet learns and conveys a pixel-wise partitioning of the frame, used to assign each pixel to the most suited coding mode. ModeNet is trained alongside the different… ▽ More In this paper, a mode selection network (ModeNet) is proposed to enhance deep learning-based video compression. Inspired by traditional video coding, ModeNet purpose is to enable competition among several coding modes. The proposed ModeNet learns and conveys a pixel-wise partitioning of the frame, used to assign each pixel to the most suited coding mode. ModeNet is trained alongside the different coding modes to minimize a rate-distortion cost. It is a flexible component which can be generalized to other systems to allow competition between different coding tools. Mod-eNet interest is studied on a P-frame coding task, where it is used to design a method for coding a frame given its prediction. ModeNet-based systems achieve compelling performance when evaluated under the Challenge on Learned Image Compression 2020 (CLIC20) P-frame coding track conditions. △ Less

Submitted 31 July, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

Journal ref: Machine Learning for Signal Processing (MLSP) 2020, Sep 2020, Espoo, Finland

arXiv:2002.09259 [pdf, other]

Binary Probability Model for Learning Based Image Compression

Authors: Théo Ladune, Pierrick Philippe, Wassim Hamidouche, Lu Zhang, Olivier Deforges

Abstract: In this paper, we propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding , we propose to signal the latents with three binary values and one integer, with different probability models. A relaxation method is designed to perform gra… ▽ More In this paper, we propose to enhance learned image compression systems with a richer probability model for the latent variables. Previous works model the latents with a Gaussian or a Laplace distribution. Inspired by binary arithmetic coding , we propose to signal the latents with three binary values and one integer, with different probability models. A relaxation method is designed to perform gradient-based training. The richer probability model results in a better entropy coding leading to lower rate. Experiments under the Challenge on Learned Image Compression (CLIC) test conditions demonstrate that this method achieves 18% rate saving compared to Gaussian or Laplace models. △ Less

Submitted 21 February, 2020; originally announced February 2020.

Journal ref: International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020, 2020

arXiv:1407.2806 [pdf, other]

Bandits Warm-up Cold Recommender Systems

Authors: Jérémie Mary, Romaric Gaudel, Preux Philippe

Abstract: We address the cold start problem in recommendation systems assuming no contextual information is available neither about users, nor items. We consider the case in which we only have access to a set of ratings of items by users. Most of the existing works consider a batch setting, and use cross-validation to tune parameters. The classical method consists in minimizing the root mean square error ov… ▽ More We address the cold start problem in recommendation systems assuming no contextual information is available neither about users, nor items. We consider the case in which we only have access to a set of ratings of items by users. Most of the existing works consider a batch setting, and use cross-validation to tune parameters. The classical method consists in minimizing the root mean square error over a training subset of the ratings which provides a factorization of the matrix of ratings, interpreted as a latent representation of items and users. Our contribution in this paper is 5-fold. First, we explicit the issues raised by this kind of batch setting for users or items with very few ratings. Then, we propose an online setting closer to the actual use of recommender systems; this setting is inspired by the bandit framework. The proposed methodology can be used to turn any recommender system dataset (such as Netflix, MovieLens,...) into a sequential dataset. Then, we explicit a strong and insightful link between contextual bandit algorithms and matrix factorization; this leads us to a new algorithm that tackles the exploration/exploitation dilemma associated to the cold start problem in a strikingly new perspective. Finally, experimental evidence confirm that our algorithm is effective in dealing with the cold start problem on publicly available datasets. Overall, the goal of this paper is to bridge the gap between recommender systems based on matrix factorizations and those based on contextual bandits. △ Less

Submitted 10 July, 2014; originally announced July 2014.

Report number: RR-8563

Showing 1–7 of 7 results for author: Philippe, P