Skip to main content

Showing 1–7 of 7 results for author: Cohen, T S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2203.01978  [pdf, other

    eess.IV cs.CV cs.LG

    Region-of-Interest Based Neural Video Compression

    Authors: Yura Perugachi-Diaz, Guillaume Sautière, Davide Abati, Yang Yang, Amirhossein Habibian, Taco S Cohen

    Abstract: Humans do not perceive all parts of a scene with the same resolution, but rather focus on few regions of interest (ROIs). Traditional Object-Based codecs take advantage of this biological intuition, and are capable of non-uniform allocation of bits in favor of salient regions, at the expense of increased distortion the remaining areas: such a strategy allows a boost in perceptual quality under low… ▽ More

    Submitted 2 November, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Updated arxiv version to the camera-ready version after acceptance at British Machine Vision Conference (BMVC) 2022

  2. arXiv:2111.10302  [pdf, other

    eess.IV cs.CV cs.LG

    Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set

    Authors: Ties van Rozendaal, Johann Brehmer, Yunfan Zhang, Reza Pourreza, Auke Wiggers, Taco S. Cohen

    Abstract: We introduce a video compression algorithm based on instance-adaptive learning. On each video sequence to be transmitted, we finetune a pretrained compression model. The optimal parameters are transmitted to the receiver along with the latent code. By entropy-coding the parameter updates under a suitable mixture model prior, we ensure that the network parameters can be encoded efficiently. This in… ▽ More

    Submitted 23 June, 2023; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: Matches version published in TMLR

  3. arXiv:2104.00531  [pdf, other

    eess.IV cs.CV cs.LG

    Extending Neural P-frame Codecs for B-frame Coding

    Authors: Reza Pourreza, Taco S Cohen

    Abstract: While most neural video codecs address P-frame coding (predicting each frame from past ones), in this paper we address B-frame compression (predicting frames using both past and future reference frames). Our B-frame solution is based on the existing P-frame methods. As a result, B-frame coding capability can easily be added to an existing neural codec. The basic idea of our B-frame coding method i… ▽ More

    Submitted 5 August, 2021; v1 submitted 30 March, 2021; originally announced April 2021.

    Comments: ICCV 2021

  4. arXiv:2103.01760  [pdf, other

    eess.IV cs.AI cs.CV cs.LG cs.MM

    Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

    Authors: Hilmi E. Egilmez, Ankitesh K. Singh, Muhammed Coban, Marta Karczewicz, Yinhao Zhu, Yang Yang, Amir Said, Taco S. Cohen

    Abstract: Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U… ▽ More

    Submitted 27 August, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: 10 pages, accepted in IEEE Open Journal of Signal Processing (Special issue on Applied Artificial Intelligence and Machine Learning for Video Coding and Streaming)

  5. arXiv:2004.09691  [pdf, other

    cs.LG eess.IV stat.ML

    A Data and Compute Efficient Design for Limited-Resources Deep Learning

    Authors: Mirgahney Mohamed, Gabriele Cesa, Taco S. Cohen, Max Welling

    Abstract: Thanks to their improved data efficiency, equivariant neural networks have gained increased interest in the deep learning community. They have been successfully applied in the medical domain where symmetries in the data can be effectively exploited to build more accurate and robust models. To be able to reach a much larger body of patients, mobile, on-device implementations of deep learning soluti… ▽ More

    Submitted 8 July, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted for poster presentation at the Practical Machine Learning for Develo** Countries (PML4DC) workshop, ICLR 2020

  6. arXiv:1911.04018  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Feedback Recurrent AutoEncoder

    Authors: Yang Yang, Guillaume Sautière, J. Jon Ryu, Taco S Cohen

    Abstract: In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently extract the redundancy along the time dimension and allows a compact discrete representation of the data to be learned. We demonstrate its effectiveness in spee… ▽ More

    Submitted 17 February, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

    Journal ref: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  7. arXiv:1908.05717  [pdf, other

    eess.IV cs.LG stat.ML

    Video Compression With Rate-Distortion Autoencoders

    Authors: Amirhossein Habibian, Ties van Rozendaal, Jakub M. Tomczak, Taco S. Cohen

    Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a rate-distortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find… ▽ More

    Submitted 13 November, 2019; v1 submitted 14 August, 2019; originally announced August 2019.

    Comments: Accepted to ICCV 2019