Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

Egilmez, Hilmi E.; Singh, Ankitesh K.; Coban, Muhammed; Karczewicz, Marta; Zhu, Yinhao; Yang, Yang; Said, Amir; Cohen, Taco S.

doi:10.1109/OJSP.2021.3092257

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2103.01760 (eess)

[Submitted on 27 Feb 2021 (v1), last revised 27 Aug 2021 (this version, v2)]

Title:Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

Authors:Hilmi E. Egilmez, Ankitesh K. Singh, Muhammed Coban, Marta Karczewicz, Yinhao Zhu, Yang Yang, Amir Said, Taco S. Cohen

View PDF

Abstract:Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.

Comments:	10 pages, accepted in IEEE Open Journal of Signal Processing (Special issue on Applied Artificial Intelligence and Machine Learning for Video Coding and Streaming)
Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
Cite as:	arXiv:2103.01760 [eess.IV]
	(or arXiv:2103.01760v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2103.01760
Related DOI:	https://doi.org/10.1109/OJSP.2021.3092257

Submission history

From: Hilmi Enes Egilmez [view email]
[v1] Sat, 27 Feb 2021 06:47:27 UTC (1,817 KB)
[v2] Fri, 27 Aug 2021 07:44:13 UTC (1,999 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators