Frequency-Aware Transformer for Learned Image Compression

Li, Han; Li, Shaohui; Dai, Wenrui; Li, Chenglin; Zou, Junni; Xiong, Hongkai

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2310.16387 (eess)

[Submitted on 25 Oct 2023 (v1), last revised 21 Mar 2024 (this version, v3)]

Title:Frequency-Aware Transformer for Learned Image Compression

Authors:Han Li, Shaohui Li, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong

View PDF HTML (experimental)

Abstract:Learned image compression (LIC) has gained traction as an effective solution for image storage and transmission in recent years. However, existing LIC methods are redundant in latent representation due to limitations in capturing anisotropic frequency components and preserving directional details. To overcome these challenges, we propose a novel frequency-aware transformer (FAT) block that for the first time achieves multiscale directional ananlysis for LIC. The FAT block comprises frequency-decomposition window attention (FDWA) modules to capture multiscale and directional frequency components of natural images. Additionally, we introduce frequency-modulation feed-forward network (FMFFN) to adaptively modulate different frequency components, improving rate-distortion performance. Furthermore, we present a transformer-based channel-wise autoregressive (T-CA) model that effectively exploits channel dependencies. Experiments show that our method achieves state-of-the-art rate-distortion performance compared to existing LIC methods, and evidently outperforms latest standardized codec VTM-12.1 by 14.5%, 15.1%, 13.0% in BD-rate on the Kodak, Tecnick, and CLIC datasets.

Comments:	ICLR2024 poster
Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2310.16387 [eess.IV]
	(or arXiv:2310.16387v3 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2310.16387

Submission history

From: Han Li [view email]
[v1] Wed, 25 Oct 2023 05:59:25 UTC (10,467 KB)
[v2] Tue, 19 Mar 2024 04:15:28 UTC (10,035 KB)
[v3] Thu, 21 Mar 2024 04:52:57 UTC (10,035 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Frequency-Aware Transformer for Learned Image Compression

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Frequency-Aware Transformer for Learned Image Compression

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators