Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

Liu, Yuhao; Ke, Zhanghan; Liu, Fang; Zhao, Nanxuan; Lau, Rynson W. H.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.00644 (cs)

[Submitted on 1 Mar 2024 (v1), last revised 28 May 2024 (this version, v4)]

Title:Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

Authors:Yuhao Liu, Zhanghan Ke, Fang Liu, Nanxuan Zhao, Rynson W.H. Lau

View PDF HTML (experimental)

Abstract:Diffusion models trained on large-scale datasets have achieved remarkable progress in image synthesis. However, due to the randomness in the diffusion process, they often struggle with handling diverse low-level tasks that require details preservation. To overcome this limitation, we present a new Diff-Plugin framework to enable a single pre-trained diffusion model to generate high-fidelity results across a variety of low-level tasks. Specifically, we first propose a lightweight Task-Plugin module with a dual branch design to provide task-specific priors, guiding the diffusion process in preserving image content. We then propose a Plugin-Selector that can automatically select different Task-Plugins based on the text instruction, allowing users to edit images by indicating multiple low-level tasks with natural language. We conduct extensive experiments on 8 low-level vision tasks. The results demonstrate the superiority of Diff-Plugin over existing methods, particularly in real-world scenarios. Our ablations further validate that Diff-Plugin is stable, schedulable, and supports robust training across different dataset sizes.

Comments:	Accepted to CVPR2024. Replaced some celebrity images to avoid copyright disputes
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.00644 [cs.CV]
	(or arXiv:2403.00644v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.00644

Submission history

From: Yuhao Liu [view email]
[v1] Fri, 1 Mar 2024 16:25:17 UTC (18,631 KB)
[v2] Sat, 6 Apr 2024 10:10:33 UTC (29,639 KB)
[v3] Thu, 11 Apr 2024 05:48:36 UTC (31,624 KB)
[v4] Tue, 28 May 2024 07:34:22 UTC (26,481 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators