A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Zhuang, Junhao; Zeng, Yanhong; Liu, Wenran; Yuan, Chun; Chen, Kai

Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.03594 (cs)

[Submitted on 6 Dec 2023 (v1), last revised 12 Dec 2023 (this version, v3)]

Title:A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Authors:Junhao Zhuang, Yanhong Zeng, Wenran Liu, Chun Yuan, Kai Chen

View PDF HTML (experimental)

Abstract:Achieving high-quality versatile image inpainting, where user-specified regions are filled with plausible content according to user intent, presents a significant challenge. Existing methods face difficulties in simultaneously addressing context-aware image inpainting and text-guided object inpainting due to the distinct optimal training strategies required. To overcome this challenge, we introduce PowerPaint, the first high-quality and versatile inpainting model that excels in both tasks. First, we introduce learnable task prompts along with tailored fine-tuning strategies to guide the model's focus on different inpainting targets explicitly. This enables PowerPaint to accomplish various inpainting tasks by utilizing different task prompts, resulting in state-of-the-art performance. Second, we demonstrate the versatility of the task prompt in PowerPaint by showcasing its effectiveness as a negative prompt for object removal. Additionally, we leverage prompt interpolation techniques to enable controllable shape-guided object inpainting. Finally, we extensively evaluate PowerPaint on various inpainting benchmarks to demonstrate its superior performance for versatile image inpainting. We release our codes and models on our project page: this https URL.

Comments:	Project page with code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.03594 [cs.CV]
	(or arXiv:2312.03594v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.03594

Submission history

From: Junhao Zhuang [view email]
[v1] Wed, 6 Dec 2023 16:34:46 UTC (30,122 KB)
[v2] Thu, 7 Dec 2023 03:13:33 UTC (30,122 KB)
[v3] Tue, 12 Dec 2023 04:44:42 UTC (29,393 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators