Multi-modal Domain Adaptation for REG via Relation Transfer

Ding, Yifan; Wang, Liqiang; Gong, Boqing

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.13247 (cs)

[Submitted on 23 Sep 2023]

Title:Multi-modal Domain Adaptation for REG via Relation Transfer

Authors:Yifan Ding, Liqiang Wang, Boqing Gong

View PDF

Abstract:Domain adaptation, which aims to transfer knowledge between domains, has been well studied in many areas such as image classification and object detection. However, for multi-modal tasks, conventional approaches rely on large-scale pre-training. But due to the difficulty of acquiring multi-modal data, large-scale pre-training is often impractical. Therefore, domain adaptation, which can efficiently utilize the knowledge from different datasets (domains), is crucial for multi-modal tasks. In this paper, we focus on the Referring Expression Grounding (REG) task, which is to localize an image region described by a natural language expression. Specifically, we propose a novel approach to effectively transfer multi-modal knowledge through a specially relation-tailored approach for the REG problem. Our approach tackles the multi-modal domain adaptation problem by simultaneously enriching inter-domain relations and transferring relations between domains. Experiments show that our proposed approach significantly improves the transferability of multi-modal domains and enhances adaptation performance in the REG problem.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2309.13247 [cs.CV]
	(or arXiv:2309.13247v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.13247

Submission history

From: Yifan Ding [view email]
[v1] Sat, 23 Sep 2023 04:02:06 UTC (20,446 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-modal Domain Adaptation for REG via Relation Transfer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-modal Domain Adaptation for REG via Relation Transfer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators