Open-vocabulary Attribute Detection

Bravo, María A.; Mittal, Sudhanshu; Ging, Simon; Brox, Thomas

Computer Science > Computer Vision and Pattern Recognition

arXiv:2211.12914v2 (cs)

[Submitted on 23 Nov 2022 (v1), last revised 8 Mar 2023 (this version, v2)]

Title:Open-vocabulary Attribute Detection

Authors:María A. Bravo, Sudhanshu Mittal, Simon Ging, Thomas Brox

View PDF

Abstract:Vision-language modeling has enabled open-vocabulary tasks where predictions can be queried using any text prompt in a zero-shot manner. Existing open-vocabulary tasks focus on object classes, whereas research on object attributes is limited due to the lack of a reliable attribute-focused evaluation benchmark. This paper introduces the Open-Vocabulary Attribute Detection (OVAD) task and the corresponding OVAD benchmark. The objective of the novel task and benchmark is to probe object-level attribute information learned by vision-language models. To this end, we created a clean and densely annotated test set covering 117 attribute classes on the 80 object classes of MS COCO. It includes positive and negative annotations, which enables open-vocabulary evaluation. Overall, the benchmark consists of 1.4 million annotations. For reference, we provide a first baseline method for open-vocabulary attribute detection. Moreover, we demonstrate the benchmark's value by studying the attribute detection performance of several foundation models. Project page this https URL

Comments:	Accepted at CVPR 2023. this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2211.12914 [cs.CV]
	(or arXiv:2211.12914v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2211.12914

Submission history

From: Maria A. Bravo [view email]
[v1] Wed, 23 Nov 2022 12:34:43 UTC (26,591 KB)
[v2] Wed, 8 Mar 2023 19:29:46 UTC (30,752 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Open-vocabulary Attribute Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Open-vocabulary Attribute Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators