Object as Query: Lifting any 2D Object Detector to 3D Detection

Wang, Zitian; Huang, Zehao; Fu, Jiahui; Wang, Naiyan; Liu, Si

Computer Science > Computer Vision and Pattern Recognition

arXiv:2301.02364 (cs)

[Submitted on 6 Jan 2023 (v1), last revised 6 Nov 2023 (this version, v3)]

Title:Object as Query: Lifting any 2D Object Detector to 3D Detection

Authors:Zitian Wang, Zehao Huang, Jiahui Fu, Naiyan Wang, Si Liu

View PDF

Abstract:3D object detection from multi-view images has drawn much attention over the past few years. Existing methods mainly establish 3D representations from multi-view images and adopt a dense detection head for object detection, or employ object queries distributed in 3D space to localize objects. In this paper, we design Multi-View 2D Objects guided 3D Object Detector (MV2D), which can lift any 2D object detector to multi-view 3D object detection. Since 2D detections can provide valuable priors for object existence, MV2D exploits 2D detectors to generate object queries conditioned on the rich image semantics. These dynamically generated queries help MV2D to recall objects in the field of view and show a strong capability of localizing 3D objects. For the generated queries, we design a sparse cross attention module to force them to focus on the features of specific objects, which suppresses interference from noises. The evaluation results on the nuScenes dataset demonstrate the dynamic object queries and sparse feature aggregation can promote 3D detection capability. MV2D also exhibits a state-of-the-art performance among existing methods. We hope MV2D can serve as a new baseline for future research. Code is available at \url{this https URL}.

Comments:	Accepted by ICCV 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2301.02364 [cs.CV]
	(or arXiv:2301.02364v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2301.02364

Submission history

From: Zitian Wang [view email]
[v1] Fri, 6 Jan 2023 04:08:20 UTC (6,695 KB)
[v2] Mon, 5 Jun 2023 05:40:56 UTC (11,572 KB)
[v3] Mon, 6 Nov 2023 04:37:47 UTC (11,572 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Object as Query: Lifting any 2D Object Detector to 3D Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Object as Query: Lifting any 2D Object Detector to 3D Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators