Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Fadadu, Sudeep; Pandey, Shreyash; Hegde, Darshan; Shi, Yi; Chou, Fang-Chieh; Djuric, Nemanja; Vallespi-Gonzalez, Carlos

Computer Science > Computer Vision and Pattern Recognition

arXiv:2008.11901 (cs)

[Submitted on 27 Aug 2020 (v1), last revised 19 Oct 2021 (this version, v2)]

Title:Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Authors:Sudeep Fadadu, Shreyash Pandey, Darshan Hegde, Yi Shi, Fang-Chieh Chou, Nemanja Djuric, Carlos Vallespi-Gonzalez

View PDF

Abstract:We present an end-to-end method for object detection and trajectory prediction utilizing multi-view representations of LiDAR returns and camera images. In this work, we recognize the strengths and weaknesses of different view representations, and we propose an efficient and generic fusing method that aggregates benefits from all views. Our model builds on a state-of-the-art Bird's-Eye View (BEV) network that fuses voxelized features from a sequence of historical LiDAR data as well as rasterized high-definition map to perform detection and prediction tasks. We extend this model with additional LiDAR Range-View (RV) features that use the raw LiDAR information in its native, non-quantized representation. The RV feature map is projected into BEV and fused with the BEV features computed from LiDAR and high-definition map. The fused features are then further processed to output the final detections and trajectories, within a single end-to-end trainable network. In addition, the RV fusion of LiDAR and camera is performed in a straightforward and computationally efficient manner using this framework. The proposed multi-view fusion approach improves the state-of-the-art on proprietary large-scale real-world data collected by a fleet of self-driving vehicles, as well as on the public nuScenes data set with minimal increases on the computational cost.

Comments:	Accepted for publication at IEEE Winter Conference on Applications of Computer Vision (WACV) 2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2008.11901 [cs.CV]
	(or arXiv:2008.11901v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2008.11901

Submission history

From: Fang-Chieh Chou [view email]
[v1] Thu, 27 Aug 2020 03:32:25 UTC (1,600 KB)
[v2] Tue, 19 Oct 2021 00:36:07 UTC (2,199 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators