Search | arXiv e-print repository

Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory

Authors: Jonas Kälble, Sascha Wirges, Maxim Tatarchenko, Eddy Ilg

Abstract: Automated driving fundamentally requires knowledge about the surrounding geometry of the scene. Modern approaches use only captured images to predict occupancy maps that represent the geometry. Training these approaches requires accurate data that may be acquired with the help of LiDAR scanners. We show that the techniques used for current benchmarks and training datasets to convert LiDAR scans in… ▽ More Automated driving fundamentally requires knowledge about the surrounding geometry of the scene. Modern approaches use only captured images to predict occupancy maps that represent the geometry. Training these approaches requires accurate data that may be acquired with the help of LiDAR scanners. We show that the techniques used for current benchmarks and training datasets to convert LiDAR scans into occupancy grid maps yield very low quality, and subsequently present a novel approach using evidence theory that yields more accurate reconstructions. We demonstrate that these are superior by a large margin, both qualitatively and quantitatively, and that we additionally obtain meaningful uncertainty estimates. When converting the occupancy maps back to depth estimates and comparing them with the raw LiDAR measurements, our method yields a MAE improvement of 30% to 52% on nuScenes and 53% on Waymo over other occupancy ground-truth data. Finally, we use the improved occupancy maps to train a state-of-the-art occupancy prediction method and demonstrate that it improves the MAE by 25% on nuScenes. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2204.08780 [pdf, other]

Sensor Data Fusion in Top-View Grid Maps using Evidential Reasoning with Advanced Conflict Resolution

Authors: Sven Richter, Frank Bieder, Sascha Wirges, Christoph Stiller

Abstract: We present a new method to combine evidential top-view grid maps estimated based on heterogeneous sensor sources. Dempster's combination rule that is usually applied in this context provides undesired results with highly conflicting inputs. Therefore, we use more advanced evidential reasoning techniques and improve the conflict resolution by modeling the reliability of the evidence sources. We pro… ▽ More We present a new method to combine evidential top-view grid maps estimated based on heterogeneous sensor sources. Dempster's combination rule that is usually applied in this context provides undesired results with highly conflicting inputs. Therefore, we use more advanced evidential reasoning techniques and improve the conflict resolution by modeling the reliability of the evidence sources. We propose a data-driven reliability estimation to optimize the fusion quality using the Kitti-360 dataset. We apply the proposed method to the fusion of LiDAR and stereo camera data and evaluate the results qualitatively and quantitatively. The results demonstrate that our proposed method robustly combines measurements from heterogeneous sensors and successfully resolves sensor conflicts. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2204.07887 [pdf, other]

Map** LiDAR and Camera Measurements in a Dual Top-View Grid Representation Tailored for Automated Vehicles

Authors: Sven Richter, Frank Bieder, Sascha Wirges, Christoph Stiller

Abstract: We present a generic evidential grid map** pipeline designed for imaging sensors such as LiDARs and cameras. Our grid-based evidential model contains semantic estimates for cell occupancy and ground separately. We specify the estimation steps for input data represented by point sets, but mainly focus on input data represented by images such as disparity maps or LiDAR range images. Instead of rel… ▽ More We present a generic evidential grid map** pipeline designed for imaging sensors such as LiDARs and cameras. Our grid-based evidential model contains semantic estimates for cell occupancy and ground separately. We specify the estimation steps for input data represented by point sets, but mainly focus on input data represented by images such as disparity maps or LiDAR range images. Instead of relying on an external ground segmentation only, we deduce occupancy evidence by analyzing the surface orientation around measurements. We conduct experiments and evaluate the presented method using LiDAR and stereo camera data recorded in real traffic scenarios. Our method estimates cell occupancy robustly and with a high level of detail while maximizing efficiency and minimizing the dependency to external processing modules. △ Less

Submitted 21 April, 2022; v1 submitted 16 April, 2022; originally announced April 2022.

arXiv:2203.01180 [pdf, other]

Fast and Robust Ground Surface Estimation from LIDAR Measurements using Uniform B-Splines

Authors: Sascha Wirges, Kevin Rösch, Frank Bieder, Christoph Stiller

Abstract: We propose a fast and robust method to estimate the ground surface from LIDAR measurements on an automated vehicle. The ground surface is modeled as a UBS which is robust towards varying measurement densities and with a single parameter controlling the smoothness prior. We model the estimation process as a robust LS optimization problem which can be reformulated as a linear problem and thus solved… ▽ More We propose a fast and robust method to estimate the ground surface from LIDAR measurements on an automated vehicle. The ground surface is modeled as a UBS which is robust towards varying measurement densities and with a single parameter controlling the smoothness prior. We model the estimation process as a robust LS optimization problem which can be reformulated as a linear problem and thus solved efficiently. Using the SemanticKITTI data set, we conduct a quantitative evaluation by classifying the point-wise semantic annotations into ground and non-ground points. Finally, we validate the approach on our research vehicle in real-world scenarios. △ Less

Submitted 2 March, 2022; originally announced March 2022.

arXiv:2009.12276 [pdf, other]

doi 10.1109/MFI49285.2020.9235240

SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation

Authors: Juncong Fei, Wenbo Chen, Philipp Heidenreich, Sascha Wirges, Christoph Stiller

Abstract: 3D pedestrian detection is a challenging task in automated driving because pedestrians are relatively small, frequently occluded and easily confused with narrow vertical objects. LiDAR and camera are two commonly used sensor modalities for this task, which should provide complementary information. Unexpectedly, LiDAR-only detection methods tend to outperform multisensor fusion methods in public be… ▽ More 3D pedestrian detection is a challenging task in automated driving because pedestrians are relatively small, frequently occluded and easily confused with narrow vertical objects. LiDAR and camera are two commonly used sensor modalities for this task, which should provide complementary information. Unexpectedly, LiDAR-only detection methods tend to outperform multisensor fusion methods in public benchmarks. Recently, PointPainting has been presented to eliminate this performance drop by effectively fusing the output of a semantic segmentation network instead of the raw image information. In this paper, we propose a generalization of PointPainting to be able to apply fusion at different levels. After the semantic augmentation of the point cloud, we encode raw point data in pillars to get geometric features and semantic point data in voxels to get semantic features and fuse them in an effective way. Experimental results on the KITTI test set show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird's eye view pedestrian detection benchmarks. In particular, our approach demonstrates its strength in detecting challenging pedestrian cases and outperforms current state-of-the-art approaches. △ Less

Submitted 25 September, 2020; originally announced September 2020.

Comments: Accepted to present in the 2020 IEEE International Conference on Multisensor Fusion and Integration (MFI 2020)

arXiv:2005.06667 [pdf, other]

Exploiting Multi-Layer Grid Maps for Surround-View Semantic Segmentation of Sparse LiDAR Data

Authors: Frank Bieder, Sascha Wirges, Johannes Janosovits, Sven Richter, Zheyuan Wang, Christoph Stiller

Abstract: In this paper, we consider the transformation of laser range measurements into a top-view grid map representation to approach the task of LiDAR-only semantic segmentation. Since the recent publication of the SemanticKITTI data set, researchers are now able to study semantic segmentation of urban LiDAR sequences based on a reasonable amount of data. While other approaches propose to directly learn… ▽ More In this paper, we consider the transformation of laser range measurements into a top-view grid map representation to approach the task of LiDAR-only semantic segmentation. Since the recent publication of the SemanticKITTI data set, researchers are now able to study semantic segmentation of urban LiDAR sequences based on a reasonable amount of data. While other approaches propose to directly learn on the 3D point clouds, we are exploiting a grid map framework to extract relevant information and represent them by using multi-layer grid maps. This representation allows us to use well-studied deep learning architectures from the image domain to predict a dense semantic grid map using only the sparse input data of a single LiDAR scan. We compare single-layer and multi-layer approaches and demonstrate the benefit of a multi-layer grid map input. Since the grid map representation allows us to predict a dense, 360° semantic environment representation, we further develop a method to combine the semantic information from multiple scans and create dense ground truth grids. This method allows us to evaluate and compare the performance of our models not only based on grid cells with a detection, but on the full visible measurement range. △ Less

Submitted 13 May, 2020; originally announced May 2020.

arXiv:2003.00710 [pdf, other]

Learned Enrichment of Top-View Grid Maps Improves Object Detection

Authors: Sascha Wirges, Ye Yang, Sven Richter, Haohao Hu, Christoph Stiller

Abstract: We propose an object detector for top-view grid maps which is additionally trained to generate an enriched version of its input. Our goal in the joint model is to improve generalization by regularizing towards structural knowledge in form of a map fused from multiple adjacent range sensor measurements. This training data can be generated in an automatic fashion, thus does not require manual annota… ▽ More We propose an object detector for top-view grid maps which is additionally trained to generate an enriched version of its input. Our goal in the joint model is to improve generalization by regularizing towards structural knowledge in form of a map fused from multiple adjacent range sensor measurements. This training data can be generated in an automatic fashion, thus does not require manual annotations. We present an evidential framework to generate training data, investigate different model architectures and show that predicting enriched inputs as an additional task can improve object detection performance. △ Less

Submitted 9 March, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

Comments: 6 pages, 6 figures, 4 tables

arXiv:2002.00667 [pdf, other]

Single-Stage Object Detection from Top-View Grid Maps on Custom Sensor Setups

Authors: Sascha Wirges, Shuxiao Ding, Christoph Stiller

Abstract: We present our approach to unsupervised domain adaptation for single-stage object detectors on top-view grid maps in automated driving scenarios. Our goal is to train a robust object detector on grid maps generated from custom sensor data and setups. We first introduce a single-stage object detector for grid maps based on RetinaNet. We then extend our model by image- and instance-level domain clas… ▽ More We present our approach to unsupervised domain adaptation for single-stage object detectors on top-view grid maps in automated driving scenarios. Our goal is to train a robust object detector on grid maps generated from custom sensor data and setups. We first introduce a single-stage object detector for grid maps based on RetinaNet. We then extend our model by image- and instance-level domain classifiers at different feature pyramid levels which are trained in an adversarial manner. This allows us to train robust object detectors for unlabeled domains. We evaluate our approach quantitatively on the nuScenes and KITTI benchmarks and present qualitative domain adaptation results for unlabeled measurements recorded by our experimental vehicle. Our results demonstrate that object detection accuracy for unlabeled domains can be improved by applying our domain adaptation strategy. △ Less

Submitted 3 February, 2020; originally announced February 2020.

Comments: 6 pages, 5 figures, 4 tables

arXiv:1906.01540 [pdf, other]

Localization in Aerial Imagery with Grid Maps using LocGAN

Authors: Haohao Hu, Junyi Zhu, Sascha Wirges, Martin Lauer

Abstract: In this work, we present LocGAN, our localization approach based on a geo-referenced aerial imagery and LiDAR grid maps. Currently, most self-localization approaches relate the current sensor observations to a map generated from previously acquired data. Unfortunately, this data is not always available and the generated maps are usually sensor setup specific. Global Navigation Satellite Systems (G… ▽ More In this work, we present LocGAN, our localization approach based on a geo-referenced aerial imagery and LiDAR grid maps. Currently, most self-localization approaches relate the current sensor observations to a map generated from previously acquired data. Unfortunately, this data is not always available and the generated maps are usually sensor setup specific. Global Navigation Satellite Systems (GNSS) can overcome this problem. However, they are not always reliable especially in urban areas due to multi-path and shadowing effects. Since aerial imagery is usually available, we can use it as prior information. To match aerial images with grid maps, we use conditional Generative Adversarial Networks (cGANs) which transform aerial images to the grid map domain. The transformation between the predicted and measured grid map is estimated using a localization network (LocNet). Given the geo-referenced aerial image transformation the vehicle pose can be estimated. Evaluations performed on the data recorded in region Karlsruhe, Germany show that our LocGAN approach provides reliable global localization results. △ Less

Submitted 16 July, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: 8 pages, 21 figures

arXiv:1904.12599 [pdf, other]

Self-Supervised Flow Estimation using Geometric Regularization with Applications to Camera Image and Grid Map Sequences

Authors: Sascha Wirges, Johannes Gräter, Qiuhao Zhang, Christoph Stiller

Abstract: We present a self-supervised approach to estimate flow in camera image and top-view grid map sequences using fully convolutional neural networks in the domain of automated driving. We extend existing approaches for self-supervised optical flow estimation by adding a regularizer expressing motion consistency assuming a static environment. However, as this assumption is violated for other moving tra… ▽ More We present a self-supervised approach to estimate flow in camera image and top-view grid map sequences using fully convolutional neural networks in the domain of automated driving. We extend existing approaches for self-supervised optical flow estimation by adding a regularizer expressing motion consistency assuming a static environment. However, as this assumption is violated for other moving traffic participants we also estimate a mask to scale this regularization. Adding a regularization towards motion consistency improves convergence and flow estimation accuracy. Furthermore, we scale the errors due to spatial flow inconsistency by a mask that we derive from the motion mask. This improves accuracy in regions where the flow drastically changes due to a better separation between static and dynamic environment. We apply our approach to optical flow estimation from camera image sequences, validate on odometry estimation and suggest a method to iteratively increase optical flow estimation accuracy using the generated motion masks. Finally, we provide quantitative and qualitative results based on the KITTI odometry and tracking benchmark for scene flow estimation based on grid map sequences. We show that we can improve accuracy and convergence when applying motion and spatial consistency regularization. △ Less

Submitted 17 April, 2019; originally announced April 2019.

Comments: 6 pages, 5 figures

arXiv:1901.11284 [pdf, other]

Capturing Object Detection Uncertainty in Multi-Layer Grid Maps

Authors: Sascha Wirges, Marcel Reith-Braun, Martin Lauer, Christoph Stiller

Abstract: We propose a deep convolutional object detector for automated driving applications that also estimates classification, pose and shape uncertainty of each detected object. The input consists of a multi-layer grid map which is well-suited for sensor fusion, free-space estimation and machine learning. Based on the estimated pose and shape uncertainty we approximate object hulls with bounded collision… ▽ More We propose a deep convolutional object detector for automated driving applications that also estimates classification, pose and shape uncertainty of each detected object. The input consists of a multi-layer grid map which is well-suited for sensor fusion, free-space estimation and machine learning. Based on the estimated pose and shape uncertainty we approximate object hulls with bounded collision probability which we find helpful for subsequent trajectory planning tasks. We train our models based on the KITTI object detection data set. In a quantitative and qualitative evaluation some models show a similar performance and superior robustness compared to previously developed object detectors. However, our evaluation also points to undesired data set properties which should be addressed when training data-driven models or creating new data sets. △ Less

Submitted 31 January, 2019; originally announced January 2019.

Comments: 8 pages, 8 figures, 2 tables

arXiv:1805.08689 [pdf, other]

Object Detection and Classification in Occupancy Grid Maps using Deep Convolutional Networks

Authors: Sascha Wirges, Tom Fischer, Jesus Balado Frias, Christoph Stiller

Abstract: A detailed environment perception is a crucial component of automated vehicles. However, to deal with the amount of perceived information, we also require segmentation strategies. Based on a grid map environment representation, well-suited for sensor fusion, free-space estimation and machine learning, we detect and classify objects using deep convolutional neural networks. As input for our network… ▽ More A detailed environment perception is a crucial component of automated vehicles. However, to deal with the amount of perceived information, we also require segmentation strategies. Based on a grid map environment representation, well-suited for sensor fusion, free-space estimation and machine learning, we detect and classify objects using deep convolutional neural networks. As input for our networks we use a multi-layer grid map efficiently encoding 3D range sensor information. The inference output consists of a list of rotated bounding boxes with associated semantic classes. We conduct extensive ablation studies, highlight important design considerations when using grid maps and evaluate our models on the KITTI Bird's Eye View benchmark. Qualitative and quantitative benchmark results show that we achieve robust detection and state of the art accuracy solely using top-view grid maps from range sensor data. △ Less

Submitted 5 December, 2018; v1 submitted 2 May, 2018; originally announced May 2018.

Comments: 6 pages, 4 tables, 4 figures

Journal ref: 2018 IEEE Intelligent Transportation Systems Conference (ITSC)

arXiv:1802.08632 [pdf, other]

An Approach to Vehicle Trajectory Prediction Using Automatically Generated Traffic Maps

Authors: Jannik Quehl, Haohao Hu, Sascha Wirges, Martin Lauer

Abstract: Trajectory and intention prediction of traffic participants is an important task in automated driving and crucial for safe interaction with the environment. In this paper, we present a new approach to vehicle trajectory prediction based on automatically generated maps containing statistical information about the behavior of traffic participants in a given area. These maps are generated based on tr… ▽ More Trajectory and intention prediction of traffic participants is an important task in automated driving and crucial for safe interaction with the environment. In this paper, we present a new approach to vehicle trajectory prediction based on automatically generated maps containing statistical information about the behavior of traffic participants in a given area. These maps are generated based on trajectory observations using image processing and map matching techniques and contain all typical vehicle movements and probabilities in the considered area. Our prediction approach matches an observed trajectory to a behavior contained in the map and uses this information to generate a prediction. We evaluated our approach on a dataset containing over 14000 trajectories and found that it produces significantly more precise mid-term predictions compared to motion model-based prediction approaches. △ Less

Submitted 14 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

Comments: 6 Pages, 9 figures, submitted to IV 2018

arXiv:1801.05297 [pdf, other]

doi 10.1109/IVS.2018.8500635

Evidential Occupancy Grid Map Augmentation using Deep Learning

Authors: Sascha Wirges, Felix Hartenbach, Christoph Stiller

Abstract: A detailed environment representation is a crucial component of automated vehicles. Using single range sensor scans, data is often too sparse and subject to occlusions. Therefore, we present a method to augment occupancy grid maps from single views to be similar to evidential occupancy maps acquired from different views using Deep Learning. To accomplish this, we estimate motion between subsequent… ▽ More A detailed environment representation is a crucial component of automated vehicles. Using single range sensor scans, data is often too sparse and subject to occlusions. Therefore, we present a method to augment occupancy grid maps from single views to be similar to evidential occupancy maps acquired from different views using Deep Learning. To accomplish this, we estimate motion between subsequent range sensor measurements and create an evidential 3D voxel map in an extensive post-processing step. Within this voxel map, we explicitly model uncertainty using evidence theory and create a 2D projection using combination rules. As input for our neural networks, we use a multi-layer grid map consisting of the three features detections, transmissions and intensity, each for ground and non-ground measurements. Finally, we perform a quantitative and qualitative evaluation which shows that different network architectures accurately infer evidential measures in real-time. △ Less

Submitted 5 December, 2018; v1 submitted 16 January, 2018; originally announced January 2018.

Comments: 6 pages, 5 figures

Journal ref: 2018 IEEE Intelligent Vehicles Symposium (IV)

arXiv:1706.05999 [pdf, other]

Guided Depth Upsampling for Precise Map** of Urban Environments

Authors: Sascha Wirges, Björn Roxin, Eike Rehder, Tilman Kühner, Martin Lauer

Abstract: We present an improved model for MRF-based depth upsampling, guided by image- as well as 3D surface normal features. By exploiting the underlying camera model we define a novel regularization term that implicitly evaluates the planarity of arbitrary oriented surfaces. Our method improves upsampling quality in scenes composed of predominantly planar surfaces, such as urban areas. We use a synthetic… ▽ More We present an improved model for MRF-based depth upsampling, guided by image- as well as 3D surface normal features. By exploiting the underlying camera model we define a novel regularization term that implicitly evaluates the planarity of arbitrary oriented surfaces. Our method improves upsampling quality in scenes composed of predominantly planar surfaces, such as urban areas. We use a synthetic dataset to demonstrate that our approach outperforms recent methods that implement distance-based regularization terms. Finally, we validate our approach for map** applications on our experimental vehicle. △ Less

Submitted 19 June, 2017; originally announced June 2017.

Comments: 6 pages, 6 figures

Showing 1–15 of 15 results for author: Wirges, S