Search | arXiv e-print repository

doi 10.1016/j.cose.2023.103292

Duopoly insurers' incentives for data quality under a mandatory cyber data sharing regime

Authors: Carlos Barreto, Olof Reinert, Tobias Wiesinger, Ulrik Franke

Abstract: We study the impact of data sharing policies on cyber insurance markets. These policies have been proposed to address the scarcity of data about cyber threats, which is essential to manage cyber risks. We propose a Cournot duopoly competition model in which two insurers choose the number of policies they offer (i.e., their production level) and also the resources they invest to ensure the quality… ▽ More We study the impact of data sharing policies on cyber insurance markets. These policies have been proposed to address the scarcity of data about cyber threats, which is essential to manage cyber risks. We propose a Cournot duopoly competition model in which two insurers choose the number of policies they offer (i.e., their production level) and also the resources they invest to ensure the quality of data regarding the cost of claims (i.e., the data quality of their production cost). We find that enacting mandatory data sharing sometimes creates situations in which at most one of the two insurers invests in data quality, whereas both insurers would invest when information sharing is not mandatory. This raises concerns about the merits of making data sharing mandatory. △ Less

Submitted 29 May, 2023; originally announced August 2023.

Comments: 46 pages, 8 figures, to be published at Computers & Security

arXiv:2107.03070 [pdf, other]

Learning Stixel-based Instance Segmentation

Authors: Monty Santarossa, Lukas Schneider, Claudius Zelenka, Lars Schmarje, Reinhard Koch, Uwe Franke

Abstract: Stixels have been successfully applied to a wide range of vision tasks in autonomous driving, recently including instance segmentation. However, due to their sparse occurrence in the image, until now Stixels seldomly served as input for Deep Learning algorithms, restricting their utility for such approaches. In this work we present StixelPointNet, a novel method to perform fast instance segmentati… ▽ More Stixels have been successfully applied to a wide range of vision tasks in autonomous driving, recently including instance segmentation. However, due to their sparse occurrence in the image, until now Stixels seldomly served as input for Deep Learning algorithms, restricting their utility for such approaches. In this work we present StixelPointNet, a novel method to perform fast instance segmentation directly on Stixels. By regarding the Stixel representation as unstructured data similar to point clouds, architectures like PointNet are able to learn features from Stixels. We use a bounding box detector to propose candidate instances, for which the relevant Stixels are extracted from the input image. On these Stixels, a PointNet models learns binary segmentations, which we then unify throughout the whole image in a final selection step. StixelPointNet achieves state-of-the-art performance on Stixel-level, is considerably faster than pixel-based segmentation methods, and shows that with our approach the Stixel domain can be introduced to many new 3D Deep Learning tasks. △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: Accepted for publication in IEEE Intelligent Vehicles Symposium

arXiv:2006.13084 [pdf, other]

Single-Shot 3D Detection of Vehicles from Monocular RGB Images via Geometry Constrained Keypoints in Real-Time

Authors: Nils Gählert, Jun-Jun Wan, Nicolas Jourdan, Jan Finkbeiner, Uwe Franke, Joachim Denzler

Abstract: In this paper we propose a novel 3D single-shot object detection method for detecting vehicles in monocular RGB images. Our approach lifts 2D detections to 3D space by predicting additional regression and classification parameters and hence kee** the runtime close to pure 2D object detection. The additional parameters are transformed to 3D bounding box keypoints within the network under geometri… ▽ More In this paper we propose a novel 3D single-shot object detection method for detecting vehicles in monocular RGB images. Our approach lifts 2D detections to 3D space by predicting additional regression and classification parameters and hence kee** the runtime close to pure 2D object detection. The additional parameters are transformed to 3D bounding box keypoints within the network under geometric constraints. Our proposed method features a full 3D description including all three angles of rotation without supervision by any labeled ground truth data for the object's orientation, as it focuses on certain keypoints within the image plane. While our approach can be combined with any modern object detection framework with only little computational overhead, we exemplify the extension of SSD for the prediction of 3D bounding boxes. We test our approach on different datasets for autonomous driving and evaluate it using the challenging KITTI 3D Object Detection as well as the novel nuScenes Object Detection benchmarks. While we achieve competitive results on both benchmarks we outperform current state-of-the-art methods in terms of speed with more than 20 FPS for all tested datasets and image resolutions. △ Less

Submitted 23 June, 2020; originally announced June 2020.

Comments: 2020 IEEE IV Symposium

arXiv:2006.08547 [pdf, other]

Visibility Guided NMS: Efficient Boosting of Amodal Object Detection in Crowded Traffic Scenes

Authors: Nils Gählert, Niklas Hanselmann, Uwe Franke, Joachim Denzler

Abstract: Object detection is an important task in environment perception for autonomous driving. Modern 2D object detection frameworks such as Yolo, SSD or Faster R-CNN predict multiple bounding boxes per object that are refined using Non-Maximum-Suppression (NMS) to suppress all but one bounding box. While object detection itself is fully end-to-end learnable and does not require any manual parameter sele… ▽ More Object detection is an important task in environment perception for autonomous driving. Modern 2D object detection frameworks such as Yolo, SSD or Faster R-CNN predict multiple bounding boxes per object that are refined using Non-Maximum-Suppression (NMS) to suppress all but one bounding box. While object detection itself is fully end-to-end learnable and does not require any manual parameter selection, standard NMS is parametrized by an overlap threshold that has to be chosen by hand. In practice, this often leads to an inability of standard NMS strategies to distinguish different objects in crowded scenes in the presence of high mutual occlusion, e.g. for parked cars or crowds of pedestrians. Our novel Visibility Guided NMS (vg-NMS) leverages both pixel-based as well as amodal object detection paradigms and improves the detection performance especially for highly occluded objects with little computational overhead. We evaluate vg-NMS using KITTI, VIPER as well as the Synscapes dataset and show that it outperforms current state-of-the-art NMS. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: Machine Learning for Autonomous Driving Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

arXiv:2006.07864 [pdf, other]

Cityscapes 3D: Dataset and Benchmark for 9 DoF Vehicle Detection

Authors: Nils Gählert, Nicolas Jourdan, Marius Cordts, Uwe Franke, Joachim Denzler

Abstract: Detecting vehicles and representing their position and orientation in the three dimensional space is a key technology for autonomous driving. Recently, methods for 3D vehicle detection solely based on monocular RGB images gained popularity. In order to facilitate this task as well as to compare and drive state-of-the-art methods, several new datasets and benchmarks have been published. Ground trut… ▽ More Detecting vehicles and representing their position and orientation in the three dimensional space is a key technology for autonomous driving. Recently, methods for 3D vehicle detection solely based on monocular RGB images gained popularity. In order to facilitate this task as well as to compare and drive state-of-the-art methods, several new datasets and benchmarks have been published. Ground truth annotations of vehicles are usually obtained using lidar point clouds, which often induces errors due to imperfect calibration or synchronization between both sensors. To this end, we propose Cityscapes 3D, extending the original Cityscapes dataset with 3D bounding box annotations for all types of vehicles. In contrast to existing datasets, our 3D annotations were labeled using stereo RGB images only and capture all nine degrees of freedom. This leads to a pixel-accurate reprojection in the RGB image and a higher range of annotations compared to lidar-based approaches. In order to ease multitask learning, we provide a pairing of 2D instance segments with 3D bounding boxes. In addition, we complement the Cityscapes benchmark suite with 3D vehicle detection based on the new annotations as well as metrics presented in this work. Dataset and benchmark are available online. △ Less

Submitted 14 June, 2020; originally announced June 2020.

Comments: 2020 "Scalability in Autonomous Driving" CVPR Workshop

arXiv:2005.12607 [pdf, other]

Illuminating a Blind Spot in Digitalization -- Software Development in Sweden's Private and Public Sector

Authors: Markus Borg, Joakim Wernberg, Thomas Olsson, Ulrik Franke, Martin Andersson

Abstract: As Netscape co-founder Marc Andreessen famously remarked in 2011, software is eating the world - becoming a pervasive invisible critical infrastructure. Data on the distribution of software use and development in society is scarce, but we compile results from two novel surveys to provide a fuller picture of the role software plays in the public and private sectors in Sweden, respectively. Three ou… ▽ More As Netscape co-founder Marc Andreessen famously remarked in 2011, software is eating the world - becoming a pervasive invisible critical infrastructure. Data on the distribution of software use and development in society is scarce, but we compile results from two novel surveys to provide a fuller picture of the role software plays in the public and private sectors in Sweden, respectively. Three out of ten Swedish firms, across industry sectors, develop software in-house. The corresponding figure for Sweden's government agencies is four out of ten, i.e., the public sector should not be underestimated. The digitalization of society will continue, thus the demand for software developers will further increase. Many private firms report that the limited supply of software developers in Sweden is directly affecting their expansion plans. Based on our findings, we outline directions that need additional research to allow evidence-informed policy-making. We argue that such work should ideally be conducted by academic researchers and national statistics agencies in collaboration. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Journal ref: In Proc. of the 1st International Workshop on Governance in Software Engineering (IEEE/ACM 42nd International Conference on Software Engineering Workshops (ICSEW'20), May 23-29, 2020, Seoul, Republic of Korea)

arXiv:1910.01466 [pdf, other]

doi 10.1007/s11263-019-01226-9

Slanted Stixels: A way to represent steep streets

Authors: Daniel Hernandez-Juarez, Lukas Schneider, Pau Cebrian, Antonio Espinosa, David Vazquez, Antonio M. Lopez, Uwe Franke, Marc Pollefeys, Juan C. Moure

Abstract: This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound… ▽ More This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a Fully Convolutional Network (FCN), which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed methods in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. △ Less

Submitted 2 October, 2019; originally announced October 2019.

Comments: Journal preprint (published in IJCV 2019: https://link.springer.com/article/10.1007/s11263-019-01226-9). arXiv admin note: text overlap with arXiv:1707.05397

Journal ref: IJCV 2019

arXiv:1907.08412 [pdf, other]

doi 10.1145/3338906.3340443

Risks and Assets: A Qualitative Study of a Software Ecosystem in the Mining Industry

Authors: Thomas Olsson, Ulrik Franke

Abstract: Digitalization and servitization are impacting many domains, including the mining industry. As the equipment becomes connected and technical infrastructure evolves, business models and risk management need to adapt. In this paper, we present a study on how changes in asset and risk distribution are evolving for the actors in a software ecosystem (SECO) and system-of-systems (SoS) around a mining o… ▽ More Digitalization and servitization are impacting many domains, including the mining industry. As the equipment becomes connected and technical infrastructure evolves, business models and risk management need to adapt. In this paper, we present a study on how changes in asset and risk distribution are evolving for the actors in a software ecosystem (SECO) and system-of-systems (SoS) around a mining operation. We have performed a survey to understand how Service Level Agreements (SLAs) -- a common mechanism for managing risk -- are used in other domains. Furthermore, we have performed a focus group study with companies. There is an overall trend in the mining industry to move the investment cost (CAPEX) from the mining operator to the vendors. Hence, the mining operator instead leases the equipment (as operational expense, OPEX) or even acquires a service. This change in business model impacts operation, as knowledge is moved from the mining operator to the suppliers. Furthermore, as the infrastructure becomes more complex, this implies that the mining operator is more and more reliant on the suppliers for the operation and maintenance. As this change is still in an early stage, there is no formalized risk management, e.g. through SLAs, in place. Rather, at present, the companies in the ecosystem rely more on trust and the incentives created by the promise of mutual future benefits of innovation activities. We believe there is a need to better understand how to manage risk in SECO as it is established and evolves. At the same time, in a SECO, the focus is on cooperation and innovation, the companies do not have incentives to address this unless there is an incident. Therefore, industry need, we believe, help in systematically understanding risk and defining quality aspects such as reliability and performance in the new business environment. △ Less

Submitted 19 July, 2019; originally announced July 2019.

arXiv:1906.04424 [pdf, other]

Sharing of vulnerability information among companies -- a survey of Swedish companies

Authors: Thomas Olsson, Martin Hell, Martin Höst, Ulrik Franke, Markus Borg

Abstract: Software products are rarely developed from scratch and vulnerabilities in such products might reside in parts that are either open source software or provided by another organization. Hence, the total cybersecurity of a product often depends on cooperation, explicit or implicit, between several organizations. We study the attitudes and practices of companies in software ecosystems towards sharing… ▽ More Software products are rarely developed from scratch and vulnerabilities in such products might reside in parts that are either open source software or provided by another organization. Hence, the total cybersecurity of a product often depends on cooperation, explicit or implicit, between several organizations. We study the attitudes and practices of companies in software ecosystems towards sharing vulnerability information. Furthermore, we compare these practices to contemporary cybersecurity recommendations. This is performed through a questionnaire-based qualitative survey. The questionnaire is divided into two parts: the providers' perspective and the acquirers' perspective. The results show that companies are willing to share information with each other regarding vulnerabilities. Sharing is not considered to be harmful neither to the cybersecurity nor their business, even though a majority of the respondents consider vulnerability information sensitive. However, the companies, despite being open to sharing, are less inclined to proactively sharing vulnerability information. Furthermore, the providers do not perceive that there is a large interest in vulnerability information from their customers. Hence, the companies' overall attitude to sharing vulnerability information is passive but open. In contrast, contemporary cybersecurity guidelines recommend active disclosure and sharing among actors in an ecosystem. △ Less

Submitted 11 June, 2019; originally announced June 2019.

Journal ref: Euromicro Conference on Software Engineering and Advanced Applications 2019

arXiv:1802.00312 [pdf, other]

Digitalization of Swedish Government Agencies - A Perspective Through the Lens of a Software Development Census

Authors: Markus Borg, Thomas Olsson, Ulrik Franke, Saïd Assar

Abstract: Software engineering is at the core of the digitalization of society. Ill-informed decisions can have major consequences, as made evident in the 2017 government crisis in Sweden, originating in a data breach caused by an outsourcing deal made by the Swedish Transport Agency. Many Government Agencies (GovAgs) in Sweden are rapidly undergoing a digital transition, thus it is important to overview ho… ▽ More Software engineering is at the core of the digitalization of society. Ill-informed decisions can have major consequences, as made evident in the 2017 government crisis in Sweden, originating in a data breach caused by an outsourcing deal made by the Swedish Transport Agency. Many Government Agencies (GovAgs) in Sweden are rapidly undergoing a digital transition, thus it is important to overview how widespread, and mature, software development is in this part of the public sector. We present a software development census of Swedish GovAgs, complemented by document analysis and a survey. We show that 39.2% of the GovAgs develop software internally, some matching the number of developers in large companies. Our findings suggest that the development largely resembles private sector counterparts, and that established best practices are implemented. Still, we identify improvement potential in the areas of strategic sourcing, openness, collaboration across GovAgs, and quality requirements. The Swedish Government has announced the establishment of a new digitalization agency next year, and our hope is that the software engineering community will contribute its expertise with a clear voice. △ Less

Submitted 11 February, 2018; v1 submitted 1 February, 2018; originally announced February 2018.

Comments: Preprint of paper accepted for the 40th International Conference on Software Engineering, May 27-3 June 2018, Gothenburg, Sweden, Software Engineering in Society Track

arXiv:1708.06500 [pdf, other]

Sparsity Invariant CNNs

Authors: Jonas Uhrig, Nick Schneider, Lukas Schneider, Uwe Franke, Thomas Brox, Andreas Geiger

Abstract: In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution lay… ▽ More In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings and will be made available upon publication. △ Less

Submitted 30 August, 2017; v1 submitted 22 August, 2017; originally announced August 2017.

arXiv:1707.05397 [pdf, other]

Slanted Stixels: Representing San Francisco's Steepest Streets

Authors: Daniel Hernandez-Juarez, Lukas Schneider, Antonio Espinosa, David Vázquez, Antonio M. López, Uwe Franke, Marc Pollefeys, Juan C. Moure

Abstract: In this work we present a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global e… ▽ More In this work we present a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced that uses an extremely efficient over-segmentation. In doing so, the computational complexity of the Stixel inference algorithm is reduced significantly, achieving real-time computation capabilities with only a slight drop in accuracy. We evaluate the proposed approach in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. △ Less

Submitted 17 July, 2017; originally announced July 2017.

Comments: Accepted to BMVC 2017 as oral presentation

arXiv:1707.03167 [pdf, other]

RegNet: Multimodal Sensor Registration Using Deep Neural Networks

Authors: Nick Schneider, Florian Piewak, Christoph Stiller, Uwe Franke

Abstract: In this paper, we present RegNet, the first deep convolutional neural network (CNN) to infer a 6 degrees of freedom (DOF) extrinsic calibration between multimodal sensors, exemplified using a scanning LiDAR and a monocular camera. Compared to existing approaches, RegNet casts all three conventional calibration steps (feature extraction, feature matching and global regression) into a single real-ti… ▽ More In this paper, we present RegNet, the first deep convolutional neural network (CNN) to infer a 6 degrees of freedom (DOF) extrinsic calibration between multimodal sensors, exemplified using a scanning LiDAR and a monocular camera. Compared to existing approaches, RegNet casts all three conventional calibration steps (feature extraction, feature matching and global regression) into a single real-time capable CNN. Our method does not require any human interaction and bridges the gap between classical offline and target-less online calibration approaches as it provides both a stable initial estimation as well as a continuous online correction of the extrinsic parameters. During training we randomly decalibrate our system in order to train RegNet to infer the correspondence between projected depth measurements and RGB image and finally regress the extrinsic calibration. Additionally, with an iterative execution of multiple CNNs, that are trained on different magnitudes of decalibration, our approach compares favorably to state-of-the-art methods in terms of a mean calibration error of 0.28 degrees for the rotational and 6 cm for the translation components even for large decalibrations up to 1.5 m and 20 degrees. △ Less

Submitted 11 July, 2017; originally announced July 2017.

Comments: published in IEEE Intelligent Vehicles Symposium, 2017

arXiv:1704.00280 [pdf, other]

doi 10.1016/j.imavis.2017.01.009

The Stixel world: A medium-level representation of traffic scenes

Authors: Marius Cordts, Timo Rehfeld, Lukas Schneider, David Pfeiffer, Markus Enzweiler, Stefan Roth, Marc Pollefeys, Uwe Franke

Abstract: Recent progress in advanced driver assistance systems and the race towards autonomous vehicles is mainly driven by two factors: (1) increasingly sophisticated algorithms that interpret the environment around the vehicle and react accordingly, and (2) the continuous improvements of sensor technology itself. In terms of cameras, these improvements typically include higher spatial resolution, which a… ▽ More Recent progress in advanced driver assistance systems and the race towards autonomous vehicles is mainly driven by two factors: (1) increasingly sophisticated algorithms that interpret the environment around the vehicle and react accordingly, and (2) the continuous improvements of sensor technology itself. In terms of cameras, these improvements typically include higher spatial resolution, which as a consequence requires more data to be processed. The trend to add multiple cameras to cover the entire surrounding of the vehicle is not conducive in that matter. At the same time, an increasing number of special purpose algorithms need access to the sensor input data to correctly interpret the various complex situations that can occur, particularly in urban traffic. By observing those trends, it becomes clear that a key challenge for vision architectures in intelligent vehicles is to share computational resources. We believe this challenge should be faced by introducing a representation of the sensory data that provides compressed and structured access to all relevant visual content of the scene. The Stixel World discussed in this paper is such a representation. It is a medium-level model of the environment that is specifically designed to compress information about obstacles by leveraging the typical layout of outdoor traffic scenes. It has proven useful for a multitude of automotive vision applications, including object detection, tracking, segmentation, and map**. In this paper, we summarize the ideas behind the model and generalize it to take into account multiple dense input streams: the image itself, stereo depth maps, and semantic class probability maps that can be generated, e.g., by CNNs. Our generalization is embedded into a novel mathematical formulation for the Stixel model. We further sketch how the free parameters of the model can be learned using structured SVMs. △ Less

Submitted 2 April, 2017; originally announced April 2017.

Comments: Accepted for publication in Image and Vision Computing

arXiv:1612.06573 [pdf, other]

Detecting Unexpected Obstacles for Self-Driving Cars: Fusing Deep Learning and Geometric Modeling

Authors: Sebastian Ramos, Stefan Gehrig, Peter **gera, Uwe Franke, Carsten Rother

Abstract: The detection of small road hazards, such as lost cargo, is a vital capability for self-driving cars. We tackle this challenging and rarely addressed problem with a vision system that leverages appearance, contextual as well as geometric cues. To utilize the appearance and contextual cues, we propose a new deep learning-based obstacle detection framework. Here a variant of a fully convolutional ne… ▽ More The detection of small road hazards, such as lost cargo, is a vital capability for self-driving cars. We tackle this challenging and rarely addressed problem with a vision system that leverages appearance, contextual as well as geometric cues. To utilize the appearance and contextual cues, we propose a new deep learning-based obstacle detection framework. Here a variant of a fully convolutional network is used to predict a pixel-wise semantic labeling of (i) free-space, (ii) on-road unexpected obstacles, and (iii) background. The geometric cues are exploited using a state-of-the-art detection approach that predicts obstacles from stereo input images via model-based statistical hypothesis tests. We present a principled Bayesian framework to fuse the semantic and stereo-based detection results. The mid-level Stixel representation is used to describe obstacles in a flexible, compact and robust manner. We evaluate our new obstacle detection system on the Lost and Found dataset, which includes very challenging scenes with obstacles of only 5 cm height. Overall, we report a major improvement over the state-of-the-art, with relative performance gains of up to 50%. In particular, we achieve a detection rate of over 90% for distances of up to 50 m. Our system operates at 22 Hz on our self-driving platform. △ Less

Submitted 20 December, 2016; originally announced December 2016.

Comments: Submitted to the IEEE International Conference on Robotics and Automation (ICRA) 2017

arXiv:1609.04653 [pdf, other]

Lost and Found: Detecting Small Road Hazards for Self-Driving Vehicles

Authors: Peter **gera, Sebastian Ramos, Stefan Gehrig, Uwe Franke, Carsten Rother, Rudolf Mester

Abstract: Detecting small obstacles on the road ahead is a critical part of the driving task which has to be mastered by fully autonomous cars. In this paper, we present a method based on stereo vision to reliably detect such obstacles from a moving vehicle. The proposed algorithm performs statistical hypothesis tests in disparity space directly on stereo image data, assessing freespace and obstacle hypothe… ▽ More Detecting small obstacles on the road ahead is a critical part of the driving task which has to be mastered by fully autonomous cars. In this paper, we present a method based on stereo vision to reliably detect such obstacles from a moving vehicle. The proposed algorithm performs statistical hypothesis tests in disparity space directly on stereo image data, assessing freespace and obstacle hypotheses on independent local patches. This detection approach does not depend on a global road model and handles both static and moving obstacles. For evaluation, we employ a novel lost-cargo image sequence dataset comprising more than two thousand frames with pixelwise annotations of obstacle and free-space and provide a thorough comparison to several stereo-based baseline methods. The dataset will be made available to the community to foster further research on this important topic. The proposed approach outperforms all considered baselines in our evaluations on both pixel and object level and runs at frame rates of up to 20 Hz on 2 mega-pixel stereo imagery. Small obstacles down to the height of 5 cm can successfully be detected at 20 m distance at low false positive rates. △ Less

Submitted 15 September, 2016; originally announced September 2016.

Comments: To be presented at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016

arXiv:1608.00753 [pdf, other]

Semantically Guided Depth Upsampling

Authors: Nick Schneider, Lukas Schneider, Peter **gera, Uwe Franke, Marc Pollefeys, Christoph Stiller

Abstract: We present a novel method for accurate and efficient up- sampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving dept… ▽ More We present a novel method for accurate and efficient up- sampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving depth in- terpolation while utilizing local context. We model the observed scene structure by locally planar elements and formulate the upsampling task as a global energy minimization problem. Our method determines glob- ally consistent solutions and preserves fine details and sharp depth bound- aries. In our experiments on several public datasets at different levels of application, we demonstrate superior performance of our approach over the state-of-the-art, even for very sparse measurements. △ Less

Submitted 2 August, 2016; originally announced August 2016.

Comments: German Conference on Pattern Recognition 2016 (Oral)

arXiv:1604.05096 [pdf, other]

Pixel-level Encoding and Depth Layering for Instance-level Semantic Labeling

Authors: Jonas Uhrig, Marius Cordts, Uwe Franke, Thomas Brox

Abstract: Recent approaches for instance-aware semantic labeling have augmented convolutional neural networks (CNNs) with complex multi-task architectures or computationally expensive graphical models. We present a method that leverages a fully convolutional network (FCN) to predict semantic labels, depth and an instance-based encoding using each pixel's direction towards its corresponding instance center.… ▽ More Recent approaches for instance-aware semantic labeling have augmented convolutional neural networks (CNNs) with complex multi-task architectures or computationally expensive graphical models. We present a method that leverages a fully convolutional network (FCN) to predict semantic labels, depth and an instance-based encoding using each pixel's direction towards its corresponding instance center. Subsequently, we apply low-level computer vision techniques to generate state-of-the-art instance segmentation on the street scene datasets KITTI and Cityscapes. Our approach outperforms existing works by a large margin and can additionally predict absolute distances of individual instances from a monocular image as well as a pixel-level semantic labeling. △ Less

Submitted 14 July, 2016; v1 submitted 18 April, 2016; originally announced April 2016.

Comments: Accepted at GCPR 2016. Includes supplementary material

arXiv:1604.01685 [pdf, other]

The Cityscapes Dataset for Semantic Urban Scene Understanding

Authors: Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele

Abstract: Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a be… ▽ More Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations; 20000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark. △ Less

Submitted 7 April, 2016; v1 submitted 6 April, 2016; originally announced April 2016.

Comments: Includes supplemental material

Showing 1–19 of 19 results for author: Franke, U