-
OORD: The Oxford Offroad Radar Dataset
Authors:
Matthew Gadd,
Daniele De Martini,
Oliver Bartlett,
Paul Murcutt,
Matt Towlson,
Matthew Widojo,
Valentina Muşat,
Luke Robinson,
Efimia Panagiotaki,
Georgi Pramatarov,
Marc Alexander Kühn,
Letizia Marchegiani,
Paul Newman,
Lars Kunze
Abstract:
There is a growing academic interest as well as commercial exploitation of millimetre-wave scanning radar for autonomous vehicle localisation and scene understanding. Although several datasets to support this research area have been released, they are primarily focused on urban or semi-urban environments. Nevertheless, rugged offroad deployments are important application areas which also present u…
▽ More
There is a growing academic interest as well as commercial exploitation of millimetre-wave scanning radar for autonomous vehicle localisation and scene understanding. Although several datasets to support this research area have been released, they are primarily focused on urban or semi-urban environments. Nevertheless, rugged offroad deployments are important application areas which also present unique challenges and opportunities for this sensor technology. Therefore, the Oxford Offroad Radar Dataset (OORD) presents data collected in the rugged Scottish highlands in extreme weather. The radar data we offer to the community are accompanied by GPS/INS reference - to further stimulate research in radar place recognition. In total we release over 90GiB of radar scans as well as GPS and IMU readings by driving a diverse set of four routes over 11 forays, totalling approximately 154km of rugged driving. This is an area increasingly explored in literature, and we therefore present and release examples of recent open-sourced radar place recognition systems and their performance on our dataset. This includes a learned neural network, the weights of which we also release. The data and tools are made freely available to the community at https://oxford-robotics-institute.github.io/oord-dataset.
△ Less
Submitted 25 May, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
Frequency-Time Diffusion with Neural Cellular Automata
Authors:
John Kalkhof,
Arlene Kühn,
Yannik Frisch,
Anirban Mukhopadhyay
Abstract:
Despite considerable success, large Denoising Diffusion Models (DDMs) with UNet backbone pose practical challenges, particularly on limited hardware and in processing gigapixel images. To address these limitations, we introduce two Neural Cellular Automata (NCA)-based DDMs: Diff-NCA and FourierDiff-NCA. Capitalizing on the local communication capabilities of NCA, Diff-NCA significantly reduces the…
▽ More
Despite considerable success, large Denoising Diffusion Models (DDMs) with UNet backbone pose practical challenges, particularly on limited hardware and in processing gigapixel images. To address these limitations, we introduce two Neural Cellular Automata (NCA)-based DDMs: Diff-NCA and FourierDiff-NCA. Capitalizing on the local communication capabilities of NCA, Diff-NCA significantly reduces the parameter counts of NCA-based DDMs. Integrating Fourier-based diffusion enables global communication early in the diffusion process. This feature is particularly valuable in synthesizing complex images with important global features, such as the CelebA dataset. We demonstrate that even a 331k parameter Diff-NCA can generate 512x512 pathology slices, while FourierDiff-NCA (1.1m parameters) reaches a three times lower FID score of 43.86, compared to the four times bigger UNet (3.94m parameters) with a score of 128.2. Additionally, FourierDiff-NCA can perform diverse tasks such as super-resolution, out-of-distribution image synthesis, and inpainting without explicit training.
△ Less
Submitted 13 May, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
S-TREK: Sequential Translation and Rotation Equivariant Keypoints for local feature extraction
Authors:
Emanuele Santellani,
Christian Sormann,
Mattia Rossi,
Andreas Kuhn,
Friedrich Fraundorfer
Abstract:
In this work we introduce S-TREK, a novel local feature extractor that combines a deep keypoint detector, which is both translation and rotation equivariant by design, with a lightweight deep descriptor extractor. We train the S-TREK keypoint detector within a framework inspired by reinforcement learning, where we leverage a sequential procedure to maximize a reward directly related to keypoint re…
▽ More
In this work we introduce S-TREK, a novel local feature extractor that combines a deep keypoint detector, which is both translation and rotation equivariant by design, with a lightweight deep descriptor extractor. We train the S-TREK keypoint detector within a framework inspired by reinforcement learning, where we leverage a sequential procedure to maximize a reward directly related to keypoint repeatability. Our descriptor network is trained following a "detect, then describe" approach, where the descriptor loss is evaluated only at those locations where keypoints have been selected by the already trained detector. Extensive experiments on multiple benchmarks confirm the effectiveness of our proposed method, with S-TREK often outperforming other state-of-the-art methods in terms of repeatability and quality of the recovered poses, especially when dealing with in-plane rotations.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Textual Explanations for Automated Commentary Driving
Authors:
Marc Alexander Kühn,
Daniel Omeiza,
Lars Kunze
Abstract:
The provision of natural language explanations for the predictions of deep-learning-based vehicle controllers is critical as it enhances transparency and easy audit. In this work, a state-of-the-art (SOTA) prediction and explanation model is thoroughly evaluated and validated (as a benchmark) on the new Sense--Assess--eXplain (SAX). Additionally, we developed a new explainer model that improved ov…
▽ More
The provision of natural language explanations for the predictions of deep-learning-based vehicle controllers is critical as it enhances transparency and easy audit. In this work, a state-of-the-art (SOTA) prediction and explanation model is thoroughly evaluated and validated (as a benchmark) on the new Sense--Assess--eXplain (SAX). Additionally, we developed a new explainer model that improved over the baseline architecture in two ways: (i) an integration of part of speech prediction and (ii) an introduction of special token penalties. On the BLEU metric, our explanation generation technique outperformed SOTA by a factor of 7.7 when applied on the BDD-X dataset. The description generation technique is also improved by a factor of 1.3. Hence, our work contributes to the realisation of future explainable autonomous vehicles.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
DELS-MVS: Deep Epipolar Line Search for Multi-View Stereo
Authors:
Christian Sormann,
Emanuele Santellani,
Mattia Rossi,
Andreas Kuhn,
Friedrich Fraundorfer
Abstract:
We propose a novel approach for deep learning-based Multi-View Stereo (MVS). For each pixel in the reference image, our method leverages a deep architecture to search for the corresponding point in the source image directly along the corresponding epipolar line. We denote our method DELS-MVS: Deep Epipolar Line Search Multi-View Stereo. Previous works in deep MVS select a range of interest within…
▽ More
We propose a novel approach for deep learning-based Multi-View Stereo (MVS). For each pixel in the reference image, our method leverages a deep architecture to search for the corresponding point in the source image directly along the corresponding epipolar line. We denote our method DELS-MVS: Deep Epipolar Line Search Multi-View Stereo. Previous works in deep MVS select a range of interest within the depth space, discretize it, and sample the epipolar line according to the resulting depth values: this can result in an uneven scanning of the epipolar line, hence of the image space. Instead, our method works directly on the epipolar line: this guarantees an even scanning of the image space and avoids both the need to select a depth range of interest, which is often not known a priori and can vary dramatically from scene to scene, and the need for a suitable discretization of the depth space. In fact, our search is iterative, which avoids the building of a cost volume, costly both to store and to process. Finally, our method performs a robust geometry-aware fusion of the estimated depth maps, leveraging a confidence predicted alongside each depth. We test DELS-MVS on the ETH3D, Tanks and Temples and DTU benchmarks and achieve competitive results with respect to state-of-the-art approaches.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
MD-Net: Multi-Detector for Local Feature Extraction
Authors:
Emanuele Santellani,
Christian Sormann,
Mattia Rossi,
Andreas Kuhn,
Friedrich Fraundorfer
Abstract:
Establishing a sparse set of keypoint correspon dences between images is a fundamental task in many computer vision pipelines. Often, this translates into a computationally expensive nearest neighbor search, where every keypoint descriptor at one image must be compared with all the descriptors at the others. In order to lower the computational cost of the matching phase, we propose a deep feature…
▽ More
Establishing a sparse set of keypoint correspon dences between images is a fundamental task in many computer vision pipelines. Often, this translates into a computationally expensive nearest neighbor search, where every keypoint descriptor at one image must be compared with all the descriptors at the others. In order to lower the computational cost of the matching phase, we propose a deep feature extraction network capable of detecting a predefined number of complementary sets of keypoints at each image. Since only the descriptors within the same set need to be compared across the different images, the matching phase computational complexity decreases with the number of sets. We train our network to predict the keypoints and compute the corresponding descriptors jointly. In particular, in order to learn complementary sets of keypoints, we introduce a novel unsupervised loss which penalizes intersections among the different sets. Additionally, we propose a novel descriptor-based weighting scheme meant to penalize the detection of keypoints with non-discriminative descriptors. With extensive experiments we show that our feature extraction network, trained only on synthetically warped images and in a fully unsupervised manner, achieves competitive results on 3D reconstruction and re-localization tasks at a reduced matching complexity.
△ Less
Submitted 10 August, 2022;
originally announced August 2022.
-
IB-MVS: An Iterative Algorithm for Deep Multi-View Stereo based on Binary Decisions
Authors:
Christian Sormann,
Mattia Rossi,
Andreas Kuhn,
Friedrich Fraundorfer
Abstract:
We present a novel deep-learning-based method for Multi-View Stereo. Our method estimates high resolution and highly precise depth maps iteratively, by traversing the continuous space of feasible depth values at each pixel in a binary decision fashion. The decision process leverages a deep-network architecture: this computes a pixelwise binary mask that establishes whether each pixel actual depth…
▽ More
We present a novel deep-learning-based method for Multi-View Stereo. Our method estimates high resolution and highly precise depth maps iteratively, by traversing the continuous space of feasible depth values at each pixel in a binary decision fashion. The decision process leverages a deep-network architecture: this computes a pixelwise binary mask that establishes whether each pixel actual depth is in front or behind its current iteration individual depth hypothesis. Moreover, in order to handle occluded regions, at each iteration the results from different source images are fused using pixelwise weights estimated by a second network. Thanks to the adopted binary decision strategy, which permits an efficient exploration of the depth space, our method can handle high resolution images without trading resolution and precision. This sets it apart from most alternative learning-based Multi-View Stereo methods, where the explicit discretization of the depth space requires the processing of large cost volumes. We compare our method with state-of-the-art Multi-View Stereo methods on the DTU, Tanks and Temples and the challenging ETH3D benchmarks and show competitive results.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
BP-MVSNet: Belief-Propagation-Layers for Multi-View-Stereo
Authors:
Christian Sormann,
Patrick Knöbelreiter,
Andreas Kuhn,
Mattia Rossi,
Thomas Pock,
Friedrich Fraundorfer
Abstract:
In this work, we propose BP-MVSNet, a convolutional neural network (CNN)-based Multi-View-Stereo (MVS) method that uses a differentiable Conditional Random Field (CRF) layer for regularization. To this end, we propose to extend the BP layer and add what is necessary to successfully use it in the MVS setting. We therefore show how we can calculate a normalization based on the expected 3D error, whi…
▽ More
In this work, we propose BP-MVSNet, a convolutional neural network (CNN)-based Multi-View-Stereo (MVS) method that uses a differentiable Conditional Random Field (CRF) layer for regularization. To this end, we propose to extend the BP layer and add what is necessary to successfully use it in the MVS setting. We therefore show how we can calculate a normalization based on the expected 3D error, which we can then use to normalize the label jumps in the CRF. This is required to make the BP layer invariant to different scales in the MVS setting. In order to also enable fractional label jumps, we propose a differentiable interpolation step, which we embed into the computation of the pairwise term. These extensions allow us to integrate the BP layer into a multi-scale MVS network, where we continuously improve a rough initial estimate until we get high quality depth maps as a result. We evaluate the proposed BP-MVSNet in an ablation study and conduct extensive experiments on the DTU, Tanks and Temples and ETH3D data sets. The experiments show that we can significantly outperform the baseline and achieve state-of-the-art results.
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
Joint Graph-based Depth Refinement and Normal Estimation
Authors:
Mattia Rossi,
Mireille El Gheche,
Andreas Kuhn,
Pascal Frossard
Abstract:
Depth estimation is an essential component in understanding the 3D geometry of a scene, with numerous applications in urban and indoor settings. These scenes are characterized by a prevalence of human made structures, which in most of the cases, are either inherently piece-wise planar, or can be approximated as such. In these settings, we devise a novel depth refinement framework that aims at reco…
▽ More
Depth estimation is an essential component in understanding the 3D geometry of a scene, with numerous applications in urban and indoor settings. These scenes are characterized by a prevalence of human made structures, which in most of the cases, are either inherently piece-wise planar, or can be approximated as such. In these settings, we devise a novel depth refinement framework that aims at recovering the underlying piece-wise planarity of the inverse depth map. We formulate this task as an optimization problem involving a data fidelity term that minimizes the distance to the input inverse depth map, as well as a regularization that enforces a piece-wise planar solution. As for the regularization term, we model the inverse depth map as a weighted graph between pixels. The proposed regularization is designed to estimate a plane automatically at each pixel, without any need for an a priori estimation of the scene planes, and at the same time it encourages similar pixels to be assigned to the same plane. The resulting optimization problem is efficiently solved with ADAM algorithm. Experiments show that our method leads to a significant improvement in depth refinement, both visually and numerically, with respect to state-of-the-art algorithms on Middlebury, KITTI and ETH3D multi-view stereo datasets.
△ Less
Submitted 2 September, 2020; v1 submitted 3 December, 2019;
originally announced December 2019.
-
DeepC-MVS: Deep Confidence Prediction for Multi-View Stereo Reconstruction
Authors:
Andreas Kuhn,
Christian Sormann,
Mattia Rossi,
Oliver Erdler,
Friedrich Fraundorfer
Abstract:
Deep Neural Networks (DNNs) have the potential to improve the quality of image-based 3D reconstructions. However, the use of DNNs in the context of 3D reconstruction from large and high-resolution image datasets is still an open challenge, due to memory and computational constraints. We propose a pipeline which takes advantage of DNNs to improve the quality of 3D reconstructions while being able t…
▽ More
Deep Neural Networks (DNNs) have the potential to improve the quality of image-based 3D reconstructions. However, the use of DNNs in the context of 3D reconstruction from large and high-resolution image datasets is still an open challenge, due to memory and computational constraints. We propose a pipeline which takes advantage of DNNs to improve the quality of 3D reconstructions while being able to handle large and high-resolution datasets. In particular, we propose a confidence prediction network explicitly tailored for Multi-View Stereo (MVS) and we use it for both depth map outlier filtering and depth map refinement within our pipeline, in order to improve the quality of the final 3D reconstructions. We train our confidence prediction network on (semi-)dense ground truth depth maps from publicly available real world MVS datasets. With extensive experiments on popular benchmarks, we show that our overall pipeline can produce state-of-the-art 3D reconstructions, both qualitatively and quantitatively.
△ Less
Submitted 13 August, 2020; v1 submitted 1 December, 2019;
originally announced December 2019.
-
On Designing Better Tools for Learning APIs
Authors:
Adrian Kuhn,
Robert DeLine
Abstract:
Modern software development requires a large investment in learning application programming interfaces (APIs). Recent research found that the learning materials themselves are often inadequate: developers struggle to find answers beyond simple usage scenarios. Solving these problems requires a large investment in tool and search engine development. To understand where further investment would be m…
▽ More
Modern software development requires a large investment in learning application programming interfaces (APIs). Recent research found that the learning materials themselves are often inadequate: developers struggle to find answers beyond simple usage scenarios. Solving these problems requires a large investment in tool and search engine development. To understand where further investment would be most useful, we ran a study with 19 professional developers to understand what a solution might look like, free of technical constraints. In this paper, we report on design implications of tools for API learning, grounded in the reality of the professional developers themselves. The reoccurring themes in the participants' feedback were trustworthiness, confidentiality, information overload and the need for code examples as first-class documentation artifacts.
△ Less
Submitted 5 February, 2014;
originally announced February 2014.
-
On Extracting Unit Tests from Interactive Programming Sessions
Authors:
Adrian Kuhn
Abstract:
Software engineering methodologies propose that developers should capture their efforts in ensuring that programs run correctly in repeatable and automated artifacts, such as unit tests. However, when looking at developer activities on a spectrum from exploratory testing to scripted testing we find that many engineering activities include bursts of exploratory testing. In this paper we propose to…
▽ More
Software engineering methodologies propose that developers should capture their efforts in ensuring that programs run correctly in repeatable and automated artifacts, such as unit tests. However, when looking at developer activities on a spectrum from exploratory testing to scripted testing we find that many engineering activities include bursts of exploratory testing. In this paper we propose to leverage these exploratory testing bursts by automatically extracting scripted tests from a recording of these sessions. In order to do so, we wiretap the development environment so we can record all program input, all user-issued functions calls, and all program output of an exploratory testing session. We propose to then use machine learning (i.e. clustering) to extract scripted test cases from these recordings in real-time. We outline two early-stage prototypes, one for a static and one for a dynamic language. And we outline how this idea fits into the bigger research direction of programming by example.
△ Less
Submitted 8 December, 2012;
originally announced December 2012.
-
Lessons Learned from Evaluating MDE Abstractions in an Industry Case Study
Authors:
Adrian Kuhn,
Gail C. Murphy
Abstract:
In a recent empirical study we found that evaluating abstractions of Model-Driven Engineering (MDE) is not as straight forward as it might seem. In this paper, we report on the challenges that we as researchers faced when we conducted the aforementioned field study. In our study we found that modeling happens within a complex ecosystem of different people working in different roles. An empirical e…
▽ More
In a recent empirical study we found that evaluating abstractions of Model-Driven Engineering (MDE) is not as straight forward as it might seem. In this paper, we report on the challenges that we as researchers faced when we conducted the aforementioned field study. In our study we found that modeling happens within a complex ecosystem of different people working in different roles. An empirical evaluation should thus mind the ecosystem, that is, focus on both technical and human factors. In the following, we present and discuss five lessons learnt from our recent work.
△ Less
Submitted 25 September, 2012;
originally announced September 2012.
-
Consistent Layout for Thematic Software Maps
Authors:
Adrian Kuhn,
Peter Loretan,
Oscar Nierstrasz
Abstract:
Software visualizations can provide a concise overview of a complex software system. Unfortunately, since software has no physical shape, there is no "natural" map** of software to a two-dimensional space. As a consequence most visualizations tend to use a layout in which position and distance have no meaning, and consequently layout typical diverges from one visualization to another. We propose…
▽ More
Software visualizations can provide a concise overview of a complex software system. Unfortunately, since software has no physical shape, there is no "natural" map** of software to a two-dimensional space. As a consequence most visualizations tend to use a layout in which position and distance have no meaning, and consequently layout typical diverges from one visualization to another. We propose a consistent layout for software maps in which the position of a software artifact reflects its \emph{vocabulary}, and distance corresponds to similarity of vocabulary. We use Latent Semantic Indexing (LSI) to map software artifacts to a vector space, and then use Multidimensional Scaling (MDS) to map this vector space down to two dimensions. The resulting consistent layout allows us to develop a variety of thematic software maps that express very different aspects of software while making it easy to compare them. The approach is especially suitable for comparing views of evolving software, since the vocabulary of software artifacts tends to be stable over time.
△ Less
Submitted 25 September, 2012;
originally announced September 2012.
-
An Exploratory Study of Forces and Frictions affecting Large-Scale Model-Driven Development
Authors:
Adrian Kuhn,
Gail C. Murphy,
C. Albert Thompson
Abstract:
In this paper, we investigate model-driven engineering, reporting on an exploratory case-study conducted at a large automotive company. The study consisted of interviews with 20 engineers and managers working in different roles. We found that, in the context of a large organization, contextual forces dominate the cognitive issues of using model-driven technology. The four forces we identified that…
▽ More
In this paper, we investigate model-driven engineering, reporting on an exploratory case-study conducted at a large automotive company. The study consisted of interviews with 20 engineers and managers working in different roles. We found that, in the context of a large organization, contextual forces dominate the cognitive issues of using model-driven technology. The four forces we identified that are likely independent of the particular abstractions chosen as the basis of software development are the need for diffing in software product lines, the needs for problem-specific languages and types, the need for live modeling in exploratory activities, and the need for point-to-point traceability between artifacts. We also identified triggers of accidental complexity, which we refer to as points of friction introduced by languages and tools. Examples of the friction points identified are insufficient support for model diffing, point-to-point traceability, and model changes at runtime.
△ Less
Submitted 3 July, 2012;
originally announced July 2012.
-
Embedding Spatial Software Visualization in the IDE: an Exploratory Study
Authors:
Adrian Kuhn,
David Erni,
Oscar Nierstrasz
Abstract:
Software visualization can be of great use for understanding and exploring a software system in an intuitive manner. Spatial representation of software is a promising approach of increasing interest. However, little is known about how developers interact with spatial visualizations that are embedded in the IDE. In this paper, we present a pilot study that explores the use of Software Cartography f…
▽ More
Software visualization can be of great use for understanding and exploring a software system in an intuitive manner. Spatial representation of software is a promising approach of increasing interest. However, little is known about how developers interact with spatial visualizations that are embedded in the IDE. In this paper, we present a pilot study that explores the use of Software Cartography for program comprehension of an unknown system. We investigated whether developers establish a spatial memory of the system, whether clustering by topic offers a sound base layout, and how developers interact with maps. We report our results in the form of observations, hypotheses, and implications. Key findings are a) that developers made good use of the map to inspect search results and call graphs, and b) that developers found the base layout surprising and often confusing. We conclude with concrete advice for the design of embedded software maps.
△ Less
Submitted 25 July, 2010;
originally announced July 2010.
-
Empowering Collections with Swarm Behavior
Authors:
Adrian Kuhn,
David Erni,
Marcus Denker
Abstract:
Often, when modelling a system there are properties and operations that are related to a group of objects rather than to a single object. In this paper we extend Java with Swarm Behavior, a new composition operator that associates behavior with a collection of instances. The lookup resolution of swarm behavior is based on the element type of a collection and is thus orthogonal to the collection hi…
▽ More
Often, when modelling a system there are properties and operations that are related to a group of objects rather than to a single object. In this paper we extend Java with Swarm Behavior, a new composition operator that associates behavior with a collection of instances. The lookup resolution of swarm behavior is based on the element type of a collection and is thus orthogonal to the collection hierarchy.
△ Less
Submitted 1 July, 2010;
originally announced July 2010.
-
A Trustability Metric for Code Search based on Developer Karma
Authors:
Florian S. Gysin,
Adrian Kuhn
Abstract:
The promise of search-driven development is that developers will save time and resources by reusing external code in their local projects. To efficiently integrate this code, users must be able to trust it, thus trustability of code search results is just as important as their relevance. In this paper, we introduce a trustability metric to help users assess the quality of code search results and…
▽ More
The promise of search-driven development is that developers will save time and resources by reusing external code in their local projects. To efficiently integrate this code, users must be able to trust it, thus trustability of code search results is just as important as their relevance. In this paper, we introduce a trustability metric to help users assess the quality of code search results and therefore ease the cost-benefit analysis they undertake trying to find suitable integration candidates. The proposed trustability metric incorporates both user votes and cross-project activity of developers to calculate a "karma" value for each developer. Through the karma value of all its developers a project is ranked on a trustability scale. We present JBender, a proof-of-concept code search engine which implements our trustability metric and we discuss preliminary results from an evaluation of the prototype.
△ Less
Submitted 25 February, 2010;
originally announced February 2010.
-
Towards Improving the Mental Model of Software Developers through Cartographic Visualization
Authors:
Adrian Kuhn,
David Erni,
Oscar Nierstrasz
Abstract:
Software is intangible and knowledge about software systems is typically tacit. The mental model of software developers is thus an important factor in software engineering. It is our vision that developers should be able to refer to code as being "up in the north", "over in the west", or "down-under in the south". We want to provide developers, and everyone else involved in software development,…
▽ More
Software is intangible and knowledge about software systems is typically tacit. The mental model of software developers is thus an important factor in software engineering. It is our vision that developers should be able to refer to code as being "up in the north", "over in the west", or "down-under in the south". We want to provide developers, and everyone else involved in software development, with a *shared*, spatial and stable mental model of their software project. We aim to reinforce this by embedding a cartographic visualization in the IDE (Integrated Development Environment). The visualization is always visible in the bottom-left, similar to the GPS navigation device for car drivers. For each development task, related information is displayed on the map. In this paper we present CODEMAP, an eclipse plug-in, and report on preliminary results from an ongoing user study with professional developers and students.
△ Less
Submitted 14 January, 2010;
originally announced January 2010.