Search | arXiv e-print repository

Towards Deep Active Learning in Avian Bioacoustics

Authors: Lukas Rauch, Denis Huseljic, Moritz Wirth, Jens Decke, Bernhard Sick, Christoph Scholz

Abstract: Passive acoustic monitoring (PAM) in avian bioacoustics enables cost-effective and extensive data collection with minimal disruption to natural habitats. Despite advancements in computational avian bioacoustics, deep learning models continue to encounter challenges in adapting to diverse environments in practical PAM scenarios. This is primarily due to the scarcity of annotations, which requires l… ▽ More Passive acoustic monitoring (PAM) in avian bioacoustics enables cost-effective and extensive data collection with minimal disruption to natural habitats. Despite advancements in computational avian bioacoustics, deep learning models continue to encounter challenges in adapting to diverse environments in practical PAM scenarios. This is primarily due to the scarcity of annotations, which requires labor-intensive efforts from human experts. Active learning (AL) reduces annotation cost and speed ups adaption to diverse scenarios by querying the most informative instances for labeling. This paper outlines a deep AL approach, introduces key challenges, and conducts a small-scale pilot study. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: preprint, under review IAL@ECML-PKDD24

arXiv:2406.00081 [pdf, other]

From Structured to Unstructured:A Comparative Analysis of Computer Vision and Graph Models in solving Mesh-based PDEs

Authors: Jens Decke, Olaf Wünsch, Bernhard Sick, Christian Gruhl

Abstract: This article investigates the application of computer vision and graph-based models in solving mesh-based partial differential equations within high-performance computing environments. Focusing on structured, graded structured, and unstructured meshes, the study compares the performance and computational efficiency of three computer vision-based models against three graph-based models across three… ▽ More This article investigates the application of computer vision and graph-based models in solving mesh-based partial differential equations within high-performance computing environments. Focusing on structured, graded structured, and unstructured meshes, the study compares the performance and computational efficiency of three computer vision-based models against three graph-based models across three data\-sets. The research aims to identify the most suitable models for different mesh topographies, particularly highlighting the exploration of graded meshes, a less studied area. Results demonstrate that computer vision-based models, notably U-Net, outperform the graph models in prediction performance and efficiency in two (structured and graded) out of three mesh topographies. The study also reveals the unexpected effectiveness of computer vision-based models in handling unstructured meshes, suggesting a potential shift in methodological approaches for data-driven partial differential equation learning. The article underscores deep learning as a viable and potentially sustainable way to enhance traditional high-performance computing methods, advocating for informed model selection based on the topography of the mesh. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2406.00080 [pdf, other]

An Efficient Multi Quantile Regression Network with Ad Hoc Prevention of Quantile Crossing

Authors: Jens Decke, Arne Jenß, Bernhard Sick, Christian Gruhl

Abstract: This article presents the Sorting Composite Quantile Regression Neural Network (SCQRNN), an advanced quantile regression model designed to prevent quantile crossing and enhance computational efficiency. Integrating ad hoc sorting in training, the SCQRNN ensures non-intersecting quantiles, boosting model reliability and interpretability. We demonstrate that the SCQRNN not only prevents quantile cro… ▽ More This article presents the Sorting Composite Quantile Regression Neural Network (SCQRNN), an advanced quantile regression model designed to prevent quantile crossing and enhance computational efficiency. Integrating ad hoc sorting in training, the SCQRNN ensures non-intersecting quantiles, boosting model reliability and interpretability. We demonstrate that the SCQRNN not only prevents quantile crossing and reduces computational complexity but also achieves faster convergence than traditional models. This advancement meets the requirements of high-performance computing for sustainable, accurate computation. In organic computing, the SCQRNN enhances self-aware systems with predictive uncertainties, enriching applications across finance, meteorology, climate science, and engineering. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.03386 [pdf, other]

Annot-Mix: Learning with Noisy Class Labels from Multiple Annotators via a Mixup Extension

Authors: Marek Herde, Lukas Lührs, Denis Huseljic, Bernhard Sick

Abstract: Training with noisy class labels impairs neural networks' generalization performance. In this context, mixup is a popular regularization technique to improve training robustness by making memorizing false class labels more difficult. However, mixup neglects that, typically, multiple annotators, e.g., crowdworkers, provide class labels. Therefore, we propose an extension of mixup, which handles mul… ▽ More Training with noisy class labels impairs neural networks' generalization performance. In this context, mixup is a popular regularization technique to improve training robustness by making memorizing false class labels more difficult. However, mixup neglects that, typically, multiple annotators, e.g., crowdworkers, provide class labels. Therefore, we propose an extension of mixup, which handles multiple class labels per instance while considering which class label originates from which annotator. Integrated into our multi-annotator classification framework annot-mix, it performs superiorly to eight state-of-the-art approaches on eleven datasets with noisy class labels provided either by human or simulated annotators. Our code is publicly available through our repository at https://github.com/ies-research/annot-mix. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Under review

ACM Class: I.2.6; I.5.1

arXiv:2404.11266 [pdf, other]

Criteria for Uncertainty-based Corner Cases Detection in Instance Segmentation

Authors: Florian Heidecker, Ahmad El-Khateeb, Maarten Bieshaar, Bernhard Sick

Abstract: The operating environment of a highly automated vehicle is subject to change, e.g., weather, illumination, or the scenario containing different objects and other participants in which the highly automated vehicle has to navigate its passengers safely. These situations must be considered when develo** and validating highly automated driving functions. This already poses a problem for training and… ▽ More The operating environment of a highly automated vehicle is subject to change, e.g., weather, illumination, or the scenario containing different objects and other participants in which the highly automated vehicle has to navigate its passengers safely. These situations must be considered when develo** and validating highly automated driving functions. This already poses a problem for training and evaluating deep learning models because without the costly labeling of thousands of recordings, not knowing whether the data contains relevant, interesting data for further model training, it is a guess under which conditions and situations the model performs poorly. For this purpose, we present corner case criteria based on the predictive uncertainty. With our corner case criteria, we are able to detect uncertainty-based corner cases of an object instance segmentation model without relying on ground truth (GT) data. We evaluated each corner case criterion using the COCO and the NuImages dataset to analyze the potential of our approach. We also provide a corner case decision function that allows us to distinguish each object into True Positive (TP), localization and/or classification corner case, or False Positive (FP). We also present our first results of an iterative training cycle that outperforms the baseline and where the data added to the training dataset is selected based on the corner case decision function. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.10420 [pdf, other]

AudioProtoPNet: An interpretable deep learning model for bird sound classification

Authors: René Heinrich, Bernhard Sick, Christoph Scholz

Abstract: Recently, scientists have proposed several deep learning models to monitor the diversity of bird species. These models can detect bird species with high accuracy by analyzing acoustic signals. However, traditional deep learning algorithms are black-box models that provide no insight into their decision-making process. For domain experts, such as ornithologists, it is crucial that these models are… ▽ More Recently, scientists have proposed several deep learning models to monitor the diversity of bird species. These models can detect bird species with high accuracy by analyzing acoustic signals. However, traditional deep learning algorithms are black-box models that provide no insight into their decision-making process. For domain experts, such as ornithologists, it is crucial that these models are not only efficient, but also interpretable in order to be used as assistive tools. In this study, we present an adaption of the Prototypical Part Network (ProtoPNet) for audio classification that provides inherent interpretability through its model architecture. Our approach is based on a ConvNeXt backbone architecture for feature extraction and learns prototypical patterns for each bird species using spectrograms of the training data. Classification of new data is done by comparison with these prototypes in latent space, which simultaneously serve as easily understandable explanations for the model's decisions. We evaluated the performance of our model on seven different datasets representing bird species from different geographical regions. In our experiments, the model showed excellent results, achieving an average AUROC of 0.82 and an average cmAP of 0.37 across the seven datasets, making it comparable to state-of-the-art black-box models for bird sound classification. Thus, this work demonstrates that even for the challenging task of bioacoustic bird classification, powerful yet interpretable deep learning models can be developed to provide valuable insights to domain experts. △ Less

Submitted 29 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: Work in progress

arXiv:2404.08981 [pdf, other]

Fast Fishing: Approximating BAIT for Efficient and Scalable Deep Active Image Classification

Authors: Denis Huseljic, Paul Hahn, Marek Herde, Lukas Rauch, Bernhard Sick

Abstract: Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks. BAIT, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets. However, BAIT's high computational and memory requirements hinder its applicability on large-scale classification tasks, resulting in current research neglecting BA… ▽ More Deep active learning (AL) seeks to minimize the annotation costs for training deep neural networks. BAIT, a recently proposed AL strategy based on the Fisher Information, has demonstrated impressive performance across various datasets. However, BAIT's high computational and memory requirements hinder its applicability on large-scale classification tasks, resulting in current research neglecting BAIT in their evaluation. This paper introduces two methods to enhance BAIT's computational efficiency and scalability. Notably, we significantly reduce its time complexity by approximating the Fisher Information. In particular, we adapt the original formulation by i) taking the expectation over the most probable classes, and ii) constructing a binary classification task, leading to an alternative likelihood for gradient computations. Consequently, this allows the efficient use of BAIT on large-scale datasets, including ImageNet. Our unified and comprehensive evaluation across a variety of datasets demonstrates that our approximations achieve strong performance with considerably reduced time complexity. Furthermore, we provide an extensive open-source toolbox that implements recent state-of-the-art AL strategies, available at https://github.com/dhuseljic/dal-toolbox. △ Less

Submitted 13 April, 2024; originally announced April 2024.

arXiv:2403.10380 [pdf, other]

BirdSet: A Dataset and Benchmark for Classification in Avian Bioacoustics

Authors: Lukas Rauch, Raphael Schwinger, Moritz Wirth, René Heinrich, Denis Huseljic, Jonas Lange, Stefan Kahl, Bernhard Sick, Sven Tomforde, Christoph Scholz

Abstract: Deep learning (DL) models have emerged as a powerful tool in avian bioacoustics to assess environmental health. To maximize the potential of cost-effective and minimal-invasive passive acoustic monitoring (PAM), DL models must analyze bird vocalizations across a wide range of species and environmental conditions. However, data fragmentation challenges a comprehensive evaluation of generalization p… ▽ More Deep learning (DL) models have emerged as a powerful tool in avian bioacoustics to assess environmental health. To maximize the potential of cost-effective and minimal-invasive passive acoustic monitoring (PAM), DL models must analyze bird vocalizations across a wide range of species and environmental conditions. However, data fragmentation challenges a comprehensive evaluation of generalization performance. Therefore, we introduce the BirdSet dataset, comprising approximately 520,000 global bird recordings for training and over 400 hours of PAM recordings for testing. Our benchmark offers baselines for several DL models to enhance comparability and consolidate research across studies, along with code implementations that include comprehensive training and evaluation protocols. △ Less

Submitted 17 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: Under review @NeurIPS2024 Datasets & Benchmarks

arXiv:2401.15964 [pdf, other]

Spatio-Temporal Attention Graph Neural Network for Remaining Useful Life Prediction

Authors: Zhixin Huang, Yujiang He, Bernhard Sick

Abstract: Remaining useful life prediction plays a crucial role in the health management of industrial systems. Given the increasing complexity of systems, data-driven predictive models have attracted significant research interest. Upon reviewing the existing literature, it appears that many studies either do not fully integrate both spatial and temporal features or employ only a single attention mechanism.… ▽ More Remaining useful life prediction plays a crucial role in the health management of industrial systems. Given the increasing complexity of systems, data-driven predictive models have attracted significant research interest. Upon reviewing the existing literature, it appears that many studies either do not fully integrate both spatial and temporal features or employ only a single attention mechanism. Furthermore, there seems to be inconsistency in the choice of data normalization methods, particularly concerning operating conditions, which might influence predictive performance. To bridge these observations, this study presents the Spatio-Temporal Attention Graph Neural Network. Our model combines graph neural networks and temporal convolutional neural networks for spatial and temporal feature extraction, respectively. The cascade of these extractors, combined with multi-head attention mechanisms for both spatio-temporal dimensions, aims to improve predictive precision and refine model explainability. Comprehensive experiments were conducted on the C-MAPSS dataset to evaluate the impact of unified versus clustering normalization. The findings suggest that our model performs state-of-the-art results using only the unified normalization. Additionally, when dealing with datasets with multiple operating conditions, cluster normalization enhances the performance of our proposed model by up to 27%. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: This article has been accepted in the International Conference Computational Science & Computational Intelligence (CSCI'23)

arXiv:2401.12950 [pdf, other]

Bayesian Semi-structured Subspace Inference

Authors: Daniel Dold, David Rügamer, Beate Sick, Oliver Dürr

Abstract: Semi-structured regression models enable the joint modeling of interpretable structured and complex unstructured feature effects. The structured model part is inspired by statistical models and can be used to infer the input-output relationship for features of particular importance. The complex unstructured part defines an arbitrary deep neural network and thereby provides enough flexibility to ac… ▽ More Semi-structured regression models enable the joint modeling of interpretable structured and complex unstructured feature effects. The structured model part is inspired by statistical models and can be used to infer the input-output relationship for features of particular importance. The complex unstructured part defines an arbitrary deep neural network and thereby provides enough flexibility to achieve competitive prediction performance. While these models can also account for aleatoric uncertainty, there is still a lack of work on accounting for epistemic uncertainty. In this paper, we address this problem by presenting a Bayesian approximation for semi-structured regression models using subspace inference. To this end, we extend subspace inference for joint posterior sampling from a full parameter space for structured effects and a subspace for unstructured effects. Apart from this hybrid sampling scheme, our method allows for tunable complexity of the subspace and can capture multiple minima in the loss landscape. Numerical experiments validate our approach's efficacy in recovering structured effect parameter posteriors in semi-structured models and approaching the full-space posterior distribution of MCMC for increasing subspace dimension. Further, our approach exhibits competitive predictive performance across simulated and real-world datasets. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted at AISTATS 2024

arXiv:2309.13179 [pdf, other]

Enhancing Multi-Objective Optimization through Machine Learning-Supported Multiphysics Simulation

Authors: Diego Botache, Jens Decke, Winfried Ripken, Abhinay Dornipati, Franz Götz-Hahn, Mohamed Ayeb, Bernhard Sick

Abstract: This paper presents a methodological framework for training, self-optimising, and self-organising surrogate models to approximate and speed up multiobjective optimisation of technical systems based on multiphysics simulations. At the hand of two real-world datasets, we illustrate that surrogate models can be trained on relatively small amounts of data to approximate the underlying simulations accu… ▽ More This paper presents a methodological framework for training, self-optimising, and self-organising surrogate models to approximate and speed up multiobjective optimisation of technical systems based on multiphysics simulations. At the hand of two real-world datasets, we illustrate that surrogate models can be trained on relatively small amounts of data to approximate the underlying simulations accurately. Including explainable AI techniques allow for highlighting feature relevancy or dependencies and supporting the possible extension of the used datasets. One of the datasets was created for this paper and is made publicly available for the broader scientific community. Extensive experiments combine four machine learning and deep learning algorithms with an evolutionary optimisation algorithm. The performance of the combined training and optimisation pipeline is evaluated by verifying the generated Pareto-optimal results using the ground truth simulations. The results from our pipeline and a comprehensive evaluation strategy show the potential for efficiently acquiring solution candidates in multiobjective optimisation tasks by reducing the number of simulations and conserving a higher prediction accuracy, i.e., with a MAPE score under 5% for one of the presented use cases. △ Less

Submitted 3 April, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.06159 [pdf, other]

Active Label Refinement for Semantic Segmentation of Satellite Images

Authors: Tuan Pham Minh, Jayan Wijesingha, Daniel Kottke, Marek Herde, Denis Huseljic, Bernhard Sick, Michael Wachendorf, Thomas Esch

Abstract: Remote sensing through semantic segmentation of satellite images contributes to the understanding and utilisation of the earth's surface. For this purpose, semantic segmentation networks are typically trained on large sets of labelled satellite images. However, obtaining expert labels for these images is costly. Therefore, we propose to rely on a low-cost approach, e.g. crowdsourcing or pretrained… ▽ More Remote sensing through semantic segmentation of satellite images contributes to the understanding and utilisation of the earth's surface. For this purpose, semantic segmentation networks are typically trained on large sets of labelled satellite images. However, obtaining expert labels for these images is costly. Therefore, we propose to rely on a low-cost approach, e.g. crowdsourcing or pretrained networks, to label the images in the first step. Since these initial labels are partially erroneous, we use active learning strategies to cost-efficiently refine the labels in the second step. We evaluate the active learning strategies using satellite images of Bengaluru in India, labelled with land cover and land use labels. Our experimental results suggest that an active label refinement to improve the semantic segmentation network's performance is beneficial. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2308.12785 [pdf, ps, other]

Single-shot Bayesian approximation for neural networks

Authors: Kai Brach, Beate Sick, Oliver Dürr

Abstract: Deep neural networks (NNs) are known for their high-prediction performances. However, NNs are prone to yield unreliable predictions when encountering completely new situations without indicating their uncertainty. Bayesian variants of NNs (BNNs), such as Monte Carlo (MC) dropout BNNs, do provide uncertainty measures and simultaneously increase the prediction performance. The only disadvantage of B… ▽ More Deep neural networks (NNs) are known for their high-prediction performances. However, NNs are prone to yield unreliable predictions when encountering completely new situations without indicating their uncertainty. Bayesian variants of NNs (BNNs), such as Monte Carlo (MC) dropout BNNs, do provide uncertainty measures and simultaneously increase the prediction performance. The only disadvantage of BNNs is their higher computation time during test time because they rely on a sampling approach. Here we present a single-shot MC dropout approximation that preserves the advantages of BNNs while being as fast as NNs. Our approach is based on moment propagation (MP) and allows to analytically approximate the expected value and the variance of the MC dropout signal for commonly used layers in NNs, i.e. convolution, max pooling, dense, softmax, and dropout layers. The MP approach can convert an NN into a BNN without re-training given the NN has been trained with standard dropout. We evaluate our approach on different benchmark datasets and a simulated toy example in a classification and regression setting. We demonstrate that our single-shot MC dropout approximation resembles the point estimate and the uncertainty estimate of the predictive distribution that is achieved with an MC approach, while being fast enough for real-time deployments of BNNs. We show that using part of the saved time to combine our MP approach with deep ensemble techniques does further improve the uncertainty measures. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: arXiv admin note: text overlap with arXiv:2007.03293

arXiv:2308.07121 [pdf, other]

Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers

Authors: Lukas Rauch, Raphael Schwinger, Moritz Wirth, Bernhard Sick, Sven Tomforde, Christoph Scholz

Abstract: We propose a shift towards end-to-end learning in bird sound monitoring by combining self-supervised (SSL) and deep active learning (DAL). Leveraging transformer models, we aim to bypass traditional spectrogram conversions, enabling direct raw audio processing. ActiveBird2Vec is set to generate high-quality bird sound representations through SSL, potentially accelerating the assessment of environm… ▽ More We propose a shift towards end-to-end learning in bird sound monitoring by combining self-supervised (SSL) and deep active learning (DAL). Leveraging transformer models, we aim to bypass traditional spectrogram conversions, enabling direct raw audio processing. ActiveBird2Vec is set to generate high-quality bird sound representations through SSL, potentially accelerating the assessment of environmental changes and decision-making processes for wind farms. Additionally, we seek to utilize the wide variety of bird vocalizations through DAL, reducing the reliance on extensively labeled datasets by human experts. We plan to curate a comprehensive set of tasks through Huggingface Datasets, enhancing future comparability and reproducibility of bioacoustic research. A comparative analysis between various transformer models will be conducted to evaluate their proficiency in bird sound recognition tasks. We aim to accelerate the progression of avian bioacoustic research and contribute to more effective conservation strategies. △ Less

Submitted 21 November, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: Accepted @AI4S ECAI2023. This is the author's version of the work

arXiv:2308.00971 [pdf, other]

Height Change Feature Based Free Space Detection

Authors: Steven Schreck, Hannes Reichert, Manuel Hetzel, Konrad Doll, Bernhard Sick

Abstract: In the context of autonomous forklifts, ensuring non-collision during travel, pick, and place operations is crucial. To accomplish this, the forklift must be able to detect and locate areas of free space and potential obstacles in its environment. However, this is particularly challenging in highly dynamic environments, such as factory sites and production halls, due to numerous industrial trucks… ▽ More In the context of autonomous forklifts, ensuring non-collision during travel, pick, and place operations is crucial. To accomplish this, the forklift must be able to detect and locate areas of free space and potential obstacles in its environment. However, this is particularly challenging in highly dynamic environments, such as factory sites and production halls, due to numerous industrial trucks and workers moving throughout the area. In this paper, we present a novel method for free space detection, which consists of the following steps. We introduce a novel technique for surface normal estimation relying on spherical projected LiDAR data. Subsequently, we employ the estimated surface normals to detect free space. The presented method is a heuristic approach that does not require labeling and can ensure real-time application due to high processing speed. The effectiveness of the proposed method is demonstrated through its application to a real-world dataset obtained on a factory site both indoors and outdoors, and its evaluation on the Semantic KITTI dataset [2]. We achieved a mean Intersection over Union (mIoU) score of 50.90 % on the benchmark dataset, with a processing speed of 105 Hz. In addition, we evaluated our approach on our factory site dataset. Our method achieved a mIoU score of 63.30 % at 54 Hz △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2307.14294 [pdf, other]

Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis

Authors: Diego Botache, Kristina Dingel, Rico Huhnstock, Arno Ehresmann, Bernhard Sick

Abstract: Splitting of sequential data, such as videos and time series, is an essential step in various data analysis tasks, including object tracking and anomaly detection. However, splitting sequential data presents a variety of challenges that can impact the accuracy and reliability of subsequent analyses. This concept article examines the challenges associated with splitting sequential data, including d… ▽ More Splitting of sequential data, such as videos and time series, is an essential step in various data analysis tasks, including object tracking and anomaly detection. However, splitting sequential data presents a variety of challenges that can impact the accuracy and reliability of subsequent analyses. This concept article examines the challenges associated with splitting sequential data, including data acquisition, data representation, split ratio selection, setting up quality criteria, and choosing suitable selection strategies. We explore these challenges through two real-world examples: motor test benches and particle tracking in liquids. △ Less

Submitted 26 July, 2023; originally announced July 2023.

arXiv:2307.06177 [pdf, other]

doi 10.1109/ISC253183.2021.9562809

Smart Infrastructure: A Research Junction

Authors: Manuel Hetzel, Hannes Reichert, Konrad Doll, Bernhard Sick

Abstract: Complex inner-city junctions are among the most critical traffic areas for injury and fatal accidents. The development of highly automated driving (HAD) systems struggles with the complex and hectic everyday life within those areas. Sensor-equipped smart infrastructures, which can communicate and cooperate with vehicles, are essential to enable a holistic scene understanding to resolve occlusions… ▽ More Complex inner-city junctions are among the most critical traffic areas for injury and fatal accidents. The development of highly automated driving (HAD) systems struggles with the complex and hectic everyday life within those areas. Sensor-equipped smart infrastructures, which can communicate and cooperate with vehicles, are essential to enable a holistic scene understanding to resolve occlusions drivers and vehicle perception systems for themselves can not cover. We introduce an intelligent research infrastructure equipped with visual sensor technology, located at a public inner-city junction in Aschaffenburg, Germany. A multiple-view camera system monitors the traffic situation to perceive road users' behavior. Both motorized and non-motorized traffic is considered. The system is used for research in data generation, evaluating new HAD sensors systems, algorithms, and Artificial Intelligence (AI) training strategies using real-, synthetic- and augmented data. In addition, the junction features a highly accurate digital twin. Real-world data can be taken into the digital twin for simulation purposes and synthetic data generation. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: IEEE International Smart Cities Conference (ISC2) 2021

arXiv:2307.06165 [pdf, other]

The IMPTC Dataset: An Infrastructural Multi-Person Trajectory and Context Dataset

Authors: Manuel Hetzel, Hannes Reichert, Günther Reitberger, Erich Fuchs, Konrad Doll, Bernhard Sick

Abstract: Inner-city intersections are among the most critical traffic areas for injury and fatal accidents. Automated vehicles struggle with the complex and hectic everyday life within those areas. Sensor-equipped smart infrastructures, which can cooperate with vehicles, can benefit automated traffic by extending the perception capabilities of drivers and vehicle perception systems. Additionally, they offe… ▽ More Inner-city intersections are among the most critical traffic areas for injury and fatal accidents. Automated vehicles struggle with the complex and hectic everyday life within those areas. Sensor-equipped smart infrastructures, which can cooperate with vehicles, can benefit automated traffic by extending the perception capabilities of drivers and vehicle perception systems. Additionally, they offer the opportunity to gather reproducible and precise data of a holistic scene understanding, including context information as a basis for training algorithms for various applications in automated traffic. Therefore, we introduce the Infrastructural Multi-Person Trajectory and Context Dataset (IMPTC). We use an intelligent public inner-city intersection in Germany with visual sensor technology. A multi-view camera and LiDAR system perceives traffic situations and road users' behavior. Additional sensors monitor contextual information like weather, lighting, and traffic light signal status. The data acquisition system focuses on Vulnerable Road Users (VRUs) and multi-agent interaction. The resulting dataset consists of eight hours of measurement data. It contains over 2,500 VRU trajectories, including pedestrians, cyclists, e-scooter riders, strollers, and wheelchair users, and over 20,000 vehicle trajectories at different day times, weather conditions, and seasons. In addition, to enable the entire stack of research capabilities, the dataset includes all data, starting from the sensor-, calibration- and detection data until trajectory and context data. The dataset is continuously expanded and is available online for non-commercial research at https://github.com/kav-institute/imptc-dataset. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: IEEE Intelligent Vehicles Conference (IV) 2023

arXiv:2307.04536 [pdf, other]

doi 10.1109/ICMLA58977.2023.00244

DADO -- Low-Cost Query Strategies for Deep Active Design Optimization

Authors: Jens Decke, Christian Gruhl, Lukas Rauch, Bernhard Sick

Abstract: In this experience report, we apply deep active learning to the field of design optimization to reduce the number of computationally expensive numerical simulations. We are interested in optimizing the design of structural components, where the shape is described by a set of parameters. If we can predict the performance based on these parameters and consider only the promising candidates for simul… ▽ More In this experience report, we apply deep active learning to the field of design optimization to reduce the number of computationally expensive numerical simulations. We are interested in optimizing the design of structural components, where the shape is described by a set of parameters. If we can predict the performance based on these parameters and consider only the promising candidates for simulation, there is an enormous potential for saving computing power. We present two selection strategies for self-optimization to reduce the computational cost in multi-objective design optimization problems. Our proposed methodology provides an intuitive approach that is easy to apply, offers significant improvements over random sampling, and circumvents the need for uncertainty estimation. We evaluate our strategies on a large dataset from the domain of fluid dynamics and introduce two new evaluation metrics to determine the model's performance. Findings from our evaluation highlights the effectiveness of our selection strategies in accelerating design optimization. We believe that the introduced method is easily transferable to other self-optimization problems. △ Less

Submitted 2 October, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

Journal ref: 2023 International Conference on Machine Learning and Applications (ICMLA) p.1611-1618

arXiv:2306.10087 [pdf, other]

ActiveGLAE: A Benchmark for Deep Active Learning with Transformers

Authors: Lukas Rauch, Matthias Aßenmacher, Denis Huseljic, Moritz Wirth, Bernd Bischl, Bernhard Sick

Abstract: Deep active learning (DAL) seeks to reduce annotation costs by enabling the model to actively query instance annotations from which it expects to learn the most. Despite extensive research, there is currently no standardized evaluation protocol for transformer-based language models in the field of DAL. Diverse experimental settings lead to difficulties in comparing research and deriving recommenda… ▽ More Deep active learning (DAL) seeks to reduce annotation costs by enabling the model to actively query instance annotations from which it expects to learn the most. Despite extensive research, there is currently no standardized evaluation protocol for transformer-based language models in the field of DAL. Diverse experimental settings lead to difficulties in comparing research and deriving recommendations for practitioners. To tackle this challenge, we propose the ActiveGLAE benchmark, a comprehensive collection of data sets and evaluation guidelines for assessing DAL. Our benchmark aims to facilitate and streamline the evaluation process of novel DAL strategies. Additionally, we provide an extensive overview of current practice in DAL with transformer-based language models. We identify three key challenges - data set selection, model training, and DAL settings - that pose difficulties in comparing query strategies. We establish baseline results through an extensive set of experiments as a reference point for evaluating future work. Based on our findings, we provide guidelines for researchers and practitioners. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: Accepted @ ECML PKDD 2023. This is the author's version of the work. The definitive Version of Record will be published in the Proceedings of ECML PKDD 2023

arXiv:2305.14977 [pdf, other]

Sampling-based Uncertainty Estimation for an Instance Segmentation Network

Authors: Florian Heidecker, Ahmad El-Khateeb, Bernhard Sick

Abstract: The examination of uncertainty in the predictions of machine learning (ML) models is receiving increasing attention. One uncertainty modeling technique used for this purpose is Monte-Carlo (MC)-Dropout, where repeated predictions are generated for a single input. Therefore, clustering is required to describe the resulting uncertainty, but only through efficient clustering is it possible to describ… ▽ More The examination of uncertainty in the predictions of machine learning (ML) models is receiving increasing attention. One uncertainty modeling technique used for this purpose is Monte-Carlo (MC)-Dropout, where repeated predictions are generated for a single input. Therefore, clustering is required to describe the resulting uncertainty, but only through efficient clustering is it possible to describe the uncertainty from the model attached to each object. This article uses Bayesian Gaussian Mixture (BGM) to solve this problem. In addition, we investigate different values for the dropout rate and other techniques, such as focal loss and calibration, which we integrate into the Mask-RCNN model to obtain the most accurate uncertainty approximation of each instance and showcase it graphically. △ Less

Submitted 24 May, 2023; originally announced May 2023.

arXiv:2305.05216 [pdf, other]

doi 10.1016/j.dib.2023.109477

Dataset of a parameterized U-bend flow for Deep Learning Applications

Authors: Jens Decke, Olaf Wünsch, Bernhard Sick

Abstract: This dataset contains 10,000 fluid flow and heat transfer simulations in U-bend shapes. Each of them is described by 28 design parameters, which are processed with the help of Computational Fluid Dynamics methods. The dataset provides a comprehensive benchmark for investigating various problems and methods from the field of design optimization. For these investigations supervised, semi-supervised… ▽ More This dataset contains 10,000 fluid flow and heat transfer simulations in U-bend shapes. Each of them is described by 28 design parameters, which are processed with the help of Computational Fluid Dynamics methods. The dataset provides a comprehensive benchmark for investigating various problems and methods from the field of design optimization. For these investigations supervised, semi-supervised and unsupervised deep learning approaches can be employed. One unique feature of this dataset is that each shape can be represented by three distinct data types including design parameter and objective combinations, five different resolutions of 2D images from the geometry and the solution variables of the numerical simulation, as well as a representation using the cell values of the numerical mesh. This third representation enables considering the specific data structure of numerical simulations for deep learning approaches. The source code and the container used to generate the data are published as part of this work. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: submitted to Data in Brief

arXiv:2305.00221 [pdf, other]

Sensor Equivariance by LiDAR Projection Images

Authors: Hannes Reichert, Manuel Hetzel, Steven Schreck, Konrad Doll, Bernhard Sick

Abstract: In this work, we propose an extension of conventional image data by an additional channel in which the associated projection properties are encoded. This addresses the issue of sensor-dependent object representation in projection-based sensors, such as LiDAR, which can lead to distorted physical and geometric properties due to variations in sensor resolution and field of view. To that end, we prop… ▽ More In this work, we propose an extension of conventional image data by an additional channel in which the associated projection properties are encoded. This addresses the issue of sensor-dependent object representation in projection-based sensors, such as LiDAR, which can lead to distorted physical and geometric properties due to variations in sensor resolution and field of view. To that end, we propose an architecture for processing this data in an instance segmentation framework. We focus specifically on LiDAR as a key sensor modality for machine vision tasks and highly automated driving (HAD). Through an experimental setup in a controlled synthetic environment, we identify a bias on sensor resolution and field of view and demonstrate that our proposed method can reduce said bias for the task of LiDAR instance segmentation. Furthermore, we define our method such that it can be applied to other projection-based sensors, such as cameras. To promote transparency, we make our code and dataset publicly available. This method shows the potential to improve performance and robustness in various machine vision tasks that utilize projection-based sensors. △ Less

Submitted 29 April, 2023; originally announced May 2023.

arXiv:2304.02539 [pdf, other]

Multi-annotator Deep Learning: A Probabilistic Framework for Classification

Authors: Marek Herde, Denis Huseljic, Bernhard Sick

Abstract: Solving complex classification tasks using deep neural networks typically requires large amounts of annotated data. However, corresponding class labels are noisy when provided by error-prone annotators, e.g., crowdworkers. Training standard deep neural networks leads to subpar performances in such multi-annotator supervised learning settings. We address this issue by presenting a probabilistic tra… ▽ More Solving complex classification tasks using deep neural networks typically requires large amounts of annotated data. However, corresponding class labels are noisy when provided by error-prone annotators, e.g., crowdworkers. Training standard deep neural networks leads to subpar performances in such multi-annotator supervised learning settings. We address this issue by presenting a probabilistic training framework named multi-annotator deep learning (MaDL). A downstream ground truth and an annotator performance model are jointly trained in an end-to-end learning approach. The ground truth model learns to predict instances' true class labels, while the annotator performance model infers probabilistic estimates of annotators' performances. A modular network architecture enables us to make varying assumptions regarding annotators' performances, e.g., an optional class or instance dependency. Further, we learn annotator embeddings to estimate annotators' densities within a latent space as proxies of their potentially correlated annotations. Together with a weighted loss function, we improve the learning from correlated annotation patterns. In a comprehensive evaluation, we examine three research questions about multi-annotator supervised learning. Our findings show MaDL's state-of-the-art performance and robustness against many correlated, spamming annotators. △ Less

Submitted 23 October, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

Comments: Transactions on Machine Learning Research, see https://openreview.net/forum?id=MgdoxzImlK

ACM Class: I.2.6; I.5.1

Journal ref: Transactions on Machine Learning Research, 2023

arXiv:2302.14522 [pdf, other]

AdaptiveShape: Solving Shape Variability for 3D Object Detection with Geometry Aware Anchor Distributions

Authors: Benjamin Sick, Michael Walter, Jochen Abhau

Abstract: 3D object detection with point clouds and images plays an important role in perception tasks such as autonomous driving. Current methods show great performance on detection and pose estimation of standard-shaped vehicles but lack behind on more complex shapes as e.g. semi-trailer truck combinations. Determining the shape and motion of those special vehicles accurately is crucial in yard operation… ▽ More 3D object detection with point clouds and images plays an important role in perception tasks such as autonomous driving. Current methods show great performance on detection and pose estimation of standard-shaped vehicles but lack behind on more complex shapes as e.g. semi-trailer truck combinations. Determining the shape and motion of those special vehicles accurately is crucial in yard operation and maneuvering and industrial automation applications. This work introduces several new methods to improve and measure the performance for such classes. State-of-the-art methods are based on predefined anchor grids or heatmaps for ground truth targets. However, the underlying representations do not take the shape of different sized objects into account. Our main contribution, AdaptiveShape, uses shape aware anchor distributions and heatmaps to improve the detection capabilities. For large vehicles we achieve +10.9% AP in comparison to current shape agnostic methods. Furthermore we introduce a new fast LiDAR-camera fusion. It is based on 2D bounding box camera detections which are available in many processing pipelines. This fusion method does not rely on perfectly calibrated or temporally synchronized systems and is therefore applicable to a broad range of robotic applications. We extend a standard point pillar network to account for temporal data and improve learning of complex object movements. In addition we extended a ground truth augmentation to use grouped object pairs to further improve truck AP by +2.2% compared to conventional augmentation. △ Less

Submitted 28 February, 2023; originally announced February 2023.

arXiv:2210.08885 [pdf, other]

Space, Time, and Interaction: A Taxonomy of Corner Cases in Trajectory Datasets for Automated Driving

Authors: Kevin Rösch, Florian Heidecker, Julian Truetsch, Kamil Kowol, Clemens Schicktanz, Maarten Bieshaar, Bernhard Sick, Christoph Stiller

Abstract: Trajectory data analysis is an essential component for highly automated driving. Complex models developed with these data predict other road users' movement and behavior patterns. Based on these predictions - and additional contextual information such as the course of the road, (traffic) rules, and interaction with other road users - the highly automated vehicle (HAV) must be able to reliably and… ▽ More Trajectory data analysis is an essential component for highly automated driving. Complex models developed with these data predict other road users' movement and behavior patterns. Based on these predictions - and additional contextual information such as the course of the road, (traffic) rules, and interaction with other road users - the highly automated vehicle (HAV) must be able to reliably and safely perform the task assigned to it, e.g., moving from point A to B. Ideally, the HAV moves safely through its environment, just as we would expect a human driver to do. However, if unusual trajectories occur, so-called trajectory corner cases, a human driver can usually cope well, but an HAV can quickly get into trouble. In the definition of trajectory corner cases, which we provide in this work, we will consider the relevance of unusual trajectories with respect to the task at hand. Based on this, we will also present a taxonomy of different trajectory corner cases. The categorization of corner cases into the taxonomy will be shown with examples and is done by cause and required data sources. To illustrate the complexity between the machine learning (ML) model and the corner case cause, we present a general processing chain underlying the taxonomy. △ Less

Submitted 17 October, 2022; originally announced October 2022.

arXiv:2210.06112 [pdf, other]

Fast Bayesian Updates for Deep Learning with a Use Case in Active Learning

Authors: Marek Herde, Zhixin Huang, Denis Huseljic, Daniel Kottke, Stephan Vogt, Bernhard Sick

Abstract: Retraining deep neural networks when new data arrives is typically computationally expensive. Moreover, certain applications do not allow such costly retraining due to time or computational constraints. Fast Bayesian updates are a possible solution to this issue. Therefore, we propose a Bayesian update based on Monte-Carlo samples and a last-layer Laplace approximation for different Bayesian neura… ▽ More Retraining deep neural networks when new data arrives is typically computationally expensive. Moreover, certain applications do not allow such costly retraining due to time or computational constraints. Fast Bayesian updates are a possible solution to this issue. Therefore, we propose a Bayesian update based on Monte-Carlo samples and a last-layer Laplace approximation for different Bayesian neural network types, i.e., Dropout, Ensemble, and Spectral Normalized Neural Gaussian Process (SNGP). In a large-scale evaluation study, we show that our updates combined with SNGP represent a fast and competitive alternative to costly retraining. As a use case, we combine the Bayesian updates for SNGP with different sequential query strategies to exemplarily demonstrate their improved selection performance in active learning. △ Less

Submitted 12 October, 2022; originally announced October 2022.

Comments: 25 pages, 10 figures, submitted to ICLR

arXiv:2210.02935 [pdf, other]

A Review of Uncertainty Calibration in Pretrained Object Detectors

Authors: Denis Huseljic, Marek Herde, Mehmet Muejde, Bernhard Sick

Abstract: In the field of deep learning based computer vision, the development of deep object detection has led to unique paradigms (e.g., two-stage or set-based) and architectures (e.g., Faster-RCNN or DETR) which enable outstanding performance on challenging benchmark datasets. Despite this, the trained object detectors typically do not reliably assess uncertainty regarding their own knowledge, and the qu… ▽ More In the field of deep learning based computer vision, the development of deep object detection has led to unique paradigms (e.g., two-stage or set-based) and architectures (e.g., Faster-RCNN or DETR) which enable outstanding performance on challenging benchmark datasets. Despite this, the trained object detectors typically do not reliably assess uncertainty regarding their own knowledge, and the quality of their probabilistic predictions is usually poor. As these are often used to make subsequent decisions, such inaccurate probabilistic predictions must be avoided. In this work, we investigate the uncertainty calibration properties of different pretrained object detection architectures in a multi-class setting. We propose a framework to ensure a fair, unbiased, and repeatable evaluation and conduct detailed analyses assessing the calibration under distributional changes (e.g., distributional shift and application to out-of-distribution data). Furthermore, by investigating the influence of different detector paradigms, post-processing steps, and suitable choices of metrics, we deliver novel insights into why poor detector calibration emerges. Based on these insights, we are able to improve the calibration of a detector by simply finetuning its last layer. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: 17 pages, 6 figures, submitted to IJCV

ACM Class: I.4.0; I.5.0

arXiv:2205.12729 [pdf, other]

Deep interpretable ensembles

Authors: Lucas Kook, Andrea Götschi, Philipp FM Baumann, Torsten Hothorn, Beate Sick

Abstract: Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawbac… ▽ More Ensembles improve prediction performance and allow uncertainty quantification by aggregating predictions from multiple models. In deep ensembling, the individual models are usually black box neural networks, or recently, partially interpretable semi-structured deep transformation models. However, interpretability of the ensemble members is generally lost upon aggregation. This is a crucial drawback of deep ensembles in high-stake decision fields, in which interpretable models are desired. We propose a novel transformation ensemble which aggregates probabilistic predictions with the guarantee to preserve interpretability and yield uniformly better predictions than the ensemble members on average. Transformation ensembles are tailored towards interpretable deep transformation models but are applicable to a wider range of probabilistic neural networks. In experiments on several publicly available data sets, we demonstrate that transformation ensembles perform on par with classical deep ensembles in terms of prediction performance, discrimination, and calibration. In addition, we demonstrate how transformation ensembles quantify both aleatoric and epistemic uncertainty, and produce minimax optimal predictions under certain conditions. △ Less

Submitted 25 May, 2022; originally announced May 2022.

Comments: 22 pages main text, 8 figures

arXiv:2204.13939 [pdf, other]

doi 10.1109/TSG.2023.3254890

Short-Term Density Forecasting of Low-Voltage Load using Bernstein-Polynomial Normalizing Flows

Authors: Marcel Arpogaus, Marcus Voss, Beate Sick, Mark Nigge-Uricher, Oliver Dürr

Abstract: The transition to a fully renewable energy grid requires better forecasting of demand at the low-voltage level to increase efficiency and ensure reliable control. However, high fluctuations and increasing electrification cause huge forecast variability, not reflected in traditional point estimates. Probabilistic load forecasts take future uncertainties into account and thus allow more informed dec… ▽ More The transition to a fully renewable energy grid requires better forecasting of demand at the low-voltage level to increase efficiency and ensure reliable control. However, high fluctuations and increasing electrification cause huge forecast variability, not reflected in traditional point estimates. Probabilistic load forecasts take future uncertainties into account and thus allow more informed decision-making for the planning and operation of low-carbon energy systems. We propose an approach for flexible conditional density forecasting of short-term load based on Bernstein polynomial normalizing flows, where a neural network controls the parameters of the flow. In an empirical study with 363 smart meter customers, our density predictions compare favorably against Gaussian and Gaussian mixture densities. Also, they outperform a non-parametric approach based on the pinball loss for 24h-ahead load forecasting for two different neural network architectures. △ Less

Submitted 15 June, 2023; v1 submitted 29 April, 2022; originally announced April 2022.

arXiv:2204.13908 [pdf]

doi 10.1007/978-3-030-86514-6_8

Task Embedding Temporal Convolution Networks for Transfer Learning Problems in Renewable Power Time-Series Forecast

Authors: Jens Schreiber, Stephan Vogt, Bernhard Sick

Abstract: Task embeddings in multi-layer perceptrons for multi-task learning and inductive transfer learning in renewable power forecasts have recently been introduced. In many cases, this approach improves the forecast error and reduces the required training data. However, it does not take the seasonal influences in power forecasts within a day into account, i.e., the diurnal cycle. Therefore, we extended… ▽ More Task embeddings in multi-layer perceptrons for multi-task learning and inductive transfer learning in renewable power forecasts have recently been introduced. In many cases, this approach improves the forecast error and reduces the required training data. However, it does not take the seasonal influences in power forecasts within a day into account, i.e., the diurnal cycle. Therefore, we extended this idea to temporal convolutional networks to consider those seasonalities. We propose transforming the embedding space, which contains the latent similarities between tasks, through convolution and providing these results to the network's residual block. The proposed architecture significantly improves up to 25 percent for multi-task learning for power forecasts on the EuropeWindFarm and GermanSolarFarm dataset compared to the multi-layer perceptron approach. Based on the same data, we achieve a ten percent improvement for the wind datasets and more than 20 percent in most cases for the solar dataset for inductive transfer learning without catastrophic forgetting. Finally, we are the first proposing zero-shot learning for renewable power forecasts to provide predictions even if no training data is available. △ Less

Submitted 29 April, 2022; originally announced April 2022.

Comments: Accepted at European Conference on Machine Learning 2021

arXiv:2204.13293 [pdf, other]

Model Selection, Adaptation, and Combination for Transfer Learning in Wind and Photovoltaic Power Forecasts

Authors: Jens Schreiber, Bernhard Sick

Abstract: There is recent interest in using model hubs, a collection of pre-trained models, in computer vision tasks. To utilize the model hub, we first select a source model and then adapt the model for the target to compensate for differences. While there is yet limited research on model selection and adaption for computer vision tasks, this holds even more for the field of renewable power. At the same ti… ▽ More There is recent interest in using model hubs, a collection of pre-trained models, in computer vision tasks. To utilize the model hub, we first select a source model and then adapt the model for the target to compensate for differences. While there is yet limited research on model selection and adaption for computer vision tasks, this holds even more for the field of renewable power. At the same time, it is a crucial challenge to provide forecasts for the increasing demand for power forecasts based on weather features from a numerical weather prediction. We close these gaps by conducting the first thorough experiment for model selection and adaptation for transfer learning in renewable power forecast, adopting recent results from the field of computer vision on 667 wind and photovoltaic parks. To the best of our knowledge, this makes it the most extensive study for transfer learning in renewable power forecasts reducing the computational effort and improving the forecast error. Therefore, we adopt source models based on target data from different seasons and limit the amount of training data. As an extension of the current state of the art, we utilize a Bayesian linear regression for forecasting the response based on features extracted from a neural network. This approach outperforms the baseline with only seven days of training data. We further show how combining multiple models through ensembles can significantly improve the model selection and adaptation approach. △ Less

Submitted 18 July, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

arXiv:2204.00411 [pdf, other]

Synthetic Photovoltaic and Wind Power Forecasting Data

Authors: Stephan Vogt, Jens Schreiber, Bernhard Sick

Abstract: Photovoltaic and wind power forecasts in power systems with a high share of renewable energy are essential in several applications. These include stable grid operation, profitable power trading, and forward-looking system planning. However, there is a lack of publicly available datasets for research on machine learning based prediction methods. This paper provides an openly accessible time series… ▽ More Photovoltaic and wind power forecasts in power systems with a high share of renewable energy are essential in several applications. These include stable grid operation, profitable power trading, and forward-looking system planning. However, there is a lack of publicly available datasets for research on machine learning based prediction methods. This paper provides an openly accessible time series dataset with realistic synthetic power data. Other publicly and non-publicly available datasets often lack precise geographic coordinates, timestamps, or static power plant information, e.g., to protect business secrets. On the opposite, this dataset provides these. The dataset comprises 120 photovoltaic and 273 wind power plants with distinct sides all over Germany from 500 days in hourly resolution. This large number of available sides allows forecasting experiments to include spatial correlations and run experiments in transfer and multi-task learning. It includes side-specific, power source-dependent, non-synthetic input features from the ICON-EU weather model. A simulation of virtual power plants with physical models and actual meteorological measurements provides realistic synthetic power measurement time series. These time series correspond to the power output of virtual power plants at the location of the respective weather measurements. Since the synthetic time series are based exclusively on weather measurements, possible errors in the weather forecast are comparable to those in actual power data. In addition to the data description, we evaluate the quality of weather-prediction-based power forecasts by comparing simplified physical models and a machine learning model. This experiment shows that forecasts errors on the synthetic power data are comparable to real-world historical power measurements. △ Less

Submitted 1 April, 2022; originally announced April 2022.

Comments: 12 pages, 8 figures, and 2 tables

arXiv:2202.06781 [pdf, other]

Design of Explainability Module with Experts in the Loop for Visualization and Dynamic Adjustment of Continual Learning

Authors: Yujiang He, Zhixin Huang, Bernhard Sick

Abstract: Continual learning can enable neural networks to evolve by learning new tasks sequentially in task-changing scenarios. However, two general and related challenges should be overcome in further research before we apply this technique to real-world applications. Firstly, newly collected novelties from the data stream in applications could contain anomalies that are meaningless for continual learning… ▽ More Continual learning can enable neural networks to evolve by learning new tasks sequentially in task-changing scenarios. However, two general and related challenges should be overcome in further research before we apply this technique to real-world applications. Firstly, newly collected novelties from the data stream in applications could contain anomalies that are meaningless for continual learning. Instead of viewing them as a new task for updating, we have to filter out such anomalies to reduce the disturbance of extremely high-entropy data for the progression of convergence. Secondly, fewer efforts have been put into research regarding the explainability of continual learning, which leads to a lack of transparency and credibility of the updated neural networks. Elaborated explanations about the process and result of continual learning can help experts in judgment and making decisions. Therefore, we propose the conceptual design of an explainability module with experts in the loop based on techniques, such as dimension reduction, visualization, and evaluation strategies. This work aims to overcome the mentioned challenges by sufficiently explaining and visualizing the identified anomalies and the updated neural network. With the help of this module, experts can be more confident in decision-making regarding anomaly filtering, dynamic adjustment of hyperparameters, data backup, etc. △ Less

Submitted 14 February, 2022; originally announced February 2022.

Comments: Accepted at the AAAI-22 Workshop on Interactive Machine Learning (IML@AAAI'22)

arXiv:2202.05650 [pdf, other]

Bernstein Flows for Flexible Posteriors in Variational Bayes

Authors: Oliver Dürr, Stephan Hörling, Daniel Dold, Ivonne Kovylov, Beate Sick

Abstract: Variational inference (VI) is a technique to approximate difficult to compute posteriors by optimization. In contrast to MCMC, VI scales to many observations. In the case of complex posteriors, however, state-of-the-art VI approaches often yield unsatisfactory posterior approximations. This paper presents Bernstein flow variational inference (BF-VI), a robust and easy-to-use method, flexible enoug… ▽ More Variational inference (VI) is a technique to approximate difficult to compute posteriors by optimization. In contrast to MCMC, VI scales to many observations. In the case of complex posteriors, however, state-of-the-art VI approaches often yield unsatisfactory posterior approximations. This paper presents Bernstein flow variational inference (BF-VI), a robust and easy-to-use method, flexible enough to approximate complex multivariate posteriors. BF-VI combines ideas from normalizing flows and Bernstein polynomial-based transformation models. In benchmark experiments, we compare BF-VI solutions with exact posteriors, MCMC solutions, and state-of-the-art VI methods including normalizing flow based VI. We show for low-dimensional models that BF-VI accurately approximates the true posterior; in higher-dimensional models, BF-VI outperforms other VI methods. Further, we develop with BF-VI a Bayesian model for the semi-structured Melanoma challenge data, combining a CNN model part for image data with an interpretable model part for tabular data, and demonstrate for the first time how the use of VI in semi-structured models. △ Less

Submitted 23 February, 2024; v1 submitted 11 February, 2022; originally announced February 2022.

arXiv:2109.11301 [pdf, other]

doi 10.1109/ACCESS.2021.3135514

A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification

Authors: Marek Herde, Denis Huseljic, Bernhard Sick, Adrian Calma

Abstract: Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assu… ▽ More Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assume a single, omniscient annotator who never gets tired and charges uniformly regardless of query difficulty. However, in real-world applications, we often face human annotators, e.g., crowd or in-house workers, who make annotation mistakes and can be reluctant to respond if tired or faced with complex queries. Recently, a wide range of novel AL strategies has been proposed to address these issues. They differ in at least one of the following three central aspects from traditional AL: (1) They explicitly consider (multiple) human annotators whose performances can be affected by various factors, such as missing expertise. (2) They generalize the interaction with human annotators by considering different query and annotation types, such as asking an annotator for feedback on an inferred classification rule. (3) They take more complex cost schemes regarding annotations and misclassifications into account. This survey provides an overview of these AL strategies and refers to them as real-world AL. Therefore, we introduce a general real-world AL strategy as part of a learning cycle and use its elements, e.g., the query and annotator selection algorithm, to categorize about 60 real-world AL strategies. Finally, we outline possible directions for future research in the field of AL. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Journal ref: IEEE Access 9 (2021) 166970-166989

arXiv:2109.09607 [pdf, other]

doi 10.1109/ICCVW54120.2021.00119

Description of Corner Cases in Automated Driving: Goals and Challenges

Authors: Daniel Bogdoll, Jasmin Breitenstein, Florian Heidecker, Maarten Bieshaar, Bernhard Sick, Tim Fingscheidt, J. Marius Zöllner

Abstract: Scaling the distribution of automated vehicles requires handling various unexpected and possibly dangerous situations, termed corner cases (CC). Since many modules of automated driving systems are based on machine learning (ML), CC are an essential part of the data for their development. However, there is only a limited amount of CC data in large-scale data collections, which makes them challengin… ▽ More Scaling the distribution of automated vehicles requires handling various unexpected and possibly dangerous situations, termed corner cases (CC). Since many modules of automated driving systems are based on machine learning (ML), CC are an essential part of the data for their development. However, there is only a limited amount of CC data in large-scale data collections, which makes them challenging in the context of ML. With a better understanding of CC, offline applications, e.g., dataset analysis, and online methods, e.g., improved performance of automated driving systems, can be improved. While there are knowledge-based descriptions and taxonomies for CC, there is little research on machine-interpretable descriptions. In this extended abstract, we will give a brief overview of the challenges and goals of such a description. △ Less

Submitted 28 September, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

Comments: Daniel Bogdoll, Jasmin Breitenstein and Florian Heidecker contributed equally. Accepted for publication at ICCV 2021 ERCVAD Workshop

arXiv:2108.13979 [pdf, other]

doi 10.1038/s41598-022-21646-x

Artificial intelligence for online characterization of ultrashort X-ray free-electron laser pulses

Authors: Kristina Dingel, Thorsten Otto, Lutz Marder, Lars Funke, Arne Held, Sara Savio, Andreas Hans, Gregor Hartmann, David Meier, Jens Viefhaus, Bernhard Sick, Arno Ehresmann, Markus Ilchen, Wolfram Helml

Abstract: X-ray free-electron lasers (XFELs) as the world's brightest light sources provide ultrashort X-ray pulses with a duration typically in the order of femtoseconds. Recently, they have approached and entered the attosecond regime, which holds new promises for single-molecule imaging and studying nonlinear and ultrafast phenomena such as localized electron dynamics. The technological evolution of XFEL… ▽ More X-ray free-electron lasers (XFELs) as the world's brightest light sources provide ultrashort X-ray pulses with a duration typically in the order of femtoseconds. Recently, they have approached and entered the attosecond regime, which holds new promises for single-molecule imaging and studying nonlinear and ultrafast phenomena such as localized electron dynamics. The technological evolution of XFELs toward well-controllable light sources for precise metrology of ultrafast processes has been, however, hampered by the diagnostic capabilities for characterizing X-ray pulses at the attosecond frontier. In this regard, the spectroscopic technique of photoelectron angular streaking has successfully proven how to non-destructively retrieve the exact time-energy structure of XFEL pulses on a single-shot basis. By using artificial intelligence techniques, in particular convolutional neural networks, we here show how this technique can be leveraged from its proof-of-principle stage toward routine diagnostics even at high-repetition-rate XFELs, thus enhancing and refining their scientific accessibility in all related disciplines. △ Less

Submitted 9 January, 2023; v1 submitted 31 August, 2021; originally announced August 2021.

Comments: This version includes Supplementary Information

Journal ref: Scientific Reports, 12, 1 (2022) 1-14

arXiv:2108.03891 [pdf, other]

Probabilistic Active Learning for Active Class Selection

Authors: Daniel Kottke, Georg Krempl, Marianne Stecklina, Cornelius Styp von Rekowski, Tim Sabsch, Tuan Pham Minh, Matthias Deliano, Myra Spiliopoulou, Bernhard Sick

Abstract: In machine learning, active class selection (ACS) algorithms aim to actively select a class and ask the oracle to provide an instance for that class to optimize a classifier's performance while minimizing the number of requests. In this paper, we propose a new algorithm (PAL-ACS) that transforms the ACS problem into an active learning task by introducing pseudo instances. These are used to estimat… ▽ More In machine learning, active class selection (ACS) algorithms aim to actively select a class and ask the oracle to provide an instance for that class to optimize a classifier's performance while minimizing the number of requests. In this paper, we propose a new algorithm (PAL-ACS) that transforms the ACS problem into an active learning task by introducing pseudo instances. These are used to estimate the usefulness of an upcoming instance for each class using the performance gain model from probabilistic active learning. Our experimental evaluation (on synthetic and real data) shows the advantages of our algorithm compared to state-of-the-art algorithms. It effectively prefers the sampling of difficult classes and thereby improves the classification performance. △ Less

Submitted 9 August, 2021; originally announced August 2021.

Journal ref: Proc. of the NIPS Workshop on the Future of Interactive Learning Machines (2016)

arXiv:2106.15991 [pdf, other]

Cyclist Trajectory Forecasts by Incorporation of Multi-View Video Information

Authors: Stefan Zernetsch, Oliver Trupp, Viktor Kress, Konrad Doll, Bernhard Sick

Abstract: This article presents a novel approach to incorporate visual cues from video-data from a wide-angle stereo camera system mounted at an urban intersection into the forecast of cyclist trajectories. We extract features from image and optical flow (OF) sequences using 3D convolutional neural networks (3D-ConvNet) and combine them with features extracted from the cyclist's past trajectory to forecast… ▽ More This article presents a novel approach to incorporate visual cues from video-data from a wide-angle stereo camera system mounted at an urban intersection into the forecast of cyclist trajectories. We extract features from image and optical flow (OF) sequences using 3D convolutional neural networks (3D-ConvNet) and combine them with features extracted from the cyclist's past trajectory to forecast future cyclist positions. By the use of additional information, we are able to improve positional accuracy by about 7.5 % for our test dataset and by up to 22 % for specific motion types compared to a method solely based on past trajectories. Furthermore, we compare the use of image sequences to the use of OF sequences as additional information, showing that OF alone leads to significant improvements in positional accuracy. By training and testing our methods using a real-world dataset recorded at a heavily frequented public intersection and evaluating the methods' runtimes, we demonstrate the applicability in real traffic scenarios. Our code and parts of our dataset are made publicly available. △ Less

Submitted 30 June, 2021; originally announced June 2021.

arXiv:2106.02598 [pdf, other]

Pose and Semantic Map Based Probabilistic Forecast of Vulnerable Road Users' Trajectories

Authors: Viktor Kress, Fabian Jeske, Stefan Zernetsch, Konrad Doll, Bernhard Sick

Abstract: In this article, an approach for probabilistic trajectory forecasting of vulnerable road users (VRUs) is presented, which considers past movements and the surrounding scene. Past movements are represented by 3D poses reflecting the posture and movements of individual body parts. The surrounding scene is modeled in the form of semantic maps showing, e.g., the course of streets, sidewalks, and the o… ▽ More In this article, an approach for probabilistic trajectory forecasting of vulnerable road users (VRUs) is presented, which considers past movements and the surrounding scene. Past movements are represented by 3D poses reflecting the posture and movements of individual body parts. The surrounding scene is modeled in the form of semantic maps showing, e.g., the course of streets, sidewalks, and the occurrence of obstacles. The forecasts are generated in grids discretizing the space and in the form of arbitrary discrete probability distributions. The distributions are evaluated in terms of their reliability, sharpness, and positional accuracy. We compare our method with an approach that provides forecasts in the form of Gaussian distributions and discuss the respective advantages and disadvantages. Thereby, we investigate the impact of using poses and semantic maps. With a technique called spatial label smoothing, our approach achieves reliable forecasts. Overall, the poses have a positive impact on the forecasts. The semantic maps offer the opportunity to adapt the probability distributions to the individual situation, although at the considered forecasted time horizon of 2.52 s they play a minor role compared to the past movements of the VRU. Our method is evaluated on a dataset recorded in inner-city traffic using a research vehicle. The dataset is made publicly available. △ Less

Submitted 4 June, 2021; originally announced June 2021.

arXiv:2106.00528 [pdf, other]

Transformation Models for Flexible Posteriors in Variational Bayes

Authors: Sefan Hörtling, Daniel Dold, Oliver Dürr, Beate Sick

Abstract: The main challenge in Bayesian models is to determine the posterior for the model parameters. Already, in models with only one or few parameters, the analytical posterior can only be determined in special settings. In Bayesian neural networks, variational inference is widely used to approximate difficult-to-compute posteriors by variational distributions. Usually, Gaussians are used as variational… ▽ More The main challenge in Bayesian models is to determine the posterior for the model parameters. Already, in models with only one or few parameters, the analytical posterior can only be determined in special settings. In Bayesian neural networks, variational inference is widely used to approximate difficult-to-compute posteriors by variational distributions. Usually, Gaussians are used as variational distributions (Gaussian-VI) which limits the quality of the approximation due to their limited flexibility. Transformation models on the other hand are flexible enough to fit any distribution. Here we present transformation model-based variational inference (TM-VI) and demonstrate that it allows to accurately approximate complex posteriors in models with one parameter and also works in a mean-field fashion for multi-parameter models like neural networks. △ Less

Submitted 1 June, 2021; originally announced June 2021.

Comments: 5 pages, 4 figures

arXiv:2105.06896 [pdf, other]

doi 10.1109/ISC253183.2021.9562912

Towards Sensor Data Abstraction of Autonomous Vehicle Perception Systems

Authors: Hannes Reichert, Lukas Lang, Kevin Rösch, Daniel Bogdoll, Konrad Doll, Bernhard Sick, Hans-Christian Reuss, Christoph Stiller, J. Marius Zöllner

Abstract: Full-stack autonomous driving perception modules usually consist of data-driven models based on multiple sensor modalities. However, these models might be biased to the sensor setup used for data acquisition. This bias can seriously impair the perception models' transferability to new sensor setups, which continuously occur due to the market's competitive nature. We envision sensor data abstractio… ▽ More Full-stack autonomous driving perception modules usually consist of data-driven models based on multiple sensor modalities. However, these models might be biased to the sensor setup used for data acquisition. This bias can seriously impair the perception models' transferability to new sensor setups, which continuously occur due to the market's competitive nature. We envision sensor data abstraction as an interface between sensor data and machine learning applications for highly automated vehicles (HAD). For this purpose, we review the primary sensor modalities, camera, lidar, and radar, published in autonomous-driving related datasets, examine single sensor abstraction and abstraction of sensor setups, and identify critical paths towards an abstraction of sensor data from multiple perception configurations. △ Less

Submitted 28 September, 2021; v1 submitted 14 May, 2021; originally announced May 2021.

Comments: Hannes Reichert, Lukas Lang, Kevin Rösch and Daniel Bogdoll contributed equally. Accepted for publication at ISC2 2021

arXiv:2105.02965 [pdf, other]

Out-of-distribution Detection and Generation using Soft Brownian Offset Sampling and Autoencoders

Authors: Felix Möller, Diego Botache, Denis Huseljic, Florian Heidecker, Maarten Bieshaar, Bernhard Sick

Abstract: Deep neural networks often suffer from overconfidence which can be partly remedied by improved out-of-distribution detection. For this purpose, we propose a novel approach that allows for the generation of out-of-distribution datasets based on a given in-distribution dataset. This new dataset can then be used to improve out-of-distribution detection for the given dataset and machine learning task… ▽ More Deep neural networks often suffer from overconfidence which can be partly remedied by improved out-of-distribution detection. For this purpose, we propose a novel approach that allows for the generation of out-of-distribution datasets based on a given in-distribution dataset. This new dataset can then be used to improve out-of-distribution detection for the given dataset and machine learning task at hand. The samples in this dataset are with respect to the feature space close to the in-distribution dataset and therefore realistic and plausible. Hence, this dataset can also be used to safeguard neural networks, i.e., to validate the generalization performance. Our approach first generates suitable representations of an in-distribution dataset using an autoencoder and then transforms them using our novel proposed Soft Brownian Offset method. After transformation, the decoder part of the autoencoder allows for the generation of these implicit out-of-distribution samples. This newly generated dataset then allows for mixing with other datasets and thus improved training of an out-of-distribution classifier, increasing its performance. Experimentally, we show that our approach is promising for time series using synthetic data. Using our new method, we also show in a quantitative case study that we can improve the out-of-distribution detection for the MNIST dataset. Finally, we provide another case study on the synthetic generation of out-of-distribution trajectories, which can be used to validate trajectory prediction algorithms for automated driving. △ Less

Submitted 4 May, 2021; originally announced May 2021.

Comments: 10 pages, 7 figures, accepted for publication at CVPR 2021 Workshop Safe Artificial Intelligence for Automated Driving (SAIAD)

arXiv:2104.09176 [pdf, other]

Cyclist Intention Detection: A Probabilistic Approach

Authors: Stefan Zernetsch, Hannes Reichert, Viktor Kress, Konrad Doll, Bernhard Sick

Abstract: This article presents a holistic approach for probabilistic cyclist intention detection. A basic movement detection based on motion history images (MHI) and a residual convolutional neural network (ResNet) are used to estimate probabilities for the current cyclist motion state. These probabilities are used as weights in a probabilistic ensemble trajectory forecast. The ensemble consists of special… ▽ More This article presents a holistic approach for probabilistic cyclist intention detection. A basic movement detection based on motion history images (MHI) and a residual convolutional neural network (ResNet) are used to estimate probabilities for the current cyclist motion state. These probabilities are used as weights in a probabilistic ensemble trajectory forecast. The ensemble consists of specialized models, which produce individual forecasts in the form of Gaussian distributions under the assumption of a certain motion state of the cyclist (e.g. cyclist is starting or turning left). By weighting the specialized models, we create forecasts in the from of Gaussian mixtures that define regions within which the cyclists will reside with a certain probability. To evaluate our method, we rate the reliability, sharpness, and positional accuracy of our forecasted distributions. We compare our method to a single model approach which produces forecasts in the form of Gaussian distributions and show that our method is able to produce more reliable and sharper outputs while retaining comparable positional accuracy. Both methods are evaluated using a dataset created at a public traffic intersection. Our code and the dataset are made publicly available. △ Less

Submitted 19 April, 2021; originally announced April 2021.

arXiv:2103.03678 [pdf, other]

doi 10.1109/IV48863.2021.9575933

An Application-Driven Conceptualization of Corner Cases for Perception in Highly Automated Driving

Authors: Florian Heidecker, Jasmin Breitenstein, Kevin Rösch, Jonas Löhdefink, Maarten Bieshaar, Christoph Stiller, Tim Fingscheidt, Bernhard Sick

Abstract: Systems and functions that rely on machine learning (ML) are the basis of highly automated driving. An essential task of such ML models is to reliably detect and interpret unusual, new, and potentially dangerous situations. The detection of those situations, which we refer to as corner cases, is highly relevant for successfully develo**, applying, and validating automotive perception functions i… ▽ More Systems and functions that rely on machine learning (ML) are the basis of highly automated driving. An essential task of such ML models is to reliably detect and interpret unusual, new, and potentially dangerous situations. The detection of those situations, which we refer to as corner cases, is highly relevant for successfully develo**, applying, and validating automotive perception functions in future vehicles where multiple sensor modalities will be used. A complication for the development of corner case detectors is the lack of consistent definitions, terms, and corner case descriptions, especially when taking into account various automotive sensors. In this work, we provide an application-driven view of corner cases in highly automated driving. To achieve this goal, we first consider existing definitions from the general outlier, novelty, anomaly, and out-of-distribution detection to show relations and differences to corner cases. Moreover, we extend an existing camera-focused systematization of corner cases by adding RADAR (radio detection and ranging) and LiDAR (light detection and ranging) sensors. For this, we describe an exemplary toolchain for data acquisition and processing, highlighting the interfaces of the corner case detection. We also define a novel level of corner cases, the method layer corner cases, which appear due to uncertainty inherent in the methodology or the data distribution. △ Less

Submitted 5 March, 2021; originally announced March 2021.

Comments: This paper is submitted to IEEE Intelligent Vehicles Symposium 2021

arXiv:2101.00926 [pdf, other]

doi 10.1186/s42467-021-00009-8

CLeaR: An Adaptive Continual Learning Framework for Regression Tasks

Authors: Yujiang He, Bernhard Sick

Abstract: Catastrophic forgetting means that a trained neural network model gradually forgets the previously learned tasks when being retrained on new tasks. Overcoming the forgetting problem is a major problem in machine learning. Numerous continual learning algorithms are very successful in incremental learning of classification tasks, where new samples with their labels appear frequently. However, there… ▽ More Catastrophic forgetting means that a trained neural network model gradually forgets the previously learned tasks when being retrained on new tasks. Overcoming the forgetting problem is a major problem in machine learning. Numerous continual learning algorithms are very successful in incremental learning of classification tasks, where new samples with their labels appear frequently. However, there is currently no research that addresses the catastrophic forgetting problem in regression tasks as far as we know. This problem has emerged as one of the primary constraints in some applications, such as renewable energy forecasts. This article clarifies problem-related definitions and proposes a new methodological framework that can forecast targets and update itself by means of continual learning. The framework consists of forecasting neural networks and buffers, which store newly collected data from a non-stationary data stream in an application. The changed probability distribution of the data stream, which the framework has identified, will be learned sequentially. The framework is called CLeaR (Continual Learning for Regression Tasks), where components can be flexibly customized for a specific application scenario. We design two sets of experiments to evaluate the CLeaR framework concerning fitting error (training), prediction error (test), and forgetting ratio. The first one is based on an artificial time series to explore how hyperparameters affect the CLeaR framework. The second one is designed with data collected from European wind farms to evaluate the CLeaR framework's performance in a real-world application. The experimental results demonstrate that the CLeaR framework can continually acquire knowledge in the data stream and improve the prediction accuracy. The article concludes with further research issues arising from requirements to extend the framework. △ Less

Submitted 16 July, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

Journal ref: Published on AI Perspectives (2021)

arXiv:2010.08376 [pdf, other]

Deep and interpretable regression models for ordinal outcomes

Authors: Lucas Kook, Lisa Herzog, Torsten Hothorn, Oliver Dürr, Beate Sick

Abstract: Outcomes with a natural order commonly occur in prediction tasks and often the available input data are a mixture of complex data like images and tabular predictors. Deep Learning (DL) models are state-of-the-art for image classification tasks but frequently treat ordinal outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models consider the outcome's order… ▽ More Outcomes with a natural order commonly occur in prediction tasks and often the available input data are a mixture of complex data like images and tabular predictors. Deep Learning (DL) models are state-of-the-art for image classification tasks but frequently treat ordinal outcomes as unordered and lack interpretability. In contrast, classical ordinal regression models consider the outcome's order and yield interpretable predictor effects but are limited to tabular data. We present ordinal neural network transformation models (ONTRAMs), which unite DL with classical ordinal regression approaches. ONTRAMs are a special case of transformation models and trade off flexibility and interpretability by additively decomposing the transformation function into terms for image and tabular data using jointly trained neural networks. The performance of the most flexible ONTRAM is by definition equivalent to a standard multi-class DL model trained with cross-entropy while being faster in training when facing ordinal outcomes. Lastly, we discuss how to interpret model components for both tabular and image data on two publicly available datasets. △ Less

Submitted 20 April, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: 41 pages (incl. appendix, figures and literature), 11 figures in main text, 4 figures in appendix

arXiv:2010.05898 [pdf, other]

Quantile Surfaces -- Generalizing Quantile Regression to Multivariate Targets

Authors: Maarten Bieshaar, Jens Schreiber, Stephan Vogt, André Gensler, Bernhard Sick

Abstract: In this article, we present a novel approach to multivariate probabilistic forecasting. Our approach is based on an extension of single-output quantile regression (QR) to multivariate-targets, called quantile surfaces (QS). QS uses a simple yet compelling idea of indexing observations of a probabilistic forecast through direction and vector length to estimate a central tendency. We extend the sing… ▽ More In this article, we present a novel approach to multivariate probabilistic forecasting. Our approach is based on an extension of single-output quantile regression (QR) to multivariate-targets, called quantile surfaces (QS). QS uses a simple yet compelling idea of indexing observations of a probabilistic forecast through direction and vector length to estimate a central tendency. We extend the single-output QR technique to multivariate probabilistic targets. QS efficiently models dependencies in multivariate target variables and represents probability distributions through discrete quantile levels. Therefore, we present a novel two-stage process. In the first stage, we perform a deterministic point forecast (i.e., central tendency estimation). Subsequently, we model the prediction uncertainty using QS involving neural networks called quantile surface regression neural networks (QSNN). Additionally, we introduce new methods for efficient and straightforward evaluation of the reliability and sharpness of the issued probabilistic QS predictions. We complement this by the directional extension of the Continuous Ranked Probability Score (CRPS) score. Finally, we evaluate our novel approach on synthetic data and two currently researched real-world challenges in two different domains: First, probabilistic forecasting for renewable energy power generation, second, short-term cyclists trajectory forecasting for autonomously driving vehicles. Especially for the latter, our empirical results show that even a simple one-layer QSNN outperforms traditional parametric multivariate forecasting techniques, thus improving the state-of-the-art performance. △ Less

Submitted 29 September, 2020; originally announced October 2020.

Comments: Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), currently under review, 15 page, 23 figures, 2 tables

arXiv:2009.13853 [pdf, other]

Efficient SVDD Sampling with Approximation Guarantees for the Decision Boundary

Authors: Adrian Englhardt, Holger Trittenbach, Daniel Kottke, Bernhard Sick, Klemens Böhm

Abstract: Support Vector Data Description (SVDD) is a popular one-class classifiers for anomaly and novelty detection. But despite its effectiveness, SVDD does not scale well with data size. To avoid prohibitive training times, sampling methods select small subsets of the training data on which SVDD trains a decision boundary hopefully equivalent to the one obtained on the full data set. According to the li… ▽ More Support Vector Data Description (SVDD) is a popular one-class classifiers for anomaly and novelty detection. But despite its effectiveness, SVDD does not scale well with data size. To avoid prohibitive training times, sampling methods select small subsets of the training data on which SVDD trains a decision boundary hopefully equivalent to the one obtained on the full data set. According to the literature, a good sample should therefore contain so-called boundary observations that SVDD would select as support vectors on the full data set. However, non-boundary observations also are essential to not fragment contiguous inlier regions and avoid poor classification accuracy. Other aspects, such as selecting a sufficiently representative sample, are important as well. But existing sampling methods largely overlook them, resulting in poor classification accuracy. In this article, we study how to select a sample considering these points. Our approach is to frame SVDD sampling as an optimization problem, where constraints guarantee that sampling indeed approximates the original decision boundary. We then propose RAPID, an efficient algorithm to solve this optimization problem. RAPID does not require any tuning of parameters, is easy to implement and scales well to large data sets. We evaluate our approach on real-world and synthetic data. Our evaluation is the most comprehensive one for SVDD sampling so far. Our results show that RAPID outperforms its competitors in classification accuracy, in sample size, and in runtime. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Showing 1–50 of 86 results for author: Sick, B