Skip to main content

Showing 1–41 of 41 results for author: Fingscheidt, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.04660  [pdf, other

    eess.AS cs.SD

    URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

    Authors: Wangyou Zhang, Robin Scheibler, Kohei Saijo, Samuele Cornell, Chenda Li, Zhaoheng Ni, Anurag Kumar, Jan Pirklbauer, Marvin Sach, Shinji Watanabe, Tim Fingscheidt, Yanmin Qian

    Abstract: The last decade has witnessed significant advancements in deep learning-based speech enhancement (SE). However, most existing SE research has limitations on the coverage of SE sub-tasks, data diversity and amount, and evaluation metrics. To fill this gap and promote research toward universal SE, we establish a new SE challenge, named URGENT, to focus on the universality, robustness, and generaliza… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 6 pages, 3 figures, 3 tables. Accepted by Interspeech 2024. An extended version of the accepted manuscript with appendix

  2. arXiv:2312.01850  [pdf, other

    cs.CV cs.LG

    Generalization by Adaptation: Diffusion-Based Domain Extension for Domain-Generalized Semantic Segmentation

    Authors: Joshua Niemeijer, Manuel Schwonberg, Jan-Aike Termöhlen, Nico M. Schmidt, Tim Fingscheidt

    Abstract: When models, e.g., for semantic segmentation, are applied to images that are vastly different from training data, the performance will drop significantly. Domain adaptation methods try to overcome this issue, but need samples from the target domain. However, this might not always be feasible for various reasons and therefore domain generalization methods are useful as they do not require any targe… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Accepted to WACV 2024

  3. arXiv:2309.02432  [pdf, other

    eess.AS cs.SD

    Employing Real Training Data for Deep Noise Suppression

    Authors: Ziyi Xu, Marvin Sach, Jan Pirklbauer, Tim Fingscheidt

    Abstract: Most deep noise suppression (DNS) models are trained with reference-based losses requiring access to clean speech. However, sometimes an additive microphone model is insufficient for real-world applications. Accordingly, ways to use real training data in supervised learning for DNS models promise to reduce a potential training/inference mismatch. Employing real data for DNS training requires eithe… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  4. arXiv:2308.13331  [pdf, other

    cs.CV

    A Re-Parameterized Vision Transformer (ReVT) for Domain-Generalized Semantic Segmentation

    Authors: Jan-Aike Termöhlen, Timo Bartels, Tim Fingscheidt

    Abstract: The task of semantic segmentation requires a model to assign semantic labels to each pixel of an image. However, the performance of such models degrades when deployed in an unseen domain with different data distributions compared to the training domain. We present a new augmentation-driven approach to domain generalization for semantic segmentation using a re-parameterized vision transformer (ReVT… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  5. arXiv:2304.11928  [pdf, other

    cs.CV cs.AI

    Survey on Unsupervised Domain Adaptation for Semantic Segmentation for Visual Perception in Automated Driving

    Authors: Manuel Schwonberg, Joshua Niemeijer, Jan-Aike Termöhlen, Jörg P. Schäfer, Nico M. Schmidt, Hanno Gottschalk, Tim Fingscheidt

    Abstract: Deep neural networks (DNNs) have proven their capabilities in many areas in the past years, such as robotics, or automated driving, enabling technological breakthroughs. DNNs play a significant role in environment perception for the challenging application of automated driving and are employed for tasks such as detection, semantic segmentation, and sensor fusion. Despite this progress and tremendo… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: submitted to IEEE Access; Project Website: https://uda-survey.github.io/survey/

  6. arXiv:2304.09226  [pdf, other

    eess.AS cs.SD

    Coded Speech Quality Measurement by a Non-Intrusive PESQ-DNN

    Authors: Ziyi Xu, Ziyue Zhao, Tim Fingscheidt

    Abstract: Wideband codecs such as AMR-WB or EVS are widely used in (mobile) speech communication. Evaluation of coded speech quality is often performed subjectively by an absolute category rating (ACR) listening test. However, the ACR test is impractical for online monitoring of speech communication networks. Perceptual evaluation of speech quality (PESQ) is one of the widely used metrics instrumentally pre… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  7. arXiv:2209.09735  [pdf, ps, other

    cs.LG cs.CL eess.AS eess.IV

    Relaxed Attention for Transformer Models

    Authors: Timo Lohrenz, Björn Möller, Zhengyang Li, Tim Fingscheidt

    Abstract: The powerful modeling capabilities of all-attention-based transformer architectures often cause overfitting and - for natural language processing tasks - lead to an implicitly learned internal language model in the autoregressive transformer decoder complicating the integration of external language models. In this paper, we explore relaxed attention, a simple and easy-to-implement smoothing of the… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

  8. arXiv:2206.00608  [pdf, other

    cs.CV cs.LG cs.RO

    On the Choice of Data for Efficient Training and Validation of End-to-End Driving Models

    Authors: Marvin Klingner, Konstantin Müller, Mona Mirzaie, Jasmin Breitenstein, Jan-Aike Termöhlen, Tim Fingscheidt

    Abstract: The emergence of data-driven machine learning (ML) has facilitated significant progress in many complicated tasks such as highly-automated driving. While much effort is put into improving the ML models and learning algorithms in such applications, little focus is put into how the training data and/or validation setting should be designed. In this paper we investigate the influence of several data… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: Accepted at CVPR VDU Workshop 2022

  9. arXiv:2206.00527  [pdf, other

    cs.CV

    Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic Segmentation Challenge Baseline

    Authors: Jasmin Breitenstein, Tim Fingscheidt

    Abstract: Amodal perception terms the ability of humans to imagine the entire shapes of occluded objects. This gives humans an advantage to keep track of everything that is going on, especially in crowded situations. Typical perception functions, however, lack amodal perception abilities and are therefore at a disadvantage in situations with occlusions. Complex urban driving scenarios often experience many… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: This paper is accepted at IEEE Intelligent Vehicles Symposium 2022

  10. arXiv:2205.04276  [pdf, ps, other

    eess.AS cs.SD

    Bandwidth-Scalable Fully Mask-Based Deep FCRN Acoustic Echo Cancellation and Postfiltering

    Authors: Ernst Seidel, Rasmus Kongsgaard Olsson, Karim Haddad, Zhengyang Li, Pejman Mowlaee, Tim Fingscheidt

    Abstract: Although today's speech communication systems support various bandwidths from narrowband to super-wideband and beyond, state-of-the art DNN methods for acoustic echo cancellation (AEC) are lacking modularity and bandwidth scalability. Our proposed DNN model builds upon a fully convolutional recurrent network (FCRN) and introduces scalability over various bandwidths up to a fullband (FB) system (48… ▽ More

    Submitted 7 November, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: 5 pages, 1 figure, accepted for IWAENC 2022

  11. arXiv:2205.02085  [pdf, other

    eess.AS cs.SD

    Does a PESQNet (Loss) Require a Clean Reference Input? The Original PESQ Does, But ACR Listening Tests Don't

    Authors: Ziyi Xu, Maximilian Strake, Tim Fingscheidt

    Abstract: Perceptual evaluation of speech quality (PESQ) requires a clean speech reference as input, but predicts the results from (reference-free) absolute category rating (ACR) tests. In this work, we train a fully convolutional recurrent neural network (FCRN) as deep noise suppression (DNS) model, with either a non-intrusive or an intrusive PESQNet, where only the latter has access to a clean speech refe… ▽ More

    Submitted 13 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

  12. arXiv:2203.01177  [pdf, other

    cs.CV

    Detecting Adversarial Perturbations in Multi-Task Perception

    Authors: Marvin Klingner, Varun Ravi Kumar, Senthil Yogamani, Andreas Bär, Tim Fingscheidt

    Abstract: While deep neural networks (DNNs) achieve impressive performance on environment perception tasks, their sensitivity to adversarial perturbations limits their use in practical applications. In this paper, we (i) propose a novel adversarial perturbation detection scheme based on multi-task perception of complex vision tasks (i.e., depth estimation and semantic segmentation). Specifically, adversaria… ▽ More

    Submitted 11 September, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted at IROS 2022

  13. arXiv:2203.01074  [pdf, other

    cs.CV

    Continual BatchNorm Adaptation (CBNA) for Semantic Segmentation

    Authors: Marvin Klingner, Mouadh Ayache, Tim Fingscheidt

    Abstract: Environment perception in autonomous driving vehicles often heavily relies on deep neural networks (DNNs), which are subject to domain shifts, leading to a significantly decreased performance during DNN deployment. Usually, this problem is addressed by unsupervised domain adaptation (UDA) approaches trained either simultaneously on source and target domain datasets or even source-free only on targ… ▽ More

    Submitted 8 July, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted to IEEE Transactions on Intelligent Transportation Systems

  14. arXiv:2201.06415  [pdf, other

    cs.CV eess.IV

    Improving Performance of Semantic Segmentation CycleGANs by Noise Injection into the Latent Segmentation Space

    Authors: Jonas Löhdefink, Tim Fingscheidt

    Abstract: In recent years, semantic segmentation has taken benefit from various works in computer vision. Inspired by the very versatile CycleGAN architecture, we combine semantic segmentation with the concept of cycle consistency to enable a multitask training protocol. However, learning is largely prevented by the so-called steganography effect, which expresses itself as watermarks in the latent segmentat… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

  15. arXiv:2201.02834  [pdf, other

    eess.SP cs.LG

    Reconfigurable Intelligent Surface Enabled Spatial Multiplexing with Fully Convolutional Network

    Authors: Bile Peng, Jan-Aike Termöhlen, Cong Sun, Dan** He, Ke Guan, Tim Fingscheidt, Eduard A. Jorswieck

    Abstract: Reconfigurable intelligent surface (RIS) is an emerging technology for future wireless communication systems. In this work, we consider downlink spatial multiplexing enabled by the RIS for weighted sum-rate (WSR) maximization. In the literature, most solutions use alternating gradient-based optimization, which has moderate performance, high complexity, and limited scalability. We propose to apply… ▽ More

    Submitted 21 September, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

  16. arXiv:2111.03847  [pdf, other

    eess.AS cs.SD

    Deep Noise Suppression Maximizing Non-Differentiable PESQ Mediated by a Non-Intrusive PESQNet

    Authors: Ziyi Xu, Maximilian Strake, Tim Fingscheidt

    Abstract: Speech enhancement employing deep neural networks (DNNs) for denoising are called deep noise suppression (DNS). During training, DNS methods are typically trained with mean squared error (MSE) type loss functions, which do not guarantee good perceptual quality. Perceptual evaluation of speech quality (PESQ) is a widely used metric for evaluating speech quality. However, the original PESQ algorithm… ▽ More

    Submitted 6 November, 2021; originally announced November 2021.

  17. Description of Corner Cases in Automated Driving: Goals and Challenges

    Authors: Daniel Bogdoll, Jasmin Breitenstein, Florian Heidecker, Maarten Bieshaar, Bernhard Sick, Tim Fingscheidt, J. Marius Zöllner

    Abstract: Scaling the distribution of automated vehicles requires handling various unexpected and possibly dangerous situations, termed corner cases (CC). Since many modules of automated driving systems are based on machine learning (ML), CC are an essential part of the data for their development. However, there is only a limited amount of CC data in large-scale data collections, which makes them challengin… ▽ More

    Submitted 28 September, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

    Comments: Daniel Bogdoll, Jasmin Breitenstein and Florian Heidecker contributed equally. Accepted for publication at ICCV 2021 ERCVAD Workshop

  18. arXiv:2107.01275  [pdf, ps, other

    eess.AS cs.CL cs.LG cs.SD

    Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

    Authors: Timo Lohrenz, Patrick Schwarz, Zhengyang Li, Tim Fingscheidt

    Abstract: Recently, attention-based encoder-decoder (AED) models have shown high performance for end-to-end automatic speech recognition (ASR) across several tasks. Addressing overconfidence in such models, in this paper we introduce the concept of relaxed attention, which is a simple gradual injection of a uniform distribution to the encoder-decoder attention weights during training that is easily implemen… ▽ More

    Submitted 15 December, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: Accepted at ASRU 2021, code contributed to http://github.com/freewym/espresso

  19. Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety

    Authors: Sebastian Houben, Stephanie Abrecht, Maram Akila, Andreas Bär, Felix Brockherde, Patrick Feifel, Tim Fingscheidt, Sujan Sai Gannamaneni, Seyed Eghbal Ghobadi, Ahmed Hammam, Anselm Haselhoff, Felix Hauser, Christian Heinzemann, Marco Hoffmann, Nikhil Kapoor, Falk Kappel, Marvin Klingner, Jan Kronenberger, Fabian Küppers, Jonas Löhdefink, Michael Mlynarski, Michael Mock, Firas Mualla, Svetlana Pavlitskaya, Maximilian Poretschkin , et al. (16 additional authors not shown)

    Abstract: The use of deep neural networks (DNNs) in safety-critical applications like mobile health and autonomous driving is challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability to problems with malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from safety conce… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Comments: 94 pages

    Journal ref: Fingscheidt, T., Gottschalk, H., Houben, S. (eds) Deep Neural Networks and Data for Automated Driving, Springer, Cham (2022)

  20. arXiv:2104.05255  [pdf, other

    cs.CV

    Improving Online Performance Prediction for Semantic Segmentation

    Authors: Marvin Klingner, Andreas Bär, Marcel Mross, Tim Fingscheidt

    Abstract: In this work we address the task of observing the performance of a semantic segmentation deep neural network (DNN) during online operation, i.e., during inference, which is of high importance in safety-critical applications such as autonomous driving. Here, many high-level decisions rely on such DNNs, which are usually evaluated offline, while their performance in online operation remains unknown.… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR SAIAD Workshop

  21. arXiv:2104.04420  [pdf, other

    cs.CV cs.RO

    SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround View Fisheye Cameras

    Authors: Varun Ravi Kumar, Marvin Klingner, Senthil Yogamani, Markus Bach, Stefan Milz, Tim Fingscheidt, Patrick Mäder

    Abstract: A 360° perception of scene geometry is essential for automated driving, notably for parking and urban driving scenarios. Typically, it is achieved using surround-view fisheye cameras, focusing on the near-field area around the vehicle. The majority of current depth estimation approaches focus on employing just a single camera, which cannot be straightforwardly generalized to multiple cameras. The… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: To be published at IEEE Transactions on Intelligent Transportation Systems

  22. arXiv:2104.00120  [pdf, ps, other

    eess.AS cs.CL cs.LG cs.SD

    Multi-Encoder Learning and Stream Fusion for Transformer-Based End-to-End Automatic Speech Recognition

    Authors: Timo Lohrenz, Zhengyang Li, Tim Fingscheidt

    Abstract: Stream fusion, also known as system combination, is a common technique in automatic speech recognition for traditional hybrid hidden Markov model approaches, yet mostly unexplored for modern deep neural network end-to-end model architectures. Here, we investigate various fusion techniques for the all-attention-based encoder-decoder architecture known as the transformer, striving to achieve optimal… ▽ More

    Submitted 14 July, 2021; v1 submitted 31 March, 2021; originally announced April 2021.

    Comments: accepted at INTERSPEECH 2021

  23. arXiv:2103.17189  [pdf, ps, other

    eess.AS cs.SD

    Y$^2$-Net FCRN for Acoustic Echo and Noise Suppression

    Authors: Ernst Seidel, Jan Franzen, Maximilian Strake, Tim Fingscheidt

    Abstract: In recent years, deep neural networks (DNNs) were studied as an alternative to traditional acoustic echo cancellation (AEC) algorithms. The proposed models achieved remarkable performance for the separate tasks of AEC and residual echo suppression (RES). A promising network topology is a fully convolutional recurrent network (FCRN) structure, which has already proven its performance on both noise… ▽ More

    Submitted 18 July, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

    Comments: 5 pages, 2 figures, accepted for Interspeech 2021

  24. An Application-Driven Conceptualization of Corner Cases for Perception in Highly Automated Driving

    Authors: Florian Heidecker, Jasmin Breitenstein, Kevin Rösch, Jonas Löhdefink, Maarten Bieshaar, Christoph Stiller, Tim Fingscheidt, Bernhard Sick

    Abstract: Systems and functions that rely on machine learning (ML) are the basis of highly automated driving. An essential task of such ML models is to reliably detect and interpret unusual, new, and potentially dangerous situations. The detection of those situations, which we refer to as corner cases, is highly relevant for successfully develo**, applying, and validating automotive perception functions i… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

    Comments: This paper is submitted to IEEE Intelligent Vehicles Symposium 2021

  25. arXiv:2102.05897  [pdf, other

    cs.CV

    Corner Cases for Visual Perception in Automated Driving: Some Guidance on Detection Approaches

    Authors: Jasmin Breitenstein, Jan-Aike Termöhlen, Daniel Lipinski, Tim Fingscheidt

    Abstract: Automated driving has become a major topic of interest not only in the active research community but also in mainstream media reports. Visual perception of such intelligent vehicles has experienced large progress in the last decade thanks to advances in deep learning techniques but some challenges still remain. One such challenge is the detection of corner cases. They are unexpected and unknown si… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

  26. The Vulnerability of Semantic Segmentation Networks to Adversarial Attacks in Autonomous Driving: Enhancing Extensive Environment Sensing

    Authors: Andreas Bär, Jonas Löhdefink, Nikhil Kapoor, Serin J. Varghese, Fabian Hüger, Peter Schlicht, Tim Fingscheidt

    Abstract: Enabling autonomous driving (AD) can be considered one of the biggest challenges in today's technology. AD is a complex task accomplished by several functionalities, with environment perception being one of its core functions. Environment perception is usually performed by combining the semantic information captured by several sensors, i.e., lidar or camera. The semantic information from the respe… ▽ More

    Submitted 13 January, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: IEEE Signal Processing Magazine (Volume: 38, Issue: 1, Jan. 2021), pp. 42 - 52

  27. arXiv:2012.01558  [pdf, other

    cs.CV cs.LG eess.IV

    From a Fourier-Domain Perspective on Adversarial Examples to a Wiener Filter Defense for Semantic Segmentation

    Authors: Nikhil Kapoor, Andreas Bär, Serin Varghese, Jan David Schneider, Fabian Hüger, Peter Schlicht, Tim Fingscheidt

    Abstract: Despite recent advancements, deep neural networks are not robust against adversarial perturbations. Many of the proposed adversarial defense approaches use computationally expensive training mechanisms that do not scale to complex real-world tasks such as semantic segmentation, and offer only marginal improvements. In addition, fundamental questions on the nature of adversarial perturbations and t… ▽ More

    Submitted 21 April, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: Accepted by The International Joint Conference on Neural Network (IJCNN) 2021

  28. arXiv:2012.01386  [pdf, other

    cs.CV cs.AI cs.LG

    A Self-Supervised Feature Map Augmentation (FMA) Loss and Combined Augmentations Finetuning to Efficiently Improve the Robustness of CNNs

    Authors: Nikhil Kapoor, Chun Yuan, Jonas Löhdefink, Roland Zimmermann, Serin Varghese, Fabian Hüger, Nico Schmidt, Peter Schlicht, Tim Fingscheidt

    Abstract: Deep neural networks are often not robust to semantically-irrelevant changes in the input. In this work we address the issue of robustness of state-of-the-art deep convolutional neural networks (CNNs) against commonly occurring distortions in the input such as photometric changes, or the addition of blur and noise. These changes in the input are often accounted for during training in the form of d… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: Accepted at ACM CSCS 2020 (8 pages, 4 figures)

  29. arXiv:2011.08502  [pdf, other

    cs.CV

    Unsupervised BatchNorm Adaptation (UBNA): A Domain Adaptation Method for Semantic Segmentation Without Using Source Domain Representations

    Authors: Marvin Klingner, Jan-Aike Termöhlen, Jacob Ritterbach, Tim Fingscheidt

    Abstract: In this paper we present a solution to the task of "unsupervised domain adaptation (UDA) of a given pre-trained semantic segmentation model without relying on any source domain representations". Previous UDA approaches for semantic segmentation either employed simultaneous training of the model in the source and target domains, or they relied on an additional network, replaying source domain knowl… ▽ More

    Submitted 11 November, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: Accepted to WACV DNOW Workshop

  30. arXiv:2010.14919  [pdf, other

    cs.CV

    Transferable Universal Adversarial Perturbations Using Generative Models

    Authors: Atiye Sadat Hashemi, Andreas Bär, Saeed Mozaffari, Tim Fingscheidt

    Abstract: Deep neural networks tend to be vulnerable to adversarial perturbations, which by adding to a natural image can fool a respective model with high confidence. Recently, the existence of image-agnostic perturbations, also known as universal adversarial perturbations (UAPs), were discovered. However, existing UAPs still lack a sufficiently high fooling rate, when being applied to an unknown target mo… ▽ More

    Submitted 29 October, 2020; v1 submitted 28 October, 2020; originally announced October 2020.

  31. arXiv:2008.04017  [pdf, other

    cs.CV cs.RO

    SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation Synergized with Semantic Segmentation for Autonomous Driving

    Authors: Varun Ravi Kumar, Marvin Klingner, Senthil Yogamani, Stefan Milz, Tim Fingscheidt, Patrick Maeder

    Abstract: State-of-the-art self-supervised learning approaches for monocular depth estimation usually suffer from scale ambiguity. They do not generalize well when applied on distance estimation for complex projection models such as in fisheye and omnidirectional cameras. This paper introduces a novel multi-task learning strategy to improve self-supervised monocular distance estimation on fisheye and pinhol… ▽ More

    Submitted 14 November, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: Camera ready version + supplementary. Accepted for presentation at Winter Conference on Applications of Computer Vision 2021

  32. arXiv:2007.08463  [pdf, other

    cs.CV cs.LG

    openDD: A Large-Scale Roundabout Drone Dataset

    Authors: Antonia Breuer, Jan-Aike Termöhlen, Silviu Homoceanu, Tim Fingscheidt

    Abstract: Analyzing and predicting the traffic scene around the ego vehicle has been one of the key challenges in autonomous driving. Datasets including the trajectories of all road users present in a scene, as well as the underlying road topology are invaluable to analyze the behavior of the different traffic participants. The interaction between the various traffic participants is especially high in inter… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: ITSC 2020 Conference Paper

  33. arXiv:2007.06936  [pdf, other

    cs.CV

    Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance

    Authors: Marvin Klingner, Jan-Aike Termöhlen, Jonas Mikolajczyk, Tim Fingscheidt

    Abstract: Self-supervised monocular depth estimation presents a powerful method to obtain 3D scene information from single camera images, which is trainable on arbitrary image sequences without requiring depth labels, e.g., from a LiDAR sensor. In this work we present a new self-supervised semantically-guided depth estimation (SGDepth) method to deal with moving dynamic-class (DC) objects, such as moving ca… ▽ More

    Submitted 21 July, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  34. arXiv:2006.08613  [pdf, other

    cs.CV

    Self-Supervised Domain Mismatch Estimation for Autonomous Perception

    Authors: Jonas Löhdefink, Justin Fehrling, Marvin Klingner, Fabian Hüger, Peter Schlicht, Nico M. Schmidt, Tim Fingscheidt

    Abstract: Autonomous driving requires self awareness of its perception functions. Technically spoken, this can be realized by observers, which monitor the performance indicators of various perception modules. In this work we choose, exemplarily, a semantic segmentation to be monitored, and propose an autoencoder, trained in a self-supervised fashion on the very same training data as the semantic segmentatio… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: Proc. of CVPR - Workshops

  35. arXiv:2005.06050  [pdf, other

    cs.CV cs.LG eess.IV

    Class-Incremental Learning for Semantic Segmentation Re-Using Neither Old Data Nor Old Labels

    Authors: Marvin Klingner, Andreas Bär, Philipp Donn, Tim Fingscheidt

    Abstract: While neural networks trained for semantic segmentation are essential for perception in autonomous driving, most current algorithms assume a fixed number of classes, presenting a major limitation when develo** new autonomous driving systems with the need of additional classes. In this paper we present a technique implementing class-incremental learning for semantic segmentation without using the… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: ITSC 2020 Conference Paper

  36. arXiv:2004.11072  [pdf, other

    cs.CV

    Improved Noise and Attack Robustness for Semantic Segmentation by Using Multi-Task Training with Self-Supervised Depth Estimation

    Authors: Marvin Klingner, Andreas Bär, Tim Fingscheidt

    Abstract: While current approaches for neural network training often aim at improving performance, less focus is put on training methods aiming at robustness towards varying noise conditions or directed attacks by adversarial examples. In this paper, we propose to improve robustness by a multi-task training, which extends supervised semantic segmentation by a self-supervised monocular depth estimation on un… ▽ More

    Submitted 23 April, 2020; originally announced April 2020.

    Comments: CVPR 2020 Workshop on Safe Artificial Intelligence for Automated Driving

  37. arXiv:1908.05087  [pdf, ps, other

    eess.AS cs.SD

    Components Loss for Neural Networks in Mask-Based Speech Enhancement

    Authors: Ziyi Xu, Samy Elshamy, Ziyue Zhao, Tim Fingscheidt

    Abstract: Estimating time-frequency domain masks for single-channel speech enhancement using deep learning methods has recently become a popular research field with promising results. In this paper, we propose a novel components loss (CL) for the training of neural networks for mask-based speech enhancement. During the training process, the proposed CL offers separate control over preservation of the speech… ▽ More

    Submitted 14 August, 2019; originally announced August 2019.

  38. arXiv:1905.09754  [pdf, ps, other

    eess.AS cs.SD

    A Perceptual Weighting Filter Loss for DNN Training in Speech Enhancement

    Authors: Ziyue Zhao, Samy Elshamy, Tim Fingscheidt

    Abstract: Single-channel speech enhancement with deep neural networks (DNNs) has shown promising performance and is thus intensively being studied. In this paper, instead of applying the mean squared error (MSE) as the loss function during DNN training for speech enhancement, we design a perceptual weighting filter loss motivated by the weighting filter as it is employed in analysis-by-synthesis speech codi… ▽ More

    Submitted 18 August, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

  39. arXiv:1902.09184  [pdf, other

    cs.CV

    Towards Corner Case Detection for Autonomous Driving

    Authors: Jan-Aike Bolte, Andreas Bär, Daniel Lipinski, Tim Fingscheidt

    Abstract: The progress in autonomous driving is also due to the increased availability of vast amounts of training data for the underlying machine learning approaches. Machine learning systems are generally known to lack robustness, e.g., if the training data did rarely or not at all cover critical situations. The challenging task of corner case detection in video, which is also somehow related to unusual e… ▽ More

    Submitted 26 February, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

  40. arXiv:1902.04311  [pdf, other

    cs.CV

    GAN- vs. JPEG2000 Image Compression for Distributed Automotive Perception: Higher Peak SNR Does Not Mean Better Semantic Segmentation

    Authors: Jonas Löhdefink, Andreas Bär, Nico M. Schmidt, Fabian Hüger, Peter Schlicht, Tim Fingscheidt

    Abstract: The high amount of sensors required for autonomous driving poses enormous challenges on the capacity of automotive bus systems. There is a need to understand tradeoffs between bitrate and perception performance. In this paper, we compare the image compression standards JPEG, JPEG2000, and WebP to a modern encoder/decoder image compression approach based on generative adversarial networks (GANs). W… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  41. arXiv:1806.09411  [pdf, ps, other

    eess.AS cs.SD

    Convolutional Neural Networks to Enhance Coded Speech

    Authors: Ziyue Zhao, Huijun Liu, Tim Fingscheidt

    Abstract: Enhancing coded speech suffering from far-end acoustic background noise, quantization noise, and potentially transmission errors, is a challenging task. In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follo… ▽ More

    Submitted 24 January, 2019; v1 submitted 25 June, 2018; originally announced June 2018.

    Comments: More analysis are added for version 4