Skip to main content

Showing 1–45 of 45 results for author: Stiefelhagen, R

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.01816  [pdf, other

    eess.IV cs.CV cs.HC

    Rethinking Annotator Simulation: Realistic Evaluation of Whole-Body PET Lesion Interactive Segmentation Methods

    Authors: Zdravko Marinov, Moon Kim, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: Interactive segmentation plays a crucial role in accelerating the annotation, particularly in domains requiring specialized expertise such as nuclear medicine. For example, annotating lesions in whole-body Positron Emission Tomography (PET) images can require over an hour per volume. While previous works evaluate interactive segmentation models through either real user studies or simulated annotat… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures, 1 table

  2. arXiv:2403.09975  [pdf, other

    cs.CV cs.RO eess.IV

    Skeleton-Based Human Action Recognition with Noisy Labels

    Authors: Yi Xu, Kunyu Peng, Di Wen, Rui** Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagen

    Abstract: Understanding human actions from body poses is critical for assistive robots sharing space with humans in order to make informed and safe decisions about the next interaction. However, precise temporal localization and annotation of activity sequences is time-consuming and the resulting labels are often noisy. If not effectively addressed, label noise negatively affects the model's training, resul… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: The source code will be made accessible at https://github.com/xuyizdby/NoiseEraSAR

  3. arXiv:2402.18302  [pdf, other

    cs.CV cs.RO eess.AS eess.IV

    EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving

    Authors: Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang

    Abstract: This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving. Due to the lack of semantic modeling capacity in audio and video, existing works have mainly focused on text-based multi-object tracking, which often comes at the cos… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: The source code and datasets will be made publicly available at https://github.com/lab206/EchoTrack

  4. arXiv:2401.16923  [pdf, other

    cs.CV cs.RO eess.IV

    Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation

    Authors: Rui** Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen

    Abstract: Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework. However, the modality incompleteness in multi-modal segmentation remains under-explored. In this work, we establish a task called Modality-Incomplete Scene Segmentation (MISS), which encompasses both system-level… ▽ More

    Submitted 10 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted to IEEE IV 2024. The source code is publicly available at https://github.com/Rui**L/MISS

  5. arXiv:2312.06330  [pdf, other

    cs.CV cs.AI cs.RO eess.IV

    Navigating Open Set Scenarios for Skeleton-based Action Recognition

    Authors: Kunyu Peng, Cheng Yin, Junwei Zheng, Rui** Liu, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

    Abstract: In real-world scenarios, human actions often fall outside the distribution of training data, making it crucial for models to recognize known actions and reject unknown ones. However, using pure skeleton data in such open-set conditions poses challenges due to the lack of visual background cues and the distinct sparse structure of body pose sequences. In this paper, we tackle the unexplored Open-Se… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024. The benchmark, code, and models will be released at https://github.com/KPeng9510/OS-SAR

  6. arXiv:2311.14482  [pdf, other

    eess.IV cs.AI cs.CV cs.HC

    Sliding Window FastEdit: A Framework for Lesion Annotation in Whole-body PET Images

    Authors: Matthias Hadlich, Zdravko Marinov, Moon Kim, Enrico Nasca, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: Deep learning has revolutionized the accurate segmentation of diseases in medical imaging. However, achieving such results requires training with numerous manual voxel annotations. This requirement presents a challenge for whole-body Positron Emission Tomography (PET) imaging, where lesions are scattered throughout the body. To tackle this problem, we introduce SW-FastEdit - an interactive segment… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures, 4 tables

  7. arXiv:2311.13964  [pdf, other

    eess.IV cs.AI cs.CV cs.HC cs.LG

    Deep Interactive Segmentation of Medical Images: A Systematic Review and Taxonomy

    Authors: Zdravko Marinov, Paul F. Jäger, Jan Egger, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: Interactive segmentation is a crucial research area in medical image analysis aiming to boost the efficiency of costly annotations by incorporating human feedback. This feedback takes the form of clicks, scribbles, or masks and allows for iterative refinement of the model output so as to efficiently guide the system towards the desired behavior. In recent years, deep learning-based approaches have… ▽ More

    Submitted 9 January, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: 26 pages, 8 figures, 10 tables; Zdravko Marinov and Paul F. Jäger and co-first authors; This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  8. arXiv:2310.02815  [pdf, other

    cs.CV cs.RO eess.IV

    CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

    Authors: Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

    Abstract: Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety. While previous studies have limitations in using only depth or height information, we find both depth and height matter and they are in fact complementary. The depth feature encompasses pre… ▽ More

    Submitted 17 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: The source code will be made publicly available at https://github.com/MasterHow/CoBEV

  9. arXiv:2309.12114  [pdf, other

    eess.IV cs.CV

    AutoPET Challenge 2023: Sliding Window-based Optimization of U-Net

    Authors: Matthias Hadlich, Zdravko Marinov, Rainer Stiefelhagen

    Abstract: Tumor segmentation in medical imaging is crucial and relies on precise delineation. Fluorodeoxyglucose Positron-Emission Tomography (FDG-PET) is widely used in clinical practice to detect metabolically active tumors. However, FDG-PET scans may misinterpret irregular glucose consumption in healthy or benign tissues as cancer. Combining PET with Computed Tomography (CT) can enhance tumor segmentatio… ▽ More

    Submitted 4 October, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: 9 pages, 1 figure, MICCAI 2023 - AutoPET Challenge Submission Version 2: Added all results on the preliminary test set

  10. arXiv:2309.12029  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Unveiling the Hidden Realm: Self-supervised Skeleton-based Action Recognition in Occluded Environments

    Authors: Yifei Chen, Kunyu Peng, Alina Roitberg, David Schneider, Jiaming Zhang, Junwei Zheng, Rui** Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

    Abstract: To integrate action recognition methods into autonomous robotic systems, it is crucial to consider adverse situations involving target occlusions. Such a scenario, despite its practical relevance, is rarely addressed in existing self-supervised skeleton-based action recognition methods. To empower robots with the capacity to address occlusion, we propose a simple and effective method. We first pre… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: The source code will be made publicly available at https://github.com/cyfml/OPSTL

  11. arXiv:2309.12009  [pdf, other

    cs.CV cs.MM cs.RO eess.IV

    Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

    Authors: Yi** Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Rui** Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

    Abstract: Self-supervised representation learning for human action recognition has developed rapidly in recent years. Most of the existing works are based on skeleton data while using a multi-modality setup. These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i.e., joints, bone… ▽ More

    Submitted 10 January, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024. The source code will be made publicly available at https://github.com/desehuileng0o0/IKEM

  12. arXiv:2307.15588  [pdf, other

    cs.CV cs.RO eess.IV

    OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation

    Authors: Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, Kailun Yang

    Abstract: Light field cameras, by harnessing the power of micro-lens array, are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation, a critical aspect of scene interpretation in vision intelligence. However, the extensive angular information of light… ▽ More

    Submitted 21 December, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: The source code of OAFuser will be made publicly available at https://github.com/FeiBryantkit/OAFuser

  13. arXiv:2307.13375  [pdf, other

    eess.IV cs.CV

    Towards Unifying Anatomy Segmentation: Automated Generation of a Full-body CT Dataset via Knowledge Aggregation and Anatomical Guidelines

    Authors: Alexander Jaus, Constantin Seibold, Kelsey Hermann, Alexandra Walter, Kristina Giske, Johannes Haubold, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: In this study, we present a method for generating automated anatomy segmentation datasets using a sequential process that involves nnU-Net-based pseudo-labeling and anatomy-guided pseudo-label refinement. By combining various fragmented knowledge bases, we generate a dataset of whole-body CT scans with $142$ voxel-level labels for 533 volumes providing comprehensive anatomical coverage which exper… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 18 pages, 8 figures, 2 tables

  14. arXiv:2307.07763  [pdf, other

    cs.RO cs.CV eess.IV

    Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents

    Authors: Ke Cao, Rui** Liu, Ze Wang, Kunyu Peng, Jiaming Zhang, Junwei Zheng, Zhifeng Teng, Kailun Yang, Rainer Stiefelhagen

    Abstract: The mobile robot relies on SLAM (Simultaneous Localization and Map**) to provide autonomous navigation and task execution in complex and unknown environments. However, it is hard to develop a dedicated algorithm for mobile robots due to dynamic and challenging situations, such as poor lighting conditions and motion blur. To tackle this issue, we propose a tightly-coupled LiDAR-visual SLAM based… ▽ More

    Submitted 25 December, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

    Comments: Accepted to ROBIO 2023

  15. arXiv:2307.07757  [pdf, other

    cs.CV cs.HC cs.RO eess.IV

    Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Hel** People with Visual Impairments

    Authors: Rui** Liu, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ke Cao, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

    Abstract: Grounded Situation Recognition (GSR) is capable of recognizing and interpreting visual scenes in a contextually intuitive way, yielding salient activities (verbs) and the involved entities (roles) depicted in images. In this work, we focus on the application of GSR in assisting people with visual impairments (PVI). However, precise localization information of detected objects is often required to… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Comments: Code will be available at https://github.com/Rui**L/OpenSU

  16. arXiv:2306.03934  [pdf, other

    eess.IV cs.CV cs.LG

    Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling

    Authors: Constantin Seibold, Alexander Jaus, Matthias A. Fink, Moon Kim, Simon Reiß, Ken Herrmann, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: Purpose: Interpreting chest radiographs (CXR) remains challenging due to the ambiguity of overlap** structures such as the lungs, heart, and bones. To address this issue, we propose a novel method for extracting fine-grained anatomical structures in CXR using pseudo-labeling of three-dimensional computed tomography (CT) scans. Methods: We created a large-scale dataset of 10,021 thoracic CTs wi… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: 28 pages, 1 table, 10 figures

    ACM Class: I.4.6; I.4.7; I.4.8

  17. arXiv:2305.08420  [pdf, other

    cs.CV cs.AI cs.RO eess.IV

    Exploring Few-Shot Adaptation for Activity Recognition on Diverse Domains

    Authors: Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

    Abstract: Domain adaptation is essential for activity recognition to ensure accurate and robust performance across diverse environments, sensor types, and data sources. Unsupervised domain adaptation methods have been extensively studied, yet, they require large-scale unlabeled data from the target domain. In this work, we focus on Few-Shot Domain Adaptation for Activity Recognition (FSDA-AR), which leverag… ▽ More

    Submitted 27 April, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: The benchmark and source code will be publicly available at https://github.com/KPeng9510/RelaMiX

  18. arXiv:2303.13842  [pdf, other

    cs.CV cs.RO eess.IV

    FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

    Authors: Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen

    Abstract: This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV). Fisheye cameras have larger FoV than ordinary pinhole cameras, yet its unique special imaging model naturally leads to a blind area at the edge of the image plane. This is suboptimal for safety-critical applic… ▽ More

    Submitted 20 April, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: Accepted to CVPR OmniCV 2023. Code and datasets will be available at https://github.com/MasterHow/FishDreamer

  19. arXiv:2303.07126  [pdf, ps, other

    eess.IV cs.CV

    Mirror U-Net: Marrying Multimodal Fission with Multi-task Learning for Semantic Segmentation in Medical Imaging

    Authors: Zdravko Marinov, Simon Reiß, David Kersting, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: Positron Emission Tomography (PET) and Computer Tomography (CT) are routinely used together to detect tumors. PET/CT segmentation models can automate tumor delineation, however, current multimodal models do not fully exploit the complementary information in each modality, as they either concatenate PET and CT data or fuse them at the decision level. To combat this, we propose Mirror U-Net, which r… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: 8 pages; 8 figures; 5 tables

  20. arXiv:2303.00952  [pdf, other

    cs.CV cs.RO eess.IV

    Towards Activated Muscle Group Estimation in the Wild

    Authors: Kunyu Peng, David Schneider, Alina Roitberg, Kailun Yang, Jiaming Zhang, Chen Deng, Kaiyu Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen

    Abstract: In this paper, we tackle the new task of video-based Activated Muscle Group Estimation (AMGE) aiming at identifying active muscle regions during physical activity in the wild. To this intent, we provide the MuscleMap dataset featuring >15K video clips with 135 different activities and 20 labeled muscle groups. This dataset opens the vistas to multiple video-based applications in sports and rehabil… ▽ More

    Submitted 27 April, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: The contributed dataset and code will be publicly available at https://github.com/KPeng9510/MuscleMap

  21. arXiv:2209.01112  [pdf, other

    eess.IV cs.CV

    AutoPET Challenge: Combining nn-Unet with Swin UNETR Augmented by Maximum Intensity Projection Classifier

    Authors: Lars Heiliger, Zdravko Marinov, Max Hasin, André Ferreira, Jana Fragemann, Kelsey Pomykala, Jacob Murray, David Kersting, Victor Alves, Rainer Stiefelhagen, Jan Egger, Jens Kleesiek

    Abstract: Tumor volume and changes in tumor characteristics over time are important biomarkers for cancer therapy. In this context, FDG-PET/CT scans are routinely used for staging and re-staging of cancer, as the radiolabeled fluorodeoxyglucose is taken up in regions of high metabolism. Unfortunately, these regions with high metabolism are not specific to tumors and can also represent physiological uptake b… ▽ More

    Submitted 14 October, 2022; v1 submitted 2 September, 2022; originally announced September 2022.

    Comments: 11 pages, 2 figures

  22. arXiv:2207.11860  [pdf, other

    cs.CV cs.RO eess.IV

    Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

    Authors: Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen

    Abstract: In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360° imagery. To tackle these problems, first, we propose the upgraded Transformer for Panoramic Semantic Segmentation, i.e., Trans4PASS+, equipped with Deformable Patch Embedding (DPE)… ▽ More

    Submitted 31 May, 2024; v1 submitted 24 July, 2022; originally announced July 2022.

    Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Extended version of CVPR 2022 paper arXiv:2203.01452. Code is available at https://github.com/jamycheung/Trans4PASS

  23. arXiv:2206.10711  [pdf, other

    cs.CV cs.RO eess.IV

    Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning

    Authors: Alexander Jaus, Kailun Yang, Rainer Stiefelhagen

    Abstract: In this work, we introduce panoramic panoptic segmentation, as the most holistic scene understanding, both in terms of Field of View (FoV) and image-level understanding for standard camera-based input. A complete surrounding understanding provides a maximum of information to a mobile agent. This is essential information for any intelligent vehicle to make informed decisions in a safety-critical dy… ▽ More

    Submitted 27 December, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: Accepted to IEEE Transactions on Intelligent Transportation Systems (T-ITS). Extended version of arXiv:2103.00868. The project is at https://github.com/alexanderjaus/PPS

  24. arXiv:2204.01154  [pdf, other

    cs.CV cs.HC cs.RO eess.IV

    Indoor Navigation Assistance for Visually Impaired People via Dynamic SLAM and Panoptic Segmentation with an RGB-D Sensor

    Authors: Wenyan Ou, Jiaming Zhang, Kunyu Peng, Kailun Yang, Gerhard Jaworek, Karin Müller, Rainer Stiefelhagen

    Abstract: Exploring an unfamiliar indoor environment and avoiding obstacles is challenging for visually impaired people. Currently, several approaches achieve the avoidance of static obstacles based on the map** of indoor scenes. To solve the issue of distinguishing dynamic obstacles, we propose an assistive system with an RGB-D sensor to detect dynamic information of a scene. Once the system captures an… ▽ More

    Submitted 3 April, 2022; originally announced April 2022.

    Comments: Accepted to ICCHP 2022

  25. arXiv:2203.10395  [pdf, other

    cs.CV cs.RO eess.IV

    Towards Robust Semantic Segmentation of Accident Scenes via Multi-Source Mixed Sampling and Meta-Learning

    Authors: Xinyu Luo, Jiaming Zhang, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen

    Abstract: Autonomous vehicles utilize urban scene segmentation to understand the real world like a human and react accordingly. Semantic segmentation of normal scenes has experienced a remarkable rise in accuracy on conventional benchmarks. However, a significant portion of real-life accidents features abnormal scenes, such as those with object deformations, overturns, and unexpected traffic behaviors. Sinc… ▽ More

    Submitted 19 March, 2022; originally announced March 2022.

    Comments: Code will be made publicly available at https://github.com/xinyu-laura/MMUDA

  26. arXiv:2203.09645  [pdf, other

    cs.CV cs.RO eess.IV

    MatchFormer: Interleaving Attention in Transformers for Feature Matching

    Authors: Qing Wang, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

    Abstract: Local feature matching is a computationally intensive task at the subpixel level. While detector-based methods coupled with feature descriptors struggle in low-texture scenes, CNN-based methods with a sequential extract-to-match pipeline, fail to make use of the matching capacity of the encoder and tend to overburden the decoder for matching. In contrast, we propose a novel hierarchical extract-an… ▽ More

    Submitted 23 September, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted to ACCV 2022. Code is available at https://github.com/jamycheung/MatchFormer

  27. arXiv:2203.04838  [pdf, other

    cs.CV cs.RO eess.IV

    CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers

    Authors: Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Rui** Liu, Rainer Stiefelhagen

    Abstract: Scene understanding based on image segmentation is a crucial component of autonomous vehicles. Pixel-wise semantic segmentation of RGB images can be advanced by exploiting complementary features from the supplementary modality (X-modality). However, covering a wide variety of sensors with a modality-agnostic model remains an unresolved problem due to variations in sensor characteristics among diff… ▽ More

    Submitted 24 November, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: Accepted to IEEE Transactions on Intelligent Transportation Systems (T-ITS). The source code of CMX is publicly available at https://github.com/huaaaliu/RGBX_Semantic_Segmentation

  28. arXiv:2203.01452  [pdf, other

    cs.CV cs.RO eess.IV

    Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation

    Authors: Jiaming Zhang, Kailun Yang, Chaoxiang Ma, Simon Reiß, Kunyu Peng, Rainer Stiefelhagen

    Abstract: Panoramic images with their 360-degree directional view encompass exhaustive information about the surrounding space, providing a rich foundation for scene understanding. To unfold this potential in the form of robust panoramic segmentation models, large quantities of expensive, pixel-wise annotations are crucial for success. Such annotations are available, but predominantly for narrow-angle, pinh… ▽ More

    Submitted 17 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR2022. Code will be made publicly available at https://github.com/jamycheung/Trans4PASS

  29. arXiv:2203.00927  [pdf, other

    cs.CV cs.RO eess.IV

    TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration

    Authors: Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

    Abstract: Traditional video-based human activity recognition has experienced remarkable progress linked to the rise of deep learning, but this effect was slower as it comes to the downstream task of driver behavior understanding. Understanding the situation inside the vehicle cabin is essential for Advanced Driving Assistant System (ADAS) as it enables identifying distraction, predicting driver's intent and… ▽ More

    Submitted 28 July, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted to IROS 2022. Code is publicly available at https://github.com/KPeng9510/TransDARC

  30. arXiv:2202.13393  [pdf, other

    cs.CV cs.RO eess.IV

    TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation

    Authors: Rui** Liu, Kailun Yang, Alina Roitberg, Jiaming Zhang, Kunyu Peng, Huayao Liu, Yaonan Wang, Rainer Stiefelhagen

    Abstract: Semantic segmentation benchmarks in the realm of autonomous driving are dominated by large pre-trained transformers, yet their widespread adoption is impeded by substantial computational costs and prolonged training durations. To lift this constraint, we look at efficient semantic segmentation from a perspective of comprehensive knowledge distillation and consider to bridge the gap between multi-s… ▽ More

    Submitted 24 December, 2023; v1 submitted 27 February, 2022; originally announced February 2022.

    Comments: The source code is publicly available at https://github.com/Rui**L/TransKD

  31. arXiv:2112.00735  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Reference-guided Pseudo-Label Generation for Medical Semantic Segmentation

    Authors: Constantin Seibold, Simon Reiß, Jens Kleesiek, Rainer Stiefelhagen

    Abstract: Producing densely annotated data is a difficult and tedious task for medical imaging applications. To address this problem, we propose a novel approach to generate supervision for semi-supervised semantic segmentation. We argue that visually similar regions between labeled and unlabeled images likely contain the same semantics and therefore should share their label. Following this thought, we use… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 36th AAAI Conference on Artificial Intelligence 2022

    MSC Class: 68T07; 68T45 ACM Class: I.5.4

  32. arXiv:2110.11062  [pdf, other

    cs.CV cs.RO eess.IV

    Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation

    Authors: Jiaming Zhang, Chaoxiang Ma, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen

    Abstract: Autonomous vehicles clearly benefit from the expanded Field of View (FoV) of 360-degree sensors, but modern semantic segmentation approaches rely heavily on annotated training data which is rarely available for panoramic images. We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a di… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted to IEEE Transactions on Intelligent Transportation Systems (IEEE T-ITS). Dataset and code will be made publicly available at https://github.com/chma1024/DensePASS. arXiv admin note: substantial text overlap with arXiv:2108.06383

  33. arXiv:2108.07007  [pdf, other

    cs.CV cs.HC cs.RO eess.IV

    Flying Guide Dog: Walkable Path Discovery for the Visually Impaired Utilizing Drones and Transformer-based Semantic Segmentation

    Authors: Haobin Tan, Chang Chen, Xinyu Luo, Jiaming Zhang, Constantin Seibold, Kailun Yang, Rainer Stiefelhagen

    Abstract: Lacking the ability to sense ambient environments effectively, blind and visually impaired people (BVIP) face difficulty in walking outdoors, especially in urban areas. Therefore, tools for assisting BVIP are of great importance. In this paper, we propose a novel "flying guide dog" prototype for BVIP assistance using drone and street view semantic segmentation. Based on the walkable areas extracte… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

    Comments: Code, dataset, and video demo will be made publicly available at https://github.com/EckoTan0804/flying-guide-dog

  34. arXiv:2108.06383  [pdf, other

    cs.CV cs.RO eess.IV

    DensePASS: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation with Attention-Augmented Context Exchange

    Authors: Chaoxiang Ma, Jiaming Zhang, Kailun Yang, Alina Roitberg, Rainer Stiefelhagen

    Abstract: Intelligent vehicles clearly benefit from the expanded Field of View (FoV) of the 360-degree sensors, but the vast majority of available semantic segmentation training images are captured with pinhole cameras. In this work, we look at this problem through the lens of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different d… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

    Comments: Accepted to IEEE ITSC 2021. Dataset and code will be made publicly available at https://github.com/chma1024/DensePASS

  35. arXiv:2103.05687  [pdf, other

    cs.CV cs.RO eess.IV

    Capturing Omni-Range Context for Omnidirectional Segmentation

    Authors: Kailun Yang, Jiaming Zhang, Simon Reiß, Xinxin Hu, Rainer Stiefelhagen

    Abstract: Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving. Enabling an all-encompassing view of street-scenes, omnidirectional cameras present themselves as a perfect fit in such systems. Most segmentation models for parsing urban environments operate on common, narrow Field of View (FoV) images. Transferring these model… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

    Comments: Accepted to CVPR2021

  36. arXiv:2103.04136  [pdf, other

    cs.CV cs.RO eess.IV

    Perception Framework through Real-Time Semantic Segmentation and Scene Recognition on a Wearable System for the Visually Impaired

    Authors: Yingzhi Zhang, Haoye Chen, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

    Abstract: As the scene information, including objectness and scene type, are important for people with visual impairment, in this work we present a multi-task efficient perception system for the scene parsing and recognition tasks. Building on the compact ResNet backbone, our designed network architecture has two paths with shared parameters. In the structure, the semantic segmentation path integrates fast… ▽ More

    Submitted 6 March, 2021; originally announced March 2021.

    Comments: 6 pages, 7 figures, 2 tables

  37. arXiv:2103.04128  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    Panoptic Lintention Network: Towards Efficient Navigational Perception for the Visually Impaired

    Authors: Wei Mao, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

    Abstract: Classic computer vision algorithms, instance segmentation, and semantic segmentation can not provide a holistic understanding of the surroundings for the visually impaired. In this paper, we utilize panoptic segmentation to assist the navigation of visually impaired people by offering both things and stuff awareness in the proximity of the visually impaired efficiently. To this end, we propose an… ▽ More

    Submitted 6 March, 2021; originally announced March 2021.

    Comments: 6 pages, 4 figures, 2 tables

  38. arXiv:2103.00879  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    DR-TANet: Dynamic Receptive Temporal Attention Network for Street Scene Change Detection

    Authors: Shuo Chen, Kailun Yang, Rainer Stiefelhagen

    Abstract: Street scene change detection continues to capture researchers' interests in the computer vision community. It aims to identify the changed regions of the paired street-view images captured at different times. The state-of-the-art network based on the encoder-decoder architecture leverages the feature maps at the corresponding level between two channels to gain sufficient information of changes. S… ▽ More

    Submitted 28 May, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: 8 pages, 9 figures, 6 tables. Accepted to IEEE Intelligent Vehicles Symposium 2021 (IV2021). Code is available at https://github.com/Herrccc/DR-TANet

  39. arXiv:2103.00868  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    Panoramic Panoptic Segmentation: Towards Complete Surrounding Understanding via Unsupervised Contrastive Learning

    Authors: Alexander Jaus, Kailun Yang, Rainer Stiefelhagen

    Abstract: In this work, we introduce panoramic panoptic segmentation as the most holistic scene understanding both in terms of field of view and image level understanding for standard camera based input. A complete surrounding understanding provides a maximum of information to the agent, which is essential for any intelligent vehicle in order to make informed decisions in a safety-critical dynamic environme… ▽ More

    Submitted 28 May, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: 7 pages, 4 figures, 2 tables. Accepted to 2021 IEEE Intelligent Vehicles Symposium (IV2021). The project is at https://github.com/alexanderjaus/PPS

  40. Prediction of low-keV monochromatic images from polyenergetic CT scans for improved automatic detection of pulmonary embolism

    Authors: Constantin Seibold, Matthias A. Fink, Charlotte Goos, Hans-Ulrich Kauczor, Heinz-Peter Schlemmer, Rainer Stiefelhagen, Jens Kleesiek

    Abstract: Detector-based spectral computed tomography is a recent dual-energy CT (DECT) technology that offers the possibility of obtaining spectral information. From this spectral data, different types of images can be derived, amongst others virtual monoenergetic (monoE) images. MonoE images potentially exhibit decreased artifacts, improve contrast, and overall contain lower noise values, making them idea… ▽ More

    Submitted 23 February, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

    Comments: 4 pages, ISBI 2021

    MSC Class: 92C55 68T07

  41. Self-Guided Multiple Instance Learning for Weakly Supervised Disease Classification and Localization in Chest Radiographs

    Authors: Constantin Seibold, Jens Kleesiek, Heinz-Peter Schlemmer, Rainer Stiefelhagen

    Abstract: The lack of fine-grained annotations hinders the deployment of automated diagnosis systems, which require human-interpretable justification for their decision process. In this paper, we address the problem of weakly supervised identification and localization of abnormalities in chest radiographs. To that end, we introduce a novel loss function for training convolutional neural networks increasing… ▽ More

    Submitted 30 September, 2020; originally announced October 2020.

    ACM Class: I.4.0

  42. arXiv:1909.07721  [pdf, other

    cs.CV cs.RO eess.IV eess.SP

    DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing

    Authors: Kailun Yang, Xinxin Hu, Hao Chen, Kaite Xiang, Kaiwei Wang, Rainer Stiefelhagen

    Abstract: Semantically interpreting the traffic scene is crucial for autonomous transportation and robotics systems. However, state-of-the-art semantic segmentation pipelines are dominantly designed to work with pinhole cameras and train with narrow Field-of-View (FoV) images. In this sense, the perception capacity is severely limited to offer higher-level confidence for upstream navigation tasks. In this p… ▽ More

    Submitted 7 February, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: 8 pages, 10 figures

  43. arXiv:1908.00274  [pdf, other

    cs.CV cs.LG eess.IV

    Content and Colour Distillation for Learning Image Translations with the Spatial Profile Loss

    Authors: M. Saquib Sarfraz, Constantin Seibold, Haroon Khalid, Rainer Stiefelhagen

    Abstract: Generative adversarial networks has emerged as a defacto standard for image translation problems. To successfully drive such models, one has to rely on additional networks e.g., discriminators and/or perceptual networks. Training these networks with pixel based losses alone are generally not sufficient to learn the target distribution. In this paper, we propose a novel method of computing the loss… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

    Comments: BMVC 2019

  44. arXiv:1607.05485  [pdf, other

    eess.SY

    Exact Maximum Entropy Inverse Optimal Control for Modelling Human Attention Switching and Control

    Authors: Felix Schmitt, Hans-Joachim Bieg, Dietrich Manstetten, Michael Herman, Rainer Stiefelhagen

    Abstract: Maximum Causal Entropy (MCE) Inverse Optimal Control (IOC) has become an effective tool for modelling human behaviour in many control tasks. Its advantage over classic techniques for estimating human policies is the transferability of the inferred objectives: Behaviour can be predicted in variations of the control task by policy computation using a relaxed optimality criterion. However, exact poli… ▽ More

    Submitted 19 July, 2016; originally announced July 2016.

    Comments: accepted for publication on IEEE Conference on Systems, Man and Cybernetics 2016

    ACM Class: I.2.6; I.2.8; J.4

  45. Predicting Lane Kee** Behavior of Visually Distracted Drivers Using Inverse Suboptimal Control

    Authors: Felix Schmitt, Hans-Joachim Bieg, Dietrich Manstetten, Michael Herman, Rainer Stiefelhagen

    Abstract: Driver distraction strongly contributes to crash-risk. Therefore, assistance systems that warn the driver if her distraction poses a hazard to road safety, promise a great safety benefit. Current approaches either seek to detect critical situations using environmental sensors or estimate a driver's attention state solely from her behavior. However, this neglects that driving situation, driver defi… ▽ More

    Submitted 29 April, 2016; v1 submitted 13 April, 2016; originally announced April 2016.

    Comments: 7 pages, 6 figures, accepted for 2016 IEEE Intelligent Vehicles Symposium

    ACM Class: I.2.8