Skip to main content

Showing 1–26 of 26 results for author: Wandt, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00637  [pdf, other

    cs.CV cs.AI cs.GR

    Representing Animatable Avatar via Factorized Neural Fields

    Authors: Chun** Song, Zhijie Wu, Bastian Wandt, Leonid Sigal, Helge Rhodin

    Abstract: For reconstructing high-fidelity human 3D models from monocular videos, it is crucial to maintain consistent large-scale body shapes along with finely matched subtle wrinkles. This paper explores the observation that the per-frame rendering results can be factorized into a pose-independent component and a corresponding pose-dependent equivalent to facilitate frame consistency. Pose adaptive textur… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  2. arXiv:2405.06845  [pdf, other

    cs.CV

    CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras

    Authors: James Tang, Shashwat Suri, Daniel Ajisafe, Bastian Wandt, Helge Rhodin

    Abstract: It is now possible to estimate 3D human pose from monocular images with off-the-shelf 3D pose estimators. However, many practical applications require fine-grained absolute pose information for which multi-view cues and camera calibration are necessary. Such multi-view recordings are laborious because they require manual calibration, and are expensive when using dedicated hardware. Our goal is ful… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to the 18th IEEE International Conference on Automatic Face and Gesture Recognition

  3. arXiv:2403.05327  [pdf, other

    cs.CV

    DiffSF: Diffusion Models for Scene Flow Estimation

    Authors: Yushan Zhang, Bastian Wandt, Maria Magnusson, Michael Felsberg

    Abstract: Scene flow estimation is an essential ingredient for a variety of real-world applications, especially for autonomous agents, such as self-driving cars and robots. While recent scene flow estimation approaches achieve a reasonable accuracy, their applicability to real-world systems additionally benefits from a reliability measure. Aiming at improving accuracy while additionally providing an estimat… ▽ More

    Submitted 14 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  4. arXiv:2311.04765  [pdf, other

    cs.RO cs.AI cs.LG

    The voraus-AD Dataset for Anomaly Detection in Robot Applications

    Authors: Jan Thieß Brockmann, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt

    Abstract: During the operation of industrial robots, unusual events may endanger the safety of humans and the quality of production. When collecting data to detect such cases, it is not ensured that data from all potentially occurring errors is included as unforeseeable events may happen over time. Therefore, anomaly detection (AD) delivers a practical solution, using only normal data to learn to detect unu… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 14 pages, 14 figures, accepted to Transactions on Robotics

  5. arXiv:2309.04750  [pdf, other

    cs.CV

    Mirror-Aware Neural Humans

    Authors: Daniel Ajisafe, James Tang, Shih-Yang Su, Bastian Wandt, Helge Rhodin

    Abstract: Human motion capture either requires multi-camera systems or is unreliable when using single-view input due to depth ambiguities. Meanwhile, mirrors are readily available in urban environments and form an affordable alternative by recording two views with only a single camera. However, the mirror setting poses the additional challenge of handling occlusions of real and mirror image. Going beyond e… ▽ More

    Submitted 15 May, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

    Comments: The 11th International Conference on 3D Vision (3DV 2024). Project website: https://danielajisafe.github.io/mirror-aware-neural-humans/

  6. arXiv:2308.11951  [pdf, other

    cs.CV cs.AI cs.GR

    Pose Modulated Avatars from Video

    Authors: Chun** Song, Bastian Wandt, Helge Rhodin

    Abstract: It is now possible to reconstruct dynamic human motion and shape from a sparse set of cameras using Neural Radiance Fields (NeRF) driven by an underlying skeleton. However, a challenge remains to model the deformation of cloth and skin in relation to skeleton pose. Unlike existing avatar models that are learned implicitly or rely on a proxy surface, our approach is motivated by the observation tha… ▽ More

    Submitted 29 September, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

  7. arXiv:2306.00560  [pdf, other

    cs.LG stat.ML

    Hinge-Wasserstein: Estimating Multimodal Aleatoric Uncertainty in Regression Tasks

    Authors: Ziliang Xiong, Arvi Jonnarth, Abdelrahman Eldesokey, Joakim Johnander, Bastian Wandt, Per-Erik Forssen

    Abstract: Computer vision systems that are deployed in safety-critical applications need to quantify their output uncertainty. We study regression from images to parameter values and here it is common to detect uncertainty by predicting probability distributions. In this context, we investigate the regression-by-classification paradigm which can represent multimodal distributions, without a prior assumption… ▽ More

    Submitted 21 June, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  8. arXiv:2305.17432  [pdf, other

    cs.CV

    GMSF: Global Matching Scene Flow

    Authors: Yushan Zhang, Johan Edstedt, Bastian Wandt, Per-Erik Forssén, Maria Magnusson, Michael Felsberg

    Abstract: We tackle the task of scene flow estimation from point clouds. Given a source and a target point cloud, the objective is to estimate a translation from each point in the source point cloud to the target, resulting in a 3D motion vector field. Previous dominant scene flow estimation methods require complicated coarse-to-fine or recurrent architectures as a multi-stage refinement. In contrast, we pr… ▽ More

    Submitted 30 October, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

  9. arXiv:2211.16487  [pdf, other

    cs.CV

    DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models

    Authors: Karl Holmquist, Bastian Wandt

    Abstract: Traditionally, monocular 3D human pose estimation employs a machine learning model to predict the most likely 3D pose for a given input image. However, a single image can be highly ambiguous and induces multiple plausible solutions for the 2D-3D lifting step which results in overly confident 3D pose predictors. To this end, we propose \emph{DiffPose}, a conditional diffusion model, that predicts m… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  10. arXiv:2210.07829  [pdf, other

    cs.LG cs.AI cs.CV

    Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

    Authors: Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt

    Abstract: Industrial defect detection is commonly addressed with anomaly detection (AD) methods where no or only incomplete data of potentially occurring defects is available. This work discovers previously unknown problems of student-teacher approaches for AD and proposes a solution, where two neural networks are trained to produce the same output for the defect-free training examples. The core assumption… ▽ More

    Submitted 18 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: accepted to WACV 2023

  11. arXiv:2205.10636  [pdf, other

    cs.CV

    AutoLink: Self-supervised Learning of Human Skeletons and Object Outlines by Linking Keypoints

    Authors: Xingzhe He, Bastian Wandt, Helge Rhodin

    Abstract: Structured representations such as keypoints are widely used in pose transfer, conditional image generation, animation, and 3D reconstruction. However, their supervised learning requires expensive annotation for each target domain. We propose a self-supervised method that learns to disentangle object structure from the appearance with a graph of 2D keypoints linked by straight edges. Both the keyp… ▽ More

    Submitted 23 March, 2023; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

    Journal ref: Advances in Neural Information Processing Systems 2022

  12. arXiv:2205.03448  [pdf, other

    cs.CV

    LatentKeypointGAN: Controlling Images via Latent Keypoints -- Extended Abstract

    Authors: Xingzhe He, Bastian Wandt, Helge Rhodin

    Abstract: Generative adversarial networks (GANs) can now generate photo-realistic images. However, how to best control the image content remains an open challenge. We introduce LatentKeypointGAN, a two-stage GAN internally conditioned on a set of keypoints and associated appearance embeddings providing control of the position and style of the generated objects and their respective parts. A major difficulty… ▽ More

    Submitted 17 May, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2103.15812

    Journal ref: CVPR Workshop 2022

  13. arXiv:2112.12193  [pdf, other

    cs.CV

    Improved 2D Keypoint Detection in Out-of-Balance and Fall Situations -- combining input rotations and a kinematic model

    Authors: Michael Zwölfer, Dieter Heinrich, Kurt Schindelwig, Bastian Wandt, Helge Rhodin, Joerg Spoerri, Werner Nachbauer

    Abstract: Injury analysis may be one of the most beneficial applications of deep learning based human pose estimation. To facilitate further research on this topic, we provide an injury specific 2D dataset for alpine skiing, covering in total 533 images. We further propose a post processing routine, that combines rotational information with a simple kinematic model. We could improve detection results in fal… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: extended abstract, 4 pages, 3 figures, 2 tables

  14. arXiv:2112.11593  [pdf, other

    cs.CV

    AdaptPose: Cross-Dataset Adaptation for 3D Human Pose Estimation by Learnable Motion Generation

    Authors: Mohsen Gholami, Bastian Wandt, Helge Rhodin, Rabab Ward, Z. Jane Wang

    Abstract: This paper addresses the problem of cross-dataset generalization of 3D human pose estimation models. Testing a pre-trained 3D pose estimator on a new dataset results in a major performance drop. Previous methods have mainly addressed this problem by improving the diversity of the training data. We argue that diversity alone is not sufficient and that the characteristics of the training data need t… ▽ More

    Submitted 15 March, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

  15. arXiv:2112.07088  [pdf, other

    cs.CV

    ElePose: Unsupervised 3D Human Pose Estimation by Predicting Camera Elevation and Learning Normalizing Flows on 2D Poses

    Authors: Bastian Wandt, James J. Little, Helge Rhodin

    Abstract: Human pose estimation from single images is a challenging problem that is typically solved by supervised learning. Unfortunately, labeled training data does not yet exist for many human activities since 3D annotation requires dedicated motion capture systems. Therefore, we propose an unsupervised approach that learns to predict a 3D human pose from a single image while only being trained with 2D p… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

  16. arXiv:2112.01036  [pdf, other

    cs.CV

    GANSeg: Learning to Segment by Unsupervised Hierarchical Image Generation

    Authors: Xingzhe He, Bastian Wandt, Helge Rhodin

    Abstract: Segmenting an image into its parts is a frequent preprocess for high-level vision tasks such as image editing. However, annotating masks for supervised training is expensive. Weakly-supervised and unsupervised methods exist, but they depend on the comparison of pairs of images, such as from multi-views, frames of videos, and image augmentation, which limits their applicability. To address this, we… ▽ More

    Submitted 8 October, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: CVPR 2022

  17. arXiv:2110.02855  [pdf, other

    cs.CV

    Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

    Authors: Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt

    Abstract: In industrial manufacturing processes, errors frequently occur at unpredictable times and in unknown manifestations. We tackle the problem of automatic defect detection without requiring any image samples of defective parts. Recent works model the distribution of defect-free image data, using either strong statistical priors or overly simplified data representations. In contrast, our approach hand… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  18. arXiv:2107.13788  [pdf, other

    cs.CV

    Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

    Authors: Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt

    Abstract: 3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exp… ▽ More

    Submitted 2 August, 2021; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: Accepted to ICCV 2021

  19. arXiv:2103.15812  [pdf, other

    cs.CV

    LatentKeypointGAN: Controlling GANs via Latent Keypoints

    Authors: Xingzhe He, Bastian Wandt, Helge Rhodin

    Abstract: Generative adversarial networks (GANs) have attained photo-realistic quality in image generation. However, how to best control the image content remains an open challenge. We introduce LatentKeypointGAN, a two-stage GAN which is trained end-to-end on the classical GAN objective with internal conditioning on a set of space keypoints. These keypoints have associated appearance embeddings that respec… ▽ More

    Submitted 8 June, 2023; v1 submitted 29 March, 2021; originally announced March 2021.

    Journal ref: CRV 2023

  20. arXiv:2012.13341  [pdf, other

    cs.HC cs.CV cs.LG cs.SD eess.AS

    AudioViewer: Learning to Visualize Sounds

    Authors: Chun** Song, Yuchi Zhang, Willis Peng, Parmis Mohaghegh, Bastian Wandt, Helge Rhodin

    Abstract: A long-standing goal in the field of sensory substitution is to enable sound perception for deaf and hard of hearing (DHH) people by visualizing audio content. Different from existing models that translate to hand sign language, between speech and text, or text and images, we target immediate and low-level audio to video translation that applies to generic environment sounds as well as human speec… ▽ More

    Submitted 10 November, 2022; v1 submitted 22 December, 2020; originally announced December 2020.

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 2206-2216

  21. arXiv:2011.14679  [pdf, other

    cs.CV

    CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild

    Authors: Bastian Wandt, Marco Rudolph, Petrissa Zell, Helge Rhodin, Bodo Rosenhahn

    Abstract: Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (\eg outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  22. arXiv:2008.12577  [pdf, other

    cs.CV cs.LG eess.IV

    Same Same But DifferNet: Semi-Supervised Defect Detection with Normalizing Flows

    Authors: Marco Rudolph, Bastian Wandt, Bodo Rosenhahn

    Abstract: The detection of manufacturing errors is crucial in fabrication processes to ensure product quality and safety standards. Since many defects occur very rarely and their characteristics are mostly unknown a priori, their detection is still an open research question. To this end, we propose DifferNet: It leverages the descriptiveness of features extracted by convolutional neural networks to estimate… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  23. arXiv:2007.08969  [pdf, other

    cs.CV

    Weakly-supervised Learning of Human Dynamics

    Authors: Petrissa Zell, Bodo Rosenhahn, Bastian Wandt

    Abstract: This paper proposes a weakly-supervised learning framework for dynamics estimation from human motion. Although there are many solutions to capture pure human motion readily available, their data is not sufficient to analyze quality and efficiency of movements. Instead, the forces and moments driving human motion (the dynamics) need to be considered. Since recording dynamics is a laborious task tha… ▽ More

    Submitted 23 April, 2021; v1 submitted 17 July, 2020; originally announced July 2020.

  24. arXiv:1908.02626  [pdf, other

    cs.LG cs.CV stat.ML

    Structuring Autoencoders

    Authors: Marco Rudolph, Bastian Wandt, Bodo Rosenhahn

    Abstract: In this paper we propose Structuring AutoEncoders (SAE). SAEs are neural networks which learn a low dimensional representation of data which are additionally enriched with a desired structure in this low dimensional space. While traditional Autoencoders have proven to structure data naturally they fail to discover semantic structure that is hard to recognize in the raw data. The SAE solves the pro… ▽ More

    Submitted 7 August, 2019; originally announced August 2019.

  25. arXiv:1902.09868  [pdf, other

    cs.CV

    RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation

    Authors: Bastian Wandt, Bodo Rosenhahn

    Abstract: This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied a… ▽ More

    Submitted 12 March, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

    Comments: accepted to CVPR 2019

  26. arXiv:1702.00186  [pdf, other

    cs.CV

    A Kinematic Chain Space for Monocular Motion Capture

    Authors: Bastian Wandt, Hanno Ackermann, Bodo Rosenhahn

    Abstract: This paper deals with motion capture of kinematic chains (e.g. human skeletons) from monocular image sequences taken by uncalibrated cameras. We present a method based on projecting an observation into a kinematic chain space (KCS). An optimization of the nuclear norm is proposed that implicitly enforces structural properties of the kinematic chain. Unlike other approaches our method does not requ… ▽ More

    Submitted 1 February, 2017; originally announced February 2017.