Skip to main content

Showing 1–15 of 15 results for author: Baradel, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00636  [pdf, other

    cs.CV

    T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences

    Authors: Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, Gregory Rogez

    Abstract: In this paper, we address the challenging problem of long-term 3D human motion generation. Specifically, we aim to generate a long sequence of smoothly connected actions from a stream of multiple sentences (i.e., paragraph). Previous long-term motion generating approaches were mostly based on recurrent methods, using previously generated motion chunks as input for the next step. However, this appr… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 HuMoGen Workshop

  2. arXiv:2404.12942  [pdf, other

    cs.CV

    Purposer: Putting Human Motion Generation in Context

    Authors: Nicolas Ugrinovic, Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Gregory Rogez, Francesc Moreno-Noguer

    Abstract: We present a novel method to generate human motion to populate 3D indoor scenes. It can be controlled with various combinations of conditioning signals such as a path in a scene, target poses, past motions, and scenes represented as 3D point clouds. State-of-the-art methods are either models specialized to one single setting, require vast amounts of high-quality and diverse training data, or are u… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  3. arXiv:2402.14654  [pdf, other

    cs.CV

    Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot

    Authors: Fabien Baradel, Matthieu Armando, Salma Galaaoui, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez, Thomas Lucas

    Abstract: We present Multi-HMR, a strong single-shot model for multi-person 3D human mesh recovery from a single RGB image. Predictions encompass the whole body, i.e, including hands and facial expressions, using the SMPL-X parametric model and spatial location in the camera coordinate system. Our model detects people by predicting coarse 2D heatmaps of person centers, using features produced by a standard… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: https://github.com/naver/multi-hmr

  4. arXiv:2311.09104  [pdf, other

    cs.CV

    Cross-view and Cross-pose Completion for 3D Human Understanding

    Authors: Matthieu Armando, Salma Galaaoui, Fabien Baradel, Thomas Lucas, Vincent Leroy, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez

    Abstract: Human perception and understanding is a major domain of computer vision which, like many other vision subdomains recently, stands to gain from the use of large models pre-trained on large datasets. We hypothesize that the most common pre-training strategy of relying on general purpose, object-centric image datasets such as ImageNet, is limited by an important domain shift. On the other hand, colle… ▽ More

    Submitted 18 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: CVPR 2024

  5. arXiv:2309.10748  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.RO

    SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction

    Authors: Anilkumar Swamy, Vincent Leroy, Philippe Weinzaepfel, Fabien Baradel, Salma Galaaoui, Romain Bregier, Matthieu Armando, Jean-Sebastien Franco, Gregory Rogez

    Abstract: Recent hand-object interaction datasets show limited real object variability and rely on fitting the MANO parametric model to obtain groundtruth hand shapes. To go beyond these limitations and spur further research, we introduce the SHOWMe dataset which consists of 96 videos, annotated with real and detailed hand-object 3D textured meshes. Following recent work, we consider a rigid hand-object sce… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Paper and Appendix, Accepted in ACVR workshop at ICCV conference

  6. arXiv:2210.10542  [pdf, other

    cs.CV

    PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting

    Authors: Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez

    Abstract: We address the problem of action-conditioned generation of human motion sequences. Existing work falls into two categories: forecast models conditioned on observed past motions, or generative models conditioned on action labels and duration only. In contrast, we generate motion conditioned on observations of arbitrary length, including none. To solve this generalized problem, we propose PoseGPT, a… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: ECCV'22 Conference paper

  7. arXiv:2208.10211  [pdf, other

    cs.CV

    PoseBERT: A Generic Transformer Module for Temporal 3D Human Modeling

    Authors: Fabien Baradel, Romain Brégier, Thibault Groueix, Philippe Weinzaepfel, Yannis Kalantidis, Grégory Rogez

    Abstract: Training state-of-the-art models for human pose estimation in videos requires datasets with annotations that are really hard and expensive to obtain. Although transformers have been recently utilized for body pose sequence modeling, related methods rely on pseudo-ground truth to augment the currently limited training data available for learning such models. In this paper, we introduce PoseBERT, a… ▽ More

    Submitted 19 October, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: Accepted to TPAMI 2022

  8. arXiv:2202.00368  [pdf, other

    cs.CV cs.LG

    Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space

    Authors: Steeven Janny, Fabien Baradel, Natalia Neverova, Madiha Nadri, Greg Mori, Christian Wolf

    Abstract: Learning causal relationships in high-dimensional data (images, videos) is a hard task, as they are often defined on low dimensional manifolds and must be extracted from complex signals dominated by appearance, lighting, textures and also spurious correlations in the data. We present a method for learning counterfactual reasoning of physical processes in pixel space, which requires the prediction… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Journal ref: International Conference on Learning Representation (2022)

  9. arXiv:2110.09243  [pdf, other

    cs.CV

    Leveraging MoCap Data for Human Mesh Recovery

    Authors: Fabien Baradel, Thibault Groueix, Philippe Weinzaepfel, Romain Brégier, Yannis Kalantidis, Grégory Rogez

    Abstract: Training state-of-the-art models for human body pose and shape recovery from images or videos requires datasets with corresponding annotations that are really hard and expensive to obtain. Our goal in this paper is to study whether poses from 3D Motion Capture (MoCap) data can be used to improve image-based and video-based human mesh recovery methods. We find that fine-tune image-based models with… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 3DV 2021

  10. arXiv:1909.12000  [pdf, other

    cs.CV

    CoPhy: Counterfactual Learning of Physical Dynamics

    Authors: Fabien Baradel, Natalia Neverova, Julien Mille, Greg Mori, Christian Wolf

    Abstract: Understanding causes and effects in mechanical systems is an essential component of reasoning in the physical world. This work poses a new problem of counterfactual learning of object mechanics from visual input. We develop the CoPhy benchmark to assess the capacity of the state-of-the-art models for causal physical reasoning in a synthetic 3D environment and propose a model for learning the physi… ▽ More

    Submitted 7 April, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: ICLR 2020 -Spotlight presentation

  11. arXiv:1906.05743  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Video Representations using Contrastive Bidirectional Transformer

    Authors: Chen Sun, Fabien Baradel, Kevin Murphy, Cordelia Schmid

    Abstract: This paper proposes a self-supervised learning approach for video features that results in significantly improved performance on downstream tasks (such as video classification, captioning and segmentation) compared to existing methods. Our method extends the BERT model for text sequences to the case of sequences of real-valued feature vectors, by replacing the softmax loss with noise contrastive e… ▽ More

    Submitted 27 September, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

  12. arXiv:1806.06157  [pdf, other

    cs.CV

    Object Level Visual Reasoning in Videos

    Authors: Fabien Baradel, Natalia Neverova, Christian Wolf, Julien Mille, Greg Mori

    Abstract: Human activity recognition is typically addressed by detecting key concepts like global and local motion, features related to object classes present in the scene, as well as features related to the global context. The next open challenges in activity recognition require a level of understanding that pushes beyond this and call for models with capabilities for fine distinction and detailed comprehe… ▽ More

    Submitted 20 September, 2018; v1 submitted 15 June, 2018; originally announced June 2018.

    Comments: Accepted at ECCV 2018 - long version (16 pages + ref)

    Journal ref: ECCV 2018

  13. arXiv:1802.07898  [pdf, other

    cs.CV

    Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points

    Authors: Fabien Baradel, Christian Wolf, Julien Mille, Graham W. Taylor

    Abstract: We propose a method for human activity recognition from RGB data that does not rely on any pose information during test time and does not explicitly calculate pose information internally. Instead, a visual attention module learns to predict glimpse sequences in each frame. These glimpses correspond to interest points in the scene that are relevant to the classified activities. No spatial coherence… ▽ More

    Submitted 21 August, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: CVPR 2018 - project page: https://fabienbaradel.github.io/cvpr18_glimpseclouds/

    Journal ref: CVPR 2018

  14. arXiv:1712.08002  [pdf, other

    cs.CV

    Human Action Recognition: Pose-based Attention draws focus to Hands

    Authors: Fabien Baradel, Christian Wolf, Julien Mille

    Abstract: We propose a new spatio-temporal attention based mechanism for human action recognition able to automatically attend to the hands most involved into the studied action and detect the most discriminative moments in an action. Attention is handled in a recurrent manner employing Recurrent Neural Network (RNN) and is fully-differentiable. In contrast to standard soft-attention based mechanisms, our a… ▽ More

    Submitted 20 December, 2017; originally announced December 2017.

    Comments: ICCV 2017 Workshop "Hands in action". arXiv admin note: text overlap with arXiv:1703.10106

    Journal ref: ICCV 2017

  15. arXiv:1703.10106  [pdf, other

    cs.CV

    Pose-conditioned Spatio-Temporal Attention for Human Action Recognition

    Authors: Fabien Baradel, Christian Wolf, Julien Mille

    Abstract: We address human action recognition from multi-modal video data involving articulated pose and RGB frames and propose a two-stream approach. The pose stream is processed with a convolutional model taking as input a 3D tensor holding data from a sub-sequence. A specific joint ordering, which respects the topology of the human body, ensures that different convolutional layers correspond to meaningfu… ▽ More

    Submitted 6 August, 2017; v1 submitted 29 March, 2017; originally announced March 2017.

    Comments: 10 pages, project page: https://fabienbaradel.github.io/pose_rgb_attention_human_action