Skip to main content

Showing 1–17 of 17 results for author: Lucas, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00636  [pdf, other

    cs.CV

    T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences

    Authors: Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, Gregory Rogez

    Abstract: In this paper, we address the challenging problem of long-term 3D human motion generation. Specifically, we aim to generate a long sequence of smoothly connected actions from a stream of multiple sentences (i.e., paragraph). Previous long-term motion generating approaches were mostly based on recurrent methods, using previously generated motion chunks as input for the next step. However, this appr… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 HuMoGen Workshop

  2. arXiv:2404.12942  [pdf, other

    cs.CV

    Purposer: Putting Human Motion Generation in Context

    Authors: Nicolas Ugrinovic, Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Gregory Rogez, Francesc Moreno-Noguer

    Abstract: We present a novel method to generate human motion to populate 3D indoor scenes. It can be controlled with various combinations of conditioning signals such as a path in a scene, target poses, past motions, and scenes represented as 3D point clouds. State-of-the-art methods are either models specialized to one single setting, require vast amounts of high-quality and diverse training data, or are u… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  3. arXiv:2402.14654  [pdf, other

    cs.CV

    Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot

    Authors: Fabien Baradel, Matthieu Armando, Salma Galaaoui, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez, Thomas Lucas

    Abstract: We present Multi-HMR, a strong single-shot model for multi-person 3D human mesh recovery from a single RGB image. Predictions encompass the whole body, i.e, including hands and facial expressions, using the SMPL-X parametric model and spatial location in the camera coordinate system. Our model detects people by predicting coarse 2D heatmaps of person centers, using features produced by a standard… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: https://github.com/naver/multi-hmr

  4. arXiv:2311.09104  [pdf, other

    cs.CV

    Cross-view and Cross-pose Completion for 3D Human Understanding

    Authors: Matthieu Armando, Salma Galaaoui, Fabien Baradel, Thomas Lucas, Vincent Leroy, Romain Brégier, Philippe Weinzaepfel, Grégory Rogez

    Abstract: Human perception and understanding is a major domain of computer vision which, like many other vision subdomains recently, stands to gain from the use of large models pre-trained on large datasets. We hypothesize that the most common pre-training strategy of relying on general purpose, object-centric image datasets such as ImageNet, is limited by an important domain shift. On the other hand, colle… ▽ More

    Submitted 18 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: CVPR 2024

  5. arXiv:2310.00632  [pdf, other

    cs.CV

    Win-Win: Training High-Resolution Vision Transformers from Two Windows

    Authors: Vincent Leroy, Jerome Revaud, Thomas Lucas, Philippe Weinzaepfel

    Abstract: Transformers have become the standard in state-of-the-art vision architectures, achieving impressive performance on both image-level and dense pixelwise tasks. However, training vision transformers for high-resolution pixelwise tasks has a prohibitive cost. Typical solutions boil down to hierarchical architectures, fast and approximate attention, or training on low-resolution crops. This latter so… ▽ More

    Submitted 22 March, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  6. arXiv:2211.10408  [pdf, other

    cs.CV

    CroCo v2: Improved Cross-view Completion Pre-training for Stereo Matching and Optical Flow

    Authors: Philippe Weinzaepfel, Thomas Lucas, Vincent Leroy, Yohann Cabon, Vaibhav Arora, Romain Brégier, Gabriela Csurka, Leonid Antsfeld, Boris Chidlovskii, Jérôme Revaud

    Abstract: Despite impressive performance for high-level downstream tasks, self-supervised pre-training methods have not yet fully delivered on dense geometric vision tasks such as stereo matching or optical flow. The application of self-supervised concepts, such as instance discrimination or masked image modeling, to geometric tasks is an active area of research. In this work, we build on the recent cross-v… ▽ More

    Submitted 18 August, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: ICCV 2023

  7. arXiv:2210.11795  [pdf, other

    cs.CV

    PoseScript: Linking 3D Human Poses and Natural Language

    Authors: Ginger Delmas, Philippe Weinzaepfel, Thomas Lucas, Francesc Moreno-Noguer, Grégory Rogez

    Abstract: Natural language plays a critical role in many computer vision applications, such as image captioning, visual question answering, and cross-modal retrieval, to provide fine-grained semantic information. Unfortunately, while human pose is key to human understanding, current 3D human pose datasets lack detailed language descriptions. To address this issue, we have introduced the PoseScript dataset.… ▽ More

    Submitted 19 January, 2024; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: Extended version of the ECCV 2022 paper

  8. arXiv:2210.10716  [pdf, other

    cs.CV

    CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion

    Authors: Philippe Weinzaepfel, Vincent Leroy, Thomas Lucas, Romain Brégier, Yohann Cabon, Vaibhav Arora, Leonid Antsfeld, Boris Chidlovskii, Gabriela Csurka, Jérôme Revaud

    Abstract: Masked Image Modeling (MIM) has recently been established as a potent pre-training paradigm. A pretext task is constructed by masking patches in an input image, and this masked content is then predicted by a neural network using visible patches as sole input. This pre-training leads to state-of-the-art performance when finetuned for high-level semantic tasks, e.g. image classification and object d… ▽ More

    Submitted 12 January, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  9. arXiv:2210.10542  [pdf, other

    cs.CV

    PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting

    Authors: Thomas Lucas, Fabien Baradel, Philippe Weinzaepfel, Grégory Rogez

    Abstract: We address the problem of action-conditioned generation of human motion sequences. Existing work falls into two categories: forecast models conditioned on observed past motions, or generative models conditioned on action labels and duration only. In contrast, we generate motion conditioned on observations of arbitrary length, including none. To solve this generalized problem, we propose PoseGPT, a… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: ECCV'22 Conference paper

  10. arXiv:2201.13182  [pdf, other

    cs.CV

    Learning Super-Features for Image Retrieval

    Authors: Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis

    Abstract: Methods that combine local and global features have recently shown excellent performance on multiple challenging deep image retrieval benchmarks, but their use of local features raises at least two issues. First, these local features simply boil down to the localized map activations of a neural network, and hence can be extremely redundant. Second, they are typically trained with a global loss tha… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: ICLR 2022

  11. arXiv:2112.12004  [pdf, other

    cs.CV

    Barely-Supervised Learning: Semi-Supervised Learning with very few labeled images

    Authors: Thomas Lucas, Philippe Weinzaepfel, Gregory Rogez

    Abstract: This paper tackles the problem of semi-supervised learning when the set of labeled samples is limited to a small number of images per class, typically less than 10, problem that we refer to as barely-supervised learning. We analyze in depth the behavior of a state-of-the-art semi-supervised method, FixMatch, which relies on a weakly-augmented version of an image to obtain supervision signal for a… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  12. arXiv:2109.13745  [pdf

    cs.NE cs.LG

    Meta-aprendizado para otimizacao de parametros de redes neurais

    Authors: Tarsicio Lucas, Teresa Ludermir, Ricardo Prudencio, Carlos Soares

    Abstract: The optimization of Artificial Neural Networks (ANNs) is an important task to the success of using these models in real-world applications. The solutions adopted to this task are expensive in general, involving trial-and-error procedures or expert knowledge which are not always available. In this work, we investigated the use of meta-learning to the optimization of ANNs. Meta-learning is a researc… ▽ More

    Submitted 10 July, 2021; originally announced September 2021.

    Comments: 12 pages, in Portuguese, 2 figures, 2 tables

  13. arXiv:1901.01091  [pdf, other

    cs.CV cs.LG

    Adaptive Density Estimation for Generative Models

    Authors: Thomas Lucas, Konstantin Shmelkov, Karteek Alahari, Cordelia Schmid, Jakob Verbeek

    Abstract: Unsupervised learning of generative models has seen tremendous progress over recent years, in particular due to generative adversarial networks (GANs), variational autoencoders, and flow-based models. GANs have dramatically improved sample quality, but suffer from two drawbacks: (i) they mode-drop, i.e., do not cover the full support of the train data, and (ii) they do not allow for likelihood eva… ▽ More

    Submitted 3 January, 2020; v1 submitted 4 January, 2019; originally announced January 2019.

  14. arXiv:1806.07185  [pdf, other

    cs.LG cs.CV stat.ML

    Mixed batches and symmetric discriminators for GAN training

    Authors: Thomas Lucas, Corentin Tallec, Jakob Verbeek, Yann Ollivier

    Abstract: Generative adversarial networks (GANs) are pow- erful generative models based on providing feed- back to a generative network via a discriminator network. However, the discriminator usually as- sesses individual samples. This prevents the dis- criminator from accessing global distributional statistics of generated samples, and often leads to mode drop**: the generator models only part of the tar… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: Accepted at ICML 2018 (long oral)

  15. arXiv:1805.08463  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Variational Learning on Aggregate Outputs with Gaussian Processes

    Authors: Ho Chung Leon Law, Dino Sejdinovic, Ewan Cameron, Tim CD Lucas, Seth Flaxman, Katherine Battle, Kenji Fukumizu

    Abstract: While a typical supervised learning framework assumes that the inputs and the outputs are measured at the same levels of granularity, many applications, including global map** of disease, only have access to outputs at a much coarser level than that of the inputs. Aggregation of outputs makes generalization to new inputs much more difficult. We consider an approach to this problem based on varia… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  16. arXiv:1711.11479  [pdf, other

    cs.CV

    Auxiliary Guided Autoregressive Variational Autoencoders

    Authors: Thomas Lucas, Jakob Verbeek

    Abstract: Generative modeling of high-dimensional data is a key problem in machine learning. Successful approaches include latent variable models and autoregressive models. The complementary strengths of these approaches, to model global and local image statistics respectively, suggest hybrid models that encode global image structure into latent variables while autoregressively modeling low level detail. Pr… ▽ More

    Submitted 18 April, 2019; v1 submitted 30 November, 2017; originally announced November 2017.

    Comments: Published as a conference paper at ECML-PKDD 2018

  17. arXiv:1612.01033  [pdf, other

    cs.CV

    Areas of Attention for Image Captioning

    Authors: Marco Pedersoli, Thomas Lucas, Cordelia Schmid, Jakob Verbeek

    Abstract: We propose "Areas of Attention", a novel attention-based model for automatic image captioning. Our approach models the dependencies between image regions, caption words, and the state of an RNN language model, using three pairwise interactions. In contrast to previous attention-based approaches that associate image regions only to the RNN state, our method allows a direct association between capti… ▽ More

    Submitted 25 August, 2017; v1 submitted 3 December, 2016; originally announced December 2016.

    Comments: Accepted in ICCV 2017