Skip to main content

Showing 1–29 of 29 results for author: de Charette, R

.
  1. arXiv:2404.07202  [pdf, other

    cs.CV cs.AI cs.CL

    UMBRAE: Unified Multimodal Decoding of Brain Signals

    Authors: Weihao Xia, Raoul de Charette, Cengiz Öztireli, **g-Hao Xue

    Abstract: We address prevailing challenges of the brain-powered research, departing from the observation that the literature hardly recover accurate spatial information and require subject-specific models. To address these challenges, we propose UMBRAE, a unified multimodal decoding of brain signals. First, to extract instance-level conceptual and spatial details from neural signals, we introduce an efficie… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Project Page: https://weihaox.github.io/UMBRAE

  2. arXiv:2312.02158  [pdf, other

    cs.CV cs.AI

    PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

    Authors: Anh-Quan Cao, Angela Dai, Raoul de Charette

    Abstract: We propose the task of Panoptic Scene Completion (PSC) which extends the recently popular Semantic Scene Completion (SSC) task with instance-level information to produce a richer understanding of the 3D scene. Our PSC proposal utilizes a hybrid mask-based technique on the non-empty voxels from sparse multi-scale completions. Whereas the SSC literature overlooks uncertainty which is critical for ro… ▽ More

    Submitted 25 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Oral - Best paper award candidate. Project page: https://astra-vision.github.io/PaSCo

  3. arXiv:2311.17922  [pdf, other

    cs.CV

    A Simple Recipe for Language-guided Domain Generalized Segmentation

    Authors: Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Patrick Pérez, Raoul de Charette

    Abstract: Generalization to new domains not seen during training is one of the long-standing challenges in deploying neural networks in real-world applications. Existing generalization techniques either necessitate external images for augmentation, and/or aim at learning invariant representations by imposing various alignment constraints. Large-scale pretraining has recently shown promising generalization c… ▽ More

    Submitted 2 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: CVPR 2024

  4. arXiv:2311.17060  [pdf, other

    cs.CV cs.GR

    Material Palette: Extraction of Materials from a Single Image

    Authors: Ivan Lopes, Fabio Pizzati, Raoul de Charette

    Abstract: In this paper, we propose a method to extract physically-based rendering (PBR) materials from a single real-world image. We do so in two steps: first, we map regions of the image to material concepts using a diffusion model, which allows the sampling of texture images resembling each material in the scene. Second, we benefit from a separate network to decompose the generated textures into Spatiall… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 8 pages, 11 figures, 2 tables. Webpage https://astra-vision.github.io/MaterialPalette/

  5. arXiv:2310.02265  [pdf, other

    cs.CV cs.LG eess.IV q-bio.NC

    DREAM: Visual Decoding from Reversing Human Visual System

    Authors: Weihao Xia, Raoul de Charette, Cengiz Öztireli, **g-Hao Xue

    Abstract: In this work we present DREAM, an fMRI-to-image method for reconstructing viewed images from brain activities, grounded on fundamental knowledge of the human visual system. We craft reverse pathways that emulate the hierarchical and parallel nature of how humans perceive the visual world. These tailored pathways are specialized to decipher semantics, color, and depth cues from fMRI data, mirroring… ▽ More

    Submitted 10 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Project Page: https://weihaox.github.io/DREAM

  6. arXiv:2212.03241  [pdf, other

    cs.CV cs.LG

    PØDA: Prompt-driven Zero-shot Domain Adaptation

    Authors: Mohammad Fahes, Tuan-Hung Vu, Andrei Bursuc, Patrick Pérez, Raoul de Charette

    Abstract: Domain adaptation has been vastly investigated in computer vision but still requires access to target images at train time, which might be intractable in some uncommon conditions. In this paper, we propose the task of `Prompt-driven Zero-shot Domain Adaptation', where we adapt a model trained on a source domain using only a general description in natural language of the target domain, i.e., a prom… ▽ More

    Submitted 19 August, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: Accepted to ICCV 2023, Project Page: https://astra-vision.github.io/PODA/

  7. arXiv:2212.02501  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields

    Authors: Anh-Quan Cao, Raoul de Charette

    Abstract: 3D reconstruction from a single 2D image was extensively covered in the literature but relies on depth supervision at training time, which limits its applicability. To relax the dependence to depth we propose SceneRF, a self-supervised monocular scene reconstruction method using only posed image sequences for training. Fueled by the recent progress in neural radiance fields (NeRF) we optimize a ra… ▽ More

    Submitted 24 August, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

    Comments: ICCV 2023. Project page: https://astra-vision.github.io/SceneRF

  8. arXiv:2210.01784  [pdf, other

    cs.CV

    COARSE3D: Class-Prototypes for Contrastive Learning in Weakly-Supervised 3D Point Cloud Segmentation

    Authors: Rong Li, Anh-Quan Cao, Raoul de Charette

    Abstract: Annotation of large-scale 3D data is notoriously cumbersome and costly. As an alternative, weakly-supervised learning alleviates such a need by reducing the annotation by several order of magnitudes. We propose COARSE3D, a novel architecture-agnostic contrastive learning strategy for 3D segmentation. Since contrastive learning requires rich and diverse examples as keys and anchors, we leverage a p… ▽ More

    Submitted 7 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

  9. arXiv:2206.08927  [pdf, other

    cs.CV cs.AI cs.RO

    Cross-task Attention Mechanism for Dense Multi-task Learning

    Authors: Ivan Lopes, Tuan-Hung Vu, Raoul de Charette

    Abstract: Multi-task learning has recently become a promising solution for a comprehensive understanding of complex scenes. Not only being memory-efficient, multi-task models with an appropriate design can favor exchange of complementary signals across tasks. In this work, we jointly address 2D semantic segmentation, and two geometry-related tasks, namely dense depth, surface normal estimation as well as ed… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: 10 figures, 6 tables, 23 pages

  10. arXiv:2112.00726  [pdf, other

    cs.CV cs.AI cs.RO

    MonoScene: Monocular 3D Semantic Scene Completion

    Authors: Anh-Quan Cao, Raoul de Charette

    Abstract: MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2… ▽ More

    Submitted 29 March, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: Accepted at CVPR 2022. Project page: https://cv-rits.github.io/MonoScene/

  11. arXiv:2111.13681  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    ManiFest: Manifold Deformation for Few-shot Image Translation

    Authors: Fabio Pizzati, Jean-François Lalonde, Raoul de Charette

    Abstract: Most image-to-image translation methods require a large number of training images, which restricts their applicability. We instead propose ManiFest: a framework for few-shot image translation that learns a context-aware representation of a target domain from a few images only. To enforce feature consistency, our framework learns a style manifold between source and proxy anchor domains (assumed to… ▽ More

    Submitted 20 July, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: ECCV 2022

  12. arXiv:2109.04468  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Leveraging Local Domains for Image-to-Image Translation

    Authors: Anthony Dell'Eva, Fabio Pizzati, Massimo Bertozzi, Raoul de Charette

    Abstract: Image-to-image (i2i) networks struggle to capture local changes because they do not affect the global scene structure. For example, translating from highway scenes to offroad, i2i networks easily focus on global color features but ignore obvious traits for humans like the absence of lane markings. In this paper, we leverage human knowledge about spatial domain characteristics which we refer to as… ▽ More

    Submitted 14 February, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: VISAPP 2022 Best Paper Award

  13. arXiv:2107.14229  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Physics-informed Guided Disentanglement in Generative Networks

    Authors: Fabio Pizzati, Pietro Cerri, Raoul de Charette

    Abstract: Image-to-image translation (i2i) networks suffer from entanglement effects in presence of physics-related phenomena in target domain (such as occlusions, fog, etc), lowering altogether the translation quality, controllability and variability. In this paper, we propose a general framework to disentangle visual traits in target images. Primarily, we build upon collection of simple physics models, gu… ▽ More

    Submitted 27 April, 2023; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: TPAMI 2023. Code: https://github.com/astra-vision/GuidedDisent

  14. arXiv:2103.09189  [pdf, other

    cs.RO cs.AI cs.CV

    Goal-constrained Sparse Reinforcement Learning for End-to-End Driving

    Authors: Pranav Agarwal, Pierre de Beaucorps, Raoul de Charette

    Abstract: Deep reinforcement Learning for end-to-end driving is limited by the need of complex reward engineering. Sparse rewards can circumvent this challenge but suffers from long training time and leads to sub-optimal policy. In this work, we explore full-control driving with only goal-constrained sparse reward and propose a curriculum learning approach for end-to-end driving using only navigation view m… ▽ More

    Submitted 31 July, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: Conference submission 6 pages, 8 figures

  15. arXiv:2103.07466  [pdf, other

    cs.CV

    3D Semantic Scene Completion: a Survey

    Authors: Luis Roldao, Raoul de Charette, Anne Verroust-Blondet

    Abstract: Semantic Scene Completion (SSC) aims to jointly estimate the complete geometry and semantics of a scene, assuming partial sparse input. In the last years following the multiplication of large-scale 3D datasets, SSC has gained significant momentum in the research community because it holds unresolved challenges. Specifically, SSC lies in the ambiguous completion of large unobserved areas and the we… ▽ More

    Submitted 12 July, 2021; v1 submitted 12 March, 2021; originally announced March 2021.

    Comments: Accepted in IJCV

  16. arXiv:2103.06879  [pdf, other

    cs.CV cs.AI cs.LG

    CoMoGAN: continuous model-guided image-to-image translation

    Authors: Fabio Pizzati, Pietro Cerri, Raoul de Charette

    Abstract: CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. To that matter, we introduce a new Functional Instance Normalization layer and residual mechanism, which together disentangle image content from position on target manifold. We rely on naive physics-inspired models to guide the training while allowing private model/translations featu… ▽ More

    Submitted 29 June, 2022; v1 submitted 11 March, 2021; originally announced March 2021.

    Comments: CVPR 2021 oral

  17. arXiv:2101.07253  [pdf, other

    cs.CV

    Cross-modal Learning for Domain Adaptation in 3D Semantic Segmentation

    Authors: Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, Émilie Wirbel, Patrick Pérez

    Abstract: Domain adaptation is an important task to enable learning when labels are scarce. While most works focus only on the image modality, there are many important multi-modal datasets. In order to leverage multi-modality for domain adaptation, we propose cross-modal learning, where we enforce consistency between the predictions of two modalities via mutual mimicking. We constrain our network to make co… ▽ More

    Submitted 22 June, 2022; v1 submitted 18 January, 2021; originally announced January 2021.

    Comments: TPAMI 2022

  18. Rain rendering for evaluating and improving robustness to bad weather

    Authors: Maxime Tremblay, Shirsendu Sukanta Halder, Raoul de Charette, Jean-François Lalonde

    Abstract: Rain fills the atmosphere with water particles, which breaks the common assumption that light travels unaltered from the scene to the camera. While it is well-known that rain affects computer vision algorithms, quantifying its impact is difficult. In this context, we present a rain rendering pipeline that enables the systematic evaluation of common computer vision algorithms to controlled amounts… ▽ More

    Submitted 6 September, 2020; originally announced September 2020.

    Comments: 19 pages, 19 figures, IJCV 2020 preprint. arXiv admin note: text overlap with arXiv:1908.10335

  19. arXiv:2008.10559  [pdf, other

    cs.CV

    LMSCNet: Lightweight Multiscale 3D Semantic Completion

    Authors: Luis Roldão, Raoul de Charette, Anne Verroust-Blondet

    Abstract: We introduce a new approach for multiscale 3Dsemantic scene completion from voxelized sparse 3D LiDAR scans. As opposed to the literature, we use a 2D UNet backbone with comprehensive multiscale skip connections to enhance feature flow, along with 3D segmentation heads. On the SemanticKITTI benchmark, our method performs on par on semantic completion and better on occupancy completion than all oth… ▽ More

    Submitted 25 October, 2020; v1 submitted 24 August, 2020; originally announced August 2020.

    Comments: Accepted at 3DV 2020 (Oral). For a demo video, see http://tiny.cc/lmscnet. Code is available at https://github.com/cv-rits/LMSCNet

  20. arXiv:2006.05011  [pdf, other

    cs.CV

    RGB-D-E: Event Camera Calibration for Fast 6-DOF Object Tracking

    Authors: Etienne Dubeau, Mathieu Garon, Benoit Debaque, Raoul de Charette, Jean-François Lalonde

    Abstract: Augmented reality devices require multiple sensors to perform various tasks such as localization and tracking. Currently, popular cameras are mostly frame-based (e.g. RGB and Depth) which impose a high data bandwidth and power usage. With the necessity for low power and more responsive augmented reality systems, using solely frame-based sensors imposes limits to the various algorithms that needs h… ▽ More

    Submitted 5 August, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: 9 pages, 9 figures

  21. arXiv:2004.01071  [pdf, other

    cs.CV cs.LG eess.IV

    Model-based occlusion disentanglement for image-to-image translation

    Authors: Fabio Pizzati, Pietro Cerri, Raoul de Charette

    Abstract: Image-to-image translation is affected by entanglement phenomena, which may occur in case of target data encompassing occlusions such as raindrops, dirt, etc. Our unsupervised model-based learning disentangles scene and occlusions, while benefiting from an adversarial pipeline to regress physical parameters of the occlusion model. The experiments demonstrate our method is able to handle varying ty… ▽ More

    Submitted 20 July, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: ECCV 2020

  22. arXiv:1911.12676  [pdf, other

    cs.CV

    xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

    Authors: Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, Émilie Wirbel, Patrick Pérez

    Abstract: Unsupervised Domain Adaptation (UDA) is crucial to tackle the lack of annotations in a new domain. There are many multi-modal datasets, but most UDA approaches are uni-modal. In this work, we explore how to learn from multi-modality and propose cross-modal UDA (xMUDA) where we assume the presence of 2D images and 3D point clouds for 3D semantic segmentation. This is challenging as the two input sp… ▽ More

    Submitted 30 March, 2020; v1 submitted 28 November, 2019; originally announced November 2019.

    Comments: Accepted at CVPR 2020. For a demo video, see http://tiny.cc/xmuda

  23. arXiv:1910.10563  [pdf, other

    cs.CV cs.LG

    Domain Bridge for Unpaired Image-to-Image Translation and Unsupervised Domain Adaptation

    Authors: Fabio Pizzati, Raoul de Charette, Michela Zaccaria, Pietro Cerri

    Abstract: Image-to-image translation architectures may have limited effectiveness in some circumstances. For example, while generating rainy scenarios, they may fail to model typical traits of rain as water drops, and this ultimately impacts the synthetic images realism. With our method, called domain bridge, web-crawled data are exploited to reduce the domain gap, leading to the inclusion of previously ign… ▽ More

    Submitted 14 March, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: WACV 20 camera ready

  24. arXiv:1908.10335  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    Physics-Based Rendering for Improving Robustness to Rain

    Authors: Shirsendu Sukanta Halder, Jean-François Lalonde, Raoul de Charette

    Abstract: To improve the robustness to rain, we present a physically-based rain rendering pipeline for realistically inserting rain into clear weather images. Our rendering relies on a physical particle simulator, an estimation of the scene lighting and an accurate rain photometric modeling to augment images with arbitrary amount of realistic rain or fog. We validate our rendering with a user study, proving… ▽ More

    Submitted 27 August, 2019; originally announced August 2019.

    Comments: ICCV 2019. Supplementary pdf / videos available on project page

  25. arXiv:1908.01523  [pdf, other

    cs.CV cs.HC

    3D Reconstruction of Deformable Revolving Object under Heavy Hand Interaction

    Authors: Raoul de Charette, Sotiris Manitsaris

    Abstract: We reconstruct 3D deformable object through time, in the context of a live pottery making process where the crafter molds the object. Because the object suffers from heavy hand interaction, and is being deformed, classical techniques cannot be applied. We use particle energy optimization to estimate the object profile and benefit of the object radial symmetry to increase the robustness of the reco… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

    Comments: 7 pages, 10 figures. Submitted to journal

  26. arXiv:1906.10515  [pdf, other

    cs.CV cs.RO

    3D Surface Reconstruction from Voxel-based Lidar Data

    Authors: Luis Roldão, Raoul de Charette, Anne Verroust-Blondet

    Abstract: To achieve fully autonomous navigation, vehicles need to compute an accurate model of their direct surrounding. In this paper, a 3D surface reconstruction algorithm from heterogeneous density 3D data is presented. The proposed method is based on a TSDF voxel-based representation, where an adaptive neighborhood kernel sourced on a Gaussian confidence evaluation is introduced. This enables to keep a… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Comments: IEEE Intelligent Transportation Systems Conference (ITSC) 2019

  27. arXiv:1808.00769  [pdf, other

    cs.CV

    Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation

    Authors: Maximilian Jaritz, Raoul de Charette, Emilie Wirbel, Xavier Perrotton, Fawzi Nashashibi

    Abstract: Convolutional neural networks are designed for dense data, but vision data is often sparse (stereo depth, point clouds, pen stroke, etc.). We present a method to handle sparse depth data with optional dense RGB, and accomplish depth completion and semantic segmentation changing only the last layer. Our proposal efficiently learns sparse features without the need of an additional validity mask. We… ▽ More

    Submitted 31 August, 2018; v1 submitted 2 August, 2018; originally announced August 2018.

    Comments: 3DV 2018

  28. arXiv:1807.08483  [pdf, other

    cs.RO

    A Statistical Update of Grid Representations from Range Sensors

    Authors: Luis Roldão, Raoul De Charette, Anne Verroust-Blondet

    Abstract: In a wide range of robotic applications, being able to create a 3D model of the surrounding environment is a key feature for autonomous tasks. In this research report, we present a statistical model to perform 3D reconstructions of the environment from range sensors using an occupancy grid. To do so, we take into account all the available information obtained from the sensor, considering the dista… ▽ More

    Submitted 4 July, 2019; v1 submitted 23 July, 2018; originally announced July 2018.

    Comments: Meatadata change. Typo on author's name

  29. arXiv:1807.02371  [pdf, other

    cs.CV cs.RO

    End-to-End Race Driving with Deep Reinforcement Learning

    Authors: Maximilian Jaritz, Raoul de Charette, Marin Toromanoff, Etienne Perot, Fawzi Nashashibi

    Abstract: We present research using the latest reinforcement learning algorithm for end-to-end driving without any mediated perception (object recognition, scene understanding). The newly proposed reward and learning strategies lead together to faster convergence and more robust driving using only RGB image from a forward facing camera. An Asynchronous Actor Critic (A3C) framework is used to learn the car c… ▽ More

    Submitted 31 August, 2018; v1 submitted 6 July, 2018; originally announced July 2018.

    Comments: ICRA 2018