Skip to main content

Showing 1–13 of 13 results for author: Rambour, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01400  [pdf, other

    cs.CV

    GalLoP: Learning Global and Local Prompts for Vision-Language Models

    Authors: Marc Lafon, Elias Ramzi, Clément Rambour, Nicolas Audebert, Nicolas Thome

    Abstract: Prompt learning has been widely adopted to efficiently adapt vision-language models (VLMs), e.g. CLIP, for few-shot image classification. Despite their success, most prompt learning methods trade-off between classification accuracy and robustness, e.g. in domain generalization or out-of-distribution (OOD) detection. In this work, we introduce Global-Local Prompts (GalLoP), a new prompt learning me… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: To be published at ECCV 2024

  2. arXiv:2403.10403  [pdf, other

    cs.CV cs.AI cs.LG

    Energy Correction Model in the Feature Space for Out-of-Distribution Detection

    Authors: Marc Lafon, Clément Rambour, Nicolas Thome

    Abstract: In this work, we study the out-of-distribution (OOD) detection problem through the use of the feature space of a pre-trained deep classifier. We show that learning the density of in-distribution (ID) features with an energy-based models (EBM) leads to competitive detection results. However, we found that the non-mixing of MCMC sampling during the EBM's training undermines its detection performance… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: NeurIPS ML Safety Workshop (2022)

  3. arXiv:2309.08250  [pdf, other

    cs.CV

    Optimization of Rank Losses for Image Retrieval

    Authors: Elias Ramzi, Nicolas Audebert, Clément Rambour, André Araujo, Xavier Bitot, Nicolas Thome

    Abstract: In image retrieval, standard evaluation metrics rely on score ranking, \eg average precision (AP), recall at k (R@k), normalized discounted cumulative gain (NDCG). In this work we introduce a general framework for robust and decomposable rank losses optimization. It addresses two major challenges for end-to-end training of deep neural networks with rank losses: non-differentiability and non-decomp… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: text overlap with arXiv:2207.04873

  4. arXiv:2307.06795  [pdf, other

    cs.CV

    Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks

    Authors: Denis Coquenet, Clément Rambour, Emanuele Dalsasso, Nicolas Thome

    Abstract: Vision-language foundation models such as CLIP have shown impressive zero-shot performance on many tasks and datasets, especially thanks to their free-text inputs. However, they struggle to handle some downstream tasks, such as fine-grained attribute detection and localization. In this paper, we propose a multitask fine-tuning strategy based on a positive/negative prompt formulation to further lev… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

  5. arXiv:2306.08707  [pdf, other

    cs.CV

    VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

    Authors: Paul Couairon, Clément Rambour, Jean-Emmanuel Haugeard, Nicolas Thome

    Abstract: Recently, diffusion-based generative models have achieved remarkable success for image generation and edition. However, existing diffusion-based video editing approaches lack the ability to offer precise control over generated content that maintains temporal consistency in long-term videos. On the other hand, atlas-based methods provide strong temporal consistency but are costly to edit a video an… ▽ More

    Submitted 2 April, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: TMLR 2024. Project web-page at https://videdit.github.io

  6. arXiv:2305.16966  [pdf, other

    cs.CV

    Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection

    Authors: Marc Lafon, Elias Ramzi, Clément Rambour, Nicolas Thome

    Abstract: Out-of-distribution (OOD) detection is a critical requirement for the deployment of deep neural networks. This paper introduces the HEAT model, a new post-hoc OOD detection method estimating the density of in-distribution (ID) samples using hybrid energy-based models (EBM) in the feature space of a pre-trained backbone. HEAT complements prior density estimators of the ID density, e.g. parametric m… ▽ More

    Submitted 1 June, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Journal ref: International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA

  7. arXiv:2212.07890  [pdf, other

    cs.CV

    Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation

    Authors: Loic Themyr, Clement Rambour, Nicolas Thome, Toby Collins, Alexandre Hostettler

    Abstract: Transformers have proved to be very effective for visual recognition tasks. In particular, vision transformers construct compressed global representations through self-attention and learnable class tokens. Multi-resolution transformers have shown recent successes in semantic segmentation but can only capture local interactions in high-resolution feature maps. This paper extends the notion of globa… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: Winter Conference on Applications of Computer Vision (WACV 2023)

    MSC Class: 68T45

  8. arXiv:2210.05313  [pdf, other

    cs.CV

    Memory transformers for full context and high-resolution 3D Medical Segmentation

    Authors: Loic Themyr, Clément Rambour, Nicolas Thome, Toby Collins, Alexandre Hostettler

    Abstract: Transformer models achieve state-of-the-art results for image segmentation. However, achieving long-range attention, necessary to capture global context, with high-resolution 3D images is a fundamental challenge. This paper introduces the Full resolutIoN mEmory (FINE) transformer to overcome this issue. The core idea behind FINE is to learn memory tokens to indirectly model full range interactions… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    MSC Class: 68T45

  9. arXiv:2207.04873  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Hierarchical Average Precision Training for Pertinent Image Retrieval

    Authors: Elias Ramzi, Nicolas Audebert, Nicolas Thome, Clément Rambour, Xavier Bitot

    Abstract: Image Retrieval is commonly evaluated with Average Precision (AP) or Recall@k. Yet, those metrics, are limited to binary labels and do not take into account errors' severity. This paper introduces a new hierarchical AP training method for pertinent image retrieval (HAP-PIER). HAPPIER is based on a new H-AP metric, which leverages a concept hierarchy to refine AP by integrating errors' importance a… ▽ More

    Submitted 22 July, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

    Journal ref: ECCV 2022, Oct 2022, Tel-Aviv, Israel

  10. arXiv:2207.03790  [pdf, other

    cs.CV stat.ML

    Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction

    Authors: Vincent Le Guen, Clément Rambour, Nicolas Thome

    Abstract: State-of-the-art methods for optical flow estimation rely on deep learning, which require complex sequential training schemes to reach optimal performances on real-world data. In this work, we introduce the COMBO deep network that explicitly exploits the brightness constancy (BC) model used in traditional methods. Since BC is an approximate physical model violated in several situations, we propose… ▽ More

    Submitted 12 July, 2022; v1 submitted 8 July, 2022; originally announced July 2022.

  11. arXiv:2110.01445  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Robust and Decomposable Average Precision for Image Retrieval

    Authors: Elias Ramzi, Nicolas Thome, Clément Rambour, Nicolas Audebert, Xavier Bitot

    Abstract: In image retrieval, standard evaluation metrics rely on score ranking, e.g. average precision (AP). In this paper, we introduce a method for robust and decomposable average precision (ROADMAP) addressing two major challenges for end-to-end training of deep neural networks with AP: non-differentiability and non-decomposability. Firstly, we propose a new differentiable approximation of the rank func… ▽ More

    Submitted 8 December, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

    Journal ref: Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021), Dec 2021, Sydney, Australia

  12. Urban Surface Reconstruction in SAR Tomography by Graph-Cuts

    Authors: Clément Rambour, Loïc Denis, Florence Tupin, Hélène Oriot, Yue Huang, Laurent Ferro-Famil

    Abstract: SAR (Synthetic Aperture Radar) tomography reconstructs 3-D volumes from stacks of SAR images. High-resolution satellites such as TerraSAR-X provide images that can be combined to produce 3-D models. In urban areas, sparsity priors are generally enforced during the tomographic inversion process in order to retrieve the location of scatterers seen within a given radar resolution cell. However, such… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

    Journal ref: Computer Vision and Image Understanding 188 (2019) 102791

  13. arXiv:2103.06104  [pdf, other

    eess.IV cs.CV

    U-Net Transformer: Self and Cross Attention for Medical Image Segmentation

    Authors: Olivier Petit, Nicolas Thome, Clément Rambour, Luc Soler

    Abstract: Medical image segmentation remains particularly challenging for complex and low-contrast anatomical structures. In this paper, we introduce the U-Transformer network, which combines a U-shaped architecture for image segmentation with self- and cross-attention from Transformers. U-Transformer overcomes the inability of U-Nets to model long-range contextual interactions and spatial dependencies, whi… ▽ More

    Submitted 12 March, 2021; v1 submitted 10 March, 2021; originally announced March 2021.