Skip to main content

Showing 1–9 of 9 results for author: Engilberge, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.13781  [pdf, other

    cs.CV

    Addressing the Elephant in the Room: Robust Animal Re-Identification with Unsupervised Part-Based Feature Alignment

    Authors: Yingxue Yu, Vidit Vidit, Andrey Davydov, Martin Engilberge, Pascal Fua

    Abstract: Animal Re-ID is crucial for wildlife conservation, yet it faces unique challenges compared to person Re-ID. First, the scarcity and lack of diversity in datasets lead to background-biased models. Second, animal Re-ID depends on subtle, species-specific cues, further complicated by variations in pose, background, and lighting. This study addresses background biases by proposing a method to systemat… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR workshop CV4Animals 2024

  2. arXiv:2403.09050  [pdf, other

    cs.CV

    CLOAF: CoLlisiOn-Aware Human Flow

    Authors: Andrey Davydov, Martin Engilberge, Mathieu Salzmann, Pascal Fua

    Abstract: Even the best current algorithms for estimating body 3D shape and pose yield results that include body self-intersections. In this paper, we present CLOAF, which exploits the diffeomorphic nature of Ordinary Differential Equations to eliminate such self-intersections while still imposing body shape constraints. We show that, unlike earlier approaches to addressing this issue, ours completely elimi… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: CVPR 2024, 13 pages

  3. arXiv:2301.05499  [pdf, other

    cs.CV

    CLIP the Gap: A Single Domain Generalization Approach for Object Detection

    Authors: Vidit Vidit, Martin Engilberge, Mathieu Salzmann

    Abstract: Single Domain Generalization (SDG) tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain. While this has been well studied for image classification, the literature on SDG object detection remains almost non-existent. To address the challenges of simultaneously learning robust object localization and representation, we propose to levera… ▽ More

    Submitted 6 March, 2023; v1 submitted 13 January, 2023; originally announced January 2023.

  4. arXiv:2301.05496  [pdf, other

    cs.CV

    Learning Transformations To Reduce the Geometric Shift in Object Detection

    Authors: Vidit Vidit, Martin Engilberge, Mathieu Salzmann

    Abstract: The performance of modern object detectors drops when the test distribution differs from the training one. Most of the methods that address this focus on object appearance changes caused by, e.g., different illumination conditions, or gaps between synthetic and real images. Here, by contrast, we tackle geometric shifts emerging from variations in the image capture process, or due to the constraint… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

  5. arXiv:2210.10771  [pdf, other

    cs.CV cs.LG

    Multi-view Tracking Using Weakly Supervised Human Motion Prediction

    Authors: Martin Engilberge, Weizhe Liu, Pascal Fua

    Abstract: Multi-view approaches to people-tracking have the potential to better handle occlusions than single-view ones in crowded scenes. They often rely on the tracking-by-detection paradigm, which involves detecting people first and then connecting the detections. In this paper, we argue that an even more effective approach is to predict people motion over time and infer people's presence in individual f… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted at WACV 2023

  6. arXiv:2210.10756  [pdf, other

    cs.CV cs.LG

    Two-level Data Augmentation for Calibrated Multi-view Detection

    Authors: Martin Engilberge, Haixin Shi, Zhiye Wang, Pascal Fua

    Abstract: Data augmentation has proven its usefulness to improve model generalization and performance. While it is commonly applied in computer vision application when it comes to multi-view systems, it is rarely used. Indeed geometric data augmentation can break the alignment among views. This is problematic since multi-view data tend to be scarce and it is expensive to annotate. In this work we propose to… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted at WACV 2023

  7. arXiv:1904.04272  [pdf, other

    cs.LG cs.CV cs.IR stat.ML

    SoDeep: a Sorting Deep net to learn ranking loss surrogates

    Authors: Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord

    Abstract: Several tasks in machine learning are evaluated using non-differentiable metrics such as mean average precision or Spearman correlation. However, their non-differentiability prevents from using them as objective functions in a learning framework. Surrogate and relaxation methods exist but tend to be specific to a given metric. In the present work, we introduce a new method to learn approximation… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019

  8. arXiv:1812.01973  [pdf, other

    cs.CV cs.MM

    VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability

    Authors: Romain Cohendet, Claire-Hélène Demarty, Ngoc Q. K. Duong, Martin Engilberge

    Abstract: Humans share a strong tendency to memorize/forget some of the visual information they encounter. This paper focuses on providing computational models for the prediction of the intrinsic memorability of visual content. To address this new challenge, we introduce a large scale dataset (VideoMem) composed of 10,000 videos annotated with memorability scores. In contrast to previous work on image memor… ▽ More

    Submitted 5 December, 2018; originally announced December 2018.

    Journal ref: ICCV 2019

  9. arXiv:1804.01720  [pdf, other

    cs.CV cs.CL cs.LG

    Finding beans in burgers: Deep semantic-visual embedding with localization

    Authors: Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord

    Abstract: Several works have proposed to learn a two-path neural network that maps images and texts, respectively, to a same shared Euclidean space where geometry captures useful semantic relationships. Such a multi-modal embedding can be trained and used for various tasks, notably image captioning. In the present work, we introduce a new architecture of this type, with a visual path that leverages recent s… ▽ More

    Submitted 6 April, 2018; v1 submitted 5 April, 2018; originally announced April 2018.

    Comments: Accepted to CVPR2018