Skip to main content

Showing 1–7 of 7 results for author: Ibrahimi, S

.
  1. arXiv:2307.12964  [pdf, other

    cs.CV

    Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment

    Authors: Sarah Ibrahimi, Xiaohang Sun, Pichao Wang, Amanmeet Garg, Ashutosh Sanan, Mohamed Omar

    Abstract: Text-to-video retrieval systems have recently made significant progress by utilizing pre-trained models trained on large-scale image-text pairs. However, most of the latest methods primarily focus on the video modality while disregarding the audio signal for this task. Nevertheless, a recent advancement by ECLIPSE has improved long-range text-to-video retrieval by develo** an audiovisual video r… ▽ More

    Submitted 18 October, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023

  2. arXiv:2202.01747  [pdf, other

    cs.CV

    The Met Dataset: Instance-level Recognition for Artworks

    Authors: Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah Ibrahimi, Nanne Van Noord, Giorgos Tolias

    Abstract: This work introduces a dataset for large-scale instance-level recognition in the domain of artworks. The proposed benchmark exhibits a number of different challenges such as large inter-class similarity, long tail distribution, and many classes. We rely on the open access collection of The Met museum to form a large training set of about 224k classes, where each class corresponds to a museum exhib… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  3. arXiv:2112.11743  [pdf, other

    cs.LG cs.CV

    Simple and Effective Balance of Contrastive Losses

    Authors: Arnaud Sors, Rafael Sampaio de Rezende, Sarah Ibrahimi, Jean-Marc Andreoli

    Abstract: Contrastive losses have long been a key ingredient of deep metric learning and are now becoming more popular due to the success of self-supervised learning. Recent research has shown the benefit of decomposing such losses into two sub-losses which act in a complementary way when learning the representation network: a positive term and an entropy term. Although the overall loss is thus defined as a… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 15 pages, 10 figures

  4. arXiv:2112.10453  [pdf, other

    cs.CV

    Learning with Label Noise for Image Retrieval by Selecting Interactions

    Authors: Sarah Ibrahimi, Arnaud Sors, Rafael Sampaio de Rezende, Stéphane Clinchant

    Abstract: Learning with noisy labels is an active research area for image classification. However, the effect of noisy labels on image retrieval has been less studied. In this work, we propose a noise-resistant method for image retrieval named Teacher-based Selection of Interactions, T-SINT, which identifies noisy interactions, ie. elements in the distance matrix, and selects correct positive and negative i… ▽ More

    Submitted 21 December, 2021; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: Accepted at WACV 2022. 13 pages, 5 figures

  5. arXiv:2111.13546  [pdf, other

    cs.CV

    Inside Out Visual Place Recognition

    Authors: Sarah Ibrahimi, Nanne van Noord, Tim Alpherts, Marcel Worring

    Abstract: Visual Place Recognition (VPR) is generally concerned with localizing outdoor images. However, localizing indoor scenes that contain part of an outdoor scene can be of large value for a wide range of applications. In this paper, we introduce Inside Out Visual Place Recognition (IOVPR), a task aiming to localize images based on outdoor scenes visible through windows. For this task we present the ne… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

    Comments: Accepted at British Machine Vision Conference (BMVC) 2021

  6. arXiv:2005.05632  [pdf, other

    cs.CV

    Detecting CNN-Generated Facial Images in Real-World Scenarios

    Authors: Nils Hulzebosch, Sarah Ibrahimi, Marcel Worring

    Abstract: Artificial, CNN-generated images are now of such high quality that humans have trouble distinguishing them from real images. Several algorithmic detection methods have been proposed, but these appear to generalize poorly to data from unknown sources, making them infeasible for real-world scenarios. In this work, we present a framework for evaluating detection methods under real-world conditions, c… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: Accepted to the workshop on Media Forensics at CVPR 2020

  7. arXiv:2005.04909  [pdf, other

    cs.CV

    Conditional Image Generation and Manipulation for User-Specified Content

    Authors: David Stap, Maurits Bleeker, Sarah Ibrahimi, Maartje ter Hoeve

    Abstract: In recent years, Generative Adversarial Networks (GANs) have improved steadily towards generating increasingly impressive real-world images. It is useful to steer the image generation process for purposes such as content creation. This can be done by conditioning the model on additional information. However, when conditioning on additional information, there still exists a large set of images that… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: Accepted to the AI for content creation workshop at CVPR 2020