Skip to main content

Showing 1–11 of 11 results for author: Almazan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.06592  [pdf, other

    cs.CV

    Flexible visual prompts for in-context learning in computer vision

    Authors: Thomas Foster, Ioana Croitoru, Robert Dorfman, Christoffer Edlund, Thomas Varsavsky, Jon Almazán

    Abstract: In this work, we address in-context learning (ICL) for the task of image segmentation, introducing a novel approach that adapts a modern Video Object Segmentation (VOS) technique for visual in-context learning. This adaptation is inspired by the VOS method's ability to efficiently and flexibly learn objects from a few examples. Through evaluations across a range of support set sizes and on diverse… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  2. arXiv:2210.02254  [pdf, other

    cs.CV

    Granularity-aware Adaptation for Image Retrieval over Multiple Tasks

    Authors: Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis

    Abstract: Strong image search models can be learned for a specific domain, ie. set of labels, provided that some labeled images of that domain are available. A practical visual search model, however, should be versatile enough to solve multiple retrieval tasks simultaneously, even if those cover very different specialized domains. Additionally, it should be able to benefit from even unlabeled images from th… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  3. arXiv:2110.09455  [pdf, other

    cs.CV cs.AI cs.LG

    TLDR: Twin Learning for Dimensionality Reduction

    Authors: Yannis Kalantidis, Carlos Lassance, Jon Almazan, Diane Larlus

    Abstract: Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved. Such methods usually require propagation on large k-NN graphs or complicated optimization solvers. On the other hand, self-supervised learning approaches, typically used to learn representations from scrat… ▽ More

    Submitted 15 June, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

    Comments: Accepted at Transactions on Machine Learning Research (TMLR). Code available at: https://github.com/naver/tldr

  4. arXiv:2001.01788  [pdf, other

    cs.CV

    MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection

    Authors: James H. Elder, Emilio J. Almazàn, Yiming Qian, Ron Tal

    Abstract: Traditional approaches to line segment detection typically involve perceptual grou** in the image domain and/or global accumulation in the Hough domain. Here we propose a probabilistic algorithm that merges the advantages of both approaches. In a first stage lines are detected using a global probabilistic Hough approach. In the second stage each detected line is analyzed in the image domain to l… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

  5. arXiv:1906.07589  [pdf, other

    cs.CV

    Learning with Average Precision: Training Image Retrieval with a Listwise Loss

    Authors: Jerome Revaud, Jon Almazan, Rafael Sampaio de Rezende, Cesar Roberto de Souza

    Abstract: Image retrieval can be formulated as a ranking problem where the goal is to order database images by decreasing similarity to the query. Recent deep models for image retrieval have outperformed traditional methods by leveraging ranking-tailored loss functions, but important theoretical and practical problems remain. First, rather than directly optimizing the global ranking, they minimize an upper-… ▽ More

    Submitted 18 June, 2019; originally announced June 2019.

  6. arXiv:1905.10858  [pdf, other

    cs.CV

    Integration of Text-maps in Convolutional Neural Networks for Region Detection among Different Textual Categories

    Authors: Roberto Arroyo, Javier Tovar, Francisco J. Delgado, Emilio J. Almazán, Diego G. Serrador, Antonio Hurtado

    Abstract: In this work, we propose a new technique that combines appearance and text in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual im… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR). Language and Vision Workshop 2019

  7. arXiv:1801.05339  [pdf, other

    cs.CV

    Re-ID done right: towards good practices for person re-identification

    Authors: Jon Almazan, Bojana Gajic, Naila Murray, Diane Larlus

    Abstract: Training a deep architecture using a ranking loss has become standard for the person re-identification task. Increasingly, these deep architectures include additional components that leverage part detections, attribute predictions, pose estimators and other auxiliary information, in order to more effectively localize and align discriminative image regions. In this paper we adopt a different approa… ▽ More

    Submitted 16 January, 2018; originally announced January 2018.

  8. arXiv:1610.07940  [pdf, other

    cs.CV

    End-to-end Learning of Deep Visual Representations for Image Retrieval

    Authors: Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus

    Abstract: While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval. In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: i) noisy training data, ii) inappropriate deep architecture, and iii) suboptimal trai… ▽ More

    Submitted 5 May, 2017; v1 submitted 25 October, 2016; originally announced October 2016.

    Comments: Accepted for publication at the International Journal of Computer Vision (IJCV). Extended version of our ECCV2016 paper "Deep Image Retrieval: Learning global representations for image search"

  9. arXiv:1604.01325  [pdf, other

    cs.CV

    Deep Image Retrieval: Learning global representations for image search

    Authors: Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus

    Abstract: We propose a novel approach for instance-level image retrieval. It produces a global and compact fixed-length representation for each image by aggregating many region-wise descriptors. In contrast to previous works employing pre-trained deep networks as a black box to produce features, our method leverages a deep architecture trained for the specific task of image retrieval. Our contribution is tw… ▽ More

    Submitted 28 July, 2016; v1 submitted 5 April, 2016; originally announced April 2016.

    Comments: ECCV 2016 version + additional results

  10. arXiv:1603.01076  [pdf, other

    cs.CV

    What is the right way to represent document images?

    Authors: Gabriela Csurka, Diane Larlus, Albert Gordo, Jon Almazan

    Abstract: In this article we study the problem of document image representation based on visual features. We propose a comprehensive experimental study that compares three types of visual document image representations: (1) traditional so-called shallow features, such as the RunLength and the Fisher-Vector descriptors, (2) deep features based on Convolutional Neural Networks, and (3) features extracted from… ▽ More

    Submitted 2 December, 2016; v1 submitted 3 March, 2016; originally announced March 2016.

  11. arXiv:1509.06243  [pdf, other

    cs.CV

    LEWIS: Latent Embeddings for Word Images and their Semantics

    Authors: Albert Gordo, Jon Almazan, Naila Murray, Florent Perronnin

    Abstract: The goal of this work is to bring semantics into the tasks of text recognition and retrieval in natural images. Although text recognition and retrieval have received a lot of attention in recent years, previous works have focused on recognizing or retrieving exactly the same word used as a query, without taking the semantics into consideration. In this paper, we ask the following question: \emph… ▽ More

    Submitted 21 September, 2015; originally announced September 2015.

    Comments: Accepted for publication at the International Conference on Computer Vision (ICCV) 2015