Skip to main content

Showing 1–16 of 16 results for author: Gordo, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.04825  [pdf, other

    cs.CV

    A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

    Authors: Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim

    Abstract: Machine learning models have been found to learn shortcuts -- unintended decision rules that are unable to generalize -- undermining models' reliability. Previous works address this problem under the tenuous assumption that only a single shortcut exists in the training data. Real-world images are rife with multiple visual cues from background to texture. Key to advancing the reliability of vision… ▽ More

    Submitted 21 March, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: CVPR 2023. Code is available at https://github.com/facebookresearch/Whac-A-Mole

  2. arXiv:2203.17260  [pdf, other

    cs.CV cs.LG

    Generating High Fidelity Data from Low-density Regions using Diffusion Models

    Authors: Vikash Sehwag, Caner Hazirbas, Albert Gordo, Firat Ozgenel, Cristian Canton Ferrer

    Abstract: Our work focuses on addressing sample deficiency from low-density regions of data manifold in common image datasets. We leverage diffusion process based generative models to synthesize novel images from low-density regions. We observe that uniform sampling from diffusion models predominantly samples from high-density regions of the data manifold. Therefore, we modify the sampling process to guide… ▽ More

    Submitted 26 June, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: CVPR 2022 (fixed some discrepancies in notation - v2)

  3. arXiv:2105.11373  [pdf, other

    cs.CV

    Large-Scale Attribute-Object Compositions

    Authors: Filip Radenovic, Animesh Sinha, Albert Gordo, Tamara Berg, Dhruv Mahajan

    Abstract: We study the problem of learning how to predict attribute-object compositions from images, and its generalization to unseen compositions missing from the training data. To the best of our knowledge, this is a first large-scale study of this problem, involving hundreds of thousands of compositions. We train our framework with images from Instagram using hashtags as noisy weak supervision. We make c… ▽ More

    Submitted 24 May, 2021; originally announced May 2021.

  4. arXiv:2104.02821  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Measuring Fairness in AI: the Casual Conversations Dataset

    Authors: Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer

    Abstract: This paper introduces a novel dataset to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of age, genders, apparent skin tones and ambient lighting conditions. Our dataset is composed of 3,011 subjects and contains over 45,000 videos, with an average of 15 videos per person. The videos were recorded in multiple U.S. states with a diverse set of adu… ▽ More

    Submitted 3 November, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

  5. arXiv:2007.08019  [pdf, other

    cs.CV cs.LG

    Attention-Based Query Expansion Learning

    Authors: Albert Gordo, Filip Radenovic, Tamara Berg

    Abstract: Query expansion is a technique widely used in image search consisting in combining highly ranked images from an original query into an expanded query that is then reissued, generally leading to increased recall and precision. An important aspect of query expansion is choosing an appropriate way to combine the images into a new query. Interestingly, despite the undeniable empirical success of query… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: Accepted for publication at ECCV2020

  6. arXiv:2002.08165  [pdf, other

    cs.LG stat.ML

    Using Hindsight to Anchor Past Knowledge in Continual Learning

    Authors: Arslan Chaudhry, Albert Gordo, Puneet K. Dokania, Philip Torr, David Lopez-Paz

    Abstract: In continual learning, the learner faces a stream of data whose distribution changes over time. Modern neural networks are known to suffer under this setting, as they quickly forget previously acquired knowledge. To address such catastrophic forgetting, many continual learning methods implement different types of experience replay, re-learning on past data stored in a small buffer known as episodi… ▽ More

    Submitted 2 March, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: Accepted at AAAI 2021

  7. arXiv:1910.09217  [pdf, other

    cs.CV

    Decoupling Representation and Classifier for Long-Tailed Recognition

    Authors: Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, Yannis Kalantidis

    Abstract: The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem. Existing solutions usually involve class-balancing strategies, e.g., by loss re-weighting, data re-sampling, or transfer learning from head- to tail-classes, but most of them adhere to the scheme of jointly learning representations and cl… ▽ More

    Submitted 19 February, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

    Journal ref: Published as a conference paper at ICLR 2020

  8. Rosetta: Large scale system for text detection and recognition in images

    Authors: Fedor Borisyuk, Albert Gordo, Viswanath Sivakumar

    Abstract: In this paper we present a deployed, scalable optical character recognition (OCR) system, which we call Rosetta, designed to process images uploaded daily at Facebook scale. Sharing of image content has become one of the primary ways to communicate information among internet users within social networks such as Facebook and Instagram, and the understanding of such media, including its textual info… ▽ More

    Submitted 11 October, 2019; originally announced October 2019.

    Comments: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD) 2018, London, United Kingdom

  9. Considerations When Learning Additive Explanations for Black-Box Models

    Authors: Sarah Tan, Giles Hooker, Paul Koch, Albert Gordo, Rich Caruana

    Abstract: Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additiv… ▽ More

    Submitted 31 July, 2023; v1 submitted 25 January, 2018; originally announced January 2018.

    Comments: Published at Machine Learning (2023). Previously titled "Learning Global Additive Explanations for Neural Nets Using Model Distillation". A short version was presented at NeurIPS 2018 Machine Learning for Health Workshop

  10. arXiv:1708.04890  [pdf, other

    cs.CV

    A deep architecture for unified aesthetic prediction

    Authors: Naila Murray, Albert Gordo

    Abstract: Image aesthetics has become an important criterion for visual content curation on social media sites and media content repositories. Previous work on aesthetic prediction models in the computer vision community has focused on aesthetic score prediction or binary image labeling. However, raw aesthetic annotations are in the form of score histograms and provide richer and more precise information th… ▽ More

    Submitted 16 August, 2017; originally announced August 2017.

  11. arXiv:1610.07940  [pdf, other

    cs.CV

    End-to-end Learning of Deep Visual Representations for Image Retrieval

    Authors: Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus

    Abstract: While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval. In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: i) noisy training data, ii) inappropriate deep architecture, and iii) suboptimal trai… ▽ More

    Submitted 5 May, 2017; v1 submitted 25 October, 2016; originally announced October 2016.

    Comments: Accepted for publication at the International Journal of Computer Vision (IJCV). Extended version of our ECCV2016 paper "Deep Image Retrieval: Learning global representations for image search"

  12. arXiv:1604.01325  [pdf, other

    cs.CV

    Deep Image Retrieval: Learning global representations for image search

    Authors: Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus

    Abstract: We propose a novel approach for instance-level image retrieval. It produces a global and compact fixed-length representation for each image by aggregating many region-wise descriptors. In contrast to previous works employing pre-trained deep networks as a black box to produce features, our method leverages a deep architecture trained for the specific task of image retrieval. Our contribution is tw… ▽ More

    Submitted 28 July, 2016; v1 submitted 5 April, 2016; originally announced April 2016.

    Comments: ECCV 2016 version + additional results

  13. arXiv:1603.01076  [pdf, other

    cs.CV

    What is the right way to represent document images?

    Authors: Gabriela Csurka, Diane Larlus, Albert Gordo, Jon Almazan

    Abstract: In this article we study the problem of document image representation based on visual features. We propose a comprehensive experimental study that compares three types of visual document image representations: (1) traditional so-called shallow features, such as the RunLength and the Fisher-Vector descriptors, (2) deep features based on Convolutional Neural Networks, and (3) features extracted from… ▽ More

    Submitted 2 December, 2016; v1 submitted 3 March, 2016; originally announced March 2016.

  14. arXiv:1509.06243  [pdf, other

    cs.CV

    LEWIS: Latent Embeddings for Word Images and their Semantics

    Authors: Albert Gordo, Jon Almazan, Naila Murray, Florent Perronnin

    Abstract: The goal of this work is to bring semantics into the tasks of text recognition and retrieval in natural images. Although text recognition and retrieval have received a lot of attention in recent years, previous works have focused on recognizing or retrieving exactly the same word used as a query, without taking the semantics into consideration. In this paper, we ask the following question: \emph… ▽ More

    Submitted 21 September, 2015; originally announced September 2015.

    Comments: Accepted for publication at the International Conference on Computer Vision (ICCV) 2015

  15. arXiv:1507.06429  [pdf, other

    cs.CV

    Deep Fishing: Gradient Features from Deep Nets

    Authors: Albert Gordo, Adrien Gaidon, Florent Perronnin

    Abstract: Convolutional Networks (ConvNets) have recently improved image recognition performance thanks to end-to-end learning of deep feed-forward models from raw pixels. Deep learning is a marked departure from the previous state of the art, the Fisher Vector (FV), which relied on gradient-based encoding of local hand-crafted features. In this paper, we discuss a novel connection between these two approac… ▽ More

    Submitted 23 July, 2015; originally announced July 2015.

    Comments: To appear at BMVC 2015

  16. arXiv:1410.5224  [pdf, other

    cs.CV

    Supervised mid-level features for word image representation

    Authors: Albert Gordo

    Abstract: This paper addresses the problem of learning word image representations: given the cropped image of a word, we are interested in finding a descriptive, robust, and compact fixed-length representation. Machine learning techniques can then be supplied with these representations to produce models useful for word retrieval or recognition tasks. Although many works have focused on the machine learning… ▽ More

    Submitted 14 November, 2014; v1 submitted 20 October, 2014; originally announced October 2014.