Skip to main content

Showing 1–4 of 4 results for author: Almazán, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.02912  [pdf, other

    cs.CL cs.AI

    LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias

    Authors: Mario Almagro, Emilio Almazán, Diego Ortego, David Jiménez

    Abstract: Textual noise, such as typos or abbreviations, is a well-known issue that penalizes vanilla Transformers for most downstream tasks. We show that this is also the case for sentence similarity, a fundamental task in multiple domains, e.g. matching, retrieval or paraphrasing. Sentence similarity can be approached using cross-encoders, where the two sentences are concatenated in the input allowing the… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: KDD'23 conference (main research track). (*) These authors contributed equally

  2. arXiv:2207.02008  [pdf, other

    cs.CL

    Block-SCL: Blocking Matters for Supervised Contrastive Learning in Product Matching

    Authors: Mario Almagro, David Jiménez, Diego Ortego, Emilio Almazán, Eva Martínez

    Abstract: Product matching is a fundamental step for the global understanding of consumer behavior in e-commerce. In practice, product matching refers to the task of deciding if two product offers from different data sources (e.g. retailers) represent the same product. Standard pipelines use a previous stage called blocking, where for a given product offer a set of potential matching candidates are retrieve… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: 7 pages, 2 figures, e-commerce, conference

  3. arXiv:2001.01788  [pdf, other

    cs.CV

    MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection

    Authors: James H. Elder, Emilio J. Almazàn, Yiming Qian, Ron Tal

    Abstract: Traditional approaches to line segment detection typically involve perceptual grou** in the image domain and/or global accumulation in the Hough domain. Here we propose a probabilistic algorithm that merges the advantages of both approaches. In a first stage lines are detected using a global probabilistic Hough approach. In the second stage each detected line is analyzed in the image domain to l… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

  4. arXiv:1905.10858  [pdf, other

    cs.CV

    Integration of Text-maps in Convolutional Neural Networks for Region Detection among Different Textual Categories

    Authors: Roberto Arroyo, Javier Tovar, Francisco J. Delgado, Emilio J. Almazán, Diego G. Serrador, Antonio Hurtado

    Abstract: In this work, we propose a new technique that combines appearance and text in a Convolutional Neural Network (CNN), with the aim of detecting regions of different textual categories. We define a novel visual representation of the semantic meaning of text that allows a seamless integration in a standard CNN architecture. This representation, referred to as text-map, is integrated with the actual im… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR). Language and Vision Workshop 2019