Skip to main content

Showing 1–15 of 15 results for author: López-Sastre, R J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14206  [pdf, other

    cs.CV

    Live Video Captioning

    Authors: Eduardo Blanco-Fernández, Carlos Gutiérrez-Álvarez, Nadia Nasri, Saturnino Maldonado-Bascón, Roberto J. López-Sastre

    Abstract: Dense video captioning is the task that involves the detection and description of events within video sequences. While traditional approaches focus on offline solutions where the entire video of analysis is available for the captioning model, in this work we introduce a paradigm shift towards Live Video Captioning (LVC). In LVC, dense video captioning models must generate captions for video stream… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2404.07729  [pdf, other

    cs.LG cs.CV

    Realistic Continual Learning Approach using Pre-trained Models

    Authors: Nadia Nasri, Carlos Gutiérrez-Álvarez, Sergio Lafuente-Arroyo, Saturnino Maldonado-Bascón, Roberto J. López-Sastre

    Abstract: Continual learning (CL) is crucial for evaluating adaptability in learning solutions to retain knowledge. Our research addresses the challenge of catastrophic forgetting, where models lose proficiency in previously learned tasks as they acquire new ones. While numerous solutions have been proposed, existing experimental setups often rely on idealized class-incremental learning scenarios. We introd… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  3. arXiv:2311.16623  [pdf, other

    cs.RO cs.CV

    Visual Semantic Navigation with Real Robots

    Authors: Carlos Gutiérrez-Álvarez, Pablo Ríos-Navarro, Rafael Flor-Rodríguez, Francisco Javier Acevedo-Rodríguez, Roberto J. López-Sastre

    Abstract: Visual Semantic Navigation (VSN) is the ability of a robot to learn visual semantic information for navigating in unseen environments. These VSN models are typically tested in those virtual environments where they are trained, mainly using reinforcement learning based approaches. Therefore, we do not yet have an in-depth analysis of how these models would behave in the real world. In this work, we… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  4. arXiv:2003.12041  [pdf, other

    cs.CV

    Rethinking Online Action Detection in Untrimmed Videos: A Novel Online Evaluation Protocol

    Authors: Marcos Baptista Rios, Roberto J. López-Sastre, Fabian Caba Heilbron, Jan van Gemert, F. Javier Acevedo-Rodríguez, S. Maldonado-Bascón

    Abstract: The Online Action Detection (OAD) problem needs to be revisited. Unlike traditional offline action detection approaches, where the evaluation metrics are clear and well established, in the OAD setting we find very few works and no consensus on the evaluation protocols to be used. In this work we propose to rethink the OAD scenario, clearly defining the problem itself and the main characteristics t… ▽ More

    Submitted 26 March, 2020; originally announced March 2020.

    Comments: Published at IEEE Access journal

  5. arXiv:2003.09970  [pdf, other

    cs.CV

    The Instantaneous Accuracy: a Novel Metric for the Problem of Online Human Behaviour Recognition in Untrimmed Videos

    Authors: Marcos Baptista Rios, Roberto J. López-Sastre, Fabian Caba Heilbron, Jan van Gemert, Francisco Javier Acevedo-Rodríguez, Saturnino Maldonado-Bascón

    Abstract: The problem of Online Human Behaviour Recognition in untrimmed videos, aka Online Action Detection (OAD), needs to be revisited. Unlike traditional offline action detection approaches, where the evaluation metrics are clear and well established, in the OAD setting we find few works and no consensus on the evaluation protocols to be used. In this paper we introduce a novel online metric, the Instan… ▽ More

    Submitted 25 March, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

    Comments: Published at ICCV 2019 workshop: Human Behaviour Understanding

  6. arXiv:1904.08241  [pdf, other

    cs.CV

    Deep Anomaly Detection for Generalized Face Anti-Spoofing

    Authors: Daniel Pérez-Cabo, David Jiménez-Cabello, Artur Costa-Pazo, Roberto J. López-Sastre

    Abstract: Face recognition has achieved unprecedented results, surpassing human capabilities in certain scenarios. However, these automatic solutions are not ready for production because they can be easily fooled by simple identity impersonation attacks. And although much effort has been devoted to develop face anti-spoofing models, their generalization capacity still remains a challenge in real scenarios.… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

    Comments: To appear at CVPR19 (workshop)

  7. arXiv:1904.06213  [pdf, other

    cs.CV

    Generalized Presentation Attack Detection: a face anti-spoofing evaluation proposal

    Authors: Artur Costa-Pazo, David Jimenez-Cabello, Esteban Vazquez-Fernandez, Jose L. Alba-Castro, Roberto J. López-Sastre

    Abstract: Over the past few years, Presentation Attack Detection (PAD) has become a fundamental part of facial recognition systems. Although much effort has been devoted to anti-spoofing research, generalization in real scenarios remains a challenge. In this paper we present a new open-source evaluation framework to study the generalization capacity of face PAD methods, coined here as face-GPAD. This framew… ▽ More

    Submitted 12 April, 2019; originally announced April 2019.

    Comments: 8 pages, to appear at International Conference on Biometrics (ICB19)

  8. arXiv:1810.07420  [pdf, other

    cs.CV

    Embarrassingly Simple Model for Early Action Proposal

    Authors: Marcos Baptista-Ríos, Roberto J. López-Sastre, Franciso Javier Acevedo-Rodríguez, Saturnino Maldonado-Bascón

    Abstract: Early action proposal consists in generating high quality candidate temporal segments that are likely to contain an action in a video stream, as soon as they happen. Many sophisticated approaches have been proposed for the action proposal problem but from the off-line perspective. On the contrary, we focus on the on-line version of the problem, proposing a simple classifier-based model, using stan… ▽ More

    Submitted 18 October, 2018; v1 submitted 17 October, 2018; originally announced October 2018.

    Comments: Published in the Anticipating Human Behavior Workshop, ECCV 2018

  9. arXiv:1810.05016  [pdf, other

    cs.CV

    ISA$^2$: Intelligent Speed Adaptation from Appearance

    Authors: Carlos Herranz-Perdiguero, Roberto J. López-Sastre

    Abstract: In this work we introduce a new problem named Intelligent Speed Adaptation from Appearance (ISA$^2$). Technically, the goal of an ISA$^2$ model is to predict for a given image of a driving scenario the proper speed of the vehicle. Note this problem is different from predicting the actual speed of the vehicle. It defines a novel regression problem where the appearance information has to be directly… ▽ More

    Submitted 11 October, 2018; originally announced October 2018.

    Comments: IROS 2018 Workshop: 10th Planning, Perception and Navigation for Intelligent Vehicles (PPNIV'18)

  10. arXiv:1807.07284  [pdf, other

    cs.CV cs.RO

    In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization

    Authors: Carlos Herranz-Perdiguero, Carolina Redondo-Cabrera, Roberto J. López-Sastre

    Abstract: While there has been significant progress in solving the problems of image pixel labeling, object detection and scene classification, existing approaches normally address them separately. In this paper, we propose to tackle these problems from a bottom-up perspective, where we simply need a semantic segmentation of the scene as input. We employ the DeepLab architecture, based on the ResNet deep ne… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

    Comments: IROS 2018

  11. arXiv:1805.02919  [pdf, other

    cs.CV

    Learning Short-Cut Connections for Object Counting

    Authors: Daniel Oñoro-Rubio, Mathias Niepert, Roberto J. López-Sastre

    Abstract: Object counting is an important task in computer vision due to its growing demand in applications such as traffic monitoring or surveillance. In this paper, we consider object counting as a learning problem of a joint feature extraction and pixel-wise object density estimation with Convolutional-Deconvolutional networks. We introduce a novel counting model, named Gated U-Net (GU-Net). Specifically… ▽ More

    Submitted 15 November, 2018; v1 submitted 8 May, 2018; originally announced May 2018.

  12. Learning to Exploit the Prior Network Knowledge for Weakly-Supervised Semantic Segmentation

    Authors: Carolina Redondo-Cabrera, Marcos Baptista-Ríos, Roberto J. López-Sastre

    Abstract: Training a Convolutional Neural Network (CNN) for semantic segmentation typically requires to collect a large amount of accurate pixel-level annotations, a hard and expensive task. In contrast, simple image tags are easier to gather. With this paper we introduce a novel weakly-supervised semantic segmentation model able to learn from image labels, and just image labels. Our model uses the prior kn… ▽ More

    Submitted 22 February, 2019; v1 submitted 13 April, 2018; originally announced April 2018.

    Journal ref: IEEE Transactions on Image Processing, 2019

  13. The challenge of simultaneous object detection and pose estimation: a comparative study

    Authors: Daniel Oñoro-Rubio, Roberto J. López-Sastre, Carolina Redondo-Cabrera, Pedro Gil-Jiménez

    Abstract: Detecting objects and estimating their pose remains as one of the major challenges of the computer vision research community. There exists a compromise between localizing the objects and estimating their viewpoints. The detector ideally needs to be view-invariant, while the pose estimation process should be able to generalize towards the category-level. This work is an exploration of using deep le… ▽ More

    Submitted 24 January, 2018; originally announced January 2018.

    Journal ref: Image and Vision Computing, 2018

  14. Unsupervised learning from videos using temporal coherency deep networks

    Authors: Carolina Redondo-Cabrera, Roberto J. López-Sastre

    Abstract: In this work we address the challenging problem of unsupervised learning from videos. Existing methods utilize the spatio-temporal continuity in contiguous video frames as regularization for the learning process. Typically, this temporal coherence of close frames is used as a free form of annotation, encouraging the learned representations to exhibit small differences between these frames. But thi… ▽ More

    Submitted 11 October, 2018; v1 submitted 24 January, 2018; originally announced January 2018.

    Journal ref: Computer Vision and Image Understanding, 2018

  15. arXiv:1709.02314  [pdf, other

    cs.LG cs.AI

    Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs

    Authors: Daniel Oñoro-Rubio, Mathias Niepert, Alberto García-Durán, Roberto González, Roberto J. López-Sastre

    Abstract: A visual-relational knowledge graph (KG) is a multi-relational graph whose entities are associated with images. We explore novel machine learning approaches for answering visual-relational queries in web-extracted knowledge graphs. To this end, we have created ImageGraph, a KG with 1,330 relation types, 14,870 entities, and 829,931 images crawled from the web. With visual-relational KGs such as Im… ▽ More

    Submitted 3 May, 2019; v1 submitted 7 September, 2017; originally announced September 2017.

    Journal ref: AKBC2019