Skip to main content

Showing 51–66 of 66 results for author: Moreno-Noguer, F

.
  1. arXiv:1903.12161  [pdf, other

    cs.CV

    Fast video object segmentation with Spatio-Temporal GANs

    Authors: Sergi Caelles, Albert Pumarola, Francesc Moreno-Noguer, Alberto Sanfeliu, Luc Van Gool

    Abstract: Learning descriptive spatio-temporal object models from data is paramount for the task of semi-supervised video object segmentation. Most existing approaches mainly rely on models that estimate the segmentation mask based on a reference mask at the first frame (aided sometimes by optical flow or the previous mask). These models, however, are prone to fail under rapid appearance changes or occlusio… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

  2. arXiv:1812.05478  [pdf, other

    cs.CV

    Human Motion Prediction via Spatio-Temporal Inpainting

    Authors: Alejandro Hernandez Ruiz, Juergen Gall, Francesc Moreno-Noguer

    Abstract: We propose a Generative Adversarial Network (GAN) to forecast 3D human motion given a sequence of past 3D skeleton poses. While recent GANs have shown promising results, they can only forecast plausible motion over relatively short periods of time (few hundred milliseconds) and typically ignore the absolute position of the skeleton w.r.t. the camera. Our scheme provides long term predictions (two… ▽ More

    Submitted 28 October, 2019; v1 submitted 13 December, 2018; originally announced December 2018.

    Comments: 8 pages

  3. arXiv:1810.12738  [pdf, other

    cs.CV

    Visual Re-ranking with Natural Language Understanding for Text Spotting

    Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

    Abstract: Many scene text recognition approaches are based on purely visual information and ignore the semantic relation between scene and text. In this paper, we tackle this problem from natural language processing perspective to fill the gap between language and vision. We propose a post-processing approach to improve scene text recognition accuracy by using occurrence probabilities of words (unigram lang… ▽ More

    Submitted 29 October, 2018; originally announced October 2018.

    Comments: Accepted by ACCV 2018. arXiv admin note: substantial text overlap with arXiv:1810.09776

  4. arXiv:1810.09776  [pdf, other

    cs.CV

    Visual Semantic Re-ranker for Text Spotting

    Authors: Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró

    Abstract: Many current state-of-the-art methods for text recognition are based on purely local information and ignore the semantic correlation between text and its surrounding visual context. In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene. We initially rely on an off-the-shelf deep neural network tha… ▽ More

    Submitted 27 October, 2018; v1 submitted 23 October, 2018; originally announced October 2018.

  5. arXiv:1809.10305  [pdf, other

    cs.CV

    Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View

    Authors: Albert Pumarola, Antonio Agudo, Lorenzo Porzi, Alberto Sanfeliu, Vincent Lepetit, Francesc Moreno-Noguer

    Abstract: We propose a method for predicting the 3D shape of a deformable surface from a single view. By contrast with previous approaches, we do not need a pre-registered template of the surface, and our method is robust to the lack of texture and partial occlusions. At the core of our approach is a {\it geometry-aware} deep architecture that tackles the problem as usually done in analytic solutions: first… ▽ More

    Submitted 26 September, 2018; originally announced September 2018.

    Comments: Accepted at CVPR 2018

  6. arXiv:1809.10280  [pdf, other

    cs.CV

    Unsupervised Person Image Synthesis in Arbitrary Poses

    Authors: Albert Pumarola, Antonio Agudo, Alberto Sanfeliu, Francesc Moreno-Noguer

    Abstract: We present a novel approach for synthesizing photo-realistic images of people in arbitrary poses using generative adversarial learning. Given an input image of a person and a desired pose represented by a 2D skeleton, our model renders the image of the same person under the new pose, synthesizing novel views of the parts visible in the input image and hallucinating those that are not seen. This pr… ▽ More

    Submitted 26 September, 2018; originally announced September 2018.

    Comments: Accepted as Spotlight at CVPR 2018

  7. arXiv:1808.10542  [pdf, other

    cs.CV

    Hallucinating Dense Optical Flow from Sparse Lidar for Autonomous Vehicles

    Authors: Victor Vaquero, Alberto Sanfeliu, Francesc Moreno-Noguer

    Abstract: In this paper we propose a novel approach to estimate dense optical flow from sparse lidar data acquired on an autonomous vehicle. This is intended to be used as a drop-in replacement of any image-based optical flow system when images are not reliable due to e.g. adverse weather conditions or at night. In order to infer high resolution 2D flows from discrete range data we devise a three-block arch… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: Accepted in ICPR 2018. More information: www.victorvaquero.me

  8. arXiv:1808.09526  [pdf, other

    cs.CV stat.ML

    Deep Lidar CNN to Understand the Dynamics of Moving Vehicles

    Authors: Victor Vaquero, Alberto Sanfeliu, Francesc Moreno-Noguer

    Abstract: Perception technologies in Autonomous Driving are experiencing their golden age due to the advances in Deep Learning. Yet, most of these systems rely on the semantically rich information of RGB images. Deep Learning solutions applied to the data of other sensors typically mounted on autonomous cars (e.g. lidars or radars) are not explored much. In this paper we propose a novel solution to understa… ▽ More

    Submitted 30 August, 2018; v1 submitted 28 August, 2018; originally announced August 2018.

    Comments: Presented in IEEE ICRA 2018. IEEE Copyrights: Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses. (V2 just corrected comments on arxiv submission)

  9. Deconvolutional Networks for Point-Cloud Vehicle Detection and Tracking in Driving Scenarios

    Authors: Victor Vaquero, Ivan del Pino, Francesc Moreno-Noguer, Joan Solà, Alberto Sanfeliu, Juan Andrade-Cetto

    Abstract: Vehicle detection and tracking is a core ingredient for develo** autonomous driving applications in urban scenarios. Recent image-based Deep Learning (DL) techniques are obtaining breakthrough results in these perceptive tasks. However, DL research has not yet advanced much towards processing 3D point clouds from lidar range-finders. These sensors are very common in autonomous vehicles since, de… ▽ More

    Submitted 23 August, 2018; originally announced August 2018.

    Comments: Presented in IEEE ECMR 2017. IEEE Copyrights: Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses

  10. Joint Coarse-And-Fine Reasoning for Deep Optical Flow

    Authors: Victor Vaquero, German Ros, Francesc Moreno-Noguer, Antonio M. Lopez, Alberto Sanfeliu

    Abstract: We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach… ▽ More

    Submitted 22 August, 2018; originally announced August 2018.

    Comments: Accepted in IEEE ICIP 2017. IEEE Copyrights: Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses

  11. arXiv:1807.09251  [pdf, other

    cs.CV

    GANimation: Anatomically-aware Facial Animation from a Single Image

    Authors: Albert Pumarola, Antonio Agudo, Aleix M. Martinez, Alberto Sanfeliu, Francesc Moreno-Noguer

    Abstract: Recent advances in Generative Adversarial Networks (GANs) have shown impressive results for task of facial expression synthesis. The most successful architecture is StarGAN, that conditions GANs generation process with images of a specific domain, namely a set of images of persons sharing the same expression. While effective, this approach can only generate a discrete number of expressions, determ… ▽ More

    Submitted 28 August, 2018; v1 submitted 24 July, 2018; originally announced July 2018.

    Comments: Accepted as oral at ECCV 2018. Code available at https://github.com/albertpumarola/GANimation. Added minor updates

  12. arXiv:1611.09010  [pdf, other

    cs.CV

    3D Human Pose Estimation from a Single Image via Distance Matrix Regression

    Authors: Francesc Moreno-Noguer

    Abstract: This paper addresses the problem of 3D human pose estimation from a single image. We follow a standard two-step pipeline by first detecting the 2D position of the $N$ body joints, and then using these observations to infer 3D pose. For the first step, we use a recent CNN-based detector. For the second step, most existing approaches perform 2$N$-to-3$N$ regression of the Cartesian joint coordinates… ▽ More

    Submitted 28 November, 2016; originally announced November 2016.

  13. arXiv:1603.07141  [pdf, other

    cs.CV

    BreakingNews: Article Annotation by Image and Text Processing

    Authors: Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk

    Abstract: Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of computer vision and natural language processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual c… ▽ More

    Submitted 23 March, 2016; originally announced March 2016.

  14. arXiv:1509.02130  [pdf, other

    cs.CV

    Structured Prediction with Output Embeddings for Semantic Image Annotation

    Authors: Ariadna Quattoni, Arnau Ramisa, Pranava Swaroop Madhyastha, Edgar Simo-Serra, Francesc Moreno-Noguer

    Abstract: We address the task of annotating images with semantic tuples. Solving this problem requires an algorithm which is able to deal with hundreds of classes for each argument of the tuple. In such contexts, data sparsity becomes a key challenge, as there will be a large number of classes for which only a few examples are available. We propose handling this by incorporating feature representations of b… ▽ More

    Submitted 7 September, 2015; originally announced September 2015.

    Comments: 11 pages

  15. arXiv:1503.02291  [pdf, other

    cs.CV

    TED: A Tolerant Edit Distance for Segmentation Evaluation

    Authors: Jan Funke, Francesc Moreno-Noguer, Albert Cardona, Matthew Cook

    Abstract: In this paper, we present a novel error measure to compare a segmentation against ground truth. This measure, which we call Tolerant Edit Distance (TED), is motivated by two observations: (1) Some errors, like small boundary shifts, are tolerable in practice. Which errors are tolerable is application dependent and should be a parameter of the measure. (2) Non-tolerable errors have to be corrected… ▽ More

    Submitted 1 February, 2016; v1 submitted 8 March, 2015; originally announced March 2015.

  16. arXiv:1412.6537  [pdf, other

    cs.CV

    Fracking Deep Convolutional Image Descriptors

    Authors: Edgar Simo-Serra, Eduard Trulls, Luis Ferraz, Iasonas Kokkinos, Francesc Moreno-Noguer

    Abstract: In this paper we propose a novel framework for learning local image descriptors in a discriminative manner. For this purpose we explore a siamese architecture of Deep Convolutional Neural Networks (CNN), with a Hinge embedding loss on the L2 distance between descriptors. Since a siamese architecture uses pairs rather than single image patches to train, there exist a large number of positive sample… ▽ More

    Submitted 25 February, 2015; v1 submitted 19 December, 2014; originally announced December 2014.