Skip to main content

Showing 1–24 of 24 results for author: Smeulders, A W M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2010.14438  [pdf, other

    cs.CV

    Structured Visual Search via Composition-aware Learning

    Authors: Mert Kilickaya, Arnold W. M. Smeulders

    Abstract: This paper studies visual search using structured queries. The structure is in the form of a 2D composition that encodes the position and the category of the objects. The transformation of the position and the category of the objects leads to a continuous-valued relationship between visual compositions, which carries highly beneficial information, although not leveraged by previous techniques. To… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Accepted at WACV'21 (Algorithms Track)

  2. arXiv:2006.16571  [pdf, other

    cs.CV cs.LG eess.IV

    Tackling Occlusion in Siamese Tracking with Structured Dropouts

    Authors: Deepak K. Gupta, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: Occlusion is one of the most difficult challenges in object tracking to model. This is because unlike other challenges, where data augmentation can be of help, occlusion is hard to simulate as the occluding object can be anything in any shape. In this paper, we propose a simple solution to simulate the effects of occlusion in the latent space. Specifically, we present structured dropout to mimick… ▽ More

    Submitted 30 June, 2020; originally announced June 2020.

    Comments: Accepted at ICPR2020

  3. arXiv:2003.08275  [pdf, other

    cs.CV

    PIC: Permutation Invariant Convolution for Recognizing Long-range Activities

    Authors: Noureldien Hussein, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: Neural operations as convolutions, self-attention, and vector aggregation are the go-to choices for recognizing short-range actions. However, they have three limitations in modeling long-range activities. This paper presents PIC, Permutation Invariant Convolution, a novel neural layer to model the temporal structure of long-range activities. It has three desirable properties. i. Unlike standard co… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

  4. arXiv:2003.05065  [pdf, other

    cs.CV

    Cloth in the Wind: A Case Study of Physical Measurement through Simulation

    Authors: Tom F. H. Runia, Kirill Gavrilyuk, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: For many of the physical phenomena around us, we have developed sophisticated models explaining their behavior. Nevertheless, measuring physical properties from visual observations is challenging due to the high number of causally underlying physical parameters -- including material properties and external forces. In this paper, we propose to measure latent physical properties for cloth in the win… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

    Comments: CVPR 2020. arXiv admin note: substantial text overlap with arXiv:1910.07861

  5. arXiv:1910.07861  [pdf, other

    cs.CV

    Go with the Flow: Perception-refined Physics Simulation

    Authors: Tom F. H. Runia, Kirill Gavrilyuk, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: For many of the physical phenomena around us, we have developed sophisticated models explaining their behavior. Nevertheless, inferring specifics from visual observations is challenging due to the high number of causally underlying physical parameters -- including material properties and external forces. This paper addresses the problem of inferring such latent physical properties from observation… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

  6. arXiv:1908.01603  [pdf, other

    cs.CV

    Model Decay in Long-Term Tracking

    Authors: Efstratios Gavves, Ran Tao, Deepak K. Gupta, Arnold W. M. Smeulders

    Abstract: Updating the tracker model with adverse bounding box predictions adds an unavoidable bias term to the learning. This bias term, which we refer to as model decay, offsets the learning and causes tracking drift. While its adverse affect might not be visible in short-term tracking, accumulation of this bias over a long-term can eventually lead to a permanent loss of the target. In this paper, we look… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

  7. arXiv:1905.05143  [pdf, other

    cs.CV

    VideoGraph: Recognizing Minutes-Long Human Activities in Videos

    Authors: Noureldien Hussein, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: Many human activities take minutes to unfold. To represent them, related works opt for statistical pooling, which neglects the temporal structure. Others opt for convolutional methods, as CNN and Non-Local. While successful in learning temporal concepts, they are short of modeling minutes-long temporal dependencies. We propose VideoGraph, a method to achieve the best of two worlds: represent minut… ▽ More

    Submitted 13 October, 2019; v1 submitted 13 May, 2019; originally announced May 2019.

    Journal ref: ICCV 2019, Workshop on Scene Graph Representation and Learning

  8. arXiv:1904.01421  [pdf, other

    cs.CV

    Cooperative Embeddings for Instance, Attribute and Category Retrieval

    Authors: William Thong, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: The goal of this paper is to retrieve an image based on instance, attribute and category similarity notions. Different from existing works, which usually address only one of these entities in isolation, we introduce a cooperative embedding to integrate them while preserving their specific level of semantic representation. An algebraic structure defines a superspace filled with instances. Attribute… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

  9. arXiv:1812.01289  [pdf, other

    cs.CV

    Timeception for Complex Action Recognition

    Authors: Noureldien Hussein, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: This paper focuses on the temporal aspect for recognizing human activities in videos; an important visual cue that has long been undervalued. We revisit the conventional definition of activity and restrict it to Complex Action: a set of one-actions with a weak temporal pattern that serves a specific purpose. Related works use spatiotemporal 3D convolutions with fixed kernel size, too rigid to capt… ▽ More

    Submitted 28 April, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

    Comments: IEEE CVPR 2019 (Oral)

  10. arXiv:1809.00720  [pdf, other

    cs.CV

    Estimating Small Differences in Car-Pose from Orbits

    Authors: Berkay Kicanaoglu, Ran Tao, Arnold W. M. Smeulders

    Abstract: Distinction among nearby poses and among symmetries of an object is challenging. In this paper, we propose a unified, group-theoretic approach to tackle both. Different from existing works which directly predict absolute pose, our method measures the pose of an object relative to another pose, i.e., the pose difference. The proposed method generates the complete orbit of an object from a single vi… ▽ More

    Submitted 3 September, 2018; originally announced September 2018.

    Comments: to appear in BMVC2018

  11. arXiv:1806.06984  [pdf, other

    cs.CV

    Repetition Estimation

    Authors: Tom F. H. Runia, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: Visual repetition is ubiquitous in our world. It appears in human activity (sports, cooking), animal behavior (a bee's waggle dance), natural phenomena (leaves in the wind) and in urban environments (flashing lights). Estimating visual repetition from realistic video is challenging as periodic motion is rarely perfectly static and stationary. To better deal with realistic video, we elevate the sta… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

  12. arXiv:1803.06962  [pdf, other

    cs.CV

    Featureless: Bypassing feature extraction in action categorization

    Authors: Silvia L. Pintea, Pascal S. Mettes, Jan C. van Gemert, Arnold W. M. Smeulders

    Abstract: This method introduces an efficient manner of learning action categories without the need of feature estimation. The approach starts from low-level values, in a similar style to the successful CNN methods. However, rather than extracting general image features, we learn to predict specific video representations from raw video data. The benefit of such an approach is that at the same computational… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: Published in the proceedings of the International Conference on Image Processing (ICIP), 2016

  13. arXiv:1803.06952  [pdf, other

    cs.LG cs.CV stat.ML

    Asymmetric kernel in Gaussian Processes for learning target variance

    Authors: Silvia L. Pintea, Jan C. van Gemert, Arnold W. M. Smeulders

    Abstract: This work incorporates the multi-modality of the data distribution into a Gaussian Process regression model. We approach the problem from a discriminative perspective by learning, jointly over the training data, the target space variance in the neighborhood of a certain sample through metric learning. We start by using data centers rather than all training samples. Subsequently, each center select… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: Accepted in Pattern Recognition Letters, 2018

  14. arXiv:1803.06951  [pdf, other

    cs.CV

    Deja Vu: Motion Prediction in Static Images

    Authors: Silvia L. Pintea, Jan C. van Gemert, Arnold W. M. Smeulders

    Abstract: This paper proposes motion prediction in single still images by learning it from a set of videos. The building assumption is that similar motion is characterized by similar appearance. The proposed method learns local motion patterns given a specific appearance and adds the predicted motion in a number of applications. This work (i) introduces a novel method to predict motion from appearance in a… ▽ More

    Submitted 21 March, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

    Comments: Published in the proceedings of the European Conference on Computer Vision (ECCV), 2014

  15. arXiv:1802.09971  [pdf, other

    cs.CV

    Real-World Repetition Estimation by Div, Grad and Curl

    Authors: Tom F. H. Runia, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: We consider the problem of estimating repetition in video, such as performing push-ups, cutting a melon or playing violin. Existing work shows good results under the assumption of static and stationary periodicity. As realistic video is rarely perfectly static and stationary, the often preferred Fourier-based measurements is inapt. Instead, we adopt the wavelet transform to better handle non-stati… ▽ More

    Submitted 27 February, 2018; originally announced February 2018.

  16. arXiv:1711.10217  [pdf, other

    cs.CV

    Tracking for Half an Hour

    Authors: Ran Tao, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: Long-term tracking requires extreme stability to the multitude of model updates and robustness to the disappearance and loss of the target as such will inevitably happen. For motivation, we have taken 10 randomly selected OTB-sequences, doubled each by attaching a reversed version and repeated each double sequence 20 times. On most of these repetitive videos, the best current tracker performs wors… ▽ More

    Submitted 28 November, 2017; originally announced November 2017.

    Comments: tech report

  17. arXiv:1706.00598  [pdf, other

    cs.CV stat.ML

    Dynamic Steerable Blocks in Deep Residual Networks

    Authors: Jörn-Henrik Jacobsen, Bert de Brabandere, Arnold W. M. Smeulders

    Abstract: Filters in convolutional networks are typically parameterized in a pixel basis, that does not take prior knowledge about the visual world into account. We investigate the generalized notion of frames designed with image properties in mind, as alternatives to this parametrization. We show that frame-based ResNets and Densenets can improve performance on Cifar-10+ consistently, while having addition… ▽ More

    Submitted 19 July, 2017; v1 submitted 2 June, 2017; originally announced June 2017.

  18. arXiv:1705.02148  [pdf, other

    cs.CV

    Unified Embedding and Metric Learning for Zero-Exemplar Event Detection

    Authors: Noureldien Hussein, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: Event detection in unconstrained videos is conceived as a content-based video retrieval with two modalities: textual and visual. Given a text describing a novel event, the goal is to rank related videos accordingly. This task is zero-exemplar, no video examples are given to the novel event. Related works train a bank of concept detectors on external data sources. These detectors predict confiden… ▽ More

    Submitted 5 May, 2017; originally announced May 2017.

    Comments: IEEE CVPR 2017

  19. arXiv:1703.04140  [pdf, other

    cs.LG stat.ML

    Multiscale Hierarchical Convolutional Networks

    Authors: Jörn-Henrik Jacobsen, Edouard Oyallon, Stéphane Mallat, Arnold W. M. Smeulders

    Abstract: Deep neural network algorithms are difficult to analyze because they lack structure allowing to understand the properties of underlying transforms and invariants. Multiscale hierarchical convolutional networks are structured deep convolutional networks where layers are indexed by progressively higher dimensional attributes, which are learned from training data. Each new layer is computed with mult… ▽ More

    Submitted 12 March, 2017; originally announced March 2017.

  20. arXiv:1610.03708  [pdf, other

    cs.CV cs.CL

    Generating captions without looking beyond objects

    Authors: Hendrik Heuer, Christof Monz, Arnold W. M. Smeulders

    Abstract: This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacrificing performance on n-gram precision. The paper… ▽ More

    Submitted 18 October, 2016; v1 submitted 12 October, 2016; originally announced October 2016.

    Comments: This paper was presented at the ECCV2016 2nd Workshop on Storytelling with Images and Videos (VisStory)

  21. arXiv:1610.01801  [pdf, other

    cs.CV

    Searching Scenes by Abstracting Things

    Authors: Svetlana Kordumova, Jan C. van Gemert, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: In this paper we propose to represent a scene as an abstraction of 'things'. We start from 'things' as generated by modern object proposals, and we investigate their immediately observable properties: position, size, aspect ratio and color, and those only. Where the recent successes and excitement of the field lie in object identification, we represent the scene composition independent of object i… ▽ More

    Submitted 6 October, 2016; originally announced October 2016.

  22. arXiv:1605.07104  [pdf, other

    cs.CV

    Generic Instance Search and Re-identification from One Example via Attributes and Categories

    Authors: Ran Tao, Arnold W. M. Smeulders, Shih-Fu Chang

    Abstract: This paper aims for generic instance search from one example where the instance can be an arbitrary object like shoes, not just near-planar and one-sided instances like buildings and logos. First, we evaluate state-of-the-art instance search methods on this problem. We observe that what works for buildings loses its generality on shoes. Second, we propose to use automatically learned category-spec… ▽ More

    Submitted 23 May, 2016; originally announced May 2016.

    Comments: This technical report is an extended version of our previous conference paper 'Attributes and Categories for Generic Instance Search from One Example' (CVPR 2015)

  23. arXiv:1605.05863  [pdf, ps, other

    cs.CV

    Siamese Instance Search for Tracking

    Authors: Ran Tao, Efstratios Gavves, Arnold W. M. Smeulders

    Abstract: In this paper we present a tracker, which is radically different from state-of-the-art trackers: we apply no model updating, no occlusion detection, no combination of trackers, no geometric matching, and still deliver state-of-the-art tracking performance, as demonstrated on the popular online tracking benchmark (OTB) and six very challenging YouTube videos. The presented tracker simply matches th… ▽ More

    Submitted 19 May, 2016; originally announced May 2016.

    Comments: This paper is accepted to the IEEE Conference on Computer Vision and Pattern Recognition, 2016

  24. arXiv:1605.02971  [pdf, other

    cs.CV

    Structured Receptive Fields in CNNs

    Authors: Jörn-Henrik Jacobsen, Jan van Gemert, Zhongyu Lou, Arnold W. M. Smeulders

    Abstract: Learning powerful feature representations with CNNs is hard when training data are limited. Pre-training is one way to overcome this, but it requires large datasets sufficiently similar to the target domain. Another option is to design priors into the model, which can range from tuned hyperparameters to fully engineered representations like Scattering Networks. We combine these ideas into structur… ▽ More

    Submitted 13 May, 2016; v1 submitted 10 May, 2016; originally announced May 2016.

    Comments: Reason for update: i) Fix Reference for "Deep roto-translation scattering for object classification" by Oyallon and Mallat. ii) Fixed two minor typos. iii) Removed implicit assumption in equation (4) where scale is represented with diffusion time and adapted to rest of paper where scale is represented with standard deviation, to avoid possible confusion