Skip to main content

Showing 101–119 of 119 results for author: Snoek, C

.
  1. arXiv:1710.08011  [pdf, other

    cs.CV

    ActivityNet Challenge 2017 Summary

    Authors: Bernard Ghanem, Juan Carlos Niebles, Cees Snoek, Fabian Caba Heilbron, Humam Alwassel, Ranjay Khrisna, Victor Escorcia, Kenji Hata, Shyamal Buch

    Abstract: The ActivityNet Large Scale Activity Recognition Challenge 2017 Summary: results and challenge participants papers.

    Submitted 22 October, 2017; originally announced October 2017.

    Comments: 76 pages

  2. Predicting Visual Features from Text for Image and Video Caption Retrieval

    Authors: Jianfeng Dong, Xirong Li, Cees G. M. Snoek

    Abstract: This paper strives to find amidst a set of sentences the one best describing the content of a given image or video. Different from existing works, which rely on a joint subspace for their image and video caption retrieval, we propose to do so in a visual space exclusively. Apart from this conceptual novelty, we contribute \emph{Word2VisualVec}, a deep neural network architecture that learns to pre… ▽ More

    Submitted 14 July, 2018; v1 submitted 5 September, 2017; originally announced September 2017.

    Comments: Accepted by Transaction on Multimedia. Code is available at https://github.com/danieljf24/w2vv

  3. arXiv:1707.09145  [pdf, other

    cs.CV

    Spatial-Aware Object Embeddings for Zero-Shot Localization and Classification of Actions

    Authors: Pascal Mettes, Cees G. M. Snoek

    Abstract: We aim for zero-shot localization and classification of human actions in video. Where traditional approaches rely on global attribute or object classification scores for their zero-shot knowledge transfer, our main contribution is a spatial-aware object embedding. To arrive at spatial awareness, we build our embedding on top of freely available actor and object detectors. Relevance of objects is d… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: ICCV

    Report number: ICCV/2017/10

  4. arXiv:1707.09143  [pdf, other

    cs.CV

    Localizing Actions from Video Labels and Pseudo-Annotations

    Authors: Pascal Mettes, Cees G. M. Snoek, Shih-Fu Chang

    Abstract: The goal of this paper is to determine the spatio-temporal location of actions in video. Where training from hard to obtain box annotations is the norm, we propose an intuitive and effective algorithm that localizes actions from their class label only. We are inspired by recent work showing that unsupervised action proposals selected with human point-supervision perform as well as using expensive… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: BMVC

    Report number: BMVC/2017/09

  5. arXiv:1612.06753  [pdf, other

    cs.IR cs.MM

    Video Stream Retrieval of Unseen Queries using Semantic Memory

    Authors: Spencer Cappallo, Thomas Mensink, Cees G. M. Snoek

    Abstract: Retrieval of live, user-broadcast video streams is an under-addressed and increasingly relevant challenge. The on-line nature of the problem requires temporal evaluation and the unforeseeable scope of potential queries motivates an approach which can accommodate arbitrary search queries. To account for the breadth of possible queries, we adopt a no-example approach to query retrieval, which uses a… ▽ More

    Submitted 20 December, 2016; originally announced December 2016.

    Comments: Presented at BMVC 2016, British Machine Vision Conference, 2016

  6. arXiv:1610.01801  [pdf, other

    cs.CV

    Searching Scenes by Abstracting Things

    Authors: Svetlana Kordumova, Jan C. van Gemert, Cees G. M. Snoek, Arnold W. M. Smeulders

    Abstract: In this paper we propose to represent a scene as an abstraction of 'things'. We start from 'things' as generated by modern object proposals, and we investigate their immediately observable properties: position, size, aspect ratio and color, and those only. Where the recent successes and excitement of the field lie in object identification, we represent the scene composition independent of object i… ▽ More

    Submitted 6 October, 2016; originally announced October 2016.

  7. arXiv:1607.02003  [pdf, other

    cs.CV

    Tubelets: Unsupervised action proposals from spatiotemporal super-voxels

    Authors: Mihir Jain, Jan van Gemert, Hervé Jégou, Patrick Bouthemy, Cees G. M. Snoek

    Abstract: This paper considers the problem of localizing actions in videos as a sequences of bounding boxes. The objective is to generate action proposals that are likely to include the action of interest, ideally achieving high recall with few proposals. Our contributions are threefold. First, inspired by selective search for object proposals, we introduce an approach to generate action proposals from spat… ▽ More

    Submitted 7 July, 2016; originally announced July 2016.

    Comments: submitted to International Journal of Computer Vision

  8. arXiv:1607.01794  [pdf, other

    cs.CV

    VideoLSTM Convolves, Attends and Flows for Action Recognition

    Authors: Zhenyang Li, Efstratios Gavves, Mihir Jain, Cees G. M. Snoek

    Abstract: We present a new architecture for end-to-end sequence learning of actions in video, we call VideoLSTM. Rather than adapting the video to the peculiarities of established recurrent or convolutional architectures, we adapt the architecture to fit the requirements of the video medium. Starting from the soft-Attention LSTM, VideoLSTM makes three novel contributions. First, video has a spatial layout.… ▽ More

    Submitted 6 July, 2016; originally announced July 2016.

  9. arXiv:1604.07602  [pdf, other

    cs.CV

    Spot On: Action Localization from Pointly-Supervised Proposals

    Authors: Pascal Mettes, Jan C. van Gemert, Cees G. M. Snoek

    Abstract: We strive for spatio-temporal localization of actions in videos. The state-of-the-art relies on action proposals at test time and selects the best one with a classifier trained on carefully annotated box annotations. Annotating action boxes in video is cumbersome, tedious, and error prone. Rather than annotating boxes, we propose to annotate actions in video with points on a sparse subset of frame… ▽ More

    Submitted 25 July, 2016; v1 submitted 26 April, 2016; originally announced April 2016.

    Report number: ECCV/2016/10

  10. arXiv:1604.06838  [pdf, other

    cs.CV

    Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction

    Authors: Jianfeng Dong, Xirong Li, Cees G. M. Snoek

    Abstract: This paper strives to find the sentence best describing the content of an image or video. Different from existing works, which rely on a joint subspace for image / video to sentence matching, we propose to do so in a visual space only. We contribute Word2VisualVec, a deep neural network architecture that learns to predict a deep visual encoding of textual input based on sentence vectorization and… ▽ More

    Submitted 25 November, 2016; v1 submitted 22 April, 2016; originally announced April 2016.

  11. arXiv:1604.06506  [pdf, other

    cs.CV

    Online Action Detection

    Authors: Roeland De Geest, Efstratios Gavves, Amir Ghodrati, Zhenyang Li, Cees Snoek, Tinne Tuytelaars

    Abstract: In online action detection, the goal is to detect the start of an action in a video stream as soon as it happens. For instance, if a child is chasing a ball, an autonomous car should recognize what is going on and respond immediately. This is a very challenging problem for four reasons. First, only partial actions are observed. Second, there is a large variability in negative data. Third, the star… ▽ More

    Submitted 30 August, 2016; v1 submitted 21 April, 2016; originally announced April 2016.

    Comments: Project page: http://homes.esat.kuleuven.be/~rdegeest/OnlineActionDetection.html

  12. The ImageNet Shuffle: Reorganized Pre-training for Video Event Detection

    Authors: Pascal Mettes, Dennis C. Koelma, Cees G. M. Snoek

    Abstract: This paper strives for video event detection using a representation learned from deep convolutional neural networks. Different from the leading approaches, who all learn from the 1,000 classes defined in the ImageNet Large Scale Visual Recognition Challenge, we investigate how to leverage the complete ImageNet hierarchy for pre-training deep networks. To deal with the problems of over-specific cla… ▽ More

    Submitted 23 February, 2016; originally announced February 2016.

    Report number: ICMR/2016/06

  13. arXiv:1511.02492  [pdf, other

    cs.CV cs.MM

    VideoStory Embeddings Recognize Events when Examples are Scarce

    Authors: Amirhossein Habibian, Thomas Mensink, Cees G. M. Snoek

    Abstract: This paper aims for event recognition when video examples are scarce or even completely absent. The key in such a challenging setting is a semantic video representation. Rather than building the representation from individual attribute detectors and their annotations, we propose to learn the entire representation from freely available web videos and their descriptions using an embedding between vi… ▽ More

    Submitted 8 November, 2015; originally announced November 2015.

  14. arXiv:1510.06939  [pdf, other

    cs.CV

    Objects2action: Classifying and localizing actions without any video example

    Authors: Mihir Jain, Jan C. van Gemert, Thomas Mensink, Cees G. M. Snoek

    Abstract: The goal of this paper is to recognize actions in video without the need for examples. Different from traditional zero-shot approaches we do not demand the design and specification of attribute classifiers and class-to-attribute map**s to allow for transfer from seen classes to unseen classes. Our key contribution is objects2action, a semantic word embedding that is spanned by a skip-gram model… ▽ More

    Submitted 23 October, 2015; originally announced October 2015.

  15. arXiv:1510.04908  [pdf, other

    cs.CV

    No Spare Parts: Sharing Part Detectors for Image Categorization

    Authors: Pascal Mettes, Jan C. van Gemert, Cees G. M. Snoek

    Abstract: This work aims for image categorization using a representation of distinctive parts. Different from existing part-based work, we argue that parts are naturally shared between image categories and should be modeled as such. We motivate our approach with a quantitative and qualitative analysis by backtracking where selected parts come from. Our analysis shows that in addition to the category parts d… ▽ More

    Submitted 12 July, 2016; v1 submitted 16 October, 2015; originally announced October 2015.

  16. arXiv:1510.02899  [pdf, ps, other

    cs.CV cs.MM

    TagBook: A Semantic Video Representation without Supervision for Event Detection

    Authors: Masoud Mazloom, Xirong Li, Cees G. M. Snoek

    Abstract: We consider the problem of event detection in video for scenarios where only few, or even zero examples are available for training. For this challenging setting, the prevailing solutions in the literature rely on a semantic video representation obtained from thousands of pre-trained concept detectors. Different from existing work, we propose a new semantic video representation that is based on fre… ▽ More

    Submitted 23 April, 2016; v1 submitted 10 October, 2015; originally announced October 2015.

    Comments: accepted for publication as a regular paper in the IEEE Transactions on Multimedia

  17. arXiv:1510.01544  [pdf, other

    cs.CV

    Active Transfer Learning with Zero-Shot Priors: Reusing Past Datasets for Future Tasks

    Authors: Efstratios Gavves, Thomas Mensink, Tatiana Tommasi, Cees G. M. Snoek, Tinne Tuytelaars

    Abstract: How can we reuse existing knowledge, in the form of available datasets, when solving a new and apparently unrelated target task from a set of unlabeled data? In this work we make a first contribution to answer this question in the context of image classification. We frame this quest as an active learning problem and use zero-shot classifiers to guide the learning process by linking the new task to… ▽ More

    Submitted 6 October, 2015; originally announced October 2015.

  18. arXiv:1503.08248  [pdf, other

    cs.IR cs.CV cs.MM cs.SI

    Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

    Authors: Xirong Li, Tiberio Uricchio, Lamberto Ballan, Marco Bertini, Cees G. M. Snoek, Alberto Del Bimbo

    Abstract: Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image. A comprehensive treatise of three closely linked problems, i.e., image tag assignment, refinement, and tag-based image retrieval is presented. While existing works vary in terms of their targeted tasks and methodology, t… ▽ More

    Submitted 23 March, 2016; v1 submitted 27 March, 2015; originally announced March 2015.

    Comments: to appear in ACM Computing Surveys

    ACM Class: H.3.1; H.3.3

    Journal ref: ACM Computing Surveys, Volume 49 Issue 1, 14:1-14:39, June 2016

  19. Modelling and Refinement in CODA

    Authors: Michael Butler, John Colley, Andrew Edmunds, Colin Snook, Neil Evans, Neil Grant, Helen Marshall

    Abstract: This paper provides an overview of the CODA framework for modelling and refinement of component-based embedded systems. CODA is an extension of Event-B and UML-B and is supported by a plug-in for the Rodin toolset. CODA augments Event-B with constructs for component-based modelling including components, communications ports, port connectors, timed communications and timing triggers. Component beha… ▽ More

    Submitted 27 May, 2013; originally announced May 2013.

    Comments: In Proceedings Refine 2013, arXiv:1305.5634

    Journal ref: EPTCS 115, 2013, pp. 36-51