Skip to main content

Showing 1–5 of 5 results for author: Thoker, F M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.11003  [pdf, other

    cs.CV cs.AI

    Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization

    Authors: Fida Mohammad Thoker, Hazel Doughty, Cees Snoek

    Abstract: We propose a self-supervised method for learning motion-focused video representations. Existing approaches minimize distances between temporally augmented videos, which maintain high spatial similarity. We instead propose to learn similarities between videos with identical local motion dynamics but an otherwise different appearance. We do so by adding synthetic motion trajectories to videos which… ▽ More

    Submitted 28 September, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: Accepted in ICCV 2023

  2. arXiv:2203.14221  [pdf, other

    cs.CV

    How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?

    Authors: Fida Mohammad Thoker, Hazel Doughty, Piyush Bagad, Cees Snoek

    Abstract: Despite the recent success of video self-supervised learning models, there is much still to be understood about their generalization capability. In this paper, we investigate how sensitive video self-supervised learning is to the current conventional benchmark and whether methods generalize beyond the canonical evaluation setting. We do this across four different factors of sensitivity: domain, sa… ▽ More

    Submitted 30 July, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

    Comments: Accepted in ECCV 2022

  3. arXiv:2108.03656  [pdf, other

    cs.CV

    Skeleton-Contrastive 3D Action Representation Learning

    Authors: Fida Mohammad Thoker, Hazel Doughty, Cees G. M. Snoek

    Abstract: This paper strives for self-supervised learning of a feature space suitable for skeleton-based action recognition. Our proposal is built upon learning invariances to input skeleton representations and various skeleton augmentations via a noise contrastive estimation. In particular, we propose inter-skeleton contrastive learning, which learns from multiple different input skeleton representations i… ▽ More

    Submitted 8 August, 2021; originally announced August 2021.

    Comments: Accepted in ACM Multimedia 2021

  4. arXiv:2108.03329  [pdf, other

    cs.CV

    Feature-Supervised Action Modality Transfer

    Authors: Fida Mohammad Thoker, Cees G. M. Snoek

    Abstract: This paper strives for action recognition and detection in video modalities like RGB, depth maps or 3D-skeleton sequences when only limited modality-specific labeled examples are available. For the RGB, and derived optical-flow, modality many large-scale labeled datasets have been made available. They have become the de facto pre-training choice when recognizing or detecting new actions from RGB d… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: IEEE International Conference on Pattern Recognition (ICPR), 2020

  5. arXiv:1910.04641  [pdf, other

    cs.CV

    Cross-modal knowledge distillation for action recognition

    Authors: Fida Mohammad Thoker, Juergen Gall

    Abstract: In this work, we address the problem how a network for action recognition that has been trained on a modality like RGB videos can be adapted to recognize actions for another modality like sequences of 3D human poses. To this end, we extract the knowledge of the trained teacher network for the source modality and transfer it to a small ensemble of student networks for the target modality. For the c… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

    Comments: Published in: 2019 IEEE International Conference on Image Processing (ICIP)