Skip to main content

Showing 1–6 of 6 results for author: Vahdani, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.15916  [pdf, other

    cs.CV

    ADM-Loc: Actionness Distribution Modeling for Point-supervised Temporal Action Localization

    Authors: Elahe Vahdani, Yingli Tian

    Abstract: This paper addresses the challenge of point-supervised temporal action detection, in which only one frame per action instance is annotated in the training set. Self-training aims to provide supplementary supervision for the training process by generating pseudo-labels (action proposals) from a base model. However, most current methods generate action proposals by applying manually designed thresho… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  2. arXiv:2310.13585  [pdf, other

    cs.CV

    POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization

    Authors: Elahe Vahdani, Yingli Tian

    Abstract: This paper tackles the challenge of point-supervised temporal action detection, wherein only a single frame is annotated for each action instance in the training set. Most of the current methods, hindered by the sparse nature of annotated points, struggle to effectively represent the continuous structure of actions or the inherent temporal and semantic dependencies within action instances. Consequ… ▽ More

    Submitted 5 June, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

  3. arXiv:2110.00111  [pdf, other

    cs.CV

    Deep Learning-based Action Detection in Untrimmed Videos: A Survey

    Authors: Elahe Vahdani, Yingli Tian

    Abstract: Understanding human behavior and activity facilitates advancement of numerous real-world applications, and is critical for video analysis. Despite the progress of action recognition algorithms in trimmed videos, the majority of real-world videos are lengthy and untrimmed with sparse segments of interest. The task of temporal activity detection in untrimmed videos aims to localize the temporal boun… ▽ More

    Submitted 30 September, 2021; originally announced October 2021.

  4. arXiv:2008.03561  [pdf, other

    cs.CV

    Cross-modal Center Loss

    Authors: Longlong **g, Elahe Vahdani, Jiaxing Tan, Yingli Tian

    Abstract: Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. Unlike the existing methods which usually learn from the features extracted by offline networks, in this paper, we propose an approach to jointly train the components of cross-modal retrieval framework with metadata, and enable the network to find optimal features. The proposed end-t… ▽ More

    Submitted 8 August, 2020; originally announced August 2020.

  5. arXiv:2005.00253  [pdf, other

    cs.CV

    Recognizing American Sign Language Nonmanual Signal Grammar Errors in Continuous Videos

    Authors: Elahe Vahdani, Longlong **g, Yingli Tian, Matt Huenerfauth

    Abstract: As part of the development of an educational tool that can help students achieve fluency in American Sign Language (ASL) through independent and interactive practice with immediate feedback, this paper introduces a near real-time system to recognize grammatical errors in continuous signing videos without necessarily identifying the entire sequence of signs. Our system automatically recognizes if p… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

  6. arXiv:1906.02851  [pdf, other

    cs.CV

    Recognizing American Sign Language Manual Signs from RGB-D Videos

    Authors: Longlong **g, Elahe Vahdani, Matt Huenerfauth, Yingli Tian

    Abstract: In this paper, we propose a 3D Convolutional Neural Network (3DCNN) based multi-stream framework to recognize American Sign Language (ASL) manual signs (consisting of movements of the hands, as well as non-manual face movements in some cases) in real-time from RGB-D videos, by fusing multimodality features including hand gestures, facial expressions, and body poses from multi-channels (RGB, depth,… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.