-
Weakly-Supervised Completion Moment Detection using Temporal Attention
Authors:
Farnoosh Heidarivincheh,
Majid Mirmehdi,
Dima Damen
Abstract:
Monitoring the progression of an action towards completion offers fine grained insight into the actor's behaviour. In this work, we target detecting the completion moment of actions, that is the moment when the action's goal has been successfully accomplished. This has potential applications from surveillance to assistive living and human-robot interactions. Previous effort required human annotati…
▽ More
Monitoring the progression of an action towards completion offers fine grained insight into the actor's behaviour. In this work, we target detecting the completion moment of actions, that is the moment when the action's goal has been successfully accomplished. This has potential applications from surveillance to assistive living and human-robot interactions. Previous effort required human annotations of the completion moment for training (i.e. full supervision). In this work, we present an approach for moment detection from weak video-level labels. Given both complete and incomplete sequences, of the same action, we learn temporal attention, along with accumulated completion prediction from all frames in the sequence. We also demonstrate how the approach can be used when completion moment supervision is available. We evaluate and compare our approach on actions from three datasets, namely HMDB, UCF101 and RGBD-AC, and show that temporal attention improves detection in both weakly-supervised and fully-supervised settings.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
Action Completion: A Temporal Model for Moment Detection
Authors:
Farnoosh Heidarivincheh,
Majid Mirmehdi,
Dima Damen
Abstract:
We introduce completion moment detection for actions - the problem of locating the moment of completion, when the action's goal is confidently considered achieved. The paper proposes a joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment. We introduce a recurrent voting…
▽ More
We introduce completion moment detection for actions - the problem of locating the moment of completion, when the action's goal is confidently considered achieved. The paper proposes a joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment. We introduce a recurrent voting node that predicts the frame's relative position of the completion moment by either classification or regression. The method is also capable of detecting incompletion. For example, the method is capable of detecting a missed ball-catch, as well as the moment at which the ball is safely caught. We test the method on 16 actions from three public datasets, covering sports as well as daily actions. Results show that when combining contributions from frames prior to the completion moment as well as frames post completion, the completion moment is detected within one second in 89% of all tested sequences.
△ Less
Submitted 23 July, 2018; v1 submitted 17 May, 2018;
originally announced May 2018.
-
Detecting the Moment of Completion: Temporal Models for Localising Action Completion
Authors:
Farnoosh Heidarivincheh,
Majid Mirmehdi,
Dima Damen
Abstract:
Action completion detection is the problem of modelling the action's progression towards localising the moment of completion - when the action's goal is confidently considered achieved. In this work, we assess the ability of two temporal models, namely Hidden Markov Models (HMM) and Long-Short Term Memory (LSTM), to localise completion for six object interactions: switch, plug, open, pull, pick an…
▽ More
Action completion detection is the problem of modelling the action's progression towards localising the moment of completion - when the action's goal is confidently considered achieved. In this work, we assess the ability of two temporal models, namely Hidden Markov Models (HMM) and Long-Short Term Memory (LSTM), to localise completion for six object interactions: switch, plug, open, pull, pick and drink. We use a supervised approach, where annotations of pre-completion and post-completion frames are available per action, and fine-tuned CNN features are used to train temporal models. Tested on the Action-Completion-2016 dataset, we detect completion within 10 frames of annotations for ~75% of completed action sequences using both temporal models. Results show that fine-tuned CNN features outperform hand-crafted features for localisation, and that observing incomplete instances is necessary when incomplete sequences are also present in the test set.
△ Less
Submitted 6 October, 2017;
originally announced October 2017.