-
EMG subspace alignment and visualization for cross-subject hand gesture classification
Authors:
Martin Colot,
Cédric Simar,
Mathieu Petieau,
Ana Maria Cebolla Alvarez,
Guy Cheron,
Gianluca Bontempi
Abstract:
Electromyograms (EMG)-based hand gesture recognition systems are a promising technology for human/machine interfaces. However, one of their main limitations is the long calibration time that is typically required to handle new users. The paper discusses and analyses the challenge of cross-subject generalization thanks to an original dataset containing the EMG signals of 14 human subjects during ha…
▽ More
Electromyograms (EMG)-based hand gesture recognition systems are a promising technology for human/machine interfaces. However, one of their main limitations is the long calibration time that is typically required to handle new users. The paper discusses and analyses the challenge of cross-subject generalization thanks to an original dataset containing the EMG signals of 14 human subjects during hand gestures. The experimental results show that, though an accurate generalization based on pooling multiple subjects is hardly achievable, it is possible to improve the cross-subject estimation by identifying a robust low-dimensional subspace for multiple subjects and aligning it to a target subject. A visualization of the subspace enables us to provide insights for the improvement of cross-subject generalization with EMG signals.
△ Less
Submitted 18 December, 2023;
originally announced January 2024.
-
A flexible model for training action localization with varying levels of supervision
Authors:
Guilhem Chéron,
Jean-Baptiste Alayrac,
Ivan Laptev,
Cordelia Schmid
Abstract:
Spatio-temporal action detection in videos is typically addressed in a fully-supervised setup with manual annotation of training videos required at every frame. Since such annotation is extremely tedious and prohibits scalability, there is a clear need to minimize the amount of manual supervision. In this work we propose a unifying framework that can handle and combine varying types of less-demand…
▽ More
Spatio-temporal action detection in videos is typically addressed in a fully-supervised setup with manual annotation of training videos required at every frame. Since such annotation is extremely tedious and prohibits scalability, there is a clear need to minimize the amount of manual supervision. In this work we propose a unifying framework that can handle and combine varying types of less-demanding weak supervision. Our model is based on discriminative clustering and integrates different types of supervision as constraints on the optimization. We investigate applications of such a model to training setups with alternative supervisory signals ranging from video-level class labels to the full per-frame annotation of action bounding boxes. Experiments on the challenging UCF101-24 and DALY datasets demonstrate competitive performance of our method at a fraction of supervision used by previous methods. The flexibility of our model enables joint learning from data with different levels of annotation. Experimental results demonstrate a significant gain by adding a few fully supervised examples to otherwise weakly labeled videos.
△ Less
Submitted 27 November, 2018; v1 submitted 29 June, 2018;
originally announced June 2018.
-
Modeling Spatio-Temporal Human Track Structure for Action Localization
Authors:
Guilhem Chéron,
Anton Osokin,
Ivan Laptev,
Cordelia Schmid
Abstract:
This paper addresses spatio-temporal localization of human actions in video. In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks. Our model is trained to simultaneously recognize and localize action classes in time and is based on two layer gated recurrent units (GRU) applied s…
▽ More
This paper addresses spatio-temporal localization of human actions in video. In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks. Our model is trained to simultaneously recognize and localize action classes in time and is based on two layer gated recurrent units (GRU) applied separately to two streams, i.e. appearance and optical flow streams. When used together with state-of-the-art person detection and tracking, our model is shown to improve substantially spatio-temporal action localization in videos. The gain is shown to be mainly due to improved temporal localization. We evaluate our method on two recent datasets for spatio-temporal action localization, UCF101-24 and DALY, demonstrating a significant improvement of the state of the art.
△ Less
Submitted 28 June, 2018;
originally announced June 2018.
-
P-CNN: Pose-based CNN Features for Action Recognition
Authors:
Guilhem Chéron,
Ivan Laptev,
Cordelia Schmid
Abstract:
This work targets human action recognition in video. While recent methods typically represent actions by statistics of local video features, here we argue for the importance of a representation derived from human pose. To this end we propose a new Pose-based Convolutional Neural Network descriptor (P-CNN) for action recognition. The descriptor aggregates motion and appearance information along tra…
▽ More
This work targets human action recognition in video. While recent methods typically represent actions by statistics of local video features, here we argue for the importance of a representation derived from human pose. To this end we propose a new Pose-based Convolutional Neural Network descriptor (P-CNN) for action recognition. The descriptor aggregates motion and appearance information along tracks of human body parts. We investigate different schemes of temporal aggregation and experiment with P-CNN features obtained both for automatically estimated and manually annotated human poses. We evaluate our method on the recent and challenging JHMDB and MPII Cooking datasets. For both datasets our method shows consistent improvement over the state of the art.
△ Less
Submitted 23 September, 2015; v1 submitted 11 June, 2015;
originally announced June 2015.