DeepActsNet: Spatial and Motion features from Face, Hands, and Body Combined with Convolutional and Graph Networks for Improved Action Recognition

Asif, Umar; Mehta, Deval; von Cavallar, Stefan; Tang, Jianbin; Harrer, Stefan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2009.09818v1 (cs)

[Submitted on 21 Sep 2020 (this version), latest version 4 Jun 2021 (v3)]

Title:DeepActsNet: Spatial and Motion features from Face, Hands, and Body Combined with Convolutional and Graph Networks for Improved Action Recognition

Authors:Umar Asif, Deval Mehta, Stefan von Cavallar, Jianbin Tang, Stefan Harrer

View PDF

Abstract:Existing action recognition methods mainly focus on joint and bone information in human body skeleton data due to its robustness to complex backgrounds and dynamic characteristics of the environments. In this paper, we combine body skeleton data with spatial and motion information from face and two hands, and present Deep Action Stamps (DeepActs), a novel data representation to encode actions from video sequences. We also present DeepActsNet, a deep learning based model with modality-specific Convolutional and Graph sub-networks for highly accurate action recognition based on Deep Action Stamps. Experiments on three challenging action recognition datasets (NTU60, NTU120, and SYSU) show that DeepActs produce considerable improvements in the recognition performance of standard convolutional and graph networks. Experiments also show that the fusion of modality-specific convolutional and structural features learnt by our DeepActsNet yields consistent improvements in action recognition accuracy over the state-of-the-art on the target datasets.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2009.09818 [cs.CV]
	(or arXiv:2009.09818v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2009.09818

Submission history

From: Umar Asif [view email]
[v1] Mon, 21 Sep 2020 12:41:56 UTC (2,128 KB)
[v2] Wed, 31 Mar 2021 22:52:13 UTC (6,932 KB)
[v3] Fri, 4 Jun 2021 04:09:54 UTC (15,383 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DeepActsNet: Spatial and Motion features from Face, Hands, and Body Combined with Convolutional and Graph Networks for Improved Action Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DeepActsNet: Spatial and Motion features from Face, Hands, and Body Combined with Convolutional and Graph Networks for Improved Action Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators