Skeleton-Based Action Segmentation with Multi-Stage Spatial-Temporal Graph Convolutional Neural Networks
Authors:
Benjamin Filtjens,
Bart Vanrumste,
Peter Slaets
Abstract:
The ability to identify and temporally segment fine-grained actions in motion capture sequences is crucial for applications in human movement analysis. Motion capture is typically performed with optical or inertial measurement systems, which encode human movement as a time series of human joint locations and orientations or their higher-order representations. State-of-the-art action segmentation a…
▽ More
The ability to identify and temporally segment fine-grained actions in motion capture sequences is crucial for applications in human movement analysis. Motion capture is typically performed with optical or inertial measurement systems, which encode human movement as a time series of human joint locations and orientations or their higher-order representations. State-of-the-art action segmentation approaches use multiple stages of temporal convolutions. The main idea is to generate an initial prediction with several layers of temporal convolutions and refine these predictions over multiple stages, also with temporal convolutions. Although these approaches capture long-term temporal patterns, the initial predictions do not adequately consider the spatial hierarchy among the human joints. To address this limitation, we recently introduced multi-stage spatial-temporal graph convolutional neural networks (MS-GCN). Our framework replaces the initial stage of temporal convolutions with spatial graph convolutions and dilated temporal convolutions, which better exploit the spatial configuration of the joints and their long-term temporal dynamics. Our framework was compared to four strong baselines on five tasks. Experimental results demonstrate that our framework is a strong baseline for skeleton-based action segmentation.
△ Less
Submitted 9 October, 2022; v1 submitted 3 February, 2022;
originally announced February 2022.
Automated freezing of gait assessment with marker-based motion capture and multi-stage spatial-temporal graph convolutional neural networks
Authors:
Benjamin Filtjens,
Pieter Ginis,
Alice Nieuwboer,
Peter Slaets,
Bart Vanrumste
Abstract:
Freezing of gait (FOG) is a common and debilitating gait impairment in Parkinson's disease. Further insight into this phenomenon is hampered by the difficulty to objectively assess FOG. To meet this clinical need, this paper proposes an automated motion-capture-based FOG assessment method driven by a novel deep neural network. Automated FOG assessment can be formulated as an action segmentation pr…
▽ More
Freezing of gait (FOG) is a common and debilitating gait impairment in Parkinson's disease. Further insight into this phenomenon is hampered by the difficulty to objectively assess FOG. To meet this clinical need, this paper proposes an automated motion-capture-based FOG assessment method driven by a novel deep neural network. Automated FOG assessment can be formulated as an action segmentation problem, where temporal models are tasked to recognize and temporally localize the FOG segments in untrimmed motion capture trials. This paper takes a closer look at the performance of state-of-the-art action segmentation models when tasked to automatically assess FOG. Furthermore, a novel deep neural network architecture is proposed that aims to better capture the spatial and temporal dependencies than the state-of-the-art baselines. The proposed network, termed multi-stage spatial-temporal graph convolutional network (MS-GCN), combines the spatial-temporal graph convolutional network (ST-GCN) and the multi-stage temporal convolutional network (MS-TCN). The ST-GCN captures the hierarchical spatial-temporal motion among the joints inherent to motion capture, while the multi-stage component reduces over-segmentation errors by refining the predictions over multiple stages. The experiments indicate that the proposed model outperforms four state-of-the-art baselines. Moreover, FOG outcomes derived from MS-GCN predictions had an excellent (r=0.93 [0.87, 0.97]) and moderately strong (r=0.75 [0.55, 0.87]) linear relationship with FOG outcomes derived from manual annotations. The proposed MS-GCN may provide an automated and objective alternative to labor-intensive clinician-based FOG assessment. Future work is now possible that aims to assess the generalization of MS-GCN to a larger and more varied verification cohort.
△ Less
Submitted 3 February, 2022; v1 submitted 29 March, 2021;
originally announced March 2021.