抄録
Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, temporal action segmentation (TAS) aims to segment a temporally untrimmed video sequence by time. Consequently, it has wider application prospects in various fields. Previously proposed TAS-based methods use only RGB color video as input to segment the actions, but RGB video is not robust against diverse backgrounds. Whereas skeleton-based features are more resilient as they do not incorporate any background information but there has been limited research exploring this feature modality. To this end, we propose a motion-aware and temporal-enhanced spatial–temporal graph convolutional network for the skeleton-based human action segmentation. Our framework contains a motion-aware module, multi-scale temporal convolutional network, temporal-enhanced graph convolutional network module and a refinement module. Our method can efficiently capture the motion information and long-range dependencies using skeleton features while improving temporal modeling. We have conducted experiments using four publicly available datasets to demonstrate the effectiveness of our introduced method. The code is available at https://github.com/11yxk/openpack.
本文言語 | 英語 |
---|---|
論文番号 | 127482 |
ジャーナル | Neurocomputing |
巻 | 580 |
DOI | |
出版ステータス | 出版済み - 01-05-2024 |
All Science Journal Classification (ASJC) codes
- コンピュータ サイエンスの応用
- 認知神経科学
- 人工知能