Priori separation graph convolution with long-short term temporal modeling for skeleton-based action recognition

Tuo Zang¹,
Jianfeng Tu¹,
Mengran Duan¹,
Zhipeng Chen¹,
Hao Cheng¹,
Hanrui Jiang¹,
Jiahui Zhao¹ &
…
Lingfeng Liu ORCID: orcid.org/0000-0002-6177-1998^1,2

231 Accesses
1 Citation
Explore all metrics

Abstract

Human action recognition from skeleton motion sequences is widely applied in various fields such as virtual reality, human-computer interaction and kinematic rehabilitation. With the wide use of graph neural networks for extracting spatial features from skeleton anatomy connectivity, spatial-temporal extension of single graph models on human skeleton may improve the network performance. In this paper, we propose a priori separation graph convolution (PS-GCN) network composition with a priori mixed GCN by introducing a hypergraph representation of the skeleton spatial features, and a dynamic adaptive GCN to describe the respective graph model for each sample at each layer of the network of spatial features. For temporal feature analysis, a global attention unit is added to describe the long-term relationship. Moreover, a feature fusion structure is applied for short-term temporal features in the input of the network. The proposed model is evaluated on the NTU-RGB+D, NTU-RGB+D 120 and NW-UCLA datasets via a comprehensive ablative study. The results show that our model is comparable in accuracy to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatial-Temporal Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition

Graph Attention Convolutional Network with Motion Tempo Enhancement for Skeleton-Based Action Recognition

MS-GTR: Multi-stream Graph Transformer for Skeleton-Based Action Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and materials

The datasets are available at https://rose1.ntu.edu.sg/dataset/actionRecognition/ and https://wangjiangb.github.io/my_data.html.

Code availability

When this paper is accepted, we will disclose all relevant codes.

References

Ciftci U, Zhang X, Tin L Partially occluded facial action recognition and interaction in virtual reality applications, pp 715–720. https://doi.org/10.1109/ICME.2017.8019545. ISSN: 1945-788X
Marinoiu E, Zanfir M, Olaru V, Sminchisescu C (2018) 3d human sensing, action and emotion recognition in robot assisted therapy of children with autism, pp 2158–2167
Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C (Appl Rev) 34(3):334–352. https://doi.org/10.1109/TSMCC.2004.829274
Article Google Scholar
Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Twenty-third international joint conference on artificial intelligence
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition, pp 1290–1297. IEEE
Xu Y, Cheng J, Wang L, Xia H, Liu F, Tao D (2018) Ensemble one-dimensional convolution neural networks for skeleton-based action recognition. IEEE Signal Process Lett 25(7):1044–1048
Article Google Scholar
Zhang P, Xue J, Lan C, Zeng W, Gao Z, Zheng N (2019) Eleatt-rnn: Adding attentiveness to neurons in recurrent neural networks. IEEE Trans Image Process 29:1061–1073
Article MathSciNet Google Scholar
Manessi F, Rozza A, Manzo M (2020) Dynamic graph convolutional networks. Pattern Recognit 97:107000
Article Google Scholar
Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Chen Z, Li S, Yang B, Li Q, Liu H (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 1113–1122
Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12026–12035
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition, pp 1290–1297. IEEE
Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Twenty-third international joint conference on artificial intelligence
Wang P, Li Z, Hou Y, Li W (2016) Action recognition based on joint trajectory maps using convolutional neural networks. In: Proceedings of the 24th ACM international conference on multimedia, pp 102–106
Li C, Sun S, Min X, Lin W, Nie B, Zhang X (2017) End-to-end learning of deep convolutional neural network for 3d human action recognition. In: 2017 IEEE international conference on multimedia & expo workshops (ICMEW), pp 609–612. IEEE
Liu M, Liu H, Chen C (2017) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit 68:346–362
Article Google Scholar
Ardianto S, Hang H-M (2018) Multi-view and multi-modal action recognition with learned fusion. In: 2018 Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC), pp 1601–1604. IEEE
Li C, Zhong Q, Xie D, Pu S (2018) Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence. International Joint Conferences on Artificial Intelligence Organization
Zhang S, Yang Y, Xiao J, Liu X, Yang Y, Xie D, Zhuang Y (2018) Fusing geometric features for skeleton-based action recognition using multilayer lstm networks. IEEE Trans Multimed 20(9):2330–2343
Article Google Scholar
Si C, Chen W, Wang W, Wang L, Tan T (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1227–1236
Zhu C, Ma X, Ding W, Zhan J (2023) Long-term time series forecasting with multi-linear trend fuzzy information granules for lstm in a periodic framework. IEEE Trans Fuzzy Syst
Wu X, Zhan J, Li T, Ding W, Pedrycz W (2024) Mbssa-bi-aesn: Classification prediction of bi-directional adaptive echo state network based on modified binary salp swarm algorithm and feature selection. Appl Intell 54(2):1706–1733
Article Google Scholar
Lee I, Kim D, Kang S, Lee S (2017) Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks. In: Proceedings of the IEEE international conference on computer vision, pp 1012–1020
Wang H, Wang L (2017) Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 499–508
Du Y, Fu Y, Wang L (2016) Representation learning of temporal dynamics for skeleton-based action recognition. IEEE Trans Image Process 25(7):3010–3022
Article MathSciNet Google Scholar
Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12026–12035
Hu Z, Pan Z, Wang Q, Yu L, Fei S (2022) Forward-reverse adaptive graph convolutional networks for skeleton-based action recognition. Neurocomputing 492:624–636
Article Google Scholar
Liu Z, Zhang H, Chen Z, Wang Z, Ouyang W (2020) Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 143–152
Chen T, Zhou D, Wang J, Wang S, Guan Y, He X, Ding E (2021) Learning multi-granular spatio-temporal graph network for skeleton-based action recognition. In: Proceedings of the 29th ACM international conference on multimedia, pp 4334–4342
Bai R, Meng X, Meng B, Jiang M, Ren J, Yang Y, Li M, Sun D (2021) Graph attention convolutional network with motion tempo enhancement for skeleton-based action recognition. In: PRICAI 2021: trends in artificial intelligence: 18th Pacific Rim international conference on artificial intelligence, PRICAI 2021, Hanoi, Vietnam, November 8–12, 2021, Proceedings, Part III 18, pp 152–165. Springer
Qiu Z-X, Zhang H-B, Deng W-M, Du J-X, Lei Q, Zhang G-L (2023) Effective skeleton topology and semantics-guided adaptive graph convolution network for action recognition. Vis Comput 39(5):2191–2203
Article Google Scholar
Zhang P, Lan C, Zeng W, Xing J, Xue J, Zheng N (2020) Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1112–1121
Qi Y, Hu J, Zhuang L, Pei X (2023) Semantic-guided multi-scale human skeleton action recognition. Appl Intell 53(9):9763–9778
Article Google Scholar
Feng Y, You H, Zhang Z, Ji R, Gao Y (2019) Hypergraph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3558–3565
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
Shahroudy A, Liu J, Ng T-T, Wang G (2016) Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019
Liu J, Shahroudy A, Perez M, Wang G, Duan L-Y, Kot AC (2019) Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell 42(10):2684–2701
Article Google Scholar
Wang J, Nie X, Xia Y, Wu Y, Zhu S-C (2014) Cross-view action modeling, learning and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2649–2656
Gedamu K, Ji Y, Gao L, Yang Y, Shen HT (2023) Relation-mining self-attention network for skeleton-based human action recognition. Pattern Recognit 139:109455
Article Google Scholar
Wu L, Zhang C, Zou Y (2023) Spatiotemporal focus for skeleton-based action recognition. Pattern Recognit 136:109231
Article Google Scholar
Dai M, Sun Z, Wang T, Feng J, Jia K (2023) Global spatio-temporal synergistic topology learning for skeleton-based action recognition. Pattern Recognit 140:109540
Article Google Scholar
Hao X, Li J, Guo Y, Jiang T, Yu M (2021) Hypergraph neural network for skeleton-based action recognition. IEEE Trans Image Process 30:2263–2275
Article MathSciNet Google Scholar
He C, Xiao C, Liu S, Qin X, Zhao Y, Zhang X (2021) Single-skeleton and dual-skeleton hypergraph convolution neural networks for skeleton-based action recognition. In: Neural information processing: 28th International conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part II 28, pp 15–27. Springer
Zhu Y, Huang G, Xu X, Ji Y, Shen F (2022) Selective hypergraph convolutional networks for skeleton-based action recognition. In: Proceedings of the 2022 international conference on multimedia retrieval, pp 518–526
Li M, Chen S, Chen X, Zhang Y, Wang Y, Tian Q (2019) Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3595–3603
Huang L, Huang Y, Ouyang W, Wang L (2020) Part-level graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 11045–11052
Shi L, Zhang Y, Cheng J, Lu H (2020) Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition. In: Proceedings of the Asian conference on computer vision
Wu C, Wu X-J, Kittler J (2021) Graph2net: Perceptually-enriched graph learning for skeleton-based action recognition. IEEE Trans Circ Syst Video Technol 32(4):2120–2132
Article Google Scholar
Kang M-S, Kang D, Kim H (2023) Efficient skeleton-based action recognition via joint-mapping strategies. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3403–3412
Geng P, Li H, Wang F, Lyu L (2022) Adaptive multi-level graph convolution with contrastive learning for skeleton-based action recognition. Signal Process 201:108714
Article Google Scholar
Liu Y, Zhang H, Li Y, He K, Xu D (2023) Skeleton-based human action recognition via large-kernel attention graph convolutional network. IEEE Trans Visual Comput Graph 29(5):2575–2585
Article Google Scholar
Lin L, Song S, Yang W, Liu J (2020) Ms2l: Multi-task self-supervised learning for skeleton based action recognition. In: Proceedings of the 28th ACM international conference on multimedia, pp 2490–2498
Wang P, Wen J, Si C, Qian Y, Wang L (2022) Contrast-reconstruction representation learning for self-supervised skeleton-based action recognition. IEEE Trans Image Process 31:6224–6238
Article Google Scholar
Pang C, Lu X, Lyu L (2023) Skeleton-based action recognition through contrasting two-stream spatial-temporal networks. IEEE Trans Multimed
Cheng Q, Cheng J, Ren Z, Zhang Q, Liu J (2023) Multi-scale spatial-temporal convolutional neural network for skeleton-based action recognition. Pattern Anal Appl 26(3):1303–1315
Article Google Scholar

Download references

Acknowledgements

This research is funded by the National Key R &D Program of China, grant number 2023YFB2603900, the Major Discipline Academic and Technical Leaders Training Program of Jiangxi Province, under grant number 20225BCJ22012, the National Natural Science Foundation of China, under grant number 61801180, the Natural Science Foundation of Jiangxi Province, under grant number 20202BAB202003.

Author information

Authors and Affiliations

School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China
Tuo Zang, Jianfeng Tu, Mengran Duan, Zhipeng Chen, Hao Cheng, Hanrui Jiang, Jiahui Zhao & Lingfeng Liu
Jiangxi Minxuan Intelligent Technology Co., Ltd., Nanchang, 330096, China
Lingfeng Liu

Authors

Tuo Zang
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Tu
View author publications
You can also search for this author in PubMed Google Scholar
Mengran Duan
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hao Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Hanrui Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jiahui Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Lingfeng Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors confirm contributions to the paper as follows: study conception and design: Tuo Zang, Lingfeng Liu; methodology and data collection: Tuo Zang, Jianfeng Tu; analysis and interpretation of results: Tuo Zang, Jianfeng Tu, Mengran Duan, Lingfeng Liu; draft manuscript preparation: Tuo Zang, Zhipeng Chen, Hao Cheng; manuscript revision: Tuo Zang, Jianfeng Tu, Hanrui Jiang, Jiahui Zhao; All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Lingfeng Liu.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Model hyperparameter

Table 11 Hyperparameter of model

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zang, T., Tu, J., Duan, M. et al. Priori separation graph convolution with long-short term temporal modeling for skeleton-based action recognition. Appl Intell 54, 7621–7635 (2024). https://doi.org/10.1007/s10489-024-05544-5

Download citation

Accepted: 17 May 2024
Published: 11 June 2024
Issue Date: September 2024
DOI: https://doi.org/10.1007/s10489-024-05544-5

Priori separation graph convolution with long-short term temporal modeling for skeleton-based action recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Spatial-Temporal Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition

Graph Attention Convolutional Network with Motion Tempo Enhancement for Skeleton-Based Action Recognition

MS-GTR: Multi-stream Graph Transformer for Skeleton-Based Action Recognition

Availability of data and materials

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendix A: Model hyperparameter

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Priori separation graph convolution with long-short term temporal modeling for skeleton-based action recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Spatial-Temporal Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition

Graph Attention Convolutional Network with Motion Tempo Enhancement for Skeleton-Based Action Recognition

MS-GTR: Multi-stream Graph Transformer for Skeleton-Based Action Recognition

Explore related subjects

Availability of data and materials

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendix A: Model hyperparameter

Appendix A: Model hyperparameter

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation