Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Priori separation graph convolution with long-short term temporal modeling for skeleton-based action recognition

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Human action recognition from skeleton motion sequences is widely applied in various fields such as virtual reality, human-computer interaction and kinematic rehabilitation. With the wide use of graph neural networks for extracting spatial features from skeleton anatomy connectivity, spatial-temporal extension of single graph models on human skeleton may improve the network performance. In this paper, we propose a priori separation graph convolution (PS-GCN) network composition with a priori mixed GCN by introducing a hypergraph representation of the skeleton spatial features, and a dynamic adaptive GCN to describe the respective graph model for each sample at each layer of the network of spatial features. For temporal feature analysis, a global attention unit is added to describe the long-term relationship. Moreover, a feature fusion structure is applied for short-term temporal features in the input of the network. The proposed model is evaluated on the NTU-RGB+D, NTU-RGB+D 120 and NW-UCLA datasets via a comprehensive ablative study. The results show that our model is comparable in accuracy to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

The datasets are available at https://rose1.ntu.edu.sg/dataset/actionRecognition/ and https://wangjiangb.github.io/my_data.html.

Code availability

When this paper is accepted, we will disclose all relevant codes.

References

  1. Ciftci U, Zhang X, Tin L Partially occluded facial action recognition and interaction in virtual reality applications, pp 715–720. https://doi.org/10.1109/ICME.2017.8019545. ISSN: 1945-788X

  2. Marinoiu E, Zanfir M, Olaru V, Sminchisescu C (2018) 3d human sensing, action and emotion recognition in robot assisted therapy of children with autism, pp 2158–2167

  3. Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst Man Cybern Part C (Appl Rev) 34(3):334–352. https://doi.org/10.1109/TSMCC.2004.829274

    Article  Google Scholar 

  4. Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Twenty-third international joint conference on artificial intelligence

  5. Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition, pp 1290–1297. IEEE

  6. Xu Y, Cheng J, Wang L, Xia H, Liu F, Tao D (2018) Ensemble one-dimensional convolution neural networks for skeleton-based action recognition. IEEE Signal Process Lett 25(7):1044–1048

    Article  Google Scholar 

  7. Zhang P, Xue J, Lan C, Zeng W, Gao Z, Zheng N (2019) Eleatt-rnn: Adding attentiveness to neurons in recurrent neural networks. IEEE Trans Image Process 29:1061–1073

    Article  MathSciNet  Google Scholar 

  8. Manessi F, Rozza A, Manzo M (2020) Dynamic graph convolutional networks. Pattern Recognit 97:107000

    Article  Google Scholar 

  9. Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  10. Chen Z, Li S, Yang B, Li Q, Liu H (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, pp. 1113–1122

  11. Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12026–12035

  12. Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE conference on computer vision and pattern recognition, pp 1290–1297. IEEE

  13. Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: Twenty-third international joint conference on artificial intelligence

  14. Wang P, Li Z, Hou Y, Li W (2016) Action recognition based on joint trajectory maps using convolutional neural networks. In: Proceedings of the 24th ACM international conference on multimedia, pp 102–106

  15. Li C, Sun S, Min X, Lin W, Nie B, Zhang X (2017) End-to-end learning of deep convolutional neural network for 3d human action recognition. In: 2017 IEEE international conference on multimedia & expo workshops (ICMEW), pp 609–612. IEEE

  16. Liu M, Liu H, Chen C (2017) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognit 68:346–362

    Article  Google Scholar 

  17. Ardianto S, Hang H-M (2018) Multi-view and multi-modal action recognition with learned fusion. In: 2018 Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC), pp 1601–1604. IEEE

  18. Li C, Zhong Q, Xie D, Pu S (2018) Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence. International Joint Conferences on Artificial Intelligence Organization

  19. Zhang S, Yang Y, Xiao J, Liu X, Yang Y, Xie D, Zhuang Y (2018) Fusing geometric features for skeleton-based action recognition using multilayer lstm networks. IEEE Trans Multimed 20(9):2330–2343

    Article  Google Scholar 

  20. Si C, Chen W, Wang W, Wang L, Tan T (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1227–1236

  21. Zhu C, Ma X, Ding W, Zhan J (2023) Long-term time series forecasting with multi-linear trend fuzzy information granules for lstm in a periodic framework. IEEE Trans Fuzzy Syst

  22. Wu X, Zhan J, Li T, Ding W, Pedrycz W (2024) Mbssa-bi-aesn: Classification prediction of bi-directional adaptive echo state network based on modified binary salp swarm algorithm and feature selection. Appl Intell 54(2):1706–1733

    Article  Google Scholar 

  23. Lee I, Kim D, Kang S, Lee S (2017) Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks. In: Proceedings of the IEEE international conference on computer vision, pp 1012–1020

  24. Wang H, Wang L (2017) Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 499–508

  25. Du Y, Fu Y, Wang L (2016) Representation learning of temporal dynamics for skeleton-based action recognition. IEEE Trans Image Process 25(7):3010–3022

    Article  MathSciNet  Google Scholar 

  26. Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12026–12035

  27. Hu Z, Pan Z, Wang Q, Yu L, Fei S (2022) Forward-reverse adaptive graph convolutional networks for skeleton-based action recognition. Neurocomputing 492:624–636

    Article  Google Scholar 

  28. Liu Z, Zhang H, Chen Z, Wang Z, Ouyang W (2020) Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 143–152

  29. Chen T, Zhou D, Wang J, Wang S, Guan Y, He X, Ding E (2021) Learning multi-granular spatio-temporal graph network for skeleton-based action recognition. In: Proceedings of the 29th ACM international conference on multimedia, pp 4334–4342

  30. Bai R, Meng X, Meng B, Jiang M, Ren J, Yang Y, Li M, Sun D (2021) Graph attention convolutional network with motion tempo enhancement for skeleton-based action recognition. In: PRICAI 2021: trends in artificial intelligence: 18th Pacific Rim international conference on artificial intelligence, PRICAI 2021, Hanoi, Vietnam, November 8–12, 2021, Proceedings, Part III 18, pp 152–165. Springer

  31. Qiu Z-X, Zhang H-B, Deng W-M, Du J-X, Lei Q, Zhang G-L (2023) Effective skeleton topology and semantics-guided adaptive graph convolution network for action recognition. Vis Comput 39(5):2191–2203

    Article  Google Scholar 

  32. Zhang P, Lan C, Zeng W, Xing J, Xue J, Zheng N (2020) Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1112–1121

  33. Qi Y, Hu J, Zhuang L, Pei X (2023) Semantic-guided multi-scale human skeleton action recognition. Appl Intell 53(9):9763–9778

    Article  Google Scholar 

  34. Feng Y, You H, Zhang Z, Ji R, Gao Y (2019) Hypergraph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3558–3565

  35. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30

  36. Shahroudy A, Liu J, Ng T-T, Wang G (2016) Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019

  37. Liu J, Shahroudy A, Perez M, Wang G, Duan L-Y, Kot AC (2019) Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell 42(10):2684–2701

    Article  Google Scholar 

  38. Wang J, Nie X, Xia Y, Wu Y, Zhu S-C (2014) Cross-view action modeling, learning and recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2649–2656

  39. Gedamu K, Ji Y, Gao L, Yang Y, Shen HT (2023) Relation-mining self-attention network for skeleton-based human action recognition. Pattern Recognit 139:109455

    Article  Google Scholar 

  40. Wu L, Zhang C, Zou Y (2023) Spatiotemporal focus for skeleton-based action recognition. Pattern Recognit 136:109231

    Article  Google Scholar 

  41. Dai M, Sun Z, Wang T, Feng J, Jia K (2023) Global spatio-temporal synergistic topology learning for skeleton-based action recognition. Pattern Recognit 140:109540

    Article  Google Scholar 

  42. Hao X, Li J, Guo Y, Jiang T, Yu M (2021) Hypergraph neural network for skeleton-based action recognition. IEEE Trans Image Process 30:2263–2275

    Article  MathSciNet  Google Scholar 

  43. He C, Xiao C, Liu S, Qin X, Zhao Y, Zhang X (2021) Single-skeleton and dual-skeleton hypergraph convolution neural networks for skeleton-based action recognition. In: Neural information processing: 28th International conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part II 28, pp 15–27. Springer

  44. Zhu Y, Huang G, Xu X, Ji Y, Shen F (2022) Selective hypergraph convolutional networks for skeleton-based action recognition. In: Proceedings of the 2022 international conference on multimedia retrieval, pp 518–526

  45. Li M, Chen S, Chen X, Zhang Y, Wang Y, Tian Q (2019) Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3595–3603

  46. Huang L, Huang Y, Ouyang W, Wang L (2020) Part-level graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 11045–11052

  47. Shi L, Zhang Y, Cheng J, Lu H (2020) Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition. In: Proceedings of the Asian conference on computer vision

  48. Wu C, Wu X-J, Kittler J (2021) Graph2net: Perceptually-enriched graph learning for skeleton-based action recognition. IEEE Trans Circ Syst Video Technol 32(4):2120–2132

    Article  Google Scholar 

  49. Kang M-S, Kang D, Kim H (2023) Efficient skeleton-based action recognition via joint-mapping strategies. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3403–3412

  50. Geng P, Li H, Wang F, Lyu L (2022) Adaptive multi-level graph convolution with contrastive learning for skeleton-based action recognition. Signal Process 201:108714

    Article  Google Scholar 

  51. Liu Y, Zhang H, Li Y, He K, Xu D (2023) Skeleton-based human action recognition via large-kernel attention graph convolutional network. IEEE Trans Visual Comput Graph 29(5):2575–2585

    Article  Google Scholar 

  52. Lin L, Song S, Yang W, Liu J (2020) Ms2l: Multi-task self-supervised learning for skeleton based action recognition. In: Proceedings of the 28th ACM international conference on multimedia, pp 2490–2498

  53. Wang P, Wen J, Si C, Qian Y, Wang L (2022) Contrast-reconstruction representation learning for self-supervised skeleton-based action recognition. IEEE Trans Image Process 31:6224–6238

    Article  Google Scholar 

  54. Pang C, Lu X, Lyu L (2023) Skeleton-based action recognition through contrasting two-stream spatial-temporal networks. IEEE Trans Multimed

  55. Cheng Q, Cheng J, Ren Z, Zhang Q, Liu J (2023) Multi-scale spatial-temporal convolutional neural network for skeleton-based action recognition. Pattern Anal Appl 26(3):1303–1315

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded by the National Key R &D Program of China, grant number 2023YFB2603900, the Major Discipline Academic and Technical Leaders Training Program of Jiangxi Province, under grant number 20225BCJ22012, the National Natural Science Foundation of China, under grant number 61801180, the Natural Science Foundation of Jiangxi Province, under grant number 20202BAB202003.

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm contributions to the paper as follows: study conception and design: Tuo Zang, Lingfeng Liu; methodology and data collection: Tuo Zang, Jianfeng Tu; analysis and interpretation of results: Tuo Zang, Jianfeng Tu, Mengran Duan, Lingfeng Liu; draft manuscript preparation: Tuo Zang, Zhipeng Chen, Hao Cheng; manuscript revision: Tuo Zang, Jianfeng Tu, Hanrui Jiang, Jiahui Zhao; All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Lingfeng Liu.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Model hyperparameter

Appendix A: Model hyperparameter

Table 11 Hyperparameter of model

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zang, T., Tu, J., Duan, M. et al. Priori separation graph convolution with long-short term temporal modeling for skeleton-based action recognition. Appl Intell 54, 7621–7635 (2024). https://doi.org/10.1007/s10489-024-05544-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05544-5

Keywords

Navigation