Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3549737.3549786acmotherconferencesArticle/Chapter ViewAbstractPublication PagessetnConference Proceedingsconference-collections
research-article

Early Fusion of Visual Representations of Skeletal Data for Human Activity Recognition

Published: 09 September 2022 Publication History

Abstract

In this work we present an approach for human activity recognition which is based on skeletal motion, i.e., the motion of skeletal joints in the 3D space. More specifically, we propose the use of 4 well-known image transformations (i.e., DFT, FFT, DCT, DST) on images that are created based on the skeletal motion. This way, we create “activity” images which are then used to train four deep convolutional neural networks. These networks are then used for feature extraction. The extracted features are fused, scaled and upon a dimensionality reduction step they are given as input to a support vector machine for classification. We evaluate our approach using two well-known, publicly available, challenging datasets and we demonstrate the superiority of the fusion approach.

References

[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, 2016. {TensorFlow}: A System for {Large-Scale} Machine Learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265–283.
[2]
Hend Basly, Wael Ouarda, Fatma Ezahra Sayadi, Bouraoui Ouni, and Adel M Alimi. 2020. CNN-SVM learning approach based human activity recognition. In International Conference on Image and Signal Processing. Springer, 271–281.
[3]
Carlos Caetano, Jessica Sena, François Brémond, Jefersson A Dos Santos, and William Robson Schwartz. 2019. Skelemotion: A new representation of skeleton joint sequences based on motion information for 3d action recognition. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1–8.
[4]
Alexandros Chaaraoui, Jose Padilla-Lopez, and Francisco Flórez-Revuelta. 2013. Fusion of skeletal and silhouette-based features for human action recognition with rgb-d devices. In Proceedings of the IEEE international conference on computer vision workshops. 91–97.
[5]
Yanfang Chen, Liwei Wang, Chuankun Li, Yonghong Hou, and Wanqing Li. 2020. ConvNets-based action recognition from skeleton motion maps. Multimedia Tools and Applications 79, 3 (2020), 1707–1725.
[6]
Francois Chollet. 2021. Deep learning with Python. Simon and Schuster.
[7]
Muhammad Ehatisham-Ul-Haq, Ali Javed, Muhammad Awais Azam, Hafiz MA Malik, Aun Irtaza, Ik Hyun Lee, and Muhammad Tariq Mahmood. 2019. Robust human activity recognition using multimodal feature-level fusion. IEEE Access 7(2019), 60736–60751.
[8]
Thien Huynh-The, Cam-Hao Hua, Trung-Thanh Ngo, and Dong-Seong Kim. 2020. Image representation of pose-transition feature for 3D skeleton-based action recognition. Information Sciences 513(2020), 112–126.
[9]
Dimitrios Koutrintzes., Eirini Mathe., and Evaggelos Spyrou.2022. Boosting the Performance of Deep Approaches through Fusion with Handcrafted Features. In Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods - ICPRAM,. INSTICC, SciTePress, 370–377. https://doi.org/10.5220/0010982700003122
[10]
Chunhui Liu, Yueyu Hu, Yanghao Li, Sijie Song, and Jiaying Liu. 2017. PKU-MMD: A large scale benchmark for skeleton-based human action understanding. In Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities. 1–8.
[11]
Jun Liu, Amir Shahroudy, Mauricio Perez, Gang Wang, Ling-Yu Duan, and Alex C Kot. 2019. Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding. IEEE transactions on pattern analysis and machine intelligence 42, 10(2019), 2684–2701.
[12]
Antonios Papadakis, Eirini Mathe, Evaggelos Spyrou, and Phivos Mylonas. 2019. A geometric approach for cross-view human action recognition using deep learning. In 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA). IEEE, 258–263.
[13]
Antonios Papadakis, Eirini Mathe, Ioannis Vernikos, Apostolos Maniatis, Evaggelos Spyrou, and Phivos Mylonas. 2019. Recognizing human actions using 3d skeletal information and CNNs. In International Conference on Engineering Applications of Neural Networks. Springer, 511–521.
[14]
Vinícius Silva, Filomena Soares, Celina P Leão, João Sena Esteves, and Gianni Vercelli. 2021. Skeleton driven action recognition using an image-based spatial-temporal representation and convolution neural network. Sensors 21, 13 (2021), 4342.
[15]
Karen Simonyan and Andrew Zisserman. 2014. Two-stream convolutional networks for action recognition in videos. Advances in neural information processing systems 27 (2014).
[16]
Nusrat Tasnim, Mohammad Khairul Islam, and Joong-Hwan Baek. 2021. Deep learning based human activity recognition using spatio-temporal image formation of skeleton joints. Applied Sciences 11, 6 (2021), 2675.
[17]
Pratishtha Verma, Animesh Sah, and Rajeev Srivastava. 2020. Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognition. Multimedia Systems 26, 6 (2020), 671–685.
[18]
Ioannis Vernikos, Eirini Mathe, Evaggelos Spyrou, Alexandros Mitsou, Theodore Giannakopoulos, and Phivos Mylonas. 2019. Fusing handcrafted and contextual features for human activity recognition. In 2019 14th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP). IEEE, 1–6.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SETN '22: Proceedings of the 12th Hellenic Conference on Artificial Intelligence
September 2022
450 pages
ISBN:9781450395977
DOI:10.1145/3549737
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolutional neural networks
  2. deep learning
  3. early fusion
  4. human activity recognition

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SETN 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 46
    Total Downloads
  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media