Abstract
Yoga has now become a part of life for humans all over the globe enabling them to unite the body with the mind, helping in achieving a healthy lifestyle in the modern day. Practicing yoga with a trainer is always recommended to get maximum benefit but with this fast-paced lifestyle getting time for a lesson with a trainer or under proper guidance gets difficult and also sometimes middle-class families cannot bear the cost of a trainer. Therefore, a system is required which is accessible to everyone and can help in performing yoga poses and improving. This paper introduces a novel framework developed for estimating yoga poses through computer vision using a 3-D top-down semantic key landmark estimator with a Recurrent Neural Network (RNN) for classification. For training and validation of our model, we tailored our custom dataset of 10 different yoga poses having a total of 300 sequences. The model on the dataset gave an average of 92.34% accuracy using Long Short-Term Memory (LSTM) classifier.
Similar content being viewed by others
Data availability
Data available on request from the authors. The data that support the findings of this study are available from the corresponding author [Dr. Lokendra Singh Umrao], upon reasonable request. Policy: Basic, Share upon Request
References
Beddiar DR, Nini B, Sabokrou M, Hadid A (2020) Vision-based human activity recognition: a survey. Multimedia Tools and Applications 79(41):30509–30555
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2016) LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28(10):2222–2232
Luštrek M, Boštjan K (2009) Fall detection and activity recognition with machine learning. Informatica 33:205–212
Ma CY, Chen MH, Kira Z, AlRegib G (2019) TS-LSTM and temporal-inception: Exploiting spatiotemporal dynamics for activity recognition. Signal Processing: Image Communication 71:76–87
Nagalakshmi VDPP (2021) The Analysis of the Impact of Yoga on Healthcare and Conventional Strategies for Human Pose Recognition. Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12(6):1772–1783
Ramachandra S., Hoelzemann A., Van L. K. (2021) Transformer Networks for Data Augmentation of Human Physical Activity Recognition. arXiv preprint arXiv:2109.01081
Ann O C, Theng L B (2014) Human activity recognition: a review. In 2014 IEEE international conference on control system, computing and engineering (ICCSCE 2014) 389-393
Jin X, Yao Y., Jiang Q, Huang X, Zhang J, Zhang X, & Zhang K (2015) Virtual personal trainer via the kinect sensor. In 2015 IEEE 16th international conference on communication technology (ICCT) 460-463
Quan J, Xu L, Xu R, Tong T, & Su J (2019) DaTscan SPECT Image Classification for Parkinson's Disease. arXiv preprint arXiv:1909.04142.
Jose J, Shailesh S (2021) Yoga Asana Identification: A Deep Learning Approach. In IOP Conference Series: Materials Science and Engineering 1110(2021):1–10
Agrawal Y, Shah Y, Sharma A (2020) Implementation of machine learning technique for identification of yoga poses. In 2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT) 40-43
Yadav SK, Singh A, Gupta A, Raheja JL (2019) Real-time Yoga recognition using deep learning. Neural Computing and Applications 31(12):9349–9361
Jain S, Rustagi A, Saurav S, Saini R, Singh S (2021) Three-dimensional CNN-inspired deep learning architecture for Yoga pose recognition in the real-world environment. Neural Computing and Applications 33(12):6427–6441
Zhang Z, Lv Z, Gan C, Zhu Q (2020) Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions. Neurocomputing 410:304–316
Palanimeera J, Ponmozhi K (2021). Classification of yoga pose using machine learning techniques. Materials Today: Proceedings 37: 2930-8)(2933
] Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition 2625-2634
Ji S, Xu W, Yang M, Yu K (2012) 3D convolutional neural networks for human action recognition. IEEE transactions on pattern analysis and machine intelligence 35(1):221–231
Wang L, Xiong Y, Wang Z, Qiao Y (2015) Towards good practices for very deep two-stream convnets. arXiv preprint arXiv:1507.02159.
Peng X, Zou C, Qiao Y, Peng Q (2014) Action recognition with stacked fisher vectors. In European Conference on Computer Vision 581-595
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199.
Taylor G W, Fergus R, LeCun Y, Bregler C (2010) Convolutional learning of spatio-temporal features. In European conference on computer vision and Springer, Berlin, Heidelberg 140-153
Bazarevsky V, Grishchenko I, Raveendran K, Zhu T, Zhang F, Grundmann M (2020). BlazePose: On-device Real-time Body Pose tracking. arXiv preprint arXiv:2006.10204
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computation 9(8):1735–1780
https://machinelearningmastery.com/deep-learning-models-for-human-activity-recognition. Last accessed on 03-10-2021
Garg S, Saxena A, Gupta R (2022) Yoga pose classification: a CNN and MediaPipe inspired deep learning approach for real-world application. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03910-0
Ashraf FB, Islam MU, Kabir MR, Uddin J (2023) YoNet: A Neural Network for Yoga Pose Classification. SN Computer Science 4:1–9
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest, financial or otherwise.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Srivastava, R.P., Umrao, L.S. & Yadav, R.S. Real-time yoga pose classification with 3-D pose estimation model with LSTM. Multimed Tools Appl 83, 33019–33030 (2024). https://doi.org/10.1007/s11042-023-17036-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17036-8