Abstract
There are limited approaches using Kinect for upper body motion recognition. Most of the available approaches are conducted when there is no joint occlusion, though some performed with joint occlusion only demonstrated recognition of a few motions at low recognition rates. This paper utilizes OptiTrack and its supporting software to obtain and transfer data into a human skeleton coordinates using Kinect v2, and selects the vector among key joint points and angles as the feature values; the AP clustering algorithm was adopted for the key frames of motions which are marked; then we conduct relative normalization of the feature values, and use the method of random forest regression to realize two functions: (1) conduct derivation based on joint offset of frames detected with Kinect v2 from those detected with OptiTrack, learn the joint offset regression function, and correct the skeleton based on the predictions on joint offset; (2) determine the motions based on predicted posture. This paper performs recognition of 8 types of upper body motions at an average accuracy of 90.86%.
Similar content being viewed by others
References
Batabyal T, Chattopadhyay T, Mukherjee DP (2015) Action recognition using joint coordinates of 3D skeleton data. IEEE International Conference on Image Processing:4107–4111. https://doi.org/10.1109/icip.2015.7351578
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Caruana R, Karampatziakis N, Yessenalina A (2008) An empirical evaluation of supervised learning in high dimensions. International Conference on Machine Learning ACM:96–103. https://doi.org/10.1145/1390156.1390169
Chen C, Jafari R, Kehtarnavaz N (2016) Fusion of depth, skeleton, and inertial data for human action recognition. IEEE International Conference on Acoustics, Speech and Signal Processing IEEE:2712–2716. https://doi.org/10.1109/ICASSP.2016.7472170
Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del BA (2015) 3-D human action recognition by shape analysis of motion trajectories on riemannian manifold. IEEE Transactions on Cybernetics 45(7):1340–1352. https://doi.org/10.1109/TCYB.2014.2350774
Dollár P, Welinder P, Perona P (2010) Cascaded pose regression. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference:1078–1085. https://doi.org/10.1109/CVPR.2010.5540094
Du Y, Fu Y, Wang L (2016) Representation learning of temporal dynamics for skeleton-based action recognition. IEEE Trans Image Process 25(7):3010–3022. https://doi.org/10.1109/TIP.2016.2552404
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976. https://doi.org/10.1126/science.1136800
Hsu SC, Huang JY, Kao WC, Huang CL (2015) Human body motion parameters capturing using Kinect. Machine Vision & Applications 26(7-8):919–932. https://doi.org/10.1007/s00138-015-0710-1
Le TL, Nguyen MQ, Nguyen TTM (2013) Human posture recognition using human skeleton provided by Kinect. International Conference on Computing, Management and Tele-communications, IEEE 138:340–345. https://doi.org/10.1109/commantel.2013.6482417
Lepetit V, Lagger P, Fua P (2005) Randomized trees for real-time keypoint recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition IEEE Computer Society:775–781. https://doi.org/10.1109/CVPR.2005.288
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. Computer Vision and Pattern Recognition Workshops IEEE:9–14. https://doi.org/10.1109/CVPRW.2010.5543273
Li X, Zhang Y, Liao D (2017) Mining key skeleton poses with latent SVM for action recognition. Applied Computational Intelligence and Soft Computing 1:1–11. https://doi.org/10.1155/2017/5861435
Liu T, Song Y, Gu Y, Li A (2014) Human action recognition based on depth Images from Microsoft Kinect. Intelligent Systems, IEEE:200–204. https://doi.org/10.1109/gcis.2013.38
Mahasseni B, Todorovic S (2016) Regularizing long short term memory with 3D human-skeleton sequences for action recognition. Computer Vision and Pattern Recognition, IEEE:3054–3062. https://doi.org/10.1109/CVPR.2016.333
Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2012) Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. Computer Vision and Pattern Recognition Workshops 25:24–38. https://doi.org/10.1109/cvprw.2012.6239231
Patsadu O, Nukoolkit C, Watanapa B (2012) Human gesture recognition using Kinect camera. International Joint Conference on Computer Science and Software Engineering, IEEE:28–32. https://doi.org/10.1109/jcsse.2012.6261920
Pisharady PK, Martin S (2013) Kinect based body posture detection and recognition system. Proceedings of SPIE - The International Society for Optical Engineering 8768(1):87687F–87687F-5. https://doi.org/10.1117/12.2009926
Prakash A, Swathi R, Kumar S, Ashwin TS, Reddy GRM (2017) Kinect based real time gesture recognition tool for air Marshallers and traffic policemen. IEEE Eighth Interna-tional Conference on Technology for Education:34–37. https://doi.org/10.1109/T4E.2016.015
Schwarz LA, Mkhitaryan A, Mateus D, Navab N (2012) Human skeleton tracking from depth data using geodesic distances and optical flow. Image & Vision Computing 30(3):217–226. https://doi.org/10.1016/j.imavis.2011.12.001
Seidenari L, Varano V, Berretti S, Bimbo AD, Pala P (2013) Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses. Computer Vision and Pattern Recognition Workshops, IEEE 13:479–485. https://doi.org/10.1109/cvprw.2013.77
Sheth NS, Deshpande AR (2015) A review of splitting criteria for decision tree induction. Fuzzy Systems 7(1):1–4
Shotton J, Kipman A, Kipman A, Finocchio M, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single depth images. Commun ACM 56:116–124. https://doi.org/10.1145/2398356.2398381
Wang H, Chai X, Zhou Y, Chen X (2015) Fast sign language recognition benefited from low rank approximation. IEEE International Conference and Workshops on Automatic Face and Gesture Recognition 1:1–6. https://doi.org/10.1109/FG.2015.7163092
Wang WJ, Chang JW, Haung SF, Wang RJ (2016) Human posture recognition based on images captured by the kinect sensor. Int J Adv Robot Syst 13(2):54–69. https://doi.org/10.5772/62163
Xia L, Chen CC, Aggarwal JK (2011) Human detection using depth information by Ki-nect. Computer Vision and Pattern Recognition Workshops, IEEE 85:15–22. https://doi.org/10.1109/cvprw.2011.5981811
Xia S, Zhang Z, Su L (2018) Cascaded 3D Full-body Pose Regression from Single Depth Image at 100 FPS. arXiv preprint arXiv, 1711.08126v2
Xiao Z, Mengyin F, Yi Y, Ningyi L (2012) 3D human postures recognition using Ki-nect. International Conference on Intelligent Human-Machine Systems and Cybernetics, IEEE 1:344–347. https://doi.org/10.1109/ihmsc.2012.92
Xiao Y, Zhang Z, Beck A, Yuan J (2014) Human-robot interaction by understanding upper body gestures. Presence 23(2):133–154. https://doi.org/10.1162/PRES_a_00176
Yang X, Tian YL (2014) Effective 3D action recognition using EigenJoints. Journal of Vis-ual Communication & Image Representation 25(1):2–11. https://doi.org/10.1016/j.jvcir.2013.03.001
Acknowledgments
We hereby extend thanks to the Changchun University of Science and Technology National and Local Joint Engineering Research Centre of Special Film Technology and Equipment for the support on funding and equipment. Science and Technology Key Breakthrough Project of Jilin Province, China. (Project No. 20170203004GX.)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, B., Bai, B. & Han, C. Upper body motion recognition based on key frame and random forest regression. Multimed Tools Appl 79, 5197–5212 (2020). https://doi.org/10.1007/s11042-018-6357-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6357-y