Sensor fusion based manipulative action recognition

Ye Gu ORCID: orcid.org/0000-0002-4008-2042¹,
Meiqin Liu²,
Weihua Sheng³,
Yongsheng Ou⁴ &
…
Yongqiang Li⁵

722 Accesses
5 Citations
Explore all metrics

Abstract

Manipulative action recognition is one of the most important and challenging topic in the fields of image processing. In this paper, three kinds of sensor modules are used for motion, force and object information capture in the manipulative actions. Two fusion methods are proposed. Further, the recognition accuracy can be improved by using object as context. For the feature-level fusion method, significant features are chosen first. Then the Hidden Markov Models are built with these selected features to characterize the temporal sequence. For the decision-level fusion method, HMMs are built for each feature group. Then the decisions are fused. On top of these two fusion methods, the object/action context is modeled using Bayesian network. Assembly tasks are used for algorithm evaluation. The experimental results prove that the proposed approach is effective on manipulative action recognition task. The recognition accuracy of the decision-level, feature-level fusion methods and the Bayesian model are 72%, 80% and 90% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Structural feature representation and fusion of human spatial cooperative motion for action recognition

Article 30 January 2023

Dexterous Hand Motion Classification and Recognition Based on Multimodal Sensing

Action recognition for the robotics and manufacturing automation using 3-D binary micro-block difference

Article 09 July 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Ahmad, M., & Lee, S. W. (2006). Hmm-based human action recognition using multiview image sequences. In International conference on pattern recognition (pp. 263–266).
Aldoma, A., Marton, Z. C., Tombari, F., & Wohlkinger, W. (2012). Tutorial: Point cloud library—Three-dimensional object recognition and 6 dof pose estimation. Robotics & Automation Magazine IEEE, 19(3), 80–91.
Article Google Scholar
Alhamzi, K., Elmogy, M., & Barakat, S. (2015). 3D object recognition based on local and global features using point cloud library. International Journal of Advancements in Computing Technology, 7, 43–54.
Google Scholar
Banos, O., Damas, M., Guillen, A., Herrera, L. J., Pomares, H., Rojas, I., Villalonga, C., & Lee, S. (2015). On the development of a real-time multi-sensor activity recognition system. In International work-conference on ambient assisted living. ICT-based solutions in real life situations (pp. 176–182).
Bux, A., Angelov, P., & Habib, Z. (2016). Vision based human activity recognition: A review. Berlin: Springer.
Google Scholar
Chen, C., Jafari, R., & Kehtarnavaz, N. (2017). A survey of depth and inertial sensor fusion for human action recognition. Multimedia Tools and Applications, 76(3), 4405–4425.
Article Google Scholar
Chernbumroong, S., Shuang, C., & Yu, H. (2014). A practical multi-sensor activity recognition system for home-based care. Decision Support Systems, 66(C), 61–70.
Article Google Scholar
Chu, V., Fitzgerald, T., & Thomaz, A. L. (2016). Learning object affordances by leveraging the combination of human-guidance and self-exploration. In ACM/IEEE international conference on human–robot interaction (pp. 221–228).
Diete, A., Sztyler, T., & Stuckenschmidt, H. (2017). A smart data annotation tool for multi-sensor activity recognition. In IEEE international conference on pervasive computing and communications workshops (pp. 111–116).
Dietterich, T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems, LBCS-1857 (pp. 1–15).
Du, Y., Wang, W., & Wang, L. (2015). Hierarchical recurrent neural network for skeleton based action recognition. In 2015 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1110–1118).
Feichtenhofer, C., Pinz, A., & Zisserman, A. (2016). Convolutional two-stream network fusion for video action recognition. In CoRR. arXiv:1604.06573.
Gu, Y., Do, H., & Sheng, W. (2012). Human gesture recognition through a kinect sensor. In IEEE international conference on robotics and biomimetics.
Gu, Y., Sheng, W., Liu, M., & Ou, Y. (2015). Fine manipulative action recognition through sensor fusion. In IEEE/RSJ international conference on intelligent robots and systems (pp. 886–891).
He, Z. (2010). A new feature fusion method for gesture recognition based on 3d accelerometer. In 2010 Chinese conference on pattern recognition (CCPR) (pp. 1–5).
Ke, Q., Bennamoun, M., Rahmani, H., An, S., Sohel, F., & Boussaid, F. (2020). Learning latent global network for skeleton-based action prediction. IEEE Transactions on Image Processing, 29, 959–970.
Article MathSciNet Google Scholar
Ke, Q., Fritz, M., & Schiele, B. (2019). Time-conditioned action anticipation in one shot. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 9917–9926).
Ke, Q., Liu, J., Bennamoun, M., Rahmani, H., An, S., Sohel, F., et al. (2019). Global regularizer and temporal-aware cross-entropy for skeleton-based early action recognition. In C. Jawahar, H. Li, G. Mori, & K. Schindler (Eds.), Computer vision—ACCV 2018 (pp. 729–745). Cham: Springer.
Chapter Google Scholar
Kumar, S. H. & Sivaprakash, P. (2013). New approach for action recognition using motion based features. In Information and communication technologies (pp. 1247–1252).
Lara, O., & Labrador, M. (2013). A survey on human activity recognition using wearable sensors. IEEE Communications Surveys Tutorials, 15(3), 1192–1209.
Article Google Scholar
Liu, J., Shahroudy, A., Wang, G., Duan, L., & Chichung, A. Kot, (2019). Skeleton-based online action prediction using scale selection network. In IEEE transactions on pattern analysis and machine intelligence (p. 1).
Liu, J., Shahroudy, A., Wang, G., Duan, L., & Kot, A. C. (2018a). SSNET: Scale selection network for online 3d action prediction. In 2018 IEEE/CVF conference on computer vision and pattern recognition (pp. 8349–8358).
Liu, J., Shahroudy, A., Xu, D., Kot, A. C., & Wang, G. (2018b). Skeleton-based action recognition using spatio-temporal lstm network with trust gates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3007–3021.
Article Google Scholar
Liu, J., Wang, G., Duan, L., Hu, P., & Kot, A. C. (2017). Skeleton based human action recognition with global context-aware attention LSTM networks. In CoRR. arXiv:1707.05740.
Meena, P. R., & Shantha, S. K. R. (2017). Spatial fuzzy c means and expectation maximization algorithms with bias correction for segmentation of mr brain images. Journal of Medical Systems, 41(1), 15.
Article Google Scholar
Munaro, M., Rusu, R. B., & Menegatti, E. (2016). 3D robot perception with point cloud library. Robotics & Autonomous Systems, 78, 97–99.
Article Google Scholar
Nag, A., & Mukhopadhyay, S. C. (2015). Occupancy detection at smart home using real-time dynamic thresholding of flexiforce sensor. IEEE Sensors Journal, 15(8), 4457–4463.
Article Google Scholar
Pfister, A., West, A. M., Bronner, S., & Noah, J. A. (2014). Comparative abilities of microsoft kinect and vicon 3d motion capture for gait analysis. Journal of Medical Engineering and Technology, 38(5), 274–280.
Article Google Scholar
Quigley, M., Conley, K., Gerkey, B. P., Faust, J., Foote, T., Leibs, J., Wheeler, R., & Ng, A. Y. (2009). ROS: An open-source robot operating system. In ICRA workshop on open source software.
Rahmani, H., Mian, A., & Shah, M. (2018). Learning a deep model for human action recognition from novel viewpoints. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(3), 667–681.
Article Google Scholar
Shahroudy, A., Liu, J., Ng, T. T., & Wang, G. (2016). Ntu rgb+d: A large scale dataset for 3d human activity analysis. In Computer vision and pattern recognition (pp. 1010–1019).
Sharma, S., Kiros, R., & Salakhutdinov, R. (2015). Action recognition using visual attention. arXiv:1511.04119.
Simonyan, K., & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In CoRR. arXiv:1406.2199.
Smisek, J., Jancosek, M., & Pajdla, T. (2013). 3D with kinect. Advances in Computer Vision & Pattern Recognition, 21(5), 1154–1160.
Google Scholar
Stiefmeier, T., Ogris, G., Junker, H., Lukowicz, P., & Troster, G. (2006). Combining motion sensors and ultrasonic hands tracking for continuous activity recognition in a maintenance scenario. In 10th IEEE international symposium on wearable computers(pp. 97–104).
Titus, J. A. (2012). The hands-on XBEE lab manual: experiments that teach you XBEE wirelesss communications (1st edn.). Newnes.
Tombari, F., Salti, S., & Di Stefano, L. (2010). Unique signature of histograms for local surface description. In Proceedings of the 11th European conference on computer vision (pp. 356–369).
Tombari, F., Salti, S., & Di Stefano, L. (2011). A combined texture-shape descriptor for enhanced 3D feature matching. In IEEE international conference on image processing (ICIP) (pp. 809–812).
Tran, K., Kakadiaris, I. A., & Shah, S. K. (2012). Fusion of human posture features for continuous action recognition. In Proceedings of the 11th European conference on trends and topics in computer vision—volume part I, ser. ECCV’10, 2012 (pp. 244–257).
Tsai, C. H., & Yen, J. C. (2014). Teaching spatial visualization skills using OpenNI and the microsoft kinect sensor. Berlin: Springer.
Book Google Scholar
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., & Gool, L. V. (2016). Temporal segment networks: Towards good practices for deep action recognition. In CoRR. arXiv:1608.00859.
Wu, Q., Wang, Z., Deng, F., Chi, Z., & Feng, D. (2013). Realistic human action recognition with multimodal feature selection and fusion. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 43(4), 875–885.
Article Google Scholar
Yan, S., Xiong, Y., & Lin, D. (2018). Spatial temporal graph convolutional networks for skeleton-based action recognition. In CoRR. arXiv:1801.07455.
Yang, Y., Li, Y., Fermuller, C., & Aloimonos, Y. (2015). Robot learning manipulation action plans by “watching” unconstrained videos from the world wide web. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence, ser. AAAI’15. AAAI Press, 2015 (pp. 3686–3692). http://dl.acm.org/citation.cfm?id=2888116.2888228.
Zhao, Z., Cox, J., Duling, D., & Sarle, W. (2012). Massively parallel feature selection: An approach based on variance preservation. In European conference on machine learning and knowledge discovery in databases (pp. 237–252).
Zhao, Z., Ma, H., & You, S. (2016). Single image action recognition using semantic body part actions. In CoRR. arXiv:1612.04520.
Zhou, L., Li, W., & Ogunbona, P. (2016). Learning a pose lexicon for semantic action recognition. In IEEE international conference on multimedia and expo (pp. 1–6).

Download references

Acknowledgements

This Project is supported by the National Natural Science Foundation of China (No. 61906123). The Fundamental Research Funds for Shenzhen Technology University. Shenzhen Overseas High Level Talent (Peacock Plan) Program (No. KQTD20140630154026047). National Natural Science Foundation of China U1713216. Shenzhen basic research Projects (JCYJ20160429161539298). National Natural Science Foundation of China (No. 61976070). Scientific Research Platforms and Projects in Universities in Guangdong Province under Grants 2019KTSCX204.

Author information

Authors and Affiliations

Shenzhen Technology University, Shenzhen, Guangdong, China
Ye Gu
College of Electrical Engineering, Zhejiang University, Hangzhou, 310027, China
Meiqin Liu
Shenzhen Academy of Robotics, Shenzhen, Guangdong, China
Weihua Sheng
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China
Yongsheng Ou
Harbin Institute of Technology, No. 92, Xidazhi Street, Harbin, 150001, China
Yongqiang Li

Authors

Ye Gu
View author publications
You can also search for this author in PubMed Google Scholar
Meiqin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weihua Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Yongsheng Ou
View author publications
You can also search for this author in PubMed Google Scholar
Yongqiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ye Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gu, Y., Liu, M., Sheng, W. et al. Sensor fusion based manipulative action recognition. Auton Robot 45, 1–13 (2021). https://doi.org/10.1007/s10514-020-09943-8

Download citation

Received: 04 January 2019
Accepted: 18 August 2020
Published: 11 September 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10514-020-09943-8

Sensor fusion based manipulative action recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Structural feature representation and fusion of human spatial cooperative motion for action recognition

Dexterous Hand Motion Classification and Recognition Based on Multimodal Sensing

Action recognition for the robotics and manufacturing automation using 3-D binary micro-block difference

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

Sensor fusion based manipulative action recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Structural feature representation and fusion of human spatial cooperative motion for action recognition

Dexterous Hand Motion Classification and Recognition Based on Multimodal Sensing

Action recognition for the robotics and manufacturing automation using 3-D binary micro-block difference

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation