Abstract
We propose a real-time vision-based teleoperation approach for robotic arms that employs a single depth-based camera, exempting the user from the need for any wearable devices. By employing a natural user interface, this novel approach leverages the conventional fine-tuning control, turning it into a direct body pose capture process. The proposed approach is comprised of two main parts. The first is a nonlinear customizable pose mapping based on Thin-Plate Splines (TPS), to directly transfer human body motion to robotic arm motion in a nonlinear fashion, thus allowing matching dissimilar bodies with different workspace shapes and kinematic constraints. The second is a Deep Neural Network hand-state classifier based on Long-term Recurrent Convolutional Networks (LRCNs) that exploits the temporal coherence of the acquired depth data. We validate, evaluate and compare our approach through both classical cross-validation experiments of the proposed hand state classifier; and user studies over a set of practical experiments involving variants of pick-and-place and manufacturing tasks. Results revealed that LRCNs outperform single image Convolutional Neural Networks; and that users’ learning curves were steep, thus allowing the successful completion of the proposed tasks. When compared to a previous approach, the TPS approach revealed no increase in task complexity and similar times of completion, while providing more precise operation in regions closer to workspace boundaries.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and material
Our collected depth images dataset is publicly available through a link provided in Section 7.1.
Code Availability
Code used in this article is stored in GitHub and will be made publicly available.
References
Adithya, V., Rajesh, R.: A deep convolutional neural network approach for static hand gesture recognition. Procedia Comput. Sci. 171, 2353–2361 (2020)
Bao, P., Maqueda, A.I., del Blanco, C.R., García, N.: Tiny hand gesture recognition without localization via a deep convolutional network. IEEE Trans. Consum. Electron. 63(3), 251–257 (2017)
Bookstein, F.L.: Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11(6), 567–585 (1989)
Brown, B.J., Rusinkiewicz, S.: Non-rigid range-scan alignment using thin-plate splines. In: Proceedings. 2Nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004., pp. 759–765. IEEE (2004)
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019)
Chen, Y., Tian, Y., He, M.: Monocular human pose estimation: A survey of deep learning-based methods. Comput. Vis. Image Underst. 192, 102897 (2020)
Chen, Z., Wang, Z., Liang, R., Liang, B., Zhang, T.: Virtual-joint based motion similarity criteria for human–robot kinematics mapping. Robot. Auton. Syst. 125, 103412 (2020)
Delmerico, J., Mintchev, S., Giusti, A., Gromov, B., Melo, K., Horvat, T., Cadena, C., Hutter, M., Ijspeert, A., Floreano, D., et al.: The current state and future outlook of rescue robotics. Journal of Field Robotics 36(7), 1171–1191 (2019)
Denso: VP6242 Specs. https://www.denso-wave.com/en/robot/product/five-six/vp.html. Online; Accessed in 28th April, 2020
Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. CoRR arXiv:abs/1411.4389. http://arxiv.org/abs/1411.4389 (2014)
D’Orazio, T., Marani, R., Renò, V., Cicirelli, G.: Recent trends in gesture recognition: How depth data has improved classical approaches. Image and Vision Computing 52, 56–72 (2016). https://doi.org/10.1016/j.imavis.2016.05.007, https://www.sciencedirect.com/science/article/pii/S0262885616300853
Duchon, J.: Splines Minimizing Rotation-Invariant Semi-Norms in Sobolev Spaces. In: Constructive Theory of Functions of Several Variables, pp. 85–100. Springer (1977)
Goodrich, M.A., Schultz, A.C.: Human-robot interaction: A survey Now. Publishers Inc (2008)
Gowtham, S., Krishna, K.M.A., Srinivas, T., Raj, R.G.P., Joshuva, A.: Emg-Based control of a 5 Dof robotic manipulator. In: 2020 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), pp. 52–57. https://doi.org/10.1109/WiSPNET48689.2020.9198439 (2020)
Hu, B., Wang, J.: Deep learning based hand gesture recognition and uav flight controls. Int. J. Autom. Comput. 17(1), 17–29 (2020)
Huang, H., Chong, Y., Nie, C., Pan, S.: Hand gesture recognition with skin detection and deep learning method. In: Journal of Physics: Conference Series, Vol. 1213, pp. 022001. IOP Publishing (2019)
Hutchinson, M.F.: Interpolating mean rainfall using thin plate smoothing splines. International Journal of Geographical Information Systems 9(4), 385–403 (1995)
Károly, A.I., Galambos, P., Kuti, J., Rudas, I.J.: Deep learning in robotics: Survey on model structures and training strategies. IEEE Transactions on Systems, Man, and Cybernetics: Systems 51(1), 266–279 (2020)
LaViola, J.J.: 3D gestural interaction: The state of the field. International Scholarly Research Notices. https://www.hindawi.com/journals/isrn/2013/514641/ (2013)
Lee, W., Park, J., Park, C.H.: Acceptability of tele-assistive robotic nurse for human-robot collaboration in medical environment. In: Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, HRI ’18. https://doi.org/10.1145/3173386.3177084, pp 171–172. Association for Computing Machinery, New York (2018)
Li, G., Tang, H., Sun, Y., Kong, J., Jiang, G., Jiang, D., Tao, B., Xu, S., Liu, H.: Hand gesture recognition based on convolution neural network. Clust. Comput. 22(2), 2719–2729 (2019)
Likert, R.: A technique for the measurement of attitudes. Archives of psychology (1932)
Lima, B., Junior, G.N., Amaral, L., Vieira, T., Ferreira, B., Vieira, T.: Real-time hand pose tracking and classification for natural human-robot control. in: Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, pp. 832–839. INSTICC, SciTePress. https://doi.org/10.5220/0007384608320839 (2019)
Lipton, J.I., Fay, A.J., Rus, D.: Baxter’s homunculus: Virtual reality spaces for teleoperation in manufacturing. IEEE Robot. Autom. Lett. 3(1), 179–186 (2017)
Liu, H., Fang, T., Zhou, T., Wang, Y., Wang, L.: Deep learning-based multimodal control interface for human-robot collaboration. Procedia CIRP 72, 3–8 (2018). https://doi.org/10.1016/j.procir.2018.03.224, https://www.sciencedirect.com/science/article/pii/S2212827118303846. 51st CIRP Conference on Manufacturing Systems
Lo Presti, L., La Cascia, M.: 3d skeleton-based human action classification. Pattern Recogn. 53, 130–147 (2016)
Lun, R., Zhao, W.: A survey of applications and human motion recognition with microsoft kinect. International Journal of Pattern Recognition and Artificial Intelligence 29(05), 1555008 (2015)
Microsoft: Kinect - windows app development. https://developer.microsoft.com/en-us/windows/kinect (2020)
Miranda, L., Vieira, T., Martinez, D., Lewiner, T., Vieira, A.W., Campos, M.F.: Real-time gesture recognition from depth data through key poses learning and decision forests. In: 2012 25Th SIBGRAPI Conference on Graphics, Patterns and Images, pp. 268–275. IEEE (2012)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104(2-3), 90–126 (2006)
Molchanov, P., Gupta, S., Kim, K., Kautz, J.: Hand gesture recognition with 3d convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015)
Mujahid, A., Awan, M.J., Yasin, A., Mohammed, M.A., Damaševičius, R., Maskeliūnas, R., Abdulkareem, K.H.: Real-time hand gesture recognition based on deep learning yolov3 model. Appl. Sci. 11(9), 4164 (2021)
Neverova, N., Wolf, C., Taylor, G.W., Nebout, F.: Multi-scale deep learning for gesture detection and localization. In: European Conference on Computer Vision, pp. 474–490. Springer (2014)
Nuzzi, C., Pasinetti, S., Lancini, M., Docchio, F., Sansoni, G.: Deep learning-based hand gesture recognition for collaborative robots. IEEE Instrumentation & Measurement Magazine 22(2), 44–51 (2019)
Oudah, M., Al-Naji, A., Chahl, J.: Hand gesture recognition based on computer vision: A review of techniques. Journal of Imaging 6(8), 73 (2020)
Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Applic. 28(12), 3941–3951 (2017)
Poppe, R.: Vision-based human motion analysis: an overview. Computer Vision and Image Understanding 108(1-2), 4–18 (2007)
Rautaray, S.S., Agrawal, A.: Vision based hand gesture recognition for human computer interaction: a survey. Artificial Intelligence Review 43(1), 1–54 (2015)
Ravichandar, H., Polydoros, A.S., Chernova, S., Billard, A.: Recent advances in robot learning from demonstration. Annual Review of Control, Robotics, and Autonomous Systems 3(1), 297–330 (2020). https://doi.org/10.1146/annurev-control-100819-063206
Robotiq: 2F-85 Specs. https://robotiq.com/products/adaptive-grippers (2018). Online; Acessado em 26 out. 2018
Roeder, L.: Netron https://github.com/lutzroeder/netron (2021)
Rohr, K., Stiehl, H.S., Sprengel, R., Beil, W., Buzug, T.M., Weese, J., Kuhn, M.: Point-based elastic registration of medical image data using approximating thin-plate splines. In: International Conference on Visualization in Biomedical Computing, pp. 297–306. Springer (1996)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR 2011, pp. 1297–1304. Ieee (2011)
Stanford: Robotic operating system. https://www.ros.org (2020)
Student: The probable error of a mean. Biometrika pp. 1–25 (1908)
Suarez, J., Murphy, R.R.: Hand gesture recognition with depth images: A Review. In: 2012 IEEE RO-MAN: the 21St IEEE International Symposium on Robot and Human Interactive Communication, pp. 411–417. https://doi.org/10.1109/ROMAN.2012.6343787 (2012)
Villani, V., Pini, F., Leali, F., Secchi, C.: Survey on human–robot collaboration in industrial settings: Safety, intuitive interfaces and applications. Mechatronics 55, 248–266 (2018)
Villaroman, N., Rowe, D., Swan, B.: Teaching natural user interaction using openni and the microsoft kinect sensor. In: Proceedings of the 2011 Conference on Information Technology Education, pp. 227–232 (2011)
Xiang, L., Echtler, F., Kerl, C., Wiedemeyer, T., Gordon, R., Facioni, F., Wareham, R., Goldhoorn, M., Fuchs, S., Blake, J., et al.: libfreenect2: Release 02 (2016)
Zacharaki, A., Kostavelis, I., Gasteratos, A., Dokas, I.: Safety bounds in human robot interaction: A survey. Safety Sci. 127, 104667 (2020)
Zennaro, S., Munaro, M., Milani, S., Zanuttigh, P., Bernardi, A., Ghidoni, S., Menegatti, E.: Performance evaluation of the 1St and 2Nd generation kinect for multimedia applications. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. https://doi.org/10.1109/ICME.2015.7177380 (2015)
Acknowledgements
The authors would like to thank: (1) FAPEAL/CAPES grant 05/2018 for funding and supporting the research; (2) The Instituto de Computação (IC/UFAL) for providing the necessary infrastructure for the development of this project.
Funding
This research was supported by FAPEAL/CAPES grant 05/2018.
Author information
Authors and Affiliations
Contributions
Bruno Lima: Methodology, Software, Investigation, Writing - original draft preparation, Writing - review and editing; Lucas Amaral: Methodology, Software, Investigation, Writing - review and editing; Givanildo Nascimento-Jr: Methodology, Software, Investigation, Writing - original draft preparation, Writing - review and editing; Victor Mafra: Methodology, Investigation, Writing - original draft preparation, Writing - review and editing; Bruno Georgevich Ferreira: Methodology, Investigation, Writing - review and editing; Tiago Vieira: Conceptualization, Investigation, Writing - review and editing, Resources, Supervision; Thales Vieira: Conceptualization, Investigation, Writing - review and editing, Resources, Supervision.
Corresponding author
Ethics declarations
Ethics approval
Ethics approval was not needed since participants were not explicitly identified, nor asked to perform dangerous, harmful or invasive tasks.
Consent to participate
Informed consent was obtained from all individual participants included in the study for research purposes.
Consent for Publication
Informed consent was obtained from all individual participants regarding publishing their data and photographs for research purposes.
Conflict of Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lima, B., Amaral, L., Nascimento-Jr, G. et al. User-oriented Natural Human-Robot Control with Thin-Plate Splines and LRCN. J Intell Robot Syst 104, 50 (2022). https://doi.org/10.1007/s10846-021-01560-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01560-6