Abstract
Hand gesture recognition is one of the most natural and intuitive ways to communicate between people and machines, since it closely mimics how human interact with each other. This paper presents a novel method for real-time markerless hand gesture recognition from depth images. The proposed method encompasses a collection of techniques that enable the detection, segmentation and recognition of hand gestures. A Hand detection and location method is employed using the depth information acquired from a depth sensor. Then, the hand is robustly segmented in cluttered background without any marker around. A convex shape decomposition method based on Radius Morse function is proposed for hand shape decomposition in real-time. Hand palm, fingertips and hand skeleton are recognized based on the hand shape decomposition and hand features. Moreover, we present a method for recognition of two-hand gestures. Representative experimental results demonstrate qualitatively and quantitatively that accurate hand gesture recognition can be achieved for real-time applications.
Similar content being viewed by others
References
Wachs, J.P., Kölsch, M., Stern, H., Edan, Y. (2011). Vision-based hand-gesture applications. Communications of the ACM, 54(2), 60–71.
Wang, R.Y., & Popović, J. (2009). Real-time hand-tracking with a color glove. ACM Transactions on Graphics (TOG), 28(63), 1–8.
Bretzner, L., Laptev, I., Lindeberg, T. (2002). Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering. In IEEE international conference on automatic face and gesture recognition (pp. 423–428). IEEE.
Argyros, A., & Lourakis, M. (2004). Real-time tracking of multiple skin-colored objects with a possibly moving camera. In European Conference on Computer Vision (pp. 368–379). Springer.
Argyros, A., & Lourakis, M. (2006). Vision-based interpretation of hand gestures for remote control of a computer mouse. In Computer Vision in Human-Computer Interaction (pp. 40–51). Springer.
Lee, T., Hollerer, T., Handy, A.R. (2007). Markerless inspection of augmented reality objects using fingertip tracking. In IEEE international symposium on wearable computers (pp. 83–90). IEEE.
Lee, T., & Hollerer, T. (2009). Multithreaded hybrid feature tracking for markerless augmented reality. IEEE Transactions on Visualization and Computer Graphics, 15(3), 355–368.
Ren, Z., Yuan, J., Zhang, Z. (2011). Robust hand gesture recognition based on finger-earth mover’s distance with a commodity depth camera. In ACM international conference on multimedia (pp. 1093–1096). ACM.
Lindeberg, T. (1998). Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2), 79–116.
Fang, Y., Cheng, J., Wang, K., Lu, H. (2007). Hand gesture recognition using fast multi-scale analysis. In International conference on image and graphics (pp. 694–6980). IEEE.
Fang, Y., Wang, K., Cheng, J., Lu, H. (2007). A real-time hand gesture recognition method. In IEEE international conference on multimedia and expo (pp 995–998). IEEE.
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 1297–1304).
Cai, Q., Gallup, D., Zhang, C., Zhang, Z. (2010). 3d deformable face tracking with a commodity depth camera. In European Conference on Computer Vision (pp. 229–242). Springer.
Oikonomidis, I., Kyriazis, N., Argyros, A. (2011). Efficient model-based 3d tracking of hand articulations using kinect. In British Machine Vision Conference (pp. 101:1–101:11). Springer.
Van den Bergh, M., & Van Gool, L. (2011). Combining rgb and tof cameras for real-time 3d hand gesture interaction. In IEEE workshop on applications of computer vision (pp. 66–72). IEEE.
Hackenberg, G., McCall, R., Broll, W. (2011). Lightweight palm and finger tracking for real-time 3d gesture control. In IEEE virtual reality conference (VR) (pp. 19–26). IEEE.
Molina, J., Escudero-Viñolo, M., Signoriello, A., Pardàs, M., Ferrán, C., Bescós, J., Marqués, F., Martínez, J.M. (2011). Real-time user independent hand gesture recognition from time-of-flight camera video using static and dynamic models. Machine Vision and Applications, 24(1), 187–204.
Schwarz, L.A., Mkhitaryan, A., Mateus, D., Navab, N. (2011). Estimating human 3d pose from time-of-flight images based on geodesic distances and optical flow. In IEEE international conference on automatic face and gesture recognition (pp. 700–706). IEEE.
Zhu, J., Wang, L., Yang, R., Davis, J. (2008). Fusion of time-of-flight depth and stereo for high accuracy depth maps. In IEEE conference on computer vision and pattern recognition (pp. 1–8). IEEE.
Daribo, I., & Saito, H. (2011). A novel inpainting-based layered depth video for 3dtv. IEEE Transactions on Broadcasting, 57(2), 533–541.
Qin, S., Zhu, X., Yu, H., Ge, S., Yang, Y., Jiang, Y. (2012). Real-time markerless hand gesture recognition with depth camera. In Advances in multimedia information processing – PCM 2012, lecture notes in computer science (Vol. 7674, pp. 186–197). Springer.
Telea, A. (2004). An image inpainting technique based on the fast marching method. Journal of Graphics Tools, 9(1), 23–34.
Kopf, J., Cohen, M.F., Lischinski, D., Uyttendaele, M. (2007). Joint bilateral upsampling. ACM Transactions on Graphics, 26(96), 1–8.
Liu, H., Liu, W., Latecki, L.J. (2010). Convex shape decomposition. In IEEE conference on computer vision and pattern recognition (pp. 97–104). IEEE.
Lien, J.M., & Amato, N.M. (2006). Approximate convex decomposition of polygons. Computational Geometry, 35(1), 100–123.
Mi, X., & DeCarlo, D. (2007). Separating parts from 2d shapes using relatability. In IEEE international conference on computer vision (pp. 1–8). IEEE.
Bai, X., & Latecki, L. (2007). Discrete skeleton evolution. In Energy minimization methods in computer vision and pattern recognition (pp. 362–374). Springer.
Schwarz, L.A., Mkhitaryan, A., Mateus, D., Navab, N. (2011). Human skeleton tracking from depth data using geodesic distances and optical flow. Image and Vision Computing, 30(3), 217–226.
Bai, X., & Latecki, L.J. (2008). Path similarity skeleton graph matching. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 30(7), 1282–1292.
Acknowledgments
This work is supported by CASIA - Beijing CAS X-Vision Digital Technology Co., Ltd. Joint Lab of Information Visualization.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qin, S., Zhu, X., Yang, Y. et al. Real-time Hand Gesture Recognition from Depth Images Using Convex Shape Decomposition Method. J Sign Process Syst 74, 47–58 (2014). https://doi.org/10.1007/s11265-013-0778-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-013-0778-7