Abstract
Retrieval of a specific human motion from 3D skeleton data is intractable because of its articulated complexity. We propose a context-based motion document formation method to reflect geometric variations by calculating covariance descriptors among skeletal joint locations and joint relative distances, and temporal variations by performing a coarse-to-fine segmentation on the motion sequence. The descriptors of query motion traverse all the motion categories to lock its motion words, which can be regarded as the basic units of a motion document. The discrete motion words of different spatiotemporal descriptors are also mapped to divergent index ranges to add prior knowledge of motion with temporal order to latent Dirichlet allocation (LDA). The similarity matching is based on motion-topic distributions from LDA with semantic meanings. The experiments on public datasets show the effectiveness and robustness of the proposed method over existing models.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Bregonzio, M., Li, J., Gong, S., Xiang, T.: Discriminative topics modelling for action feature selection and recognition. In: Proceedings of the British Machine Vision Conference, pp. 1–11 (2010)
Brémaud, P.: An Introduction to Probabilistic Modeling. Springer, Berlin (2012)
Chai, J., Hodgins, J.K.: Performance animation from low-dimensional control signals. ACM Trans. Graph. 24, 686–696 (2005)
Chao, M.W., Lin, C.H., Assa, J., Lee, T.Y.: Human motion retrieval from hand-drawn sketch. IEEE Trans. Vis. Comput. Graph. 18(5), 729–740 (2012)
Chen, C., Zhuang, Y., Nie, F., Yang, Y., Wu, F., Xiao, J.: Learning a 3d human pose distance metric from geometric pose descriptor. IEEE Trans. Vis. Comput. Graph. 17(11), 1676–1689 (2011)
Chiu, C.Y., Chao, S.P., Wu, M.Y., Yang, S.N., Lin, H.C.: Content-based retrieval for human motion data. J. Vis. Commun. Image Represent. 15(3), 446–466 (2004)
Du, Y., Fu, Y., Wang, L.: Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR Asian Conference on Pattern Recognition, pp. 579–583 (2015)
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1110–1118 (2015)
Gowayyed, M.A., Torki, M., Hussein, M.E., El-Saban, M.: Histogram of oriented displacements (HOD): Describing trajectories of human joints for action recognition. In: International Joint Conference on Artificial Intelligence (2013)
Ho, E.S., Komura, T.: Indexing and retrieving motions of characters in close contact. IEEE Trans. Vis. Comput. Graph. 15(3), 481–492 (2009)
Hussein, M.E., Torki, M., Gowayyed, M.A., El-Saban, M.: Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations. In: International Joint Conference on Artificial Intelligence, vol. 13, pp. 2466–2472 (2013)
Kapadia, M., Chiang, I.K., Thomas, T., Badler, N.I., Kider Jr, J.T., et al.: Efficient motion retrieval in large motion databases. In: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 19–28 (2013)
Ke, Q., Bennamoun, M., An, S., Sohel, F., Boussaid, F.: A new representation of skeleton sequences for 3d action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4570–4579 (2017)
Kitagawa, M., Windsor, B.: MoCap for Artists: Workflow and Techniques for Motion Capture. Focal Press, Waltham (2012)
Komura, T., Ho, E.S., Lau, R.W.: Animating reactive motion using momentum-based inverse kinematics. Comput. Anim. Virtual Worlds 16(3–4), 213–223 (2005)
Koniusz, P., Cherian, A., Porikli, F.: Tensor representations via kernel linearization for action recognition from 3d skeletons. In: European Conference on Computer Vision, pp. 37–53 (2016)
Krüger, B., Tautges, J., Weber, A., Zinke, A.: Fast local and global similarity searches in large motion capture databases. In: Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 1–10 (2010)
Lan, R., Sun, H., Zhu, M.: Text-like motion representation for human motion retrieval. In: International Conference on Intelligent Science and Intelligent Data Engineering, pp. 72–81 (2012)
Lee, I., Kim, D., Kang, S., Lee, S.: Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks. In: IEEE International Conference on Computer Vision, pp. 1012–1020 (2017)
Li, M., Leung, H., Liu, Z., Zhou, L.: 3d human motion retrieval using graph kernels based on adaptive graph construction. Comput. Graph. 54, 104–112 (2016)
Liu, F., Zhuang, Y., Wu, F., Pan, Y.: 3d motion retrieval with motion index tree. Comput. Vis. Image Underst. 92(2–3), 265–284 (2003)
Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal lstm with trust gates for 3d human action recognition. In: European Conference on Computer Vision, pp. 816–833 (2016)
Liu, X., He, G.F., Peng, S.J., Cheung, Y.M., Tang, Y.Y.: Efficient human motion retrieval via temporal adjacent bag of words and discriminative neighborhood preserving dictionary learning. IEEE Trans. Hum. Mach. Syst. 99, 1–14 (2017)
Lv, N., Jiang, Z., Huang, Y., Meng, X., Meenakshisundaram, G., Peng, J.: Generic content-based retrieval of marker-based motion capture data. IEEE Trans. Vis. Comput. Graph. 24(6), 1969–1982 (2018)
MacKay, D.J., Mac Kay, D.J.: Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge (2003)
Müller, M.: Information Retrieval for Music and Motion, vol. 2. Springer, Berlin (2007)
Müller, M., Röder, T., Clausen, M.: Efficient content-based retrieval of motion capture data. ACM Trans. Graph. (ToG) 24, 677–685 (2005)
Müller, M., Röder, T., Clausen, M., Eberhardt, B., Krüger, B., Weber, A.: Documentation mocap database hdm05. Tech. Rep. CG-2007-2, Universität Bonn (2007)
Qi, T., Feng, Y., Xiao, J., Zhuang, Y., Yang, X., Zhang, J.: A semantic feature for human motion retrieval. Comput. Anim. Virtual Worlds 24(3–4), 399–407 (2013)
Sedmidubsky, J., Elias, P., Zezula, P.: Effective and efficient similarity searching in motion capture data. Multimed. Tools Appl. 77(10), 12073–12094 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Sucar, L.E., Azcárate, G., Leder, R.S., Reinkensmeyer, D., Hernández, J., Sanchez, I., Saucedo, P.: Gesture therapy: A vision-based system for arm rehabilitation after stroke. In: International Joint Conference on Biomedical Engineering Systems and Technologies, pp. 531–540 (2008)
Tang, J., Meng, Z., Nguyen, X., Mei, Q., Zhang, M.: Understanding the limiting factors of topic modeling via posterior contraction analysis. In: International Conference on Machine Learning, pp. 190–198 (2014)
Valcik, J., Sedmidubsky, J., Zezula, P.: Assessing similarity models for human-motion retrieval applications. Comput. Anim. Virtual Worlds 27(5), 484–500 (2016)
Vögele, A., Krüger, B., Klein, R.: Efficient unsupervised temporal segmentation of human motion. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 167–176 (2014)
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1290–1297 (2012)
Wang, P., Lau, R.W., Pan, Z., Wang, J., Song, H.: An eigen-based motion retrieval method for real-time animation. Comput. Graph. 38, 255–267 (2014)
Wang, P., Yuan, C., Hu, W., Li, B., Zhang, Y.: Graph based skeleton motion representation and similarity measurement for action recognition. In: European Conference on Computer Vision, pp. 370–385 (2016)
Wang, Z., Feng, Y., Qi, T., Yang, X., Zhang, J.J.: Adaptive multi-view feature selection for human motion retrieval. Signal Process. 120, 691–701 (2016)
Wu, S., Wang, Z., Xia, S.: Indexing and retrieval of human motion data by a hierarchical tree. In: Proceedings of the 16th ACM Symposium on Virtual Reality Software and Technology, pp. 207–214 (2009)
Xiao, J., Tang, Z., Feng, Y., Xiao, Z.: Sketch-based human motion retrieval via selected 2d geometric posture descriptor. Signal Process. 113, 1–8 (2015)
Xiao, Q., Song, R.: Human motion retrieval based on statistical learning and bayesian fusion. PLoS ONE 11(10), e0164,610 (2016)
Xiao, Q., Song, R.: Motion retrieval based on motion semantic dictionary and hmm inference. Soft Comput. 21(1), 255–265 (2017)
Yang, S., Yuan, C., Wu, B., Hu, W., Wang, F.: Multi-feature max-margin hierarchical bayesian model for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1610–1618 (2015)
Yoo, I., Vanek, J., Nizovtseva, M., Adamo-Villani, N., Benes, B.: Sketching human character animations by composing sequences from large motion database. Vis. Comput. 30(2), 213–227 (2014)
Yoshitaka, A., Ichikawa, T.: A survey on content-based retrieval for multimedia databases. IEEE Trans. Knowl. Data Eng. 11(1), 81–93 (1999)
Yu, T., Shen, X., Li, Q., Geng, W.: Motion retrieval based on movement notation language. Comput. Anim. Virtual Worlds 16(3–4), 273–282 (2005)
Zhou, F., De la Torre, F., Hodgins, J.K.: Aligned cluster analysis for temporal segmentation of human motion. In: 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 1–7 (2008)
Zhou, L., Lu, Z., Leung, H., Shang, L.: Spatial temporal pyramid matching using temporal sparse representation for human motion retrieval. Vis. Comput. 30(6–8), 845–854 (2014)
Zhu, M., Sun, H., Lan, R., Li, B.: Human motion retrieval using topic model. Comput. Anim. Virtual Worlds 23(5), 469–476 (2012)
Acknowledgements
The work described in this paper was fully supported by Grants from City University of Hong Kong (Project No. 7004681 and 7004916).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Men, Q., Leung, H. Retrieval of spatial–temporal motion topics from 3D skeleton data. Vis Comput 35, 973–984 (2019). https://doi.org/10.1007/s00371-019-01690-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-019-01690-x