Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In this paper, we present a human action recognition method for human silhouette sequences. Inspired by the locality preserving projection and its variants, a novel manifold embedding method, maximum spatio-temporal dissimilarity embedding, is proposed to embed each action frame into a manifold, where frames from different action classes can be well separated. Unlike existing methods that incorporate both inter-class and intra-class information in the embedding process, our proposed method focuses on maximizing distances between frames that are similar in appearance but are from different classes and takes the temporal information into consideration. A variant of Hausdorff distance is introduced for frame and sequence classifications. Extensive experimental results and comparison with state-of-the-art methods demonstrate the effectiveness and robustness of the proposed method for human action silhouette analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The results of LSTDE are obtained from its original paper.

References

  1. Levin, E., Pieraccini, R., Eckert, W.: A stochastic model of human–machine interaction for learning dialog strategies. IEEE Trans. Speech Audio Process. 8, 11–23 (2000)

    Article  Google Scholar 

  2. Dufaux, F., Ebrahimi, T.: Scrambling for Video Surveillance with Privacy. IEEE Conference on Computer Vision and Pattern Recognition Workshop (2006)

  3. Rougier, C., Meunier, J., St-Arnaud, A., Rousseau, J.: Fall detection from Humhan shape and motion history using video surveillance. Int. Conf. Adv. Inf. Netw. Appl. Workshops 2, 875–880 (2007)

    Google Scholar 

  4. Niebles, J.C., Chen, C., Li, F.: Modeling temporal structure of decomposable motion segments for activity classification. European Conference on Computer Vision, pp. 392–405 (2010)

  5. Petkovłc, M., Jonker, W.: Content-based video retrieval by integrating spatio-temporal and stochastic recognition of events. IEEE Workshop on Detection and Recognition of Events in Video, pp. 75–82 (2001)

  6. Geetha, P., Narayanan, V.: A survey of content-based video retrieval. J. Comput. Sci. 4, 474–486 (2008)

    Article  Google Scholar 

  7. Efros, A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. Int. Conf. Comput. Vis. 2, 726–733 (2003)

    Google Scholar 

  8. Collins, R., Gross, R., Shi, J.: Silhouette-based human identification from body shape and gait. IEEE Conference on Automatic Face and Gesture Recognition, pp. 366–371 (2002)

  9. Schldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. IEEE Conf. Autom. Face Gesture Recognit. 3, 32–36 (2004)

    Google Scholar 

  10. Ke, Y., Sukthankar, R., Hebert, M.H.: Efficient visual event detection using volumetric features. Int. Conf. Comput. Vis. 1, 166–173 (2005)

    Google Scholar 

  11. Ivan, L.: On space-time interest points. Int. J. Comput. Vis. 64, 107–123 (2005)

    Article  Google Scholar 

  12. Wang, L., Suter, D.: Learning and matching of dynamic shape manifolds for human action recognition. IEEE Trans. Image Process. 16, 1646–1661 (2007)

    Article  MathSciNet  Google Scholar 

  13. Wang, L., Suter, D.: Recognizing human activities from silhouettes: motion subspace and factorial discriminative graphical model. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007) (2007)

  14. Bobick, A., Davis, J.: The recognition of human movement using temporal template. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)

    Article  Google Scholar 

  15. He, X., Niyogi, P.: Locality preserving projections. Neural Inf. Process. Syst. 16, 153–160 (2003)

    Google Scholar 

  16. Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. Int. Conf. Comput. Vis. Workshop Human Motion 4814, 285–298 (2007)

    Google Scholar 

  17. Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  18. Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, New York (1993)

    Google Scholar 

  19. Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  20. Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. Neural Inf. Process. Syst. 14, 585–591 (2001)

    Google Scholar 

  21. Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 8, 406–424 (2005)

    MathSciNet  Google Scholar 

  22. Yan, S., Xu, D., Zhang, B., Zhang, H., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 40–51 (2007)

  23. Jenkins, O., Mataric, M.: A spatio-temporal extension to isomap nonlinear dimension reduction. International Conference on Machine Learning, pp. 56–61 (2004)

  24. Fang, C., Chen, J., Tseng, C., Lien, J.: Human action recognition using spatio-temporal classification. Asian Conf. Comput. Vis. 5995, 98–109 (2009)

    Google Scholar 

  25. Lewandowski, M., del Rincon, J.M., Makris, D., Nebe, J.: Temporal extension of laplacian eigenmaps for unsupervised dimensionality reduction of time series. International Conference on Pattern Recognition, pp. 161–164 (2010)

  26. Jia, K., Yeung, D.: Human action recognition using local spatio-temporal discriminant embedding. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

  27. Zheng, Z., Yanga, F., Tana, W., Jiaa, J., Yangb, J.: Gabor feature-based face recognition using supervised locality preserving projection. Signal Process. 87, 2473–2483 (2007)

    Article  MATH  Google Scholar 

  28. Okiopoulou, E., Saad, Y.: Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2143–2156 (2007)

    Article  Google Scholar 

  29. Cai, D., He, X., Zhou, K.: Locality sensitive discriminant analysis. International Joint Conference on Artificial Intelligence, pp. 708–713 (2007)

  30. Cai, D., He, X.: Orthogonal locality preserving indexing. ACM SIGIR Conference on Research and development in Information Retrieval, pp. 3–10 (2005)

  31. Wang, L., Suter, D.: Visual learning and recognition of sequential data manifolds with applications to human movement analysis. Comput. Vis. Image Underst. 110, 153–172 (2008)

    Article  Google Scholar 

  32. Ma, J., Yuen, P.C., Zou, W., Lai, J.H.: Supervised neighborhood topology learning for human action recognition. International Conference on Computer Vision Workshops, pp. 476–481 (2009)

  33. Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Action as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2247–2253 (2007)

    Article  Google Scholar 

  34. Wang, L., Tan, T.: Silhouette analysis based gait recognition for human identification. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1505–1518 (2003)

    Article  Google Scholar 

  35. Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104, 249–257 (2006)

    Article  Google Scholar 

  36. Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3D exemplars. International Conference on Computer Vision, pp. 1–7 (2007)

Download references

Acknowledgments

This work was supported by the National Science Foundation of China (No. 61301269 and No. 61201271), the Research Fund for the Doctoral Program of Higher Education (No. 20100185120021), the Science and Technology Cooperation Program with the Academy of China and Sichuan Province (No. 2012JZ0001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongsheng Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, J., Liu, H. & Li, H. Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding. Machine Vision and Applications 25, 1007–1018 (2014). https://doi.org/10.1007/s00138-013-0581-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0581-2

Keywords

Navigation