Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding

Jian Cheng¹,
Haijun Liu¹ &
Hongsheng Li¹

589 Accesses
9 Citations
Explore all metrics

Abstract

In this paper, we present a human action recognition method for human silhouette sequences. Inspired by the locality preserving projection and its variants, a novel manifold embedding method, maximum spatio-temporal dissimilarity embedding, is proposed to embed each action frame into a manifold, where frames from different action classes can be well separated. Unlike existing methods that incorporate both inter-class and intra-class information in the embedding process, our proposed method focuses on maximizing distances between frames that are similar in appearance but are from different classes and takes the temporal information into consideration. A variant of Hausdorff distance is introduced for frame and sequence classifications. Extensive experimental results and comparison with state-of-the-art methods demonstrate the effectiveness and robustness of the proposed method for human action silhouette analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human Action Recognition Using Maximum Temporal Inter-Class Dissimilarity

Efficient Silhouette-based Input Methods for Reliable Human Action Recognition from Videos

Manifold Methods for Action Recognition

Notes

The results of LSTDE are obtained from its original paper.

References

Levin, E., Pieraccini, R., Eckert, W.: A stochastic model of human–machine interaction for learning dialog strategies. IEEE Trans. Speech Audio Process. 8, 11–23 (2000)
Article Google Scholar
Dufaux, F., Ebrahimi, T.: Scrambling for Video Surveillance with Privacy. IEEE Conference on Computer Vision and Pattern Recognition Workshop (2006)
Rougier, C., Meunier, J., St-Arnaud, A., Rousseau, J.: Fall detection from Humhan shape and motion history using video surveillance. Int. Conf. Adv. Inf. Netw. Appl. Workshops 2, 875–880 (2007)
Google Scholar
Niebles, J.C., Chen, C., Li, F.: Modeling temporal structure of decomposable motion segments for activity classification. European Conference on Computer Vision, pp. 392–405 (2010)
Petkovłc, M., Jonker, W.: Content-based video retrieval by integrating spatio-temporal and stochastic recognition of events. IEEE Workshop on Detection and Recognition of Events in Video, pp. 75–82 (2001)
Geetha, P., Narayanan, V.: A survey of content-based video retrieval. J. Comput. Sci. 4, 474–486 (2008)
Article Google Scholar
Efros, A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. Int. Conf. Comput. Vis. 2, 726–733 (2003)
Google Scholar
Collins, R., Gross, R., Shi, J.: Silhouette-based human identification from body shape and gait. IEEE Conference on Automatic Face and Gesture Recognition, pp. 366–371 (2002)
Schldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. IEEE Conf. Autom. Face Gesture Recognit. 3, 32–36 (2004)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.H.: Efficient visual event detection using volumetric features. Int. Conf. Comput. Vis. 1, 166–173 (2005)
Google Scholar
Ivan, L.: On space-time interest points. Int. J. Comput. Vis. 64, 107–123 (2005)
Article Google Scholar
Wang, L., Suter, D.: Learning and matching of dynamic shape manifolds for human action recognition. IEEE Trans. Image Process. 16, 1646–1661 (2007)
Article MathSciNet Google Scholar
Wang, L., Suter, D.: Recognizing human activities from silhouettes: motion subspace and factorial discriminative graphical model. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007) (2007)
Bobick, A., Davis, J.: The recognition of human movement using temporal template. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)
Article Google Scholar
He, X., Niyogi, P.: Locality preserving projections. Neural Inf. Process. Syst. 16, 153–160 (2003)
Google Scholar
Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. Int. Conf. Comput. Vis. Workshop Human Motion 4814, 285–298 (2007)
Google Scholar
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Article Google Scholar
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, New York (1993)
Google Scholar
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Article Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. Neural Inf. Process. Syst. 14, 585–591 (2001)
Google Scholar
Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 8, 406–424 (2005)
MathSciNet Google Scholar
Yan, S., Xu, D., Zhang, B., Zhang, H., Yang, Q., Lin, S.: Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell. 40–51 (2007)
Jenkins, O., Mataric, M.: A spatio-temporal extension to isomap nonlinear dimension reduction. International Conference on Machine Learning, pp. 56–61 (2004)
Fang, C., Chen, J., Tseng, C., Lien, J.: Human action recognition using spatio-temporal classification. Asian Conf. Comput. Vis. 5995, 98–109 (2009)
Google Scholar
Lewandowski, M., del Rincon, J.M., Makris, D., Nebe, J.: Temporal extension of laplacian eigenmaps for unsupervised dimensionality reduction of time series. International Conference on Pattern Recognition, pp. 161–164 (2010)
Jia, K., Yeung, D.: Human action recognition using local spatio-temporal discriminant embedding. IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Zheng, Z., Yanga, F., Tana, W., Jiaa, J., Yangb, J.: Gabor feature-based face recognition using supervised locality preserving projection. Signal Process. 87, 2473–2483 (2007)
Article MATH Google Scholar
Okiopoulou, E., Saad, Y.: Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2143–2156 (2007)
Article Google Scholar
Cai, D., He, X., Zhou, K.: Locality sensitive discriminant analysis. International Joint Conference on Artificial Intelligence, pp. 708–713 (2007)
Cai, D., He, X.: Orthogonal locality preserving indexing. ACM SIGIR Conference on Research and development in Information Retrieval, pp. 3–10 (2005)
Wang, L., Suter, D.: Visual learning and recognition of sequential data manifolds with applications to human movement analysis. Comput. Vis. Image Underst. 110, 153–172 (2008)
Article Google Scholar
Ma, J., Yuen, P.C., Zou, W., Lai, J.H.: Supervised neighborhood topology learning for human action recognition. International Conference on Computer Vision Workshops, pp. 476–481 (2009)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Action as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29, 2247–2253 (2007)
Article Google Scholar
Wang, L., Tan, T.: Silhouette analysis based gait recognition for human identification. IEEE Trans. Pattern Anal. Mach. Intell. 25, 1505–1518 (2003)
Article Google Scholar
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104, 249–257 (2006)
Article Google Scholar
Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3D exemplars. International Conference on Computer Vision, pp. 1–7 (2007)

Download references

Acknowledgments

This work was supported by the National Science Foundation of China (No. 61301269 and No. 61201271), the Research Fund for the Doctoral Program of Higher Education (No. 20100185120021), the Science and Technology Cooperation Program with the Academy of China and Sichuan Province (No. 2012JZ0001).

Author information

Authors and Affiliations

School of Electronic Engineering, University of Electronic Science and Technology of China, 2006 Xiyuan Ave., Chengdu, 611731, China
Jian Cheng, Haijun Liu & Hongsheng Li

Authors

Jian Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Haijun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongsheng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongsheng Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, J., Liu, H. & Li, H. Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding. Machine Vision and Applications 25, 1007–1018 (2014). https://doi.org/10.1007/s00138-013-0581-2

Download citation

Received: 25 March 2013
Revised: 29 October 2013
Accepted: 04 November 2013
Published: 23 November 2013
Issue Date: May 2014
DOI: https://doi.org/10.1007/s00138-013-0581-2

Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Human Action Recognition Using Maximum Temporal Inter-Class Dissimilarity

Efficient Silhouette-based Input Methods for Reliable Human Action Recognition from Videos

Manifold Methods for Action Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Silhouette analysis for human action recognition based on maximum spatio-temporal dissimilarity embedding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Human Action Recognition Using Maximum Temporal Inter-Class Dissimilarity

Efficient Silhouette-based Input Methods for Reliable Human Action Recognition from Videos

Manifold Methods for Action Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation