Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Tensor discriminant analysis on grassmann manifold with application to video based human action recognition

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Representing videos as linear subspaces on Grassmann manifolds has made great strides in action recognition problems. Recent studies have explored the convenience of discriminant analysis by making use of Grassmann kernels. However, traditional methods rely on the matrix representation of videos based on the temporal dimension and suffer from not considering the two spatial dimensions. To overcome this problem, we keep the natural form of videos by representing video inputs as multidimensional arrays known as tensors and propose a tensor discriminant analysis approach on Grassmannian manifolds. Because matrix algebra does not handle tensor data, we introduce a new Grassmann projection kernel based on the tensor-tensor decomposition and product. Experiments with human action databases show that the proposed method performs well compared with the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

In this section, we have presented experimental results on four well-known data sets: the Cambridge Hand Gesture [47], Weizmann [49], UTD-MHAD [48], and UCF sports action [50] data sets. These datasets are freely available. The source code of the proposed method is available at: https://github.com/Cagri- Ozdemir/TGDA

Notes

  1. Note that while we generally use upper-case calligraphic letters to denote tensors, to keep consistent with the literature, we will denote a Grassmann manifold using an upper case calligraphic \({\mathcal {G}}\).

  2. Our source code is available in the GitHub repository: https://github.com/Cagri-Ozdemir/TGDA.

References

  1. Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):1–43

    Article  Google Scholar 

  2. Moeslund TB, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Underst 81(3):231–268

    Article  Google Scholar 

  3. Kuo C-H, Nevatia R (2011) How does person identity recognition help multi-person tracking? In: CVPR 2011. IEEE, pp 1217–1224

  4. Hamm J, Lee DD (2008) Grassmann discriminant analysis: a unifying view on subspace-based learning. In: Proceedings of the 25th international conference on machine learning, pp 376–383

  5. Harandi MT, Sanderson C, Shirazi S, Lovell BC (2011) Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching. In: CVPR 2011. IEEE, pp 2705–2712

  6. Lui YM, Beveridge JR, Kirby M (2010) Action classification on product manifolds. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 833–839

  7. Lui YM (2011) Tangent bundles on special manifolds for action recognition. IEEE Trans Circuits Syst Video Technol 22(6):930–942

    Article  Google Scholar 

  8. Lui YM (2012) Human gesture recognition on product manifolds. J Mach Learn Res 13(1):3297–3321

    MathSciNet  Google Scholar 

  9. Sharma K, Rameshan R (2020) Image set classification using a distance-based kernel over affine Grassmann manifold. IEEE Trans Neural Netw Learn Syst 32(3):1082–1095

    Article  MathSciNet  Google Scholar 

  10. Gatto BB, dos Santos EM, Koerich AL, Fukui K, Junior WS (2021) Tensor analysis with n-mode generalized difference subspace. Expert Syst Appl 171:114559

    Article  Google Scholar 

  11. Liu Y, Gao Q, Miao S, Gao X, Nie F, Li Y (2016) A non-greedy algorithm for l1-norm lda. IEEE Trans Image Process 26(2):684–695

    Article  MathSciNet  Google Scholar 

  12. Liu Y, Gao X, Gao Q, Shao L, Han J (2019) Adaptive robust principal component analysis. Neural Netw 119:85–92

    Article  Google Scholar 

  13. Lai Z, Xu Y, Yang J, Tang J, Zhang D (2013) Sparse tensor discriminant analysis. IEEE Trans Image Process 22(10):3904–3915

    Article  MathSciNet  Google Scholar 

  14. Lu J, Lai Z, Wang H, Chen Y, Zhou J, Shen L (2020) Generalized embedding regression: a framework for supervised feature extraction. IEEE Trans Neural Netw Learn Syst 33(1):185–199

    Article  MathSciNet  Google Scholar 

  15. Lu J, Wang H, Zhou J, Chen Y, Lai Z, Hu Q (2021) Low-rank adaptive graph embedding for unsupervised feature extraction. Pattern Recognit 113:107758

    Article  Google Scholar 

  16. Harandi MT, Salzmann M, Jayasumana S, Hartley R, Li H (2014) Expanding the family of Grassmannian kernels: an embedding perspective. In: European conference on computer vision. Springer, pp 408–423

  17. Kilmer ME, Martin CD, Perrone L (2008) A third-order generalization of the matrix SVD as a product of third-order tensors. Tufts University, Department of Computer Science, technical report TR-2008-4

  18. Braman K (2010) Third-order tensors as linear operators on a space of matrices. Linear Algebra Appl 433(7):1241–1253

    Article  MathSciNet  Google Scholar 

  19. Kilmer ME, Martin CD (2011) Factorization strategies for third-order tensors. Linear Algebra Appl 435(3):641–658

    Article  MathSciNet  Google Scholar 

  20. Karner H, Schneid J, Ueberhuber CW (2003) Spectral decomposition of real circulant matrices. Linear Algebra Appl 367:301–311

    Article  MathSciNet  Google Scholar 

  21. Gleich DF, Greif C, Varah JM (2013) The power and Arnoldi methods in an algebra of circulants. Numer Linear Algebra Appl 20(5):809–831

    Article  MathSciNet  Google Scholar 

  22. Kernfeld E, Kilmer M, Aeron S (2015) Tensor-tensor products with invertible linear transforms. Linear Algebra Appl 485:545–570

    Article  MathSciNet  Google Scholar 

  23. Tarzanagh DA, Michailidis G (2018) Fast randomized algorithms for t-product based tensor operations and decompositions with applications to imaging data. SIAM J Imaging Sci 11(4):2629–2664

    Article  MathSciNet  Google Scholar 

  24. Ozdemir C, Hoover RC, Caudle K (2021) Fast tensor singular value decomposition using the low-resolution features of tensors. In: 2021 20th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 527–533

  25. Zhang Z, Aeron S (2016) Exact tensor completion using t-svd. IEEE Trans Signal Process 65(6):1511–1526

    Article  MathSciNet  Google Scholar 

  26. Zhou P, Lu C, Lin Z, Zhang C (2017) Tensor factorization for low-rank tensor completion. IEEE Trans Image Process 27(3):1152–1163

    Article  MathSciNet  Google Scholar 

  27. Zhang L, Song L, Du B, Zhang Y (2019) Nonlocal low-rank tensor completion for visual data. IEEE Trans Cybern 51(2):673–685

    Article  Google Scholar 

  28. Soltani S, Kilmer ME, Hansen PC (2016) A tensor-based dictionary learning approach to tomographic image reconstruction. BIT Numer Math 56(4):1425–1454

    Article  MathSciNet  Google Scholar 

  29. Zhang C, Hu W, Jin T, Mei Z (2018) Nonlocal image denoising via adaptive tensor nuclear norm minimization. Neural Comput Appl 29(1):3–19

    Article  Google Scholar 

  30. Ozdemir C, Hoover RC, Caudle K, Braman K (2022) Kernelization of tensor discriminant analysis with application to image recognition. In: 2022 21st IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 183–189

  31. Kilmer ME, Braman K, Hao N, Hoover RC (2013) Third-order tensors as operators on matrices: a theoretical and computational framework with applications in imaging. SIAM J Matrix Anal Appl 34(1):148–172

    Article  MathSciNet  Google Scholar 

  32. Hao N, Kilmer ME, Braman K, Hoover RC (2013) Facial recognition using tensor-tensor decompositions. SIAM J Imaging Sci 6(1):437–463

    Article  MathSciNet  Google Scholar 

  33. Li Q, Schonfeld D (2014) Multilinear discriminant analysis for higher-order tensor data classification. IEEE Trans Pattern Anal Mach Intell 36(12):2524–2537

    Article  Google Scholar 

  34. Zhang J, Li Z, Jing P, Liu Y, Su Y (2019) Tensor-driven low-rank discriminant analysis for image set classification. Multim Tools Appl 78(4):4001–4020

    Article  Google Scholar 

  35. Hoover RC, Braman KS, Hao N (2011) Pose estimation from a single image using tensor decomposition and an algebra of circulants. In: 2011 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 2928–2934

  36. Ozdemir C, Hoover RC, Caudle K (2021) 2DTPCA: a new framework for multilinear principal component analysis. In: 2021 IEEE international conference on image processing (ICIP). IEEE, pp 344–348

  37. Strang G, Nguyen T (1996) Wavelets and filter banks. SIAM

  38. Jensen A, la Cour-Harbo A (2001) Ripples in mathematics: the discrete wavelet transform. Springer, New York

    Book  Google Scholar 

  39. Haar A (1910) Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen 69(3):331–371

    Article  MathSciNet  Google Scholar 

  40. Daubechies I (1993) Orthonormal bases of compactly supported wavelets ii. variations on a theme. SIAM J Math Anal 24(2):499–519

  41. Porwik P, Lisowska A (2004) The Haar-wavelet transform in digital image processing: its status and achievements. Mach Graphics Vis 13(1/2):79–98

    Google Scholar 

  42. Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500

    Article  MathSciNet  Google Scholar 

  43. Hoover RC, Caudle K, Braman K (2018) Multilinear discriminant analysis through tensor-tensor Eigendecomposition. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 578–584

  44. Ozdemir, C, Hoover RC, Caudle K, Braman K (2022) High-order multilinear discriminant analysis via order-n tensor eigendecomposition. arXiv:2205.09191

  45. Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl 20(2):303–353

    Article  MathSciNet  Google Scholar 

  46. Hamm J, Lee D (2008) Extended Grassmann kernels for subspace-based learning. Adv Neural Inf Process Syst 21

  47. Kim T-K, Wong S-F, Cipolla R (2007) Tensor canonical correlation analysis for action classification. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8

  48. Chen C, Jafari R, Kehtarnavaz N (2015) Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 168–172

  49. Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Tenth IEEE international conference on computer vision (ICCV’05) volume 1, vol 2. IEEE, pp 1395–1402

  50. Rodriguez M (2010) Spatio-temporal maximum average correlation height templates in action recognition and video summarization

  51. Kim T-K, Cipolla R (2008) Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Trans Pattern Anal Mach Intell 31(8):1415–1428

    Google Scholar 

  52. Suryanto CH, Xue J-H, Fukui K (2016) Randomized time warping for motion recognition. Image Vis Comput 54:1–11

    Article  Google Scholar 

Download references

Acknowledgements

The current research was supported in part by the Department of the Navy, Naval Engineering Education Consortium under Grant No. (N00174-19-1-0014) and the National Science Foundation under Grant No. (2007367).

Funding

The current research was supported in part by the Department of the Navy, Naval Engineering Education Consortium under Grant No. (N00174-19-1-0014) and the National Science Foundation under Grant No. (2007367).

Author information

Authors and Affiliations

Authors

Contributions

Cagri Ozdemir: Conducting the research and investigation process; development of the methodology; performing the experiments; preparation and creation of the paper, specifically writing the initial draft. Randy C. Hoover: Management and coordination responsibility for the research planning and execution; verification of the research outputs; preparation and creation critical review, commentary and revision. Kyle Caudle: Verification of the research outputs; preparation and creation critical review, commentary and revision. Karen Braman: Verification of the research outputs; preparation and creation critical review, commentary and revision.

Corresponding author

Correspondence to Cagri Ozdemir.

Ethics declarations

Conflict of Interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ozdemir, C., Hoover, R.C., Caudle, K. et al. Tensor discriminant analysis on grassmann manifold with application to video based human action recognition. Int. J. Mach. Learn. & Cyber. 15, 3353–3365 (2024). https://doi.org/10.1007/s13042-024-02096-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-024-02096-5

Keywords

Navigation