Abstract
Representing videos as linear subspaces on Grassmann manifolds has made great strides in action recognition problems. Recent studies have explored the convenience of discriminant analysis by making use of Grassmann kernels. However, traditional methods rely on the matrix representation of videos based on the temporal dimension and suffer from not considering the two spatial dimensions. To overcome this problem, we keep the natural form of videos by representing video inputs as multidimensional arrays known as tensors and propose a tensor discriminant analysis approach on Grassmannian manifolds. Because matrix algebra does not handle tensor data, we introduce a new Grassmann projection kernel based on the tensor-tensor decomposition and product. Experiments with human action databases show that the proposed method performs well compared with the state-of-the-art algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
In this section, we have presented experimental results on four well-known data sets: the Cambridge Hand Gesture [47], Weizmann [49], UTD-MHAD [48], and UCF sports action [50] data sets. These datasets are freely available. The source code of the proposed method is available at: https://github.com/Cagri- Ozdemir/TGDA
Notes
Note that while we generally use upper-case calligraphic letters to denote tensors, to keep consistent with the literature, we will denote a Grassmann manifold using an upper case calligraphic \({\mathcal {G}}\).
Our source code is available in the GitHub repository: https://github.com/Cagri-Ozdemir/TGDA.
References
Aggarwal JK, Ryoo MS (2011) Human activity analysis: a review. ACM Comput Surv 43(3):1–43
Moeslund TB, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Underst 81(3):231–268
Kuo C-H, Nevatia R (2011) How does person identity recognition help multi-person tracking? In: CVPR 2011. IEEE, pp 1217–1224
Hamm J, Lee DD (2008) Grassmann discriminant analysis: a unifying view on subspace-based learning. In: Proceedings of the 25th international conference on machine learning, pp 376–383
Harandi MT, Sanderson C, Shirazi S, Lovell BC (2011) Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching. In: CVPR 2011. IEEE, pp 2705–2712
Lui YM, Beveridge JR, Kirby M (2010) Action classification on product manifolds. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 833–839
Lui YM (2011) Tangent bundles on special manifolds for action recognition. IEEE Trans Circuits Syst Video Technol 22(6):930–942
Lui YM (2012) Human gesture recognition on product manifolds. J Mach Learn Res 13(1):3297–3321
Sharma K, Rameshan R (2020) Image set classification using a distance-based kernel over affine Grassmann manifold. IEEE Trans Neural Netw Learn Syst 32(3):1082–1095
Gatto BB, dos Santos EM, Koerich AL, Fukui K, Junior WS (2021) Tensor analysis with n-mode generalized difference subspace. Expert Syst Appl 171:114559
Liu Y, Gao Q, Miao S, Gao X, Nie F, Li Y (2016) A non-greedy algorithm for l1-norm lda. IEEE Trans Image Process 26(2):684–695
Liu Y, Gao X, Gao Q, Shao L, Han J (2019) Adaptive robust principal component analysis. Neural Netw 119:85–92
Lai Z, Xu Y, Yang J, Tang J, Zhang D (2013) Sparse tensor discriminant analysis. IEEE Trans Image Process 22(10):3904–3915
Lu J, Lai Z, Wang H, Chen Y, Zhou J, Shen L (2020) Generalized embedding regression: a framework for supervised feature extraction. IEEE Trans Neural Netw Learn Syst 33(1):185–199
Lu J, Wang H, Zhou J, Chen Y, Lai Z, Hu Q (2021) Low-rank adaptive graph embedding for unsupervised feature extraction. Pattern Recognit 113:107758
Harandi MT, Salzmann M, Jayasumana S, Hartley R, Li H (2014) Expanding the family of Grassmannian kernels: an embedding perspective. In: European conference on computer vision. Springer, pp 408–423
Kilmer ME, Martin CD, Perrone L (2008) A third-order generalization of the matrix SVD as a product of third-order tensors. Tufts University, Department of Computer Science, technical report TR-2008-4
Braman K (2010) Third-order tensors as linear operators on a space of matrices. Linear Algebra Appl 433(7):1241–1253
Kilmer ME, Martin CD (2011) Factorization strategies for third-order tensors. Linear Algebra Appl 435(3):641–658
Karner H, Schneid J, Ueberhuber CW (2003) Spectral decomposition of real circulant matrices. Linear Algebra Appl 367:301–311
Gleich DF, Greif C, Varah JM (2013) The power and Arnoldi methods in an algebra of circulants. Numer Linear Algebra Appl 20(5):809–831
Kernfeld E, Kilmer M, Aeron S (2015) Tensor-tensor products with invertible linear transforms. Linear Algebra Appl 485:545–570
Tarzanagh DA, Michailidis G (2018) Fast randomized algorithms for t-product based tensor operations and decompositions with applications to imaging data. SIAM J Imaging Sci 11(4):2629–2664
Ozdemir C, Hoover RC, Caudle K (2021) Fast tensor singular value decomposition using the low-resolution features of tensors. In: 2021 20th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 527–533
Zhang Z, Aeron S (2016) Exact tensor completion using t-svd. IEEE Trans Signal Process 65(6):1511–1526
Zhou P, Lu C, Lin Z, Zhang C (2017) Tensor factorization for low-rank tensor completion. IEEE Trans Image Process 27(3):1152–1163
Zhang L, Song L, Du B, Zhang Y (2019) Nonlocal low-rank tensor completion for visual data. IEEE Trans Cybern 51(2):673–685
Soltani S, Kilmer ME, Hansen PC (2016) A tensor-based dictionary learning approach to tomographic image reconstruction. BIT Numer Math 56(4):1425–1454
Zhang C, Hu W, Jin T, Mei Z (2018) Nonlocal image denoising via adaptive tensor nuclear norm minimization. Neural Comput Appl 29(1):3–19
Ozdemir C, Hoover RC, Caudle K, Braman K (2022) Kernelization of tensor discriminant analysis with application to image recognition. In: 2022 21st IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 183–189
Kilmer ME, Braman K, Hao N, Hoover RC (2013) Third-order tensors as operators on matrices: a theoretical and computational framework with applications in imaging. SIAM J Matrix Anal Appl 34(1):148–172
Hao N, Kilmer ME, Braman K, Hoover RC (2013) Facial recognition using tensor-tensor decompositions. SIAM J Imaging Sci 6(1):437–463
Li Q, Schonfeld D (2014) Multilinear discriminant analysis for higher-order tensor data classification. IEEE Trans Pattern Anal Mach Intell 36(12):2524–2537
Zhang J, Li Z, Jing P, Liu Y, Su Y (2019) Tensor-driven low-rank discriminant analysis for image set classification. Multim Tools Appl 78(4):4001–4020
Hoover RC, Braman KS, Hao N (2011) Pose estimation from a single image using tensor decomposition and an algebra of circulants. In: 2011 IEEE/RSJ international conference on intelligent robots and systems. IEEE, pp 2928–2934
Ozdemir C, Hoover RC, Caudle K (2021) 2DTPCA: a new framework for multilinear principal component analysis. In: 2021 IEEE international conference on image processing (ICIP). IEEE, pp 344–348
Strang G, Nguyen T (1996) Wavelets and filter banks. SIAM
Jensen A, la Cour-Harbo A (2001) Ripples in mathematics: the discrete wavelet transform. Springer, New York
Haar A (1910) Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen 69(3):331–371
Daubechies I (1993) Orthonormal bases of compactly supported wavelets ii. variations on a theme. SIAM J Math Anal 24(2):499–519
Porwik P, Lisowska A (2004) The Haar-wavelet transform in digital image processing: its status and achievements. Mach Graphics Vis 13(1/2):79–98
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Hoover RC, Caudle K, Braman K (2018) Multilinear discriminant analysis through tensor-tensor Eigendecomposition. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 578–584
Ozdemir, C, Hoover RC, Caudle K, Braman K (2022) High-order multilinear discriminant analysis via order-n tensor eigendecomposition. arXiv:2205.09191
Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl 20(2):303–353
Hamm J, Lee D (2008) Extended Grassmann kernels for subspace-based learning. Adv Neural Inf Process Syst 21
Kim T-K, Wong S-F, Cipolla R (2007) Tensor canonical correlation analysis for action classification. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Chen C, Jafari R, Kehtarnavaz N (2015) Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 168–172
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Tenth IEEE international conference on computer vision (ICCV’05) volume 1, vol 2. IEEE, pp 1395–1402
Rodriguez M (2010) Spatio-temporal maximum average correlation height templates in action recognition and video summarization
Kim T-K, Cipolla R (2008) Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Trans Pattern Anal Mach Intell 31(8):1415–1428
Suryanto CH, Xue J-H, Fukui K (2016) Randomized time warping for motion recognition. Image Vis Comput 54:1–11
Acknowledgements
The current research was supported in part by the Department of the Navy, Naval Engineering Education Consortium under Grant No. (N00174-19-1-0014) and the National Science Foundation under Grant No. (2007367).
Funding
The current research was supported in part by the Department of the Navy, Naval Engineering Education Consortium under Grant No. (N00174-19-1-0014) and the National Science Foundation under Grant No. (2007367).
Author information
Authors and Affiliations
Contributions
Cagri Ozdemir: Conducting the research and investigation process; development of the methodology; performing the experiments; preparation and creation of the paper, specifically writing the initial draft. Randy C. Hoover: Management and coordination responsibility for the research planning and execution; verification of the research outputs; preparation and creation critical review, commentary and revision. Kyle Caudle: Verification of the research outputs; preparation and creation critical review, commentary and revision. Karen Braman: Verification of the research outputs; preparation and creation critical review, commentary and revision.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ozdemir, C., Hoover, R.C., Caudle, K. et al. Tensor discriminant analysis on grassmann manifold with application to video based human action recognition. Int. J. Mach. Learn. & Cyber. 15, 3353–3365 (2024). https://doi.org/10.1007/s13042-024-02096-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-024-02096-5