Representing videos as linear subspaces on Grassmann manifolds has made great strides in action recognition problems. Recent studies have explored the convenience of discriminant analysis by making use of Grassmann kernels. However, traditional methods rely on the matrix representation of videos based on the temporal dimension and suffer from not considering the two spatial dimensions. To overcome this problem, we keep the natural form of videos by representing video inputs as multidimensional arrays known as tensors and propose a tensor discriminant analysis approach on Grassmannian manifolds. Because matrix algebra does not handle tensor data, we introduce a new Grassmann projection kernel based on the tensor-tensor decomposition and product. Experiments with human action databases show that the proposed method performs well compared with the state-of-the-art algorithms.
In this section, we have presented experimental results on four well-known data sets: the Cambridge Hand Gesture [47], Weizmann [49], UTD-MHAD [48], and UCF sports action [50] data sets. These datasets are freely available. The source code of the proposed method is available at: https://github.com/Cagri- Ozdemir/TGDA
Note that while we generally use upper-case calligraphic letters to denote tensors, to keep consistent with the literature, we will denote a Grassmann manifold using an upper case calligraphic \({\mathcal {G}}\).
Our source code is available in the GitHub repository: https://github.com/Cagri-Ozdemir/TGDA.
The current research was supported in part by the Department of the Navy, Naval Engineering Education Consortium under Grant No. (N00174-19-1-0014) and the National Science Foundation under Grant No. (2007367).
