Abstract
With the advancement in technology and availability of multimedia content, human action recognition has become a major area of research in computer vision that contributes to semantic analysis of videos. The representation and matching of spatio-temporal information in videos is a major factor affecting the design and performance of existing convolution neural network approaches for human action recognition. In this paper, in contrast to the traditional approach of using raw video as input, we derive attributes from action bank features to represent and match spatio-temporal information effectively. The derived features are arranged in a square matrix and used as input to the convolutional neural network for action recognition. The effectiveness of the proposed approach is demonstrated on KTH and UCF Sports datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. (eds.) HBU 2011. LNCS, vol. 7065, pp. 29–39. Springer, Heidelberg (2011)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 35, 221–231 (2013)
Huang, Y., Yang, H., Huang, P.: Action recognition using hog feature in different resolution video sequences. In: 2012 International Conference on Computer Distributed Control and Intelligent Environmental Monitoring (CDCIEM), pp. 85–88 (2012)
Sadanand, S., Corso, J.J.: Action bank: a high-level representation of activity in video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1234–1241 (2012)
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013)
Jiang, Z., Lin, Z., Davis, L.: Label consistent K-SVD: learning a discriminative dictionary for recognition. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 35, 2651–2664 (2013)
Baumann, F.: Action recognition with HOG-OF features. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 243–248. Springer, Heidelberg (2013)
Yao, B., Nie, B., Liu, Z., Zhu, S.C.: Animated pose templates for modeling and detecting human actions. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 36, 436–452 (2014)
Palm, R.B.: Prediction as a candidate for learning deep hierarchical models of data. Master’s thesis, Technical University of Denmark, Asmussens Alle, Denmark (2012)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3337–3344 (2011)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos ‘in the wild’. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1996–2003 (2009)
Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368 (2011)
Zhang, Y., Liu, X., Chang, M.-C., Ge, W., Chen, T.: Spatio-temporal phrases for activity recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 707–721. Springer, Heidelberg (2012)
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176 (2011)
Wu, X., Xu, D., Duan, L., Luo, J.: Action recognition using context and appearance distribution features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 489–496 (2011)
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2046–2053 (2010)
O’Hara, S., Draper, B.: Scalable action recognition with a subspace forest. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1210–1217 (2012)
Rodriguez, M., Ahmed, J., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Yeffet, L., Wolf, L.: Local trinary patterns for human action recognition. In: IEEE 12th International Conference on Computer Vision, pp. 492–497 (2009)
Sadanand, S., Corso, J.: Action bank: a high-level representation of activity in video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1234–1241 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ijjina, E.P., Mohan, C.K. (2015). Human Action Recognition Using Action Bank Features and Convolutional Neural Networks. In: Jawahar, C., Shan, S. (eds) Computer Vision - ACCV 2014 Workshops. ACCV 2014. Lecture Notes in Computer Science(), vol 9008. Springer, Cham. https://doi.org/10.1007/978-3-319-16628-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-16628-5_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16627-8
Online ISBN: 978-3-319-16628-5
eBook Packages: Computer ScienceComputer Science (R0)