Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/CVPR.2014.83guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Multi-view Super Vector for Action Recognition

Published: 23 June 2014 Publication History

Abstract

Images and videos are often characterized by multiple types of local descriptors such as SIFT, HOG and HOF, each of which describes certain aspects of object feature. Recognition systems benefit from fusing multiple types of these descriptors. Two widely applied fusion pipelines are descriptor concatenation and kernel average. The first one is effective when different descriptors are strongly correlated, while the second one is probably better when descriptors are relatively independent. In practice, however, different descriptors are neither fully independent nor fully correlated, and previous fusion methods may not be satisfying. In this paper, we propose a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors. Kernel average is then applied on these components to produce recognition result. To obtain MVSV, we develop a generative mixture model of probabilistic canonical correlation analyzers (M-PCCA), and utilize the hidden factors and gradient vectors of M-PCCA to construct MVSV for video representation. Experiments on video based action recognition tasks show that MVSV achieves promising results, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.

Cited By

View all
  • (2023)Deep Unsupervised Key Frame Extraction for Efficient Video ClassificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357173519:3(1-17)Online publication date: 25-Feb-2023
  • (2020)Multi-task Information Bottleneck Co-clustering for Unsupervised Cross-view Human Action CategorizationACM Transactions on Knowledge Discovery from Data10.1145/337539414:2(1-23)Online publication date: 9-Feb-2020
  • (2019)Second-order Temporal Pooling for Action RecognitionInternational Journal of Computer Vision10.1007/s11263-018-1111-5127:4(340-362)Online publication date: 1-Apr-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
CVPR '14: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition
June 2014
4302 pages
ISBN:9781479951185

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 June 2014

Author Tags

  1. action recognition
  2. canonical correlation analysis
  3. mixture model
  4. multi-view

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Deep Unsupervised Key Frame Extraction for Efficient Video ClassificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357173519:3(1-17)Online publication date: 25-Feb-2023
  • (2020)Multi-task Information Bottleneck Co-clustering for Unsupervised Cross-view Human Action CategorizationACM Transactions on Knowledge Discovery from Data10.1145/337539414:2(1-23)Online publication date: 9-Feb-2020
  • (2019)Second-order Temporal Pooling for Action RecognitionInternational Journal of Computer Vision10.1007/s11263-018-1111-5127:4(340-362)Online publication date: 1-Apr-2019
  • (2019)Multi-input 1-dimensional deep belief networkMultimedia Tools and Applications10.1007/s11042-018-7076-078:13(17739-17761)Online publication date: 1-Jul-2019
  • (2018)A Large-scale RGB-D Database for Arbitrary-view Human Action RecognitionProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240675(1510-1518)Online publication date: 15-Oct-2018
  • (2018)End-to-end temporal attention extraction and human action recognitionMachine Vision and Applications10.1007/s00138-018-0956-529:7(1127-1142)Online publication date: 1-Oct-2018
  • (2017)Temporal Pyramid Pooling-Based Convolutional Neural Network for Action RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2016.257676127:12(2613-2622)Online publication date: 1-Dec-2017
  • (2017)Exploring hybrid spatio-temporal convolutional networks for human action recognitionMultimedia Tools and Applications10.1007/s11042-017-4514-376:13(15065-15081)Online publication date: 1-Jul-2017
  • (2017)Action recognition with spatio-temporal augmented descriptor and fusion methodMultimedia Tools and Applications10.1007/s11042-016-3789-076:12(13953-13969)Online publication date: 1-Jun-2017
  • (2016)Discriminative-Element-Aware Sparse Representation for Action RecognitionProceedings of the International Conference on Internet Multimedia Computing and Service10.1145/3007669.3007693(22-26)Online publication date: 19-Aug-2016
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media