Article

Heterogeneous Visual Features Fusion via Sparse Multimodal Machine

Authors:

Hua Wang,

Feiping Nie,

Heng Huang,

Chris DingAuthors Info & Claims

CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

Pages 3097 - 3102

https://doi.org/10.1109/CVPR.2013.398

Published: 23 June 2013 Publication History

Abstract

To better understand, search, and classify image and video information, many visual feature descriptors have been proposed to describe elementary visual characteristics, such as the shape, the color, the texture, etc. How to integrate these heterogeneous visual features and identify the important ones from them for specific vision tasks has become an increasingly critical problem. In this paper, We propose a novel Sparse Multimodal Learning (SMML) approach to integrate such heterogeneous features by using the joint structured sparsity regularizations to learn the feature importance of for the vision tasks from both group-wise and individual point of views. A new optimization algorithm is also introduced to solve the non-smooth objective with rigorously proved global convergence. We applied our SMML method to five broadly used object categorization and scene understanding image data sets for both single-label and multi-label image classification tasks. For each data set we integrate six different types of popularly used image features. Compared to existing scene and object categorization methods using either single modality or multi-modalities of features, our approach always achieves better performances measured.

Cited By

View all

Zeng SRao YZhang BXu Y(2023)Joint Augmented and Compressed Dictionaries for Robust Image ClassificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357291019:3s(1-24)Online publication date: 24-Feb-2023
https://dl.acm.org/doi/10.1145/3572910
Wang NXue YLin QZhong P(2019)Structured sparse multi-view feature selection based on weighted hinge lossMultimedia Tools and Applications10.1007/s11042-018-6937-x78:11(15455-15481)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s11042-018-6937-x
Li MLeung HShum HNeff MGeraerts RShum H(2016)Human action recognition via skeletal and depth based feature fusionProceedings of the 9th International Conference on Motion in Games10.1145/2994258.2994268(123-132)Online publication date: 10-Oct-2016
https://dl.acm.org/doi/10.1145/2994258.2994268
Show More Cited By

Index Terms

Heterogeneous Visual Features Fusion via Sparse Multimodal Machine
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding

Recommendations

Compressive sensing via nonlocal low-rank tensor regularization

The aim of Compressing sensing (CS) is to acquire an original signal, when it is sampled at a lower rate than Nyquist rate previously. In the framework of CS, the original signal is often assumed to be sparse and correlated in some domain. Recently, ...
Sparse representation and learning in visual recognition: Theory and applications

Sparse representation and learning has been widely used in computational intelligence, machine learning, computer vision and pattern recognition, etc. Mathematically, solving sparse representation and learning involves seeking the sparsest linear ...
Structured sparsity via alternating direction methods

We consider a class of sparse learning problems in high dimensional feature space regularized by a structured sparsity-inducing norm that incorporates prior knowledge of the group structure of the features. Such problems often pose a considerable ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

June 2013

3752 pages

ISBN:9780769549897

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 June 2013

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zeng SRao YZhang BXu Y(2023)Joint Augmented and Compressed Dictionaries for Robust Image ClassificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357291019:3s(1-24)Online publication date: 24-Feb-2023
https://dl.acm.org/doi/10.1145/3572910
Wang NXue YLin QZhong P(2019)Structured sparse multi-view feature selection based on weighted hinge lossMultimedia Tools and Applications10.1007/s11042-018-6937-x78:11(15455-15481)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s11042-018-6937-x
Li MLeung HShum HNeff MGeraerts RShum H(2016)Human action recognition via skeletal and depth based feature fusionProceedings of the 9th International Conference on Motion in Games10.1145/2994258.2994268(123-132)Online publication date: 10-Oct-2016
https://dl.acm.org/doi/10.1145/2994258.2994268
Shahroudy ANg TYang QWang G(2016)Multimodal Multipart Learning for Action Recognition in Depth VideosIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2015.250529538:10(2123-2129)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1109/TPAMI.2015.2505295
Cong YWang SFan BYang YYu H(2016)UDSFSNeurocomputing10.1016/j.neucom.2015.10.130196:C(150-158)Online publication date: 5-Jul-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.10.130
Deng CLv ZLiu WHuang JTao DGao X(2015)Multi-view matrix decompositionProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832581.2832728(3438-3444)Online publication date: 25-Jul-2015
https://dl.acm.org/doi/10.5555/2832581.2832728
Gao HYan LCai WHuang HCao LZhang CJoachims TWebb GMargineantu DWilliams G(2015)Anatomical Annotations for Drosophila Gene Expression Patterns via Multi-Dimensional Visual Descriptors IntegrationProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2783258.2783384(339-348)Online publication date: 10-Aug-2015
https://dl.acm.org/doi/10.1145/2783258.2783384
Gupta NDas SDwivedi GPal SKundu MChaudhury SMitra SMazumdar D(2015)Cognitive Inspired WOR Framework to Reveal Image Semantics, for Efficient Content Based Image RetrievalProceedings of the 2nd International Conference on Perception and Machine Intelligence10.1145/2708463.2709034(201-210)Online publication date: 26-Feb-2015
https://dl.acm.org/doi/10.1145/2708463.2709034
Gupta NDas SChakraborti SRamakrishnan AMalik JEfros AJawahar CVarma M(2014)Revealing What to Extract from Where, for Object-Centric Content Based Image Retrieval (CBIR)Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing10.1145/2683483.2683540(1-8)Online publication date: 14-Dec-2014
https://dl.acm.org/doi/10.1145/2683483.2683540
Wang YLin XWu LZhang WZhang QHua KRui YSteinmetz RHanjalic ANatsev AZhu W(2014)Exploiting Correlation ConsensusProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654999(981-984)Online publication date: 3-Nov-2014
https://dl.acm.org/doi/10.1145/2647868.2654999
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Index Terms

Recommendations

Compressive sensing via nonlocal low-rank tensor regularization

Sparse representation and learning in visual recognition: Theory and applications

Structured sparsity via alternating direction methods

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations