research-article

Online Multimodal Co-indexing and Retrieval of Weakly Labeled Web Image Collections

Authors:

Chunyan MiaoAuthors Info & Claims

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

Pages 219 - 226

https://doi.org/10.1145/2671188.2749362

Published: 22 June 2015 Publication History

Abstract

Weak supervisory information of web images, such as captions, tags, and descriptions, make it possible to better understand images at the semantic level. In this paper, we propose a novel online multimodal co-indexing algorithm based on Adaptive Resonance Theory, named OMC-ART, for the automatic co-indexing and retrieval of images using their multimodal information. Compared with existing studies, OMC-ART has several distinct characteristics. First, OMC-ART is able to perform online learning of sequential data. Second, OMC-ART builds a two-layer indexing structure, in which the first layer co-indexes the images by the key visual and textual features based on the generalized distributions of clusters they belong to; while in the second layer, images are co-indexed by their own feature distributions. Third, OMC-ART enables flexible multimodal search by using either visual features, keywords, or a combination of both. Fourth, OMC-ART employs a ranking algorithm that does not need to go through the whole indexing system when only a limited number of images need to be retrieved. Experiments on two published data sets demonstrate the efficiency and effectiveness of our proposed approach.

References

[1]

J. C. Caicedo, J. BenAbdallah, F. A. González, and O. Nasraoui. Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization. Neurocomputing, 76(1):50--60, 2012.

Digital Library

[2]

J. C. Caicedo, J. G. Moreno, E. A. Niño, and F. A. González. Combining visual features and text data for medical image retrieval using latent semantic kernels. In Proceedings of the international conference on Multimedia information retrieval, pages 359--366, 2010.

Digital Library

[3]

P. Chandrika and C. V. Jawahar. Multi modal semantic indexing for image retrieval. In CIVR, pages 342--349, 2010.

Digital Library

[4]

T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. NUS-WIDE: a real-world web image database from national university of singapore. In CIVR, 2009.

Digital Library

[5]

L. De Lathauwer, B. De Moor, and J. Vandewalle. A multilinear singular value decomposition. SIAM journal on Matrix Analysis and Applications, 21(4):1253--1278, 2000.

Digital Library

[6]

P. Duygulu, K. Barnard, J. F. de Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, pages 97--112, 2002.

Digital Library

[7]

H. J. Escalante, M. Montes, and E. Sucar. Multimodal indexing based on semantic cohesion for image retrieval. Information Retrieval, 15(1):1--32, 2012.

Digital Library

[8]

Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik. Improving image-sentence embeddings using large weakly annotated photo collections. In Proceedings of the European Conference on Computer Vision (ECCV), pages 529--545, 2014.

[9]

M. Li, X.-B. Xue, and Z.-H. Zhou. Exploiting multi-modal interactions: A unified framework. pages 1120--1125, 2009.

Digital Library

[10]

R. Lienhart, S. Romberg, and E. Hörster. Multilayer pLSA for multimodal image retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval, 2009.

Digital Library

[11]

T. Mei, Y. Rui, S. Li, and Q. Tian. Multimedia search reranking: A literature survey. ACM Computing Surveys (CSUR), 46(3):38, 2014.

Digital Library

[12]

L. Meng and A.-H. Tan. Semi-supervised hierarchical clustering for personalized web image organization. In Proceedings of International Joint Conference on Neural Networks (IJCNN), pages 1--8, 2012.

[13]

L. Meng and A.-H. Tan. Community discovery in social networks via heterogeneous link association and fusion. In Proceedings of the SIAM International Conference on Data Mining (SDM), pages 803--811, 2014.

[14]

L. Meng, A.-H. Tan, and D. C. Wunsch. Vigilance adaptation in adaptive resonance theory. In Proceedings of International Joint Conference on Neural Networks (IJCNN), pages 1--7, 2013.

[15]

L. Meng, A.-H. Tan, and D. Xu. Semi-supervised heterogeneous fusion for multimedia data co-clustering. IEEE Transactions on Knowledge and Data Engineering, 26(9):2293--2306, 2014.

[16]

Y. Mu, J. Shen, and S. Yan. Weakly-supervised hashing in kernel space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3344--3351, 2010.

[17]

L. Nie, M. Wang, Y. Gao, Z.-J. Zha, and T.-S. Chua. Beyond text QA: Multimedia answer generation by harvesting web information. IEEE Transactions on Multimedia, 15(2):426--441, 2013.

Digital Library

[18]

L. Nie, M. Wang, Z.-J. Zha, G. Li, and T.-S. Chua. Multimedia answering: Enriching text QA with media information. In SIGIR, pages 695--704, 2011.

Digital Library

[19]

A. W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1349--1380, 2000.

Digital Library

[20]

J.-H. Su, B.-W. Wang, T.-Y. Hsu, C.-L. Chou, and V. S. Tseng. Multi-modal image retrieval by integrating web image annotation, concept matching and fuzzy ranking techniques. International Journal of Fuzzy Systems, 12(2):136--149, 2010.

[21]

F. X. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. Weak attributes for large-scale image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2949--2956, 2012.

Digital Library

[22]

S. Zhang, M. Yang, X. Wang, Y. Lin, and Q. Tian. Semantic-aware co-indexing for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1673--1680, 2013.

Digital Library

Cited By

Aravind Krishnan ADeepak G(2024)MIWE: Multimodal Indexing of Web Entities Incorporating Semantic Artificial IntelligenceData Science and Security10.1007/978-981-97-0975-5_43(485-494)Online publication date: 31-May-2024
https://doi.org/10.1007/978-981-97-0975-5_43
Wang YQi ZLi XLiu JMeng XMeng L(2023)Multi-channel Attentive Weighting of Visual Frames for Multimodal Video Classification2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10192036(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10192036
Liu TQi ZChen ZMeng XMeng L(2023)Cross-Training with Prototypical Distillation for improving the generalization of Federated Learning2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00117(648-653)Online publication date: Jul-2023
https://doi.org/10.1109/ICME55011.2023.00117
Show More Cited By

Index Terms

Online Multimodal Co-indexing and Retrieval of Weakly Labeled Web Image Collections
1. Information systems
  1. Information retrieval
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

Multimodal biomedical image indexing and retrieval using descriptive text and global feature mapping
Abstract
The images found within biomedical articles are sources of essential information useful for a variety of tasks. Due to the rapid growth of biomedical knowledge, image retrieval systems are increasingly becoming necessary tools for quickly ...
Mutual relevance feedback for multimodal query formulation in video retrieval
MIR '05: Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval

Video indexing and retrieval systems allow users to find relevant video segments for a given information need. A multimodal video index may include speech indices, a text-from-screen (OCR) index, semantic visual concepts, content-based image features, ...
Optimizing multimedia retrieval using multimodal fusion and relevance feedback techniques
MMM'12: Proceedings of the 18th international conference on Advances in Multimedia Modeling

This paper introduces a novel approach for search and retrieval of multimedia content. The proposed framework retrieves multiple media types simultaneously, namely 3D objects, 2D images and audio files, by utilizing an appropriately modified manifold ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

June 2015

700 pages

ISBN:9781450332743

DOI:10.1145/2671188

General Chairs:
Alex Hauptmann
Carnegie Mellon University, USA
,
Chong-Wah Ngo
City University of Hong Kong, China
,
Xiangyang Xue
Fudan University, China
,
Program Chairs:
Yu-Gang Jiang
Fudan University, China
,
Cees Snoek
University of Amsterdam and Qualcomm Research Netherlands
,
Nuno Vasconcelos
University of California, San Diego, USA

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Research Foundation-Prime Minister's office, Republic of Singapore

Conference

ICMR '15

Sponsor:

SIGMM

ICMR '15: International Conference on Multimedia Retrieval

June 23 - 26, 2015

Shanghai, China

Acceptance Rates

ICMR '15 Paper Acceptance Rate 48 of 127 submissions, 38%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
139
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 31 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Aravind Krishnan ADeepak G(2024)MIWE: Multimodal Indexing of Web Entities Incorporating Semantic Artificial IntelligenceData Science and Security10.1007/978-981-97-0975-5_43(485-494)Online publication date: 31-May-2024
https://doi.org/10.1007/978-981-97-0975-5_43
Wang YQi ZLi XLiu JMeng XMeng L(2023)Multi-channel Attentive Weighting of Visual Frames for Multimodal Video Classification2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10192036(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10192036
Liu TQi ZChen ZMeng XMeng L(2023)Cross-Training with Prototypical Distillation for improving the generalization of Federated Learning2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00117(648-653)Online publication date: Jul-2023
https://doi.org/10.1109/ICME55011.2023.00117
Liu RLiang JJin PWang YMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)MMH-index: Enhancing Apache Lucene with High-Performance Multi-Modal Indexing and SearchingProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548768(7279-7289)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3548768
Wenxuan ZYaqing HQiang ZHongwei GXin YLiang FXinghua Q(2020)A Preliminary Study of Fusion ARTs with Adaptively Information Intensity Attenuation Controlling2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9207553(1-7)Online publication date: Jul-2020
https://doi.org/10.1109/IJCNN48605.2020.9207553
Brito da Silva LElnabarawy IWunsch D(2019)A survey of adaptive resonance theory neural network models for engineering applicationsNeural Networks10.1016/j.neunet.2019.09.012120:C(167-203)Online publication date: 1-Dec-2019
https://dl.acm.org/doi/10.1016/j.neunet.2019.09.012
Meng LTan AWunsch II DMeng LTan AWunsch II D(2019)Online Multimodal Co-indexing and Retrieval of Social Media DataAdaptive Resonance Theory in Social Media Data Clustering10.1007/978-3-030-02985-2_7(155-174)Online publication date: 1-May-2019
https://doi.org/10.1007/978-3-030-02985-2_7
Meng LTan AWunsch II DMeng LTan AWunsch II D(2019)Adaptive Resonance Theory (ART) for Social Media AnalyticsAdaptive Resonance Theory in Social Media Data Clustering10.1007/978-3-030-02985-2_3(45-89)Online publication date: 1-May-2019
https://doi.org/10.1007/978-3-030-02985-2_3
Meng LNguyen QTian XShen ZChng EGuan FMiao CLeung C(2017)Towards Age-friendly E-commerce Through Crowd-Improved Speech Recognition, Multimodal Search, and Personalized Speech FeedbackProceedings of the 2nd International Conference on Crowd Science and Engineering10.1145/3126973.3129306(127-135)Online publication date: 6-Jul-2017
https://dl.acm.org/doi/10.1145/3126973.3129306
Memon MLi JMemon IArain Q(2017)GEO matching regionsMultimedia Tools and Applications10.1007/s11042-016-3834-z76:14(15377-15411)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s11042-016-3834-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents