Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2671188.2749362acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Online Multimodal Co-indexing and Retrieval of Weakly Labeled Web Image Collections

Published: 22 June 2015 Publication History

Abstract

Weak supervisory information of web images, such as captions, tags, and descriptions, make it possible to better understand images at the semantic level. In this paper, we propose a novel online multimodal co-indexing algorithm based on Adaptive Resonance Theory, named OMC-ART, for the automatic co-indexing and retrieval of images using their multimodal information. Compared with existing studies, OMC-ART has several distinct characteristics. First, OMC-ART is able to perform online learning of sequential data. Second, OMC-ART builds a two-layer indexing structure, in which the first layer co-indexes the images by the key visual and textual features based on the generalized distributions of clusters they belong to; while in the second layer, images are co-indexed by their own feature distributions. Third, OMC-ART enables flexible multimodal search by using either visual features, keywords, or a combination of both. Fourth, OMC-ART employs a ranking algorithm that does not need to go through the whole indexing system when only a limited number of images need to be retrieved. Experiments on two published data sets demonstrate the efficiency and effectiveness of our proposed approach.

References

[1]
J. C. Caicedo, J. BenAbdallah, F. A. González, and O. Nasraoui. Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization. Neurocomputing, 76(1):50--60, 2012.
[2]
J. C. Caicedo, J. G. Moreno, E. A. Niño, and F. A. González. Combining visual features and text data for medical image retrieval using latent semantic kernels. In Proceedings of the international conference on Multimedia information retrieval, pages 359--366, 2010.
[3]
P. Chandrika and C. V. Jawahar. Multi modal semantic indexing for image retrieval. In CIVR, pages 342--349, 2010.
[4]
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. NUS-WIDE: a real-world web image database from national university of singapore. In CIVR, 2009.
[5]
L. De Lathauwer, B. De Moor, and J. Vandewalle. A multilinear singular value decomposition. SIAM journal on Matrix Analysis and Applications, 21(4):1253--1278, 2000.
[6]
P. Duygulu, K. Barnard, J. F. de Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, pages 97--112, 2002.
[7]
H. J. Escalante, M. Montes, and E. Sucar. Multimodal indexing based on semantic cohesion for image retrieval. Information Retrieval, 15(1):1--32, 2012.
[8]
Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik. Improving image-sentence embeddings using large weakly annotated photo collections. In Proceedings of the European Conference on Computer Vision (ECCV), pages 529--545, 2014.
[9]
M. Li, X.-B. Xue, and Z.-H. Zhou. Exploiting multi-modal interactions: A unified framework. pages 1120--1125, 2009.
[10]
R. Lienhart, S. Romberg, and E. Hörster. Multilayer pLSA for multimodal image retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval, 2009.
[11]
T. Mei, Y. Rui, S. Li, and Q. Tian. Multimedia search reranking: A literature survey. ACM Computing Surveys (CSUR), 46(3):38, 2014.
[12]
L. Meng and A.-H. Tan. Semi-supervised hierarchical clustering for personalized web image organization. In Proceedings of International Joint Conference on Neural Networks (IJCNN), pages 1--8, 2012.
[13]
L. Meng and A.-H. Tan. Community discovery in social networks via heterogeneous link association and fusion. In Proceedings of the SIAM International Conference on Data Mining (SDM), pages 803--811, 2014.
[14]
L. Meng, A.-H. Tan, and D. C. Wunsch. Vigilance adaptation in adaptive resonance theory. In Proceedings of International Joint Conference on Neural Networks (IJCNN), pages 1--7, 2013.
[15]
L. Meng, A.-H. Tan, and D. Xu. Semi-supervised heterogeneous fusion for multimedia data co-clustering. IEEE Transactions on Knowledge and Data Engineering, 26(9):2293--2306, 2014.
[16]
Y. Mu, J. Shen, and S. Yan. Weakly-supervised hashing in kernel space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3344--3351, 2010.
[17]
L. Nie, M. Wang, Y. Gao, Z.-J. Zha, and T.-S. Chua. Beyond text QA: Multimedia answer generation by harvesting web information. IEEE Transactions on Multimedia, 15(2):426--441, 2013.
[18]
L. Nie, M. Wang, Z.-J. Zha, G. Li, and T.-S. Chua. Multimedia answering: Enriching text QA with media information. In SIGIR, pages 695--704, 2011.
[19]
A. W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12):1349--1380, 2000.
[20]
J.-H. Su, B.-W. Wang, T.-Y. Hsu, C.-L. Chou, and V. S. Tseng. Multi-modal image retrieval by integrating web image annotation, concept matching and fuzzy ranking techniques. International Journal of Fuzzy Systems, 12(2):136--149, 2010.
[21]
F. X. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang. Weak attributes for large-scale image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2949--2956, 2012.
[22]
S. Zhang, M. Yang, X. Wang, Y. Lin, and Q. Tian. Semantic-aware co-indexing for image retrieval. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 1673--1680, 2013.

Cited By

View all
  • (2024)MIWE: Multimodal Indexing of Web Entities Incorporating Semantic Artificial IntelligenceData Science and Security10.1007/978-981-97-0975-5_43(485-494)Online publication date: 31-May-2024
  • (2023)Multi-channel Attentive Weighting of Visual Frames for Multimodal Video Classification2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10192036(1-8)Online publication date: 18-Jun-2023
  • (2023)Cross-Training with Prototypical Distillation for improving the generalization of Federated Learning2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00117(648-653)Online publication date: Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval
June 2015
700 pages
ISBN:9781450332743
DOI:10.1145/2671188
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. hierarchical image co-indexing
  3. multimodal search
  4. online learning
  5. weakly supervised learning

Qualifiers

  • Research-article

Funding Sources

  • National Research Foundation-Prime Minister's office, Republic of Singapore

Conference

ICMR '15
Sponsor:

Acceptance Rates

ICMR '15 Paper Acceptance Rate 48 of 127 submissions, 38%;
Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MIWE: Multimodal Indexing of Web Entities Incorporating Semantic Artificial IntelligenceData Science and Security10.1007/978-981-97-0975-5_43(485-494)Online publication date: 31-May-2024
  • (2023)Multi-channel Attentive Weighting of Visual Frames for Multimodal Video Classification2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10192036(1-8)Online publication date: 18-Jun-2023
  • (2023)Cross-Training with Prototypical Distillation for improving the generalization of Federated Learning2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00117(648-653)Online publication date: Jul-2023
  • (2022)MMH-index: Enhancing Apache Lucene with High-Performance Multi-Modal Indexing and SearchingProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548768(7279-7289)Online publication date: 10-Oct-2022
  • (2020)A Preliminary Study of Fusion ARTs with Adaptively Information Intensity Attenuation Controlling2020 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN48605.2020.9207553(1-7)Online publication date: Jul-2020
  • (2019)A survey of adaptive resonance theory neural network models for engineering applicationsNeural Networks10.1016/j.neunet.2019.09.012120:C(167-203)Online publication date: 1-Dec-2019
  • (2019)Online Multimodal Co-indexing and Retrieval of Social Media DataAdaptive Resonance Theory in Social Media Data Clustering10.1007/978-3-030-02985-2_7(155-174)Online publication date: 1-May-2019
  • (2019)Adaptive Resonance Theory (ART) for Social Media AnalyticsAdaptive Resonance Theory in Social Media Data Clustering10.1007/978-3-030-02985-2_3(45-89)Online publication date: 1-May-2019
  • (2017)Towards Age-friendly E-commerce Through Crowd-Improved Speech Recognition, Multimodal Search, and Personalized Speech FeedbackProceedings of the 2nd International Conference on Crowd Science and Engineering10.1145/3126973.3129306(127-135)Online publication date: 6-Jul-2017
  • (2017)GEO matching regionsMultimedia Tools and Applications10.1007/s11042-016-3834-z76:14(15377-15411)Online publication date: 1-Jul-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media