Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Multimodal Retrieval with Diversification and Relevance Feedback for Tourist Attraction Images

Published: 12 August 2017 Publication History

Abstract

In this article, we present a novel framework that can produce a visual description of a tourist attraction by choosing the most diverse pictures from community-contributed datasets, which describe different details of the queried location. The main strength of the proposed approach is its flexibility that permits us to filter out non-relevant images and to obtain a reliable set of diverse and relevant images by first clustering similar images according to their textual descriptions and their visual content and then extracting images from different clusters according to a measure of the user’s credibility. Clustering is based on a two-step process, where textual descriptions are used first and the clusters are then refined according to the visual features. The degree of diversification can be further increased by exploiting users’ judgments on the results produced by the proposed algorithm through a novel approach, where users not only provide a relevance feedback but also a diversity feedback. Experimental results performed on the MediaEval 2015 “Retrieving Diverse Social Images” dataset show that the proposed framework can achieve very good performance both in the case of automatic retrieval of diverse images and in the case of the exploitation of the users’ feedback. The effectiveness of the proposed approach has been also confirmed by a small case study involving a number of real users.

References

[1]
M. R. Anderberg. 1973. Cluster Analysis for Applications. Academic Press.
[2]
J. Bian, Y. Yang, H. Zhang, and T. S. Chua. 2015. Multimedia summarization for social events in microblog stream. IEEE Trans. Multimed. 17, 2 (Feb 2015), 216--228.
[3]
G. Boato, D.-T. Dang-Nguyen, O. Muratov, N. Alajlan, and F. G. B. De Natale. 2015. Exploiting visual saliency for increasing diversity of image retrieval results. Multimed. Tools. Appl. (2015), 1--22.
[4]
B. Boteanu, I. Mironica, and B. Ionescu. 2014. A relevance feedback perspective to image search result diversification. In Proceedings of the IEEE International Conference on Computer Vision. 47--54.
[5]
B. Boteanu, I. Mironica, and B. Ionescu. 2015. Hierarchical clustering pseudo-relevance feedback for social image search result diversification. In Proceedings of the IEEE International Workshop on Content-Based Multimedia Indexing. 1--6.
[6]
J. Carbonell and J. Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 335--336.
[7]
T. Chen, K.-H. Yap, and D. Zhang. 2014. Discriminative soft bag-of-visual phrase for mobile landmark recognition. IEEE Trans. Multimed. 16, 3 (2014), 612--622.
[8]
Y. Chen, X. S. Zhou, and T. S. Huang. 2001. One-class SVM for learning in image retrieval. In Proceedings of the IEEE International Conference on Image Processing, Vol. 1. 34--37.
[9]
D. Giordano, S. Palazzo, and C. Spampinato. 2016. A diversity-based search approach to support annotation of a large fish image dataset. Multimedia Systems 22, 6 (Nov. 2016), 725--736.
[10]
N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 886--893.
[11]
D.-T. Dang-Nguyen, G. Boato, F. G.B. De Natale, L. Piras, G. Giacinto, F. Tuveri, and M. Angioni. 2015a. Multimodal-based diversified summarization in social image retrieval. In MediaEval, Vol. 1436.
[12]
D.-T. Dang-Nguyen, L. Piras, G. Giacinto, G. Boato, and F. G. B. De Natale. 2015b. A hybrid approach for retrieving diverse social images of landmarks. In Proceedings of the IEEE International Conference on Multimedia and Expo.
[13]
V. de Weijer, C. Schmid, J. Verbeek, and D. Larlus. 2009. Learning color names for real-world applications. IEEE Trans. Image Process. 18, 7 (2009), 1512--1523.
[14]
A. L. Gînscă, A. Popescu, B. Ionescu, A. Armagan, and I. Kanellos. 2014. Toward an estimation of user tagging credibility for social image retrieval. In Proceedings of the ACM International Conference on Multimedia. 1021--1024.
[15]
J.-T. Huang, C.-H. Shen, S.-M. Phoong, and H. Chen. 2005. Robust measure of image focus in the wavelet domain. In Proceedings of the Conference on Intelligent Signal Processing and Communication Systems. 157--160.
[16]
Z. Huang, B. Hu, H. Cheng, H. T. Shen, H. Liu, and X. Zhou. 2010. Mining near-duplicate graph for cluster-based reranking of web video search results. ACM Trans. Info. Syst. 28, 4 (2010), 22:1--22:27.
[17]
B. Ionescu, A.-L. Gînscă, B. Boteanu, A. Popescu, M. Lupu, and H. Müller. 2015. Retrieving diverse social images at mediaeval 2015: Challenge, dataset and evaluation. In MediaEval, Vol. 1436.
[18]
B. Ionescu, A. Popescu, M. Lupu, A. L. Gînscă, and Müller. 2014. Retrieving diverse social images at mediaeval 2014: Challenge, dataset and evaluation. In MediaEval.
[19]
S. Jiang, X. Qian, J. Shen, Y. Fu, and T. Mei. 2015. Author topic model-based collaborative filtering for personalized POI recommendations. IEEE Trans. Multimed. 17, 6 (June 2015), 907--918.
[20]
L. S. Kennedy and M. Naaman. 2008. Generating diverse and representative image search results for landmarks. In Proceedings of the ACM International Conference on World Wide Web. 297--306.
[21]
D.-H. Kim, C.-W. Chung, and K. Barnard. 2005. Relevance feedback using adaptive clustering for image similarity retrieval. J. Syst. Softw. 78, 1 (2005), 9--23.
[22]
J. Laaksonen, M. Koskela, and E. Oja. 2002. PicSOM-self-organizing image retrieval with MPEG-7 content descriptors. IEEE Trans. Neural Netw. 13, 4 (2002), 841--853.
[23]
S. Lazebnik, C. Schmid, and J. Ponce. 2006. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2169--2178.
[24]
S. Liang and Z. Sun. 2008. Sketch retrieval and relevance feedback with biased SVM classification. Patt. Recogn. Lett. 29, 12 (2008), 17331741.
[25]
D. Lu, X. Liu, and X. Qian. 2016. Tag-based image search by social re-ranking. IEEE Trans. Multimed. 18, 8 (Aug 2016), 1628--1639.
[26]
Z. Lu and H. H. S. Ip. 2010. Combining context, consistency, and diversity cues for interactive image categorization. IEEE Trans. Multimed. 12, 3 (2010), 194--203.
[27]
B. S. Manjunath, J. R. Ohm, V. V. Vasudevan, and A. Yamada. 2001. Color and texture descriptors. IEEE Trans. Circ. Syst. Vid. Technol. 11, 6 (2001), 703--715.
[28]
I. Mironica, B. Ionescu, and C. Vertan. 2012. Hierarchical clustering relevance feedback for content-based image retrieval. In Proceedings of the IEEE International Workshop on Content-Based Multimedia Indexing. 1--6.
[29]
T. Ojala, M. Pietikinen, and D. Harwood. 1994. Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In Proceedings of the IAPR International Conference on Pattern Recognition. 582--585.
[30]
M. Paramita, M. Sanderson, and P. Clough. 2009. Diversity in photo retrieval: Overview of the ImageCLEF photo task 2009. In Proceedings of the International Conference on Cross-language Evaluation Forum: Multimedia Experiments.
[31]
L. Piras and G. Giacinto. 2009. Neighborhood-based feature weighting for relevance feedback in content-based retrieval. In Proceedings of the IEEE International Workshop on Image Analysis for Multimedia Interactive Services. 238--241.
[32]
L. Piras and G. Giacinto. 2017. Information fusion in content based image retrieval: A comprehensive overview. Info. Fusion 37 (2017), 50--60.
[33]
X. Qian, X. Tan, Y. Zhang, R. Hong, and M. Wang. 2016. Enhancing sketch-based image retrieval by re-ranking and relevance feedback. IEEE Trans. Image Process. 25, 1 (Jan 2016), 195--208.
[34]
X. Qian, Y. Xue, X. Yang, Y. Y. Tang, X. Hou, and T. Mei. 2015. Landmark summarization with diverse viewpoints. IEEE Trans. Circ. Syst. Vid. Technol. 25, 11 (2015), 1857--1869.
[35]
S. S. Ravindranath, M. Gygli, and L. van Gool. In MediaEval.
[36]
S. Rudinac, A. Hanjalic, and M. Larson. 2013. Generating visual summaries of geographic areas using community-contributed images. IEEE Trans. Multimed. 15, 4 (2013), 921--932.
[37]
Y. Rui, T. S. Huang, and S. Mehrotra. 1997. Content-based image retrieval with relevance feedback in MARS. In Proceedings of the IEEE International Conference on Image Processing. 815--818.
[38]
Y. Rui, T. S. Huang, and S. Mehrotra. 1998. Relevance feedback: A power tool in interactive content-based image retrieval. IEEE Trans. Circ. Syst. Vid. Technol. 8, 5 (September 1998), 644--655.
[39]
S. Sabetghadam, J. R. M. Palotti, N. Rekabsaz, M. Lupu, and A. Hanbury. 2015. TUW @ MediaEval 2015 retrieving diverse social images task. In MediaEval, Vol. 1436.
[40]
I. Simon, N. Snavely, and S. M. Seitz. 2007. Scene summarization for online image collections. In Proceedings of the IEEE International Conference on Computer Vision. 1--8.
[41]
B. Thomee and M. S. Lew. 2012. Interactive search in image retrieval: A survey. Int. J. Multimed. Info. Retriev. 1, 1 (2012), 71--86.
[42]
R. Tronci, G. Murgia, M. Pili, L. Piras, and G. Giacinto. 2013. ImageHunter: A novel tool for relevance feedback in content based image retrieval. In New Challenges in Distributed Information Filtering and Retrieval. Vol. 439. 53--70.
[43]
C.-M. Tsai, A. Qamra, E. Y. Chang, and Y.-F. Wang. 2006. Extent: Interring image metadata from context and content. In Proceedings of the IEEE International Conference on Multimedia and Expo. 1270--1273.
[44]
R. H. van Leuken, L. Garcia, X. Olivares, and R. van Zwol. 2009. Visual diversification of image search results. In Proceedings of the ACM International Conference on World Wide Web. 341--350.
[45]
T. Wang, Y. Rui, S.-M. Hu, and J.-G. Sun. 2003. Adaptive tree similarity learning for image retrieval. Multimed. Syst. 9, 2 (2003), 131--143.
[46]
J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba. 2010. SUN database: Large-scale scene recognition from abbey to zoo. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3485--3492.
[47]
E. S. Xioufis, A. Popescu, S. Papadopoulos, and I. Kompatsiaris. USEMP: Finding diverse images at MediaEval 2015. In MediaEval.
[48]
M. Zaharieva and L. Diem. 2015. MIS @ retrieving diverse social images task 2015. In MediaEval, Vol. 1436.
[49]
L. Zhang, F. Lin, and B. Zhang. 2001. Support vector machine learning for image retrieval. In Proceedings of the IEEE International Conference on Image Processing, Vol. 2. 721--724.
[50]
R. Zhang and Z. Zhang. 2005. FAST: Toward more effective and efficient image retrieval. Multimed. Syst. 10, 6 (2005), 529--543.
[51]
T. Zhang, R. Ramakrishnan, and M. Livny. 1996. BIRCH: An efficient data clustering method for very large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 103--114.
[52]
L. Zhu, J. Shen, H. Jin, L. Xie, and R. Zheng. 2015. Landmark classification with hierarchical multi-modal exemplar feature. IEEE Trans. Multimed. 17, 7 (2015), 981--993.

Cited By

View all
  • (2023)Design-time Reference Current Generation for Robust Spintronic-based Neuromorphic ArchitectureACM Journal on Emerging Technologies in Computing Systems10.1145/362555620:1(1-20)Online publication date: 14-Nov-2023
  • (2023)Boosting Diversity in Visual Search with Pareto Non-Dominated Re-RankingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362529620:3(1-23)Online publication date: 10-Nov-2023
  • (2023)Geometric and Learning-Based Mesh Denoising: A Comprehensive SurveyACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362509820:3(1-28)Online publication date: 10-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 13, Issue 4
November 2017
362 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3129737
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2017
Accepted: 01 May 2017
Revised: 01 April 2017
Received: 01 October 2016
Published in TOMM Volume 13, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Diversification
  2. tourist attraction images retrieval

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Regional Administration of Sardinia, Italy
  • Advanced and secure sharing of multimedia data over social networks in the future Internet

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Design-time Reference Current Generation for Robust Spintronic-based Neuromorphic ArchitectureACM Journal on Emerging Technologies in Computing Systems10.1145/362555620:1(1-20)Online publication date: 14-Nov-2023
  • (2023)Boosting Diversity in Visual Search with Pareto Non-Dominated Re-RankingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362529620:3(1-23)Online publication date: 10-Nov-2023
  • (2023)Geometric and Learning-Based Mesh Denoising: A Comprehensive SurveyACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362509820:3(1-28)Online publication date: 10-Nov-2023
  • (2023)A Review on Methods and Applications in Multimodal Deep LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/354557219:2s(1-41)Online publication date: 17-Feb-2023
  • (2022)Densely Enhanced Semantic Network for Conversation System in Social MediaACM Transactions on Multimedia Computing, Communications, and Applications10.1145/350179918:4(1-24)Online publication date: 4-Mar-2022
  • (2022)ExpertosLF: dynamic late fusion of CBIR systems using online learning with relevance feedbackMultimedia Tools and Applications10.1007/s11042-022-13119-082:8(11619-11661)Online publication date: 20-Aug-2022
  • (2021)Impact of Interaction Strategies on User Relevance FeedbackProceedings of the 2021 International Conference on Multimedia Retrieval10.1145/3460426.3463663(590-598)Online publication date: 24-Aug-2021
  • (2020)Multi-View Graph Matching for 3D Model RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/338792016:3(1-20)Online publication date: 5-Jul-2020
  • (2020)Hypergraph-based image search reranking with elastic net regularized regressionMultimedia Tools and Applications10.1007/s11042-020-09418-zOnline publication date: 15-Aug-2020
  • (2020)Convolutional neural networks for relevance feedback in content based image retrievalMultimedia Tools and Applications10.1007/s11042-020-09292-979:37-38(26995-27021)Online publication date: 1-Oct-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media