Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2623330.2623346acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Mining text snippets for images on the web

Published: 24 August 2014 Publication History

Abstract

Images are often used to convey many different concepts or illustrate many different stories. We propose an algorithm to mine multiple diverse, relevant, and interesting text snippets for images on the web. Our algorithm scales to all images on the web. For each image, all webpages that contain it are considered. The top-K text snippet selection problem is posed as combinatorial subset selection with the goal of choosing an optimal set of snippets that maximizes a combination of relevancy, interestingness, and diversity. The relevancy and interestingness are scored by machine learned models. Our algorithm is run at scale on the entire image index of a major search engine resulting in the construction of a database of images with their corresponding text snippets. We validate the quality of the database through a large-scale comparative study. We showcase the utility of the database through two web-scale applications: (a) augmentation of images on the web as webpages are browsed and (b)~an image browsing experience (similar in spirit to web browsing) that is enabled by interconnecting semantically related images (which may not be visually related) through shared concepts in their corresponding text snippets.

Supplementary Material

MP4 File (p1534-sidebyside.mp4)

References

[1]
R. Agrawal, M. Christoforaki, S. Gollapudi, A. Kannan, K. Kenthapadi, and A. Swaminathan. Mining videos from the web for electronic textbooks. International Conference on Formal Concept Analysis, 2014.
[2]
R. Angheluta, R. De Busser, and M.-F. Moens. The use of topic segmentation for automatic summarization. In Proceedings of the ACL-2002 Workshop on Automatic Summarization, 2002.
[3]
A. L. Berger and V. O. Mittal. Ocelot: a system for summarizing web pages. In Proceedings of ACM SIGIR, pages 144--151. ACM, 2000.
[4]
O. Buyukkokten, H. Garcia-Molina, and A. Paepcke. Seeing the whole in parts: text summarization for web browsing on handheld devices. In Proceedings of the 10th international conference on World Wide Web, pages 652--662. ACM, 2001.
[5]
W. T. Chuang and J. Yang. Extracting sentence segments for text summarization: a machine learning approach. In Proceedings of ACM SIGIR, pages 152--159. ACM, 2000.
[6]
J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In Sixth Symposium on Operating System Design and Implementation, pages 137--149, 2004.
[7]
A. Farhadi, M. Hejrati, M. A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences from images. In ECCV 2010. Springer, 2010.
[8]
E. Gabrilovich and S. Markovitch. Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In IJCAI, 2007.
[9]
J. Goldstein, V. Mittal, J. Carbonell, and M. Kantrowitz. Multi-document summarization by sentence extraction. In Proceedings of the 2000 NAACL-ANLP Workshop on Automatic summarization, 2000.
[10]
Y. Jing, H. A. Rowley, C. Rosenberg, J. Wang, M. Zhao, and M. Covell. Google image swirl, a large-scale content-based image browsing system. In Multimedia and Expo (ICME), IEEE International Conference on, pages 267--267. IEEE, 2010.
[11]
C.-W. Ko, J. Lee, and M. Queyranne. An exact algorithm for maximum entropy sampling. Operations Research, 43(4):684--691, 1995.
[12]
G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg. Baby talk: Understanding and generating simple image descriptions. In Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, pages 1601--1608. IEEE, 2011.
[13]
P. Kuznetsova, V. Ordonez, A. C. Berg, T. L. Berg, and Y. Choi. Collective generation of natural image descriptions. In Proceedings of the Association for Computational Linguistics, pages 359--368, 2012.
[14]
D. C. Liu and J. Nocedal. On the limited memory method for large scale optimization. Mathematical Programming, 45(3):503--528, 1989.
[15]
I. Mani and M. T. Maybury. Advances in automatic text summarization. the MIT Press, 1999.
[16]
R. Mason and E. Charniak. Annotation of online shopping images without labeled training examples. NAACL HLT 2013, page 1, 2013.
[17]
O. Medelyan, D. Milne, C. Legg, and I. Witten. Mining meaning from Wikipedia. International Journal of Human-Computer Studies, 67(9), 2009.
[18]
Microsoft. Internet Explorer. http://windows.microsoft.com/en-us/internet-explorer/go-explore-ie.
[19]
Microsoft. Windows Azure Cloud Services. http://www.windowsazure.com.
[20]
Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.
[21]
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical Programming, 14(1):265--294, 1978.
[22]
V. Ordonez, G. Kulkarni, and T. L. Berg. Im2text: Describing images using 1 million captioned photographs. In Advances in Neural Information Processing Systems, pages 1143--1151, 2011.
[23]
K. Spärck Jones. Automatic summarising: The state of the art. Information Processing & Management, 43(6):1449--1481, 2007.
[24]
G. Strong, E. Hoque, M. Gong, and O. Hoeber. Organizing and browsing image search results based on conceptual and visual similarities. In Advances in Visual Computing, pages 481--490. Springer, 2010.
[25]
M. Strube and S. Ponzetto. WikiRelate! Computing semantic relatedness using Wikipedia. In AAAI, 2006.
[26]
X.-J. Wang, L. Zhang, and C. Liu. Duplicate discovery on 2 billion internet images. In Proceedings of the Big Data Workshop, IEEE CVPR, pages 429--346. IEEE, 2013.
[27]
S. Winder and M. Brown. Learning local image descriptors. In IEEE Computer Vision and Pattern Recognition, pages 1--8, 2007.
[28]
B. Z. Yao, X. Yang, L. Lin, M. W. Lee, and S.-C. Zhu. I2t: Image parsing to text description. Proceedings of the IEEE, 98(8):1485--1508, 2010.

Cited By

View all
  • (2018)Extracting semantic knowledge from web context for multimedia IRMultimedia Tools and Applications10.1007/s11042-017-4997-y77:11(13853-13889)Online publication date: 1-Jun-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. browsing
  2. diversity
  3. interestingness
  4. relevance
  5. semantic image browsing
  6. text mining for images
  7. text snippets
  8. web image augmentation

Qualifiers

  • Research-article

Conference

KDD '14
Sponsor:

Acceptance Rates

KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Extracting semantic knowledge from web context for multimedia IRMultimedia Tools and Applications10.1007/s11042-017-4997-y77:11(13853-13889)Online publication date: 1-Jun-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media