A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion

Andrew Gilbert²⁰ &
Richard Bowden²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7725))

Included in the following conference series:

Asian Conference on Computer Vision

3944 Accesses

Abstract

We present an approach to automatically expand the annotation of images using the internet as an additional information source. The novelty of the work is in the expansion of image tags by automatically introducing new unseen complex linguistic labels which are collected unsupervised from associated webpages. Taking a small subset of existing image tags, a web based search retrieves additional textual information. Both a textual bag of words model and a visual bag of words model are combined and symbolised for data mining. Association rule mining is then used to identify rules which relate words to visual contents. Unseen images that fit these rules are re-tagged. This approach allows a large number of additional annotations to be added to unseen images, on average 12.8 new tags per image, with an 87.2% true positive rate. Results are shown on two datasets including a new 2800 image annotation dataset of landmarks, the results include pictures of buildings being tagged with the architect, the year of construction and even events that have taken place there. This widens the tag annotation impact and their use in retrieval. This dataset is made available along with tags and the 1970 webpages and additional images which form the information corpus. In addition, results for a common state-of-the-art dataset MIRFlickr25000 are presented for comparison of the learning framework against previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Photo Recall: Using the Internet to Label Your Photos

The Recapitulate Analysis of Image Mining Techniques Applications and Challenges Associated

Image Annotation Using a Semantic Hierarchy

References

Tsai, D., Jing, Y., Liu, Y., Rowley, H., Ioffe, S.M., Rehg, J.: Large-scale image annotation using visual synset. In: Proc. of IEEE International Conference on Computer Vision, ICCV 2011 (2011)
Google Scholar
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Grubinger, M., Clough, P., Müller, H., Deselaers, T.: The iapr tc-12 benchmark - a new evaluation resource for visual information systems. In: Proc. of ICLRE (2006)
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, CVPR 2009 (2009)
Google Scholar
Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: learning to rank with joint word-image embeddings. Mach. Learn. 81, 21–35 (2010)
Article Google Scholar
Barnard, K., Duygulu, P., Forsyth, D., De Freitas, N., Blei, D.M., Jaz, K., Hofmann, T., Poggio, T., Shawe-taylor, J.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
MATH Google Scholar
Yakhnenko, O., Honavar, V.: Annotating images and image objects using a hierarchical dirichlet process model. In: Proceedings of the 9th International Workshop on Multimedia Data Mining, MDM 2008: held in Conjunction with the ACM SIGKDD 2008, pp. 1–7. ACM, New York (2008)
Chapter Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2005), pp. 886–893 (2005)
Google Scholar
Hertz, T., Bar-Hillel, A., Weinshall, D.: Learning distance functions for image retrieval. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2004), vol. 2, pp. II-570–II-577 (2004)
Google Scholar
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: Proc of IEEE International Conference on Computer Vision (ICCV 2009), pp. 309–316 (2009)
Google Scholar
Makadia, A., Pavlovic, V., Kumar, S.: A New Baseline for Image Annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)
Chapter Google Scholar
Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: Arista - image search to annotation on billions of web photos. In: Proc of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 2987–2994 (2010)
Google Scholar
Cesa-Bianchi, N., Gentile, C., Zaniboni, L.: Incremental algorithms for hierarchical classification. J. Mach. Learn. Res. 7, 31–54 (2006)
MathSciNet MATH Google Scholar
Bi, W., Kwok, J.: Multi-label classification on tree- and dag-structured hierarchies. In: Getoor, L., Scheffer, T. (eds.) Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 17–24. ACM, New York (2011)
Google Scholar
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: VLDB 1994, Proceedings of 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
Gilbert, A., Bowden, R.: igroup: Weakly supervised image and video grouping. In: Proc. of International Conference on Computer Vision, ICCV 2011 (2011)
Google Scholar
Lowe, D.: Distinctive Image Features from Scale-invariant Keypoints. Proc of International Jounral of Computer Vision (IJCV) 60, 91–110 (2004)
Article Google Scholar
Cai, H., Mikolajczyk, K., Matas, J.: Learning linear discriminant projections for dimensionality reduction of image descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the 1993 ACM SIGMOD International Conference on Management of Data SIGMOD 1993, pp. 207–216 (1993)
Google Scholar
Huiskes, M., Lew, M.: The mir flickr retreieval evaluation. In: Proc of MIR (2008)
Google Scholar
Oliva, A., Torralba, A.: Modelling the shape of the scene: a holistic representation of the spatial envelope. Proc of International Journal of Computer Vision, IJCV 2001 42(3), 145–175 (2001)
Article MATH Google Scholar
Nowak, S.: Overview of the Photo Annotation Task in ImageCLEF@ICPR. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 138–151. Springer, Heidelberg (2010)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, GU2 7XH, UK
Andrew Gilbert & Richard Bowden

Authors

Andrew Gilbert
View author publications
You can also search for this author in PubMed Google Scholar
Richard Bowden
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, 151-744, Seoul, Korea
Kyoung Mu Lee
Microsoft Research Asia, No. 5, Danling st., Haidian district, 100080, Beijing, P.R. China
Yasuyuki Matsushita
School of Interactive Computing, Georgia Institute of Technology, 801 Atlantic Drive, CCB 315, 30332, Atlanta, GA, USA
James M. Rehg
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Zhong Quan Cun East Road 95, Haidian District, 100 190, Beijing, P.R. China
Zhanyi Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gilbert, A., Bowden, R. (2013). A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37444-9_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-37444-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37443-2
Online ISBN: 978-3-642-37444-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Photo Recall: Using the Internet to Label Your Photos

The Recapitulate Analysis of Image Mining Techniques Applications and Challenges Associated

Image Annotation Using a Semantic Hierarchy

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Photo Recall: Using the Internet to Label Your Photos

The Recapitulate Analysis of Image Mining Techniques Applications and Challenges Associated

Image Annotation Using a Semantic Hierarchy

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation