Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2671188.2749330acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

Large Scale Image Annotation via Deep Representation Learning and Tag Embedding Learning

Published: 22 June 2015 Publication History

Abstract

In this paper, we focus on the issue of large scale image annotation, whereas most existing methods are devised for small datasets. A novel model based on deep representation learning and tag embedding learning is proposed. Specifically, the proposed model learns an unified latent space for image visual features and tag embeddings simultaneously. Furthermore, a metric matrix is introduced to estimate the relevance scores between images and tags. Finally, an objective function modeling triplet relationships (irrelevant tag, image, relevant tag) is proposed with maximum margin pursuit. The proposed model is easy to tackle new images and tags via online learning and has a relatively low test computation complexity. Experimental results on NUS-WIDE dataset demonstrate the effectiveness of the proposed model.

References

[1]
L. Ballan, T. Uricchio, L. Seidenari, and A. Del Bimbo. A cross-media model for automatic image annotation. In ACM International Conference on Multimedia Retrieval, pages 73--80, 2014.
[2]
G. Carneiro, A. Chan, P. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3):394--410, 2007.
[3]
M. Chen, A. Zheng, and K. Weinberger. Fast image tagging. In International Conference on Machine Learning, pages 1274--1282, 2013.
[4]
T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: a real-world web image database from national university of singapore. In ACM International Conference on Image and Video Retrieval, page 48, 2009.
[5]
J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei. Imagenet: a large-scale hierarchical image database. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 248--255, 2009.
[6]
P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In European Conference on Computer Vision, pages 97--112. 2006.
[7]
R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint arXiv:1311.2524, 2013.
[8]
Y. Gong, Y. Jia, T. Leung, A. Toshev, and S. Ioffe. Deep convolutional ranking for multilabel image annotation. 2014.
[9]
M. Grubinger, P. Clough, H. Muller, and T. Deselaers. The iapr tc-12 benchmark: A new evaluation resource for visual information systems. In International Workshop OntoImage, pages 13--23, 2006.
[10]
M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In IEEE International Conference on Computer Vision, pages 309--316, 2009.
[11]
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia, pages 675--678, 2014.
[12]
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.
[13]
D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.
[14]
A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In European Conference on Computer Vision, pages 316--329. 2008.
[15]
V. Murthy, E. Can, and R. Manmatha. A hybrid model for automatic image annotation. In ACM International Conference on Multimedia Retrieval, pages 369--376, 2014.
[16]
A. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. Cnn features off-the-shelf: an astounding baseline for recognition. arXiv preprint arXiv:1403.6382, 2014.
[17]
V. Vapnik and V. Vapnik. Statistical learning theory, volume 2. Wiley New York, 1998.
[18]
L. Von Ahn and L. Dabbish. Labeling images with a computer game. In ACM SIGCHI Conference on Human Factors in Computing Systems, pages 319--326, 2004.
[19]
C. Wang, S. Yan, L. Zhang, and H. Zhang. Multi-label sparse coding for automatic image annotation. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 1643--1650, 2009.
[20]
J. Weston, S. Bengio, and N. Usunier. Large scale image annotation: learning to rank with joint word-image embeddings. Machine Learning, 81(1):21--35, 2010.

Cited By

View all
  • (2019)Inductive Zero-Shot Image Annotation via Embedding GraphIEEE Access10.1109/ACCESS.2019.29253837(107816-107830)Online publication date: 2019
  • (2018)Image Annotation through Adaptive Dependency Fusion2018 IEEE International Conference on Progress in Informatics and Computing (PIC)10.1109/PIC.2018.8706284(196-202)Online publication date: Dec-2018
  • (2017)Large scale automatic image annotation based on convolutional neural networkJournal of Visual Communication and Image Representation10.5555/3163595.316381049:C(213-224)Online publication date: 1-Nov-2017
  • Show More Cited By

Index Terms

  1. Large Scale Image Annotation via Deep Representation Learning and Tag Embedding Learning

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval
      June 2015
      700 pages
      ISBN:9781450332743
      DOI:10.1145/2671188
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 June 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. deep representation learning
      2. large scale image annotation
      3. tag embedding learning

      Qualifiers

      • Short-paper

      Funding Sources

      Conference

      ICMR '15
      Sponsor:

      Acceptance Rates

      ICMR '15 Paper Acceptance Rate 48 of 127 submissions, 38%;
      Overall Acceptance Rate 254 of 830 submissions, 31%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 24 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Inductive Zero-Shot Image Annotation via Embedding GraphIEEE Access10.1109/ACCESS.2019.29253837(107816-107830)Online publication date: 2019
      • (2018)Image Annotation through Adaptive Dependency Fusion2018 IEEE International Conference on Progress in Informatics and Computing (PIC)10.1109/PIC.2018.8706284(196-202)Online publication date: Dec-2018
      • (2017)Large scale automatic image annotation based on convolutional neural networkJournal of Visual Communication and Image Representation10.5555/3163595.316381049:C(213-224)Online publication date: 1-Nov-2017
      • (2017)Large scale automatic image annotation based on convolutional neural networkJournal of Visual Communication and Image Representation10.1016/j.jvcir.2017.07.00449(213-224)Online publication date: Nov-2017

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media