short-paper

Large Scale Image Annotation via Deep Representation Learning and Tag Embedding Learning

Authors:

Chunhong PanAuthors Info & Claims

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

Pages 523 - 526

https://doi.org/10.1145/2671188.2749330

Published: 22 June 2015 Publication History

Abstract

In this paper, we focus on the issue of large scale image annotation, whereas most existing methods are devised for small datasets. A novel model based on deep representation learning and tag embedding learning is proposed. Specifically, the proposed model learns an unified latent space for image visual features and tag embeddings simultaneously. Furthermore, a metric matrix is introduced to estimate the relevance scores between images and tags. Finally, an objective function modeling triplet relationships (irrelevant tag, image, relevant tag) is proposed with maximum margin pursuit. The proposed model is easy to tackle new images and tags via online learning and has a relatively low test computation complexity. Experimental results on NUS-WIDE dataset demonstrate the effectiveness of the proposed model.

References

[1]

L. Ballan, T. Uricchio, L. Seidenari, and A. Del Bimbo. A cross-media model for automatic image annotation. In ACM International Conference on Multimedia Retrieval, pages 73--80, 2014.

Digital Library

[2]

G. Carneiro, A. Chan, P. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3):394--410, 2007.

Digital Library

[3]

M. Chen, A. Zheng, and K. Weinberger. Fast image tagging. In International Conference on Machine Learning, pages 1274--1282, 2013.

Digital Library

[4]

T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: a real-world web image database from national university of singapore. In ACM International Conference on Image and Video Retrieval, page 48, 2009.

Digital Library

[5]

J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei. Imagenet: a large-scale hierarchical image database. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 248--255, 2009.

[6]

P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In European Conference on Computer Vision, pages 97--112. 2006.

Digital Library

[7]

R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint arXiv:1311.2524, 2013.

Digital Library

[8]

Y. Gong, Y. Jia, T. Leung, A. Toshev, and S. Ioffe. Deep convolutional ranking for multilabel image annotation. 2014.

[9]

M. Grubinger, P. Clough, H. Muller, and T. Deselaers. The iapr tc-12 benchmark: A new evaluation resource for visual information systems. In International Workshop OntoImage, pages 13--23, 2006.

[10]

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In IEEE International Conference on Computer Vision, pages 309--316, 2009.

[11]

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: convolutional architecture for fast feature embedding. In ACM International Conference on Multimedia, pages 675--678, 2014.

Digital Library

[12]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.

[13]

D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.

Digital Library

[14]

A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In European Conference on Computer Vision, pages 316--329. 2008.

Digital Library

[15]

V. Murthy, E. Can, and R. Manmatha. A hybrid model for automatic image annotation. In ACM International Conference on Multimedia Retrieval, pages 369--376, 2014.

Digital Library

[16]

A. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. Cnn features off-the-shelf: an astounding baseline for recognition. arXiv preprint arXiv:1403.6382, 2014.

[17]

V. Vapnik and V. Vapnik. Statistical learning theory, volume 2. Wiley New York, 1998.

Digital Library

[18]

L. Von Ahn and L. Dabbish. Labeling images with a computer game. In ACM SIGCHI Conference on Human Factors in Computing Systems, pages 319--326, 2004.

Digital Library

[19]

C. Wang, S. Yan, L. Zhang, and H. Zhang. Multi-label sparse coding for automatic image annotation. In IEEE International Conference on Computer Vision and Pattern Recognition, pages 1643--1650, 2009.

[20]

J. Weston, S. Bengio, and N. Usunier. Large scale image annotation: learning to rank with joint word-image embeddings. Machine Learning, 81(1):21--35, 2010.

Digital Library

Cited By

Wang FLiu JZhang SZhang GLi YYuan F(2019)Inductive Zero-Shot Image Annotation via Embedding GraphIEEE Access10.1109/ACCESS.2019.29253837(107816-107830)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2925383
Wang FLiu JZhang SZhang GZheng YLi X(2018)Image Annotation through Adaptive Dependency Fusion2018 IEEE International Conference on Progress in Informatics and Computing (PIC)10.1109/PIC.2018.8706284(196-202)Online publication date: Dec-2018
https://doi.org/10.1109/PIC.2018.8706284
Wang RXie YYang JXue LHu MZhang Q(2017)Large scale automatic image annotation based on convolutional neural networkJournal of Visual Communication and Image Representation10.5555/3163595.316381049:C(213-224)Online publication date: 1-Nov-2017
https://dl.acm.org/doi/10.5555/3163595.3163810
Show More Cited By

Index Terms

Large Scale Image Annotation via Deep Representation Learning and Tag Embedding Learning
1. Applied computing
  1. Document management and text processing
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Joint multi-view representation learning and image tagging
AAAI'16: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence

Automatic image annotation is an important problem in several machine learning applications such as image search. Since there exists a semantic gap between low-level image features and high-level semantics, the description ability of image ...
Learning Social Image Embedding with Deep Multimodal Attention Networks
Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017

Learning social media data embedding by deep models has attracted extensive research interest as well as boomed a lot of applications, such as link prediction, classification, and cross-modal search. However, for social images which contain both link ...
A 3D-CAE-CNN model for Deep Representation Learning of 3D images
Abstract
Deep Representation Learning technologies based on supervised Convolutional Neural Networks (CNNs) have attained significant interest mainly due to their superior performance for learning abstract and robust features used in object ...
Highlights
- We propose the idea of combining 3D-CAE-UFL and 3D-CNN-SFL approaches in order to create efficient and high quality deep learning representations for 3D ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

June 2015

700 pages

ISBN:9781450332743

DOI:10.1145/2671188

General Chairs:
Alex Hauptmann
Carnegie Mellon University, USA
,
Chong-Wah Ngo
City University of Hong Kong, China
,
Xiangyang Xue
Fudan University, China
,
Program Chairs:
Yu-Gang Jiang
Fudan University, China
,
Cees Snoek
University of Amsterdam and Qualcomm Research Netherlands
,
Nuno Vasconcelos
University of California, San Diego, USA

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Beijing Natural Science Foundation
National Basic Research Program of China
National Natural Science Foundation of China

Conference

ICMR '15

Sponsor:

SIGMM

ICMR '15: International Conference on Multimedia Retrieval

June 23 - 26, 2015

Shanghai, China

Acceptance Rates

ICMR '15 Paper Acceptance Rate 48 of 127 submissions, 38%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
249
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang FLiu JZhang SZhang GLi YYuan F(2019)Inductive Zero-Shot Image Annotation via Embedding GraphIEEE Access10.1109/ACCESS.2019.29253837(107816-107830)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2925383
Wang FLiu JZhang SZhang GZheng YLi X(2018)Image Annotation through Adaptive Dependency Fusion2018 IEEE International Conference on Progress in Informatics and Computing (PIC)10.1109/PIC.2018.8706284(196-202)Online publication date: Dec-2018
https://doi.org/10.1109/PIC.2018.8706284
Wang RXie YYang JXue LHu MZhang Q(2017)Large scale automatic image annotation based on convolutional neural networkJournal of Visual Communication and Image Representation10.5555/3163595.316381049:C(213-224)Online publication date: 1-Nov-2017
https://dl.acm.org/doi/10.5555/3163595.3163810
Wang RXie YYang JXue LHu MZhang Q(2017)Large scale automatic image annotation based on convolutional neural networkJournal of Visual Communication and Image Representation10.1016/j.jvcir.2017.07.00449(213-224)Online publication date: Nov-2017
https://doi.org/10.1016/j.jvcir.2017.07.004

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents