tutorial

Sparse Kernel Learning for Image Annotation

Authors:

Victor LavrenkoAuthors Info & Claims

ICMR '14: Proceedings of International Conference on Multimedia Retrieval

Pages 113 - 120

https://doi.org/10.1145/2578726.2578734

Published: 01 April 2014 Publication History

Abstract

In this paper we introduce a sparse kernel learning framework for the Continuous Relevance Model (CRM). State-of-the-art image annotation models linearly combine evidence from several different feature types to improve image annotation accuracy. While previous authors have focused on learning the linear combination weights for these features, there has been no work examining the optimal combination of kernels. We address this gap by formulating a sparse kernel learning framework for the CRM, dubbed the SKL-CRM, that greedily selects an optimal combination of kernels. Our kernel learning framework rapidly converges to an annotation accuracy that substantially outperforms a host of state-of-the-art annotation models. We make two surprising conclusions: firstly, if the kernels are chosen correctly, only a very small number of features are required so to achieve superior performance over models that utilise a full suite of feature types; and secondly, the standard default selection of kernels commonly used in the literature is sub-optimal, and it is much better to adapt the kernel choice based on the feature type and image dataset.

References

[1]

K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. Matching words and pictures. In JMLR'03.

Digital Library

[2]

G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos. Supervised learning of semantic classes for image annotation and retrieval. In PAMI'07.

[3]

W. S. Cooper. Some inconsistencies and misidentified modeling assumptions in probabilistic information retrieval. In TOIS'95.

[4]

P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV '02.

Digital Library

[5]

S. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In CVPR'04.

Digital Library

[6]

H. Fu, Q. Zhang, and G. Qiu. Random forest for image annotation. In ECCV'12.

Digital Library

[7]

D. Grangier and S. Bengio. A discriminative kernel-based approach to rank images from text queries. In PAMI'08.

[8]

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In ICCV'09.

[9]

C. Hentschel, S. Stober, A. Nürnberger, and M. Detyniecki. Automatic image annotation using a visual dictionary based on reliable image segmentation. In AMR'08.

[10]

V. Lavrenko, S. Feng, and R. Manmatha. Statistical models for automatic video annotation and retrieval. In ICASSP'04.

[11]

V. Lavrenko, R. Manmatha, and J. Jeon. A model for learning the semantics of pictures. In NIPS'03.

[12]

J. Liu, M. Li, Q. Liu, H. Lu, and S. Ma. Image annotation via graph learning. In JPR'09.

[13]

A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In ECCV '08.

Digital Library

[14]

M. Markkula and E. Sormunen. End-user searching challenges indexing practices in the digital newspaper photo archive. In IR'00.

[15]

D. Metzler and R. Manmatha. An inference network approach to image retrieval. In CIVR'04.

[16]

S. Moran and V. Lavrenko. Optimal tag sets for automatic image annotation. In BMVC'11.

[17]

H. Nakayama. Linear distance metric Learning for large-scale generic image recognition. PhD thesis, The University of Tokyo, Japan, 2011.

[18]

P. Richtárik and M. Takác. Distributed coordinate descent method for learning with big data. In CoRR'13.

[19]

Y. Verma and C. V. Jawahar. Exploring svm for image annotation in presence of confusing labels. In BMVC'13.

[20]

Y. Verma and C. V. Jawahar. Image annotation using metric learning in semantic neighbourhoods. In ECCV'12.

Digital Library

[21]

K. Q. Weinberger and L. K. Saul. Distance metric learning for large margin nearest neighbor classification. In JMLR'09.

Digital Library

[22]

Y. Xiang, X. Zhou, T.-S. Chua, and C.-W. Ngo. A revisit of generative model for automatic image annotation using markov random fields. In CVPR'09.

[23]

O. Yakhnenko and V. Honavar. Annotating images and image objects using a hierarchical dirichlet process model. In MDM '08.

Digital Library

[24]

A. Yavlinsky, E. Schofield, and S. Rüger. Automated image annotation using global features and robust nonparametric density estimation. In CIVR'05.

Digital Library

[25]

S. Zhang, J. Huang, Y. Huang, Y. Yu, H. Li, and D. N. Metaxas. Automatic image annotation using group sparsity. In CVPR'10.

Cited By

Oussama AKhaldi BKherfi M(2022)A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleriesMultimedia Tools and Applications10.1007/s11042-022-13788-x82:7(10795-10812)Online publication date: 17-Sep-2022
https://dl.acm.org/doi/10.1007/s11042-022-13788-x
Pham TNguyen HPhan HDo TNguyen TTran TLe T(2022)Towards a large-scale person search by vietnamese natural language: dataset and methodsMultimedia Tools and Applications10.1007/s11042-022-12138-181:19(27569-27600)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s11042-022-12138-1
Bensaci RKhaldi BAiadi OBenchabana A(2021)Deep Convolutional Neural Network with KNN Regression for Automatic Image AnnotationApplied Sciences10.3390/app11211017611:21(10176)Online publication date: 29-Oct-2021
https://doi.org/10.3390/app112110176
Show More Cited By

Index Terms

Sparse Kernel Learning for Image Annotation
1. Information systems
  1. Information retrieval
    1. Document representation
  2. Information storage systems

Recommendations

Image annotation by composite kernel learning with group structure
MM '11: Proceedings of the 19th ACM international conference on Multimedia

We can obtain more and more kinds of heterogeneous features (such as color, shape and texture) in images which can be extracted to describe various aspects of visual characteristics. Those high-dimensional heterogeneous visual features are intrinsically ...
Multiple kernel learning with NOn-conVex group spArsity

Enforce grouping sparsity penalty to select out discriminative visual features.Propose non-convex penalty to guarantee a consistent selection for features.Introduce sparse canonical correlation analysis to boost image annotation. As the high-dimensional ...
Effective automatic image annotation via a coherent language model and active learning
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

Image annotations allow users to access a large image database with textual queries. There have been several studies on automatic image annotation utilizing machine learning techniques, which automatically learn statistical models from annotated images ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMR '14: Proceedings of International Conference on Multimedia Retrieval

April 2014

564 pages

ISBN:9781450327824

DOI:10.1145/2578726

Conference Chairs:
Mohan Kankanhalli
National University of Singapore
,
Stefan Rueger
The Open University, UK
,
R. Manmatha
A9.com, USA
,
General Chairs:
Joemon Jose
University of Glasgow, UK
,
Keith van Rijsbergen
University of Glasgow, UK

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial
Research
Refereed limited

Conference

ICMR '14

ICMR '14: International Conference on Multimedia Retrieval

April 1 - 4, 2014

Glasgow, United Kingdom

Acceptance Rates

ICMR '14 Paper Acceptance Rate 21 of 111 submissions, 19%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
281
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Oussama AKhaldi BKherfi M(2022)A fast weighted multi-view Bayesian learning scheme with deep learning for text-based image retrieval from unlabeled galleriesMultimedia Tools and Applications10.1007/s11042-022-13788-x82:7(10795-10812)Online publication date: 17-Sep-2022
https://dl.acm.org/doi/10.1007/s11042-022-13788-x
Pham TNguyen HPhan HDo TNguyen TTran TLe T(2022)Towards a large-scale person search by vietnamese natural language: dataset and methodsMultimedia Tools and Applications10.1007/s11042-022-12138-181:19(27569-27600)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s11042-022-12138-1
Bensaci RKhaldi BAiadi OBenchabana A(2021)Deep Convolutional Neural Network with KNN Regression for Automatic Image AnnotationApplied Sciences10.3390/app11211017611:21(10176)Online publication date: 29-Oct-2021
https://doi.org/10.3390/app112110176
Zhang WHu HHu HYu J(2020)Automatic image annotation via category labelsMultimedia Tools and Applications10.1007/s11042-019-07929-y79:17-18(11421-11435)Online publication date: 1-May-2020
https://dl.acm.org/doi/10.1007/s11042-019-07929-y
徐海(2019)Image Auto-Annotation Using Semantic Link NetworkArtificial Intelligence and Robotics Research10.12677/AIRR.2019.8301808:03(158-165)Online publication date: 2019
https://doi.org/10.12677/AIRR.2019.83018
Tang CLiu XWang PZhang CLi MWang L(2019)Adaptive Hypergraph Embedded Semi-Supervised Multi-Label Image AnnotationIEEE Transactions on Multimedia10.1109/TMM.2019.290986021:11(2837-2849)Online publication date: Nov-2019
https://doi.org/10.1109/TMM.2019.2909860
Jin CSun QJin S(2019)A hybrid automatic image annotation approachMultimedia Tools and Applications10.1007/s11042-018-6742-678:9(11815-11834)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1007/s11042-018-6742-6
Ji QZhang LShu XTang J(2019)Image annotation refinement via 2P-KNN based group sparse reconstructionMultimedia Tools and Applications10.1007/s11042-018-5925-578:10(13213-13225)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1007/s11042-018-5925-5
Nemade SSonavane S(2019)Automatic Feature Extraction for CBIR and Image Annotation ApplicationsComputing in Engineering and Technology10.1007/978-981-32-9515-5_53(557-566)Online publication date: 17-Oct-2019
https://doi.org/10.1007/978-981-32-9515-5_53
Zhong FChen ZNing ZMin GHu Y(2018)Heterogeneous visual features integration for image recognition optimization in internet of thingsJournal of Computational Science10.1016/j.jocs.2016.11.00228(466-475)Online publication date: Sep-2018
https://doi.org/10.1016/j.jocs.2016.11.002
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents