Dual cross-media relevance model for image annotation
Proceedings of the 15th ACM international conference on Multimedia, 2007•dl.acm.org
Image annotation has been an active research topic in recent years due to its potential
impact on both image understanding and web image retrieval. Existing relevance-model-
based methods perform image annotation by maximizing the joint probability of images and
words, which is calculated by the expectation over training images. However, the semantic
gap and the dependence on training data restrict their performance and scalability. In this
paper, a dual cross-media relevance model (DCMRM) is proposed for automatic image …
impact on both image understanding and web image retrieval. Existing relevance-model-
based methods perform image annotation by maximizing the joint probability of images and
words, which is calculated by the expectation over training images. However, the semantic
gap and the dependence on training data restrict their performance and scalability. In this
paper, a dual cross-media relevance model (DCMRM) is proposed for automatic image …
Image annotation has been an active research topic in recent years due to its potential impact on both image understanding and web image retrieval. Existing relevance-model-based methods perform image annotation by maximizing the joint probability of images and words, which is calculated by the expectation over training images. However, the semantic gap and the dependence on training data restrict their performance and scalability. In this paper, a dual cross-media relevance model (DCMRM) is proposed for automatic image annotation, which estimates the joint probability by the expectation over words in a pre-defined lexicon. DCMRM involves two kinds of critical relations in image annotation. One is the word-to-image relation and the other is the word-to-word relation. Both relations can be estimated by using search techniques on the web data as well as available training data. Experiments conducted on the Corel dataset and a web image dataset demonstrate the effectiveness of the proposed model.
ACM Digital Library