Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-540-89796-5_64guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

SemanGist: A Local Semantic Image Representation

Published: 09 December 2008 Publication History

Abstract

Although various kinds of image features have been proposed, there exists no single optimal feature which can save the effort of all other features for multimedia analysis applications, e.g. image annotation. In this paper, we propose a novel image representation, Semantic Gist (SemanGist), to combine the merit of multiple features automatically. Given a local image patch, SemanGist converts multiple low-level features of the patch into compact prediction scores of a few predefined semantic categories. To this end, a discriminative multi-label boosting algorithm is adopted. This local SemanGist output allows for incorporating semantic spatial context among adjacent patches. For applications like image annotation, this may further reduce possible annotation errors by considering the label compatibility. The same boosting algorithm is applied to the SemanGist representation, together with low-level features, to ensure the label compatibility. Experiments on an image annotation task show that SemanGist not only achieves compact representation but also incorporates spatial context at low run-time computational cost.

References

[1]
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 145-175 (2001).
[2]
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: ICML 2007: Proceedings of the 24th international conference on Machine learning, pp. 209-216. ACM, New York (2007).
[3]
van Gemert, J.C., Geusebroek, J.M., Veenman, C.J., Snoek, C.G.M., Smeulders, A.W.M.: Robust scene categorization by learning image statistics in context. In: SLAM workshop on CVPR 2006, p. 105 (2006).
[4]
Amir, A., et al.: IBM research trecvid-2003 video retrieval system. In: Proc. of TRECVID workshop (2004).
[5]
Snoek, C.G.M., Worring, M., Geusebroek, J.M., Koelma, D.C., Seinstra, F.J., Smeulders, A.W.M.: The semantic pathfinder: Using an authoring metaphor for generic multimedia indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 1678-1689 (2006).
[6]
Jiang, W., Chang, S.F., Loui, A.C.: Context-based concept fusion with boosted conditional random fields. In: Proc. of ICASSP, Hawaii, USA (April 2007).
[7]
Yan, R., Tesic, J., Smith, J.R.: Model-shared subspace boosting for multilabel classification. In: Proc. of ACM KDD 2007 (2007).
[8]
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing features: efficient boosting procedures for multiclass object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, vol. 2, pp. 762-769 (2004).
[9]
Wang, D., Liu, X., Luo, L., Li, J., Zhang, B.: Video diver: generic video indexing with diverse features. In: MIR 2007: Proc. of MIR workshop, pp. 61-70 (2007).
[10]
Fredman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28, 274-337 (2000).
[11]
Pudil, P., Ferri, F., Novovicova, J., Kittler, J.: Floating search methods for feature selection with nonmonotonic criterion functions. Pattern Recognition 2, 279-283 (1994).
[12]
Yuan, J., Li, J., Zhang, B.: Exploiting spatial context constraints for autmatic image region annotation. In: Proc. of ACM Multimedia 2007 (2007).
[13]
Altun, Y., Hofmann, T., Johnson, M.: Discriminative learning for label sequences via boosting (2003).
[14]
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: Proc. of Intl. MIR workshop (2006).
[15]
Deng, Y., Manjunath, B.S.: Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Anal. Mach. Intell. 23, 800-810 (2001).

Cited By

View all
  • (2009)The interactive video retrieval system in SMARTV 2009Proceedings of the ACM International Conference on Image and Video Retrieval10.1145/1646396.1646455(1-1)Online publication date: 8-Jul-2009

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
PCM '08: Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
December 2008
939 pages
ISBN:9783540897958
  • Editors:
  • Yueh-Min Ray Huang,
  • Changsheng Xu,
  • Kuo-Sheng Cheng,
  • Jar-Ferr Kevin Yang,
  • M. N. Swamy,
  • Shipeng Li,
  • Jen-Wen Ding

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 09 December 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2009)The interactive video retrieval system in SMARTV 2009Proceedings of the ACM International Conference on Image and Video Retrieval10.1145/1646396.1646455(1-1)Online publication date: 8-Jul-2009

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media