Abstract
This paper presents an approach on high-level feature detection within video documents, using a Region Thesaurus. A video shot is represented by a single keyframe and MPEG-7 features are extracted locally, from coarse segmented regions. Then a clustering algorithm is applied on those extracted regions and a region thesaurus is constructed to facilitate the description of each keyframe at a higher level than the low-level descriptors but at a lower than the high-level concepts. A model vector representation is formed and several high-level concept detectors are appropriately trained using a global keyframe annotation. The proposed approach is thoroughly evaluated on the TRECVID 2007 development data for the detection of nine high level concepts, demonstrating sufficient performance on large data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Saux, B., Amato, G.: Image classifiers for scene analysis. In: International Conference on Computer Vision and Graphics (2004)
Gokalp, D., Aksoy, S.: Scene classification using bag-of-regions representations. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV - International Workshop on Statistical Learning in Computer Vision (2004)
Boujemaa, N., Fleuret, F., Gouet, V., Sahbi, H.: Visual content extraction for automatic semantic annotation of video news. In: IS&T/SPIE Conf. on Storage and Retrieval Methods and Applications for Multimedia (2004)
Voisine, N., Dasiopoulou, S., Mezaris, V., Spyrou, E., Athanasiadis, T., Kompatsiaris, I., Avrithis, Y., Strintzis, M.G.: Knowledge-assisted video analysis using a genetic algorithm. In: 6th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS (2005)
IBM: MARVEL Multimedia Analysis and Retrieval System. IBM Research White paper (2005)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. International Journal of Computer Vision (2008)
Naphade, M.R., Kennedy, L., Kender, J.R., Chang, S.F., Smith, J.R., Over, P., Hauptmann, A.: A Light Scale Concept Ontology for Multimedia understanding for trecvid (IBM Research Technical Report (2005)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)
Avrithis, Y., Doulamis, A., Doulamis, N., Kollias, S.: A stochastic framework for optimal key frame extraction from mpeg video databases. Computer Vision and Image Understanding 75 (1/2), 3–24 (1999)
Manjunath, B., Ohm, J., Vasudevan, V., Yamada, A.: Color and texture descriptors. IEEE trans. on Circuits and Systems for Video Technology 11(6), 703–715 (2001)
Spyrou, E., LeBorgne, H., Mailis, T., Cooke, E., Avrithis, Y., O’Connor, N.: Fusing MPEG-7 visual descriptors for image classification. In: International Conference on Artificial Neural Networks (ICANN) (2005)
Molina, J., Spyrou, E., Sofou, N., Martinez, J.M.: On the selection of MPEG-7 visual descriptors and their level of detail for nature disaster video sequences classification. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 70–73. Springer, Heidelberg (2007)
Ayache, S., Quenot, G.: TRECVID, collaborative annotation using active learning. In: TRECVID, Workshop, Gaithersburg (2007)
Kishida, K.: Property of average precision and its generalization: an examination of evaluation indicator for information retrieval. NII Technical Reports, NII-2005-014E (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Spyrou, E., Tolias, G., Avrithis, Y. (2009). Large Scale Concept Detection in Video Using a Region Thesaurus. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds) Advances in Multimedia Modeling . MMM 2009. Lecture Notes in Computer Science, vol 5371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92892-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-92892-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92891-1
Online ISBN: 978-3-540-92892-8
eBook Packages: Computer ScienceComputer Science (R0)