Abstract
In this paper, we present the results of our work on the analysis of an automatic semantic video content indexing and retrieval system based on fusing various low level visual descriptors. Global MPEG-7 features extracted from video shots, are described via IVSM signature (Image Vector Space Model) in order to have a compact description of the content. Both static and dynamic feature fusion are introduced to obtain effective signatures. Support Vector Machines (SVMs) are employed to perform classification (One classifier per feature). The task of the classifiers is to detect the video semantic content. Then, classifier outputs are fused using a neural network based on evidence theory (NNET) in order to provide a decision on the content of each shot. The experimental results are conducted in the framework of the TRECVid feature extraction task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mottaleb, M.A., Krishnamachari, S.: Multimedia descriptions based on MPEG-7: Extraction and applications. Proceeding of IEEE Multimedia 6, 459–468 (2004)
Spyrou, E., Leborgne, H., Mailis, T., Cooke, E., Avrithis, Y., O’Connor, N.: Fusing MPEG-7 visual descriptors for image classification. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 847–852. Springer, Heidelberg (2005)
Rautiainen, M., Seppanen, T.: Comparison of visual features and fusion techniques in automatic detection of concepts from news video based on gabor filters. In: Proceeding of ICME, pp. 932–935 (2005)
Souvannavong, F., Merialdo, B., Huet, B.: Latent semantic analysis for an effective region based video shot retrieval system. In: Proceedings of ACM MIR, pp. 243–250 (2004)
Jolliffe, I.: Principle component analysis. Springer, Heidelberg (1986)
Zhang, W., Shan, S., Gao, W., Chang, Y., Cao, B., Yang, P.: Information fusion in face identification. In: Proceedings of IEEE ICPR, vol. 3, pp. 950–953 (2004)
Vapnik, V.: The nature of statistical learning theory. Springer, Heidelberg (1995)
Shafer, G.: A mathematical theory of evidence. Princeton University Press, Princeton (1976)
Benmokhtar, R., Huet, B.: Neural network combining classifier based on Dempster-Shafer theory. In: Cham, T.-J., Cai, J., Dorai, C., Rajan, D., Chua, T.-S., Chia, L.-T. (eds.) MMM 2007. LNCS, vol. 4351, pp. 196–205. Springer, Heidelberg (2006)
TrecVid, Digital video retrieval at NIST, http://www-nlpir.nist.gov/projects/trecvid/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Benmokhtar, R., Huet, B. (2008). Multi-level Fusion for Semantic Video Content Indexing and Retrieval. In: Boujemaa, N., Detyniecki, M., Nürnberger, A. (eds) Adaptive Multimedia Retrieval: Retrieval, User, and Semantics. AMR 2007. Lecture Notes in Computer Science, vol 4918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79860-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-79860-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79859-0
Online ISBN: 978-3-540-79860-6
eBook Packages: Computer ScienceComputer Science (R0)