Abstract
This paper presents an efficient learning scheme for automatic annotation of video shot size. Instead of existing methods that applied in sports videos using domain knowledge, we are aiming at a general approach to deal with more video genres, by using a more general low- and mid- level feature set. Support Vector Machine (SVM) is adopted in the classification task, and an efficient co-training scheme is used to explore the information embedded in unlabeled data based on two complementary feature sets. Moreover, the subjectivity-consistent costs for different mis-classifications are introduced to make the final decisions by a cost minimization criterion. Experimental results indicate the effectiveness and efficiency of the proposed scheme for shot size annotation.
This work was performed when the first author was visiting Microsoft Research Asia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
TRECVID: TREC Video Retrieval Evaluation, http://www-nlpir.nist.gov/prejects/trecvid
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of COLT (1998)
Deng, Y., Manjunath, B.S.: Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI) (2001)
Elkan, C.: The foundation of cost-sensitive learning. In: Proceedings of IJCAI (2001)
Maria, Z.F., Barbieri, M., Weda, H.: Automatic classification of field of view in video. In: Proceedings of ICME (2006)
Hua, X.S., Lu, L., Zhang, H.J.: AVE – Automated Home Video Editing. In: Proceedings of ACM Multimedia (2003)
Kumano, M., Ariki, Y., Amano, M., Uehara, K.: Video editing support system based on video grammar and content analysis. In: Proceedings of ICPR (2002)
Kumano, M., Ariki, Y., Tsukada, K., Shunto, K.: Automatic shot size indexing for a video editing support system. In: Proceedings of CBMI (2003)
Li, B., Goh, K., Chang, E.: Confidence based dynamic ensemble for image segmentation and semantic discovery. In: Proceedings of ACM Multimedia (2003)
Matsuo, Y., Amano, M., Uehara, K.: Mining video editing rules in video streams. In: Proceedings of ACM Multimedia (2002)
Mei, T., Hua, X.S.: Tracking users capture intention: a novel complementary view for home video content analysis. In: Proceedings of ACM Multimedia (2005)
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of CIKM (2000)
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Proceedings of Advances in Large Margin Classifiers (1999)
Seeger, M.: Learning with Labeled and Unlabeled Data. Tachnical report, Edinburgh University (2001)
Tong, X.F., Duan, L.Y., Lu, H.Q., Xu, C.S., Tian, Q., Jin, J.S.: A mid-level visual concept generation framework for sports analysis. In: Proceedings of ICME (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, M., Hua, XS., Song, Y., Lai, W., Dai, LR., Wang, RH. (2006). An Efficient Automatic Video Shot Size Annotation Scheme. In: Cham, TJ., Cai, J., Dorai, C., Rajan, D., Chua, TS., Chia, LT. (eds) Advances in Multimedia Modeling. MMM 2007. Lecture Notes in Computer Science, vol 4351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69423-6_63
Download citation
DOI: https://doi.org/10.1007/978-3-540-69423-6_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69421-2
Online ISBN: 978-3-540-69423-6
eBook Packages: Computer ScienceComputer Science (R0)