Nothing Special   »   [go: up one dir, main page]

Skip to main content

Detecting Semantic Concepts from Video Using Temporal Gradients and Audio Classification

  • Conference paper
  • First Online:
Image and Video Retrieval (CIVR 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2728))

Included in the following conference series:

Abstract

In this paper we describe new methods to detect semantic concepts from digital video based on audible and visual content. Temporal Gradient Correlogram captures temporal correlations of gradient edge directions from sampled shot frames. Power-related physical features are extracted from short audio samples in video shots. Video shots containing people, cityscape, landscape, speech or instrumental sound are detected with trained self-organized maps and kNN classification results of audio samples. Test runs and evaluations in TREC 2002 Video Track show consistent performance for Temporal Gradient Correlogram and state-of-the-art precision in audio-based instrumental sound detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. IBM CueVideo Toolkit. http://www.almaden.ibm.com/projects/cuevideo.shtml (27.2.2003)

    Google Scholar 

  2. Informedia. http://www.informedia.cs.cmu.edu/ (27.2.2003)

    Google Scholar 

  3. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic D., Steele, D., and, Yanker, P.: Query by image and video content: The QBIC system. In IEEE Computer Magazine 28, (1995) 23–32

    Google Scholar 

  4. Smeaton, A: Browsing digital video in the FÍschlár system. Keynote presentation at Infotech Oulu International Workshop on Information Retrieval, Oulu, Finland (2001)

    Google Scholar 

  5. Virage, Inc. http://www.virage.com/ (27.2.2003)

    Google Scholar 

  6. Naphade, M.R., Kristjansson, T., Frey, B., Huang, T.S.: Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems. In proceedings of International Conference on Image Processing, vol. 3. (1998) 536–540

    Google Scholar 

  7. Naphade, M.R., Huang, T.S.: Semantic video indexing using a probabilistic framework. In proceedings of 15th International Conference on Pattern Recognition, Vol. 3. (2000) 79–84

    Google Scholar 

  8. Chang S.F., Chen W., Sundaram H.: Semantic visual templates — linking features to semantics. In Proceedings of IEEE International Conference on Image Processing, vol. 3. (1998) 531–535

    Google Scholar 

  9. Del Bimbo, A.: Expressive semantics for automatic annotation and retrieval of video streams. IEEE International Conference on Multimedia and Expo, Vol.2. (2000) 671–674

    Google Scholar 

  10. Naphade, M.R., Basu, S., Smith, J.R., Ching-Yung Lin, Tseng, B.: A statistical modeling approach to content based video retrieval. Proceedings of 16th International Conference on Pattern Recognition, Vol.2. (2002) 953–956

    Google Scholar 

  11. TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid/ (27.2.2003)

    Google Scholar 

  12. Vailaya, A., Jain, A., Hong Jiang Zhang: On image classification: city vs. landscape. Proceedings of IEEE Workshop on Content-Based Access of Image and Video Libraries. (1998) 3–8

    Google Scholar 

  13. Carey M., Parris E., Lloyd-Thomas H.: A comparison of features for speech, music discrimination. Proc. ICASSP (1999)

    Google Scholar 

  14. Hoyt J. & Wechsler H.: Detection of human speech in structured noise. Proc. ICASSP (1994)

    Google Scholar 

  15. Saunders J.: Real-time discrimination of broadcast speech/music. Proc. ICASSP (1996)

    Google Scholar 

  16. Penttilä J., Peltola J., Seppänen T.: A speech/music discriminator-based audio browser with a degree of certainty measure. Proceedings of Infotech Oulu International Workshop on Information Retrieval, Oulu, Finland, (2001) 125–131

    Google Scholar 

  17. Scheirer E., Slaney M.: Construction and evaluation of a robust multifeature speech/music discriminator. Proc. ICASSP (1997)

    Google Scholar 

  18. DeValois, R.L., DeValois K.K.: Neural coding of color. In E.C. Carterette and M.P. Friedman (eds.) Handbook of perception, vol. 5. New YorK: Academic press (1975) 117–166

    Google Scholar 

  19. Mäenpää T., Pietikäinen M. & Viertola J.: Separating color and pattern information for color texture discrimination. Proceedings of 16th International Conference on Pattern Recognition, vol. 1. Quebec City, Canada (2002) 668–671

    Google Scholar 

  20. Rautiainen M., Doermann D.: Temporal color correlograms for video retrieval. Proceedings of 16th International Conference on Pattern Recognition, Quebec City, Canada (2002)

    Google Scholar 

  21. Haralick, R., Shanmugam, K., Dinstein, I.: Textural features for image classification, IEEE Transactions on Systems, Man and Cybernetics, vol. 3. (1973) 610–621

    Article  Google Scholar 

  22. Prewitt, J.M.S.: Object enhancement and extraction. In B.S. Lipkin and A. Rosenfeld, (eds) Picture Processing and Psychopictorics, Academic Press, New York (1970)

    Google Scholar 

  23. Huang, J., Kumar, S.R., Mitra, M., Zhu W.J.: Image indexing using color correlograms. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico (1997) 762–768

    Google Scholar 

  24. Manjunath, B.S., Salembier, P., Sikora, T.: Introduction to MPEG-7: Multimedia Content Description Language. Wiley, John & Sons, Inc. (2002)

    Google Scholar 

  25. Rautiainen, M., Penttilä, J., Vorobiev, D., Noponen, K., Väyrynen, P., Hosio, M., Matinmikko, E., Mäkelä, S.M., Peltola, J., Ojala, T., Seppänen, T.: TREC 2002 Video Track experiments at MediaTeam Oulu and VTT. Text Retrieval Conference TREC 2002 Video Track, Gaithersburg, MD (2002)

    Google Scholar 

  26. Open Video Project. http://www.open-video.org/ (27.2.2003)

    Google Scholar 

  27. Internet Archive Home Page. http://webdev.archive.org/movies/movies.php (27.2.2003)

    Google Scholar 

  28. Voorhees, E.M.: Overview of TREC 2001. Proceedings of the Tenth Text REtrieval Conference TREC-10 (2001)

    Google Scholar 

  29. Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22., Issue 12 (2000) 1349–1380

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rautiainen, M., Seppänen, T., Penttilä, J., Peltola, J. (2003). Detecting Semantic Concepts from Video Using Temporal Gradients and Audio Classification. In: Bakker, E.M., Lew, M.S., Huang, T.S., Sebe, N., Zhou, X.S. (eds) Image and Video Retrieval. CIVR 2003. Lecture Notes in Computer Science, vol 2728. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45113-7_26

Download citation

  • DOI: https://doi.org/10.1007/3-540-45113-7_26

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40634-1

  • Online ISBN: 978-3-540-45113-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics