Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/500141.500202acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Supporting audiovisual query using dynamic programming

Published: 01 October 2001 Publication History

Abstract

A necessary capability for content-based retrieval is to support the paradigm of query by example. Most systems for video retrieval support queries using image sequences only. We present an algorithm for matching multimodal (audio-visual) patterns for the purpose of content-based video retrieval. The novel ability of our approach to use the information content in multiple media coupled with a strong emphasis on temporal similarity differentiates it from the state-of-the-art in content-based retrieval. At the core of the pattern matching scheme is a dynamic programming algorithm, which leads to a significant improvement in performance. Coupling the use of audio with video this algorithm can be applied to grouping of shots based on audio-visual similarity. We also support relevance feedback. The user can provide feedback to the system, by choosing clips, which are closer to the user's desired target. The system then automatically adjusts the relative weights or relevance of the media and fetches different sets of target clips accordingly. It is our observation that a few iterations of such feedback are generally sufficient, for retrieving the desired video clips.

References

[1]
J. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Humphrey, R. Jain, and C. Shu. The virage image search engine: An open framework for image management. In Pnxeedings of SPIE Storage and Retrieval for Image and Video Databases, Feb. 1996.
[2]
Ft. E. Bellman. Dynamic Progmmming. Princeton University Press, Princeton, NJ, 1957.
[3]
S. F. Chang, W. Chen, and H. Sundaam. Semantic visual templates linking features to semantics. In Proceedings of IEEE International Confemce on Image Processing, volume 3, pages 531-535, Chicago, IL, Oct. 1998.
[4]
M. Flickner, H. Sawhney, W. Niblack, .I. Ashley, Q. Huang, B. Dam, M. Gorksni, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker. Query by image and video content: The QBIC system. IEEE Computer, 28(9):23-32, 1995.
[5]
A. K. Jain and A. Vailaya. Shape-based retrieval: A case study with trademark image databases. Pattern Recognition, 31(9):1369-1390, 1998.
[6]
A. K. Jain, A. Vailaya, and W. Xiong. Query by video clip. Multimedia Systems, Special Issue mz Vidideo Libraries, 7(5):369-384, 1999.
[7]
V. Kobla, D. DeMenthon, and D. Doermann. Identifying sports video using replay, text and camera motion features. In Proceedings of SPIE Storage and Retrieval for Media Databases, volume 3972, pages 332-343, Jan. 2000.
[8]
W. Ma and B. S. Manjunath. NETRA: A toolbox for navigating large image databases. Multimedia System, 7(3):X+-198, 1999.
[9]
R. Mohan. Video sequence matching. In Proceedings of Intonational Conference on Speech, Accowtics and Sigd Processing, volume 6, pages 3697-3700, 1998.
[10]
M. Naphade, I. Kozintsev, T. Huang, and K. Rnmchandran. A factor graph framework for semantic indexing and retrieval in video. In Pmceedings of Workshop on Content Based Access to Image and Video Libmries Held in Conjunction with CVPR, pages 35-39, June 2000.
[11]
M. Naphade, T. Kristjansson, B. Frey, and T. S. Huang. Probabilistic multimedia objects (multijects): A novel approach to indexing and retrieval in multimedia systems. In Pmceedings of IEEE Intemtiond Conference on Image Pnxessing, volume 3, pages 53G-540, Chicago, IL, Oct. 1998.
[12]
M. Naphade, R. Mehrotra, A. M. Ferman, J. Warnick, T. S. Hung, and A. M. Tekalp. A hiih performance shot boundary detection algorithm using multiple cues. In Proceedings of IEEE Intmatimd Confemm on Image Processing, volume 2, pages 884-887, Chicago, IL, Oct. 1998.
[13]
M. R. Naphade and T. S. Huang. Semantic video indexing using a probabilistic framework. In Pmceedings of IAPR International Conference on Pattern Recognition, volume 3, pages 83-88, Barcelona, Spain, Sep. 2000.
[14]
M. R. Naphade and T. S. Huang. A probabilistic framework for semantic video indexing, filtering and retrieval. IEEE Transactions on Multimedia, special issue on Multimedia over IP, 3(1):141-151, Mar. 2001.
[15]
M. R. Naphade, I. Kaintsev, and T. S. Hung. On probabilistic semantic video indexing. In Proceedings of Neural Information Processing Systems, Nov. 2000.
[16]
M. R. Naphade, M. M. Yang, and B. L. Yea. A navel scheme for fast and efficient video sequence matching using compact signatures. In Pmceedings of SPIE Storage and Retrieval for Mzdtimedia Databases, volume 3972, pages 564-572, Jan. 2000.
[17]
L. Rabiner and B. H. Juang. findamentals of Speech Recognition Prentice Hall, Englewood Cliffs, NJ, 1993.
[18]
Y. R.ui, T. S. Huang, M. Ortega, and S. Mehrotra. Relevance feedback: A power tool in interactive content-based image retrieval. IEEE %nsactions on Cimits and Systems for Video Technology, Special issue on Segmentation, Description, and Retrieval of Video Content, 8(5):644-655, Sep. 1998.
[19]
H. Sakoe and S. Chiba. Dynamic programming optimization for spoken word recognition. IEEE Zhmsactimw on Accoustics, Speech, Signal Processing ASSP, 26(1):43-49, Feb. 1978.
[20]
D. D. Saw, Y. P. Tan, S. R. Kulkami, and P. J. Ramadge. Automated analysis and annotation of basketball video. In Proceedings of SPIE Symposium, volume 3022, pages 176187, 1997.
[21]
J. R. Smith and S. F. Chang. Visualseek: A fully automated content-based image query system. In Proceedings of ACM Multimedia, Boston, MA, Nov. 1996.
[22]
S. Srinivasan, D. Ponceleon, A. Amir, and D. Petkovic. What is that video anyway? In search of better browsing. In Proceedings of IEEE Intenzational Conference on Multimedia and Ezpo, pages 388-392, July 2000.
[23]
N. Vaswncelos and A. Lippman. Baysian modeling of video editing and structure: Semantic features for video summarization and browsing. In P-dings of IEEE International Confemce on Image Processing, volume 2, pages 550-555, Chicago, IL, Oct. 1998.
[24]
M. M. Yeung and B. Liu. Efficient matching and clustering of video shots. In Proceedings of IEEE Intenuationol Conference on Image Pwxessing, volume 1, pages 338-341, Washington, D.C., Oct. 1995.
[25]
H. Zhang, A. Wang, and Y. Altunbasak. Content-based video retrieval and compression: A unified solution. In Proceedings of IEEE International Confersce on Image Processing, volume 1, pages 13-16, Santa Barbara, CA, Oct. 1997.
[26]
T. Zhang and C. Kuo. An integrated approach to multimodal media content analysis. In Proceedings of SPIE, ISIT Storage and Retrieval for Media Databases, volume 3972, pages 506-517, Jan. 2000.
[27]
D. Zhong and S. F. Chang. Spatio-temporal video search using the object-based video representation. In Pmceedings of IEEE International Conference on Image Pmcewing, volume 1, pages 21-24, Santa Barbara, CA, Oct. 1997.

Cited By

View all
  • (2016)Efficient video copy detection using multi-modality and dynamic path searchMultimedia Systems10.1007/s00530-014-0387-822:1(29-39)Online publication date: 1-Feb-2016
  • (2009)Multimodal video copy detection applied to social mediaProceedings of the first SIGMM workshop on Social media10.1145/1631144.1631157(57-64)Online publication date: 23-Oct-2009
  • (2007)Classification of video events using 4-dimensional time-compressed motion featuresProceedings of the 6th ACM international conference on Image and video retrieval10.1145/1282280.1282311(178-185)Online publication date: 9-Jul-2007
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia
October 2001
664 pages
ISBN:1581133944
DOI:10.1145/500141
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2001

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic programming
  2. nonlinear warping
  3. relevance feedback
  4. video retrieval

Qualifiers

  • Article

Conference

MM01: ACM Multimedia 2001
September 30 - October 5, 2001
Ottawa, Canada

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Efficient video copy detection using multi-modality and dynamic path searchMultimedia Systems10.1007/s00530-014-0387-822:1(29-39)Online publication date: 1-Feb-2016
  • (2009)Multimodal video copy detection applied to social mediaProceedings of the first SIGMM workshop on Social media10.1145/1631144.1631157(57-64)Online publication date: 23-Oct-2009
  • (2007)Classification of video events using 4-dimensional time-compressed motion featuresProceedings of the 6th ACM international conference on Image and video retrieval10.1145/1282280.1282311(178-185)Online publication date: 9-Jul-2007
  • (2004)On supervision and statistical learning for semantic multimedia analysisJournal of Visual Communication and Image Representation10.1016/j.jvcir.2004.04.01015:3(348-369)Online publication date: 1-Sep-2004

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media