Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2911451.2914765acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Which Information Sources are More Effective and Reliable in Video Search

Published: 07 July 2016 Publication History

Abstract

It is common that users are interested in finding video segments, which contain further information about the video contents in a segment of interest. To facilitate users to find and browse related video contents, video hyperlinking aims at constructing links among video segments with relevant information in a large video collection. In this study, we explore the effectiveness of various video features on the performance of video hyperlinking, including subtitle, metadata, content features (i.e., audio and visual), surrounding context, as well as the combinations of those features. Besides, we also test different search strategies over different types of queries, which are categorized according to their video contents. Comprehensive experimental studies have been conducted on the dataset of TRECVID 2015 video hyperlinking task. Results show that (1) text features play a crucial role in search performance, and the combination of audio and visual features cannot provide improvements; (2) the consideration of contexts cannot obtain better results; and (3) due to the lack of training examples, machine learning techniques cannot improve the performance.

References

[1]
G. Amati. Frequentist and bayesian approach to information retrieval. In Proc. of ECIR, 2006.
[2]
G. Amati, G. Amodeo, M. Bianchi, Ca. Gaibisso, and G. Gambosi. Fub, iasi-cnr and university of tor vergata at trec 2008 blog track. In Proc. of TREC 2008, 2008.
[3]
C. Bhatt, N. Pappas, M. Habibi, and A. Popescu-Belis. Multimodal reranking of content-based recommendations for hyperlinking video snippets. In Proc. of ACM ICMR, 2014.
[4]
X. Chang, Y. Yu, Y. Yang, and A. Hauptmann. Searching persuasively: Joint event detection and evidence recounting with limited supervision. In Proc. of ACM MM, 2015.
[5]
M. Eskevich, G. Jones, and R. Aly. Multimedia information seeking through search and hyperlinking. In Proc. of ACM ICMR, 2013.
[6]
M. Eskevich, G. Jones, S. Chen, R. Aly, R. Ordelman, and M. Larson. Search and hyperlinking task at mediaeval 2012. In Proceedings of MediaEval Workshop, 2013.
[7]
F. Eyben, F. Weninger, F. Gross, and B. Schuller. Recent developments in opensmile, the munich open-source multimedia feature extractor. In Proc. of ACM MM, 2013.
[8]
P. Galußcšková, P. Pecina, M. Kruliš, and J. Lokoc. Cuni at mediaeval 2014 search and hyperlinking task: visual and prosodic features in hyperlinking. In Proceedings of the MediaEval Workshop, 2014.
[9]
A. Gianni. Probabilistic models for information retrieval based on divergence from randomness. PhD Thesis, School of Computing Science, University of Glasgow, 2003.
[10]
L. Hang. A short introduction to learning to rank. IEICE TRANSACTIONS on Information and Systems, 94(10):1854--1862, 2011.
[11]
D. Hiemstra. Using language models for information retrieval. PhD Thesis, University of Twente, 2001.
[12]
L. Lamel. Multilingual speech processing activities in quaero: Application to multimedia search in unstructured data. In Proc. of Baltic HLT, pages 1--8, 2012.
[13]
Z. Lan, M. Lin, X. Li, A. Hauptmann, and B. Raj. Beyond gaussian pyramid: Multi-skip feature stacking for action recognition. arXiv preprint arXiv:1411.6660, 2014.
[14]
P. Lanchantin et al. Automatic transcription of multi-genre media archives. In First Workshop on Speech, Language and Audio in Multimedia, 2013.
[15]
P. Over, J. Fiscus, G. Sanders, D. Joy, M. Michel, G. Awad, A. Smeaton, W. Kraaij, and G. Quénot. Trecvid 2014--an overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of TRECVID, 2014.
[16]
A. Rousseau, P. Deléglise, and Y. Estève. Enhancing the ted-lium corpus with selected data for language modeling and more ted talks. In The 9th edition of the Language Resources and Evaluation Conference, pages 3935--3939, 2014.
[17]
T. Tommasi et al. Beyond metadata: searching your archive based on its audio-visual content. In Proceedings of the 2014 International Broadcasting Convention, 2014.
[18]
A. Vedaldi and A. Zisserman. Efficient additive kernels via explicit feature maps. IEEE Trans. Pattern Anal. Mach. Intell., 34(3):480--492, 2012.
[19]
S. Yu, L. Jiang, Z. Xu, Y. Yang, and A. Hauptmann. Content-based video search over 1 million videos with 1 core in 1 second. In Proc. of ACM ICMR, 2015.

Cited By

View all
  • (2024)Video Moment Retrieval With Noisy LabelsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321290035:5(6779-6791)Online publication date: May-2024
  • (2024)Reference-Aware Adaptive Network for Image-Text MatchingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.339261934:10(9678-9691)Online publication date: Oct-2024
  • (2024)Improving Image-Text Matching With Bidirectional Consistency of Cross-Modal AlignmentIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.336965634:7(6590-6607)Online publication date: Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. video hyperlinking
  2. video search

Qualifiers

  • Short-paper

Conference

SIGIR '16
Sponsor:

Acceptance Rates

SIGIR '16 Paper Acceptance Rate 62 of 341 submissions, 18%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Video Moment Retrieval With Noisy LabelsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321290035:5(6779-6791)Online publication date: May-2024
  • (2024)Reference-Aware Adaptive Network for Image-Text MatchingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.339261934:10(9678-9691)Online publication date: Oct-2024
  • (2024)Improving Image-Text Matching With Bidirectional Consistency of Cross-Modal AlignmentIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.336965634:7(6590-6607)Online publication date: Jul-2024
  • (2023)Focus and Align: Learning Tube Tokens for Video-Language Pre-TrainingIEEE Transactions on Multimedia10.1109/TMM.2022.323110825(8036-8050)Online publication date: 1-Jan-2023
  • (2023)Debiased Video-Text Retrieval via Soft Positive Sample CalibrationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.324887333:9(5257-5270)Online publication date: 1-Sep-2023
  • (2021)Video representation and suspicious event detection using semantic technologiesSemantic Web10.3233/SW-20039312:3(467-491)Online publication date: 1-Jan-2021
  • (2021)Multimodal Activation: Awakening Dialog Robots without Wake WordsProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462964(491-500)Online publication date: 11-Jul-2021
  • (2020)Wordy: Interactive Word Cloud to Summarize and Browse Online Videos to Enhance eLearning2020 IEEE/SICE International Symposium on System Integration (SII)10.1109/SII46433.2020.9026306(879-884)Online publication date: Jan-2020
  • (2019)Video-Based Cross-Modal Recipe RetrievalProceedings of the 27th ACM International Conference on Multimedia10.1145/3343031.3351067(1685-1693)Online publication date: 15-Oct-2019
  • (2019)Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal AttentionProceedings of the 2019 on International Conference on Multimedia Retrieval10.1145/3323873.3325019(217-225)Online publication date: 5-Jun-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media