Computer Science > Computer Vision and Pattern Recognition

arXiv:2307.01945 (cs)

[Submitted on 4 Jul 2023]

Title:Query-based Video Summarization with Pseudo Label Supervision

Authors:Jia-Hong Huang, Luka Murn, Marta Mrak, Marcel Worring

View PDF

Abstract:Existing datasets for manually labelled query-based video summarization are costly and thus small, limiting the performance of supervised deep video summarization models. Self-supervision can address the data sparsity challenge by using a pretext task and defining a method to acquire extra data with pseudo labels to pre-train a supervised deep model. In this work, we introduce segment-level pseudo labels from input videos to properly model both the relationship between a pretext task and a target task, and the implicit relationship between the pseudo label and the human-defined label. The pseudo labels are generated based on existing human-defined frame-level labels. To create more accurate query-dependent video summaries, a semantics booster is proposed to generate context-aware query representations. Furthermore, we propose mutual attention to help capture the interactive information between visual and textual modalities. Three commonly-used video summarization benchmarks are used to thoroughly validate the proposed approach. Experimental results show that the proposed video summarization algorithm achieves state-of-the-art performance.

Comments:	This paper is accepted by IEEE International Conference on Image Processing (ICIP), 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
Cite as:	arXiv:2307.01945 [cs.CV]
	(or arXiv:2307.01945v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2307.01945

Submission history

From: Jia-Hong Huang [view email]
[v1] Tue, 4 Jul 2023 22:28:17 UTC (13,456 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Query-based Video Summarization with Pseudo Label Supervision

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Query-based Video Summarization with Pseudo Label Supervision

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators