Nothing Special   »   [go: up one dir, main page]

×
Please click here if you are not redirected within a few seconds.
Oct 26, 2020 · In this paper, we address the problem of assigning importance scores to video segments, that is how much information they contain with respect to the overall ...
Our experiments investigate the impact of visual and temporal information, as well as the combination of multimodal features on importance prediction. Keywords.
This paper addresses the problem of assigning importance scores to video segments, that is how much information they contain with respect to the overall topic ...
Oct 26, 2020 · Our experiments investigate the impact of visual and temporal information, as well as the combination of multimodal features on importance ...
We use state-of-the-art pre-trained models to encode each modality in order to extract features.
EDUVSUM (Educational Video Summarization). Introduced by Ghauri et al. in Classification of Important Segments in Educational Videos using Multimodal Features.
Classification of Important Segments in Educational Videos using Multimodal Features. J. Ghauri, S. Hakimov, and R. Ewerth. CIKM (Workshops), volume 2699 of ...
Classification of Important Segments in Educational Videos using Multimodal Features. J. Ghauri, S. Hakimov, and R. Ewerth. CIKM (Workshops), CEUR-WS.org ...
We intro- duce PolyViLT, a novel multimodal transformer trained with a multi-instance learning loss that is more effective than current approaches for retrieval ...
EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important temporal segments in ...