research-article

Narrative Dataset: Towards Goal-Driven Narrative Generation

Authors:

Rishabh Sheoran,

Satoshi YamazakiAuthors Info & Claims

NarSUM '22: Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos

Pages 7 - 12

https://doi.org/10.1145/3552463.3557021

Published: 10 October 2022 Publication History

Abstract

In this paper, we propose a new dataset called the Narrative dataset, which is a work in progress, towards generating video and text narratives of complex daily events from long videos, captured from multiple cameras. As most of the existing datasets are collected from publicly available videos such as YouTube videos, there are no datasets targeted towards the task of narrative summarization of complex videos which contains multiple narratives. Hence, we create story plots and conduct video shooting with hired actors to create complex video sets where 3 to 4 narratives happen in each video. In the story plot, a narrative composes of multiple events corresponding to video clips of key human activities. On top of the shot video sets and the story plot, the narrative dataset contains dense annotation of actors, objects, and their relationships for each frame as the facts of narratives. Therefore, narrative dataset richly contains holistic and hierarchical structure of facts, events, and narratives. Moreover, Narrative Graph, a collection of scene graphs of narrative events with their causal relationships, is introduced for bridging the gap between the collection of facts and generation of the summary sentences of a narrative. Beyond related subtasks such as scene graph generation, narrative dataset potentially provide challenges of subtasks for bridging human event clips to narratives.

References

[1]

Chunhui Gu, Chen Sun, David A Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, et al. 2018. Ava: A video dataset of spatio-temporally localized atomic visual actions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6047--6056.

[2]

Michael Gygli, Helmut Grabner, and Luc Van Gool. 2015. Video summarization by learning submodular mixtures of objectives. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3090--3098.

[3]

Jingwei Ji, Ranjay Krishna, Li Fei-Fei, and Juan Carlos Niebles. 2020. Action genome: Actions as compositions of spatio-temporal scene graphs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10236--10247.

[4]

Jonathan Krause, Justin Johnson, Ranjay Krishna, and Li Fei-Fei. 2017. A hierarchical approach for generating descriptive image paragraphs. In Proceedings of the IEEE conference on computer vision and pattern recognition. 317--325.

[5]

Ranjay Krishna, Kenji Hata, Frederic Ren, Li Fei-Fei, and Juan Carlos Niebles. 2017. Dense-captioning events in videos. In Proceedings of the IEEE international conference on computer vision. 706--715.

[6]

Yong Jae Lee and Kristen Grauman. 2015. Predicting important objects for egocentric video summarization. International Journal of Computer Vision, Vol. 114, 1 (2015), 38--55.

Digital Library

[7]

Junnan Li, Yongkang Wong, Qi Zhao, and Mohan S. Kankanhalli. 2020. Video Storytelling: Textual Summaries for Events. IEEE Trans. Multim., Vol. 22, 2 (2020), 554--565.

Digital Library

[8]

Yu Liu, Jianlong Fu, Tao Mei, and Chang Wen Chen. 2017. Let your photos talk: Generating narrative paragraph for photo stream via bidirectional attention recurrent neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.

[9]

Zelun Luo, Wanze Xie, Siddharth Kapoor, Yiyun Liang, Michael Cooper, Juan Carlos Niebles, Ehsan Adeli, and Fei-Fei Li. 2021. MOMA: Multi-Object Multi-Actor Activity Parsing. Advances in Neural Information Processing Systems, Vol. 34 (2021), 17939--17955.

[10]

Cesc C Park and Gunhee Kim. 2015. Expressing an image stream with a sequence of natural sentences. Advances in neural information processing systems, Vol. 28 (2015).

[11]

Danila Potapov, Matthijs Douze, Zaid Harchaoui, and Cordelia Schmid. 2014. Category-specific video summarization. In European conference on computer vision. Springer, 540--555.

[12]

Nishant Rai, Haofeng Chen, Jingwei Ji, Rishi Desai, Kazuki Kozuka, Shun Ishizaka, Ehsan Adeli, and Juan Carlos Niebles. 2021. Home action genome: Cooperative compositional action understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11184--11193.

[13]

Aidean Sharghi, Jacob S Laurel, and Boqing Gong. 2017. Query-focused video summarization: Dataset, evaluation, and a memory network based approach. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4788--4797.

[14]

Gunnar A Sigurdsson, Gül Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav Gupta. 2016. Hollywood in homes: Crowdsourcing data collection for activity understanding. In European Conference on Computer Vision. Springer, 510--526.

[15]

Yale Song, Jordi Vallmitjana, Amanda Stent, and Alejandro Jaimes. 2015. Tvsum: Summarizing web videos using titles. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5179--5187.

[16]

Yongkang Wong, Shaojing Fan, Yangyang Guo, Ziwei Xu, Karen Stephen, Rishabh Sheoran, Anusha Bhamidipati, Vivek Barsopia, Jianquan Liu, and Mohan Kankanhalli. 2022. Compute to Tell the Tale: Goal-Driven Narrative Generation. In ACM Multimedia (to be published).

Cited By

Index Terms

Narrative Dataset: Towards Goal-Driven Narrative Generation
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
2. General and reference
  1. Document types
    1. Surveys and overviews

Recommendations

Narrative prose generation

Narrative generation has historically suffered from poor writing quality, stemming from a narrow focus on story grammars and plot design. Moreover, to-date natural language generation systems have not been capable of faithfully reproducing either the ...
Narrative Generation for Suspense: Modeling and Evaluation
Interactive Storytelling
Abstract
Although suspense contributes significantly to the enjoyment of a narrative by its readers, there has been little research on the automated generation of stories that evoke specific cognitive and affective responses in their readers. The goal of ...
Narrative prose generation

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

NarSUM '22: Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos

October 2022

36 pages

ISBN:9781450394932

DOI:10.1145/3552463

General Chair:
Mohan S. Kankanhalli
National University of Singapore, Singapore
,
Program Chairs:
Jianquan Liu
NEC Corporation, Japan
,
Yongkang Wong
National University of Singapore, Singapore

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10, 2022

Lisboa, Portugal

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
144
Total Downloads

Downloads (Last 12 months)43
Downloads (Last 6 weeks)4

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents