research-article

WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session

Authors:

Amit Arjun Verma,

S. R.S. IyengarAuthors Info & Claims

HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in Hypertext

Article No.: 4, Pages 1 - 9

https://doi.org/10.1145/3406853.3432662

Published: 25 November 2020 Publication History

Abstract

Wikipedia is an open-content encyclopedia that receives billions of page views per month. It has been observed that in a single reading session, Wikipedia users visit multiple articles. To reduce the problems of overload and loss of information, there has been a growing interest in the research community to develop new approaches to present the only necessary information to the users. Automatically generation of personalized summaries is a proven remedy for the information overload problem. In this paper, we propose a technique to generate personalized summaries for Wikipedia articles by analyzing the reading patterns of users. To perform reading pattern analysis, we track eye gaze during the article reading session. Eye gaze analysis helps in identifying the attention distribution of a reader over an article. We extend the proposed approach to generate a summary for multiple articles visited during a user's Wikipedia reading session. We capture a dataset representing the reading pattern of Wikipedia users. We make this dataset publicly available for research community1.

References

[1]

[n.d.]. CVC Eye Tracker. https://github.com/tiendan/OpenGazer Accessed: 2016.

[2]

[n.d.]. NetGazer. http://sourceforge.net/projects/netgazer/ Accessed: 2016.

[3]

Henny Admoni and Brian Scassellati. 2017. Social eye gaze in human-robot interaction: a review. Journal of Human-Robot Interaction 6, 1 (2017), 25--63.

Digital Library

[4]

Diego Antognini and Boi Faltings. 2019. Learning to Create Sentence Semantic Relation Graphs for Multi-Document Summarization. arXiv preprint arXiv:1909.12231 (2019).

[5]

Diego Antognini and Boi Faltings. 2020. GameWikiSum: a Novel Large Multi-Document Summarization Dataset. arXiv preprint arXiv:2002.06851 (2020).

[6]

Shlomo Berkovsky, Timothy Baldwin, and Ingrid Zukerman. 2008. Aspect-based personalized text summarization. In International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems. Springer, 267--270.

Digital Library

[7]

David Beymer and Daniel M Russell. 2005. WebGazeAnalyzer: a system for capturing and analyzing web reading behavior using eye gaze. In CHI'05 extended abstracts on Human factors in computing systems. ACM, 1913--1916.

Digital Library

[8]

Georg Buscher and Andreas Dengel. 2009. Gaze-based filtering of relevant document segments. In International World Wide Web Conference (WWW). 2024.

[9]

Frans W Cornelissen, Enno M Peters, and John Palmer. 2002. The Eyelink Toolbox: eye tracking with MATLAB and the Psychophysics Toolbox. Behavior Research Methods, Instruments, & Computers 34, 4 (2002), 613--617.

[10]

Alberto Díaz, Pablo Gervás, and Antonio García. 2005. Evaluation of a System for Personalized Summarization of Web Contents. In User Modeling 2005. Springer Berlin Heidelberg, 453--462.

[11]

Peter K Dunn, Margaret Marshman, and Robert McDougall. 2019. Evaluating Wikipedia as a self-learning resource for statistics: You know they'll use it. The American Statistician 73, 3 (2019), 224--231.

[12]

Nathan J Emery. 2000. The eyes have it: the neuroethology, function and evolution of social gaze. Neuroscience & Biobehavioral Reviews 24, 6 (2000), 581--604.

[13]

Gunes Erkan and Dragomir Radev. 2004. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 365--371.

[14]

Günes Erkan and Dragomir R. Radev. 2004. LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. ArXiv abs/1109.2128 (2004).

[15]

Onur Ferhat, Fernando Vilarino, and Francisco Javier Sanchez. 2014. A cheap portable eye-tracker solution for common setups. (2014).

[16]

Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, and Georgiana Ifrim. 2020. A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal. arXiv preprint arXiv:2005.10070 (2020).

[17]

Jade Goldstein, Vibhu O Mittal, Jaime G Carbonell, and Mark Kantrowitz. 2000. Multi-document summarization by sentence extraction. In NAACL-ANLP 2000 Workshop: Automatic Summarization.

Digital Library

[18]

Alison Head and Michael Eisenberg. 2010. How today's college students use Wikipedia for course-related research. First Monday 15, 3 (2010).

[19]

Denis Helic. 2012. Analyzing user click paths in a wikipedia navigation game. In 2012 Proceedings of the 35th International Convention MIPRO. IEEE, 374--379.

[20]

Dharmendra Hingu, Deep Shah, and Sandeep S Udmale. 2015. Automatic text summarization of Wikipedia articles. In 2015 International Conference on Communication, Information & Computing Technology (ICCICT). IEEE, 1--4.

[21]

Heather Knight and Reid Simmons. 2013. Estimating human interest and attention via gaze analysis. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 4350--4355.

[22]

Mahnaz Koupaee and William Yang Wang. 2018. Wikihow: A large scale text summarization dataset. arXiv preprint arXiv:1810.09305 (2018).

[23]

Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.

[24]

Peter J Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer. 2018. Generating wikipedia by summarizing long sequences. arXiv preprint arXiv:1801.10198 (2018).

[25]

Yong Liu, Xiaolei Wang, Jin Zhang, and Hongbo Xu. 2008. Personalized PageRank based multi-document summarization. In IEEE International Workshop on Semantic Computing and Systems. IEEE, 169--173.

Digital Library

[26]

Róbert Móro et al. 2012. Personalized text summarization based on important terms identification. In 2012 23rd International Workshop on Database and Expert Systems Applications. IEEE, 131--135.

[27]

EM Nel, DJC MacKay, P Zieliński, O Williams, and R Cipolla. 2012. Opengazer: open-source gaze tracker for ordinary webcams. (2012).

[28]

Ani Nenkova and Lucy Vanderwende. [n.d.]. The impact of frequency on summarization. ([n. d.]).

[29]

Ayano Okoso, Kai Kunze, and Koichi Kise. 2014. Implicit gaze based annotations to support second language learning. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. 143--146.

Digital Library

[30]

Anneli Olsen. 2012. The Tobii I-VT fixation filter. Tobii Technology (2012).

[31]

M Whitney Olsen and Anne R Diekema. 2012. "I just Wikipedia it": Information behavior of first-year writing students. Proceedings of the American Society for Information Science and Technology 49, 1 (2012), 1--11.

[32]

Alexandra Papoutsaki, James Laskey, and Jeff Huang. 2017. Searchgazer: Webcam eye tracking for remote studies of web search. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval. 17--26.

Digital Library

[33]

Dragomir R Radev, Weiguo Fan, and Zhu Zhang. 2001. Webinessence: A personalized web-based multi-document summarization and recommendation system. In NAACL Workshop on Automatic Summarization. Citeseer.

[34]

Krishnan Ramanathan, Yogesh Sankarasubramaniam, Nidhi Mathur, and Ajay Gupta. 2009. Document summarization using Wikipedia. In Proceedings of the first international conference on intelligent human computer interaction. Springer, 254--260.

[35]

Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. New Jersey, USA, 133--142.

[36]

Eyal M Reingold and Keith Rayner. 2006. Examining the word identification stages hypothesized by the EZ Reader model. Psychological Science 17, 9 (2006), 742--746.

[37]

Gaetano Rossiello, Pierpaolo Basile, and Giovanni Semeraro. 2017. Centroid-based text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres. 12--21.

[38]

Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017).

[39]

HS Sichel. 1974. On a distribution representing sentence-length in written prose. Journal of the Royal Statistical Society: Series A (General) 137, 1 (1974), 25--34.

[40]

Sameer Singh, Amarnag Subramanya, Fernando Pereira, and Andrew McCallum. 2012. Wikilinks: A large-scale cross-document coreference corpus labeled via links to Wikipedia. University of Massachusetts, Amherst, Tech. Rep. UM-CS-2012 15 (2012).

[41]

Taner Uçkan and Ali Karcı. 2020. Extractive multi-document text summarization based on graph independent sets. Egyptian Informatics Journal (2020).

[42]

Wikimedia Statistics. 2019. Wikistats 2 - Statistics For Wikimedia Projects. https://stats.wikimedia.org/v2/#/en.wikipedia.org [Online; accessed 05-October-2019].

[43]

Songhua Xu, Hao Jiang, and Francis Lau. 2009. User-oriented document summarization through vision-based eye-tracking. In Proceedings of the 14th international conference on Intelligent user interfaces. ACM, 7--16.

Digital Library

[44]

Petro Zdebskyi, Victoria Vysotska, Roman Peleshchak, Ivan Peleshchak, Andriy Demchuk, and Maksym Krylyshyn. 2019. An Application Development for Recognizing of View in Order to Control the Mouse Pointer. In MoMLeT. 55--74.

[45]

Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M Meyer, and Steffen Eger. 2019. Moverscore: Text generation evaluating with contextualized embeddings and earth mover distance. arXiv preprint arXiv:1909.02622 (2019).

Cited By

Thilderkvist EDobslaw F(2024)On current limitations of online eye-tracking to study the visual processing of source codeInformation and Software Technology10.1016/j.infsof.2024.107502174:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.infsof.2024.107502
Taieb-Maimon MRomanovski-Chernik ALast MLitvak MElhadad M(2023)Mining Eye-Tracking Data for Text SummarizationInternational Journal of Human–Computer Interaction10.1080/10447318.2023.222782740:17(4887-4905)Online publication date: 21-Jul-2023
https://doi.org/10.1080/10447318.2023.2227827
Beinborn LHollenstein NBeinborn LHollenstein N(2023)Cognitive Signals of Language ProcessingCognitive Plausibility in Natural Language Processing10.1007/978-3-031-43260-6_3(31-60)Online publication date: 31-Oct-2023
https://doi.org/10.1007/978-3-031-43260-6_3

Index Terms

WikiGaze: Gaze-based Personalized Summarization of Wikipedia Reading Session
1. Human-centered computing
  1. Collaborative and social computing
    1. Empirical studies in collaborative and social computing
  2. Human computer interaction (HCI)
    1. Interactive systems and tools

Recommendations

Datasets and gate evaluation framework for benchmarking Wikipedia-based NER systems
NLP-DBPEDIA'13: Proceedings of the 2013th International Conference on NLP & DBpedia - Volume 1064

We present a wikifier evaluation framework consisting of software support and two datasets (News and Tweets), which were derived from datasets previously published at WEKEX 2011 and MSM Challenge 2013. Entities recognized in the original datasets were ...
Learning to Map Wikidata Entities To Predefined Topics
WWW '19: Companion Proceedings of The 2019 World Wide Web Conference

Recently much progress has been made in entity disambiguation and linking systems (EDL). Given a piece of text, EDL links words and phrases to entities in a knowledge base, where each entity defines a specific concept. Although extracted entities are ...
DAWT: Densely Annotated Wikipedia Texts Across Multiple Languages
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

In this work, we open up the DAWT dataset - Densely Annotated Wikipedia Texts across multiple languages. The annotations include labeled text mentions mapping to entities (represented by their Freebase machine ids) as well as the type of the entity. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

HUMAN '20: Proceedings of the 3rd Workshop on Human Factors in Hypertext

December 2020

25 pages

ISBN:9781450380584

DOI:10.1145/3406853

Editors:
Claus Atzenbeck
Institute of Information Systems (iisys), Hof University, Germany
,
Jessica Rubart
Ostwestfalen-Lippe University of Applied Sciences and Arts, Germany

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

In-Cooperation

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

CSRI, Department of Science and Technology India

Conference

HT '20

Sponsor:

SIGWEB

HT '20: 31st ACM Conference on Hypertext and Social Media

December 4, 2020

Virtual Event, USA

Acceptance Rates

HUMAN '20 Paper Acceptance Rate 3 of 5 submissions, 60%;

Overall Acceptance Rate 6 of 9 submissions, 67%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
162
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)4

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Thilderkvist EDobslaw F(2024)On current limitations of online eye-tracking to study the visual processing of source codeInformation and Software Technology10.1016/j.infsof.2024.107502174:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.infsof.2024.107502
Taieb-Maimon MRomanovski-Chernik ALast MLitvak MElhadad M(2023)Mining Eye-Tracking Data for Text SummarizationInternational Journal of Human–Computer Interaction10.1080/10447318.2023.222782740:17(4887-4905)Online publication date: 21-Jul-2023
https://doi.org/10.1080/10447318.2023.2227827
Beinborn LHollenstein NBeinborn LHollenstein N(2023)Cognitive Signals of Language ProcessingCognitive Plausibility in Natural Language Processing10.1007/978-3-031-43260-6_3(31-60)Online publication date: 31-Oct-2023
https://doi.org/10.1007/978-3-031-43260-6_3
Tayal DJain AKirti Gaur H(2022)Customized Point Generation for Text Summarization2022 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS)10.1109/ICCCIS56430.2022.10037726(800-804)Online publication date: 4-Nov-2022
https://doi.org/10.1109/ICCCIS56430.2022.10037726

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents