short-paper

Public Access

WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval

Authors:

W. Bruce CroftAuthors Info & Claims

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Pages 1165 - 1168

https://doi.org/10.1145/3209978.3210118

Published: 27 June 2018 Publication History

Abstract

With the rise in mobile and voice search, answer passage retrieval acts as a critical component of an effective information retrieval system for open domain question answering. Currently, there are no comparable collections that address non-factoid question answering within larger documents while simultaneously providing enough examples sufficient to train a deep neural network. In this paper, we introduce a new Wikipedia based collection specific for non-factoid answer passage retrieval containing thousands of questions with annotated answers and show benchmark results on a variety of state of the art neural architectures and retrieval models. The experimental results demonstrate the unique challenges presented by answer passage retrieval within topically relevant documents for future research.

References

[1]

Daniel Cohen and W. Bruce Croft. End to End Long Short Term Memory Networks for Non-Factoid Question Answering. In ICTIR '16.

Digital Library

[2]

Jia Deng, Wei Dong, Richard Socher, Li jia Li, Kai Li, and Li Fei-fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR.

[3]

Minwei Feng, Bing Xiang, Michael R. Glass, Lidan Wang, and Bowen Zhou. 2015. Applying Deep Learning to Answer Selection: A Study and An Open Task. CoRR abs/1508.01585 (2015).

[4]

Ivan Habernal, Maria Sukhareva, Fiana Raiber, Anna Shtok, Oren Kurland, Hadar Ronen, Judit Bar-Ilan, and Iryna Gurevych. New Collection Announcement: Focused Retrieval Over the Web. In SIGIR '16.

Digital Library

[5]

Karl Moritz Hermann, Tomás Kociský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. CoRR abs/1506.03340 (2015).

Digital Library

[6]

Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. 2015. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations. CoRR abs/1511.02301 (2015).

[7]

Mostafa Keikha, Jae Hyun Park, and W. Bruce Croft. 2014. Evaluating Answer Passages Using Summarization Measures. In SIGIR '14.

Digital Library

[8]

Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. CoRR abs/1405.4053 (2014). http://arxiv.org/abs/1405.4053

Digital Library

[9]

Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. CoRR abs/1611.09268 (2016).

[10]

Jay M. Ponte and W. Bruce Croft. 1998. A Language Modeling Approach to Information Retrieval. In SIGIR '98.

Digital Library

[11]

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100, 000+ Questions for Machine Comprehension of Text. CoRR abs/1606.05250 (2016).

[12]

Matthew Richardson, Christopher J.C. Burges, and Erin Renshaw. MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text. In EMNLP '13.

[13]

Stephen Robertson and Stephen Walker. 1994. Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In SIGIR '94.

Digital Library

[14]

Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. In ACM SIGIR (SIGIR '15). ACM, New York, NY, USA, 373--382.

Digital Library

[15]

Ming Tan, Bing Xiang, and Bowen Zhou. 2015. LSTM-based Deep Learning Models for non-factoid answer selection. CoRR abs/1511.04108 (2015). http: //arxiv.org/abs/1511.04108

[16]

Di Wang and Eric Nyberg. A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering. In ACL '15.

[17]

Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA. In EMNLP '07.

[18]

Jason Weston, Antoine Bordes, Sumit Chopra, and Tomas Mikolov. 2015. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. CoRR abs/1502.05698 (2015).

[19]

Liu Yang, Qingyao Ai, Jiafeng Guo, and W. Bruce Croft. aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model. In CIKM '16.

Digital Library

[20]

Liu Yang, Qingyao Ai, Damiano Spina, Ruey-Cheng Chen, Liang Pang, W. Bruce Croft, Jiafeng Guo, and Falk Scholer. Beyond Factoid QA: Effective Methods for Non-factoid Answer Sentence Retrieval. In ECIR '16.

[21]

Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. WikiQA: A Challenge Dataset for Open-Domain Question Answering. In EMNLP '15.

[22]

Lei Yu, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. 2014. Deep Learning for Answer Sentence Selection. CoRR (2014).

Cited By

Biancofiore GDeldjoo YNoia TDi Sciascio ENarducci F(2024)Interactive Question Answering Systems: Literature ReviewACM Computing Surveys10.1145/365763156:9(1-38)Online publication date: 11-Apr-2024
https://dl.acm.org/doi/10.1145/3657631
Lien YZamani HCroft B(2024)Generalized Weak Supervision for Neural Information RetrievalACM Transactions on Information Systems10.1145/364763942:5(1-26)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3647639
Uwoghiren EOladipupo OOyelade J(2024)Advancements and Trends in Non-Factoid Question Answering: A Comprehensive Systematic Literature Review2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG)10.1109/SEB4SDG60871.2024.10629871(1-17)Online publication date: 2-Apr-2024
https://doi.org/10.1109/SEB4SDG60871.2024.10629871
Show More Cited By

Index Terms

WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering

Recommendations

Human question answering performance using an interactive document retrieval system
IIIX '12: Proceedings of the 4th Information Interaction in Context Symposium

Every day, people answer their questions by using document retrieval systems. Compared to document retrieval systems, question answering (QA) systems aim to speed the rate at which users find answers by retrieving answers rather than documents. To ...
Lightweight web-based fact repositories for textual question answering
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management

Since answers to fact-seeking questions usually reside within small factual text nuggets, often "hidden" within full-length documents, their relevance to a question is not necessarily correlated to the relevance of the full-length document to the ...
An answer passage retrieval strategy for web-based question answering
InfoScale '07: Proceedings of the 2nd international conference on Scalable information systems

A passage retrieval strategy for our web-based Question Answering (QA) system is proposed in this paper. We utilize Google to retrieve web documents for answer passage finding. We propose a new method to rewrite the query for passage retrieval. We ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

June 2018

1509 pages

ISBN:9781450356572

DOI:10.1145/3209978

General Chairs:
Kevyn Collins-Thompson
University of Michigan, United States
,
Qiaozhu Mei
University of Michigan, United States
,
Program Chairs:
Brian Davison
Lehigh University, United States
,
Yiqun Liu
Tsinghua University, China
,
Emine Yilmaz
University College London, United Kingdom

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Air Force Research Laboratory
University of Southern California

Conference

SIGIR '18

Sponsor:

SIGIR

SIGIR '18: The 41st International ACM SIGIR conference on research and development in Information Retrieval

July 8 - 12, 2018

MI, Ann Arbor, USA

Acceptance Rates

SIGIR '18 Paper Acceptance Rate 86 of 409 submissions, 21%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
653
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)16

Reflects downloads up to 17 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Biancofiore GDeldjoo YNoia TDi Sciascio ENarducci F(2024)Interactive Question Answering Systems: Literature ReviewACM Computing Surveys10.1145/365763156:9(1-38)Online publication date: 11-Apr-2024
https://dl.acm.org/doi/10.1145/3657631
Lien YZamani HCroft B(2024)Generalized Weak Supervision for Neural Information RetrievalACM Transactions on Information Systems10.1145/364763942:5(1-26)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3647639
Uwoghiren EOladipupo OOyelade J(2024)Advancements and Trends in Non-Factoid Question Answering: A Comprehensive Systematic Literature Review2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG)10.1109/SEB4SDG60871.2024.10629871(1-17)Online publication date: 2-Apr-2024
https://doi.org/10.1109/SEB4SDG60871.2024.10629871
Rashid MMeem JHristidis V(2024)NORMY: Non-Uniform History Modeling for Open Retrieval Conversational Question Answering2024 IEEE 18th International Conference on Semantic Computing (ICSC)10.1109/ICSC59802.2024.00022(101-109)Online publication date: 5-Feb-2024
https://doi.org/10.1109/ICSC59802.2024.00022
Li BYang PZhao HZhang PLiu Z(2023)Hierarchical Sliding Inference Generator for Question-driven Abstractive Answer SummarizationACM Transactions on Information Systems10.1145/351189141:1(1-27)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1145/3511891
Zhang ZVu TGandhi SChadha AMoschitti AAl Hasan MXiong L(2022)WDRASS: A Web-scale Dataset for Document Retrieval and Answer Sentence SelectionProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557678(4707-4711)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557678
Long DGao QZou KXu GXie PGuo RXu JJiang GXing LYang PAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Multi-CPRProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531736(3046-3056)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531736
Lin DTang JLi XPang KLi SWang T(2022)BERT-SMAPInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10278859:2Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1016/j.ipm.2021.102788
Abdel-Nabi HAwajan AAli M(2022)Deep learning-based question answering: a surveyKnowledge and Information Systems10.1007/s10115-022-01783-565:4(1399-1485)Online publication date: 30-Dec-2022
https://doi.org/10.1007/s10115-022-01783-5
Jin ZHong YZhu HYao JZhang M(2022)Bi-granularity Adversarial Training for Non-factoid Answer RetrievalAdvances in Information Retrieval10.1007/978-3-030-99736-6_22(322-335)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99736-6_22
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents