Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3511808.3557678acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper
Open access

WDRASS: A Web-scale Dataset for Document Retrieval and Answer Sentence Selection

Published: 17 October 2022 Publication History

Abstract

Open-Domain Question Answering (ODQA) systems generate answers from relevant text returned by search engines, e.g., lexical features-based such as BM25, or embeddings-based such as dense passage retrieval (DPR). Few datasets are available for this task: they mainly focus on QA systems based on machine reading (MR) approach, and show problematic evaluation, mostly based on uncontextualized short answer matching. In this paper, we present WDRASS, a dataset for ODQA based on answer sentence selection (AS2) models, which consider sentences as candidate answers for QA systems. WDRASS consists of ∼64k questions and 800k+ labeled passages and sentences extracted from 30M documents. We evaluate the dataset by training models on it and comparing with the same models trained on Google NQ. Our experiments show that WDRASS significantly improves the performance of retrieval and reranking models, thus boosting the accuracy of downstream QA tasks. We believe our dataset can produce significant impact in advancing IR research.

References

[1]
Qingyao Ai, Keping Bi, Jiafeng Guo, and W. Bruce Croft. 2018. Learning a Deep Listwise Context Model for Ranking Refinement. CoRR abs/1804.05936 (2018). arXiv:1804.05936 http://arxiv.org/abs/1804.05936
[2]
Weijie Bian, Si Li, Zhao Yang, Guang Chen, and Zhiqing Lin. 2017. A Compare- Aggregate Model with Dynamic-Clip Attention for Answer Selection. In CIKM. ACM, 1987--1990.
[3]
Daniele Bonadiman and Alessandro Moschitti. 2020. A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection. CoRR abs/2003.02349 (2020). arXiv:2003.02349 https://arxiv.org/abs/2003.02349
[4]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning (Corvalis, Oregon) (ICML '07). ACM, New York, NY, USA, 129--136. https://doi.org/10.1145/1273496.1273513
[5]
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051 (2017).
[6]
Daniel Cohen, Liu Yang, andWBruce Croft. 2018. WikiPassageQA: A benchmark collection for research on non-factoid answer passage retrieval. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1165--1168.
[7]
Hengxin Fun, Sunil Gandhi, and Sujith Ravi. 2021. Efficient Retrieval Optimized Multi-task Learning. arXiv preprint arXiv:2104.10129 (2021).
[8]
Siddhant Garg, Thuy Vu, and Alessandro Moschitti. 2020. TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7--12, 2020. AAAI Press, 7780--7788. https://aaai.org/ojs/index.php/AAAI/article/view/6282
[9]
Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. Realm: Retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909 (2020).
[10]
Gautier Izacard and Edouard Grave. 2020. Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282 (2020).
[11]
Mandar Joshi, Eunsol Choi, DanielWeld, and Luke Zettlemoyer. 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 1601--1611. https://doi.org/10.18653/v1/P17--1147
[12]
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Ledell Wu, Sergey Edunov, Danqi Chen, andWen-tau Yih. 2020. Dense Passage Retrieval for Open-domain Question Answering. arXiv preprint arXiv:2004.04906 (2020). https://arxiv.org/abs/2004.04906
[13]
Mostafa Keikha, Jae Hyun Park, and W. Bruce Croft. 2014. Evaluating answer passages using summarization measures. Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval (2014).
[14]
Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Matthew Kelcey, Jacob Devlin, Kenton Lee, Kristina N. Toutanova, Llion Jones, Ming-Wei Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: a Benchmark for Question Answering Research. (2019).
[15]
Md Tahmid Rahman Laskar, Jimmy Xiangji Huang, and Enamul Hoque. 2020. Contextualized Embeddings based Transformer Encoder for Sentence Similarity Modeling in Answer Selection Task. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 5505--5514. https://www.aclweb.org/anthology/2020.lrec-1.676
[16]
Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. 6086--6096.
[17]
Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. arXiv preprint arXiv:2005.11401 (2020).
[18]
Sewon Min, Jordan L. Boyd-Graber, Chris Alberti, Danqi Chen, Eunsol Choi, Michael Collins, Kelvin Guu, Hannaneh Hajishirzi, Kenton Lee, Jennimaria Palomaki, Colin Raffel, Adam Roberts, Tom Kwiatkowski, Patrick S. H. Lewis, Yuxiang Wu, Heinrich Küttler, Linqing Liu, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel, Sohee Yang, Minjoon Seo, Gautier Izacard, Fabio Petroni, Lucas Hosseini, Nicola De Cao, Edouard Grave, Ikuya Yamada, Sonse Shimaoka, Masatoshi Suzuki, Shumpei Miyawaki, Shun Sato, Ryo Takahashi, Jun Suzuki, Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz, Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Sejr Schlichtkrull, Sonal Gupta, Yashar Mehdad, and Wen-tau Yih. 2021. NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned. CoRR abs/2101.00133 (2021). arXiv:2101.00133 https://arxiv.org/abs/2101.00133
[19]
Jinfeng Rao, Hua He, and Jimmy J. Lin. 2016. Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks. In CIKM. ACM, 1913--1916.
[20]
Revanth Gangi Reddy, Bhavani Iyer, Md. Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, and Salim Roukos. 2020. End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training. CoRR abs/2012.01414 (2020). arXiv:2012.01414 https://arxiv.org/abs/2012.01414
[21]
A. Severyn and Alessandro Moschitti. 2015. Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (2015).
[22]
Gehui Shen, Yunlun Yang, and Zhi-Hong Deng. 2017. Inter-Weighted Alignment Network for Sentence Pair Modeling. In EMNLP'17. Copenhagen, Denmark, 1179-- 1189. https://doi.org/10.18653/v1/D17--1122
[23]
Harish Tayyar Madabushi, Mark Lee, and John Barnden. 2018. Integrating Question Classification and Deep Learning for improved Answer Selection. In COLING'18. 3283--3294. https://www.aclweb.org/anthology/C18--1278
[24]
Ellen M Voorhees. 1999. The TREC-8 question answering track report. In TREC, Vol. 99. 77--82.
[25]
MengqiuWang, Noah A. Smith, and Teruko Mitamura. 2007. What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). Association for Computational Linguistics, Prague, Czech Republic, 22--32. https://aclanthology.org/D07--1003
[26]
Yi Yang,Wen-tau Yih, and Christopher Meek. 2015. WikiQA: A Challenge Dataset for Open-Domain Question Answering. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 2013--2018. https://doi.org/10.18653/v1/D15--1237
[27]
Seunghyun Yoon, Franck Dernoncourt, Doo Soon Kim, Trung Bui, and Kyomin Jung. 2019. A Compare-Aggregate Model with Latent Clustering for Answer Selection. CoRR abs/1905.12897 (2019). arXiv:1905.12897 http://arxiv.org/abs/1905.12897

Cited By

View all
  • (2024)In Situ Answer Sentence Selection at Web-scaleProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679946(4298-4302)Online publication date: 21-Oct-2024
  • (2024)Device-Wise Federated Network Pruning2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01173(12342-12352)Online publication date: 16-Jun-2024
  • (2023)Efficient Fine-Tuning Large Language Models for Knowledge-Aware Response PlanningMachine Learning and Knowledge Discovery in Databases: Research Track10.1007/978-3-031-43415-0_35(593-611)Online publication date: 18-Sep-2023

Index Terms

  1. WDRASS: A Web-scale Dataset for Document Retrieval and Answer Sentence Selection

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
    October 2022
    5274 pages
    ISBN:9781450392365
    DOI:10.1145/3511808
    • General Chairs:
    • Mohammad Al Hasan,
    • Li Xiong
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2022

    Check for updates

    Author Tags

    1. information retrieval
    2. machine learning
    3. question answering

    Qualifiers

    • Short-paper

    Conference

    CIKM '22
    Sponsor:

    Acceptance Rates

    CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)154
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)In Situ Answer Sentence Selection at Web-scaleProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679946(4298-4302)Online publication date: 21-Oct-2024
    • (2024)Device-Wise Federated Network Pruning2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01173(12342-12352)Online publication date: 16-Jun-2024
    • (2023)Efficient Fine-Tuning Large Language Models for Knowledge-Aware Response PlanningMachine Learning and Knowledge Discovery in Databases: Research Track10.1007/978-3-031-43415-0_35(593-611)Online publication date: 18-Sep-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media