Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1099554.1099571acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Retrieving answers from frequently asked questions pages on the web

Published: 31 October 2005 Publication History

Abstract

We address the task of answering natural language questions by using the large number of Frequently Asked Questions (FAQ) pages available on the web. The task involves three steps: (1) fetching FAQ pages from the web; (2) automatic extraction of question/answer (Q/A) pairs from the collected pages; and (3) answering users' questions by retrieving appropriate Q/A pairs. We discuss our solutions for each of the three tasks, and give detailed evaluation results on a collected corpus of about 3.6Gb of text data (293K pages, 2.8M Q/A pairs), with real users' questions sampled from a web search engine log. Specifically, we propose simple but effective methods for Q/A extraction and investigate task-specific retrieval models for answering questions. Our best model finds answers for 36% of the test questions in the top 20 results. Our overall conclusion is that FAQ pages on the web provide an excellent resource for addressing real users' information needs in a highly focused manner.

References

[1]
Apache Lucene: A high-performance, full-featured text search engine library. http://lucene.apache.org.
[2]
E. Agichtein, S. Lawrence, and L. Gravano. Learning to find answers to questions on the web. ACM Trans. Inter. Tech., 4(2):129--162, 2004.
[3]
A. Berger, R. Caruana, D. Cohn, D. Freitag, and V. Mittal. Bridging the lexical chasm: statistical approaches to answer- finding. In Proc. SIGIR 2000, pages 192--199, 2000.
[4]
R. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg. Natural language processing in the FAQFinder system: Results and prospects. In Proc. 1997 AAAI Spring Symposium on Natural Language Processing for the World Wide Web, pages 17--26, 1997.
[5]
R. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg. Question answering from frequently asked question files: Experiences with the FAQFinder system. AI Magazine, 18(2):57--66, 1997.
[6]
D. Carmel, M. Shtalhaim, and A. Soffer. eResponder: Electronic question responder. In Proc. CoopIS 2002, pages 150--161, 2000.
[7]
S. Chakrabarti, M. Van Den Berg, and B. Dom. Focused crawling: A new approach to topic-specific Web resource discovery. Computer Networks, 31:1623--1640, 1999.
[8]
W. Daelemans, J. Zavrel, K. Van Der Sloot, and A. Van Den Bosch. TiMBL: Tilburg Memory Based Learner, version 5.0. Tech. Report 03--10, 2003.
[9]
O. Etzioni, M. Cafarella, D. Downey, S. Kok, A.-M. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. Web-scale information extraction in KnowItAll: (preliminary results). In Proc. WWW 2004, pages 100--110, 2004.
[10]
A. Foster and N. Ford. Serendipity and information seeking: an empirical study. J. Documentation, 59(3):321--340, 2003.
[11]
N. Fuhr, M. Lalmas, S. Malik, and Z. Szlavik, editors. Advances in XML Information Retrieval: Third International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2004), LNCS 3493, Springer, 2005
[12]
R. Girju. Automatic detection of causal relations for question answering. In Proc. ACL 2003 Workshop on Multilingual Summarization and Question Answering, 2003.
[13]
B. Katz. Annotating the World Wide Web using natural language. In Proc. RIAO'97, 1997.
[14]
B. Katz, S. Felshin, D. Yuret, A. Ibrahim, J. Lin, G. Marton, A. McFarland, and B. Temelkuran. Omnibase: Uniform access to heterogeneous data for question answering. In Proc. NLDB 2002, 2002.
[15]
H. Kim and J. Seo. High-performance FAQ retrieval using an automatic clustering method of query logs. Information Processing & Management, in press.
[16]
L. Kossseim, S. Beauregard, and G. Lapalme. Using information extraction and natural language generation to answer e-mail. Data & Knowledge Engineering, 38(1):85--100, 2001.
[17]
N. Kushmerick. Wrapper induction: Efficiency and expressiveness. Artificial Intelligence, 118(1--2):15--68, 2000.
[18]
C. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the web. In Proc. WWW 2001, pages 150--161, 2001.
[19]
Y.-S. Lai, K.-A. Fung, and C.-H. Wu. FAQ mining via list detection. In Proc. Coling Workshop on Multilingual Summarization and Question Answering, 2002.
[20]
H. Limanto, N. Giang, V. Trung, N. Huy, J. Zhang, and Q. He. An information extraction engine for web discussion forums. In Proc. WWW 2005, pages 978--979, 2005.
[21]
C.-Y. Lin, D. Quan, V. Sinha, K. Bakshi, D. Huynh, B. Katz, and D. Karger. What makes a good answer? The role of context in question answering systems. In Proc. INTERACT 2003, 2003.
[22]
S. Lytinen and N. Tomuro. The use of question types to match questions in FAQFinder. In Proc. AAAI-2002 Spring Symposium on Mining Answers from Texts and Knowledge Bases, pages 46--53, 2002.
[23]
S. Lytinen, N. Tomuro, and T. Repede. The use of WordNet sense tagging in FAQFinder. In Proc. AAAI-2000 Workshop on AI and Web Search, Austin, TX, 2000.
[24]
A. McCallum, D. Freitag, and F. Pereira. Maximum entropy markov models for information extraction and segmentation. In Proc. ICML 2000, pages 591--598, 2000.
[25]
G. Mishne and M. de Rijke. Boosting Web Retrieval through Query Operations. In Proc. ECIR 2005, pages 502--516, 2005.
[26]
M. Porter. An algorithm for suffix stripping. Program, 14 (3):130--137, 1980.
[27]
D. Radev, W. Fan, H. Qi, H. Wu, and A. Grewal. Probabilistic question answering on the web. In Proc. WWW 2002, pages 408--419, 2002.
[28]
G. Ramakrishnan, S. Chakrabarti, D. Paranjpe, and P. Bhattacharya. Is question answering an acquired skill? In Proc. WWW 2004, pages 111--120, 2004.
[29]
R. Soricut and E. Brill. Automatic question answering: Beyond the factoid. In Proc. HLT/NAACL, 2004.
[30]
E. Voorhees. Evaluating answers to definition questions. In Proc. HLT 2003, 2003.
[31]
J. Wang and F. Lochovsky. Data extraction and label assignment for web databases. In Proc. WWW 2003, pages 197--196, 2003.
[32]
S. Whitehead. Auto-FAQ: An experiment in cyberspace leveraging. Computer Networks and ISDN Systems, 28(1--2): 137--146, 1995.
[33]
R. Wilkinson. Effective retrieval of structured documents. In Proc. SIGIR 1994, pages 311--317, 1994.
[34]
Z. Zheng. AnswerBus question answering system. In Proc. HLT 2002, 2002.

Cited By

View all
  • (2023)HSM-QA: Question Answering System Based on Hierarchical Semantic MatchingIEEE Access10.1109/ACCESS.2023.329685011(77826-77839)Online publication date: 2023
  • (2021)QuAXProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482289(1518-1527)Online publication date: 26-Oct-2021
  • (2021)An Ensemble Net of Convolutional Auto-Encoder and Graph Auto-Encoder for Auto-DiagnosisIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2020.298433513:1(189-199)Online publication date: Mar-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FAQ retrieval
  2. question answering
  3. questions beyond factoids

Qualifiers

  • Article

Conference

CIKM05
Sponsor:
CIKM05: Conference on Information and Knowledge Management
October 31 - November 5, 2005
Bremen, Germany

Acceptance Rates

CIKM '05 Paper Acceptance Rate 77 of 425 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)13
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)HSM-QA: Question Answering System Based on Hierarchical Semantic MatchingIEEE Access10.1109/ACCESS.2023.329685011(77826-77839)Online publication date: 2023
  • (2021)QuAXProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482289(1518-1527)Online publication date: 26-Oct-2021
  • (2021)An Ensemble Net of Convolutional Auto-Encoder and Graph Auto-Encoder for Auto-DiagnosisIEEE Transactions on Cognitive and Developmental Systems10.1109/TCDS.2020.298433513:1(189-199)Online publication date: Mar-2021
  • (2020)Question retrieval using combined queries in community question answeringJournal of Intelligent Information Systems10.1007/s10844-020-00612-xOnline publication date: 24-Jul-2020
  • (2020)Optimized Transformer Models for FAQ AnsweringAdvances in Knowledge Discovery and Data Mining10.1007/978-3-030-47426-3_19(235-248)Online publication date: 6-May-2020
  • (2019)A question-entailment approach to question answeringBMC Bioinformatics10.1186/s12859-019-3119-420:1Online publication date: 22-Oct-2019
  • (2019)FAQ Retrieval Using Attentive MatchingProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331294(929-932)Online publication date: 18-Jul-2019
  • (2018)Document Summarization for Answering Non-Factoid QueriesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.275437330:1(15-28)Online publication date: 1-Jan-2018
  • (2017)Ripple Down Rules for question answeringSemantic Web10.3233/SW-1502048:4(511-532)Online publication date: 1-Jan-2017
  • (2017)Boosting a Rule-Based Chatbot Using Statistics and User Satisfaction RatingsArtificial Intelligence and Natural Language10.1007/978-3-319-71746-3_3(27-41)Online publication date: 28-Nov-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media