research-article

From query to question in one click: suggesting synthetic questions to searchers

Authors:

Idan SzpektorAuthors Info & Claims

WWW '13: Proceedings of the 22nd international conference on World Wide Web

Pages 391 - 402

https://doi.org/10.1145/2488388.2488423

Published: 13 May 2013 Publication History

Abstract

In Web search, users may remain unsatisfied for several reasons: the search engine may not be effective enough or the query might not reflect their intent. Years of research focused on providing the best user experience for the data available to the search engine. However, little has been done to address the cases in which relevant content for the specific user need has not been posted on the Web yet. One obvious solution is to directly ask other users to generate the missing content using Community Question Answering services such as Yahoo! Answers or Baidu Zhidao. However, formulating a full-fledged question after having issued a query requires some effort. Some previous work proposed to automatically generate natural language questions from a given query, but not for scenarios in which a searcher is presented with a list of questions to choose from. We propose here to generate synthetic questions that can actually be clicked by the searcher so as to be directly posted as questions on a Community Question Answering service. This imposes new constraints, as questions will be actually shown to searchers, who will not appreciate an awkward style or redundancy. To this end, we introduce a learning-based approach that improves not only the relevance of the suggested questions to the original query, but also their grammatical correctness. In addition, since queries are often underspecified and ambiguous, we put a special emphasis on increasing the diversity of suggestions via a novel diversification mechanism. We conducted several experiments to evaluate our approach by comparing it to prior work. The experiments show that our algorithm improves question quality by 14% over prior work and that adding diversification reduced redundancy by 55%.

References

[1]

M. Agarwal, R. Shah, and P. Mannem. Automatic question generation using discourse cues. In Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications, IUNLPBEA '11, pages 1--9, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics.

Digital Library

[2]

E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 19--26, New York, NY, USA, 2006. ACM.

Digital Library

[3]

H. Ali, Y. Chali, and S. Hasan. Automation of question generation from sentences. Boyer & Piwek (2010), pages 58--67, 2010.

[4]

R. Boim, T. Milo, and S. Novgorodov. Diversification and refinement in collaborative filtering recommender. In Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pages 739--744, New York, NY, USA, 2011. ACM.

Digital Library

[5]

J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '98, pages 335--336, New York, NY, USA, 1998. ACM.

Digital Library

[6]

K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. JMLR, 7:551--585, 2006.

Digital Library

[7]

K. Crammer, R. McDonald, and F. Pereira. Scalable large-margin online learning for structured classification. In NIPS Workshop on Learning With Structured Outputs, 2005.

[8]

M.-C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed dependency parses from phrase structure trees. In LREC, 2006.

[9]

M. Drosou and E. Pitoura. Search result diversification. SIGMOD Rec., 39(1):41--47, Sept. 2010.

Digital Library

[10]

H. D. III, K. Knight, I. Langkilde-geary, D. Marcu, and K. Yamada. The importance of lexicalized syntax models for natural language generation tasks. In In Proceedings of the 2002 International Conference on Natural Language Generation (INLG - 2002, pages 9--16, 2002.

[11]

S. Kalady, A. Elikkottil, and R. Das. Natural language question generation using syntax and keywords. In Proceedings of QG2010: The Third Workshop on Question Generation, pages 1--10, 2010.

[12]

T. Lau and E. Horvitz. Patterns of search: analyzing and modeling web query refinement. In Proceedings of the seventh international conference on User modeling, UM '99, pages 119--128, Secaucus, NJ, USA, 1999. Springer-Verlag New York, Inc.

Digital Library

[13]

J. Lee and S. Seneff. Automatic grammar correction for second-language learners. In INTERSPEECH. ISCA, 2006.

[14]

C. Lin. Automatic question generation from queries. In Workshop on the Question Generation Shared Task, 2008.

[15]

Q. Liu, E. Agichtein, G. Dror, E. Gabrilovich, Y. Maarek, D. Pelleg, and I. Szpektor. Predicting web searcher satisfaction with existing community-based answers. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR '11, pages 415--424, New York, NY, USA, 2011. ACM.

Digital Library

[16]

Q. Liu, E. Agichtein, G. Dror, Y. Maarek, and I. Szpektor. When web search fails, searchers become askers: understanding the transition. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, SIGIR '12, pages 801--810, New York, NY, USA, 2012. ACM.

Digital Library

[17]

P. Mannem, R. Prasad, and A. Joshi. Question generation from paragraphs at upenn: Qgstec system description. In Proceedings of QG2010: The Third Workshop on Question Generation, pages 84--91, 2010.

[18]

R. McDonald, K. Crammer, and F. Pereira. Online large-margin training of dependency parsers. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 91--98. Association for Computational Linguistics, 2005.

Digital Library

[19]

A. Olney, A. Graesser, and N. Person. Question generation from concept maps. Dialogue & Discourse, 3(2):75--99, 2012.

[20]

S. Pal, T. Mondal, P. Pakray, D. Das, and S. Bandyopadhyay. Qgstec system description?juqgg: A rule based approach. Boyer & Piwek (2010), pages 76--79, 2010.

[21]

A. Pauls and D. Klein. Faster and smaller N-gram language models. In D. Lin, Y. Matsumoto, and R. Mihalcea, editors, Proceedings of the 49th Annual Meeting of the Association of Computational Linguistics, pages 258--267. The Association for Computer Linguistics, 2011.

Digital Library

[22]

V. Rus, B. Wyse, P. Piwek, M. C. Lintean, S. Stoyanchev, and C. Moldovan. The first question generation shared task evaluation challenge. In J. D. Kelleher, B. M. Namee, I. van der Sluis, A. Belz, A. Gatt, and A. Koller, editors, INLG 2010 - Proceedings of the Sixth International Natural Language Generation Conference. The Association for Computer Linguistics, 2010.

Digital Library

[23]

C. Yu, L. V. S. Lakshmanan, and S. Amer-Yahia. Recommendation diversification using explanations. In Proceedings of the 2009 IEEE International Conference on Data Engineering, ICDE '09, pages 1299--1302, Washington, DC, USA, 2009. IEEE Computer Society.

Digital Library

[24]

S. Zhao, H. Wang, C. Li, T. Liu, and Y. Guan. Automatically generating questions from queries for community-based question answering. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 929--937, Chiang Mai, Thailand, November 2011. Asian Federation of Natural Language Processing.

[25]

Z. Zheng, X. Si, E. Chang, and X. Zhu. K2q: Generating natural language questions from keywords with user refinements. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 947--955, Chiang Mai, Thailand, November 2011. Asian Federation of Natural Language Processing.

[26]

C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web, WWW '05, pages 22--32, New York, NY, USA, 2005. ACM.

Digital Library

Cited By

Usuha KKato MFujita S(2022)Can a Machine Reading Comprehension Model Improve Ad-hoc Document Retrieval?From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries10.1007/978-3-031-21756-2_14(172-181)Online publication date: 7-Dec-2022
https://doi.org/10.1007/978-3-031-21756-2_14
Pan BYang YZhuang YCai D(2019)Discriminate and Reconstruct: Learning from Language Model to Answer Keyword Questions2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence (CCHI)10.1109/CCHI.2019.8901922(6-11)Online publication date: Sep-2019
https://doi.org/10.1109/CCHI.2019.8901922
Ding HBalog KSong DLiu TSun LBruza PMelucci MSebastiani FYang G(2018)Generating Synthetic Data for Neural Keyword-to-Question ModelsProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234964(51-58)Online publication date: 10-Sep-2018
https://dl.acm.org/doi/10.1145/3234944.3234964
Show More Cited By

Recommendations

Generating Synthetic Data for Neural Keyword-to-Question Models
ICTIR '18: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval

Search typically relies on keyword queries, but these are often semantically ambiguous. We propose to overcome this by offering users natural language questions, based on their keyword queries, to disambiguate their intent. This keyword-to-question task ...
Generating Clarifying Questions for Information Retrieval
WWW '20: Proceedings of The Web Conference 2020

Search queries are often short, and the underlying user intent may be ambiguous. This makes it challenging for search engines to predict possible intents, only one of which may pertain to the current user. To address this issue, search engines often ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '13: Proceedings of the 22nd international conference on World Wide Web

May 2013

1628 pages

ISBN:9781450320351

DOI:10.1145/2488388

General Chairs:
Daniel Schwabe
PUC-Rio - Brazil
,
Virgílio Almeida
UFMG - Brazil
,
Hartmut Glaser
CGI.br - Brazil
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Labs - Spain & Chile
,
Sue Moon
KAIST - South Korea

Copyright © 2013 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
CGIBR: Comite Gestor da Internet no Brazil

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '13

Sponsor:

NICBR
CGIBR

WWW '13: 22nd International World Wide Web Conference

May 13 - 17, 2013

Rio de Janeiro, Brazil

Acceptance Rates

WWW '13 Paper Acceptance Rate 125 of 831 submissions, 15%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
443
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Usuha KKato MFujita S(2022)Can a Machine Reading Comprehension Model Improve Ad-hoc Document Retrieval?From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries10.1007/978-3-031-21756-2_14(172-181)Online publication date: 7-Dec-2022
https://doi.org/10.1007/978-3-031-21756-2_14
Pan BYang YZhuang YCai D(2019)Discriminate and Reconstruct: Learning from Language Model to Answer Keyword Questions2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence (CCHI)10.1109/CCHI.2019.8901922(6-11)Online publication date: Sep-2019
https://doi.org/10.1109/CCHI.2019.8901922
Ding HBalog KSong DLiu TSun LBruza PMelucci MSebastiani FYang G(2018)Generating Synthetic Data for Neural Keyword-to-Question ModelsProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234964(51-58)Online publication date: 10-Sep-2018
https://dl.acm.org/doi/10.1145/3234944.3234964
Guy I(2018)The Characteristics of Voice SearchACM Transactions on Information Systems10.1145/318216336:3(1-28)Online publication date: 13-Mar-2018
https://dl.acm.org/doi/10.1145/3182163
Srba IBielikova M(2016)A Comprehensive Survey and Classification of Approaches for Community Question AnsweringACM Transactions on the Web10.1145/293468710:3(1-63)Online publication date: 16-Aug-2016
https://dl.acm.org/doi/10.1145/2934687
Guy IPerego RSebastiani FAslam JRuthven IZobel J(2016)Searching by TalkingProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2911525(35-44)Online publication date: 7-Jul-2016
https://dl.acm.org/doi/10.1145/2911451.2911525
Loch LNavrat PKovarova A(2015)Answering questions based on gradually learned knowledge from the web using lightweight semanticsProceedings of the 16th International Conference on Computer Systems and Technologies10.1145/2812428.2812435(192-198)Online publication date: 25-Jun-2015
https://dl.acm.org/doi/10.1145/2812428.2812435
Wu HWu WZhou MChen EDuan LShum HCarterette BDiaz FCastillo CMetzler D(2014)Improving search relevance for short queries in community question answeringProceedings of the 7th ACM international conference on Web search and data mining10.1145/2556195.2556239(43-52)Online publication date: 24-Feb-2014
https://dl.acm.org/doi/10.1145/2556195.2556239
Kaveh-Yazdy FZareh-Bidoki A(2014)Aleph or Aleph-Maddah, that is the question! Spelling correction for search engine autocomplete service2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)10.1109/ICCKE.2014.6993359(273-278)Online publication date: Oct-2014
https://doi.org/10.1109/ICCKE.2014.6993359

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten