Abstract
People often issue informational queries to search engines to find out more about some entities or events. While a Wikipedia-like summary would be an ideal answer to such queries, not all queries have a corresponding Wikipedia entry. In this work we propose to study query-oriented keyphrase extraction, which can be used to assist search results summarization. We propose a general method for keyphrase extraction for our task, where we consider both phraseness and informativeness. We discuss three criteria for phraseness and four ways to compute informativeness scores. Using a large Wikipedia corpus and 40 queries, our empirical evaluation shows that using a named entity-based phraseness criterion and a language model-based informativeness score gives the best performance on our task. This method also outperforms two state-of-the-art baseline methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bailey, P., Craswell, N., de Vries, A.P., Soboroff, I.: Overview of the TREC 2007 enterprise track. In: Proceedings of the 16th Text Retrieval Conference (2007)
Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the TREC 2010 entity track. In: Proceedings of the 19th Text Retrieval Conference (2010)
Balog, K., de Vries, A.P., Serdyukov, P., Thomas, P., Westerveld, T.: Overview of the TREC 2009 entity track. In: Proceedings of the 18th Text Retrieval Conference (2009)
Blanco, R., Zaragoza, H.: Finding support sentences for entities. In: SIGIR, pp. 339–346 (2010)
Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)
Craswell, N., de Vries, A.P., Soboroff, I.: Overview of the TREC-2005 enterprise track. In: Proceedings of the 14th Text Retrieval Conference (2005)
Demartini, G., Iofciu, T., de Vries, A.P.: Overview of the INEX 2009 Entity Ranking Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)
Demartini, G., Missen, M.M.S., Blanco, R., Zaragoza, H.: Entity summarization of news articles. In: SIGIR, pp. 795–796 (2010)
Demartini, G., Missen, M.M.S., Blanco, R., Zaragoza, H.: Taer: time-aware entity retrieval-exploiting the past to find relevant entities in news articles. In: CIKM, pp. 1517–1520 (2010)
Demartini, G., de Vries, A.P., Iofciu, T., Zhu, J.: Overview of the INEX 2008 Entity Ranking Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 243–252. Springer, Heidelberg (2009)
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL, pp. 363–370 (2005)
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: IJCAI, pp. 668–673 (1999)
Hasan, K.S., Ng, V.: Conundrums in unsupervised keyphrase extraction: Making sense of the state-of-the-art. In: COLING, pp. 365–373 (2010)
Jansen, B.J., Booth, D.L., Spink, A.: Determining the informational, navigational, and transactional intent of Web queries. IP&M 44(3), 1251–1266 (2008)
Leouski, A.V., Croft, W.B.: An evaluation of techniques for clustering search results. Tech. rep., University of Massachusetts at Amherst (1996)
Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: EMNLP, Barcelona, Spain (2004)
Qazvinian, V., Radev, D.R., Ozgur, A.: Citation summarization through keyphrase extraction. In: COLING, Beijing, China, pp. 895–903 (2010)
Soboroff, I., de Vries, A.P., Craswell, N.: Overview of the TREC 2006 enterprise track. In: Proceedings of the 15th Text Retrieval Conference (2006)
Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of ACL Workshop on Multiword Expressions, pp. 33–40 (2003)
Turney, P.D.: Learning algorithms for keyphrase extraction. Information Retrieval 2(4), 303–336 (2000)
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of the 23rd National Conference on Artificial Intelligence, pp. 855–860 (2008)
Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL, pp. 552–559 (2007)
Zeng, H.J., He, Q.C., Chen, Z., Ma, W.Y., Ma, J.: Learning to cluster web search results. In: SIGIR, pp. 210–217 (2004)
Zhao, X., Jiang, J., He, J., Song, Y., Achanauparp, P., Lim, E.-P., Li, X.: Topical keyphrase extraction from twitter. In: ACL-HLT, pp. 379–388 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qiu, M., Li, Y., Jiang, J. (2012). Query-Oriented Keyphrase Extraction. In: Hou, Y., Nie, JY., Sun, L., Wang, B., Zhang, P. (eds) Information Retrieval Technology. AIRS 2012. Lecture Notes in Computer Science, vol 7675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35341-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-35341-3_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35340-6
Online ISBN: 978-3-642-35341-3
eBook Packages: Computer ScienceComputer Science (R0)