Abstract
Keyphrases extracted from news articles can be used to concisely represent the main content of news events. In this paper, we first present several criteria of high-quality news keyphrases. After that, in order to integrate those criteria into the keyphrase extraction task, we propose a novel formulation which coverts the task to a learning to rank problem. Our approach involves two phases: selecting candidate keyphrases and ranking all possible sub-permutations among the candidates. Three kinds of feature sets: lexical feature set, locality feature set and coherence feature set are introduced to rank the candidates, and then the best sub-permutation provides the keyphrases. The proposed method is evaluated on a multi-news collection and experimental results verify that our proposed method is effective to extract coherent news keyphrases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: From pairwise approach to listwise approach. In: Proceedings of the International Conference on Machine Learning (2007)
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase extraction. In: Proceedings of IJCAI, pp. 668–673 (1999)
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of EMNLP, pp. 216–223 (2003)
Jiang, X., Hu, Y.H., Li, H.: A aanking approach to keyphase extraction. In: Proceedings of SIGIR (2009)
Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: Proceedings of SIGIR (2009)
Liu, Z.Y., Li, P., Zheng, Y.B., Sun, M.S.: Clustering to find exemplar terms for keyphrase extraction. In: Proceedings of EMNLP, pp. 257–266 (2009)
Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: Proceedings of EMNLP, pp. 366–376 (2010)
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools (2004)
Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Proceedings of EMNLP, pp. 404–411 (2004)
Nguyen, T.D., Kan, M.Y.: Keyphrase Extraction in Scientific Publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007)
Turney, P.D.: Learning algorithms for keyphrase extraction. Information Retrieval, 303–336 (2000)
Turney, P.D.: Coherent keyphrase extraction via web mining. In: Proceedings of IJCAI, pp. 434–439 (2003)
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of AAAI, pp. 855–860 (2008)
Xia, F., Liu, T.Y., Wang, J., Zhang, W., Li, H.: Listwise approach to learning to rank - theory and algorithm. In: Proceedings of ICML (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ding, Z., Zhang, Q., Huang, X. (2011). Learning to Extract Coherent Keyphrases from Online News. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-25631-8_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25630-1
Online ISBN: 978-3-642-25631-8
eBook Packages: Computer ScienceComputer Science (R0)