Abstract
Geo-entity relation recognition from rich texts requires robust and effective solutions on keyword extraction. Compared with supervised learning methods, unsupervised learning methods attract more attention for their capability to capture the dynamic feature variation in text and to discover additional relation types. The frequency-based methods of keyword extraction have been widely studied. However, it is difficult to be applied into geo-entity keyword extraction directly because of the sparse distribution of geo-entity relations in texts. Besides, there are few studies on Chinese keyword extraction. This paper proposes a context enhanced keyword extraction method. Firstly the contexts for geo-entities are enhanced to reduce the sparseness of terms. Secondly two well-known frequency-based statistical methods (i.e., DF and Entropy) are used to build a large-scale corpus automatically from the enhanced contexts. Thirdly the lexical features and their weights are statistically determined based on the corpus to enhance the distinction of the terms. Finally, all terms in the enhanced contexts are measured with the lexical features, and the most important terms are selected as the keywords of geo-entity pairs. Experiments are conducted with mass real Chinese web texts. Compared with DF and Entropy, the presented method improves the precision by 41 % and 36 % respectively in discovering the keywords with sparse distribution and generates additional 60 % correct keywords for geo-entity relation recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jones, C.B., Purves, R.S.: Geographical information retrieval. Int. J. Geogr. Inf. Sci. 22(3), 219–228 (2008)
Kordjamshidi, P., Otterlo, M.V., Moens, M.F.: Spatial role labeling: towards extraction of spatial relations from natural language. ACM Trans. Speech Lang. Process. 8(3), 1–39 (2011)
Purves, R.S., Clough, P., Jones, C.B.: The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the Internet. Int. J. Geogr. Inf. Sci. 21(7), 717–745 (2007)
Zhu, S.N., Zhang, X.Y., Zhang, C.J.: Syntactic pattern recognition of geospatial relations described in natural language. In: Proceedings of 2010 International Conference on Broadcast Technology and Multimedia Communication, 13 December, pp. 354–357. CNKI, Chongqing (2010)
Li, W.W., Goodchild, M.F., Raskin, R.: Towards geospatial semantic search: exploiting latent semantic relations in geospatial data. Int. J. Digit. Earth 7(1), 17–37 (2014)
Loglisci, C., Ienco, D., Roche, M., et al.: Towards geographic information harvesting: extraction of spatial relational facts from web documents. In: 2012 IEEE 12th International Conference on Data Mining Workshops, 10 December, pp. 789–796. IEEE, Brussels (2012)
Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2014)
Zhang, W.R., Sun, L., Han, X.P.: A entity relation extraction method based on Wikipedia and pattern clustering. J. Chin. Inf. Process. 26(2), 75–127 (2012)
Liu, Z.Y., Sun, M.S.: Can prior knowledge help graph-based methods for keyword extraction? Front. Electr. Electron. Eng. 7(2), 242–253 (2012)
Vasardani, M., Winter, S., Richter, K.F.: Locating place names from place descriptions. Int. J. Geogr. Inf. Sci. 27(12), 2509–2532 (2013)
Shen, M.M., Liu, D.R., Huang, Y.S.: Extracting semantic relations to enrich domain ontologies. J. Intell. Inf. Syst. 39(3), 749–761 (2012)
Zhang, X.Y., et al.: SVM based extraction of spatial relations in text. In: 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services, 29 June–01 July, pp. 529–533. IEEE, Fuzhou (2011)
Naughton, M., Stokes, N., Carthy, J.: Sentence-level event classification in unstructured texts. Inf. Retrieval 13(2), 132–156 (2010)
Acknowledgments
This work was partially supported by the National High-Tech Research and Development Program of China (2013AA120305) and the National Natural Science Foundation of China (41271408).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Yu, L., Lu, F., Zhang, X., Liu, X. (2016). Context Enhanced Keyword Extraction for Sparse Geo-Entity Relation from Web Texts. In: Morishima, A., et al. Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9865. Springer, Cham. https://doi.org/10.1007/978-3-319-45835-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-45835-9_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45834-2
Online ISBN: 978-3-319-45835-9
eBook Packages: Computer ScienceComputer Science (R0)