Abstract
A corpus-based Measure of Semantic Relatedness can be calculated for every pair of words occurring in the corpus, but it can produce erroneous results for many word pairs due to accidental associations derived on the basis of several context features. We propose a novel idea of a partial measure that assigns relatedness values only to word pairs well enough supported by corpus data. Three simple implementations of this idea are presented and evaluated on large corpora and wordnets for two languages. Partial Measures of Semantic Relatedness are shown to perform better in tasks focused on wordnet development than a state-of-the-art ‘full’ Measure of Semantic Relatedness. A comparison of the partial measure with a globally filtered measure is also presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baroni, M., Lenci, A.: Distributional memory: A general framework for corpus-based semantics. Computational Linguistics 36(4), 637–721 (2010)
Bullinaria, J.A., Levy, J.P.: Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD. Behav. Res. Methods 44(3), 890–907 (2012)
Fellbaum, C. (ed.): WordNet – An Electronic Lexical Database. The MIT Press (1998)
Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., Wang, Z.: New experiments in distributional representations of synonymy. In: Proc. of the 9th Conf. on Computational Natural Language Learning, pp. 25–32. ACL, Ann Arbor (2005)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the International Conference on Research in Computational Linguistics (ROCLING X), Taiwan (1997)
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: The Latent Semantic Analysis theory of acquisition. Psychological Review 104(2), 211–240 (1997)
Lin, D.: Principle-based parsing without overgeneration. In: Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (1993)
Lin, D.: Using syntactic dependency as local context to resolve word sense ambiguity. In: Proc. of the 35th ACL and 8th EACL, pp. 64–71. ACL, Madrid (1997)
Lin, D.: Automatic retrieval and clustering of similar words. In: Proc. of the 35th ACL and 17th Inter. Conf. on Computational Linguistics, pp. 768–774. ACL (1998)
Maziarz, M., Piasecki, M., Rudnicka, E., Szpakowicz, S.: Beyond the transfer-and-merge wordnet construction: plWordNet and a comparison with WordNet. In: Proc. of the Inter. Conf. Recent Advances in Natural Language Processing, RANLP 2013. INCOMA Ltd. and ACL, Hissar, Bulgaria (2013)
Navigli, R., Velardi, P., Faralli, S.: A graph-based algorithm for inducing lexical taxonomies from scratch. In: Proceedings of IJCAI (2011)
Piasecki, M., Szpakowicz, S., Broda, B.: Extended similarity test for the evaluation of semantic similarity functions. In: Vetulani, Z. (ed.) Proce. of the 3rd Language and Technology Conference, Poznań, pp. 104–108 (2007)
Piasecki, M., Szpakowicz, S., Broda, B.: A Wordnet from the Ground Up. Oficyna Wydawnicza Politechniki Wrocławskiej (2009), http://www.plwordnet.pwr.wroc.pl/main/content/files/publications/A_Wordnet_from_the_Ground_Up.pdf
Snow, R., Jurafsky, D., Ng, A.Y.: Semantic taxonomy induction from heterogenous evidence. In: Proc. of the Joint Conf. of the International Committee on Computational Linguistics and ACL, pp. 801–808 (2006)
Weeds, J., Weir, D.: Co-occurrence retrieval: A flexible framework for lexical distributional similarity. Computational Linguistics 31(4), 439–475 (2005)
Zesch, T., Gurevych, I.: Automatically creating datasets for measures of semantic relatedness. In: Proceedings of the Workshop on Linguistic Distances, pp. 16–24. Association for Computational Linguistics, Sydney (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Piasecki, M., Wendelberger, M. (2014). Partial Measure of Semantic Relatedness Based on the Local Feature Selection. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-10816-2_41
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)