research-article

An inference-based model of word meaning in context as a paraphrase distribution

Authors:

Katrin ErkAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 4, Issue 3

Article No.: 42, Pages 1 - 28

https://doi.org/10.1145/2483669.2483675

Published: 01 July 2013 Publication History

Abstract

Graded models of word meaning in context characterize the meaning of individual usages (occurrences) without reference to dictionary senses. We introduce a novel approach that frames the task of computing word meaning in context as a probabilistic inference problem. The model represents the meaning of a word as a probability distribution over potential paraphrases, inferred using an undirected graphical model. Evaluated on paraphrasing tasks, the model achieves state-of-the-art performance.

References

[1]

Agirre, E. and Soroa, A. 2009. Personalizing pagerank for word sense disambiguation. In Proceedings of the 12^th Conference of the European Chapter of the Association for Computational Linguistics (EACL'09). 33--41.

Digital Library

[2]

Aji, S. and Mceliece, R. 2000. The generalized distributive law. IEEE Trans. Inf. Theory 46, 2, 325--343.

Digital Library

[3]

Bannard, C. and Callison-Burch, C. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of the 43^rd Annual Meeting of the Association for Computational Linguistics (ACL'05).

Digital Library

[4]

Baroni, M., Bernardini, S., Ferraresi, A., and Zanchetta, E. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Lang. Resourc. Eval. 43, 3, 209--226.

[5]

Baroni, M. and Zamparelli, R. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1183--1193.

Digital Library

[6]

Barzilay, R. and McKeown, K. 2001. Extracting paraphrases from a parallel corpus. In Proceedings of the 39^th Annual Meeting of the Association for Computational Linguistics (ACL/EACL'01).

Digital Library

[7]

Berant, J., Dagan, I., and Goldberger, J. 2010. Global learning of focused entailment graphs. In Proceedings of the 48^th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1220--1229.

Digital Library

[8]

Bieman, C. and Nygaard, V. 2010. Crowdsourcing wordnet. In Proceedings of the 5^th Global WordNet Conference.

[9]

Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022.

[10]

Brody, S. and Lapata, M. 2009. Bayesian word sense induction. In Proceedings of the 12^th Conference of the European Chapter of the Association for Computational Linguitics (EACL'09).

Digital Library

[11]

Chklovski, T. and Pantel, P. 2004. Verbocean: Mining the web for fine-grained semantic verb relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'04), D. Lin and D. Wu, Eds., Association for Computational Linguistics, 33--40.

[12]

Clark, S. and Curran, J. R. 2007.Wide-coverage efficient statistical parsing with CCG and log-linear models. Comput. Linguist. 33, 4.

Digital Library

[13]

Cruse, D. A. 1995. Polysemy and related phenomena from a cognitive linguistic viewpoint. In Computational Lexical Semantics, P. Saint-Dizier and E. Viegas, Eds., Cambridge University Press, 33--49.

[14]

Deschacht, K. and Moens, M. 2009. Semi-supervised semantic role labeling using the latent words language model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'09). 21--29.

Digital Library

[15]

Dinu, G. and Lapata, M. 2010. Measuring distributional similarity in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1162--1172.

Digital Library

[16]

Dreyer, M. and Eisner, J. 2009. Graphical models over multiple strings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'09). 101--110.

Digital Library

[17]

Edmonds, P. and Cotton, S., Eds. Senseval-2: Overview. In Proceedings of the 2^nd International Workshop on Evaluating Word Sense Disambiguation Systems (Senseval-2'01). 1--6.

Digital Library

[18]

Erk, K., McCarthy, D., and Gaylord, N. 2009. Investigations on word senses and word usages. In Proceedings of the Joint Conference of the 47^th Annual Meeting of the ACL and the 4^th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1.

Digital Library

[19]

Erk, K. and Pado, S. 2008. A structured vector space model for word meaning in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'08).

Digital Library

[20]

Erk, K. and Pado, S. 2009. Paraphrase assessment in structured vector space: Exploring parameters and datasets. In Proceedings of the EACL Workshop on Geometrical Methods for Natural Language Semantics (GEMS).

Digital Library

[21]

Erk, K. and Pado, S. 2010. Exemplar-based models for word meaning in context. In Proceedings of the ACL Conference Short Papers.

Digital Library

[22]

Grefenstette, E. and Sadrzadeh, M. 2011. Experimental support for a categorical compositional distributional model of meaning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1394--1404.

Digital Library

[23]

Hanks, P. 2000. Do word meanings exist&quest; Comput. Human. 34, 1--2, 205--215.

[24]

Kilgarriff, A. 1997. I don't believe in word senses. Comput. Human. 31, 2, 91--113.

[25]

Kintsch, W. 2001. Predication. Cogn. Sci. 25, 173--202.

[26]

Kintsch, W. 2007. Meaning in context. In Handbook of Latent Semantic Analysis, T. Landauer, D. McNamara, S. Dennis, and W. Kintsch, Eds., Erlbaum, Mahwah, NJ, 89--105.

[27]

Kschischang, F., Frey, B., and Loeliger, H.-A. 2001. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47, 2, 498--519.

Digital Library

[28]

Lafferty, J. D., McCallum, A., and Pereira, F. C. N. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18^th International Conference on Machine Learning (ICML'01). Morgan Kaufmann Publishers, San Francisco, CA, 282--289.

Digital Library

[29]

Landauer, T. and Dumais, S. 1997. A solution to platos problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 2, 211--240.

[30]

Lin, D. and Pantel, P. 2001. Dirt - Discovery of inference rules from text. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining (KDD'01). 323--328.

Digital Library

[31]

McCallum, A. K. 2002. MALLET: A machine learning for language toolkit. http://mallet.cs.umass.edu.

[32]

McCarthy, D. and Navigli, R. 2009. The English lexical substitution task. Lang. Resourc. Eval. 43, 2, 139--159.

[33]

Mihalcea, R., Chklovski, T., and Kilgariff, A. 2004a. The senseval-3 english lexical sample task. In Proceedings of the 3^rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text Proceedings of SensEval-3.

[34]

Mihalcea, R., Tarau, P., and Figa, E. 2004b. PageRank on semantic networks, with application to word sense disambiguation. In Proceedings of the 20^th International Conference on Computational Linguistics.

Digital Library

[35]

Mitchell, J. and Lapata, M. 2008. Vector-based models of semantic composition. In Proceedings of the 46^th Annual Meeting of the Association for Computational Linguistics.

[36]

Mitchell, J. and Lapata, M. 2010. Composition in distributional models of semantics. Cogn. Sci. 34, 8, 1388--1429.

[37]

Murphy, K., Weiss, Y., and Jordan, M. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the 15^th Conference on Uncertainty in Artificial Intelligence. 467--475.

Digital Library

[38]

Nastase, V. 2008. Unsupervised all-words word sense disambiguation with grammatical dependencies. In Proceedings of the 3^rd International Joint Conference on Natural Language Processing.

[39]

Pado, S. and Lapata, M. 2007. Dependency-based construction of semantic space models. Comput. Linguist. 33, 2, 161--199.

Digital Library

[40]

Pantel, P., Bhagat, R., Chklovski, T., and Hovy, E. 2007. Isp: Learning inferential selectional preferences. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL'07).

[41]

Poon, H. and Domingos, P. 2009. Unsupervised semantic parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1--10.

Digital Library

[42]

Reisinger, J. and Mooney, R. 2010. Multi-prototype vector-space models of word meaning. In Proceedings of the Human Language Technologies at the Annual Conference of the North American Chapter of the Association for Computational Linguistics.

Digital Library

[43]

Riedel, S. and Meza-Ruiz, I. 2008. Collective semantic role labelling with markov logic. In Proceedings of the 12^th Conference on Computational Natural Language Learning (CoNLL'08). 193--197.

Digital Library

[44]

Ritter, A., Mausam, and Etzioni, O. 2010. A latent dirichlet allocation method for selectional preferences. In Proceedings of the 48^th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 424--434.

Digital Library

[45]

Sharoff, S. 2006. Open-source corpora: Using the net to fish for linguistic data. Int. J. Corpus Linguist. 11, 4, 435--462.

[46]

Smith, D. and Eisner, J. 2008. Dependency parsing by belief propagation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 145--156.

Digital Library

[47]

Smolensky, P. 1990. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artif. Intell. 46, 159--216.

Digital Library

[48]

Snyder, B. and Palmer, M. 2004. The english all-words task. In 3^rd International Workshop on Semantic Evaluations (SensEval-3). ACL.

[49]

Sutton, C., McCallum, A., and Rohanimanesh, K. 2007. Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. J. Mach. Learn. Res. 8, 693--723.

Digital Library

[50]

Szpektor, I. and Dagan, I. 2008. Learning entailment rules for unary templates. In Proceedings of 22^nd International Conference on Computational Linguistics (COLING'08).

Digital Library

[51]

Szpektor, I., Dagan, I., Bar-Haim, R., and Goldberger, J. 2008. Contextual preferences. In Proceedings of ACL-08: HLT. Association for Computational Linguistics, 683--691.

[52]

Thater, S., Dinu, G., and Pinkal, M. 2009. Ranking paraphrases in context. In Proceedings of the ACL Workshop on Applied Textual Inference.

Digital Library

[53]

Thater, S., Furstenau, H., and Pinkal, M. 2010. Contextualizing semantic representations using syntactically enriched vector models. In Proceedings of the 48^th Annual Meeting of the Association for Computational Linguistics (ACL'10).

Digital Library

[54]

Tuggy, D. H. 1993. Ambiguity, polysemy and vagueness. Cogn. Linguist. 4, 2, 273--290.

[55]

Weaver, W. 1949. Translation. In Machine Translation of Languages: Fourteen Essays, W. Locke and A. Booth, Eds., MIT Press, Cambridge, MA.

[56]

Weiss, Y. 2000. Correctness of local probability propagation in graphical models with loops. Neural Comput. 12, 1, 1--41.

Digital Library

[57]

Yoshikawa, K., Riedel, S., Asahara, M., and Matsumoto, Y. 2009. Jointly identifying temporal relations with markov logic. In Proceedings of the Joint Conference of the 47^th Annual Meeting of the ACL and the 4^th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, 405--413.

Digital Library

[58]

Yuret, D., Han, A., and Turgut, Z. 2010. Semeval-2010 task 12: Parser evaluation using textual entailments. In Proceedings of the 5^th International Workshop on Semantic Evaluation. Association for Computational Linguistics. 51--56.

Digital Library

Cited By

Ye JMing ZChua T(2016)Generating Incremental Length Summary Based on Hierarchical Topic Coverage MaximizationACM Transactions on Intelligent Systems and Technology10.1145/28094337:3(1-33)Online publication date: 17-Feb-2016
https://dl.acm.org/doi/10.1145/2809433

Index Terms

An inference-based model of word meaning in context as a paraphrase distribution
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Semantic interpretation of noun compounds using verbal and other paraphrases
Special issue on multiword expressions: From theory to practice and use, part 2

We study the problem of semantic interpretation of noun compounds such as bee honey, malaria mosquito, apple cake, and stem cell. In particular, we explore the potential of using predicates that make explicit the hidden relation that holds between the ...
Probabilistic Bag-Of-Hyperlinks Model for Entity Linking
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic ...
A Comparison of Algorithms for Inference and Learning in Probabilistic Graphical Models

Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 4, Issue 3

Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction

June 2013

435 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/2483669

Issue’s Table of Contents

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2013

Accepted: 01 December 2011

Revised: 01 October 2011

Received: 01 March 2011

Published in TIST Volume 4, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Division of Information and Intelligent Systems

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
284
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ye JMing ZChua T(2016)Generating Incremental Length Summary Based on Hierarchical Topic Coverage MaximizationACM Transactions on Intelligent Systems and Technology10.1145/28094337:3(1-33)Online publication date: 17-Feb-2016
https://dl.acm.org/doi/10.1145/2809433

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents