Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

An inference-based model of word meaning in context as a paraphrase distribution

Published: 01 July 2013 Publication History

Abstract

Graded models of word meaning in context characterize the meaning of individual usages (occurrences) without reference to dictionary senses. We introduce a novel approach that frames the task of computing word meaning in context as a probabilistic inference problem. The model represents the meaning of a word as a probability distribution over potential paraphrases, inferred using an undirected graphical model. Evaluated on paraphrasing tasks, the model achieves state-of-the-art performance.

References

[1]
Agirre, E. and Soroa, A. 2009. Personalizing pagerank for word sense disambiguation. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL'09). 33--41.
[2]
Aji, S. and Mceliece, R. 2000. The generalized distributive law. IEEE Trans. Inf. Theory 46, 2, 325--343.
[3]
Bannard, C. and Callison-Burch, C. 2005. Paraphrasing with bilingual parallel corpora. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05).
[4]
Baroni, M., Bernardini, S., Ferraresi, A., and Zanchetta, E. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Lang. Resourc. Eval. 43, 3, 209--226.
[5]
Baroni, M. and Zamparelli, R. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1183--1193.
[6]
Barzilay, R. and McKeown, K. 2001. Extracting paraphrases from a parallel corpus. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL/EACL'01).
[7]
Berant, J., Dagan, I., and Goldberger, J. 2010. Global learning of focused entailment graphs. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1220--1229.
[8]
Bieman, C. and Nygaard, V. 2010. Crowdsourcing wordnet. In Proceedings of the 5th Global WordNet Conference.
[9]
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022.
[10]
Brody, S. and Lapata, M. 2009. Bayesian word sense induction. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguitics (EACL'09).
[11]
Chklovski, T. and Pantel, P. 2004. Verbocean: Mining the web for fine-grained semantic verb relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'04), D. Lin and D. Wu, Eds., Association for Computational Linguistics, 33--40.
[12]
Clark, S. and Curran, J. R. 2007.Wide-coverage efficient statistical parsing with CCG and log-linear models. Comput. Linguist. 33, 4.
[13]
Cruse, D. A. 1995. Polysemy and related phenomena from a cognitive linguistic viewpoint. In Computational Lexical Semantics, P. Saint-Dizier and E. Viegas, Eds., Cambridge University Press, 33--49.
[14]
Deschacht, K. and Moens, M. 2009. Semi-supervised semantic role labeling using the latent words language model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'09). 21--29.
[15]
Dinu, G. and Lapata, M. 2010. Measuring distributional similarity in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1162--1172.
[16]
Dreyer, M. and Eisner, J. 2009. Graphical models over multiple strings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'09). 101--110.
[17]
Edmonds, P. and Cotton, S., Eds. Senseval-2: Overview. In Proceedings of the 2nd International Workshop on Evaluating Word Sense Disambiguation Systems (Senseval-2'01). 1--6.
[18]
Erk, K., McCarthy, D., and Gaylord, N. 2009. Investigations on word senses and word usages. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1.
[19]
Erk, K. and Pado, S. 2008. A structured vector space model for word meaning in context. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'08).
[20]
Erk, K. and Pado, S. 2009. Paraphrase assessment in structured vector space: Exploring parameters and datasets. In Proceedings of the EACL Workshop on Geometrical Methods for Natural Language Semantics (GEMS).
[21]
Erk, K. and Pado, S. 2010. Exemplar-based models for word meaning in context. In Proceedings of the ACL Conference Short Papers.
[22]
Grefenstette, E. and Sadrzadeh, M. 2011. Experimental support for a categorical compositional distributional model of meaning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1394--1404.
[23]
Hanks, P. 2000. Do word meanings exist? Comput. Human. 34, 1--2, 205--215.
[24]
Kilgarriff, A. 1997. I don't believe in word senses. Comput. Human. 31, 2, 91--113.
[25]
Kintsch, W. 2001. Predication. Cogn. Sci. 25, 173--202.
[26]
Kintsch, W. 2007. Meaning in context. In Handbook of Latent Semantic Analysis, T. Landauer, D. McNamara, S. Dennis, and W. Kintsch, Eds., Erlbaum, Mahwah, NJ, 89--105.
[27]
Kschischang, F., Frey, B., and Loeliger, H.-A. 2001. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47, 2, 498--519.
[28]
Lafferty, J. D., McCallum, A., and Pereira, F. C. N. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML'01). Morgan Kaufmann Publishers, San Francisco, CA, 282--289.
[29]
Landauer, T. and Dumais, S. 1997. A solution to platos problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 2, 211--240.
[30]
Lin, D. and Pantel, P. 2001. Dirt - Discovery of inference rules from text. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining (KDD'01). 323--328.
[31]
McCallum, A. K. 2002. MALLET: A machine learning for language toolkit. http://mallet.cs.umass.edu.
[32]
McCarthy, D. and Navigli, R. 2009. The English lexical substitution task. Lang. Resourc. Eval. 43, 2, 139--159.
[33]
Mihalcea, R., Chklovski, T., and Kilgariff, A. 2004a. The senseval-3 english lexical sample task. In Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text Proceedings of SensEval-3.
[34]
Mihalcea, R., Tarau, P., and Figa, E. 2004b. PageRank on semantic networks, with application to word sense disambiguation. In Proceedings of the 20th International Conference on Computational Linguistics.
[35]
Mitchell, J. and Lapata, M. 2008. Vector-based models of semantic composition. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.
[36]
Mitchell, J. and Lapata, M. 2010. Composition in distributional models of semantics. Cogn. Sci. 34, 8, 1388--1429.
[37]
Murphy, K., Weiss, Y., and Jordan, M. 1999. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. 467--475.
[38]
Nastase, V. 2008. Unsupervised all-words word sense disambiguation with grammatical dependencies. In Proceedings of the 3rd International Joint Conference on Natural Language Processing.
[39]
Pado, S. and Lapata, M. 2007. Dependency-based construction of semantic space models. Comput. Linguist. 33, 2, 161--199.
[40]
Pantel, P., Bhagat, R., Chklovski, T., and Hovy, E. 2007. Isp: Learning inferential selectional preferences. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL'07).
[41]
Poon, H. and Domingos, P. 2009. Unsupervised semantic parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1--10.
[42]
Reisinger, J. and Mooney, R. 2010. Multi-prototype vector-space models of word meaning. In Proceedings of the Human Language Technologies at the Annual Conference of the North American Chapter of the Association for Computational Linguistics.
[43]
Riedel, S. and Meza-Ruiz, I. 2008. Collective semantic role labelling with markov logic. In Proceedings of the 12th Conference on Computational Natural Language Learning (CoNLL'08). 193--197.
[44]
Ritter, A., Mausam, and Etzioni, O. 2010. A latent dirichlet allocation method for selectional preferences. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 424--434.
[45]
Sharoff, S. 2006. Open-source corpora: Using the net to fish for linguistic data. Int. J. Corpus Linguist. 11, 4, 435--462.
[46]
Smith, D. and Eisner, J. 2008. Dependency parsing by belief propagation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 145--156.
[47]
Smolensky, P. 1990. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artif. Intell. 46, 159--216.
[48]
Snyder, B. and Palmer, M. 2004. The english all-words task. In 3rd International Workshop on Semantic Evaluations (SensEval-3). ACL.
[49]
Sutton, C., McCallum, A., and Rohanimanesh, K. 2007. Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. J. Mach. Learn. Res. 8, 693--723.
[50]
Szpektor, I. and Dagan, I. 2008. Learning entailment rules for unary templates. In Proceedings of 22nd International Conference on Computational Linguistics (COLING'08).
[51]
Szpektor, I., Dagan, I., Bar-Haim, R., and Goldberger, J. 2008. Contextual preferences. In Proceedings of ACL-08: HLT. Association for Computational Linguistics, 683--691.
[52]
Thater, S., Dinu, G., and Pinkal, M. 2009. Ranking paraphrases in context. In Proceedings of the ACL Workshop on Applied Textual Inference.
[53]
Thater, S., Furstenau, H., and Pinkal, M. 2010. Contextualizing semantic representations using syntactically enriched vector models. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL'10).
[54]
Tuggy, D. H. 1993. Ambiguity, polysemy and vagueness. Cogn. Linguist. 4, 2, 273--290.
[55]
Weaver, W. 1949. Translation. In Machine Translation of Languages: Fourteen Essays, W. Locke and A. Booth, Eds., MIT Press, Cambridge, MA.
[56]
Weiss, Y. 2000. Correctness of local probability propagation in graphical models with loops. Neural Comput. 12, 1, 1--41.
[57]
Yoshikawa, K., Riedel, S., Asahara, M., and Matsumoto, Y. 2009. Jointly identifying temporal relations with markov logic. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, 405--413.
[58]
Yuret, D., Han, A., and Turgut, Z. 2010. Semeval-2010 task 12: Parser evaluation using textual entailments. In Proceedings of the 5th International Workshop on Semantic Evaluation. Association for Computational Linguistics. 51--56.

Cited By

View all
  • (2016)Generating Incremental Length Summary Based on Hierarchical Topic Coverage MaximizationACM Transactions on Intelligent Systems and Technology10.1145/28094337:3(1-33)Online publication date: 17-Feb-2016

Index Terms

  1. An inference-based model of word meaning in context as a paraphrase distribution

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 4, Issue 3
    Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
    June 2013
    435 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2483669
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 July 2013
    Accepted: 01 December 2011
    Revised: 01 October 2011
    Received: 01 March 2011
    Published in TIST Volume 4, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Semantics
    2. lexical semantics
    3. loopy belief propagation
    4. paraphrases
    5. probabilistic graphical models
    6. probabilistic inference

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Generating Incremental Length Summary Based on Hierarchical Topic Coverage MaximizationACM Transactions on Intelligent Systems and Technology10.1145/28094337:3(1-33)Online publication date: 17-Feb-2016

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media