Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Normalizing Complex Functional Expressions in Japanese Predicates: Linguistically-Directed Rule-Based Paraphrasing and Its Application

Published: 01 August 2013 Publication History

Abstract

The growing need for text mining systems, such as opinion mining, requires a deep semantic understanding of the target language. In order to accomplish this, extracting the semantic information of functional expressions plays a crucial role, because functional expressions such as would like to and can’t are key expressions to detecting customers’ needs and wants. However, in Japanese, functional expressions appear in the form of suffixes, and two different types of functional expressions are merged into one predicate: one influences the factual meaning of the predicate while the other is merely used for discourse purposes. This triggers an increase in surface forms, which hinders information extraction systems. In this article, we present a novel normalization technique that paraphrases complex functional expressions into simplified forms that retain only the crucial meaning of the predicate. We construct paraphrasing rules based on linguistic theories in syntax and semantics. The results of experiments indicate that our system achieves a high accuracy of 79.7%, while it reduces the differences in functional expressions by up to 66.7%. The results also show an improvement in the performance of predicate extraction, providing encouraging evidence of the usability of paraphrasing as a means of normalizing different language expressions.

References

[1]
Adger, D. 2003. Core Syntax: A Minimalist Approach. Oxford University Press, Oxford UK.
[2]
Brun, C. and Hagège, C. 2003. Normalization and paraphrasing using symbolic methods. In Proceedings of the 2nd International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP’03). 41--48.
[3]
Chierchia, G. and McConnell-Ginet, S. 2000. Meaning and Grammar: An Introduction to Semantics 2nd Ed., The MIT Press, Cambridge, MA.
[4]
Cinque, G. 2006. Restructuring and Functional Heads: The Cartography of Syntactic Structures. Vol. 4. Oxford University Press, New York, NY.
[5]
Haugh, M. 2008. Utterance-final conjunctive particles and implicature in Japanese conversation. Pragmatics 18, 3, 425--451.
[6]
Hong, G., Lee, S.-W., and Rim, H.-C. 2009. Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics and the International Conference on Natural Language Processing (ACL-IJCNLP’09). 233--236.
[7]
Imamura, K., Izumi, T., Kikui, G., and Sato, S. 2011. Jutsubu kinouhyougen-no imiraberu tagaa {Semantic label tagging to functional expressions in predicate phrases}. In Proceedings of the 17th Annual Meeting of the Association for Natural Language Processing. 308--311.
[8]
Inui, K, Abe, S., Hara, K., Morita, H., Sao, C., Eguchi, M., Sumida, A., Murakami, K., and Matsuyoshi, S. 2008. Experience mining: Building a large-scale database of personal experiences and opinions from Web documents. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. Vol. 1, 314--321.
[9]
Izumi, T., Imamura, K., Kikui, G., and Sato, S. 2010. Standardizing complex functional expressions in Japanese predicates: Applying theoretically-based paraphrasing rules. In Proceedings of the Workshop on Multiword Expressions: from Theory to Applications, 63--71.
[10]
Kato, S. 2007. Nihongo-no jutsubu-kouzou to kyoukaisei {Predicate complex structure and morphological boundaries in Japanese}. Ann. Rep. Cultural Sci. 122, 6, Graduate School of Letters, Hokkaido University Sapporo, Japan, 97--155.
[11]
Lee, J., Lee, D., and Lee, G. G. 2006. Improving phrase-based Korean-English statistical machine translation. In Proceedings of the International Conference on Spoken Language Processing (ICSLP’06). 753--756.
[12]
Maynard, S. K. 1997. Japanese Communication: Language and Thought in Context. University of Hawai’i Press, Honolulu, HI.
[13]
Matsuyoshi, S., Sato, S., and Utsuro, T. 2006. Compilation of a dictionary of Japanese functional expressions with hierarchical organization. In Proceedings of the 21st International Conference on Computer Processing of Oriental Languages (ICCPOL). Lecture Notes in Computer Science, vol. 4285, Springer, Berlin, 395--402.
[14]
Matsuyoshi, S., Sato, S., and Utsuro, T. 2007. Nihongo kinouhyougenzisyono hensan {A dictionary of Japanese functional expressions with hierarchical organization}. J. Natural Lang. Proces. 14, 5, 123--146.
[15]
Matsuyoshi, S. and Sato, S. 2008. Automatic paraphrasing of Japanese functional expressions using a hierarchically organized dictionary. In Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP). Vol. 1, 691--696.
[16]
Minami, F. 1993. Gendai Nihongobunpou-no Rinkaku {Introduction to Modern Japanese Grammar}. Taishuukan, Tokyo.
[17]
Nakau, M. 1976. Tense, aspect, and modality. In Syntax and Semantics (Vol.5): Japanese Generative Grammar, M. Shibatani Ed., Academic Press, Boston, MA, 421--482.
[18]
Narrog, H. 2005. On defining modality again. Lang. Sci. 27, 2, 165--192.
[19]
Nasukawa, T. 2001. Kooru sentaa-niokeru tekisuto mainingu {Text mining application for call centers}. J. Japan. Soc. Artif. Intell. 16, 2, 219--225.
[20]
Nasukawa, T. 2009. Text analysis and knowledge mining. In Proceedings of 8th International Symposium on Natural Language Processing. 1--2.
[21]
Oku, M. 1990. Nihonbun kaiseki-niokeru zyutugosoutoukuno kanyoutekihyougenno atukai. {Analysis methods for Japanese idiomatic predicates} Trans. Inf. Proces. Soc. Japan 31, 12, 1727--1734.
[22]
Partee, B. H., Meulen, A., and Wall, R. E. 1990. Mathematical Methods in Linguistics. Kluwer, Dordrecht, The Netherlands.
[23]
Portner, P. H. 2005. What is Meaning?: Fundamentals of Formal Semantics. Blackwell, Malden, MA.
[24]
Rizzi, L. 1999. On the position “Int(errogative)” in the left periphery of the clause. Master’s thesis, Università di Siena.
[25]
Shirai, S., Ikehara, S., and Kawaoka, T. 1993. Effects of automatic rewriting of source language within a Japanese to English MT system. In Proceedings of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI’93): MT in the Next Generation. 226--239.
[26]
Shudo, K., Tanabe, T., Takahashi, M., and Yoshimura, K 2004. MWEs as non-propositional content indicators. In Proceedings of the 2nd Association for Computational Linguistics (ACL) Workshops on Multiword Expressions: Integrating Processing. 32--39.
[27]
Stolcke, A. 2002. SRILM-an extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing. J. H. L. Hansen and B. Pellom Eds., 901--904.
[28]
Tanabe, T., Yoshimura, K., and Shudo, K. 2001. Modality expressions in Japanese and their automatic paraphrasing. In Proceedings of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS). 507--512.
[29]
Tsujimura, N. 2007. An Introduction to Japanese Linguistics 2nd Ed. Blackwell, Malden, MA.
[30]
Yoon, J.-H. 1994. Korean verbal inflection and checking theory. In MIT Working Papers in Linguistics: The Morphology-Syntax Connection, C. Philip and H. Harley Eds., Department of Linguistics, MIT, Cambridge, MA, 251--270.

Cited By

View all
  • (2021)Japanese Chess Commentary Corpus with Named Entity and Modality Annotation将棋解説文への固有表現・モダリティ情報アノテーションJournal of Natural Language Processing10.5715/jnlp.28.84728:3(847-873)Online publication date: 2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian Language Information Processing
ACM Transactions on Asian Language Information Processing  Volume 12, Issue 3
August 2013
76 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/2499955
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2013
Accepted: 01 October 2012
Received: 01 October 2012
Published in TALIP Volume 12, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Paraphrasing
  2. factuality analysis
  3. functional expressions
  4. linguistic theories
  5. opinion mining
  6. sentiment analysis
  7. text mining

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Japanese Chess Commentary Corpus with Named Entity and Modality Annotation将棋解説文への固有表現・モダリティ情報アノテーションJournal of Natural Language Processing10.5715/jnlp.28.84728:3(847-873)Online publication date: 2021

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media