The Incremental Use of Morphological Information and Lexicalization in Data-Driven Dependency Parsing

Gülşen Eryiğit²²,
Joakim Nivre²³ &
Kemal Oflazer²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Included in the following conference series:

International Conference on Computer Processing of Oriental Languages

1057 Accesses

Abstract

Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Third Approach: Dependency Trees

Automatic dependency parsing of Estonian: what linguistic features to include?

Article 09 October 2024

Coarse-Grained vs. Fine-Grained Lithuanian Dependency Parsing

References

Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania (1999)
Google Scholar
Collins, M., Hajic, J., Ramshaw, L., Tillmann, C.: A statistical parser for Czech. In: Proc. of ACL 1999, pp. 505–518 (1999)
Google Scholar
Bikel, D., Chiang, D.: Two statistical parsing models applied to the Chinese treebank. In: Proc. of the Second Chinese Language Processing Workshop, pp. 1–6 (2000)
Google Scholar
Dubey, A., Keller, F.: Probabilistic parsing for German using sister-head dependencies. In: Proc. of ACL 2003, pp. 96–103 (2003)
Google Scholar
Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese treebank? In: Proc. of ACL 2003, pp. 439–446 (2003)
Google Scholar
Corazza, A., Lavelli, A., Satta, G., Zanoli, R.: Analyzing an Italian treebank with state-of-the-art statistical parsers. In: Proc. of the Third Workshop on Treebanks and Linguistic Theories (TLT), pp. 39–50 (2004)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proc. of ACL 2003, pp. 423–430 (2003)
Google Scholar
Arun, A., Keller, F.: Lexicalization in crosslinguistic probabilistic parsing: The case of French. In: Proc. of ACL 2005, pp. 302–313 (2005)
Google Scholar
Eryiğit, G., Oflazer, K.: Statistical dependency parsing of Turkish. In: Proc. of EACL 2006, pp. 89–96 (2006)
Google Scholar
Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: Proc. of Conll 2002, pp. 63–69 (2002)
Google Scholar
Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: Proc. of IWPT 2003, pp. 195–206 (2003)
Google Scholar
Nivre, J., Scholz, M.: Deterministic dependency parsing of English text. In: Proc. of COLING 2004, pp. 64–70 (2004)
Google Scholar
Nivre, J., Hall, J., Nilsson, J.: Memory-based dependency parsing. In: Proc. of Conll 2004, pp. 49–56 (2004)
Google Scholar
Nivre, J., Nilsson, J.: Pseudo-projective dependency parsing. In: Proc. of the ACL 2005, pp. 99–106 (2005)
Google Scholar
Bozşahin, C.: Gapping and word order in Turkish. In: Proc. of the 10th International Conference on Turkish Linguistics (2000)
Google Scholar
Oflazer, K., Say, B., Hakkani-Tür, D.Z., Tür, G.: Building a Turkish treebank. In: Abeille, A. (ed.) Building and Exploiting Syntactically-annotated Corpora. Kluwer Academic Publishers, Dordrecht (2003)
Google Scholar
Oflazer, K.: Dependency parsing with an extended finite-state approach. Computational Linguistics 29(4) (2003)
Google Scholar
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proc. of IWPT 2003, pp. 149–160 (2003)
Google Scholar
Black, E., Jelinek, F., Lafferty, J.D., Magerman, D.M., Mercer, R.L., Roukos, S.: Towards history-based grammars: Using richer models for probabilistic parsing. In: Proc. of the 5th DARPA Speech and Natural Language Workshop, pp. 31–37 (1992)
Google Scholar
Veenstra, J., Daelemans, W.: A memory-based alternative for connectionist shift-reduce parsing. Technical Report ILK-0012, Tilburg University (2000)
Google Scholar
Nivre, J.: Inductive Dependency Parsing. Springer, Heidelberg (2006)
Book MATH Google Scholar
Nivre, J.: Incrementality in deterministic dependency parsing. In: Keller, F., Clark, S., Crocker, M., Steedman, M. (eds.) Proc. of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together (ACL), pp. 50–57 (2004)
Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
MATH Google Scholar
Sagae, K., Lavie, A.: A classifier-based parser with linear run-time complexity. In: Proc. of IWPT 2005, pp. 125–132 (2005)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
Buchholz, S., Marsi, E., Krymolowski, Y., Dubey, A. (eds.): Proc. of the CoNLL-X Shared Task: Multi-lingual Dependency Parsing, New York, SIGNLL (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Istanbul Technical Univ., 34469, Turkey
Gülşen Eryiğit
School of Mathematics and Systems Engineering, Växjö Univ., 35195, Sweden
Joakim Nivre
Faculty of Engineering and Natural Sciences, Sabancı Univ., 34956, Turkey
Kemal Oflazer

Authors

Gülşen Eryiğit
View author publications
You can also search for this author in PubMed Google Scholar
Joakim Nivre
View author publications
You can also search for this author in PubMed Google Scholar
Kemal Oflazer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan
Yuji Matsumoto
Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA
Richard W. Sproat
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Kam-Fai Wong
State Key Lab of Intelligent Tech. & Sys., Tsinghua University,
Min Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eryiğit, G., Nivre, J., Oflazer, K. (2006). The Incremental Use of Morphological Information and Lexicalization in Data-Driven Dependency Parsing. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_53

Download citation

DOI: https://doi.org/10.1007/11940098_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics