Abstract
Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania (1999)
Collins, M., Hajic, J., Ramshaw, L., Tillmann, C.: A statistical parser for Czech. In: Proc. of ACL 1999, pp. 505–518 (1999)
Bikel, D., Chiang, D.: Two statistical parsing models applied to the Chinese treebank. In: Proc. of the Second Chinese Language Processing Workshop, pp. 1–6 (2000)
Dubey, A., Keller, F.: Probabilistic parsing for German using sister-head dependencies. In: Proc. of ACL 2003, pp. 96–103 (2003)
Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese treebank? In: Proc. of ACL 2003, pp. 439–446 (2003)
Corazza, A., Lavelli, A., Satta, G., Zanoli, R.: Analyzing an Italian treebank with state-of-the-art statistical parsers. In: Proc. of the Third Workshop on Treebanks and Linguistic Theories (TLT), pp. 39–50 (2004)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proc. of ACL 2003, pp. 423–430 (2003)
Arun, A., Keller, F.: Lexicalization in crosslinguistic probabilistic parsing: The case of French. In: Proc. of ACL 2005, pp. 302–313 (2005)
Eryiğit, G., Oflazer, K.: Statistical dependency parsing of Turkish. In: Proc. of EACL 2006, pp. 89–96 (2006)
Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: Proc. of Conll 2002, pp. 63–69 (2002)
Yamada, H., Matsumoto, Y.: Statistical dependency analysis with support vector machines. In: Proc. of IWPT 2003, pp. 195–206 (2003)
Nivre, J., Scholz, M.: Deterministic dependency parsing of English text. In: Proc. of COLING 2004, pp. 64–70 (2004)
Nivre, J., Hall, J., Nilsson, J.: Memory-based dependency parsing. In: Proc. of Conll 2004, pp. 49–56 (2004)
Nivre, J., Nilsson, J.: Pseudo-projective dependency parsing. In: Proc. of the ACL 2005, pp. 99–106 (2005)
Bozşahin, C.: Gapping and word order in Turkish. In: Proc. of the 10th International Conference on Turkish Linguistics (2000)
Oflazer, K., Say, B., Hakkani-Tür, D.Z., Tür, G.: Building a Turkish treebank. In: Abeille, A. (ed.) Building and Exploiting Syntactically-annotated Corpora. Kluwer Academic Publishers, Dordrecht (2003)
Oflazer, K.: Dependency parsing with an extended finite-state approach. Computational Linguistics 29(4) (2003)
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proc. of IWPT 2003, pp. 149–160 (2003)
Black, E., Jelinek, F., Lafferty, J.D., Magerman, D.M., Mercer, R.L., Roukos, S.: Towards history-based grammars: Using richer models for probabilistic parsing. In: Proc. of the 5th DARPA Speech and Natural Language Workshop, pp. 31–37 (1992)
Veenstra, J., Daelemans, W.: A memory-based alternative for connectionist shift-reduce parsing. Technical Report ILK-0012, Tilburg University (2000)
Nivre, J.: Inductive Dependency Parsing. Springer, Heidelberg (2006)
Nivre, J.: Incrementality in deterministic dependency parsing. In: Keller, F., Clark, S., Crocker, M., Steedman, M. (eds.) Proc. of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together (ACL), pp. 50–57 (2004)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Sagae, K., Lavie, A.: A classifier-based parser with linear run-time complexity. In: Proc. of IWPT 2005, pp. 125–132 (2005)
Chang, C.C., Lin, C.J.: LIBSVM: A Library for Support Vector Machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
Buchholz, S., Marsi, E., Krymolowski, Y., Dubey, A. (eds.): Proc. of the CoNLL-X Shared Task: Multi-lingual Dependency Parsing, New York, SIGNLL (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Eryiğit, G., Nivre, J., Oflazer, K. (2006). The Incremental Use of Morphological Information and Lexicalization in Data-Driven Dependency Parsing. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_53
Download citation
DOI: https://doi.org/10.1007/11940098_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)