Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/1220355.1220365dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free access

Deterministic dependency parsing of English text

Published: 23 August 2004 Publication History

Abstract

This paper presents a deterministic dependency parser based on memory-based learning, which parses English text in linear time. When trained and evaluated on the Wall Street Journal section of the Penn Treebank, the parser achieves a maximum attachment score of 87.1%. Unlike most previous systems, the parser produces labeled dependency graphs, using as arc labels a combination of bracket labels and grammatical role labels taken from the Penn Treebank II annotation scheme. The best overall accuracy obtained for identifying both the correct head and the correct arc label is 86.0%, when restricted to grammatical role labels (7 labels), and 84.4% for the maximum set (50 labels).

References

[1]
D. W. Aha, D. Kibler, and M. Albert. 1991. Instance-based learning algorithms. Machine Learning, 6:37--66.
[2]
D. Aha, editor. 1997. Lazy Learning. Kluwer.
[3]
A. Bies, M. Ferguson, K. Katz, and R. MacIntyre. 1995. Bracketing guidelines for Treebank II style, Penn Treebank project. University of Pennsylvania, Philadelphia.
[4]
E. Black, F. Jelinek, J. Lafferty, D. Magerman, R. Mercer, and S. Roukos. 1992. Towards history-based grammars: Using richer models for probabilistic parsing. In Proceedings of the 5th DARPA Speech and Natural Language Workshop.
[5]
D. Blaheta and E. Charniak. 2000. Assigning function tags to parsed text. In Proceedings of NAACL, pages 234--240.
[6]
S. Buchholz. 2002. Memory-Based Grammatical Relation Finding. Ph.D. thesis, University of Tilburg.
[7]
J. Carroll, E. Briscoe, and A. Sanfilippo. 1998. Parser evaluation: A survey and a new proposal. In Proceedings of LREC, pages 447--454, Granada, Spain.
[8]
E. Charniak. 2000. A maximum-entropy-inspired parser. In Proceedings of NAACL.
[9]
M. Collins, J. Hajič, E. Brill, L. Ramshaw, and C. Tillmann. 1999. A Statistical Parser of Czech. In Proceedings of ACL, pages 505--512, University of Maryland, College Park, USA.
[10]
M. Collins. 1996. A new statistical parser based on bigram lexical dependencies. In Proceedings of ACL, pages 184--191, Santa Cruz, CA.
[11]
M. Collins. 1997. Three generative, lexicalised models for statistical parsing. In Proceedings of ACL, pages 16--23, Madrid, Spain.
[12]
M. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania.
[13]
S. Cost and S. Salzberg. 1993. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10:57--78.
[14]
W. Daelemans, A. van den Bosch, and J. Zavrel. 2002. Forgetting exceptions is harmful in language learning. Machine Learning, 34:11--43.
[15]
W. Daelemans, J. Zavrel, K. van der Sloot, and A. van den Bosch. 2003. Timbl: Tilburg memory based learner, version 5.0, reference guide. Technical Report ILK 03-10, Tilburg University, ILK.
[16]
W. Daelemans. 1999. Memory-based language processing. Introduction to the special issue. Journal of Experimental and Theoretical Artificial Intelligence, 11:287--292.
[17]
S. A. Dudani. 1976. The distance-weighted k-nearest neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, SMC-6:325--327.
[18]
J. Einarsson. 1976. Talbankens skriftspråkskonkor-dans. Lund University.
[19]
J. M. Eisner. 1996. Three new probabilistic models for dependency parsing: An exploration. In Proceedings of COLING, Copenhagen, Denmark.
[20]
E. Fix and J. Hodges. 1952. Discriminatory analysis: Nonparametric discrimination: Consistency properties. Technical Report 11, USAF School of Aviation Medicine, Randolph Field, Texas.
[21]
J. Hajic. 1998. Building a syntactically annotated corpus: The prague dependency treebank. In Issues of Valency and Meaning, pages 106--132. Karolinum.
[22]
M. T. Kromann. 2003. The Danish dependency treebank and the DTAG treebank tool. In Proceedings of the Second Workshop on Treebanks and Linguistic Theories, pages 217--220, Växjö Sweden.
[23]
T. Kudo and Y. Matsumoto. 2000. Japanese dependency structure analysis based on support vector machines. In Proceedings of EMNLP/VLC, Hongkong.
[24]
D. Lin. 1998. Dependency-based evaluation of MINIPAR. In Proceedings of LREC.
[25]
D. M. Magerman. 1995. Statistical decision-tree models for parsing. In Proceedings of ACL, pages 276--283, Boston, MA.
[26]
M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19:313--330.
[27]
J. Nivre, J. Hall, and J. Nilsson. 2004. Memory-based dependency parsing. In Proceedings of CoNLL, pages 49--56.
[28]
J. Nivre. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of IWPT, pages 149--160, Nancy, France.
[29]
K. Oflazer. 2003. Dependency parsing with an extended finite-state approach. Computational Linguistics, 29:515--544.
[30]
R. Skousen. 1989. Analogical Modeling of Language. Kluwer.
[31]
R. Skousen. 1992. Analogy and Structure. Kluwer.
[32]
C. Stanfill and D. Waltz. 1986. Toward memory-based reasoning. Communications of the ACM, 29:1213--1228.
[33]
V. N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer-Verlag.
[34]
J. Veenstra and W. Daelemans. 2000. A memory-based alternative for connectionist shift-reduce parsing. Technical Report ILK-0012, University of Tilburg.
[35]
H. Yamada and Y. Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of IWPT, pages 195--206, Nancy, France.

Cited By

View all
  • (2020)Improving Code-mixed POS Tagging Using Code-mixed EmbeddingsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/338096719:4(1-31)Online publication date: 29-Mar-2020
  • (2019)Multitask Pointer Network for Korean Dependency ParsingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/328244218:3(1-10)Online publication date: 8-Feb-2019
  • (2018)Improved Discourse Parsing with Two-Step Neural Transition-Based ModelACM Transactions on Asian and Low-Resource Language Information Processing10.1145/315253717:2(1-21)Online publication date: 11-Jan-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
COLING '04: Proceedings of the 20th international conference on Computational Linguistics
August 2004
1411 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 23 August 2004

Qualifiers

  • Article

Acceptance Rates

COLING '04 Paper Acceptance Rate 1,411 of 1,411 submissions, 100%;
Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Improving Code-mixed POS Tagging Using Code-mixed EmbeddingsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/338096719:4(1-31)Online publication date: 29-Mar-2020
  • (2019)Multitask Pointer Network for Korean Dependency ParsingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/328244218:3(1-10)Online publication date: 8-Feb-2019
  • (2018)Improved Discourse Parsing with Two-Step Neural Transition-Based ModelACM Transactions on Asian and Low-Resource Language Information Processing10.1145/315253717:2(1-21)Online publication date: 11-Jan-2018
  • (2016)Transition-Based Dependency Parsing Exploiting SupertagsIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.259831024:11(2059-2068)Online publication date: 1-Nov-2016
  • (2015)Joint learning of constituency and dependency grammars by decomposed cross-lingual inductionProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832249.2832381(953-959)Online publication date: 25-Jul-2015
  • (2015)A two-level parser for patent claim parsingAdvanced Engineering Informatics10.1016/j.aei.2015.01.01329:3(431-439)Online publication date: 1-Aug-2015
  • (2015)Semantic Dependency GraphsRevised Selected Papers of the 11th International Tbilisi Symposium on Logic, Language, and Computation - Volume 1014810.1007/978-3-662-54332-0_10(171-184)Online publication date: 21-Sep-2015
  • (2015)Fast Dependency Parsing Using Distributed Word RepresentationsRevised Selected Papers of the PAKDD 2015 Workshops on Trends and Applications in Knowledge Discovery and Data Mining - Volume 944110.1007/978-3-319-25660-3_22(261-272)Online publication date: 19-May-2015
  • (2012)A ranking-based approach to word reordering for statistical machine translationProceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 110.5555/2390524.2390649(912-920)Online publication date: 8-Jul-2012
  • (2012)Dependency Parsing domain adaptation using transductive SVMProceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP10.5555/2389961.2389968(55-59)Online publication date: 23-Apr-2012
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media