Abstract
Tree Adjoining Grammar parsers can use a statistical supertagger as a preprocessor to help disambiguate the category of words and thus speed up the parsing phase dramatically. However, since the errors in supertagging propagate to the latter, it is vital to keep the word error rate of the supertagger reasonably low. With very large tagsets coming from extracted grammars, this error rate can be of almost 20% (whereas the error rate of part of speech tagging is under 5%), using standard Hidden Markov Model techniques. To address this problem, we can trade some ambiguity in the supertagger output for a higher accuracy. We propose a new approach to introduce ambiguity in the supertags, looking for a suitable trade-off. The method is based on a representation of the supertags as a feature structure, and consists in grouping the values, or some of the values, of certain features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Joshi, A.K., Bangalore, S.: Disambiguation of super parts of speech (or supertags): Almost parsing. In: International Conference on Computational Linguistics (COLING 1994), Kyoto University, Japan (August 1994)
Bangalore, S.: Complexity of lexical descriptions and its relevance for partial parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia (1997)
Chen, J.: Towards Efficient Statistical Parsing using Lexicalized Grammatical Information. Ph.D. thesis, University of Delaware (2001)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn treebank. Computational Linguistics 19, 313–330 (1993)
Nasr, A., Rambow, O., Chen, J., Bangalore, S.: Context-Free Parsing of a Tree Adjoining Grammar Using Finite State Machines. In: Sixth International Workshop on Tree Adjoining Grammars and Related Frameworks, Venice, Italy (2002)
Chen, J., Bangalore, S., Collins, M., Rambow, O.: Reranking an n-gram supertagger. In: Proceedings of the Sixth International Workshop on Tree Adjoining Grammars and Related Frameworks, Venice, Italy (2002)
Chen, J., Bangalore, S., Vijay-Shanker, K.: New models for improving supertag disambiguation. In: Proceedings of the Ninth Conference of the European Chapter of the Assocation for Computational Linguistics, Bergen, Norway (1999)
Bangalore, S., Joshi, A.K.: Supertagging: An approach to almost parsing. Computational Linguistics 25, 237–265 (1999)
Xia, F.: Automatic grammar generation from two different perspectives. Ph.D. thesis, Department of Computer and Information Science, University of Pennsylvania (2001)
Kinyon, A.: Hypertags. In: Proceedings of COLING 2000, Saarbrücken, Germany (2000)
Candito, M.H.: Représentation modulaire et paramétrable de grammaires électroniques lexicalisées. Ph.D. thesis, University Paris 7 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Toussenel, F. (2004). Ambiguous Supertagging Using a Feature Structure. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-30120-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive