Parsing Ill-Formed Text Using an Error Grammar

Jennifer Foster &
Carl Vogel

139 Accesses
14 Citations
Explore all metrics

Abstract

This paper presents a robust parsing approach which is designed to address the issue of syntactic errors in text. The approach is based on the concept of an error grammar which is a grammar of ungrammatical sentences. An error grammar is derived from a conventional grammar on the basis of an analysis of a corpus of observed ill-formed sentences. A robust parsing algorithm is presented which is applied after a conventional bottom–up parsing algorithm has failed. This algorithm combines a rule from the error grammar with rules from the normal grammar to arrive at a parse for an ungrammatical sentence. This algorithm is applied to 50 test sentences, with encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Becker, M., Bredenkamp, A., Crysmann, B. & Klein, J. (1999). Annotation of Error Types for German News Corpus. In Proceedings of the ATALA workshop on Treebanks, Paris.
Charniak (2000). A Maximum-Entropy-Inspired Parser. In Proceedings of the NAACL-2000.
Copestake, A. (2002). Implementing Typed Feature Structure Grammars. CSLI Lecture Notes. Cambridge: Cambridge University Press.
Google Scholar
Douglas, S. & Dale, R. (1992). Towards Robust Patr. In COLING '92, 468–474.
Earley, J. (1970). An Efficient Context-free Parsing Algorithm. Commun. ACM 6(8): 451–455.
Google Scholar
Foster, J. (2000). A Unification Strategy for Parsing Agreement Errors. In Piliére, C. (ed.), Proceedings of the ESSLLI-2000 Student Session, 77–87.
Fouvry, F. (2000). Robust Unification for Linguistics. In ROMAND 2000 1st workshop on Robust Methods in Analysis of Natural language Data, Lausanne.
Fouvry, F. (2003). Constraint Relaxation with Weighted Feature Structures. In Proceedings of the 8th International Workshop on Parsing Technologies, Nancy, France.
Gojenola, K. & Oronoz, M. (2000). Corpus-based Syntactic Error Detection Using Syntactic Patterns. In NAACL-ANLP00, Student Research Workshop, Seattle.
James, C. (1998). Errors in Language Learning and Use: Exploring Error Analysis. Addison Wesley Longman.
Jensen, K., Heidorn, G., Miller, L. & Ravin, Y. (1983). Parse Fitting and Prose Fixing: Getting a Hold on Ill-formedness. Am. J. Comput. Linguist. 9(3-4): 147–160.
Google Scholar
Keenan, E. L. (1976). Towards a Universal Definition of 'Subject'. In Li, C. (ed.), Subject and Topic. London: Academic Press Inc.
Google Scholar
Magerman, D. M. & Weir, C. (1992). Efficiency, Robustness and Accuracy in Picky Chart Parsing. In Proceedings of the 30th ACL.
Mellish, C. S. (1989). Some Chart-based Techniques for Parsing Ill-formed Input. In Proceedings of the 27th ACL, 102–109.
Pereira, F. C. & Shieber, S. M. (1987). Prolog and Natural-Language Analysis. CSLI Lecture Notes: Number 10. Center for the Study of Language and Information.
Sampson, G. (2001). Evidence Against the Grammatical/Ungrammatical Distinction. In Empirical Linguistics, Chap. 10. Continuum, New York.
Google Scholar
Schneider, D. & McCoy, K. (1998). Recognizing Syntactic Errors in the Writing of Second Language Learners. In Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and the Seventeenth International Conference on Computational Linguistics (COLING-ACL), Vol. 2, Montreal, Canada.
Vogel, C. & Cooper, R. (1995). Robust Chart Parsing with Mildly Inconsistent Feature Structures. In Schöter, A. & Vogel, C. (eds.), Nonclassical Feature Systems, Vol. 10. Centre for Cognitive Science, University of Edinburgh, Working Papers in Cognitive Science.
Weischedel, R. M. & Sondheimer, N. K. (1983). Meta-Rules as a Basis for Processing Ill-Formed Input. Am. J. Comput. Linguist. 9(3-4): 161–177.
Google Scholar

Download references

Authors

Jennifer Foster
View author publications
You can also search for this author in PubMed Google Scholar
Carl Vogel
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Foster, J., Vogel, C. Parsing Ill-Formed Text Using an Error Grammar. Artificial Intelligence Review 21, 269–291 (2004). https://doi.org/10.1023/B:AIRE.0000036259.68818.1e

Download citation

Issue Date: June 2004
DOI: https://doi.org/10.1023/B:AIRE.0000036259.68818.1e

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Parsing as semantically guided constraint solving: the role of ontologies

Extracting Semantic Roles from Ungrammatical Sentences

Novel Benchmark Data Set for Automatic Error Detection and Correction

References

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Parsing Ill-Formed Text Using an Error Grammar

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Parsing as semantically guided constraint solving: the role of ontologies

Extracting Semantic Roles from Ungrammatical Sentences

Novel Benchmark Data Set for Automatic Error Detection and Correction

Explore related subjects

References

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now