Abstract
This paper presents a robust parsing approach which is designed to address the issue of syntactic errors in text. The approach is based on the concept of an error grammar which is a grammar of ungrammatical sentences. An error grammar is derived from a conventional grammar on the basis of an analysis of a corpus of observed ill-formed sentences. A robust parsing algorithm is presented which is applied after a conventional bottom–up parsing algorithm has failed. This algorithm combines a rule from the error grammar with rules from the normal grammar to arrive at a parse for an ungrammatical sentence. This algorithm is applied to 50 test sentences, with encouraging results.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Becker, M., Bredenkamp, A., Crysmann, B. & Klein, J. (1999). Annotation of Error Types for German News Corpus. In Proceedings of the ATALA workshop on Treebanks, Paris.
Charniak (2000). A Maximum-Entropy-Inspired Parser. In Proceedings of the NAACL-2000.
Copestake, A. (2002). Implementing Typed Feature Structure Grammars. CSLI Lecture Notes. Cambridge: Cambridge University Press.
Douglas, S. & Dale, R. (1992). Towards Robust Patr. In COLING '92, 468–474.
Earley, J. (1970). An Efficient Context-free Parsing Algorithm. Commun. ACM 6(8): 451–455.
Foster, J. (2000). A Unification Strategy for Parsing Agreement Errors. In Piliére, C. (ed.), Proceedings of the ESSLLI-2000 Student Session, 77–87.
Fouvry, F. (2000). Robust Unification for Linguistics. In ROMAND 2000 1st workshop on Robust Methods in Analysis of Natural language Data, Lausanne.
Fouvry, F. (2003). Constraint Relaxation with Weighted Feature Structures. In Proceedings of the 8th International Workshop on Parsing Technologies, Nancy, France.
Gojenola, K. & Oronoz, M. (2000). Corpus-based Syntactic Error Detection Using Syntactic Patterns. In NAACL-ANLP00, Student Research Workshop, Seattle.
James, C. (1998). Errors in Language Learning and Use: Exploring Error Analysis. Addison Wesley Longman.
Jensen, K., Heidorn, G., Miller, L. & Ravin, Y. (1983). Parse Fitting and Prose Fixing: Getting a Hold on Ill-formedness. Am. J. Comput. Linguist. 9(3-4): 147–160.
Keenan, E. L. (1976). Towards a Universal Definition of 'Subject'. In Li, C. (ed.), Subject and Topic. London: Academic Press Inc.
Magerman, D. M. & Weir, C. (1992). Efficiency, Robustness and Accuracy in Picky Chart Parsing. In Proceedings of the 30th ACL.
Mellish, C. S. (1989). Some Chart-based Techniques for Parsing Ill-formed Input. In Proceedings of the 27th ACL, 102–109.
Pereira, F. C. & Shieber, S. M. (1987). Prolog and Natural-Language Analysis. CSLI Lecture Notes: Number 10. Center for the Study of Language and Information.
Sampson, G. (2001). Evidence Against the Grammatical/Ungrammatical Distinction. In Empirical Linguistics, Chap. 10. Continuum, New York.
Schneider, D. & McCoy, K. (1998). Recognizing Syntactic Errors in the Writing of Second Language Learners. In Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and the Seventeenth International Conference on Computational Linguistics (COLING-ACL), Vol. 2, Montreal, Canada.
Vogel, C. & Cooper, R. (1995). Robust Chart Parsing with Mildly Inconsistent Feature Structures. In Schöter, A. & Vogel, C. (eds.), Nonclassical Feature Systems, Vol. 10. Centre for Cognitive Science, University of Edinburgh, Working Papers in Cognitive Science.
Weischedel, R. M. & Sondheimer, N. K. (1983). Meta-Rules as a Basis for Processing Ill-Formed Input. Am. J. Comput. Linguist. 9(3-4): 161–177.
Rights and permissions
About this article
Cite this article
Foster, J., Vogel, C. Parsing Ill-Formed Text Using an Error Grammar. Artificial Intelligence Review 21, 269–291 (2004). https://doi.org/10.1023/B:AIRE.0000036259.68818.1e
Issue Date:
DOI: https://doi.org/10.1023/B:AIRE.0000036259.68818.1e