Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/964001.964011acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
Article

Parsing expression grammars: a recognition-based syntactic foundation

Published: 01 January 2004 Publication History

Abstract

For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ambiguity is crucial to their original purpose of modelling natural languages, but this very power makes it unnecessarily difficult both to express and to parse machine-oriented languages using CFGs. Parsing Expression Grammars (PEGs) provide an alternative, recognition-based formal foundation for describing machine-oriented syntax, which solves the ambiguity problem by not introducing ambiguity in the first place. Where CFGs express nondeterministic choice between alternatives, PEGs instead use prioritized choice. PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components. A linear-time parser can be built for any PEG, avoiding both the complexity and fickleness of LR parsers and the inefficiency of generalized CFG parsing. While PEGs provide a rich set of operators for constructing grammars, they are reducible to two minimal recognition schemas developed around 1970, TS/TDPL and gTS/GTDPL, which are here proven equivalent in effective recognition power.

References

[1]
Stephen Robert Adams. Modular Grammars for Programming Language Prototyping. PhD thesis, University of Southampton, 1991.
[2]
Alfred V. Aho. Indexed grammars---an extension of context-free grammars. Journal of the ACM, 15(4):647--671, October 1968.
[3]
Alfred V. Aho and Jeffrey D. Ullman. The Theory of Parsing, Translation and Compiling - Vol. I: Parsing. Prentice Hall, Englewood Cliffs, N.J., 1972.
[4]
Alexander Birman. The TMG Recognition Schema. PhD thesis, Princeton University, February 1970.
[5]
Alexander Birman and Jeffrey D. Ullman. Parsing algorithms with backtrack. Information and Control, 23(1):1--34, August 1973.
[6]
Claus Brabrand, Michael I. Schwartzbach, and Mads Vanggaard. The metafront system: Extensible parsing and transformation. In Third Workshop on Language Descriptions, Tools and Applications, Warsaw, Poland, April 2003.
[7]
Bryan Ford. Packrat parsing: a practical linear-time algorithm with backtracking. Master's thesis, Massachusetts Institute of Technology, Sep 2002.
[8]
Bryan Ford. Packrat parsing: Simple, powerful, lazy, linear time. In Proceedings of the 2002 International Conference on Functional Programming, Oct 2002.
[9]
Dick Grune and Ceriel J.H. Jacobs. Parsing Techniques--A Practical Guide. Ellis Horwood, Chichester, England, 1990.
[10]
J. Heering, P. R. H. Hendriks, P. Klint, and J. Rekers. The syntax definition formalism SDF--reference manual--. SIG-PLAN Notices, 24(11):43--75, 1989.
[11]
Simon Peyton Jones and John Hughes (editors). Haskell 98 Report, 1998. http://www.haskell.org.
[12]
Aravind K. Joshi and Yves Schabes. Tree-adjoining grammars. Handbook of Formal Languages, 3:69--124, 1997.
[13]
C.H.A. Koster. Affix grammars. In J.E.L. Peck, editor, AL-GOL 68 Implementation, pages 95--109, Amsterdam, 1971. North-Holland Publ. Co.
[14]
Lillian Lee. Fast context-free grammar parsing requires fast boolean matrix multiplication. Journal of the ACM, 49(1):1--15, 2002.
[15]
Daan Leijen. Parsec, a fast combinator parser. http://www.cs.uu.nl/daan.
[16]
Sun Microsystems. Java compiler compiler (JavaCC). https://javacc.dev.java.net/.
[17]
Sun Microsystems. JavaCC: LOOKAHEAD minitutorial. https://javacc.dev.java.net/doc/lookahead.html.
[18]
Alexander Okhotin. Conjunctive grammars. Journal of Automata, Languages and Combinatorics, 6(4):519--535, 2001.
[19]
International Standards Organization. Syntactic metalan-guage -- Extended BNF, 1996. ISO/IEC 14977.
[20]
Terence J. Parr and Russell W. Quong. Adding semantic and syntactic predicates to LL(k)--pred-LL(k).InProceedings of the International Conference on Compiler Construction, Edinburgh, Scotland, April 1994.
[21]
Terence J. Parr and Russell W. Quong. ANTLR: A Predicated-LL(k) parser generator. Software Practice and Experience, 25(7):789--810, 1995.
[22]
Daniel J. Salomon and Gordon V. Cormack. Scannerless NSLR(1) parsing of programming languages. In Proceedings of the ACM SIGPLAN'89 Conference on Programming Language Design and Implementation (PLDI), pages 170--178, Jul 1989.
[23]
Amir Shpilka. Lower bounds for matrix product. In IEEE Symposium on Foundations of Computer Science, pages 358--367, 2001.
[24]
Edward Stabler. Derivational minimalism. Logical Aspects of Computational Linguistics, pages 68--95, 1997.
[25]
Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, 3rd edition, June 1997.
[26]
Kuo-Chung Tai. Noncanonical SLR(1) grammars. ACM Transactions on Programming Languages and Systems, 1(2):295--320, Oct 1979.
[27]
M.G.J. van den Brand, J. Scheerder, J.J. Vinju, and E. Visser. Disambiguation filters for scannerless generalized LR parsers. In Compiler Construction, 2002.
[28]
A. van Wijngaarden, B.J. Mailloux, J.E.L. Peck, C.H.A. Koster, M. Sintzoff, C.H. Lindsey, L.G.L.T. Meertens, and R.G. Fisker. Report on the algorithmic language ALGOL 68. Numer. Math., 14:79--218, 1969.
[29]
Eelco Visser. A family of syntax definition formalisms. Technical Report P9706, Programming Research Group, University of Amsterdam, 1997.
[30]
Niklaus Wirth. What can we do about the unnecessary diversity of notation for syntactic descriptions. Communications of the ACM, 20(11):822--823, November 1977.

Cited By

View all
  • (2025)Hedge Automata Revisited: Transforming Texts to and from XMLAutomated Technology for Verification and Analysis10.1007/978-3-031-78750-8_6(117-136)Online publication date: 12-Feb-2025
  • (2024)Trane: Musical Janet on the WebProceedings of the 12th ACM SIGPLAN International Workshop on Functional Art, Music, Modelling, and Design10.1145/3677996.3678285(30-35)Online publication date: 2-Sep-2024
  • (2024)Daedalus: Safer Document ParsingProceedings of the ACM on Programming Languages10.1145/36564108:PLDI(816-840)Online publication date: 20-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
POPL '04: Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
January 2004
364 pages
ISBN:158113729X
DOI:10.1145/964001
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 39, Issue 1
    POPL '04
    January 2004
    352 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/982962
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BNF
  2. GTDPL
  3. TDPL
  4. context-free grammars
  5. lexical analysis
  6. packrat parsing
  7. parsing expression grammars
  8. regular expressions
  9. scannerless parsing
  10. syntactic predicates
  11. unified grammars

Qualifiers

  • Article

Conference

POPL04

Acceptance Rates

POPL '04 Paper Acceptance Rate 29 of 176 submissions, 16%;
Overall Acceptance Rate 860 of 4,328 submissions, 20%

Upcoming Conference

POPL '26

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)141
  • Downloads (Last 6 weeks)18
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Hedge Automata Revisited: Transforming Texts to and from XMLAutomated Technology for Verification and Analysis10.1007/978-3-031-78750-8_6(117-136)Online publication date: 12-Feb-2025
  • (2024)Trane: Musical Janet on the WebProceedings of the 12th ACM SIGPLAN International Workshop on Functional Art, Music, Modelling, and Design10.1145/3677996.3678285(30-35)Online publication date: 2-Sep-2024
  • (2024)Daedalus: Safer Document ParsingProceedings of the ACM on Programming Languages10.1145/36564108:PLDI(816-840)Online publication date: 20-Jun-2024
  • (2024)Paguroidea: Fused Parser Generator with Transparent Semantic ActionsProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641563(39-48)Online publication date: 17-Feb-2024
  • (2024)Efficient Matching of Regular Expressions with Lookaround AssertionsProceedings of the ACM on Programming Languages10.1145/36329348:POPL(2761-2791)Online publication date: 5-Jan-2024
  • (2024)Robust Verification of PEG Parser Interpreters2024 IEEE Security and Privacy Workshops (SPW)10.1109/SPW63631.2024.00022(180-191)Online publication date: 23-May-2024
  • (2024)The MuLeCo Project: A learner corpus of L1 German learners of Romance languagesAmpersand10.1016/j.amper.2024.100200(100200)Online publication date: Oct-2024
  • (2023)Ordered Context-Free Grammars RevisitedElectronic Proceedings in Theoretical Computer Science10.4204/EPTCS.388.13388(140-153)Online publication date: 15-Sep-2023
  • (2023)On Parsing Programming Languages with Turing-Complete ParserMathematics10.3390/math1107159411:7(1594)Online publication date: 25-Mar-2023
  • (2023)Derivatives of Context-free Grammars with LookaheadJournal of Information Processing10.2197/ipsjjip.31.42131(421-431)Online publication date: 2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media