-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List entrypoints reverse parsed lists (Haskell/C) #163
Comments
I think the problem is not so much the order of the rules but the entry points (wich, if unspecified, will be the first category). Many backends don't seem to support having a list as an entry point. |
Maybe entry points could be handled in the general layer rather in the backends. |
In the generated cup file I find:
It seems that the start non-terminals are not translated. |
The botched entrypoint in the cup file is easily fixed. (Use The Haskell and C backends print a parsed top-level list backwards. (Not an issue with Java and CPP.) The reason is that e.g. Haskell generates left-recursive Happy rules for lists, generating snoc-lists. These are reversed when plugged into other AST nodes, but not at the top-level. |
Ocaml: Only the generated Test file was broken. Java: Only the name of the entrypoint needed to put right (identCat instead of show)
Ocaml: Only the generated Test file was broken. Java: Only the name of the entrypoint needed to put right (identCat instead of show)
Reversed list printing in Haskell with this test case:
I could not reproduce the problem in the simpler setting:
It seems that then the right-to-left recursion transformation does not kick in. UPDATE: fixed for Haskell by removing the right-to-left recursion transformation, but still open for C. |
That left recursion for LR parsing is strictly more performant than right recursion is not a categorical truth. Happy uses a heap-allocated parser stack and BNFC-generated parsers produce ASTs, thus, left recursion does not save anything in comparison to right recursion. Bison has hard stack limits (10000 if not set otherwise), thus, left-recursion may be safer. For instance, the
We might thus remove the optimization for the Haskell backend, yet keep it for C. See also section "A few words on performance" of blog http://gallium.inria.fr/blog/lr-lists/ |
Default is only 10.000, which seems an anachronism in 2019. Parsing right-recursive categories needs O(n) stack size, thus, YYMAXDEPTH is a hard limit on the depth of right recursion. BNFC attempts to rewrite right recursion to left recursion, but only when it can be done easily. Thus, we might still have right recursion left in the generated Bison grammar.
For LR parsers that allocate the parse stack in the heap, there is only a minimal difference between left- and right-recursion. Right recursion requires O(n) stack in comparison of O(1) stack of left recursion. However, the ASTs constructed are the same, and their nodes are stored in the data stack, which has thus the same size for left- and right recursion. Only the control stack is different, but the O(n) extra machine words are dominated by the size of the ASTs. Asymptotic space complexity (linear) is certainly the same for both types of recursion.
For LR parsers that allocate the parse stack in the heap, there is only a minimal difference between left- and right-recursion. Right recursion requires O(n) stack in comparison of O(1) stack of left recursion. However, the ASTs constructed are the same, and their nodes are stored in the data stack, which has thus the same size for left- and right recursion. Only the control stack is different, but the O(n) extra machine words are dominated by the size of the ASTs. Asymptotic space complexity (linear) is certainly the same for both types of recursion. The ocamlyacc parser exhibits stack_overflow exceptions for lists of e.g. length 1.000.000, no matter whether left or right recursion.
C++ gets this list reversed:
|
ANTLR seems to have a bug blocking this issue for the Java/ANTLR backend: antlr/antlr4#2689 |
…ormed entrypoints This fixes the problem with reversed printing of list entrypoints which have been subjected to the transformation from right to left recursion.
Since #272 is fixed this issue should be resolved completely. |
BNFC release 2.8, and up to the latest commit.
The Java backend is sensitive to the order of the macros/productions.
For instance the following gives errors, both using CUP and ANTLR4:
Instead
seems to work fine.
I did not try all the other backends, but Haskell seems to work without problem with both formulations.
The text was updated successfully, but these errors were encountered: