List entrypoints reverse parsed lists (Haskell/C) #163

gapag · 2016-02-02T05:51:15Z

BNFC release 2.8, and up to the latest commit.
The Java backend is sensitive to the order of the macros/productions.
For instance the following gives errors, both using CUP and ANTLR4:

separator Exp ",";
List . Something ::=  [Exp];
A . Exp ::= "a";
B . Exp ::= "b";

Instead

List . Something ::=  [Exp];
separator Exp ",";
A . Exp ::= "a";
B . Exp ::= "b";

seems to work fine.
I did not try all the other backends, but Haskell seems to work without problem with both formulations.

The text was updated successfully, but these errors were encountered:

gdetrez · 2016-02-04T15:48:08Z

I think the problem is not so much the order of the rules but the entry points (wich, if unspecified, will be the first category). Many backends don't seem to support having a list as an entry point.

andreasabel · 2018-01-01T23:10:40Z

Maybe entry points could be handled in the general layer rather in the backends.

andreasabel · 2018-01-02T09:11:03Z

In the generated cup file I find:

start with [Exp];

ListExp ::= ...

It seems that the start non-terminals are not translated.

andreasabel · 2019-05-11T18:30:51Z

The botched entrypoint in the cup file is easily fixed. (Use identCat instead of show.)
For the OCaml backend, just the generated Test program was faulty.

The Haskell and C backends print a parsed top-level list backwards. (Not an issue with Java and CPP.) The reason is that e.g. Haskell generates left-recursive Happy rules for lists, generating snoc-lists. These are reversed when plugged into other AST nodes, but not at the top-level.
A more principled solution may be to have rules for snoc-lists and then rules for the corresponding cons-lists that just apply the reversal function.

Ocaml: Only the generated Test file was broken. Java: Only the name of the entrypoint needed to put right (identCat instead of show)

andreasabel · 2019-11-23T12:16:18Z

Reversed list printing in Haskell with this test case:

terminator Exp "";
Lit. Exp ::= Integer ;

I could not reproduce the problem in the simpler setting:

terminator Integer "";

It seems that then the right-to-left recursion transformation does not kick in.

UPDATE: fixed for Haskell by removing the right-to-left recursion transformation, but still open for C.

andreasabel · 2019-11-23T12:55:21Z

That left recursion for LR parsing is strictly more performant than right recursion is not a categorical truth.
It is true that left recursion uses O(1) stack and right recursion O(n). However, the AST/parsetree stored in the single cell in case of left recursion is of size O(n) where each stack entry in case of right recursion has size O(1). So it is O(n) no matter which one is uses.

Happy uses a heap-allocated parser stack and BNFC-generated parsers produce ASTs, thus, left recursion does not save anything in comparison to right recursion.

Bison has hard stack limits (10000 if not set otherwise), thus, left-recursion may be safer. For instance, the bnfc -c generated bison parser for input separator [Integer] ""; reports

error: 9999,0: memory exhausted at 9999

We might thus remove the optimization for the Haskell backend, yet keep it for C.

See also section "A few words on performance" of blog http://gallium.inria.fr/blog/lr-lists/

Default is only 10.000, which seems an anachronism in 2019. Parsing right-recursive categories needs O(n) stack size, thus, YYMAXDEPTH is a hard limit on the depth of right recursion. BNFC attempts to rewrite right recursion to left recursion, but only when it can be done easily. Thus, we might still have right recursion left in the generated Bison grammar.

For LR parsers that allocate the parse stack in the heap, there is only a minimal difference between left- and right-recursion. Right recursion requires O(n) stack in comparison of O(1) stack of left recursion. However, the ASTs constructed are the same, and their nodes are stored in the data stack, which has thus the same size for left- and right recursion. Only the control stack is different, but the O(n) extra machine words are dominated by the size of the ASTs. Asymptotic space complexity (linear) is certainly the same for both types of recursion.

…"optimization"

For LR parsers that allocate the parse stack in the heap, there is only a minimal difference between left- and right-recursion. Right recursion requires O(n) stack in comparison of O(1) stack of left recursion. However, the ASTs constructed are the same, and their nodes are stored in the data stack, which has thus the same size for left- and right recursion. Only the control stack is different, but the O(n) extra machine words are dominated by the size of the ASTs. Asymptotic space complexity (linear) is certainly the same for both types of recursion. The ocamlyacc parser exhibits stack_overflow exceptions for lists of e.g. length 1.000.000, no matter whether left or right recursion.

andreasabel · 2019-11-24T01:08:46Z

C++ gets this list reversed:

separator Integer "";

andreasabel · 2019-11-24T13:58:00Z

ANTLR seems to have a bug blocking this issue for the Java/ANTLR backend: antlr/antlr4#2689
The .g4 parser specification produced by BNFC looks right.

…ormed entrypoints This fixes the problem with reversed printing of list entrypoints which have been subjected to the transformation from right to left recursion.

andreasabel · 2019-12-30T20:40:31Z

Since #272 is fixed this issue should be resolved completely.

gdetrez added bug C++ Java OCaml C labels Feb 4, 2016

andreasabel mentioned this issue Jan 2, 2018

Lists of lists #221

Open

andreasabel self-assigned this May 11, 2019

andreasabel added this to the 2.8.3 milestone May 11, 2019

andreasabel added the Haskell label May 11, 2019

andreasabel added a commit that referenced this issue May 13, 2019

[ #163 ] ListCat entrypoint: fixed for OCaml and Java

c2c7415

Ocaml: Only the generated Test file was broken. Java: Only the name of the entrypoint needed to put right (identCat instead of show)

andreasabel added a commit that referenced this issue May 24, 2019

[ #163 ] ListCat entrypoint: fixed for OCaml and Java

8c26bc4

Ocaml: Only the generated Test file was broken. Java: Only the name of the entrypoint needed to put right (identCat instead of show)

andreasabel removed C++ Java OCaml labels Aug 27, 2019

andreasabel modified the milestones: 2.8.3, 2.8.4 Aug 27, 2019

andreasabel changed the title ~~Java backend does not want "separator" or "terminator" as first line~~ List entrypoints reverse parsed lists (Haskell/C) Aug 27, 2019

andreasabel added the lists Concerning list categories and separator/terminator/delimiter pragmas label Nov 23, 2019

andreasabel mentioned this issue Nov 23, 2019

More powerful list declarations #268

Open

andreasabel removed the Haskell label Nov 23, 2019

andreasabel added a commit that referenced this issue Nov 23, 2019

[ #163 ] Haskell/Profile: removed unfinished right-to-left recursion …

d1829e3

…"optimization"

andreasabel added the C++ label Nov 24, 2019

andreasabel added Java Java/ANTLR blocked Blocked by some other issue and removed Java C C++ labels Nov 24, 2019

andreasabel added a commit that referenced this issue Nov 24, 2019

[ #163 ] Haskell/GADT: allow empty set of user defined categories

682ef83

andreasabel added a commit that referenced this issue Nov 24, 2019

[ #163 ] Java: allow empty set of user defined categories

6a6c83d

andreasabel added a commit that referenced this issue Nov 24, 2019

[ #163 ] Ocaml: allow entrypoint w/o user defined categories

58447fe

andreasabel added a commit that referenced this issue Nov 24, 2019

[ #163 ] Pygments: work with empty set of keywords

ccc8690

andreasabel added a commit that referenced this issue Nov 24, 2019

[ re #163 ] Haskell/CNF: Test: print also list categories

9a672e2

andreasabel closed this as completed in 5719ad6 Nov 24, 2019

andreasabel mentioned this issue Nov 25, 2019

ANTLR backend: start rules needed #272

Closed

andreasabel removed Java/ANTLR blocked Blocked by some other issue labels Dec 30, 2019

andreasabel added a commit that referenced this issue Dec 30, 2019

[ docs ] left-recursion transformation disabled for happy (#163)

792349f

andreasabel added the entrypoints Concerning entry points for the parser and the `entrypoints` directive label Mar 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List entrypoints reverse parsed lists (Haskell/C) #163

List entrypoints reverse parsed lists (Haskell/C) #163

List entrypoints reverse parsed lists (Haskell/C) #163

List entrypoints reverse parsed lists (Haskell/C) #163

Comments