Add ANTLR4 Java backend #155

gapag · 2015-10-23T14:15:19Z

I wrote an ANTLR4 backend hopefully substituting the old J(F)Lex+CUP .
ANTLR is a lexer+parser generator originally targetting Java. It has many commonalities with BNFC, but it deals mainly with parse trees whereas BNFC has its focus on ASTs.

The motivation is that

these tools are old and at least one of them is not supported anymore.
ANTLR has interesting features to manipulate parse trees, which could be integrated in BNFC.
ANTLR has also some other target languages (e.g. Python)

Commit c7ba720 's message contains further details about what has been done and what is to do.

Summarizing, what I did is

make ANTLR a non-default option for the Java backend
CF to ANTLR lexer/parser translation
Java Makefiles calling ANTLR
Blacklist Java keywords (abstract, native ...) (ONLY for ANTLR backend)

System tests pass successfully on my machine.

The directory .idea is created by the IDE I use and you can of course remove it from the merge -- unless there are other people willing to use my same project.

I did not set ANTLR as the default Java backend because I'd like to have more people trying it before promoting it.

Thanks for your time.

This commit applies the changes and it does not structurally change the status quo. Currently the JavaLexerParser type defined in BNFC.Options is pushed down to other Java specific files of this commit. This has to be removed for cleansiness, since the tool-specific features of JLex, JFlex and ANTLR4 can be decided for once before the code generation begins.

The different backends are being differentiated. The file Java.hs now prepares structures containing parsing and lexing functions contained in the modules relative to the various lexing/parsing tools.

…er support

…le support (works with both (JLex|JFlex->CUP) and AntlrV4)

…class. TestRunner compares the result of the parsing of the two different Java backends.

DONE: - ANTLR lexer/parser translation - Makefiles - Blacklist Java keywords (abstract, native ...) ONLY for ANTLR backend - System tests pass correctly ToDo: - Composable visitors for ASTs - Make integer and double literals be varsize (issue 59) - Generate some mapping AST <--> parse trees to take advantage of ANTLR's facilities - Add position information - Add layout directives for Java backend - Implement the following test schema: cup accepts ==> antlr accepts and cup AST equals antlr AST - Make ANTLR4 support abstract to use it in other backends (C#, Python, JavaScript) Known bugs: name clashes are not entirely avoided - Implicit tokens are given the name Surrogate_id_SYMB_<n> where <n> is an increasing integer. The motivation is that ANTLR lexer ids must start with an uppercase letter. The current implementation does nothing to prevent clashes with user-defined tokens. ANTLR will complain about this when trying to compile. Test.java: The generated Test.java is more complicated than the J(F)lex/CUP backend. This because ANTLR does not by default 1) terminate abruptly parsing/lexing if error occurr, but tries to continue 2) expect to make sense of everything until EOF.

gapag · 2015-10-23T17:44:49Z

Oops. Doctests fail. I will fix this on Monday. I will also fix the classpath in the example tests, right now there is no antlr library in the javatools.jar. Sorry!

…n the printout

gdetrez · 2015-10-29T11:03:53Z

.idea/libraries/org_antlr_antlr4_4_5_1_1.xml

@@ -0,0 +1,12 @@
+<component name="libraryTable">


This seems a bit too IDE-specific to be in the repository.
Although, I like the idea of using some package manager for java (maven?) to get the dependencies instead of the (rather ugly) javatools.jar...

You can remove it, by itself it does not help to automate the dependencies -- it is not a maven input file AFAIK.

I write the following just so that you can follow my line of though and see if it is the same of yours:
I guess it should be necessary to have some maven file with such dependencies and run maven if necessary when you run the tests, which implies you should have maven on your test server.

I don't have direct experience using Maven anyway, I usually use it through IDEs.

Yes, that's what I was thinking. Maven is usually available as a distribution package so it shouldn't be a problem to have it on the test server.

No problem, I was just thinking out loud :-)
Can you remove the .idea directory though?

Yes, I will make some commits to address all your comments.

Before tonight.

gdetrez · 2015-10-29T12:21:31Z

Hi Gabriele.

Sorry it took me so long to get back to you on this. I think this is definitely a great addition to bnfc as we have sort of fallen behind with respect to the parsing tools available for java.

Globally, this looks good and coherent with the rest of the code. (Thanks in particular for taking the time to add system tests!!). So most of my comments at this point are about polishing.

Can you look at the following in addition the the line comments:

I don't like TODOs and FIXMEs. First they have a tendency to stick around forever and second they make it difficult to have a discussion about the problem. I suggest the following:
- if it's trivial, just fix it
- if it's more work but it doesn't really matter (like potential refactorings), just remove the TODO
- for anything else (future work, questions, potential bug...), open an issue
There is a few places where the code could be simplified a bit (e.g.: ( "PARSER", (executable parmake)) -> ( "PARSER", executable parmake),, you can run hlint at least on the new files to get reformating suggestion (you can run hlint on the file you only modified but you'll get a lot of suggestion about the old code as well... :-/)

As a FYI, I'm globally trying to follow this as a style guide, although like for hlint, the legacy code is still a mess at this point...

gapag · 2015-10-29T12:29:47Z

Great, I will try to stick to the style guide and polish the code as much as I can. I will push the commits before tonight. Thanks to you for reviewing the code.

gapag · 2015-11-19T10:43:56Z

I meant haddock not HSpec. Shame on me.

gdetrez · 2015-11-19T16:13:59Z

source/src/BNFC/Backend/Java/CFtoJLex15.hs

+
+cf2jlex' str cf = cf2jlex str cf False
+cf2jflex' str cf = cf2jlex str cf True
+


I still think this needs fixing. I suggest the following solution: reorder the argument of cf2jlex and use cf2jlex True and cf2jlex False instead.

Or, if you really want two different functions that's ok but give them names without quotes.

I'll follow your advice, I'll fix this before next week.

gdetrez · 2015-11-19T16:24:37Z

This is great!

One last thing, if I may: can you write a short help section about the antlr backend in the doc? It doesn't have to be very long, just a quick example of how you use it is enough. You can look at what I wrote for pygment for an example.

gapag · 2015-11-19T16:30:30Z

I will do it, before next week too.

Beginning writing manual page.

Add manual entry comparing ANTLR parse trees and BNFC ASTs.

gdetrez · 2015-12-08T17:14:20Z

Oh, I somehow missed the notification that you updated this branch :-/
Thanks for your fixes, merging now!

Add ANTLR4 Java backend

Gabriele Paganelli added 9 commits October 14, 2015 11:05

Add refactoring structures containing parsing and lexing functions.

b319707

The different backends are being differentiated. The file Java.hs now prepares structures containing parsing and lexing functions contained in the modules relative to the various lexing/parsing tools.

Add untested ANTLR lexer support

f331f86

Link untested Antlr lexer support to Java.hs; add untested Antlr pars…

44495c8

…er support

Link CFtoAntlr4Parser.hs to Java.hs. Parser untested.

cffe593

Add generation of file Test for antlr backend. Add generalized Makefi…

411972b

…le support (works with both (JLex|JFlex->CUP) and AntlrV4)

Refactor Test.java generation function. Add comments. Add TestRunner …

8427941

…class. TestRunner compares the result of the parsing of the two different Java backends.

Cleanup local dev files

efb2bbb

Gabriele Paganelli added 2 commits October 23, 2015 20:01

Fix doctest specification of some functions

24a6b7c

Add ANTLRv4.5.1 to javatools.jar for testing. Update ANTLR4 version i…

f2f20ec

…n the printout

gdetrez reviewed Oct 29, 2015
View reviewed changes

Remove .idea directory

357de99

Revert README.md from a typo

8b98162

Gabriele Paganelli added 8 commits October 29, 2015 14:20

Remove aliases constructing records

31ea838

Remove commented out code in ANTLRException handlers

661c929

Remove TODOs, FIXMEs and other unnecessary comments

c09a52e

Fix all errors reported by hlint on the ANTLR-related files

e71f656

Fix all errors reported by hlint on Java.hs

55b475a

Fix code style for Java.hs

0733793

Fix some files to follow the Haskell style guide we use

d0f99d7

Fix doctests due to malformed haddock

f72fe68

Gabriele Paganelli and others added 3 commits October 30, 2015 23:51

Fix wrong printouts in generated Test.java, introduced in re-formatting

776d58b

Fix String rendering in Java does not work (BNFC#159)

4008c8b

Fix HSpec for prCat

a96592d

gdetrez reviewed Nov 19, 2015
View reviewed changes

Gabriele Paganelli added 6 commits November 20, 2015 11:18

Removed functions cf2j(f)lex that had quotes in their names.

e25ab90

Fix comment about alternative entrypoints in generated Test.

e29d46e

Beginning writing manual page.

Fix package handling in ANTLR4.

211e6e1

Rename variable names in Test.java

df64a65

Add manual entry comparing ANTLR parse trees and BNFC ASTs.

Remove hlint errors

bd9db09

Format some comments to be shorter than 80 col

f2eb603

gdetrez added a commit that referenced this pull request Dec 8, 2015

Merge pull request #155 from gapag/master

ecf4fed

Add ANTLR4 Java backend

gdetrez merged commit ecf4fed into BNFC:master Dec 8, 2015

gapag mentioned this pull request Feb 4, 2016

String rendering in Java does not work #159

Closed

andreasabel added this to the 2.8.2 milestone Nov 4, 2018

andreasabel added the Java label Nov 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ANTLR4 Java backend #155

Add ANTLR4 Java backend #155


		cf2jlex' str cf = cf2jlex str cf False
		cf2jflex' str cf = cf2jlex str cf True

Add ANTLR4 Java backend #155

Add ANTLR4 Java backend #155

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment