-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ANTLR4 Java backend #155
Conversation
This commit applies the changes and it does not structurally change the status quo. Currently the JavaLexerParser type defined in BNFC.Options is pushed down to other Java specific files of this commit. This has to be removed for cleansiness, since the tool-specific features of JLex, JFlex and ANTLR4 can be decided for once before the code generation begins.
The different backends are being differentiated. The file Java.hs now prepares structures containing parsing and lexing functions contained in the modules relative to the various lexing/parsing tools.
…le support (works with both (JLex|JFlex->CUP) and AntlrV4)
…class. TestRunner compares the result of the parsing of the two different Java backends.
DONE: - ANTLR lexer/parser translation - Makefiles - Blacklist Java keywords (abstract, native ...) ONLY for ANTLR backend - System tests pass correctly ToDo: - Composable visitors for ASTs - Make integer and double literals be varsize (issue 59) - Generate some mapping AST <--> parse trees to take advantage of ANTLR's facilities - Add position information - Add layout directives for Java backend - Implement the following test schema: cup accepts ==> antlr accepts and cup AST equals antlr AST - Make ANTLR4 support abstract to use it in other backends (C#, Python, JavaScript) Known bugs: name clashes are not entirely avoided - Implicit tokens are given the name Surrogate_id_SYMB_<n> where <n> is an increasing integer. The motivation is that ANTLR lexer ids must start with an uppercase letter. The current implementation does nothing to prevent clashes with user-defined tokens. ANTLR will complain about this when trying to compile. Test.java: The generated Test.java is more complicated than the J(F)lex/CUP backend. This because ANTLR does not by default 1) terminate abruptly parsing/lexing if error occurr, but tries to continue 2) expect to make sense of everything until EOF.
Oops. Doctests fail. I will fix this on Monday. I will also fix the classpath in the example tests, right now there is no antlr library in the javatools.jar. Sorry! |
@@ -0,0 +1,12 @@ | |||
<component name="libraryTable"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit too IDE-specific to be in the repository.
Although, I like the idea of using some package manager for java (maven?) to get the dependencies instead of the (rather ugly) javatools.jar...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove it, by itself it does not help to automate the dependencies -- it is not a maven input file AFAIK.
I write the following just so that you can follow my line of though and see if it is the same of yours:
I guess it should be necessary to have some maven file with such dependencies and run maven if necessary when you run the tests, which implies you should have maven on your test server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have direct experience using Maven anyway, I usually use it through IDEs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I was thinking. Maven is usually available as a distribution package so it shouldn't be a problem to have it on the test server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I was just thinking out loud :-)
Can you remove the .idea directory though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will make some commits to address all your comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before tonight.
Hi Gabriele. Sorry it took me so long to get back to you on this. I think this is definitely a great addition to bnfc as we have sort of fallen behind with respect to the parsing tools available for java. Globally, this looks good and coherent with the rest of the code. (Thanks in particular for taking the time to add system tests!!). So most of my comments at this point are about polishing. Can you look at the following in addition the the line comments:
As a FYI, I'm globally trying to follow this as a style guide, although like for hlint, the legacy code is still a mess at this point... |
Great, I will try to stick to the style guide and polish the code as much as I can. I will push the commits before tonight. Thanks to you for reviewing the code. |
I meant haddock not HSpec. Shame on me. |
|
||
cf2jlex' str cf = cf2jlex str cf False | ||
cf2jflex' str cf = cf2jlex str cf True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this needs fixing. I suggest the following solution: reorder the argument of cf2jlex
and use cf2jlex True
and cf2jlex False
instead.
Or, if you really want two different functions that's ok but give them names without quotes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll follow your advice, I'll fix this before next week.
This is great! One last thing, if I may: can you write a short help section about the antlr backend in the doc? It doesn't have to be very long, just a quick example of how you use it is enough. You can look at what I wrote for pygment for an example. |
I will do it, before next week too. |
Beginning writing manual page.
Add manual entry comparing ANTLR parse trees and BNFC ASTs.
Oh, I somehow missed the notification that you updated this branch :-/ |
I wrote an ANTLR4 backend hopefully substituting the old J(F)Lex+CUP .
ANTLR is a lexer+parser generator originally targetting Java. It has many commonalities with BNFC, but it deals mainly with parse trees whereas BNFC has its focus on ASTs.
The motivation is that
Commit c7ba720 's message contains further details about what has been done and what is to do.
Summarizing, what I did is
System tests pass successfully on my machine.
The directory .idea is created by the IDE I use and you can of course remove it from the merge -- unless there are other people willing to use my same project.
I did not set ANTLR as the default Java backend because I'd like to have more people trying it before promoting it.
Thanks for your time.