-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Antlr backend - quotation marks in bracket expressions are escaped when they shouldn't #319
Comments
It seems that the regular expression printer does not apply special printing rules when printing content in bracketed expressions, but maybe it should, according to the rules you quoted above:
The problematic lines in BNFC are thus:
bnfc/source/src/BNFC/Backend/Java/RegToAntlrLexer.hs Lines 69 to 72 in 3ca7211
There, instead of calling the prt function recursively, a special print function for content inside brackets should be called.
|
@fonfalleh : Can you test if PR #321 works for you? |
Seems to work, thanks! 👍 |
Great! |
My fix wasn't complete, see #329. |
It seems the only characters that should be escaped in bracket expressions in regexes are
]
,\
, and-
. I'm not sure if this means that there needs to be different escaping in different contexts.https://github.com/antlr/antlr4/blob/master/doc/lexer-rules.md#lexer-rule-elements
Example token rule that generates broken code (not by any means good or correct, I just noticed that the resulting lexer file doesn't work) :
token NoteToken ["abcdefgr"]({"es"} | {"is"})*["\',"]*(digit)*["."]* ;
results in the following line in the Lexer.g4 file
NoteToken : [abcdefgr]('e''s'|'i''s')*[\',]*DIGIT*'.'*;
which generates the following when building:
warning(156): lily/lilyLexer.g4:83:38: invalid escape sequence \'
The build also complains about the following line:
STRINGTEXT : ~[\"\\] -> more;
bnfc/source/src/BNFC/Backend/Java/CFtoAntlr4Lexer.hs
Line 157 in 3ca7211
The build works as expected when removing the extra backslashes as follows:
NoteToken : [abcdefgr]('e''s'|'i''s')*[',]*DIGIT*'.'*;
...
STRINGTEXT : ~["\\] -> more;
Sidenote:
I first thought this could be related to this line, referencing RegToJLex.hs instead of RegToAntlrLexer.hs, but it seems the reference is correct, even if it's confusing naming.
bnfc/source/src/BNFC/Backend/Java/CFtoAntlr4Lexer.hs
Line 150 in 3ca7211
Export from RegToAntlrLexer:
bnfc/source/src/BNFC/Backend/Java/RegToAntlrLexer.hs
Line 1 in 3ca7211
The text was updated successfully, but these errors were encountered: