Line comment problem at end of file (antlr4)

1,495 views
Skip to first unread message

George S Cowan

unread,
Jan 23, 2013, 8:24:43 PM1/23/13
to antlr-di...@googlegroups.com
Because our standard way of coding the syntax for line comments has included the newline, i.e.,
  LINE_COMMENT
    : '//' ~[\r\n]*   '\r'? '\n' -> channel(HIDDEN)
    ;
we have not been able to end a file with line comment unless we included a line return.
 
It looks like this issue has been solved with implementation language action coding (http://www.antlr.org/pipermail/antlr-interest/2009-August/035662.html), but we wish to keep grammars free of actions in antlr4. I tried the following and it seems to work
LINE_COMMENT
: '//' ~[\r\n]*   -> channel(HIDDEN)
;
Obviously this is dependent on '\r' and '\n' being included in a whitespace rule.
 
Does anyone see any problem with this approach? I haven't tried this in antlr3, but I haven't seen it suggested anywhere -- Is there some reason that this works in antlr4 but not antlr3?
 
George
 

Terence Parr

unread,
Jan 23, 2013, 8:25:34 PM1/23/13
to antlr-di...@googlegroups.com
try EOF on the end of the rule as one of the options not just \r\n :)
T
--
Dictation in use. Please excuse homophones, malapropism, and nonsense.
> --
>
>

George S Cowan

unread,
Jan 23, 2013, 8:32:12 PM1/23/13
to antlr-di...@googlegroups.com
Yes, that works, too.
 
LINE_COMMENT
  : '//' ~[\r\n]* (EOF|'\r'? '\n') -> channel(HIDDEN)  
  ;
George
 

Terence Parr

unread,
Jan 23, 2013, 9:17:31 PM1/23/13
to antlr-di...@googlegroups.com
We can thank Sam Harwell for the EOF rigor / functionality in v4. :)
T
--
Dictation in use. Please excuse homophones, malapropism, and nonsense.


George S Cowan wrote:
> --
>
>

Jim Idle

unread,
Jan 23, 2013, 10:31:37 PM1/23/13
to antlr-di...@googlegroups.com
Don't include \n in your token definition. You don't need it as you can consume it elsewhere and the comment has no meaning to the parser. Don't try and add semantics/syntax in to the tokenizer. 

LC: '//' ~[\r\n]* -> channel(HIDDEN)  ;

When you are having trouble with formulating a rule, always second guess yourself and ask if you actually need to do what you are doing. As always, move all checking as far down the tool chain as you can.

Jim


--



Sam Harwell

unread,
Jan 23, 2013, 10:40:26 PM1/23/13
to antlr-di...@googlegroups.com

I agree the best way to handle implicitly-terminated line comments is to not include the terminating newline character(s) in the token.

 

The EOF handling was to allow additional flexibility for lexing unterminated block comments:

 

BlockComment

    :   '/*' .*? (EOF | '*/')

;

 

--

Sam Harwell

Owner, Lead Developer

http://tunnelvisionlabs.com

--
 
 

Jim Idle

unread,
Jan 23, 2013, 11:33:40 PM1/23/13
to antlr-di...@googlegroups.com
Yes - that is always an issue, though in v3 you could usually (but not always is you use .* before it) just use ( '*/' | {issue error} ). I think that the analysis is better in v4 though and that the EOF marginal condition is much better handled.

Jim


--
 
 

Leo Antoli

unread,
Jan 24, 2013, 7:23:20 AM1/24/13
to antlr-discussion
So what LINE_COMMENT rule are you guys recommending ?

Thanks.

Regards,
Leo




--
 
 

George S Cowan

unread,
Jan 24, 2013, 8:57:00 AM1/24/13
to antlr-di...@googlegroups.com
In Java, the newline is ignored as part of the language definition, and we can take the position that conceptually the LINE_COMMENT goes up to the newline but does not include it. So the way that I am going to change the Java.g4 grammar is
 
  LINE_COMMENT
   : '//' ~[\r\n]* -> channel(HIDDEN)
   ;
 
Ter's suggestion includes the newline as part of the LINE_COMMENT:
 
  LINE_COMMENT
   : '//' ~[\r\n]* (EOF|'\r'? '\n') -> channel(HIDDEN)
   ;
 
This is all made possible by Sam Harwell's work on version 4. You can also control error messages when an unexpected EOF is encountered by including an EOF alternative with an error message, for example:
 
  BLOCK_COMMENT
  : '/* .*? ('*/' | EOF { issue your error message here and let BLOCK_COMMENT syntax succeed} )
  ;
 
George
 
Reply all
Reply to author
Forward
0 new messages