Semi-important whitespace in ANTLR4

1,810 views
Skip to first unread message

Maurice

unread,
Jan 15, 2014, 2:39:14 PM1/15/14
to
Hi there!

I have created a grammar and I am using 'skip' in the lexer to ditch whitespace, like so:

WS  : [ \t\r\n]+ -> skip ;

Other tokens I have (and I would like to keep these, not rewrite):

FOO : 'foo';
STAR : '*';
BLOB: 'blob';

However, there are times in the parser where whitespace is significant for me. I would like to be able to parse:

foo * blob

differently to 

foo *blob

But I do not want to turn off skip (my real grammar is more complex and there are only a few cases where whitespace is significant).

How can I do this in ANTLR4?

Thanks,

Maurice

Terence Parr

unread,
Jan 15, 2014, 2:19:11 PM1/15/14
to antlr-di...@googlegroups.com
hi Maurice, The lexer will have to rely on context information to know whether to skip or not. Sometimes the context is obvious, such as nested parentheses. the lexer can simply count open and close () with actions and then use a semantic predicate in the whitespace rule to indicate what to do. this is how you solve the newline is context-sensitive in Python.

or just use an action

WS  : [ \t\r\n]+ {if (ignore) skip()} ;

does this help?


Ter
On Jan 15, 2014, at 11:12 AM, Maurice wrote:

Hi there!

I have created a grammar and I am using 'skip' in the lexer to ditch whitespace, like so:

WS  : [ \t\r\n]+ -> skip ;

Other tokens I have:

FOO : 'foo';
STAR : '*';
BLOB: 'blob';

However, there are times in the parser where whitespace is significant for me. I would like to be able to parse:

foo * blob

differently to 

foo *blob

But I do not want to turn off skip (my real grammar is more complex and there are only a few cases where whitespace is significant).

How can I do this in ANTLR4?

Thanks,

Maurice

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Maurice

unread,
Jan 15, 2014, 2:42:30 PM1/15/14
to antlr-di...@googlegroups.com
Yup, it did the trick! For future travellers, the solution was:

1. Add a new member to the Lexer:

@lexer::members {
    boolean ignore=true;
}

2. Update the WS token:

WS  : [ \t\r\n]+ { if(ignore) skip(); } ;

3. Add a new token, which could be picked up in the parser:

NEW_TOKEN : STAR { ignore = false; } WS { ignore = true; };

Maurice
Reply all
Reply to author
Forward
0 new messages