how to exclude white space from specific rules?

mellifl...@gmail.com

unread,

Apr 1, 2013, 6:20:24 PM4/1/13

to antlr-di...@googlegroups.com

Hi, I'm using the common method of sending white space to the hidden channel, and that's what I want most of the time, but I have a few lexer rules from which I would like to exclude white space. For example, I would like the rule

IP_ADDRESS: INTEGER DOT INTEGER DOT INTEGER DOT INTEGER;

not to allow spaces inside the address. Is there a good way to do that?

Terence Parr

unread,

Apr 1, 2013, 7:06:23 PM4/1/13

to antlr-di...@googlegroups.com

It will already do that. IP_ADDRESS is a lexer rule.

Ter

--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Dictation in use. Please excuse homophones, malapropisms, and nonsense.

mellifl...@gmail.com

unread,

Apr 2, 2013, 4:25:05 AM4/2/13

to antlr-di...@googlegroups.com

Hello, thank you for your response! In retrospect, though, I asked the wrong question--sorry about that. I should have asked, is there a good way to exclude white space from some parser (rather than lexer) rules, while ignoring it in others?

Essentially, I would like to pick apart the components of a string that excludes white space. Returning to the IPv4 address example, perhaps a more accurate version would be something like

ipAddress: byte1=INTEGER DOT byte2=INTEGER DOT byte3=INTEGER DOT byte4=INTEGER;

without allowing white space between the components of the address, while allowing it in other parser rules. I suppose I could use a lexer rule and pass its text through a regular expression afterward to pull out the parts, but it seems strange to resort to a separate regular expression when I'm already using an LL parser. (The text in my actual problem is more complex than an IP address, so I can't just split the string on dots.)

Jim Idle

unread,

Apr 2, 2013, 11:04:38 AM4/2/13

to antlr-di...@googlegroups.com

At the end of the rule count the number of tokens consumed from
$start. If there are too many, then there was white space and you can
issue a nice semantic error. You could also check the token types in
the off channels if something other then white space can be hidden. I
put such tokens in different channels.

Jim

On Apr 2, 2013, at 1:25, "mellifl...@gmail.com"

Reply all

Reply to author

Forward