<ExpectingRegExp>
TOKEN:
{
< RegularExpressionPattern: ~["/", "*"]("\\/" | ~["/"])* > :
DEFAULT
}
void Literal() :
{}
{
<NullLiteral>
| <BooleanLiteral>
| <DecimalLiteral>
| <OctalLiteral>
| <HexLiteral>
| <FloatLiteral>
| <StringLiteral>
| "/" {token_source.SwitchTo(ParserConstants.ExpectingRegExp); }
<RegularExpressionPattern>
{token_source.SwitchTo(ParserConstants.DEFAULT); }"/"
}
This doesn't seem to work. I guess it's defeated by lookahead.
Any ideas?
Thanks
Mike
I have tried to avoid changing lexer states from the parser code, since
the lexer may be several tokens ahead of the parser.
Given your situation, here are some initial thoughts:
(1) Create a single token for a regular expression, including the initial
and final "/". This removes the confusion between "/" as an arithmetic
operator and a regular expression delimiter. You will have to deal with
the fact that your token includes the delimiters.
(2) Alternatively, see if you can determine which of your tokens may be
followed by a regular expression, and which cannot. (This is a more
difficult task, and more error prone. And some grammars may not be
amenable to this.) So rather than 2 states (DEFAULT and
EXPECTING_REG_EXP), you would have at least 3: DEFAULT (where the *lexer*
knows that a regular expression is illegal),
SLASH_MEANS_REGULAR_EXPRESSION, and EXPECTING_REG_EXP. You must think
through each token in your grammar and decide whether you should be in
DEFAULT or SLASH_MEANS_REGULAR_EXPRESSION afterward. (1) is easier.
hth
Eric
I think (1) means that there are lots of cases that are supposed to be
arithmetic, but are interpreted as regexps.
>
> (2) Alternatively, see if you can determine which of your tokens may be
> followed by a regular expression, and which cannot. (This is a more
> difficult task, and more error prone. And some grammars may not be
> amenable to this.) So rather than 2 states (DEFAULT and
> EXPECTING_REG_EXP), you would have at least 3: DEFAULT (where the *lexer*
> knows that a regular expression is illegal),
> SLASH_MEANS_REGULAR_EXPRESSION, and EXPECTING_REG_EXP. You must think
> through each token in your grammar and decide whether you should be in
> DEFAULT or SLASH_MEANS_REGULAR_EXPRESSION afterward. (1) is easier.
>
(2) seems to solve the problem!
Many thanks,
Mike