Implementation for "less" behavior in ANTLR 4

12 views

Skip to first unread message

Nikolay Shustov

unread,

Oct 12, 2024, 9:50:12 PM10/12/24

to antlr-discussion

There is quite old thread in ANTLR4 GitHub, asking to implement "less" feature (as a compliment to "more"): https://github.com/antlr/antlr4/issues/212

If somebody interested, this is how I solved it for my case, when I wanted the matching token text to be re-parsed again:

In the Lexer.g4 file add the following fragment:

@members
{

int storedTokenStartCharIndex;
int storedTokenStartCharPositionInLine;
int storedTokenStartLine;

void prepareLess()
{
storedTokenStartCharIndex = _tokenStartCharIndex;
storedTokenStartCharPositionInLine = _tokenStartCharPositionInLine;
storedTokenStartLine = _tokenStartLine;
}

void less()
{
getInputStream().seek(storedTokenStartCharIndex);
getInterpreter().setCharPositionInLine(storedTokenStartCharPositionInLine);
getInterpreter().setLine(storedTokenStartLine);
}

}

Then use it as in the following example:

mode SomeMode;
SEMICOLON_IN_SOME_MODE: { prepareLess(); } ';' { less(); } -> mode(AnotherMode), skip;

This will make ';' text re-parsed in AnotherMode after matching to SEMICOLON_IN_SOME_MODE token in SomeMode.

The idea is straightforward: store the token start position of input character stream, interpreter's token location data and then set it in "less" logic invocation.

What I would hope for is that instead doing it all by hand, that simple logic could be a part of out-of-the-box Lexer at some time.

Reply all

Reply to author

Forward

0 new messages