Implementation for "less" behavior in ANTLR 4

12 views
Skip to first unread message

Nikolay Shustov

unread,
Oct 12, 2024, 9:50:12 PM10/12/24
to antlr-discussion
There is quite old thread in ANTLR4 GitHub, asking to implement "less" feature (as a compliment to "more"):  https://github.com/antlr/antlr4/issues/212


If somebody interested, this is how I solved it for my case, when I wanted the matching token text to be re-parsed again:

In the Lexer.g4 file add the following fragment:

@members
{

int storedTokenStartCharIndex;
int storedTokenStartCharPositionInLine;
int storedTokenStartLine;

void prepareLess()
{
  storedTokenStartCharIndex = _tokenStartCharIndex;
  storedTokenStartCharPositionInLine = _tokenStartCharPositionInLine;
  storedTokenStartLine = _tokenStartLine;
}

void less()
{
  getInputStream().seek(storedTokenStartCharIndex);
  getInterpreter().setCharPositionInLine(storedTokenStartCharPositionInLine);
  getInterpreter().setLine(storedTokenStartLine);
}

}


Then use it as in the following example:

mode SomeMode;
SEMICOLON_IN_SOME_MODE: { prepareLess(); } ';' { less(); } -> mode(AnotherMode), skip;


This will make ';' text re-parsed in AnotherMode after matching to SEMICOLON_IN_SOME_MODE token in SomeMode.

The idea is straightforward: store the token start position of input character stream, interpreter's token location data and then set it in "less" logic invocation.

What I would hope for is that instead doing it all by hand, that simple logic could be a part of out-of-the-box Lexer at some time.
Reply all
Reply to author
Forward
0 new messages