Greetings!
Recall that ANTLR lexers are greedy, matching the longest possible input sequence for each token recognized. Further, when 2 (or more) Lexer rules match exactly the same input sequence; ANTLR disambiguates this collision by selecting the Lexer rule that appears first in the Lexer grammar.
Move LITERAL to the end of the Lexer grammar (i usually have the WHITESPACE rule at the end also, but that probably doesn't matter, just something i do).
Delete OBJECT as a token. Have all Parser rules recognize LITERAL and then give a semantic constraint upon those instances in the Parse where the extra characters in LITERAL are not permitted (in my opinion, doing this also has the benefit of possibly providing a more meaningful error message in this case).
Note that all of the above is UNTESTED, just my experience with ANTLR.
Hope this helps...
-jbb
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antlr-discussion/5af8e44d-92b6-4d99-a2b5-178c1d4fc53b%40googlegroups.com.
TEXT matches just 1 input character at a time. Since FIELD and/or
NUMBER match multiple characters, the greedy lexer matches those
over TEXT. Hope this helps.
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antlr-discussion/3eae139d-d92a-4c7d-bd1c-ee25cb754bbe%40googlegroups.com.
Thanks John. understood.
I changed TEXT to TEXT: ~[ \r\t\n]+; and tokens are.
TEXT ("testing12324")AND ("AND")TEXT ("test=1234")
not sure why test=1234 treated as TEXT token, it should split into FIELD, OPERATOR and NUMBER tokens as per order. because of longest token rule if i understand it correctly?
the sequence matched by TEXT in this case test=1234 is longer than the sequence matched by the other tokens individually. so TEXT is the longest match and that is what the greedy lexer reports.
is there any way to treat any string literal as TEXT token when none of listed token rules are applied ?
not that i know of. need to be very specific about what TEXT
should match and not try to be a catchall rule, in my opinion.
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antlr-discussion/571b6e77-0943-4515-9794-47d55dc8d58e%40googlegroups.com.
Le 24 janv. 2020 à 00:08, Anil Dasari <dasaria...@gmail.com> a écrit :
--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antlr-discussion/3eae139d-d92a-4c7d-bd1c-ee25cb754bbe%40googlegroups.com.