Best practices for parsing minus operators and negative numbers with ANTLR4

1,660 views
Skip to first unread message

Mahae Koh

unread,
May 28, 2014, 1:00:04 PM5/28/14
to antlr-di...@googlegroups.com
Hi all, 

I've been playing with the Java grammar (https://github.com/antlr/grammars-v4/blob/master/java/Java.g4#L497) and I've run into a slight problem. The grammar defines IntegerLiterals as positive integers only, and has a unary minus operator. I can use these to parse both "-123" and "-(123)" to -123 just fine. However, when I try to parse "-2147483648", my integer parsing freaks out because 2147483648 causes an overflow. If I add '-'? to the integer rule, then I seem to run into situations where actual subtractions (e.g. "-2147483647-1") are not parsed correctly (interestingly, "-2147483647 - 1" is parsed correctly; note the spaces). 

I've come up with some workarounds to detect and avoid this, but surely this problem has been encountered before; what are some recommended practices for handling these situations?

Thanks!

Terence Parr

unread,
May 28, 2014, 1:08:35 PM5/28/14
to antlr-di...@googlegroups.com
 perhaps parse with Long.valueOf()?  I'm not sure and what value it will overflow off the top of my head, but if it goes beyond 32-bit integer twos complement arithmetic the constant should have suffix of L for longer something shouldn't it?
T


--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Dictation in use. Please excuse homophones, malapropisms, and nonsense. 

Mahae Koh

unread,
May 28, 2014, 1:22:11 PM5/28/14
to antlr-di...@googlegroups.com
I considered that, but then the minus unary operator would have to know to convert back to an int afterwards, which seems less than ideal.  

Jim Idle

unread,
May 28, 2014, 10:42:57 PM5/28/14
to antlr-di...@googlegroups.com
Nevertheless, you need to use Long for evaluation, and can lower the form to int after range checking (and error output if it is out of range). It is virtually never correct to look for '-' operators within the lexical rule for numbers, as you have already discovered.

Jim

Mahae Koh

unread,
May 28, 2014, 10:45:44 PM5/28/14
to antlr-di...@googlegroups.com
Understood. Thanks all!
You received this message because you are subscribed to a topic in the Google Groups "antlr-discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antlr-discussion/pxk3TO8gelk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antlr-discussi...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
---
mahae koh
Reply all
Reply to author
Forward
0 new messages