non-associative rule?

889 views
Skip to first unread message

Hou Yunqing

unread,
Apr 15, 2016, 1:51:01 AM4/15/16
to antlr-discussion
Hi all,

antlr currently supports <assoc=left> and <assoc=right>. What about <assoc=none>? Unary operators and comparison operators are often non-associative. I think this would be quite useful a feature, wouldn't it?

Apparently this question has been asked before here. But I find the answer less than satisfactory because it produces ugly parse tree that requires manual processing to clean up.

What do you guys think? Am I missing something?

Thanks
Yunqing

rtm...@googlemail.com

unread,
Apr 15, 2016, 6:27:03 AM4/15/16
to antlr-discussion
Hi,
I *suspect* your question is down to unclear thinking rather than a real need. The suggested solution is the traditional one and I can't see a problem (which doesn't mean there isn't one, just that maybe I'm overly familiar with one way of doing things).

So, you say "it produces ugly parse tree that requires manual processing to clean up" - I don't think either are true - what would be a non-ugly parse tree, and why does it need cleaning up? I've never had to (the stackoverflow grammar perhaps isn't the prettiest grammar, but that's not an issue with the parse tree).

Approaching it from the other side, what should

a op b op c

produce as a parse tree if 'op' has no associativity? If it's meant to be an error then the parser picks that up with the stackoverflow answer.

Also unary ops have no need for left/right assoc because they can only be interpreted one way. Suppose a prefix unary op 'puop', then you can only write it one way legally

puop x

-or generally-

puop puop puop x

etc. which can only mean

(puop (puop (puop x)))

so AFAICS associativity doesn't apply as it's unambiguous.

Looking at my ancient yacc book, yacc seems to provide these as a shortcut to avoid writing out the grammar so as to explicity embody precedence. ANTLR does this for free (mostly) by the ordering of alts in an expression, so in antr:

e :
    e * e
    |
    e + e
;

in yacc, according to the book, %left and %right are as much to do with precedence as associativity.

HTH

jan

Hou Yunqing

unread,
Apr 16, 2016, 2:45:06 AM4/16/16
to antlr-discussion
Hi jan, thanks for the answer

I think you probably misunderstood me. Let me clarify. 

In your example, '-' or 'not' can be a puop. Several things:
1. There are times when I want to make "puop puop puop x" illegal. Think about it, is it really okay to write "not not not not my_business"? If this example is not convincing enough, think about "load_address_of_variable", here we shorten as 'la'. Does it make sense to do "la la la x"??
2. For some binary operations such as binary comparison, associativity sometimes does not make sense, or at least we want to make it illegal. The answer given in the stackoverflow post is mutually left-recursive. Using post-processing as suggested in the comments looks like a hack. Now, consider these 2 natural solutions

Method 1:
expr
: ID
| expr '*' expr
| expr '+' expr
| expr '<' expr;   // this is undesirable, since we intend for comparison to be non-associative. But we can recover from this using post-processing

Method 2:
expr1
: ID
| expr '*' expr
| expr '+' expr;

expr2 : expr1 | expr1 '<' expr1; // this introduces a new non-terminal

Method1 is undesirable because it requires extra post-processing. Method2 is also undesirable because for something as simple as ID you get a 2-level parse tree expr2(expr1(ID)) instead of a simple expr(ID). This is what I mean by ugly parse tree: it is unnecessarily deeply nested. Suppose we could use <assoc=none>, then the above grammar could be written as:
Method 3:
expr
: ID
| expr '*' expr
| eppr '+' expr
| <assoc=none> expr '<' expr;

which resolves all our problems. In the example I've given, expr2(expr1(ID)) does not seem that deep. In real, serious grammar, it can be much deeper, with a long chain of precedence like assoc_op > nonassoc_op > assoc_op > nonassoc_op and so on. When we encounter this kind of grammar, deeply-nested parse tree with a single branch that goes all the way down is really ugly. Even though we can produce decent-looking AST with trivial post-processing on the parse tree, the parse tree itself sometimes must be preserved. Some processing are run directly on the parse tree instead of on the AST. I'm doing one of such applications, which is why I am complaining.

I think <assoc=none> is not absolutely essential. However, in some applications, it does get pretty handy if it is supported. I understand supporting it makes recursive-descent parsers a bit trickier to write. But antlr already supports left recursion, so I suppose supporting nonassoc is relatively straightforward.

Please comment.

Thanks,
Yunqing

Jim Idle

unread,
Apr 16, 2016, 2:49:03 AM4/16/16
to antlr-discussion
The better thing is to allow not not not syntactically but reject it semantically. However, not not not is logical and can be useful for machine generated code. 





--
You received this message because you are subscribed to the Google Groups "antlr-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antlr-discussi...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

rtm...@googlemail.com

unread,
Apr 16, 2016, 6:33:31 AM4/16/16
to antlr-discussion
Hi Hou,
Jim Idle said it well - allow syntactically and semantically, report after, and machines write some correct but unpleasant code sometimes.

A bit more (and I'm no expert, please remember).

On Saturday, April 16, 2016 at 7:45:06 AM UTC+1, Hou Yunqing wrote:
Hi jan, thanks for the answer

I think you probably misunderstood me. Let me clarify. 

In your example, '-' or 'not' can be a puop. Several things:
1. There are times when I want to make "puop puop puop x" illegal. Think about it, is it really okay to write "not not not not my_business"?

Yes, it's fine. It's not your business if someone/something writes weird code and it may be happening for reasons you don't anticipate (like they're teaching a beginner that 'not' flips its input so they build a chain of 'flippers'). Also "not not x" is by definition meaningless? It depends, in eg. javascript IIRC it can be used to turn a looks-like-a-false into a genuine boolean false. Warn certainly, but don't constrict what other people do.
 
If this example is not convincing enough, think about "load_address_of_variable", here we shorten as 'la'. Does it make sense to do "la la la x"??


"load address" is procedural not functional, so it's not a full composable expression, arguably. All you can do is assign or dereference it? I can't see that you can build it into a larger expression in eg. C/C++.
 


2. For some binary operations such as binary comparison, associativity sometimes does not make sense, or at least we want to make it illegal. The answer given in the stackoverflow post is mutually left-recursive. Using post-processing as suggested in the comments looks like a hack. Now, consider these 2 natural solutions

Method 1:
expr
: ID
| expr '*' expr
| expr '+' expr
| expr '<' expr;   // this is undesirable, since we intend for comparison to be non-associative. But we can recover from this using post-processing

Method 2:
expr1
: ID
| expr '*' expr
| expr '+' expr;

expr2 : expr1 | expr1 '<' expr1; // this introduces a new non-terminal

Method1 is undesirable because it requires extra post-processing. Method2 is also undesirable because for something as simple as ID you get a 2-level parse tree expr2(expr1(ID)) instead of a simple expr(ID)

Parse trees are ugly. I can't say more than that.
 
This is what I mean by ugly parse tree: it is unnecessarily deeply nested. Suppose we could use <assoc=none>, then the above grammar could be written as:
Method 3:
expr
: ID
| expr '*' expr
| eppr '+' expr
| <assoc=none> expr '<' expr;

which resolves all our problems. In the example I've given, expr2(expr1(ID)) does not seem that deep. In real, serious grammar, it can be much deeper, with a long chain of precedence like assoc_op > nonassoc_op > assoc_op > nonassoc_op and so on.

And see my recent post <https://groups.google.com/forum/#!topic/antlr-discussion/BpNigIH1cBg> for an example of just this. Agreed, but how many layers would you be able to strip off if nonassoc was used?

 
When we encounter this kind of grammar, deeply-nested parse tree with a single branch that goes all the way down is really ugly. Even though we can produce decent-looking AST with trivial post-processing on the parse tree, the parse tree itself sometimes must be preserved. Some processing are run directly on the parse tree instead of on the AST. I'm doing one of such applications, which is why I am complaining.

I do take your point, but I haven't the experience either in antlr or parsing theory to address it.

sorry

jan
 

Mike Cargal

unread,
Apr 16, 2016, 7:13:18 AM4/16/16
to ANTLR List
I don’t think you’ve convinced me that there is any such thing as associativity = none.

In your example, the expression a < b < c is syntactically valid.  You might add semantic logic to disallow it, but the parser has to interpret it as either (a < b) < c or a < (b < c) (i.e. you have to do one comparison before the other, and that order is reflected in the resulting parse tree. (yes, some languages have an “expr < expr < expr” that;s evaluated as a single “between” operation, but that would be correctly handled with a parse rule to bring all three nodes together for evaluation).)

Since “a < b < c” is syntactically valid in your grammar, and you claim assoc=none, what is your third alternative for interpreting that expression?

assoc=left => (a < b) < c
assoc=right => a < (b < c)
assoc=none => ????

 

rtm...@googlemail.com

unread,
Apr 16, 2016, 10:48:44 AM4/16/16
to antlr-di...@googlegroups.com


On Saturday, April 16, 2016 at 12:13:18 PM UTC+1, Mike Cargal wrote:
I don’t think you’ve convinced me that there is any such thing as associativity = none.

In your example, the expression a < b < c is syntactically valid.  

I think his point is that he wishes it to be syntactically invalid ie. caught at parse time by antlr.

jan
 
You might add semantic logic to disallow it, but the parser has to interpret it as either (a < b) < c or a < (b < c) (i.e. you have to do one comparison before the other, and that order is reflected in the resulting parse tree. (yes, some languages have an “expr < expr < expr” that;s evaluated as a single “between” operation, but that would be correctly handled with a parse rule to bring all three nodes together for evaluation).)

[snip]
Reply all
Reply to author
Forward
0 new messages