I'm parsing a C-style language and I'm facing a dangling-else ambiguity
problem:
if (rel1) if (rel2) stmt1 else stmt2
My statement nonterminal includes the following productions seemingly
relevant here:
statement
->assignexpression SEMICOLON
| L_BRKT statements? R_BRKT
| SEMICOLON
| %prec(CLOSEDIF) IF relation statement ELSE statement
| %prec(OPENIF) IF relation statement
The key here is that I am trying to resolve the ambiguity by defining:
%nonassoc CLOSEDIF OPENIF
to prioritize closely-match if/else pairs over more distantly dangling
potential pairs, as per the example in
[http://lava.net/~newsham/pyggy/html/node29.html].
WITHOUT the precedence rules for the two IF productions, the grammar
behaves exactly as expected, except for the above ambiguity. However,
if I introduce these rules, parsing:
if (rel1) stmt1 else if (rel2) stmt2
fails on the second if. Other sorts of if/else statements (that don't
involve 'ELSE statement' -> 'ELSE IF relation statement') parse fine,
and the else precedence example is handled correctly.
I fundamentally fail to see how introducing these precedence rules
should alter this. Interestingly, swapping the higher/lower precedence
(to effectively group ambiguous ELSE sub-statements with the earliest,
rather than the latest unmatched IF) disambiguates in the opposite way
as expected, *and* the parse works for the oddly problematic "else if"
inputs.
Any pointers much appreciated, as I am very confused. I am working
furiously towards a publication deadline in a few days and had totally
taken for granted that my grammar has been working for ages until I
just this evening ran into my first input in a very long time with
bracket-less if/else trees revealing the initial common ambiguity.
Thanks.
-jrk