Error message when generating parser / dangling else problem

183 views
Skip to first unread message

Steve Stewart

unread,
Jul 22, 2012, 6:07:09 PM7/22/12
to sab...@googlegroups.com
Hi there,

I'm looking for some help regarding an error message that I'm receiving when trying to generate a parser using SableCC 3.3.

Here is the error message:

Generating the parser.

.............

.............

.............

..

...java.lang.NullPointerExceptionnull


at org.sablecc.sablecc.ComputeInlining.computeInlining(Unknown Source)

at org.sablecc.sablecc.GenParser.caseStart(Unknown Source)

at org.sablecc.sablecc.node.Start.apply(Unknown Source)

at org.sablecc.sablecc.SableCC.processGrammar(Unknown Source)

at org.sablecc.sablecc.SableCC.processGrammar(Unknown Source)

at org.sablecc.sablecc.SableCC.main(Unknown Source)


I'm working on a 'dangling else' problem, and (for testing) I've restricted the grammar to the relevant productions for handling a version of this problem.

Let me explain. We have expressions of the following form, where E is a non-terminal and '=' separates the left and right side of the following two production rules for E:

E = E => E
E = E => E else E

(The "=>" can be read as "implies.")

Obviously, we want to be able to resolve the inherent ambiguity of the dangling else problem; in other words, we want to match each "else" with the closest preceding "=>". I've mocked this up as follows:

s = {unmatched} u |

{matched} m;

m = e implies [m1]:m else [m2]:m |

{expr} e;

u = {case1} e implies s |

{case2} e implies m else u;

e = s |

{num} number;


I've more or less copied the solution to the dangling else problem from p.67 of Appel's "Modern Compiler Implementation" text book.

The error message only seems to occur in the presence of the "e = s" alternative for production rule "e." 

Perhaps I have not correctly disambiguated the grammar, but (regardless) I'm not sure how to interpret the error message, and so any help would be appreciated. Any thoughts as to why "computeInlining" would be throwing a null pointer exception?

Thanks,
Steven Stewart


Etienne Gagnon

unread,
Jul 22, 2012, 6:53:02 PM7/22/12
to sab...@googlegroups.com
Hi Steven,

You've found a bug (somewhere in the inlining code). Could you provide a minimal grammar causing the null pointer exception?

Thanks for the bug report,

Etienne
Etienne Gagnon, Ph.D.
http://sablecc.org
--
-- You received this message because you are subscribed to the SableCC group. To post to this group, send email to sab...@googlegroups.com. To unsubscribe from this group, send email to sablecc+u...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/sablecc?hl=en
 
 

Steve Stewart

unread,
Jul 23, 2012, 7:07:09 AM7/23/12
to sab...@googlegroups.com
Thanks for taking a look!

Here is my "test.grammar" file that I'm using that leads to the error:

Package ca.stevenstewart;


Helpers

digit = ['0'..'9'];

tab = 9;

cr = 13;

lf = 10;

eol = cr lf | cr | lf;

whitespace = (' ' | tab | eol)+;


Tokens

else = 'else';

implies = '=>' | 'implies';


number = digit+;

whitespace = whitespace;


Ignored Tokens

whitespace;


Productions

s = {unmatched} u |

{matched} m;

m = e implies [m1]:m else [m2]:m |

{expr} e;

u = {case1} e implies s |

{case2} e implies m else u;

e = s |

{num} number;


Here is the complete console output from SableCC:

SableCC version 3.3

Copyright (C) 1997-2012 Etienne M. Gagnon <ega...@j-meg.com> and

others.  All rights reserved.


This software comes with ABSOLUTELY NO WARRANTY.  This is free software,

and you are welcome to redistribute it under certain conditions.


Type 'sablecc -license' to view

the complete copyright notice and license.



 -- Generating parser for test.grammar in /Users/macbookpro/Documents/A4/test/src

Adding productions and alternative of section AST.

Verifying identifiers.

Verifying ast identifiers.

Adding empty productions and empty alternative transformation if necessary.

Adding productions and alternative transformation if necessary.

computing alternative symbol table identifiers.

Verifying production transform identifiers.

Verifying ast alternatives transform identifiers.

Generating token classes.

Generating production classes.

Generating alternative classes.

Generating analysis classes.

Generating utility classes.

Generating the lexer.

 State: INITIAL

 - Constructing NFA.

.......................

 - Constructing DFA.

..............................................

....................

 - resolving ACCEPT states.

Generating the parser.

.............

.............

.............

..

...null

java.lang.NullPointerException

at org.sablecc.sablecc.ComputeInlining.computeInlining(Unknown Source)

at org.sablecc.sablecc.GenParser.caseStart(Unknown Source)

at org.sablecc.sablecc.node.Start.apply(Unknown Source)

at org.sablecc.sablecc.SableCC.processGrammar(Unknown Source)

at org.sablecc.sablecc.SableCC.processGrammar(Unknown Source)

at org.sablecc.sablecc.SableCC.main(Unknown Source)


Steven Stewart

Steve Stewart

unread,
Jul 23, 2012, 8:04:04 AM7/23/12
to sab...@googlegroups.com
Incidentally, I can see that the alternative "e = s" in that grammar would certainly lead to conflicts, but it still highlights the existence of the bug.

-Steven

Etienne Gagnon

unread,
Jul 23, 2012, 9:17:01 AM7/23/12
to sab...@googlegroups.com
Hi Steven,

Thanks a lot. I'll be back with some suggestions.

Etienne

Etienne Gagnon, Ph.D.
http://sablecc.org

On 2012-07-23 07:07, Steve Stewart wrote:
> Here is my "test.grammar" file that I'm using that leads to the error:
> [...]

Etienne Gagnon

unread,
Jul 23, 2012, 11:08:06 PM7/23/12
to sab...@googlegroups.com
Hi Steven,

Thanks to your test case, I was able to locate and fix the bug.

SableCC was trying to inline the implicit Start production (involved in a reduce/accept conflict), causing the null pointer exception.

I have released SableCC 3.4 which fixes the problem. See: http://sablecc.org/wiki/DownloadPage

I'll address your original problem in another message.


Etienne
Etienne Gagnon, Ph.D.
http://sablecc.org

Etienne Gagnon

unread,
Jul 23, 2012, 11:24:02 PM7/23/12
to sab...@googlegroups.com
Hi Steven,

Your little grammar combines both of the two common (and famous) ambiguities:
  1. expression grammar (precedence and associativity)
  2. dangling else (attached to closest enclosing statement)
Your problem was that you only attempted to attack the dangling else problem. In other words, you must also specify how you want the following to be parsed:
 3 => 5 => 6
is it(3 => 5) => 6 or 3 => (5 => 6) ?

I have chosen right associativity in the solution attached to this message.

You'll notice that I isolated the term production. I have also used right recursion to imply right associativity. Finally, I have introduced the e_no_missing_else production for handling the dangling-else problem.

Have fun!

Etienne

On Sunday, July 22, 2012 6:07:09 PM UTC-4, Steve Stewart wrote:

[...]I'm working on a 'dangling else' problem, and (for testing) I've restricted the grammar to the relevant productions for handling a version of this problem.

Let me explain. We have expressions of the following form, where E is a non-terminal and '=' separates the left and right side of the following two production rules for E:

E = E => E
E = E => E else E

(The "=>" can be read as "implies.")

Obviously, we want to be able to resolve the inherent ambiguity of the dangling else problem; in other words, we want to match each "else" with the closest preceding "=>". I've mocked this up as follows:
[...]


I've more or less copied the solution to the dangling else problem from p.67 of Appel's "Modern Compiler Implementation" text book.
[...]
test.grammar

Phuc Luoi

unread,
Jul 24, 2012, 6:51:23 AM7/24/12
to sab...@googlegroups.com
Hi Etienne,

I try to compile the grammar in the test case. The result is an error
message shift/reduce
conflict. It is the expected result, isn't it?

Hong Phuc

Steve Stewart

unread,
Jul 24, 2012, 7:13:26 AM7/24/12
to sab...@googlegroups.com
You've no doubt saved me a few hours as I wasn't thinking about the associativity of the implies operator, having been distracted by the dangling else. It's been a few years since I did these sort of exercises in my undergraduate compilers course.

I understand your explanation perfectly, and thanks for taking the time to help me out, as well as addressing the bug that arose.

-Steve


On Monday, July 23, 2012 11:24:02 PM UTC-4, Etienne Gagnon wrote:
Hi Steven,

Your little grammar combines both of the two common (and famous) ambiguities:
  1. expression grammar (precedence and associativity)
  2. dangling else (attached to closest enclosing statement)
<snip>

Etienne Gagnon

unread,
Jul 24, 2012, 7:14:11 AM7/24/12
to sab...@googlegroups.com
Hi Hong Phuc,

Yes, it is expected. The reduce/accept conflict was causing the bug
during an inlining attempt. SableCC only reports conflicts after it is
done with inlining.

Etienne

Etienne Gagnon, Ph.D.
http://sablecc.org

Phuc Luoi

unread,
Jul 24, 2012, 9:24:51 AM7/24/12
to sab...@googlegroups.com
Hi Etienne.
Thank you.

Steve Stewart

unread,
Dec 1, 2012, 9:25:28 PM12/1/12
to sab...@googlegroups.com
Hi there,

I've returned to look at the dangling else problem that I was encountering a while back, and unfortunately it is not resolved. Referring back to your solution, the problem that I encounter is that a "Term" (in my grammar) can again be another expression of the form "expr => expr" or "expr => expr else expr". This causes a shift/reduce conflict.

Any thoughts/suggestions? (file is attached)

Many thanks,
Steven

--
-- You received this message because you are subscribed to the SableCC group. To post to this group, send email to sab...@googlegroups.com. To unsubscribe from this group, send email to sablecc+u...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/sablecc?hl=en
 
 



--
Steven Stewart
o6...@unb.ca
simple.grammar

Steve Stewart

unread,
Dec 1, 2012, 9:27:10 PM12/1/12
to sab...@googlegroups.com
To clarify, in the attached file, you'll see "expr" in place of term. Then, you'll see a production rule "unOp e" that refers back to the top-level production 'e' because it should be possible to have an "implies expression" following a unOp token. This is just a snippet of the grammar.

Thanks,
Steven

Steve Stewart

unread,
Dec 2, 2012, 4:06:58 PM12/2/12
to sab...@googlegroups.com, steven.t...@unb.ca, o6...@unb.ca
Ok, for anyone interested, I think I may have it worked it out now following an example in Andrew Appel's book on compiler implementation:

start = {matched} matched |

{unmatched} unmatched

;

matched = expr implies [m1]:matched else [m2]:matched |

{other} expr

;

unmatched = expr implies start |

{else} expr implies matched else unmatched

;

expr = {const} const |

{unary} un_op expr |

;

const = {number} number;

un_op = {not} not;

Etienne Gagnon

unread,
Dec 4, 2012, 11:36:35 AM12/4/12
to sab...@googlegroups.com
Hi Steve,

Your new grammar accepts a different language than your original grammar. Is this really what you want?

If you haven't noticed, your new grammar allows for "empty" expressions. So, the following sequence of tokens is accepted by your new grammar, while it was rejected by the previous one: EOF  (i.e. an empty program).

Etienne
P.S. see the red "|" and ";" symbols below.

Etienne Gagnon, Ph.D.
http://sablecc.org
On 2012-12-02 16:06, Steve Stewart wrote:
Ok, for anyone interested, I think I may have it worked it out now following an example in Andrew Appel's book on compiler implementation:
[...]

Etienne Gagnon

unread,
Dec 4, 2012, 12:01:22 PM12/4/12
to sab...@googlegroups.com
Hi Steven,

I think that the biggest difficulty with your original grammar is, actually, the naming you used, which obscures its problem.

An important part of eliminating expression ambiguities is to properly name productions and order them in reverse precedence order, from lowest precedence to highest, where term is the last and atomic (not left nor right-recursive) production.

So, I first renamed the productions of your original grammar:
 e -> expr
 expr -> term

It became obvious that something was not right. Your term production was not atomic, it was (indirectly) right-recursive:
  term =
    {const} const |
    {unary} un_op expr;
To solve this, I split the term production into two distinct productions:
  left_unary =
    {unary} un_op left_unary |
    {simple} term;

  term =
    {const} const;
Of course, I made left_unary directly right-recursive (as usual, the indirect recursion was the source of the ambiguity). The result was accepted by SableC, as expected.

I have attached the original grammar with renaming and the conflict-less version.

Have fun!

Etienne

Etienne Gagnon, Ph.D.
http://sablecc.org
simple-no-conflict.sablecc
simple.sablecc
Reply all
Reply to author
Forward
0 new messages