Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

A non-LL(1) language

50 views

Skip to first unread message

Troy Vasiga

unread,

Nov 5, 2009, 2:34:37 PM11/5/09

I mentioned in class today that the language:

L = {BOF a^n b^m EOF: n >= m >= 0}

is not LL(1) (in fact, it is not LL(k) for any fixed k)

A student thought that he had proved me wrong, since he came up with an
LL(1) grammar for it.

The grammar he came up with was:

S'-> BOF S EOF
S -> aSB
S -> epsilon
B -> b
B -> epsilon

In the follow-up message, I will explain why it is not LL(1). You
should ask yourself, before reading that follow-up, why it is not LL(1).

Troy (t.v.)

Troy Vasiga

unread,

Nov 5, 2009, 10:18:32 PM11/5/09

The problem with the grammar is not with the a's: it is a problem with
the b's.

Notice that b is in the First(B).

Also notice that b is in the Follow(B), and since B can disappear, we do
need to use this rule for a possible expansion of B.

So, both rule B rules (i.e. going to epsilon and going to bB) are valid
rules to expand B by when given a single b.

Troy (t.v.)

Gordon V. Cormack

unread,

Nov 6, 2009, 9:55:42 AM11/6/09

to Troy Vasiga

A more concrete example of essentially the
same language is the "dangling else problem"
that arises in C, C++ and Java (but not WL).

1. stmt -> IF expr THEN stmt
2. stmt -> IF expr THEN stmt ELSE stmt

Clearly, as given, the grammar is not LL(1) because
Predict(stmt,IF) contains rules 1 and 2.

Factoring doesn't help:

1. stmt -> IF expr THEN stmt elsepart
2. elsepart -> <epsilon>
3. elsepart -> ELSE stmt

Here, Predict(elsepart, ELSE) contains rules
2 and 3.

The reason it contains rule 2 is because the
following sentential form can occur:

IF expr THEN IF expr THEN stmt ELSE stmt

When we see the ELSE, we don't know whether
to expand rule 2 or rule 3.

As it happens, this grammar is ambiguous, as
there are two distinct parse trees for the
given sentential form. It is possible to
write an unambiguous grammar for the same
language:

1. stmt -> complete
2. stmt -> incomplete
3. complete -> IF expr THEN complete ELSE complete
4. incomplete -> IF expr THEN stmt
4. incomplete -> IF expr THEN complete ELSE incomplete

This grammar is unambiguous, and LR(1). But it
is not LL(1). Factor as you may, you cannot make
it LL(1).

From the perspective of syntax-directed translation
this grammar does "what you want" -- it matches the
ELSE to the most recent IF statement.

But is "what you want" in fact an artifact of
"what you can easily implement?" In the primordial
days of computing, "the dangling else problem"
caused much consternation and debate.

To muddy the water further, I note that the LL(1)
predictor table can be "tweaked" to accept this
language by simply deleting the rule
"elsepart -> <epsilon>" from Predict(elsepart,ELSE).
When you tweak the Predict table, you forsake the
formal specification afforded by the context-free
grammar, and must find other ways of reasoning
about what language is accepted by your "parser."

0 new messages