How change grammar to equivalent LL(1) ?

Andy

unread,

Dec 22, 2019, 7:06:44 PM12/22/19

to

Obviously if is possible.
In Polish Wikipedia can we read, that even very simple grammar:
expr->number '+' expr
expr->number
is not LL(1) bacause we must see '+' to distinguish

But
is posssible equivalent grammar:
expr -> number optPlusExpr
optPlusExpr -> epsilon
optPlusExpr ->'+' expr

What are general rules to change grammar to equivalent LL(1) grammar if possible?
[This topic is covered in every compiler textbooks. Or you can start
with this Wikipedia article
https://en.wikipedia.org/wiki/Left_recursion#Removing_left_recursion
-John]

Lasse Hillerøe Petersen

unread,

Dec 22, 2019, 10:53:40 PM12/22/19

to

On Sun, 22 Dec 2019 15:55:12 -0800, Andy wrote:
> https://en.wikipedia.org/wiki/Left_recursion#Removing_left_recursion
> -John]

As the WP-article mentions under pitfalls, the change from left- to right-
recursion changes the parse tree, which needs to be dealt with.

I am writing a "toy" LL(1) tool in Javascript/Nodejs, and I have found a
method which I think is quite "clever".

Expr: Term | Expr AddOp Term.
AddOp: "+" | "-".

becomes

Expr: Term Exprtailety '{ $$=$2($1) }'.
Exprtailety: '{ $$=(x)=>(x) }'
| AddOp Term Exprtailety '{ $$=(x)=>($3([$1, x, $2]) }'.
AddOp: "+" '{ $$=$1 }' | "-" '{ $$=$1 }'.

(I have a naming convention I find useful, with "tail" in the name to
denote such tailing productions, and also all nullables end in "ety" as
they may produce "empty".)

Using "Yacc-like" actions, I simply pass a chain of functions up through
the parse, which get called from the Sum rule and constructs a parse tree
turned the right way (that is: the left way.)

/Lasse

Christopher F Clark

unread,

Dec 23, 2019, 9:52:58 PM12/23/19

to

Just a slight comment on what
Lasse =?iso-8859-1?q?Hiller=F8e?= Petersen <lhp+...@toft-hp.dk> wrote:

The technique of changing:
Expr: num | num plus Expr;
to
Expr: num ExprTailOpt; ExprTailOpt: plus Expr | /* empty */;
is called left-factoring.

By the way, "Opt" is the usual suffix for Ety.

--
******************************************************************************
Chris Clark email: christoph...@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------

Hans-Peter Diettrich

unread,

Dec 23, 2019, 9:53:56 PM12/23/19

to

Am 23.12.2019 um 00:55 schrieb Andy:
> Obviously if is possible.
> In Polish Wikipedia can we read, that even very simple grammar:
> expr->number '+' expr
> expr->number
> is not LL(1) bacause we must see '+' to distinguish
>
> But
> is posssible equivalent grammar:
> expr -> number optPlusExpr
> optPlusExpr -> epsilon
> optPlusExpr ->'+' expr
>
> What are general rules to change grammar to equivalent LL(1) grammar if possible?

Use EBNF:
expr -> number { '+' expr }
or
expr -> number { '+' number }

EBNF or "railroad diagrams" typically translate directly into LL
top-down parsers. Many problems vanish when the grammar does not allow
for too much freedom causing inobvious problems. E.g. in EBNF only a
single derivation of a (left hand) nonterminal is allowed.

DoDi

Lasse Hillerøe Petersen

unread,

Apr 24, 2020, 12:42:41 PM4/24/20

to

I know this is a very late reply, however, I sometimes forget to read
Usenet news for a while, I hope the moderator is forgiving.

On Mon, 23 Dec 2019 05:57:50 -0500, Christopher F Clark wrote:

> Just a slight comment on what Lasse Hillerøe Petersen
> <lhp+...@toft-hp.dk> wrote:

> is called left-factoring.

I am aware, however the point was having the refactored action return a
function to adjust the direction of the parse tree. I am sure LISPers and
Schemers wouldn't consider this anything special (so ordinary perhaps
even, that I hadn't been able to find any written mention of it, until
today), but when I wrote it back in 2017 I looked at my code and thought
"hey, that's actually neat and general."

Only today did I actually manage to find a paper, which, although I am
very rusty in the matter of formal proofs and theory, being just an
amateur hacker, to me reads like the theory behind "my" method:

Thielecke, Hayo. (2012). Functional semantics of parsing actions, and
left recursion elimination as continuation passing. PPDP'12 - Proceedings
of the 2012 ACM SIGPLAN Principles and Practice of Declarative
Programming. 91-102. 10.1145/2370776.2370789.

> By the way, "Opt" is the usual suffix for Ety.

Not in the Algol68 Reports IIRC. ;-)

/Lasse
[Your moderator tends to draw the line at replies to messages
posted 15 or 20 years ago, generally via Google Groups. -John]

Kaz Kylheku

unread,

Apr 24, 2020, 2:42:18 PM4/24/20

to

On 2020-04-24, Lasse Hillerøe Petersen <lhp+...@toft-hp.dk> wrote:
> I know this is a very late reply, however, I sometimes forget to read
> Usenet news for a while, I hope the moderator is forgiving.
>
> On Mon, 23 Dec 2019 05:57:50 -0500, Christopher F Clark wrote:
>
>> Just a slight comment on what Lasse Hillerøe Petersen
>> <lhp+...@toft-hp.dk> wrote:
>
>> is called left-factoring.
>
> I am aware, however the point was having the refactored action return a
> function to adjust the direction of the parse tree. I am sure LISPers and
> Schemers wouldn't consider this anything special (so ordinary perhaps
> even, that I hadn't been able to find any written mention of it, until
> today), but when I wrote it back in 2017 I looked at my code and thought
> "hey, that's actually neat and general."
>
> Only today did I actually manage to find a paper, which, although I am
> very rusty in the matter of formal proofs and theory, being just an
> amateur hacker, to me reads like the theory behind "my" method:
>
> Thielecke, Hayo. (2012). Functional semantics of parsing actions, and
> left recursion elimination as continuation passing. PPDP'12 - Proceedings
> of the 2012 ACM SIGPLAN Principles and Practice of Declarative
> Programming. 91-102. 10.1145/2370776.2370789.

Both left and right recursion elimination are related to tail
optimization and continuation passing.

In a shift-reduce LALR(1) parser, right recursion consumes parser stack
space. If you can refactor it to left recursion, it becomes stackless.

There is also a strong connection to the "reduce" or "fold" function in
functional programming. We can linguistically identify the "reduce" in a
LALR(1) parser with "reduce", the function.

"reduce" takes an accumulator, initialized to some value, and then
decimates a sequence by repeatedly passing the accumulator as
the left argument to a function, and the successive items of the
sequence as the right argument. For each successive call, the return
value of the previous call is used as the accumulator.

If we write a left-recursive calculator using a parser generator, say
with addition as the binary op:

expr : expr '+' expr { $$ = $1 + $3; }
| expr { $$ = $1; }
;

expr : initial_value { $$ = $1 }
;

this behaves like an iterative reduce over an input like '1 + 2 + 3 ...'.

The accumulator is seeded with 1, and then threaded through the
successive reductions without consuming parser stack space.

Fold/reduce, grammar reductions, continuations and (tail-)recursion
are all closely related.

If a compiler's target run-time is continuation-based, then stackless
tail calls are trivial to implement. All functions return by invoking
their continuation already, and so to make a tail call to a function,
you just call it, and give it your *own* continuation as the
continuation argument. If that function invokes that continuationm, it
will "return" to wherever you would have returned.
To generate a regular non-tail call which will return back to the
caller, the caller captures a local continuation and gives the callee
that one.

The accumulator object in a reduce is a kind of continuation: it
summarizes everything that has been done so far, so the calculation can
continue withut having to regress anywhere. Old values of the
accumulator are never revisited, so the reduce job can be done
iteratively: by assigning the new accumulator value over the old one.
That is easily achieved without assignment by tail recursion.

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

silas poulson

unread,

Nov 11, 2020, 11:51:38 AM11/11/20

to

An even later response, but thought quote from course notes (§5.6.5
available here <http://cs.rhul.ac.uk/courses/CS3470/>) I'm currently
pursuing might be useful.

*LL(1) grammars*
Grammars which admit non-back-tracking top down LL(1) parsers are
precisely the ones which are left factored, follow determined and have
no left recursion.

Thus we have the following definition: A context-free grammar is LL(1) if
for all non-terminals A and productions A ::= α|β we have
1. first(α) ∩ first(β) = ∅
2. If A ∗⇒ ε then first(A) ∩ follow(A) =∅.