So far,
LL(1) = scans input left-to-right, produces output for the
left-most bits first, and looks ahead at most 1 symbol at a time.
LR(1) = scans input left-to-right, produces output for the
right-most bits first (but does so in reverse order for some reason!),
and looks ahead at most 1 symbol at a time.
LALR(1) = Look-Ahead LR(1), but the (1) indicates that it looks
ahead 1 symbol anyway, so what is the difference? I have seen clear
statements that LR(1) is NOT the same as LALR(1).
Then there are SLR and SLALR grammars and quite possibly others that I
have not come across yet. All the books I have read use at least some
of these terms but I can find no systematic attempt to say what the
differences are.
Can anyone help or at least suggest a book with clear definitions?
Thank you.
Joe
> LL(1) = scans input left-to-right, produces output for the
>left-most bits first, and looks ahead at most 1 symbol at a time.
LL(1) - what you can parse when you're using a top-down parser
generator like RDP. Kind of like recursive descent in power.
> LR(1) = scans input left-to-right, produces output for the
>right-most bits first (but does so in reverse order for some reason!),
>and looks ahead at most 1 symbol at a time.
LR(1) - what you can parse when you're using a really good commercial
bottom-up parser generator like Yacc++. Not seen much in the real
world otherwise, because the internal tables tend to get too big.
> LALR(1) = Look-Ahead LR(1), but the (1) indicates that it looks
>ahead 1 symbol anyway, so what is the difference? I have seen clear
>statements that LR(1) is NOT the same as LALR(1).
LALR(1) - what you can parse when you're using Yacc and its children.
Like LR(1), but with some additional restrictions on things that you
can parse. In practice these restriction don't get in your way much,
and LALR(1) grammars can express most normal programming-language
features. (C++ is, of course, abnormal in this area.) Most real-
world compilers are either LALR(1) using something like Yacc, or
hand-coded recursive-descent.
>Then there are SLR and SLALR grammars and quite possibly others that I
>have not come across yet. All the books I have read use at least some
>of these terms but I can find no systematic attempt to say what the
>differences are.
Not of much use, because LALR(1) is more general than SLR and LALR(1)
tools are widely available.
>Can anyone help or at least suggest a book with clear definitions?
You don't really want clear definitions, would be my guess, because
exact specs of these involve learning language and parsing theory.
You'd need to sludge through the swamp like the rest of here have
done: if you really really want to, then get the dragon book
(Compilers: Principles, Techniques, and Tools, by Aho, Sethi, and
Ullman) and allocate a few months to get up to speed. It's fun,
but I only did it because I couldn't not do it.
The other, gentler approach is to get John's book (lex & yacc, by
John Levine) and a copy of byacc and flex off the net and learn
how to write parsers by going through his examples.
Or, you can go get RDP
http://www.cs.rhbnc.ac.uk/research/languages/rdp.shtml
which is a lot easier to learn, and generates C code that you can
understand, instead of a lot of unfathomable tables.
Have fun!
Dwight
LL(k) means a top-down parser can be created for the parser with a max
lookahead of k symbols.
LR(k) means a bottom-up parser can be created for the parser with a max
lookahead of k symbols.
LALR(1) grammars are a subset of LR(1) grammars, requiring smaller
parsing table. (speed up gain vs. complication of error reporting and
recovery).
SLR(1) parsing is a hack of LR(0) parsing: you attempt to construct a
LR(0) parser for it, and if there are only minor glitches, you slap on
some extra control.
Because these parsing processes are rather complex, it's a lot of work
to check if a grammar is LR(k),SLR(1) or LALR(1)
Theory and Practice of Compiler Writing, by Tremblay and Sorenson,
McGraw Hill, is an in-depth work, but not for the technically
challenged.
JPA
Joe Hotchkiss <joe.ho...@gecm.com> writes:
>I have been writing a small recursive descent parser, mostly for my own
>amusement, and have been trying to document it for others to use. In
>doing so I have tried to give definitions of some terms, but I can't
>actually find a clear definition of the various types of grammar.
OK, here goes...
> LL(1) = scans input left-to-right, produces output for the
>left-most bits first, and looks ahead at most 1 symbol at a time.
LL(1) = scans input from left-to-right, produces a leftmost derivation
top-down, 1 symbol of lookahead
LL(k) is precisely the class of grammars which can be deterministically
parsed using a recursive descent parser where parsing decisions are based
on left context and k symbols of lookahead only.
> LR(1) = scans input left-to-right, produces output for the
>right-most bits first (but does so in reverse order for some reason!),
>and looks ahead at most 1 symbol at a time.
LR(1) = scans input from left-to-right, produces a rightmost derivation
in reverse bottom-up, 1 symbol of lookahead
LR(k) is precisely the class of grammars which can be deterministically
parsed using a shift-reduce (or recursive ascent) parser where parsing
decisions are based on left context and k symbols of lookahead only.
> LALR(1) = Look-Ahead LR(1), but the (1) indicates that it looks
>ahead 1 symbol anyway, so what is the difference? I have seen clear
>statements that LR(1) is NOT the same as LALR(1).
LALR(k) is not a simple class of grammars to reason about since its
definition is based on the kind of automaton used to parse it. LALR(k)
is, basically, what you get when you take an LR(k) parser and squish
it down to LR(0) size, possibly losing information along the way.
>Then there are SLR and SLALR grammars and quite possibly others that I
>have not come across yet.
SLR(k) is what you get when you start with an LR(0) parser and approximate
lookahead information by using follow sets. It's rarely used nowadays
since:
- People generally don't use k > 1
- LALR(1) is a superset of SLR(1)
- Modern LALR(1) generators are extremely efficient, and the
time saved generating SLR(1) parsers just isn't worth it.
Again, it's a bit hard to reason about SLR, because its definition is
based on the automaton which parses it. It's a bit easier than LALR,
though...
I don't know about SLALR.
As for a book, Nigel Chapman's "LR Parsing: Theory and Practice" is
a good one if you want to delve into the topic, and the definitions
are presented well.
Cheers,
Andrew Bromage
ANTLR is a recursive descent parser generator from www.antlr.org. You
might find it enlightening to read the draft chapters from Terence's
forthcoming book. One chapter shows exactly what you would write by
hand and then how you can simplify your life by generating that code.
If you download Terence's thesis from that website there are a whole
slew of formal definitions of LL, LR, LALR, etc., mostly skewed to
discuss the issue of linear approximate lookahead.
Monty
One resuurce that you may find useful (but it may also be an overkill) is
http://www.cs.vu.nl/~dick/PTAPG.html which contains an online book about
parsing.
Enjoy!
Ehud Lamm msl...@pluto.mscc.huji.ac.il