LALR parsing

A. Soare

unread,

Dec 4, 2009, 10:26:44 AM12/4/09

to

Hi,

Nowadays, the subject of LALR is well understood. I want to implement
a LALR parser for an arbitrary grammar (many common point to yacc,
however different).

I have read many classical articles and books. Among my studies:

- dragon book
- parsing techniques a practical guide, D Grune (the best that I read)
- many classical articles in the style Frank DeRemer and Thomas
Pennello. Efficient Computation of LALR(1) Look-Ahead Sets, etc.

However, I did read that these methods are not the most fast. The
fastest methods use matrix multiplications, etc. In the book of Grune
is it explained something about the equalence between parsing and
matrix multiplication, but I see nothing useful apart of some vagues
ideas.

My question for you is: what are the most efficient methods of lalr parsing?

I would like ask you to give me some references about.

Thanks in advance,

Alin Soare.

____________________________________________________

Michael Jackson, Susan Boyle, Black Eyed Peas ... Retrouvez leurs derniers titres sur http://musiline.voila.fr

Torben AEgidius Mogensen

unread,

Dec 7, 2009, 5:43:20 AM12/7/09

to

"A. Soare" <alin...@voila.fr> writes:

> I want to implement a LALR parser for an arbitrary grammar (many
> common point to yacc, however different).
>

> However, I did read that these methods are not the most fast. The
> fastest methods use matrix multiplications, etc. In the book of Grune
> is it explained something about the equalence between parsing and
> matrix multiplication, but I see nothing useful apart of some vagues
> ideas.
>
> My question for you is: what are the most efficient methods of lalr parsing?

LALR(1) defined a class of grammars that can be parsed with a
deterministic LALR(1) parser. These can be parsed in something
approaching linear time. But LALR(1) can not deterministically parse
arbitrary grammars. You can add backtracking to allow parsing of
arbitrary grammars, but that can in the worst case give exponential
runtimes. GLR parsers (see http://en.wikipedia.org/wiki/GLR_parser)
handle the multiple parses in a breath-first manner instead of a
depth-first backtracking, so they can merge identical states. This
reduces the worst case to O(n^3).

While the matrix-multiplication methods you refer to have better
worst-case performance, GLR will outperform them in most cases. Matrix
multiplication can be done in O(n^2.376), but the algorithm for this has
a huge constant factor overhead that makes the traditional O(n^3) method
faster except for extremely large matrices. So I doubt the O(n^2.376)
algorithm is of any practical use for parsing.

Note, however, that while it has been proven that the complexity of
parsing is bounded by the complexity of matrix multiplication, the
converse is not true: There may well be general parsers that are faster
than matrix multiplication. IIRC, the lower bound for CF parsing is
still O(n).

Torben

A. Soare

unread,

Dec 9, 2009, 7:17:02 AM12/9/09

to

> > I want to implement a LALR parser for an arbitrary grammar (many
> > common point to yacc, however different).
> >

> > However, I did read that these methods are not the most fast. The
> > fastest methods use matrix multiplications, etc. In the book of Grune
> > is it explained something about the equalence between parsing and
> > matrix multiplication, but I see nothing useful apart of some vagues
> > ideas.
> >
> > My question for you is: what are the most efficient methods of lalr parsing?
>

> LALR(1) defined a class of grammars that can be parsed with a
> deterministic LALR(1) parser. These can be parsed in something
> approaching linear time. But LALR(1) can not deterministically parse
> arbitrary grammars. You can add backtracking to allow parsing of
> arbitrary grammars, but that can in the worst case give exponential
> runtimes. GLR parsers (see http://en.wikipedia.org/wiki/GLR_parser)
> handle the multiple parses in a breath-first manner instead of a
> depth-first backtracking, so they can merge identical states. This
> reduces the worst case to O(n^3).

Thanks. My question is: what implementation of lalr is the most
efficient? There are lots of implementations. For example, Simple
computation of LALR (1) lookahead sets (
http://cat.inist.fr/?aModele=afficheN&cpsidt=6679232 ) presents a
simple method by building a secondary grammar, and computing there the
follow sets. This is one method.

Another question is: is there a method of computing the lalr sets (
starting from SLR ) or whatever else using matrix multiplication?

Another question is : what reference do you recommend me in order for
me to understand the equivalence between parsing and matrix
multiplication?

And, finally, what methods of computing lalr sets are the fastest ?
There are lots of methods...

> While the matrix-multiplication methods you refer to have better
> worst-case performance, GLR will outperform them in most cases. Matrix
> multiplication can be done in O(n^2.376), but the algorithm for this has
> a huge constant factor overhead that makes the traditional O(n^3) method
> faster except for extremely large matrices. So I doubt the O(n^2.376)
> algorithm is of any practical use for parsing.

I know about glr, but I am not interested in the subject of general
grammars, but only in lalr(1) grammars.

Thanks,

Alin

Danny Dubé

unread,

Dec 9, 2009, 8:59:14 PM12/9/09

to

Torben AEgidius Mogensen <tor...@diku.dk> writes:

> Note, however, that while it has been proven that the complexity of
> parsing is bounded by the complexity of matrix multiplication, the
> converse is not true: There may well be general parsers that are faster
> than matrix multiplication. IIRC, the lower bound for CF parsing is
> still O(n).

The following paper explains that a fast parsing algorithm can be
turned into a fast (but not AS fast, though) binary matrix
multiplication algorithm.

@article{505242,
author = {Lee, Lillian},
title = {Fast context-free grammar parsing requires fast
boolean matrix multiplication},
journal = {J. ACM},
volume = {49},
number = {1},
year = {2002},
issn = {0004-5411},
pages = {1--15},
doi = {http://doi.acm.org/10.1145/505241.505242},
publisher = {ACM},
address = {New York, NY, USA},
}

--
Danny DubC), professeur
DC)partement d'informatique et de gC)nie logiciel
UniversitC) Laval

Torben �gidius Mogensen

unread,

Dec 10, 2009, 4:12:35 AM12/10/09

to

Danny...@ift.ulaval.ca (Danny Dubi) writes:

>> There may well be general parsers that are faster than matrix
>> multiplication. IIRC, the lower bound for CF parsing is still O(n).

> The following paper explains that a fast parsing algorithm can be
> turned into a fast (but not AS fast, though) binary matrix
> multiplication algorithm.

> author = {Lee, Lillian},

> title = {Fast context-free grammar parsing requires fast
> boolean matrix multiplication},

I read the abstract of that paper some time ago, and as far as I recall,
the essense is that you can multiply two sqrt(n) x sqrt(n) binary
matrices in the time it takes to parse a length-n string. So if you can
multiply binary matrices in O(n^2) time, you _might_ be able to parse in
linear time. But having only read the abstract, I can't say for sure.

Also, there may be difference between parsing (constructing a
derivation) and recognition (deciding membership). It is plausible that
you can do the latter faster.

Torben

Russ Cox

unread,

Dec 11, 2009, 12:43:37 PM12/11/09

to

>> B author = {Lee, Lillian},
>> B title = {Fast context-free grammar parsing requires fast
>> B B B B B boolean matrix multiplication},

>
> I read the abstract of that paper some time ago, and as far as I recall,
> the essense is that you can multiply two sqrt(n) x sqrt(n) binary

> matrices in the time it takes to parse a length-n string. B So if you can

> multiply binary matrices in O(n^2) time, you _might_ be able to parse in
> linear time.

I think this came across weaker than intended: the _might_ here is
formally a negation. That is, since fast parsing implies fast matrix
multiplication, inability to do fast matrix multiplication implies
inability to do fast parsing. So unless you can multiply binary
matrices in something approaching quadratic time, you _cannot_ parse
in linear time.

The abstract fills in the details better than I can:

In 1975, Valiant showed that Boolean matrix multiplication can be
used for parsing context-free grammars (CFGs), yielding the
asymptotically fastest (although not practical) CFG parsing
algorithm known. We prove a dual result: any CFG parser with time
complexity O(g*n^(3-epsilon)), where g is the size of the grammar
and n is the length of the input string, can be efficiently
converted into an algorithm to multiply m x m Boolean matrices in
time O(m^(3-epsilon/3)). Given that practical, substantially
subcubic Boolean matrix multiplication algorithms have been quite
difficult to find, we thus explain why there has been little
progress in developing practical, substantially subcubic general
CFG parsers. In proving this result, we also develop a
formalization of the notion of parsing.

The more precise version of the statement above is therefore:
Unless you can do matrix multiplication in O(m^(2 1/3)) time,
you cannot do context free parsing in O(g n) time.

Getting back to the original post, the connection between matrix
multiplication and parsing applies to general CFGs, not LALR(1),
which are a significantly restricted subset.

There are many aspects to writing a fast LALR(1) parser, but none of
them derive from the matrix multiplication connection: in this
context, general context-free parsers solve a needlessly hard problem.

LALR(1) parsing runs in linear time in the size of the input already,
and since you have to process the input, you can't do better. Actual
implementations may have different constant factors, of course, but
most papers talking about efficient LALR implementations mean the
preprocessing that converts the grammar into the LALR tables used
during the parse. That step is superlinear in the size of the grammar
and can be significant for large grammars. On the other hand, if
you're writing a tool like yacc, it may not matter at all: in my own
builds, the compilation of the rest of the program always takes much
longer than yacc's table generation. (It does matter a lot if you're
processing new grammars on the fly at run time; the original post
doesn't say.)

Russ

Torben �gidius Mogensen

unread,

Dec 14, 2009, 11:34:29 AM12/14/09

to

Russ Cox <r...@swtch.com> writes:
> We prove a dual result: any CFG parser with time complexity
> O(g*n^(3-epsilon)), where g is the size of the grammar and n is
> the length of the input string, can be efficiently converted into
> an algorithm to multiply m x m Boolean matrices in time
> O(m^(3-epsilon/3)).
>

> The more precise version of the statement above is therefore:
> Unless you can do matrix multiplication in O(m^(2 1/3)) time,

> you cannot do context free parsing in O(g*n) time.

http://en.wikipedia.org/wiki/Matrix_multiplication#Algorithms_for_efficient_matrix_multiplication
states that the asymptoticallty fastest known square matrix
multiplication algorithm is O(m^2.376), which isn't that far from
O(m^2.3333). Since O(m^2.376) is "only" the currently best known
result, it is quite plausible that faster methods exist. So I can't see
the statement itself being a serious impediment to linear-time general
CF parsing.

Personally, I would not look at matrix multiplication for fast general
parsing, but rather look at extensions to deterministic push-down
automata. Two-way deterministic pushdown automata (which are O(n) on a
RAM) can parse surprisingly complex grammars (including many non-CF
grammars), and AFAIK it has not been proven that they can not parse all
CF grammars. So they seem like a good point to start looking for fast
parsing.

Torben

Matthias-Christian ott

unread,

Dec 14, 2009, 3:28:42 PM12/14/09

to

On Mon, Dec 14, 2009 at 05:34:29PM +0100, Torben o?=gidius Mogensen wrote:
> Russ Cox <r...@swtch.com> writes:
> > We prove a dual result: any CFG parser with time complexity
> > O(g*n^(3-epsilon)), where g is the size of the grammar and n is
> > the length of the input string, can be efficiently converted into
> > an algorithm to multiply m x m Boolean matrices in time
> > O(m^(3-epsilon/3)).
> >
> > The more precise version of the statement above is therefore:
> > Unless you can do matrix multiplication in O(m^(2 1/3)) time,
> > you cannot do context free parsing in O(g*n) time.
>
> http://en.wikipedia.org/wiki/Matrix_multiplication#Algorithms_for_efficient_matrix_multiplication
> states that the asymptoticallty fastest known square matrix
> multiplication algorithm is O(m^2.376), which isn't that far from
> O(m^2.3333). Since O(m^2.376) is "only" the currently best known
> result, it is quite plausible that faster methods exist. So I can't see
> the statement itself being a serious impediment to linear-time general
> CF parsing.

This is also the opinion of Grune and Jacobs [1], but just because
there hasn't been a prove that general CF parsing isn't possible in
O(n). So all statements about the possibility of general CF parsing in
O(n) are simply speculations.

They also claim that boolean matrix multiplication requires at least
O(n^2) actions, so such a linear parsing algorithms is probably not
based on it or has correlation with it.

Regards,
Matthias-Christian

[1] Dick Grune, Ceriel J.H. Jacobs (2008), Parsing Techniques.
A Practical Guide (2nd ed.), Springer, New York, p. 100,
ISBN-13 978-0-387-20248-8

Chris F Clark

unread,

Dec 25, 2009, 2:18:23 PM12/25/09

to

First, of all, let me apologise in advance for commenting on things
where I am "out of my depth". In addition, I have not read or studied
this particular aspect in a while, so my failing memory may introduce
additional errors.

You may want to look at the paper by Park, Choe, & Chang. They talk
about efficiently computing the lookahead sets needed in constructing
LALR tables. Googling them I found a related paper by Frank Ives that
may be of interest.

In any case, computing the lookahead sets requires computing
transitive closures. I believe there are some algorithms that are
asymptotitcally efficient that are based upon modifications of matrix
multiplication. (They remind me how certain Galois fields, GF(2**k),
can be mapped to carry-less multiplication using shifts and xors.)

Thus, I think if you read the literature, you will find an algorithm
for efficiently computing LALR tables using matix muliplication over a
Boolean field to compute certain required sets.

However, as others have said, this has no impact on the efficiency of
the generated parser, which is why I haven't studied it in more depth.
Even naive algorithms from computing the lookahead sets (and thus
performing conflict resolution) yield the same LALR tables. In fact,
there is a whole family of tables that are essentially the same (all
based on the SLR(0) tables) and the same size, the only distinction
being what reduce actions are inserted (how conflicts are resolved).
The differences in these tables are not the speed at which they parse,
but the ability to distinguish more subtle language cases (i.e. there
are languages which are LALR that are not SLR, and the only
distinction is that the LALR tables make finer conflict resolutions
than the SLR algorithm does, i.e. SLR declares a conflict and LALR
inserts reduce actions that allow it to parse the language).

If you are interested in making a more efficient parser (rather than a
more efficient parser table generator), it is hard to beat the
aymptoptic efficiency of LR style tables. The resulting parser is
linear in the input size and you can't do better than that, because
one has to look at every input symbol at least once to correctly
distinguish all cases.

That being said, one can affect the constant multiplier, how many
operations are done to process each input symbol. There isn't a lot
of research into this, since it isn't generally a theoretical issue.
Moreover, low level issues often overwhelm the theoretical ones. The
best performancs I've seen mentionied in a paper is from Tom Penello's
work on very fast LR parsing--essentially one generates an assmebly
language program them implements the tables. That's not that
theoretically interesting, but it is a very pragmatic solution.

On a more theoretical bent, one can eliminate many of the push actions
in an LR parser. You don't have to keep all the symbols on the stack.
You still have to read them, but your shift action can simply drop
them on the floor. There was a paper written up on this and the
resulting tool was called either Mule or Ox--I can't recall which for
certain. The other tool adds attribute grammar processing to yacc.

We use a similar technique in Yacc++ to help us solve the ELR problem.
In ELR parsing, one has variable length right-hand-sides (RHSs) and
one needs to pop a variable number of them when one reduces. That are
a variety of solutions to this problem, but the one which we use[d],
is to mark the starts of productions with pushes on a special stack.
This stack has the same flavor as the one mentioned in the preceeding
paragraph.

At the practical level, the speed of the resulting parser isn't a
particularly large issue. It is almost always drawfed by the speed of
the lexer. The reason being obvious, for a parser, you are doing
operations at the token level, at the lexer you are doing operations
at the character level, unless many of your tokens are only 1
character, the lexer speed is magnified in importance by mutliplying
by the average token length over the parser speed.

In either case, the lexer or the parser, you can get the cost down to
a few instructions per symbol, and for most cases get your entire set
of tables into a modern x86 procesors cache, such that you are
effectively bounded only by how fast one can read the input into the
computer.

There are places where automata performance at the low level matters,
such as network security (e.g. virus scanning). However, it isn't
garden variety parsing. We already know how to get the cost down to a
few machine instructions per character.

Moreover, don't be surprised when there are hardware implementations
of various FA algorithms that run at 1 byte per cycle as part of
"standard PCs". These machines are needed for the above mentioned
security applications. Eventually, you will need this security on
every device that can talk to any network. Thus, this hardware will
be everywhere. When that happens, the hardware will be there to do
lexing and parsing "for free"--i.e. as you read the text off the disk
(or network or whatever), the text will be lexed and parsed as fast as
the bytes are streamed in.

Hope this helps,
-Chris

******************************************************************************
Chris Clark email: christoph...@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris