Modified Earley in Haskell

146 views
Skip to first unread message

Aristotle Pagaltzis

unread,
Apr 26, 2015, 2:43:33 AM4/26/15
to marpa-...@googlegroups.com
Hey Jeffrey,

maybe this guy is a potential convert? :-)

https://github.com/ollef/Earley

> The parsing algorithm that this library uses is based on Earley's
> parsing algorithm. The algorithm has been modified to produce online
> parse results, to give good error messages, and to allow garbage
> collection of the item sets. Essentially, instead of storing
> a sequence of sets of items like in the original algorithm, the
> modified algorithm just stores pointers back to sets of reachable
> items.
>
> The worst-case run time performance of the Earley parsing algorithm is
> cubic in the length of the input, but for large classes of grammars it
> is linear. It should however be noted that this library will likely be
> slower than most parser generators and parser combinator libraries.
>
> The parser implements an optimisation similar to that presented in
> Joop M.I.M Leo's paper _A general context-free parsing algorithm
> running in linear time on every LR(k) grammar without using lookahead_
> which removes indirections in sequences of non-ambiguous backpointers
> between item sets.
>
> The implemented algorithm handles all CFGs, with one caveat: any
> production that accepts the empty string (i.e. an epsilon production)
> is only expanded once per position in the input string. This means
> that the parser does not give an infinite number of parse results for
> such grammars. This is usually not a problem.

No idea (and no clue that I can see) whether he is aware of Marpa and/or
your research.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

Michael Alan Dorman

unread,
Apr 26, 2015, 8:04:34 AM4/26/15
to marpa-...@googlegroups.com
When it was mentioned on Reddit, one or two people brought up Marpa:

http://www.reddit.com/r/haskell/comments/33l9cr/ann_earley_parsing_library/

Mike.

Jeffrey Kegler

unread,
Apr 26, 2015, 12:18:54 PM4/26/15
to Marpa Parser Mailing LIst
Olle mentions in the comments that he is aware of Marpa, but he did not try a full implementation because of the preprocessing required.   I think he avoids extensive pre-processing because his main interest is in the semantics, which can get lost in the pre-processing.  [ Caveat: I have not read Olle's code, just skimmed his docs. ]

Marpa does lots of pre-preprocessing, and in a semantics-safe way.  I've recently generalized my method and should write it up.  It might encourage Olle to be more bold in that respect.

Olle seems not to have used my rewrite of the Earley parse engine to allow events, ruby slippers, etc., or the full details of my handling of nullables.  Since he only used the Leo modification, this may explain why cites Leo's work without mentioning Marpa -- and Leo's work certainly deserves to be much better known.

A final note re "epsilon productions" or nulling productions.  One of Marpa's innnovations, now years old, is to expand only the topmost production of a nulling derivation.  That is, when you encounter an instance of

  A ::= B C; B ::= D; C ::= E; D ::= eps; E::= eps; eps ::= /* empty */

you only get the nulling semantics of the rule A ::= B C -- all the rest is ignored.  Years of experience with Marpa now shows that this is acceptable in practice -- most users are unaware of the pruning.

If anyone both Haskell-fluent and Marpa-fluent has time, I'd be curious to hear their impressions.  Perhaps there are some ideas that are useful for Kollos.

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages