Blog: Accurate reporting of mismatched delimiters

47 views
Skip to first unread message

Jeffrey Kegler

unread,
Nov 2, 2014, 7:12:42 PM11/2/14
to marpa-...@googlegroups.com
I've just put up a new blog post: -- it describes a new method for reporting mismatched delimiters.

The ability to smoothly switch between left-eidetic, syntax-driven parsing and custom procedural logic, as well as to communicate from one to the other, is quite powerful.  This method for diagnosing mismatched delimiters by repairing them on the fly is just one application.  I think that the reason that there is not more excitement, is that most people believe it's a claim that's simply too good to be true.

-- jeffrey


Ron Savage

unread,
Nov 2, 2014, 7:47:58 PM11/2/14
to marpa-...@googlegroups.com
Marpa is so amazing that just about any claim it makes can be hard to believe until you've tried it yourself.

Sour grapes is different. That's when superficial responses are all of the form: Yes, but it doesn't have an interface in language $X.

Jeffrey Kegler

unread,
Nov 2, 2014, 7:59:55 PM11/2/14
to marpa-...@googlegroups.com
To be fair, a lot of folks are using something other than Perl these days, and if they don't happen to know C, they are stuck waiting for somebody else to do the interface.  And, of course, until Kollos is ready, the interface won't include all the cool stuff in the SLIF layer, unless they rewrite it from scratch.

You and I are of the generation where we had to learn C, whether we liked it or not, in order to get anything done.  These days a lot of very competent programmers are not proficient in C.

-- jeffrey


On 11/02/2014 04:47 PM, Ron Savage wrote:
Marpa is so amazing that just about any claim it makes can be hard to believe until you've tried it yourself.

Sour grapes is different. That's when superficial responses are all of the form: Yes, but it doesn't have an interface in language $X.

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ron Savage

unread,
Nov 2, 2014, 9:16:34 PM11/2/14
to marpa-...@googlegroups.com
Too true.

Aristotle Pagaltzis

unread,
Nov 3, 2014, 3:20:34 AM11/3/14
to marpa-...@googlegroups.com
* Jeffrey Kegler <jeffre...@jeffreykegler.com> [2014-11-03 01:15]:
> I think that the reason that there is not more excitement, is that
> most people believe it's a claim that's simply too good to be true.

It seems more likely to me that given how universal and long-standing
the status quo is, its failings have become so ingrained as to make it
hard to even imagine what it would be like to be using something better,
in practical terms.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>

Jeffrey Kegler

unread,
Nov 3, 2014, 12:19:21 PM11/3/14
to marpa-...@googlegroups.com
I think you may be right. Perhaps one way to break the cycle is to
point out that, if you're looking to work on a project that will be
remembered for a long time to come, something related to Marpa and/or
Kollos is as good a bet as anything out there.

-- jeffrey

Ron Savage

unread,
Nov 3, 2014, 3:54:19 PM11/3/14
to marpa-...@googlegroups.com
That's close to another possible explanation, what I would call Fatigue with the New, i.e. the dread of the effort to learn something new.

Of course it could just be insecurity which stops people taking a chance on new technology, given there are so many things we could be trying to adopt. Just check any job add for a list of technology-driven buzzwords.

John Alvord

unread,
Nov 3, 2014, 10:41:56 PM11/3/14
to marpa-...@googlegroups.com

Just a thought...

A web page to prove the point.

John Alvord

On Nov 3, 2014 12:54 PM, "Ron Savage" <r...@savage.net.au> wrote:
That's close to another possible explanation, what I would call Fatigue with the New, i.e. the dread of the effort to learn something new.

Of course it could just be insecurity which stops people taking a chance on new technology, given there are so many things we could be trying to adopt. Just check any job add for a list of technology-driven buzzwords.

Ron Savage

unread,
Nov 4, 2014, 2:08:00 AM11/4/14
to marpa-...@googlegroups.com
Did you have a specific web page in mind?

skva...@gmail.com

unread,
Nov 19, 2014, 1:46:28 PM11/19/14
to marpa-...@googlegroups.com
Isn't left-eideticism the same thing as immediate error detection property (as defined by Jacobs&Grune in "Parsing Technique. A Practical Guide", 2nd edition, pages 248, 299, 524, 525, 625, 626)?

Jeffrey Kegler

unread,
Nov 19, 2014, 3:04:14 PM11/19/14
to Marpa Parser Mailing LIst
The immediate error detection property does not even approach the scope of left-eideticism.  The error detection property is only about detecting what the parser thinks is the error, which for a deterninistic parser is often not the actually error.

And left-eideticism is not limited to errors, real or supposed.  Marpa knows *every* rule and symbol recognized or predicted to the left of the current location.  For example, you could, when parsing HTML, stop Marpa and ask it for all the tables nested more than 3 level deep, either complete or in progress.  More often useful is the Ruby Slippers technique, where the parser knows exactly what symbols are needed to continue.  As another example, the technique in this blog post relied on Marpa's ability to tell the application the beginning and end of any rule recognized so far.

To get an idea of the information available, you can look at the POD for the progress reports.  This show the full detail available, which can be helpful for debugging a grammar.  Applications will usually use information in a digested form.  If you've been working within the framework of deterministic parsers, the amount of information available from the Earley tables can be surprising.

Compared to left-eideticism, the immediate error detection property is parsing while blind-folded.

On Wed, Nov 19, 2014 at 10:46 AM, <skva...@gmail.com> wrote:
Isn't left-eideticism the same thing as immediate error detection property (as defined by Jacobs&Grune in "Parsing Technique. A Practical Guide", 2nd edition, pages 248, 299, 524, 525, 625, 626)?

skva...@gmail.com

unread,
Nov 20, 2014, 7:05:08 PM11/20/14
to marpa-...@googlegroups.com
Sorry for a late reply. I had to read your article (the POD for the progress reports, a very concise example) and revisit Earley algorithm. It seems that Earley parsers benefit from what at first sight seems a disadvantage (compared to deterministic context-free parsers): they have to construct Earley items on the fly, so they know exact positions in input string.

I noticed some minor typos in your article:

1. in section "Progress report lines": text 'Here the "R12:2" indicates' should be 'Here the "R11:2" indicates' (and later on)
2. in section "Ok! Now to find the bug": rule 19 in example should actually be rule 18
3. in section "Multiple instances of dotted rules": x13 should be x12
4. in section "progress()": N+X+1 should be N-X+1


Jeffrey Kegler

unread,
Nov 20, 2014, 7:49:45 PM11/20/14
to Marpa Parser Mailing LIst
Thanks for the doc fixes, which are now fixed in the working version.

Your comment on Earley's may clear up what has been kind of a mystery to me.  Much of Marpa is taken from others, but one new feature of the parse engine is definitely mine -- Marpa''s Earley items are created exactly in sync ("on the fly", as you put it) with their occurrence in the parse.  (Previous Earley's implementations has muddled this a bit by working on two Earley sets at once.)  As you can see, it's clearly a useful feature, and there's no efficiency hit.

The mystery for me had been why had nobody else done this before?  Many theoretical descriptions of Earley's are agnostic about the exact order of operations, so they don't exclude this way of doing things, but clearly nobody before me had seen it as an important property of the parse engine.

And the answer may have been that the mindset among parsing theorists that "eager" or "on the fly" or "in sync" recording of what's going on is a disadvantage.  To my mind it was the difference between being accurate or not.  In this connection, it is worth noting that Joop Leo's modification (that one that makes Earley fast for all deterministic grammars) in Joop's 1991 paper has a "lazy" implementation, one that I changed to an "eager" one.

skva...@gmail.com

unread,
Nov 21, 2014, 5:10:46 PM11/21/14
to marpa-...@googlegroups.com
Hmm, when I said "on the fly" I meant "when consuming next input character". And (to the extent of my knowledge) this is exactly how Earley algorithm works: consume next input character, compute Earley item sets (active, predicted, completed), repeat. There're many variations (prediction/reduction lookahead, Aycock & Horspool trick for empty rules, Leo's memoization, LR(0) items), but they don't break general order of things.

In Earley's algorithm most of the work must be done at runtime, as opposed to e.g. LR-family where most of the work can be done ahead of time, and thus no information about input string is encoded in the parser.

What are the other implementations you mentioned?

By the way, do you use lookahead in Marpa?

Jeffrey Kegler

unread,
Nov 21, 2014, 6:06:04 PM11/21/14
to Marpa Parser Mailing LIst
The pseudo-code on the Wikipedia page is a good example.  Note that SCANNER is adding Earley items to one set, while PREDICTOR and COMPLETER are still working on the other.  Every pre-Marpa description I know of is either like this, or agnostic about the order.  It makes a difference.  In the traditional order, you usually start each Earley set before the previous one is complete, which would make things like the Ruby Slippers impossible.

Marpa does *not* as a rule use lookahead, although recent hacks would allow applications to add it.  I follow some suggestions in the literature that lookahead is really about getting around the limitations of deterministic parsing, and not in general helpful for Earley parsing.

-- jeffrey

--

Ron Savage

unread,
Nov 22, 2014, 2:15:16 AM11/22/14
to marpa-...@googlegroups.com, skva...@gmail.com
In addition to Jeffrey's answer, note that by using events, and pause => before (or after) on lexemes, you can stop the parser temporarily at any lexeme.

Then, you can ask the parser where exactly it is in the input stream.

From there is easy to look ahead 1 or more characters yourself (I do it), but it requires you (me) to write the code. And this is without the recent patches Jeffrey mentioned.

So, Marpa gives you all the pieces you need, but it's up to us to combine them for each program.

skva...@gmail.com

unread,
Nov 22, 2014, 6:09:53 AM11/22/14
to marpa-...@googlegroups.com, skva...@gmail.com
Yes, the ability of lexer to use feedback from parser (stop parsing, change lexeme stream, resume parsing) is powerful, though not specific to Earley or Marpa algorithm. It can be embedded in other parser generators as well (I used it for my LALR(1) parser generator to parse context-sensitive parts of Javascript).

And from my observation this technique only requires parser to stop (in case of an error or some other event), save it's inner state and  transfer control to lexer. When lexer is done, parser is resumed, restores its state and continues. Is Marpa's technique somewhat trickier than that?

Anyway, it's great that Marpa does it out of box.

Ron Savage

unread,
Nov 22, 2014, 4:42:54 PM11/22/14
to marpa-...@googlegroups.com, skva...@gmail.com
"Marpa does it out of box". Exactly.

Aristotle Pagaltzis

unread,
Nov 22, 2014, 8:51:19 PM11/22/14
to marpa-...@googlegroups.com, skva...@gmail.com
* skva...@gmail.com <skva...@gmail.com> [2014-11-22 12:10]:
> And from my observation this technique only requires parser to stop
> (in case of an error or some other event), save it's inner state and
> transfer control to lexer. When lexer is done, parser is resumed,
> restores its state and continues. Is Marpa's technique somewhat
> trickier than that?

No. The difference is in the usefulness of the parser state that was
saved before pausing. Particularly when you pause the parser right after
the parse has failed: Marpa knows exactly what would have made the parse
succeed and you can ask it – then give it what it needed and resume.

You might be able to achieve even this with traditional parsers as well…
I think. But not by asking the parser! Instead the custom code would
have to know exactly which conditions might cause a particular parse
failure. I’m unsure right now if that’s even feasible. But even if it
is, the custom code is then tightly coupled to the grammar i.e. it will
have to be checked and sometimes updated to match the grammar whenever
the grammar changes.

With Marpa, this approach is not merely hypothetically possible, it is
practical and trivial – with custom code that is largely decoupled from
the grammar even. Because Marpa actually sees what is happening – how
the input is interacting with the grammar.

Traditional parsers are essentially blind during the parse: they grope
along, committing to the greedily most likely parse at a given point
(when they decide to commit), without any real idea why they are there,
hoping that it all works out in the end. For valid input, this does
eventually work out OK, but if the input isn’t valid, the parser ends
up stranded in the middle of nowhere with no idea what happened.

Marpa knows exactly what happened.

Effectively it can extract the knowledge about possible failures that
is implicit in the grammar and expose it to introspection, whereas in
traditional parsers the user code surrounding the parser would have to
have that knowledge encoded explicitly, redundantly outside the grammar.
Hence the coupling to the grammar.

This also changes how you write grammars. You can write a grammar that
has barely any valid inputs, then introspect its failure conditions to
make wide classes of nominally invalid input parse successfully.

It’s not that known techniques like pausing, lookahead, etc are somehow
trickier with Marpa than with traditional parsers. (Quite the opposite!)
It’s that ongoing insight into the parse makes all these same old useful
techniques far more effective than they could otherwise be.

Jeffrey Kegler

unread,
Nov 23, 2014, 8:22:50 AM11/23/14
to Marpa Parser Mailing LIst
@Aristotle: Very nicely put.  Thanks!

Reply all
Reply to author
Forward
0 new messages