Megaparsack and where errors happen

Matt Jadud

unread,

Feb 21, 2020, 9:49:25 AM2/21/20

to Racket Users

Hi all,

This might be a Lexi question, but perhaps someone else will have some insight as well.

I'm wrestling with how to get errors to propagate down in megaparsack. For example:

(define convert/p
(do (string/p "convert")

...
[assigns ← (many/p #:min 0
assignment-statement/p
)]

...
(pure ...)))

(Assume I have a bunch of other parsing bits around the call to `assignment-statement/p`.)

Currently, if I have a malformed assignment statement, the error is at the top level of `convert`. `convert/p` is part of a backtracking conditional:

(define conversion/p
(do
[result ← (many/p (or/p (try/p base-type/p)
(try/p convert/p)
(try/p chain/p)
)
#:sep space0+/p
)]
eof/p
(pure result)))

What should I do to get the error to report/fail down the parse tree, as opposed to the top? I would rather know that there's something wrong down in my assignment statement, as opposed to getting an error that "c" was unexpected (because the entire conversion/p failed on account of an error somewhere down inside).

I need to give the docs a more careful read, but I thought I'd ask, as it seems both simple and, given the nature of the parsing tools, possibly subtle.

Many thanks,

Matt

Alexis King

unread,

Feb 21, 2020, 11:54:16 PM2/21/20

to Matt Jadud, Racket Users

Hi Matt,

I think you probably want to read this section of the docs: https://docs.racket-lang.org/megaparsack/parsing-branching.html#%28part._.Backtracking_with_caution%29

The core idea is that `try/p` is a heavy hammer. It causes any failure inside its scope to backtrack, so you might end up accidentally ruining your error messages. Usually, what you really want is to do a small amount of lookahead to determine which branch is the correct one to take, then commit to the branch so that future parse failures are reported immediately.

Without seeing your `base-type/p`, `convert/p`, and `chain/p` parsers, it’s hard to suggest a concrete solution. But consider using `try/p` to do only whatever parsing you need to do in order to disambiguate the parsers, then exit the scope of `try/p`. Something like this:

(or/p (do (try/p base-type-initial/p) base-type-remainder/p)
(do (try/p convert-initial/p) convert-remainder/p)
...)

There are other approaches as well, and if you provide more information I might be able to suggest something better.

This complication is unfortunately fundamental to the Parsec parsing model. The syntax/parse model of tracking “progress” and reporting the error associated with the parse that made it the farthest is much nicer, but syntax/parse has the luxury of parsing a well-known tree structure. A more tractable improvement might be to add some kind of explicit committing construct.

Hope this helps,
Alexis

> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAAGM45761tRo%2Bj0Rh078ngriYiMDum%3DyDyRgwQ1LLiuPZDFj6A%40mail.gmail.com.

Matt Jadud

unread,

Feb 22, 2020, 8:13:52 AM2/22/20

to Alexis King, Racket Users

Hi Alexis,

That helps immensely. It was difficult to come up with a "minimal working example" without dumping a large parser, so thank you for being willing to read between the... productions? Lines? *cough*

I have found the library a a joy to work with, and as I got deeper into working with it, I suspected I was not using it as well as I might have. You are right that I did not realize what a "hammer" the backtracking was.

This definitely gets me unstuck, and I will see where it takes me.

Many thanks,

Matt

Reply all

Reply to author

Forward