Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Parser combinator rewrite status

19 views

Skip to first unread message

luser droog

unread,

Dec 6, 2020, 4:48:36 AM12/6/20

I started redesigning the Parser Combinators following the
paper by Hutton where he describes adding error reports by
rewriting everything to operate on a 3-state object. A parser
can result in an [OK ...] state or it can be [Error ...] or [Fail ...].
This way you can record some local state at the point that
an error is discovered and propagate that back down the
call stack wrapping more local knowledge at each step.

So for the simple tests I've put it to, viz.

0 0 (abcd\ne) string-input
(abc) str exec
report
pc

0 0 (abed\ne) string-input
(abd) str
(abc) str alt exec
report
pc

0 0 (abed\ne) string-input
(a)(c) range
(a)(c) range then
exec
report
pq

That is two success tests and one failure test, I get the following
(promising) output:

$ gsnd -q -dNOSAFER pc11a.ps
OK
[(a) (b) (c)]
remainder:[[(d) [0 3]] {0 4 (\ne) string-input}]
stack:
:stack
Fail
[[(after) [(a) []]] [[(after) [(b) []]] [[{(c) eq} (not satisfied)] [[(e) [0 2]] {0 3 (d\ne) string-input}]]]]
stack:
:stack
OK
[(a) (b)]
remainder:{0 2 (ed\ne) string-input}
stack:
:stack

The "stack:...:stack" parts are just demonstrating that the stacks
are left nice and clean after each test. And the Failure report has
the pieces for a nice error message, but it's not quite there yet IMO.

I suppose the next logical step is to try to build a regex engine or
a tokenizer with it. to be continued...

Jeffrey H. Coffield

unread,

Dec 7, 2020, 12:58:21 PM12/7/20

Not sure what you are trying to accomplish here, but some work was done
parsing PostScript using Antlr which has some interesting error
handling. I have just started using Antlr4 myself and at some point will
probably look at using it to parse PostScript forms.

Jeff Coffield

luser droog

unread,

Dec 12, 2020, 4:23:15 AM12/12/20

On Monday, December 7, 2020 at 11:58:21 AM UTC-6, Jeffrey H. Coffield wrote:
> Not sure what you are trying to accomplish here, but some work was done
> parsing PostScript using Antlr which has some interesting error
> handling. I have just started using Antlr4 myself and at some point will
> probably look at using it to parse PostScript forms.
>
> Jeff Coffield

I want to write parser *in* PostScript, not just *for* PostScript. And I think
the parser combinators make for a nice way do it, despite the difficulty in
translating it to a non-lazy language. But I did a whole bunch of work
before realizing that the lack of error messages made the whole thing
rather unusable. But it's quite possible I'm overlooking a simpler way to
go about it. It's possible possible that postscript isn't the right tool for
the job.

But in the interim, I've had some success with implementing lazy evaluation
and translating the result to working (complicated) C. So if I can get a
prototype that produces nice error messages easily from some kind of
annotated grammar, I can translate that to C and have a nice interface
for writing parsers in C.

0 new messages