On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
> luserdroog <
mij...@yahoo.com> writes:
>
> > Is there a systematic way to discard the extra noise that can occur
> > when using parser combinators? For example, the `many` combinator
> > which matches zero or more instances of its argument parser.
> > In the case of zero matches, it still needs to return a value.
> >
> > In my situation, I'm simulating everything in PostScript because it's
> > my favorite language. I'm simulating Lisp cons cells as 2-element
> > arrays. So for this JSON string,
> >
> > ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
> >
> > if I make no special effort, I get a resulting value that looks like this:
> >
> > OK
> > [[3 [[4 [[5 [[] []]] []]] []]] []]
> > remainder:[]
> >
> > All those little empty arrays need to just go away, but not any of the
> > important array structure.
> So you want
>
> [[3 [[4 [[5 []]]]]]]
>
> ?
I guess that's the big problem here. I'm not sure what I want. I keep having
to add extra code to clean up and delete the extra stuff. Ultimately the
result should be
[ 3 [ 4 [ 5 ] ] ]
The parser for arrays looks for the left bracket, then ...
/Jarray //begin-array
//value executeonly xthen
//value-separator //value executeonly xthen many then %{ps flatten ps} using
maybe
//end-array thenx
{ %filter-zeros first %ps
} using def
The `executeonly` are in there to prevent infinite recursion if the expanded
code ever gets printed (like in a stack dump while debugging). The /Jarray
parser is one of the components of the //value parser.
Hmm. Initially I had the `then` combinator doing a Lisp-style (append)
operation on my simulated lists, so something like
(a) char (b) char then
would -- if matched by the input -- return
[ (a) [ (b) [] ] ]
which I could then easily massage into
[ (a) (b) ]
But that led me into problems when I wanted to use the combinators
`xthen` and `thenx` which discard one of the two pieces. If the results
are just appended together in a list, then I've lost the information to
peel them back apart. So I changed `then` to just (cons) the pieces
together, and now `xthen` and `thenx` have an easy job.
And extra noise values pop up if I use combinators like `maybe`
and `many` which might succeed with zero matches. So, ...
> > `many` and `maybe` seem to be the chief
> > culprits, but then their results are propagated back by `alt`s and
> > `then`s all the way back to the top.
> >
> > Do I need to make some kind of out-of-band signal for these "zeros"
> > that I can filter out later? The obvious problem here is that the array
> > type is being used for too many things. But there's a paucity of
> > types in PostScript, sigh. For the JSON application, I have nametype
> > objects available that don't have a JSON corollary.
> >
> > Do I need to rewrite all the combinators to filter out noise values at
> > every turn?.
> It's odd to call something that you are returning (presuambly) as
> noise. Are you using lists as a sort of Maybe monad with [] as Nothing?
>
Yes, I think that's what I'm doing, clumsily.