libmarpa API questions

32 views
Skip to first unread message

Andreas Kupries

unread,
May 27, 2016, 3:21:51 AM5/27/16
to marpa parser

For background, I am currently writing tests for my first steps at a
Tcl binding to libmarpa. I am actually far enough that the code is
able to read a SLIF specification, incidentially of SLIF itself.

However with the tests I am running the lexer into a wall and I am
quite unsure what I am doing wrong.

Symbols are 0-7, except 2.
Start symbol is 7.
Rules are

7 -> 5 3
7 -> 6 4
3 -> 0
4 -> 1

With a newly created recognizer the session/sequence of operations is

start-input
expected-terminals -> 5 6 /ok
alternative 5
alternative 6
earleme-complete /ok
expected-terminals -> 0 1 /ok
alternative 0 /ok
earleme-complete /err: Parse exhausted

The exhaustion could be ok, as the machine has no way forward after
this point.

latest-earley-set -> 1

That seems to be ok as well, assuming we count from 0.

Tring to create the tree (actually the bocage) at that earley-set I
get an error back: 'No parse'.

Yet I believe that the recognizer should have a parse tree / bocage at
earley-set 1, of the form

7
/ \
5 3
\
0

So now I am wondeing where my reasoning is wrong.

My best speculation at the moment is that the second earleme-complete
should not throw an error, that the latest-early-set should be 2.
I.e. that earley-set 0 maps to the empty string, set 1 is for the
first symbol set {5,6}, and set 2 should have been made for the
singleton set {0}.

I should say again that the lexer code works fine in a larger context,
a full parser for SLIF specifications.

A difference there seems to be that in that context 'earleme-complete'
never throws an 'Parse exhausted', but a following explicit query
about exhaustion (marpa_r_is_exhausted) can return true.

This might fit with my speculation.

It does not explain the difference in behaviour however :(


Time to sleep, maybe I can see it better tomorrow. Or then work
through the methods for more introspective calls I could make to
understand the engine's state better.

--
Sincerely,
Andreas Kupries <akup...@shaw.ca>
<http://core.tcl.tk/akupries/>

Tcl'2016, Nov 14-18, Houston, TX, USA. http://www.tcl.tk/community/tcl2016/
-------------------------------------------------------------------------------




Jeffrey Kegler

unread,
May 27, 2016, 8:57:43 PM5/27/16
to Marpa Parser Mailing LIst
Stabbing in the dark a bit:

1.)  Exhaustion is not related to success or failure in any simple & direct way, which sometimes confuses people focusing on a particular subset of languages.  For a balanced parenthesis language, the parse is successful at X, if and only if it is exhausted at X.  For typical programming languages, however, exhaustion always means failure.  Of course, by definition, an attempt to *continue* an exhausted parse always fails, regardless of the grammar.  There's more here.

2.) As a second suggestion, ordinarily I'd suggest using Marpa's progress reports, but you're writing a new interface, so you don't have them unless you write them yourself.  I really suggest you do that ASAP.  Your users are certainly going to need them anyway.  They'll be great help in your debugging.  And by the time, you've debugged your parser, you'll have written and debugged your progress reports. :-)  And they'll be a great help in visualizing what's happening in the parse.  Writing the progress reports, not last, but very early in development, is how I did it.

Hope this helps!

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andreas Kupries

unread,
May 27, 2016, 10:04:17 PM5/27/16
to marpa-...@googlegroups.com

> Stabbing in the dark a bit:

> 1.) Exhaustion is not related to success or failure in any simple &
> direct way, which sometimes confuses people focusing on a
> particular subset of languages. For a balanced parenthesis
> language, the parse is successful at X, if and only if it is
> exhausted at X. For typical programming languages, however,
> exhaustion always means failure. Of course, by definition, an
> attempt to *continue* an exhausted parse always fails,
> regardless of the grammar.

True. In this case I _seem_ to get the 'exhausted' one step to early, as I
essentially complete the parse, and I do not try to continue at that point
either. I guess I will see what the reports will tell me.

> There's more here
> <http://search.cpan.org/~jkegl/Marpa-R2-3.000000>.

Thanks.

> 2.) As a second suggestion, ordinarily I'd suggest using Marpa's
> progress reports, but you're writing a new interface, so you
> don't have them unless you write them yourself. I really
> suggest you do that ASAP. Your users are certainly going to
> need them anyway. They'll be great help in your debugging. And
> by the time, you've debugged your parser, you'll have written
> and debugged your progress reports. :-) And they'll be a great
> help in visualizing what's happening in the parse. Writing the
> progress reports, not last, but very early in development, is
> how I did it.

That makes sense. Checking my code I see that I did the low-level
wrapping of the progress interface. I just never made higher-level
accessors for it. That definitely goes at the top of the TODO list now.

> Hope this helps!

Thank you, and at least item (2) did. And hopefully having reports
will also help with (1).

--
So long,
Reply all
Reply to author
Forward
0 new messages