Obtaining value and continuing the parse

30 views
Skip to first unread message

Thomas Weigert

unread,
Mar 13, 2015, 2:26:40 AM3/13/15
to marpa-...@googlegroups.com
I am struggling with the following: In my application, I have to perform many parses on the same string. It really drags performance down if I recreate the recognizer on a new string every time.

To solve this problem, I attempt to use the same recognizer and string to perform these parses in a single parse series. I set up the grammar so that an event is triggered when I recognized enough so that  it constitutes a "parse". At this point I handle what I got and then resume the parsing at an appropriate point.

The parsing process works fine using this scheme. Instead of performing a new parse on the same string, I perform these parse sections on the same string in the same series.

But here is the rub: I need to get at the syntax tree that is recognized at the intermediate stages. But after I call "value" in the recognizer the first time, I run into trouble: While the parser still is set up and can continue perfectly fine, any subsequent calls to "value" return undef.

This is frustrating, because I can see in the lexeme tracing that the parser continues doing the expected thing, recognizing the fragments I throw at it. But I can no longer obtain the value constructed.

I realize that this does not quite follow the rules, since the document on "Phases" states that the evaluation phase has to come after the reading phase (it says even strong that if I call "resume" in the evaluation phase the behavior is undefined, which is what I am doing here). However, Marpa is clearly set up to not require this, because the parser can continue after the evaluation phase back in the reading phase. But somehow, I cannot enter any further evaluation phases.

I was thinking that maybe starting another parse series might work. So I do a "series_restart" before I call "resume", but now the evaluation phase gets confused about the packages it is in. It is not clear to me whether this is supposed to be supported because the documentation is confusing. It states that "The series_restart() method must be called before value() when ambiguous() detects an ambiguous parse and the application needs to get the parse values." As I am not dealing with an ambiguous parse, this may not apply but maybe it is meant to be broader.

Does anybody know if this situation is meant to be supported?

Thanks, Th.


P.S. Marpa provides these amazing possibilities, so maybe I am getting greedy with what I am trying to accomplish. But parsing has never been more fun as when I learned about Marpa.

Ruslan Shvedov

unread,
Mar 13, 2015, 3:28:05 AM3/13/15
to marpa-...@googlegroups.com
AFAIK, once you called value(), the recognizer goes to the valuation phase and cannot be put back to the reading phase. value() returns undef to mark the end of the parse series, because you have already retrieved the parse value (there is only one because there is no ambiguity). After calling series_restart(), you can call value() again to re-retrieve the parse.

However, semantics can be done with events -- create completion events for the rules and set up semantic actions in the reading loop to build the parse value. Ron Savage did smth. like that [1]. Another (simpler) example of reading loop with events from which you can start is [2].

BTW, is it exactly the recognizer construction that is the bottleneck? IMO, once the grammar is precomputed, the recognizer can be created from it rather quickly. I used to create a new recognizer for each of 100+ inputs in a test loop and had no a problem. 

Hope this helps.

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Thomas Weigert

unread,
Mar 13, 2015, 11:05:10 AM3/13/15
to marpa-...@googlegroups.com
Thank you Ruslan.

You mention that I can call value() again after series_restart().

I did try this, but the following was the result. Is there something else one should do? It is as if the semantics package got lost.

Could not resolve rule action named 'simplifyPP'
  Rule was ppIfSection ::= ppIfGroup ppElifGroupsOpt ppElseGroupOpt '#endif'
  Could not fully qualify "simplifyPP": no resolve package
Marpa::R2 exception at parse.pl line 1410.
 at /usr/local/lib/perl/5.18.2/Marpa/R2.pm line 161.
    Marpa::R2::exception('Could not resolve rule action named \'simplifyPP\'\x{a}', '  Rule was ppIfSection ::= ppIfGroup ppElifGroupsOpt ppElseGr...', '  ', 'Marpa::R2::Internal::X=HASH(0x17a1c30)') called at /usr/local/lib/perl/5.18.2/Marpa/R2/Value.pm line 473
    Marpa::R2::Internal::Value::resolve_rule_by_id('Marpa::R2::Recognizer=ARRAY(0x33bd820)', 6) called at /usr/local/lib/perl/5.18.2/Marpa/R2/Value.pm line 589
    Marpa::R2::Internal::Value::resolve_recce('Marpa::R2::Recognizer=ARRAY(0x33bd820)', undef) called at /usr/local/lib/perl/5.18.2/Marpa/R2/Value.pm line 704
    Marpa::R2::Internal::Value::registration_init('Marpa::R2::Recognizer=ARRAY(0x33bd820)', undef) called at /usr/local/lib/perl/5.18.2/Marpa/R2/Value.pm line 1514
    Marpa::R2::Recognizer::value('Marpa::R2::Recognizer=ARRAY(0x33bd820)', 'Marpa::R2::Scanless::R=ARRAY(0x33ba358)', undef) called at /usr/local/lib/perl/5.18.2/Marpa/R2/SLR.pm line 1560
    Marpa::R2::Scanless::R::value('Marpa::R2::Scanless::R=ARRAY(0x33ba358)') called at parse.pl line 1410

Thomas Weigert

unread,
Mar 13, 2015, 11:08:55 AM3/13/15
to marpa-...@googlegroups.com
I worked it out.

When you call series_restart(), you must provide it with the semantics_package as an argument, otherwise it will use something else.

Now it works as I had hoped. Maybe this would be good to document in this pod:

https://metacpan.org/pod/distribution/Marpa-R2/pod/Scanless/R.pod#series_restart

Th.

Jeffrey Kegler

unread,
Mar 13, 2015, 5:01:45 PM3/13/15
to Marpa Parser Mailing LIst
Glad things worked out!  Re the doc fix: I think it's there already: "If any other recognizer setting is not specified explicitly, it is reset to its default. If an application wants an explicit recognizer setting to persist into a new parse series, it must specify that setting explicitly in the new parse series."

Reply all
Reply to author
Forward
0 new messages