Are there good strategies for diagnosing performance issues?

28 views
Skip to first unread message

mdo...@nvidia.com

unread,
Nov 20, 2017, 4:52:13 PM11/20/17
to marpa parser

Hey, all,

 

I’ve been hoping for some time to replace an existing parser---implemented using YAPP and a custom lexer---with a parser implemented using Marpa::R2.

 

After a moderate amount of work, I finally got a working parser.

 

The grammar is far easier to understand and work with than the old YAPP parser; because we’re dealing with an indentation based format, I do have to use a discard event to track indentation depths and emit indentation tokens and the like.

 

However, it’s disappointingly slow, taking about twice as long to parse our full set of files.  That was really unexpected.

 

Immediately suspecting my code, I cranked my pathological case through Devel::NYTProf; imagine my surprise when the top entry in the list of ‘top 15 subroutines’ was an xsub: Marpa::R2::Thin::V::stack_step.  And this wasn’t by a small amount: that routine took 556s out of 590s or so, and the next most expensive routine is Marpa::R2::Thin::SLR::read at 12.7s.  These completely dominated the total runtime.

 

While I will try to sanitize my parser and make it postable so people can perhaps point out problems, I thought I might first just ask: is that sort of high cost in stack_step generally representative of some sort of problem in the grammar?  Is there some well-known construct that leads to a blow-up that is easily eliminated, etc.?  Is there something I could log or examine that might shed some light?

 

Any guidance would be appreciated.

 

Michael Dorman

Jeffrey Kegler

unread,
Nov 20, 2017, 5:09:20 PM11/20/17
to Marpa Parser Mailing LIst
I'm very rusty at looking at profiles of Marpa::R2, but I hope some off-the-top-of-my-head remarks might be useful.  Typically, almost all the time of an R2 parse is taken up by the Perl layer, and by the semantics -- the parse itself is almost free.  So looking at how you handle the semantics may help.  In particular, the fewer Perl callbacks you use, the faster you are likely to run.

Marpa::R3, when it's fully developed, will allow more of the semantics to be done inside Marpa itself, without callbacks, and should be faster.

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Dorman

unread,
Nov 21, 2017, 9:27:52 AM11/21/17
to marpa-...@googlegroups.com

Hey, Jeffrey,

 

So just to confirm, generating an event (thus requiring processing and $recce->resume()) is part of this cost, yes?

 

Michael Dorman

 

--

To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.

 

--

You received this message because you are subscribed to the Google Groups "marpa parser" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.


This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Jeffrey Kegler

unread,
Nov 21, 2017, 10:03:03 AM11/21/17
to Marpa Parser Mailing LIst
Yes (still guessing), the discard events are likely to be a major part of the cost.

But don't take my guesses too seriously -- measurement may show something different.

On Tue, Nov 21, 2017 at 6:27 AM, Michael Dorman <mdo...@nvidia.com> wrote:

Hey, Jeffrey,

 

So just to confirm, generating an event (thus requiring processing and $recce->resume()) is part of this cost, yes?

 

Michael Dorman

 

From: marpa-...@googlegroups.com [mailto:marpa-parser@googlegroups.com] On Behalf Of Jeffrey Kegler
Sent: Monday, November 20, 2017 5:09 PM
To: Marpa Parser Mailing LIst <marpa-...@googlegroups.com>
Subject: Re: Are there good strategies for diagnosing performance issues?

 

I'm very rusty at looking at profiles of Marpa::R2, but I hope some off-the-top-of-my-head remarks might be useful.  Typically, almost all the time of an R2 parse is taken up by the Perl layer, and by the semantics -- the parse itself is almost free.  So looking at how you handle the semantics may help.  In particular, the fewer Perl callbacks you use, the faster you are likely to run.

 

Marpa::R3, when it's fully developed, will allow more of the semantics to be done inside Marpa itself, without callbacks, and should be faster.

On Mon, Nov 20, 2017 at 1:52 PM, <mdo...@nvidia.com> wrote:

Hey, all,

 

I’ve been hoping for some time to replace an existing parser---implemented using YAPP and a custom lexer---with a parser implemented using Marpa::R2.

 

After a moderate amount of work, I finally got a working parser.

 

The grammar is far easier to understand and work with than the old YAPP parser; because we’re dealing with an indentation based format, I do have to use a discard event to track indentation depths and emit indentation tokens and the like.

 

However, it’s disappointingly slow, taking about twice as long to parse our full set of files.  That was really unexpected.

 

Immediately suspecting my code, I cranked my pathological case through Devel::NYTProf; imagine my surprise when the top entry in the list of ‘top 15 subroutines’ was an xsub: Marpa::R2::Thin::V::stack_step.  And this wasn’t by a small amount: that routine took 556s out of 590s or so, and the next most expensive routine is Marpa::R2::Thin::SLR::read at 12.7s.  These completely dominated the total runtime.

 

While I will try to sanitize my parser and make it postable so people can perhaps point out problems, I thought I might first just ask: is that sort of high cost in stack_step generally representative of some sort of problem in the grammar?  Is there some well-known construct that leads to a blow-up that is easily eliminated, etc.?  Is there something I could log or examine that might shed some light?

 

Any guidance would be appreciated.

 

Michael Dorman

--

You received this message because you are subscribed to the Google Groups "marpa parser" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscribe@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.

To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscribe@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.


This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

--
You received this message because you are subscribed to the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to marpa-parser+unsubscribe@googlegroups.com.

amon

unread,
Nov 21, 2017, 4:06:30 PM11/21/17
to marpa parser
A high cost in stack_step is not entirely unsurprising, but if it is taking too long subjectively then there are some points that might be fixable. Note that stack_step() implements the semantics of your parser, which are only run after a successful parse. I don't think using events in the recognizer phase contributes to this cost.

One important strategy is to make sure you're handling as much of your grammar as possible on the L0 level, not on the G1 level. This might not be always possible since this can affect the language matched by locally ambiguous language. However, a G1 rule such as "digit ::= '0' | '1' | ... | '9'" is almost certainly wrong. The L0 level can handle character-level matching more efficiently since it doesn't need to go through the `stack_step()` evaluation.

Where possible, prefer built-in semantics such as `::first` over providing your own callbacks. Ignoring the values of unnecessary symbols (like "Rule ::= Interesting (Boring)") may be necessary to use built-in semantics, but is a great idea anyway since it avoids unnecessary evaluations.

The structure of your grammar should describe the AST you're trying to produce as closely as possible. With Marpa it is not necessary to rewrite your grammar to satisfy the restrictions of the parser (aside from extracting sequence rules). So if you translated your Yapp grammar directly, there might be unnecessary rules. But this advice is already about micro-optimization and unlikely to result in a phenomenal speedup.

All in all, I am surprised by your 2x slowdown since the semantics of Yapp and Marpa are comparable (doing less work is faster). Possibly, having moved from your hand-written lexer to Marpa's scanless interface may have caused some problem, e.g. suddenly having an order of magnitude more lexemes per document than before. But this kind of issue can't be discussed productively without seeing the grammar in question.

Jeffrey Kegler

unread,
Nov 21, 2017, 7:23:11 PM11/21/17
to Marpa Parser Mailing LIst
@amon: A very nice summary.  Thanks!

@mdorman: At this point many people have more & more recent experience with Marpa::R2 applications than I do.  [ Though I still know the internals better than anyone, I guess. :-) ]  amon is an expert, and also the author of some very good tutorials.

--

Ruslan Shvedov

unread,
Nov 22, 2017, 5:46:10 AM11/22/17
to marpa-...@googlegroups.com
Amon is right. In my experience, if you want it fast with Marpa::R2, then, if you can, don’t use G1 for lexing, omit unnecessary symbols, streamline your grammar to resemble the desired AST as much as possible, and drop semantic actions altogether, process AST instead.

Examples:

https://github.com/rns/Marpa--R2/blob/master/cpan/t/sl_json_ast.t

--
Reply all
Reply to author
Forward
0 new messages