Treetop using extend() constantly breaks optimising Rubies

129 views
Skip to first unread message

Clifford Heath

unread,
Feb 28, 2013, 2:11:04 AM2/28/13
to treet...@googlegroups.com
Folk,

I attended Charles Nutter's JRuby workshop and his talk on High Performance Ruby
during last week's RubyConfAU, and he underlined one really important thing amongst
others…

Using Object#extend flushes the method-lookup cache and kills performance badly.
Worse still, in 1.9.3 it flushes all cached methods for all classes, IIRC.

Treetop uses extend on almost every use of every node, even when the module it's
adding is probably empty. Of course, since a module can have methods added at any
time, you can't be sure an empty module will still be empty… however…

Is there anything we can do about this? Is there any alternative pattern that would work?

Clifford Heath.

Jonathan Stott

unread,
Feb 28, 2013, 12:08:41 PM2/28/13
to treet...@googlegroups.com

Hi Clifford

Would it be possible to include/extend once, as the parser is built? And just hold a cache of the resulting classes to be instantiated as the input is parsed. The brace rules could be defined on anonymous modules to be included.

This would break a parser  which is dynamically constructing modules as it parses input, but that seems crazy anyway. Maybr there could be a syntax addition for the current behaviour when that is essential.

Regards
Jon

--
You received this message because you are subscribed to the Google Groups "Treetop Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to treetop-dev...@googlegroups.com.
To post to this group, send email to treet...@googlegroups.com.
Visit this group at http://groups.google.com/group/treetop-dev?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


markus

unread,
Feb 28, 2013, 12:41:16 PM2/28/13
to treet...@googlegroups.com

> Treetop uses extend on almost every use of every node, even when the module it's
> adding is probably empty. Of course, since a module can have methods added at any
> time, you can't be sure an empty module will still be empty… however…
>
> Is there anything we can do about this? Is there any alternative pattern that would work?

Ha! Great point! I've played with something that touched on this (a
wacky Markus-don't-try-this-at-home stunt) and noticed that the
performance varied much more than I'd expected, but didn't investigate
why.

I think it should be possible to do something to address this, though I
suspect that we'll need something that steady states into a no-extends
running mode once warmed up. Eliminating 90% of them won't (AFAIK) help
much because in a extend-rich sequence of execution the major impact
comes from the first extend (when the cache was populated) and the only
sequence that doesn't have a first member is the empty one).

-- MarkusQ


markus

unread,
Feb 28, 2013, 8:12:30 PM2/28/13
to treet...@googlegroups.com
I was able to squeeze in a bit of time here and there over the course of
my day to sketch how this could work. It's not finished to the point of
being functional, but ~80% is there. I'm going to be away from keyboard
for the next 16 hours, so I thought I'd put what I've got done so far up
where it could be seen in case anyone want to peek.

https://github.com/MarkusQ/treetop/tree/no_extend

WARNING: this code is work-in-progress; at this moment the code can be
used to run things like bin/tt my_test_grammar.treetop which will
produce a my_test_grammar.rb file that is extend-free, but also broken
in various ways (e.g., inline module propagation in choices, among
others).

WARNING II: I intend to force push/delete/rebase/mutilate & fold this
branch, so don't become fond of it. Use tongs to pick it up, and wash
with soap afterwards.

That said, it seems to be going pretty smoothly. I've got some (minor)
concerns about temporary array creation and GC, and I've only done three
of the five cases on my mental list, but it does look like we can get to
static-extend free grammars that steady-state to be dynamic-extend free
as well (that is, we'll pay the cost the first time only).

More tomorrow.

-- MarkusQ



Clifford Heath

unread,
Feb 28, 2013, 8:14:36 PM2/28/13
to treet...@googlegroups.com
Great effort Markus. I don't have time to review it today, but I have no doubt you'll find a workable solution for us all.

Clifford Heath.

Shinji KOBAYASHI

unread,
Mar 1, 2013, 3:08:53 AM3/1/13
to treet...@googlegroups.com
Dear Clifford and folks,

With regards to performance, Treetop parser woks faster in 2.0.0 than
in 1.9.3 on
my grammatical rule. (3332steps)
Enumerable#lazy might be an alternative for memoize.

2013/2/28 Clifford Heath <cliffor...@gmail.com>:

Clifford Heath

unread,
Mar 1, 2013, 3:19:52 AM3/1/13
to treet...@googlegroups.com
On 01/03/2013, at 7:08 PM, Shinji KOBAYASHI <shnj.ko...@gmail.com> wrote:
> With regards to performance, Treetop parser woks faster in 2.0.0 than
> in 1.9.3 on my grammatical rule. (3332steps)

Thanks, that's nice to know, but not really that relevant.

No-one expects a packrat parser to be super fast, given the amount of
memory spent on memoizing things that will never be used :P, but this
is different.

The use of Treetop to parse a single character will make *all* your code
in your entire app run slower, almost as if it the app had just been started
and no method-call lookup had been cached. The problem is less-pronounced
in JRuby than MRI, because it doesn't flush the entire cache, but still.

We need to do this stuff statically, and Markus has made a great start on that.

After Treetop stops whamming the entire process' performance, we might
talk about how to limit memoize, perhaps by instrumenting a test run and
dropping code that memoizes things that never get used. Then Treetop would
really start to fly :)

Clifford Heath.

Shinji KOBAYASHI

unread,
Mar 1, 2013, 4:57:23 AM3/1/13
to treet...@googlegroups.com
Yep, I am just excited and celebrating new Ruby and always appreciate
your efforts.
I had to report treetop works well on Ruby 2.0.0 with no problem, at first.
Travis.CI shows Ruby 2.0.0 is 10% faster in executing RSpecs of my library.
https://travis-ci.org/skoba/ruby-impl-openehr/builds/5032897
This is not only the result of treetop benchmark, but shows some tendency.

2013/3/1 Clifford Heath <cliffor...@gmail.com>:

markus

unread,
Mar 2, 2013, 2:47:05 PM3/2/13
to treet...@googlegroups.com
After testing five possible partial solutions to the extend issue, I've
found a pair that cover all the cases (so that, in steady state, after
the parser is "warmed up" it never extends). I've force pushed a
cleaned up version containing just these two (and the requisite test
changes, etc.) to my branch

https://github.com/MarkusQ/treetop/tree/no_extend

The net performance impact is
negligible on MRI 1.8.7
20%-25% faster on MRI 1.9.3
25%-50% faster* on jruby 1.6.7

There are a few points of concern (notably, users counting on the syntax
tree containing nodes of a specific type extended in a specific way or
other such foolishness** may have to fix their code) and the code
generated to statically create the decorated classes could certainly be
cleaned up*** but overall it appears to do what we wanted.

This isn't a pull request yet, but I'm drifting in that direction. If
anybody wants to poke holes in it, now would be a great opportunity.

Have at it!
-- MarkusQ

* The jruby results are less accurate since the random number seed
setting trick that the benchmark uses to assure that the before and
after runs are both given the same workload fails to do so under jruby.
Still, doing several runs and averaging suggests that it's a lot faster.

** Every change breaks someone's workflow: http://xkcd.com/1172/

*** It produces lovely glop like:

if ArtFoo.is_a?(Class)
class ArtFoo_Statement10 < ArtFoo
include Statement10
end
else
class ArtFoo_Statement10 < SyntaxNode
include ArtFoo
include Statement10
end
end
if SyntaxNode.is_a?(Class)
class SyntaxNode_Statement11 < SyntaxNode
include Statement11
end
else
class SyntaxNode_Statement11 < SyntaxNode
include Statement11
end
end

which just screams out for simplification (it wouldn't make it run any
faster, but then it wouldn't make your eyes bleed as much, which has to
count for something).


Paul Madden

unread,
May 9, 2013, 1:36:45 PM5/9/13
to treet...@googlegroups.com
Is there a ticket assigned for this? I couldn't find one but, if there is, I'd like to track it, as I suspect it may be related to the issue I asked about here.

thanks,
paul

markus

unread,
May 9, 2013, 2:41:58 PM5/9/13
to treet...@googlegroups.com
Paul --

> Is there a ticket assigned for this? I couldn't find one but, if there
> is, I'd like to track it, as I suspect it may be related to the issue
> I asked about here.

After getting my proof of concept branch working I got overtaken by
external events and haven't progressed further. I suspect you are
correct, and the problems are related. If possible, could you try my
experimental branch and see if you see any improvements?

-- Markus


Paul Madden

unread,
May 13, 2013, 7:17:31 PM5/13/13
to treet...@googlegroups.com

If possible, could you try my experimental branch and see if you see any improvements?

 Markus,

Yes, your 'no_extend' branch makes a huge difference. Performance on JRuby closely matched MRI in my tests, and the crazy JRuby parse-time variability disappeared. See attachment. I'll probably wait until this is mainline to rely on it, but it's good to know a solution may be coming.

thanks!
paul
timing3.pdf
Reply all
Reply to author
Forward
0 new messages