Interesting discovery...

Austin Ziegler

unread,

May 31, 2005, 9:41:10 PM5/31/05

to

I had a user report a slowdown in PDF::Writer. I'm pretty certain I
know where the problem is, but I found something very interesting.
About 3% of the profiled performance of the manual generation (with
Shugo Maeda's new profiler) -- and near 20,000,000 (yes twenty
MILLION) calls -- are calls to Kernel.respond_to?.

A little digging later, and it looks like 98% of these calls -- or
more -- are in Marshal.dump. That's ... amazing.

-austin
--
Austin Ziegler * halos...@gmail.com
* Alternate: aus...@halostatue.ca

Ara.T.Howard

unread,

May 31, 2005, 11:02:40 PM5/31/05

to

On Wed, 1 Jun 2005, Austin Ziegler wrote:

> I had a user report a slowdown in PDF::Writer. I'm pretty certain I
> know where the problem is, but I found something very interesting.
> About 3% of the profiled performance of the manual generation (with
> Shugo Maeda's new profiler) -- and near 20,000,000 (yes twenty
> MILLION) calls -- are calls to Kernel.respond_to?.
>
> A little digging later, and it looks like 98% of these calls -- or
> more -- are in Marshal.dump. That's ... amazing.

wow. can you confirm with somthing like

def Kernel::respond_to?(*a, &b)
#
# keep yer own count somehow...
#
__aliased_respond_to?(*a, &b)
end

def Marshal::dump(*a, &b)
#
# keep yer own count somehow...
#
__aliased_dump(*a, &b)
end

and see how they scale (linearly/exponentially/etc)? that's so crazy it seems
like more investigation might be needed...

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

Austin Ziegler

unread,

Jun 1, 2005, 8:09:20 AM6/1/05

to

On 5/31/05, Ara.T.Howard <Ara.T....@noaa.gov> wrote:
> On Wed, 1 Jun 2005, Austin Ziegler wrote:
>> I had a user report a slowdown in PDF::Writer. I'm pretty certain
>> I know where the problem is, but I found something very
>> interesting. About 3% of the profiled performance of the manual
>> generation (with Shugo Maeda's new profiler) -- and near
>> 20,000,000 (yes twenty MILLION) calls -- are calls to
>> Kernel.respond_to?.
>>
>> A little digging later, and it looks like 98% of these calls --
>> or more -- are in Marshal.dump. That's ... amazing.
> wow. can you confirm with somthing like

I've done quite a bit of confirmation. Much of this has to do with
the unique nature of the object graph in PDF::Writer documents --
there's quite a bit of depth to the object graph.

I was originally investigating Transaction::Simple as the culprit of
my number of calls, but that didn't mesh with what I *did* find. My
final test was to do:

def Kernel.respond_to?(*a)
puts caller[0]
old_respond_to(*a)
end

After a few moments, I literally had tens of *thousands* of calls
indicating that the caller was Marshal.dump.

My guess -- and I haven't looked at the Marshal.dump code to confirm
-- is that what's beeing looked for is a custom dumper, and that
search is faster than assuming the dumper is there (because in most
cases, it won't be, I don't think).

Ara.T....@noaa.gov

unread,

Jun 1, 2005, 10:27:30 PM6/1/05

to

On Wed, 1 Jun 2005, Austin Ziegler wrote:

> On 5/31/05, Ara.T.Howard <Ara.T....@noaa.gov> wrote:
>> On Wed, 1 Jun 2005, Austin Ziegler wrote:
>>> I had a user report a slowdown in PDF::Writer. I'm pretty certain I know
>>> where the problem is, but I found something very interesting. About 3% of
>>> the profiled performance of the manual generation (with Shugo Maeda's new
>>> profiler) -- and near 20,000,000 (yes twenty MILLION) calls -- are calls

>>> to Kernel.respond_to?. =20 A little digging later, and it looks like 98%

>>> of these calls -- or more -- are in Marshal.dump. That's ... amazing.
>> wow. can you confirm with somthing like
>
> I've done quite a bit of confirmation. Much of this has to do with the
> unique nature of the object graph in PDF::Writer documents -- there's quite
> a bit of depth to the object graph.
>
> I was originally investigating Transaction::Simple as the culprit of my
> number of calls, but that didn't mesh with what I *did* find. My final test
> was to do:
>
> def Kernel.respond_to?(*a)
> puts caller[0]
> old_respond_to(*a)
> end
>
> After a few moments, I literally had tens of *thousands* of calls indicating
> that the caller was Marshal.dump.
>
> My guess -- and I haven't looked at the Marshal.dump code to confirm -- is
> that what's beeing looked for is a custom dumper, and that search is faster
> than assuming the dumper is there (because in most cases, it won't be, I
> don't think).

great. thanks for the follow-up. i run into things like that alot and always
wonder if calling a method and rescuing NoMethodError is slower or faster than
checking with respond_to?

harp:~ > cat a.rb
require 'benchmark'

class C
def c; end
def rescue_test n
n.times do
begin
self.a
rescue NoMethodError
begin
self.b
rescue NoMethodError
self.c
end
end
end
end
def respond_to_test n
n.times do
if respond_to? 'a'
self.a
elsif respond_to? 'b'
self.b
else
self.c
end
end
end
end

c = C::new
n = 2 ** 16
GC::disable

Benchmark::bm do |b|
%w(rescue_test respond_to_test).map{|m| b.report(m){c.send m,n}}
end

harp:~ > ruby a.rb
user system total real
rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)

looks like respond_to? is indeed faster.

Joel VanderWerf

unread,

Jun 2, 2005, 3:43:52 PM6/2/05

to

Ara.T....@noaa.gov wrote:

> Benchmark::bm do |b|
> %w(rescue_test respond_to_test).map{|m| b.report(m){c.send m,n}}
> end
>
>
> harp:~ > ruby a.rb
> user system total real
> rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
> respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
>
>
> looks like respond_to? is indeed faster.

But if the method *is* defined, rescue is faster:

class C
def a; end # <== added
def c; end
...

user system total real
rescue_test 0.070000 0.000000 0.070000 ( 0.072631)
respond_to_test 0.130000 0.010000 0.140000 ( 0.140980)

So, if some obejcts have the methods and some dont, the best choice
depends on the balance between the two branches (defined vs. undefined).

Ara.T....@noaa.gov

unread,

Jun 2, 2005, 4:06:12 PM6/2/05

to

very true - i hadn't thought of that...

thanks.

Jim Freeze

unread,

Jun 2, 2005, 5:09:04 PM6/2/05

to

* Joel VanderWerf <vj...@path.berkeley.edu> [2005-06-03 04:43:52 +0900]:

> Ara.T....@noaa.gov wrote:
> > rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
> > respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
> >
> >
> >looks like respond_to? is indeed faster.
>
> But if the method *is* defined, rescue is faster:

I'm not sure I understand. If the method exists, what is the
need for rescue to be called?

Are the options being tested these:

# with respond_to?
if x.respond_to(:meth) x.meth # or x.call(:meth)
# or with rescue
begin
x.meth
rescue
# no meth

So, if meth exists, the only overhead should be the 'if x.respond_to'.
If the method does nothing, then a check of this type may appear expensive,
but if the method does something, then the 'if' should not be too much
of a burden.
I suppose that this could be a problem in a tight loop. Under such
conditions, it may be better to just rescue.

--
Jim Freeze
Theory and practice are the same, in theory. -- Ryan Davis

Joel VanderWerf

unread,

Jun 2, 2005, 5:43:40 PM6/2/05

to

Jim Freeze wrote:
> * Joel VanderWerf <vj...@path.berkeley.edu> [2005-06-03 04:43:52 +0900]:
>
>
>>Ara.T....@noaa.gov wrote:
>>
>>> rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
>>> respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
>>>
>>>
>>>looks like respond_to? is indeed faster.
>>
>>But if the method *is* defined, rescue is faster:
>
>
> I'm not sure I understand. If the method exists, what is the
> need for rescue to be called?

I was unclear: I was hypothesizing a different test, in which a certain
proportion of the objects had the method and the rest did not. It might
be interesting to determine what ratio is the break-even point...

Jim Weirich

unread,

Jun 2, 2005, 5:45:55 PM6/2/05

to

Ara.T....@noaa.gov said:
> On Fri, 3 Jun 2005, Joel VanderWerf wrote:
>> But if the method *is* defined, rescue is faster:
>

> very true - i hadn't thought of that...

Now, if we could only find out if the method is defined, we could choose
the most efficient way of asking if a method is defined.

Or something like that.

--
-- Jim Weirich j...@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)

Jim Freeze

unread,

Jun 2, 2005, 6:30:42 PM6/2/05

to

* Joel VanderWerf <vj...@path.berkeley.edu> [2005-06-03 06:43:40 +0900]:

> Jim Freeze wrote:
> >* Joel VanderWerf <vj...@path.berkeley.edu> [2005-06-03 04:43:52 +0900]:

> >I'm not sure I understand. If the method exists, what is the

> >need for rescue to be called?
>
> I was unclear: I was hypothesizing a different test, in which a certain
> proportion of the objects had the method and the rest did not. It might
> be interesting to determine what ratio is the break-even point...

Yes, but doesn't that ratio depend upon the time taken in each method?

I think the rule of thumb is:

- If a missing method is a rare occurance, use rescue
- If a missing method is common and expected, use #respond_to
- If you don't know, then drop back ten yards and punt.

Eric Hodel

unread,

Jun 2, 2005, 6:58:40 PM6/2/05

to

On 02 Jun 2005, at 12:43, Joel VanderWerf wrote:

> Ara.T....@noaa.gov wrote:
>
>> Benchmark::bm do |b|
>> %w(rescue_test respond_to_test).map{|m| b.report(m){c.send m,n}}
>> end
>> harp:~ > ruby a.rb
>> user system total real
>> rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
>> respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
>> looks like respond_to? is indeed faster.
>>
>
> But if the method *is* defined, rescue is faster:
>
> class C
> def a; end # <== added
> def c; end
>

> So, if some obejcts have the methods and some dont, the best choice
> depends on the balance between the two branches (defined vs.
> undefined).

class C
def a; Object.no_such_method; end
end

rescue NoMethodError is an inappropriate test for whether or not a
method exists. (This may or may not be important depending on what
what else you need to rescue.)

Being faster does not make it the correct solution.

--
Eric Hodel - drb...@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04