A little digging later, and it looks like 98% of these calls -- or
more -- are in Marshal.dump. That's ... amazing.
-austin
--
Austin Ziegler * halos...@gmail.com
* Alternate: aus...@halostatue.ca
> I had a user report a slowdown in PDF::Writer. I'm pretty certain I
> know where the problem is, but I found something very interesting.
> About 3% of the profiled performance of the manual generation (with
> Shugo Maeda's new profiler) -- and near 20,000,000 (yes twenty
> MILLION) calls -- are calls to Kernel.respond_to?.
>
> A little digging later, and it looks like 98% of these calls -- or
> more -- are in Marshal.dump. That's ... amazing.
wow. can you confirm with somthing like
def Kernel::respond_to?(*a, &b)
#
# keep yer own count somehow...
#
__aliased_respond_to?(*a, &b)
end
def Marshal::dump(*a, &b)
#
# keep yer own count somehow...
#
__aliased_dump(*a, &b)
end
and see how they scale (linearly/exponentially/etc)? that's so crazy it seems
like more investigation might be needed...
cheers.
-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================
I've done quite a bit of confirmation. Much of this has to do with
the unique nature of the object graph in PDF::Writer documents --
there's quite a bit of depth to the object graph.
I was originally investigating Transaction::Simple as the culprit of
my number of calls, but that didn't mesh with what I *did* find. My
final test was to do:
def Kernel.respond_to?(*a)
puts caller[0]
old_respond_to(*a)
end
After a few moments, I literally had tens of *thousands* of calls
indicating that the caller was Marshal.dump.
My guess -- and I haven't looked at the Marshal.dump code to confirm
-- is that what's beeing looked for is a custom dumper, and that
search is faster than assuming the dumper is there (because in most
cases, it won't be, I don't think).
> On 5/31/05, Ara.T.Howard <Ara.T....@noaa.gov> wrote:
>> On Wed, 1 Jun 2005, Austin Ziegler wrote:
>>> I had a user report a slowdown in PDF::Writer. I'm pretty certain I know
>>> where the problem is, but I found something very interesting. About 3% of
>>> the profiled performance of the manual generation (with Shugo Maeda's new
>>> profiler) -- and near 20,000,000 (yes twenty MILLION) calls -- are calls
>>> to Kernel.respond_to?. =20 A little digging later, and it looks like 98%
>>> of these calls -- or more -- are in Marshal.dump. That's ... amazing.
>> wow. can you confirm with somthing like
>
> I've done quite a bit of confirmation. Much of this has to do with the
> unique nature of the object graph in PDF::Writer documents -- there's quite
> a bit of depth to the object graph.
>
> I was originally investigating Transaction::Simple as the culprit of my
> number of calls, but that didn't mesh with what I *did* find. My final test
> was to do:
>
> def Kernel.respond_to?(*a)
> puts caller[0]
> old_respond_to(*a)
> end
>
> After a few moments, I literally had tens of *thousands* of calls indicating
> that the caller was Marshal.dump.
>
> My guess -- and I haven't looked at the Marshal.dump code to confirm -- is
> that what's beeing looked for is a custom dumper, and that search is faster
> than assuming the dumper is there (because in most cases, it won't be, I
> don't think).
great. thanks for the follow-up. i run into things like that alot and always
wonder if calling a method and rescuing NoMethodError is slower or faster than
checking with respond_to?
harp:~ > cat a.rb
require 'benchmark'
class C
def c; end
def rescue_test n
n.times do
begin
self.a
rescue NoMethodError
begin
self.b
rescue NoMethodError
self.c
end
end
end
end
def respond_to_test n
n.times do
if respond_to? 'a'
self.a
elsif respond_to? 'b'
self.b
else
self.c
end
end
end
end
c = C::new
n = 2 ** 16
GC::disable
Benchmark::bm do |b|
%w(rescue_test respond_to_test).map{|m| b.report(m){c.send m,n}}
end
harp:~ > ruby a.rb
user system total real
rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
looks like respond_to? is indeed faster.
> Benchmark::bm do |b|
> %w(rescue_test respond_to_test).map{|m| b.report(m){c.send m,n}}
> end
>
>
> harp:~ > ruby a.rb
> user system total real
> rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
> respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
>
>
> looks like respond_to? is indeed faster.
But if the method *is* defined, rescue is faster:
class C
def a; end # <== added
def c; end
...
user system total real
rescue_test 0.070000 0.000000 0.070000 ( 0.072631)
respond_to_test 0.130000 0.010000 0.140000 ( 0.140980)
So, if some obejcts have the methods and some dont, the best choice
depends on the balance between the two branches (defined vs. undefined).
very true - i hadn't thought of that...
thanks.
> Ara.T....@noaa.gov wrote:
> > rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
> > respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
> >
> >
> >looks like respond_to? is indeed faster.
>
> But if the method *is* defined, rescue is faster:
I'm not sure I understand. If the method exists, what is the
need for rescue to be called?
Are the options being tested these:
# with respond_to?
if x.respond_to(:meth) x.meth # or x.call(:meth)
# or with rescue
begin
x.meth
rescue
# no meth
So, if meth exists, the only overhead should be the 'if x.respond_to'.
If the method does nothing, then a check of this type may appear expensive,
but if the method does something, then the 'if' should not be too much
of a burden.
I suppose that this could be a problem in a tight loop. Under such
conditions, it may be better to just rescue.
--
Jim Freeze
Theory and practice are the same, in theory. -- Ryan Davis
I was unclear: I was hypothesizing a different test, in which a certain
proportion of the objects had the method and the rest did not. It might
be interesting to determine what ratio is the break-even point...
Now, if we could only find out if the method is defined, we could choose
the most efficient way of asking if a method is defined.
Or something like that.
--
-- Jim Weirich j...@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)
> Jim Freeze wrote:
> >* Joel VanderWerf <vj...@path.berkeley.edu> [2005-06-03 04:43:52 +0900]:
> >I'm not sure I understand. If the method exists, what is the
> >need for rescue to be called?
>
> I was unclear: I was hypothesizing a different test, in which a certain
> proportion of the objects had the method and the rest did not. It might
> be interesting to determine what ratio is the break-even point...
Yes, but doesn't that ratio depend upon the time taken in each method?
I think the rule of thumb is:
- If a missing method is a rare occurance, use rescue
- If a missing method is common and expected, use #respond_to
- If you don't know, then drop back ten yards and punt.
> Ara.T....@noaa.gov wrote:
>
>> Benchmark::bm do |b|
>> %w(rescue_test respond_to_test).map{|m| b.report(m){c.send m,n}}
>> end
>> harp:~ > ruby a.rb
>> user system total real
>> rescue_test 2.940000 0.320000 3.260000 ( 3.264758)
>> respond_to_test 0.160000 0.050000 0.210000 ( 0.208268)
>> looks like respond_to? is indeed faster.
>>
>
> But if the method *is* defined, rescue is faster:
>
> class C
> def a; end # <== added
> def c; end
>
> So, if some obejcts have the methods and some dont, the best choice
> depends on the balance between the two branches (defined vs.
> undefined).
class C
def a; Object.no_such_method; end
end
rescue NoMethodError is an inappropriate test for whether or not a
method exists. (This may or may not be important depending on what
what else you need to rescue.)
Being faster does not make it the correct solution.
--
Eric Hodel - drb...@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04