Learning Python from the help file and online resources can leave one
with many gaps. Can someone comment on the following:
# ---------
class X:
T = 1
def f1(self, arg):
print "f1, arg=%d" % arg
def f2(self, arg):
print "f2, arg=%d" % arg
def f3(self, arg):
print "f3, arg=%d" % arg
# this:
F = f2
# versus this:
func_tbl = { 1: f1, 2: f2, 3: f3 }
def test1(self, n, arg):
# why passing 'self' is needed?
return self.func_tbl[n](self, arg)
def test2(self):
f = self.f1
f(6)
f = self.F
# why passing self is not needed?
f(87)
# ---------
x = X()
x.test1(1, 5)
print '----------'
x.test2()
Why in test1() when it uses the class variable func_tbl we still need
to pass self, but in test2() we don't ?
What is the difference between the reference in 'F' and 'func_tbl' ?
Thanks,
Elias
I recommend putting print statements into your code like this:
def test1(self, n, arg):
print "In test1, I'm calling a %s" % self.func_tbl[n]
return self.func_tbl[n](self, arg)
def test2(self):
f = self.f1
print "Now in test2, I'm calling a %s" % f
f(6)
Bottom line: You're calling different things. Your func_tbl is a dict
of functions, not methods.
-Tom
To build on that a bit, note that in test2() you are doing:
> > f = self.f1
> > f(6)
> >
> > f = self.F
> > # why passing self is not needed?
> > f(87)
As I understand it, since you obtained the reference to 'f1' from 'self', you
got it as a bound rather than an unbound method. So 'self' is automatically
passed in as the first argument.
----
Rami Chowdhury
"Given enough eyeballs, all bugs are shallow." -- Linus' Law
408-597-7068 (US) / 07875-841-046 (UK) / 01819-245544 (BD)
(snip code)
> Why in test1() when it uses the class variable func_tbl we still need
> to pass self, but in test2() we don't ?
>
> What is the difference between the reference in 'F' and 'func_tbl' ?
Answer here:
I have a sense I used to know this once upon a time, but the question
came to my mind (possibly again) and I couldn't think of an answer:
Why not create the bound methods at instantiation time, rather than
using the descriptor protocol which has the overhead of creating a new
bound method each time the method attribute is accessed?
Cheers,
Jason.
Among all the things in the Python language proper, this is probably
the most confusing thing for new programmers. Heck, even experienced
programmers get tangled up because they project how they think things
should work on to the Python model.
The second most confusing thing is probably how objects get instantiated.
--
Jonathan Gardner
jgar...@jonathangardner.net
Because people wanted it like so. There was once, a time when python
doesn't have the descriptor protocol (old-style classes) and many people
feels that a high-level language like python should provide some
additional hooks for customizing attribute access which the existing
solutions like __getattr__ and __setattr__ couldn't easily provide. The
result is new-style class. Most people probably would never need to use
descriptor protocol directly, since the immediate benefit of descriptor
protocol are property(), classmethod(), and instancemethod() decorators
which, without descriptor protocol, would never become a possibility.
> Most people probably would never need to use
> descriptor protocol directly, since the immediate benefit of descriptor
> protocol are property(), classmethod(), and instancemethod() decorators
> which, without descriptor protocol, would never become a possibility.
There's an instancemethod decorator? Where?
Are you thinking of staticmethod? "instancemethod", if you mean what I
think you mean, doesn't need a decorator because it is the default
behaviour for new-style classes.
--
Steven
Well, for one thing, Python classes are open. They can be added to at
any time. For another thing, you might not ever use most of the
methods of an instance, so it would be a huge waste to create those.
Also, this area has been optimized for normal usage patterns quite
heavily, to the point where attempted "optimizations" can lead to
results that are, on the surface, quite counterintuitive.
For example, if you want to take the length of a lot of different
strings, you might think you could save time by binding a local
variable to str.__len__ and using that on the strings. Here is an
example:
>>> def a(s, count, lenfunc):
... for i in xrange(count):
... z = lenfunc(s)
...
>>> a('abcdef', 100000000, len)
>>> a('abcdef', 100000000, str.__len__)
Running cPython 2.6 on my machine, len() runs about 3 times faster
than str.__len__(). The overhead of checking that an object is usable
with a particular class method far outweighs the cost of creating the
bound method!
So, one thought for the OP. Whenever I have a dictionary that
contains class methods in it, if I'm going to use it heavily, I often
recode it to create the dictionary at object creation time with bound
methods in it.
Regards,
Pat
A possible optimization would be a simple memoization on first access.
Whoops... yep, sorry about that. Got all it up the mixed head in...
But what if, for example, one uses some descriptor/metaclass magic to
make it so that each subsequent access to the attribute returns a method
bound to different objects?
I do agree that memoization on access is a good pattern, and I use it
frequently. I don't know if I would want the interpreter
automagically doing that for everything, though -- it would require
some thought to figure out what the overhead cost is for the things
that are only used once.
Usually, I will have a slight naming difference for the things I want
memoized, to get the memoization code to run. For example, if you add
an underbar in front of everything you want memoized:
class foo(object):
def _bar(self):
pass
def __getattr__(self, aname):
if aname.startswith('_'):
raise AttributeError
value = getattr(self, '_' + aname)
self.aname = value
return value
obj = foo()
So then the first time you look up obj.bar, it builds the bound
method, and on subsequent accesses it just returns the previously
bound method.
Regards,
Pat
Well, that's the whole problem with dynamism vs optimization...
Wow, this so surprised me, that I had to try it with 3.1 (on winxp), and
got a similar result (about 2.6x longer with str.__len__). This is a
real lesson in measure, don't guess, and how premature 'optimization'
may be a pessimization. Thanks.
Terry Jan Reedy
Actually, I think I overstated my case -- there is some special logic
for len and built-in objects, I think. I can see the same thing with
normal attributes on subclasses of object(), but not nearly as
dramatic. In any case, your conclusion about this being a lesson in
"measure, don't guess" holds, with the additional caveat that, if it
matters, you need to somehow do some additional measurements to make
sure you are measuring what you think you are measuring!
Pat
Patrick, I was trying to understand the way your code was working but
I thing I'm not getting it.
I tested:
from time import time
class foo1(object):
def _bar(self):
pass
def __getattr__(self, name):
value = getattr(self, '_' + name)
self.name = value
return value
class foo2(object):
def bar(self):
pass
def a(klass, count):
ins = klass()
for i in xrange(count):
z = ins.bar()
t0 = time()
a(foo1, 10000000)
t1 = time()
a(foo2, 10000000)
t2 = time()
print t1-t0 #75 sec
print t2-t1 #11 sec
foo1 is a lot slower than foo2. I understood that memoization should
optimize atribute calls. Maybe I am putting my foot in my mouth...
Thanks
JA
I don't think you are putting your foot in your mouth. I always have
to test to remember what works faster and what doesn't. Usually when
I memoize as I showed, it is not a simple attribute lookup, but
something that takes more work to create. As I stated in my response
to Terry, I overstated my case earlier, because of some optimizations
in len(), I think. Nonetheless, (at least on Python 2.6) I think the
advice I gave to the OP holds. One difference is that you are doing
an attribute lookup in your inner loop. I do find that performance
hit surprising, but to compare with what the OP is describing, you
either need to look up an unbound function in a dict and call it with
a parameter, or look up a bound method in a dict and call it without
the parameter. Since the dict lookup doesn't usually do anything
fancy and unexpected like attribute lookup, we pull the dict lookup
out of the equation and out of the inner loop, and just do the
comparison like this:
>>> class foo(object):
... def bar(self):
... pass
...
>>> x = foo()
>>>
>>> def a(func, count):
... for i in xrange(count):
... z=func()
...
>>> def b(func, param, count):
... for i in xrange(count):
... z=func(param)
...
>>>
>>> a(x.bar, 100000000) # 13 seconds
>>> b(foo.bar, x, 100000000) # 18 seconds
Regards,
Pat
OK, Thanks. Need to play a little bit with it.
Cheers
JA
> Actually, I think I overstated my case -- there is some special logic
> for len and built-in objects, I think.
Yes, len() invokes the C-level sq_len slot of the type object,
which for built-in types points directly to the C function
implementing the len() operation for that type. So len() on
a string doesn't involve looking up a __len__ attribute or
creating a bound method at all.
The only reason a __len__ attribute exists at all for
built-in types is to give Python code the illusion that
C-coded and Python-coded classes work the same way. Most
of the time it's never used.
If you try the same test using a method that doesn't have
a corresponding type slot (basically anything without a
double_underscore name) you will probably see a small
improvement.
--
Greg