Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

to pass self or not to pass self

13 views
Skip to first unread message

lallous

unread,
Mar 15, 2010, 12:39:50 PM3/15/10
to
Hello,

Learning Python from the help file and online resources can leave one
with many gaps. Can someone comment on the following:

# ---------
class X:
T = 1

def f1(self, arg):
print "f1, arg=%d" % arg
def f2(self, arg):
print "f2, arg=%d" % arg
def f3(self, arg):
print "f3, arg=%d" % arg

# this:
F = f2
# versus this:
func_tbl = { 1: f1, 2: f2, 3: f3 }

def test1(self, n, arg):
# why passing 'self' is needed?
return self.func_tbl[n](self, arg)

def test2(self):
f = self.f1
f(6)

f = self.F
# why passing self is not needed?
f(87)

# ---------
x = X()

x.test1(1, 5)
print '----------'
x.test2()

Why in test1() when it uses the class variable func_tbl we still need
to pass self, but in test2() we don't ?

What is the difference between the reference in 'F' and 'func_tbl' ?

Thanks,
Elias

TomF

unread,
Mar 15, 2010, 1:42:41 PM3/15/10
to

I recommend putting print statements into your code like this:

def test1(self, n, arg):
print "In test1, I'm calling a %s" % self.func_tbl[n]


return self.func_tbl[n](self, arg)

def test2(self):
f = self.f1

print "Now in test2, I'm calling a %s" % f
f(6)


Bottom line: You're calling different things. Your func_tbl is a dict
of functions, not methods.

-Tom

Rami Chowdhury

unread,
Mar 15, 2010, 2:01:00 PM3/15/10
to pytho...@python.org, TomF

To build on that a bit, note that in test2() you are doing:


> > f = self.f1
> > f(6)
> >
> > f = self.F
> > # why passing self is not needed?
> > f(87)

As I understand it, since you obtained the reference to 'f1' from 'self', you
got it as a bound rather than an unbound method. So 'self' is automatically
passed in as the first argument.

----
Rami Chowdhury
"Given enough eyeballs, all bugs are shallow." -- Linus' Law
408-597-7068 (US) / 07875-841-046 (UK) / 01819-245544 (BD)

Bruno Desthuilliers

unread,
Mar 16, 2010, 5:04:10 AM3/16/10
to
lallous a écrit :

> Hello,
>
> Learning Python from the help file and online resources can leave one
> with many gaps. Can someone comment on the following:

(snip code)

> Why in test1() when it uses the class variable func_tbl we still need
> to pass self, but in test2() we don't ?
>
> What is the difference between the reference in 'F' and 'func_tbl' ?

Answer here:

http://wiki.python.org/moin/FromFunctionToMethod

Jason Tackaberry

unread,
Mar 16, 2010, 2:59:00 PM3/16/10
to Bruno Desthuilliers, pytho...@python.org
On Tue, 2010-03-16 at 10:04 +0100, Bruno Desthuilliers wrote:
> Answer here:
>
> http://wiki.python.org/moin/FromFunctionToMethod

I have a sense I used to know this once upon a time, but the question
came to my mind (possibly again) and I couldn't think of an answer:

Why not create the bound methods at instantiation time, rather than
using the descriptor protocol which has the overhead of creating a new
bound method each time the method attribute is accessed?

Cheers,
Jason.


Jonathan Gardner

unread,
Mar 16, 2010, 4:15:13 PM3/16/10
to pytho...@python.org
On Tue, Mar 16, 2010 at 2:04 AM, Bruno Desthuilliers
<bruno.42.de...@websiteburo.invalid> wrote:
> lallous a écrit :

>>
>> What is the difference between the reference in 'F' and 'func_tbl' ?
>
> Answer here:
>
> http://wiki.python.org/moin/FromFunctionToMethod
>

Among all the things in the Python language proper, this is probably
the most confusing thing for new programmers. Heck, even experienced
programmers get tangled up because they project how they think things
should work on to the Python model.

The second most confusing thing is probably how objects get instantiated.

--
Jonathan Gardner
jgar...@jonathangardner.net

Lie Ryan

unread,
Mar 17, 2010, 12:57:17 AM3/17/10
to

Because people wanted it like so. There was once, a time when python
doesn't have the descriptor protocol (old-style classes) and many people
feels that a high-level language like python should provide some
additional hooks for customizing attribute access which the existing
solutions like __getattr__ and __setattr__ couldn't easily provide. The
result is new-style class. Most people probably would never need to use
descriptor protocol directly, since the immediate benefit of descriptor
protocol are property(), classmethod(), and instancemethod() decorators
which, without descriptor protocol, would never become a possibility.

Steven D'Aprano

unread,
Mar 17, 2010, 1:32:05 AM3/17/10
to
On Wed, 17 Mar 2010 15:57:17 +1100, Lie Ryan wrote:

> Most people probably would never need to use
> descriptor protocol directly, since the immediate benefit of descriptor
> protocol are property(), classmethod(), and instancemethod() decorators
> which, without descriptor protocol, would never become a possibility.


There's an instancemethod decorator? Where?

Are you thinking of staticmethod? "instancemethod", if you mean what I
think you mean, doesn't need a decorator because it is the default
behaviour for new-style classes.


--
Steven

Patrick Maupin

unread,
Mar 17, 2010, 1:35:03 AM3/17/10
to
On Mar 16, 1:59 pm, Jason Tackaberry <t...@urandom.ca> wrote:
> Why not create the bound methods at instantiation time, rather than
> using the descriptor protocol which has the overhead of creating a new
> bound method each time the method attribute is accessed?

Well, for one thing, Python classes are open. They can be added to at
any time. For another thing, you might not ever use most of the
methods of an instance, so it would be a huge waste to create those.

Also, this area has been optimized for normal usage patterns quite
heavily, to the point where attempted "optimizations" can lead to
results that are, on the surface, quite counterintuitive.

For example, if you want to take the length of a lot of different
strings, you might think you could save time by binding a local
variable to str.__len__ and using that on the strings. Here is an
example:

>>> def a(s, count, lenfunc):
... for i in xrange(count):
... z = lenfunc(s)
...
>>> a('abcdef', 100000000, len)
>>> a('abcdef', 100000000, str.__len__)

Running cPython 2.6 on my machine, len() runs about 3 times faster
than str.__len__(). The overhead of checking that an object is usable
with a particular class method far outweighs the cost of creating the
bound method!

So, one thought for the OP. Whenever I have a dictionary that
contains class methods in it, if I'm going to use it heavily, I often
recode it to create the dictionary at object creation time with bound
methods in it.

Regards,
Pat

Bruno Desthuilliers

unread,
Mar 17, 2010, 5:12:54 AM3/17/10
to
Patrick Maupin a écrit :

> On Mar 16, 1:59 pm, Jason Tackaberry <t...@urandom.ca> wrote:
>> Why not create the bound methods at instantiation time, rather than
>> using the descriptor protocol which has the overhead of creating a new
>> bound method each time the method attribute is accessed?
>
> Well, for one thing, Python classes are open. They can be added to at
> any time. For another thing, you might not ever use most of the
> methods of an instance, so it would be a huge waste to create those.

A possible optimization would be a simple memoization on first access.

Lie Ryan

unread,
Mar 17, 2010, 6:41:50 AM3/17/10
to

Whoops... yep, sorry about that. Got all it up the mixed head in...

Lie Ryan

unread,
Mar 17, 2010, 9:16:53 AM3/17/10
to

But what if, for example, one uses some descriptor/metaclass magic to
make it so that each subsequent access to the attribute returns a method
bound to different objects?

Patrick Maupin

unread,
Mar 17, 2010, 10:43:50 AM3/17/10
to
On Mar 17, 4:12 am, Bruno Desthuilliers <bruno.

I do agree that memoization on access is a good pattern, and I use it
frequently. I don't know if I would want the interpreter
automagically doing that for everything, though -- it would require
some thought to figure out what the overhead cost is for the things
that are only used once.

Usually, I will have a slight naming difference for the things I want
memoized, to get the memoization code to run. For example, if you add
an underbar in front of everything you want memoized:

class foo(object):

def _bar(self):
pass

def __getattr__(self, aname):
if aname.startswith('_'):
raise AttributeError
value = getattr(self, '_' + aname)
self.aname = value
return value

obj = foo()

So then the first time you look up obj.bar, it builds the bound
method, and on subsequent accesses it just returns the previously
bound method.

Regards,
Pat

Bruno Desthuilliers

unread,
Mar 17, 2010, 10:51:53 AM3/17/10
to
Lie Ryan a écrit :

Well, that's the whole problem with dynamism vs optimization...

Terry Reedy

unread,
Mar 17, 2010, 3:55:35 PM3/17/10
to pytho...@python.org
On 3/17/2010 1:35 AM, Patrick Maupin wrote:
>>>> def a(s, count, lenfunc):
> ... for i in xrange(count):
> ... z = lenfunc(s)
> ...
>>>> >>> a('abcdef', 100000000, len)
>>>> >>> a('abcdef', 100000000, str.__len__)
> Running cPython 2.6 on my machine, len() runs about 3 times faster
> than str.__len__(). The overhead of checking that an object is usable
> with a particular class method far outweighs the cost of creating the
> bound method!

Wow, this so surprised me, that I had to try it with 3.1 (on winxp), and
got a similar result (about 2.6x longer with str.__len__). This is a
real lesson in measure, don't guess, and how premature 'optimization'
may be a pessimization. Thanks.

Terry Jan Reedy

Patrick Maupin

unread,
Mar 17, 2010, 5:21:44 PM3/17/10
to

Actually, I think I overstated my case -- there is some special logic
for len and built-in objects, I think. I can see the same thing with
normal attributes on subclasses of object(), but not nearly as
dramatic. In any case, your conclusion about this being a lesson in
"measure, don't guess" holds, with the additional caveat that, if it
matters, you need to somehow do some additional measurements to make
sure you are measuring what you think you are measuring!

Pat

Joaquin Abian

unread,
Mar 17, 2010, 6:34:53 PM3/17/10
to

Patrick, I was trying to understand the way your code was working but
I thing I'm not getting it.

I tested:

from time import time

class foo1(object):
def _bar(self):
pass
def __getattr__(self, name):
value = getattr(self, '_' + name)
self.name = value
return value

class foo2(object):
def bar(self):
pass

def a(klass, count):
ins = klass()
for i in xrange(count):
z = ins.bar()

t0 = time()
a(foo1, 10000000)
t1 = time()
a(foo2, 10000000)
t2 = time()

print t1-t0 #75 sec
print t2-t1 #11 sec

foo1 is a lot slower than foo2. I understood that memoization should
optimize atribute calls. Maybe I am putting my foot in my mouth...

Thanks
JA

Patrick Maupin

unread,
Mar 17, 2010, 7:11:42 PM3/17/10
to

I don't think you are putting your foot in your mouth. I always have
to test to remember what works faster and what doesn't. Usually when
I memoize as I showed, it is not a simple attribute lookup, but
something that takes more work to create. As I stated in my response
to Terry, I overstated my case earlier, because of some optimizations
in len(), I think. Nonetheless, (at least on Python 2.6) I think the
advice I gave to the OP holds. One difference is that you are doing
an attribute lookup in your inner loop. I do find that performance
hit surprising, but to compare with what the OP is describing, you
either need to look up an unbound function in a dict and call it with
a parameter, or look up a bound method in a dict and call it without
the parameter. Since the dict lookup doesn't usually do anything
fancy and unexpected like attribute lookup, we pull the dict lookup
out of the equation and out of the inner loop, and just do the
comparison like this:

>>> class foo(object):
... def bar(self):
... pass
...
>>> x = foo()
>>>
>>> def a(func, count):


... for i in xrange(count):

... z=func()
...
>>> def b(func, param, count):


... for i in xrange(count):

... z=func(param)
...
>>>
>>> a(x.bar, 100000000) # 13 seconds
>>> b(foo.bar, x, 100000000) # 18 seconds

Regards,
Pat

Joaquin Abian

unread,
Mar 17, 2010, 8:55:41 PM3/17/10
to

OK, Thanks. Need to play a little bit with it.
Cheers
JA

Gregory Ewing

unread,
Mar 20, 2010, 3:32:23 AM3/20/10
to
Patrick Maupin wrote:

> Actually, I think I overstated my case -- there is some special logic
> for len and built-in objects, I think.

Yes, len() invokes the C-level sq_len slot of the type object,
which for built-in types points directly to the C function
implementing the len() operation for that type. So len() on
a string doesn't involve looking up a __len__ attribute or
creating a bound method at all.

The only reason a __len__ attribute exists at all for
built-in types is to give Python code the illusion that
C-coded and Python-coded classes work the same way. Most
of the time it's never used.

If you try the same test using a method that doesn't have
a corresponding type slot (basically anything without a
double_underscore name) you will probably see a small
improvement.

--
Greg

0 new messages