Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Empty list as default parameter

0 views
Skip to first unread message

Alex Panayotopoulos

unread,
Nov 21, 2003, 7:12:43 AM11/21/03
to
Hello all,

Maybe I'm being foolish, but I just don't understand why the following
code behaves as it does:

- = - = - = -

class listHolder:
def __init__( self, myList=[] ):
self.myList = myList

def __repr__( self ): return str( self.myList )

# debug: 'a' should contain 42, 'b' should be empty. But no.
a = listHolder()
a.myList.append( 42 )
b = listHolder()
print a
print b

- = - = - = -

I was expecting to see [42] then [], but instead I see [42] then [42]. It
seems that a and b share a reference to the same list object. Why?

--
<<<Alexspudros Potatopoulos>>>
Defender of Spudkind

anton muhin

unread,
Nov 21, 2003, 7:42:14 AM11/21/03
to
Alex Panayotopoulos wrote:

Really common mistake: lists are _mutable_ objects and self.myList
references the same object as the default parameter. Therefore
a.myList.append modifies default value as well. Mutable defaults are
better avoided (except for some variants of memo pattern). Standard
trick is:

def __init__(self, myList = None):
if myList is None:
self.myList = []
else:
self.myList = myList

regards,
anton.

Thorsten Pferdekämper

unread,
Nov 21, 2003, 7:39:36 AM11/21/03
to
>
> class listHolder:
> def __init__( self, myList=[] ):
> self.myList = myList
>
> def __repr__( self ): return str( self.myList )
>
> # debug: 'a' should contain 42, 'b' should be empty. But no.
> a = listHolder()
> a.myList.append( 42 )
> b = listHolder()
> print a
> print b
>
> - = - = - = -
>
> I was expecting to see [42] then [], but instead I see [42] then [42]. It
> seems that a and b share a reference to the same list object. Why?
>

Hi,
AFAIK, the default parameter values are only instantiated once. So, the
default-myList is always the same object. The coding above should be
rewritten like...

class listHolder:
def __init__( self, myList=None ):
if myList = None:


self.myList = []
else:
self.myList = myList

Regards,
Thorsten


Fredrik Lundh

unread,
Nov 21, 2003, 7:36:50 AM11/21/03
to pytho...@python.org
Alex Panayotopoulos wrote:

> Maybe I'm being foolish, but I just don't understand why the following
> code behaves as it does:
>
> - = - = - = -
>
> class listHolder:
> def __init__( self, myList=[] ):
> self.myList = myList
>
> def __repr__( self ): return str( self.myList )
>
> # debug: 'a' should contain 42, 'b' should be empty. But no.
> a = listHolder()
> a.myList.append( 42 )
> b = listHolder()
> print a
> print b
>
> - = - = - = -
>
> I was expecting to see [42] then [], but instead I see [42] then [42]. It
> seems that a and b share a reference to the same list object. Why?

the default value expression is evaluated once, when the function
object is created, and the resulting object is bound to the argument.

if you want to create a new object on every call, you have to do
that yourself:

def __init__( self, myList=None):
if myList is None:
myList = [] # create a new list
self.myList = myList

or perhaps:

def __init__( self, myList=None):
self.myList = myList or []

see the description of the "def" statement for more info:

http://www.python.org/doc/current/ref/function.html

</F>


Alex Panayotopoulos

unread,
Nov 21, 2003, 8:26:13 AM11/21/03
to
On Fri, 21 Nov 2003, anton muhin wrote:

> Really common mistake: lists are _mutable_ objects and self.myList
> references the same object as the default parameter. Therefore
> a.myList.append modifies default value as well.

It does?!
Ah, I've found it: listHolder.__init__.func_defaults

Hmm... this behaviour is *very* counter-intuitive. I expect that if I were
to define a default to an explicit object...

def process_tree1( start=rootNode )

...then I should indeed be able to process rootNode through manipulating
start. However, if I define a default as a new instance of an object...

def process_tree2( myTree=tree() )

...then the function should, IMHO, create a new object every time it is
entered. (By having func_defaults point to tree.__init__, or summat.)

Was there any reason that this sort of behaviour was not implemented?

> Mutable defaults are better avoided (except for some variants of memo
> pattern). Standard trick is:
>
> def __init__(self, myList = None):
> if myList is None:
> self.myList = []
> else:
> self.myList = myList

Thank you. I shall use this in my code. (Although I would have preferred a
trick that uses less lines!)

anton muhin

unread,
Nov 21, 2003, 9:15:19 AM11/21/03
to
Alex Panayotopoulos wrote:

> On Fri, 21 Nov 2003, anton muhin wrote:
>
>
> Was there any reason that this sort of behaviour was not implemented?
>
>

It was discussed several times (search for it, if you're interested),
but I don't remeber details.

>>Mutable defaults are better avoided (except for some variants of memo
>>pattern). Standard trick is:
>>
>> def __init__(self, myList = None):
>> if myList is None:
>> self.myList = []
>> else:
>> self.myList = myList
>
>
> Thank you. I shall use this in my code. (Although I would have preferred a
> trick that uses less lines!)
>

if you wish a one-linear, somehting like this might work:
self.myList = myList or []

regards,
anton.

Andrei

unread,
Nov 21, 2003, 9:39:46 AM11/21/03
to pytho...@python.org
Alex Panayotopoulos wrote on Fri, 21 Nov 2003 13:26:13 +0000:

<snip>

> Hmm... this behaviour is *very* counter-intuitive. I expect that if I were
> to define a default to an explicit object...

I think so too. It's documented as a Python pitfall. Here are some more:
http://zephyrfalcon.org/labs/python_pitfalls.html

>> def __init__(self, myList = None):
>> if myList is None:
>> self.myList = []
>> else:
>> self.myList = myList
>
> Thank you. I shall use this in my code. (Although I would have preferred a
> trick that uses less lines!)

You could use the quasi-ternary trick:

>>> mylist = None
>>> (mylist is None and [[]] or [mylist])[0]
[]
>>> mylist = []
>>> (mylist is None and [[]] or [mylist])[0]
[]
>>> mylist = [3,4]
>>> (mylist is None and [[]] or [mylist])[0]
[3, 4]

Not that big an improvement really :).

--
Yours,

Andrei

=====
Mail address in header catches spam. Real contact info (decode with rot13):
cebw...@jnanqbb.ay. Fcnz-serr! Cyrnfr qb abg hfr va choyvp cbfgf. V ernq
gur yvfg, fb gurer'f ab arrq gb PP.


Robin Munn

unread,
Nov 21, 2003, 10:59:01 AM11/21/03
to
Alex Panayotopoulos <A.Panayo...@sms.ed.ac.uk> wrote:
> On Fri, 21 Nov 2003, anton muhin wrote:
>
>> Really common mistake: lists are _mutable_ objects and self.myList
>> references the same object as the default parameter. Therefore
>> a.myList.append modifies default value as well.
>
> It does?!
> Ah, I've found it: listHolder.__init__.func_defaults
>
> Hmm... this behaviour is *very* counter-intuitive. I expect that if I were
> to define a default to an explicit object...
>
> def process_tree1( start=rootNode )
>
> ...then I should indeed be able to process rootNode through manipulating
> start. However, if I define a default as a new instance of an object...
>
> def process_tree2( myTree=tree() )
>
> ...then the function should, IMHO, create a new object every time it is
> entered. (By having func_defaults point to tree.__init__, or summat.)
>
> Was there any reason that this sort of behaviour was not implemented?

The reason for this behavior lies in the fact that Python is an
interpreted language, and that the class and def keywords are actually
considered statements. One creates a class object with a certain name,
the other creates a function object with a certain name. The def (or
class) statement is executed as soon as the entire function code or
class code has been parsed -- in other words, as soon as the interpreter
drops out of the indentation block of the def or class statement.
Therefore, any default values passed to function arguments in a def
statement will be evaluated once and only once, when the def statement
is executed.

Look at this, for example:


n = 5
def f(x = n):
return x
n = 3

print n # Prints 3
print f() # Prints 5

Note that the default argument to f() is the value of n when the def
statement was executed, not the value of n when f() is called.

--
Robin Munn
rm...@pobox.com

Peter Otten

unread,
Nov 21, 2003, 12:38:25 PM11/21/03
to
anton muhin wrote:

This is dangerous, don't do it.

>>> None or "just a marker"
'just a marker'

seems to work. But:

>>> [] or "just a marker"
'just a marker'

I. e. your one-liner will create a new list when the myList argument is not
provided or is provided and bound to an empty list.

Peter

Peter Otten

unread,
Nov 21, 2003, 12:56:18 PM11/21/03
to
Fredrik Lundh wrote:

> or perhaps:
>
> def __init__( self, myList=None):
> self.myList = myList or []

This treats empty and non-empty myList args differently:

>>> mylist = []
>>> (mylist or []) is mylist
False
>>> mylist = [1]
>>> (mylist or []) is mylist
True

To be consistent you could do

self.myList = myList[:] or []

>>> mylist = []
>>> (mylist[:] or []) is mylist
False
>>> mylist = [1]
>>> (mylist[:] or []) is mylist
False

or avoid the above pattern altogether.

Peter

Alex Panayotopoulos

unread,
Nov 21, 2003, 1:18:14 PM11/21/03
to
On Fri, 21 Nov 2003, anton muhin wrote:
> Alex Panayotopoulos wrote:
>
> > Was there any reason that this sort of behaviour was not implemented?
>
> It was discussed several times (search for it, if you're interested),

I have done so. It seems to be one of the most popular FAQs... 8^)

As I see it, this is not just a 'wart'; it's a major pitfall that seems
entirely out of place in python; everything up til now has made sense.[0]

The argument that this behaviour is useful for counters and such may have
been true in the past, but this was only ever a pseudo-generator hack. Now
we have real generators I don't see when you'd ever want to write
"foo(x=[])".

If it were up to me, I'd have def throw a warning wherever it sees a
square bracket, curly bracket or call to an __init__... But then I'm not a
python developer... 8^P 8^)

Okay, I've whinged enough, I think. No more posts from me on this subject.
8^)

> if you wish a one-linear, somehting like this might work:
> self.myList = myList or []

Thanks. That's sneaky without being confusing... I like.

--
<<<Alexspudros Potatopoulos>>>
Defender of Spudkind

[0] Though yes, I am happy that int division is to be changed.

Alex Panayotopoulos

unread,
Nov 21, 2003, 4:29:07 PM11/21/03
to
On Fri, 21 Nov 2003, Peter Otten wrote:

[...]


> To be consistent you could do
>
> self.myList = myList[:] or []

TypeError. You can't slice None.
If you're going to be taking copies on entry, you might as well use

def __init__( self, myList=[] ):

self.myList = myList[:]

However, copying is inefficient. Not a problem in most cases, but
I've got a lot of objects to initialise. (This is for a GA). Solution: use
"myList or []", but carefully.

Peter Otten

unread,
Nov 21, 2003, 7:22:11 PM11/21/03
to
Alex Panayotopoulos wrote:

> On Fri, 21 Nov 2003, Peter Otten wrote:
>
> [...]
>> To be consistent you could do
>>
>> self.myList = myList[:] or []
>
> TypeError. You can't slice None.
> If you're going to be taking copies on entry, you might as well use
>
> def __init__( self, myList=[] ):
> self.myList = myList[:]

That was the the idea; if you are copying anyway, the [] default value does
no harm. Shame on me for not making this clear :-(

> However, copying is inefficient. Not a problem in most cases, but
> I've got a lot of objects to initialise. (This is for a GA). Solution: use
> "myList or []", but carefully.

If you are mutating the myList attribute, the changes propagate to the list
given as a parameter only if that parameter was a non-empty list. Such a
behaviour will probably puzzle anyone but yourself (and yourself in a few
months), and what was gained by the shorter expression will be wasted on
documentation and/or debugging. Personally, I would avoid it even in
throwaway scripts.

Peter

Stian Søiland

unread,
Nov 22, 2003, 4:19:58 PM11/22/03
to
* Robin Munn spake thusly:

> Look at this, for example:
>
>
> n = 5
> def f(x = n):
> return x
> n = 3
>
> print n # Prints 3
> print f() # Prints 5
>
> Note that the default argument to f() is the value of n when the def
> statement was executed, not the value of n when f() is called.

Wouldn't it be more logical for a programmer that x should evaluate
to '3' inside f()?

I can't see what is the purpose of binding default variables at
definition time instead of runtime.

I know perfectly well of the "param is None"-trick - and I've used it
far too often. I've never had any use of early binding of default
parameters except when making a remembering-function-for-fun:

>>> def remember(value=None, list=[]):
... if value is None:
... return list
... else:
... list.append(value)
...
>>> remember(1)
>>> remember("Hello")
>>> remember()
[1, 'Hello']

This example is in LISP presented as a way to do object orientation
without extending LISP. In Python it is an example of when you should
used an object instead.

Default variables should be meant as local names that can be overridden.
LISP, default variables are evaluated at runtime:


; This function just returns a new list, but prints
; "New" each time
[58]> (defun newlist () (print 'New ) ())
NEWLIST

; this function takes an optional parameter mylist, if it
; is not supplied, newlist is called and assigned to mylist
; The function returns mylist.
[59]> (defun getlist (&optional (mylist (newlist))) mylist)
GETLIST

; note how newlist() is called
[60]> (getlist)
NEW
NIL

; each time
[61]> (getlist)
NEW
NIL

; but not when the parameter is supplied
[62]> (getlist ())
NIL


This does not work in Python:

>>> def newlist():
... print "New"
... return []
...
>>> def getlist(mylist=newlist()):
... return mylist
...
New
>>> getlist()
[]
>>> getlist()
[]

As one could see, newlist is called at definition time, and only once.

I think that default parameters should be evaluated each time the
function is called and the parameter is not supplied. This is a major
change, so it has to be delayed until 3.0 (possibly enabled by
__future__-imports)

--
Stian Sřiland Being able to break security doesn't make
Trondheim, Norway you a hacker more than being able to hotwire
http://stain.portveien.to/ cars makes you an automotive engineer. [ESR]

Fredrik Lundh

unread,
Nov 22, 2003, 7:59:51 PM11/22/03
to pytho...@python.org
Alex Panayotopoulos wrote:

> However, if I define a default as a new instance of an object...
>
> def process_tree2( myTree=tree() )
>
> ...then the function should, IMHO, create a new object every time it is
> entered. (By having func_defaults point to tree.__init__, or summat.)

point to tree.__init__ ?

you haven't spent much time thinking about this, have you?

</F>


Bengt Richter

unread,
Nov 22, 2003, 9:43:23 PM11/22/03
to
On Fri, 21 Nov 2003 13:26:13 +0000, Alex Panayotopoulos <A.Panayo...@sms.ed.ac.uk> wrote:

>On Fri, 21 Nov 2003, anton muhin wrote:
>
>> Really common mistake: lists are _mutable_ objects and self.myList
>> references the same object as the default parameter. Therefore
>> a.myList.append modifies default value as well.
>
>It does?!
>Ah, I've found it: listHolder.__init__.func_defaults
>
>Hmm... this behaviour is *very* counter-intuitive. I expect that if I were
>to define a default to an explicit object...
>
> def process_tree1( start=rootNode )
>
>...then I should indeed be able to process rootNode through manipulating
>start. However, if I define a default as a new instance of an object...
>
> def process_tree2( myTree=tree() )
>
>...then the function should, IMHO, create a new object every time it is
>entered. (By having func_defaults point to tree.__init__, or summat.)
>
>Was there any reason that this sort of behaviour was not implemented?

Yes. The bindings of default values for call args are evaluated at define-time,
in the context of the definition, not at execution time, when the function is called.

If you want to specify what to call at execution time, you have to call it at execution time,
so you either have to pass a reference to the thing to call ('tree' in this case) or have it
otherwise visible from inside the function, e.g., the interface could be something like

def process_tree2(treemaker=tree):
myTree = treemaker()
...

Now if you wanted to pass a pre-existing tree to this kind of process_tree2, you'd have to
pass a callable that would execute to produce the tree, e.g., call it like

temp_treemaker = lambda: rootNode
process_tree2(temp_treemaker) # or just put the lambda expression right in the call

More likely, you'd want to be able to accept a tree, or make one if nothing was passed. E.g.,

def process_tree3(myTree=None):
if myTree is None: myTree = tree() # here tree must be visible in an enclosing scope
... # -- usually global, but not necessarily

a variant would be to have a default value of a single particular tree, as in your first example,
and then test whether something executable, like tree (not tree()) was being passed, e.g.,

def process_tree1( start=rootNode ):
if callable(start): start = start() # e.g., if called like process_tree1(tree)

Of course, if a node object is also callable, you have to make a different check ;-)



>
>> Mutable defaults are better avoided (except for some variants of memo
>> pattern). Standard trick is:
>>
>> def __init__(self, myList = None):
>> if myList is None:
>> self.myList = []
>> else:
>> self.myList = myList
>
>Thank you. I shall use this in my code. (Although I would have preferred a
>trick that uses less lines!)
>

How about two less? I usually do it (with no trick ;-) like:

def __init__(self, myList = None):

if myList is None: myList = []
self.myList = myList

Regards,
Bengt Richter

Paul Rubin

unread,
Nov 22, 2003, 10:05:58 PM11/22/03
to
bo...@oz.net (Bengt Richter) writes:
> Yes. The bindings of default values for call args are evaluated at
> define-time, in the context of the definition, not at execution
> time, when the function is called.

That's always seemed like a source of bugs to me, and ugly workarounds
for the bugs. Is there any corresponding disadvantage to eval'ing the
defaults at runtime as is done in other languages?

Bengt Richter

unread,
Nov 22, 2003, 11:06:58 PM11/22/03
to

What languages are you thinking of? A concrete example for comparison would
clarify things. Would you have default expressions effectively passed in as
the bodies of lambdas (which might mean creating closures, depending on what was
referenced) and then executed to create the local bindings prior to the first line in
a function or method? It would certainly be inefficient for all the cases where you
just wanted a static default (unless you special cased those to work as now -- but
remember, only bare names and constant literals could be special cased that way. An
expression like os.RD_ONLY (yes that is an expression!) would have to be passed as
lambda: os.RDONLY). So you'd have to counter that by making bare-name bindings prior
to calls, like tmp_mode=os.RD_ONLY; os.open('foo.txt', tmp_mode); #etc

I expect the use cases balance favor the current methodology. A cursory grep though
defs of /lib/*py shows no defaults at all to be the majority (>80%?) and of the defs
with '=' in the same line (983 of 5443), most seem to be =None or =<some int> or =<some string>
to the tiring eyeball. I didn't process it.

Regards,
Bengt Richter

Jay O'Connor

unread,
Nov 23, 2003, 12:46:58 AM11/23/03
to
Bengt Richter wrote:

>On 22 Nov 2003 19:05:58 -0800, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>
>
>
>>bo...@oz.net (Bengt Richter) writes:
>>
>>
>>>Yes. The bindings of default values for call args are evaluated at
>>>define-time, in the context of the definition, not at execution
>>>time, when the function is called.
>>>
>>>
>>That's always seemed like a source of bugs to me, and ugly workarounds
>>for the bugs. Is there any corresponding disadvantage to eval'ing the
>>defaults at runtime as is done in other languages?
>>
>>
>What languages are you thinking of? A concrete example for comparison would
>clarify things.
>

One example that had not occurred to me until reading this thread has to
do with how Smalltalk methods are built. Since Smalltalk doesn't allow
for default parameters, one very common idiom is for a developer to
create a sort of 'cascade' effect of multiple methods. The full method
will require all the paremeters,but one or more of the more common ways
of using the method will be provided which just turn around and call the
main method with some default parameter provided for the paramaters not
beiing specified.

An example:

Canvas>>drawShape: aShape color: aColor pen: aPen

"do all the drawing in this method."
...

Now, if the developer decides that often the color or pen are going to
be pretty standard he may provide some 'convenience methods' that other
developers can call that in turn just call the main method with defaults.

Canvas>>drawShape: aShape

"Use default color of balck and a solid pen"
^self drawShape: aShape color: #black pen: #solid

Canvas>>drawShape: aShape color: aColor

^self drawShape: aShape color: aColor pen: #solid

Canvas>>drawShape: aShape pen: aPen

^self drawShape: aShape: color: #black pen: aPen.

This can be a built akward to write sometimes, but makes life pretty
nice for the other developers. This is not alwasy the case but just
based on what the developer thinks are going to be likely patterns of use.

What came to mind, of course, is that this allows the defaults to be
dynamic.

Canvas>>drawShape: aShape
^self drawShape: aShape color: self currentColor pen: self currentPen

You are still providing defaults, but the defaults are based on the
current state of the system at execution, not at compile time. This is
actually a fairly common idiom in Smalltalk, but Smalltalk's mechanism
for method signatures is fairly unique and happens to support this
approach well.


Paul Rubin

unread,
Nov 23, 2003, 1:06:03 AM11/23/03
to
bo...@oz.net (Bengt Richter) writes:
> >That's always seemed like a source of bugs to me, and ugly workarounds
> >for the bugs. Is there any corresponding disadvantage to eval'ing the
> >defaults at runtime as is done in other languages?

> What languages are you thinking of? A concrete example for
> comparison would clarify things.

Common Lisp is the most obvious one.

> Would you have default expressions effectively passed in as the
> bodies of lambdas (which might mean creating closures, depending on
> what was referenced) and then executed to create the local bindings
> prior to the first line in a function or method? It would certainly
> be inefficient for all the cases where you just wanted a static

> default ...

The compiler can do the obvious things to make efficient code in the
normal cases.

> (unless you special cased those to work as now -- but
> remember, only bare names and constant literals could be special
> cased that way. An expression like os.RD_ONLY (yes that is an
> expression!) would have to be passed as lambda: os.RDONLY). So you'd
> have to counter that by making bare-name bindings prior to calls,
> like tmp_mode=os.RD_ONLY; os.open('foo.txt', tmp_mode); #etc

Rather than have the programmer go through such contortions it's better
fix the compiler to generate the obvious code inline, and then rely
on the compiler to get these things right.

Peter Hansen

unread,
Nov 23, 2003, 9:28:04 AM11/23/03
to
Stian Sřiland wrote:
>
> * Robin Munn spake thusly:
> > Look at this, for example:
> >
> >
> > n = 5
> > def f(x = n):
> > return x
> > n = 3
> >
> > print n # Prints 3
> > print f() # Prints 5
> >
> > Note that the default argument to f() is the value of n when the def
> > statement was executed, not the value of n when f() is called.
>
> Wouldn't it be more logical for a programmer that x should evaluate
> to '3' inside f()?
>
> I can't see what is the purpose of binding default variables at
> definition time instead of runtime.

Purpose? Who needs a purpose? ... "def" is a *statement* in Python,
so naturally the code in the argument list is mostly easily handled
at definition time, when the def statement is being executed, rather
than at run time.

It also means that the default arguments don't have to be evaluated
dynamically each time the function is called, which would in some
cases be a performance nightmare...

-Peter

Bengt Richter

unread,
Nov 23, 2003, 11:33:37 AM11/23/03
to
On 22 Nov 2003 22:06:03 -0800, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:

>bo...@oz.net (Bengt Richter) writes:
>> >That's always seemed like a source of bugs to me, and ugly workarounds
>> >for the bugs. Is there any corresponding disadvantage to eval'ing the
>> >defaults at runtime as is done in other languages?
>
>> What languages are you thinking of? A concrete example for
>> comparison would clarify things.
>
>Common Lisp is the most obvious one.

You are referring to initforms, I assume. I wonder how often macros are used
IRL to defeat the deferral and plug in pre-computed static default values that
the compiler can't infer its way to at compile time?

>
>> Would you have default expressions effectively passed in as the
>> bodies of lambdas (which might mean creating closures, depending on
>> what was referenced) and then executed to create the local bindings
>> prior to the first line in a function or method? It would certainly
>> be inefficient for all the cases where you just wanted a static
>> default ...
>
>The compiler can do the obvious things to make efficient code in the
>normal cases.
>
>> (unless you special cased those to work as now -- but
>> remember, only bare names and constant literals could be special
>> cased that way. An expression like os.RD_ONLY (yes that is an
>> expression!) would have to be passed as lambda: os.RDONLY). So you'd
>> have to counter that by making bare-name bindings prior to calls,
>> like tmp_mode=os.RD_ONLY; os.open('foo.txt', tmp_mode); #etc
>
>Rather than have the programmer go through such contortions it's better
>fix the compiler to generate the obvious code inline, and then rely
>on the compiler to get these things right.

I wasn't really advocating programmer contortions ;-)
But the trouble is that the compiler can't guess what you _mean_, except
for the obvious cases of bare names and constant literals, so otherwise you have
to code explicitly in any case. E.g., is it obvious that getattr(os, 'RD_ONLY')
should be done at call time or optimized away to def time in os.open('foo.txt', os.RD_ONLY) ?
I don't think you can optimize it away without telling the compiler one way or another,
or changing the dynamic nature of the language.

In any case it would be a semantic change, and I'd hate to have the job of finding breakage ;-)

Regards,
Bengt Richter

Bengt Richter

unread,
Nov 23, 2003, 12:14:34 PM11/23/03
to
On 23 Nov 2003 16:33:37 GMT, bo...@oz.net (Bengt Richter) wrote:
[...]

>to code explicitly in any case. E.g., is it obvious that getattr(os, 'RD_ONLY')
>should be done at call time or optimized away to def time in os.open('foo.txt', os.RD_ONLY) ?
I meant during os.open('foo.txt') assuming def open(...) had a default mode expressed as an attribute
expression. But that's bad as a real example. So please assume a customized file opener that uses
os.open and has a default mode parameter os.RD_ONLY ;-)

>I don't think you can optimize it away without telling the compiler one way or another,
>or changing the dynamic nature of the language.
>
>In any case it would be a semantic change, and I'd hate to have the job of finding breakage ;-)
>

I'll leave that as is ;-)

Regards,
Bengt Richter

Paul Rubin

unread,
Nov 23, 2003, 8:21:23 PM11/23/03
to
bo...@oz.net (Bengt Richter) writes:
> >Common Lisp is the most obvious one.

> You are referring to initforms, I assume. I wonder how often macros
> are used IRL to defeat the deferral and plug in pre-computed static
> default values that the compiler can't infer its way to at compile time?

I can't think of any times I ever did that, but I've never been a real
hardcore CL hacker.

> But the trouble is that the compiler can't guess what you _mean_,
> except for the obvious cases of bare names and constant literals, so
> otherwise you have to code explicitly in any case. E.g., is it
> obvious that getattr(os, 'RD_ONLY') should be done at call time or
> optimized away to def time in os.open('foo.txt', os.RD_ONLY) ? I
> don't think you can optimize it away without telling the compiler
> one way or another, or changing the dynamic nature of the language.

I think some of that dynamicness should be toned down as Python matures.

> In any case it would be a semantic change, and I'd hate to have the
> job of finding breakage ;-)

Better to do it sooner then, so that there's less stuff to break ;-).

Jay O'Connor

unread,
Nov 23, 2003, 9:25:26 PM11/23/03
to
Paul Rubin wrote:

>bo...@oz.net (Bengt Richter) writes:
>
>
>> changing the dynamic nature of the language.
>>
>>
>
>I think some of that dynamicness should be toned down as Python matures.
>

One aspect of Python's dynamic nature that has always intrigued me is in
the terms of how instance variables. Python allows you to dynamically
add instance variables as you need, which is pretty cool, but seems to
require a lookup of every instance variable reference, which can be
pretty slow.

Consider the following test case, using Python 2.3 on Win95 in IDLE:
=====================
class A:
def __init__(self, val=1):
self.value= val

def setval (self, v):
self.value=v

def getval (self):
return self.value

def test(self):
for x in range (1,1000000):
self.value = 1
y = self.value

import time
a = A()
print "start"
t1 = time.time()
a.test()
t2 = time.time()
print t2 - t1
print "done"
=====================

This routinely return values of over 6.3.

Swithing the implementation of test() to use the accessor methods kicked
the times up to over 20 seconds

By contrast the equvalent Smalltalk* code (VisualWorks 7,1) on the same
machine. gave me consistant results of a little over 600 milliseconds
when using direct variable access and roughly 1.5 seconds when using
accessors.

I think a portion of the difference in that in Smalltalk, instance
variables are specified in the class metadata (in terms of ordering)
when the class is compiled and thus an instance variable reference is
just a pointer offset from the start of the object data in memory and
thus the compiler can optimize instance variable reference by just
compiling in the offset into the the mehod code.

This is a case where Python is more dynamic than Smalltalk in that
Python allows easy addition of instance variables to objects, but it
comes at a price in terms of performance. The Smalltalk solution to
adding instance variables dynamically is just to carry around a
dictionary (similar to Python) but most experienced Smalltalkers know
that "instance variables are much faster than dictionary lookups so
consider if this is really the right design and consider the
performance/flexibility tradeoff"

This is one case where I think Python's flexibility hurts it in the long
run and perhaps as it goes forward, it will adapt to a more rigid style
that will provide for better performance without too much tradeoff.
After all, Python is still very young, Smalltalk has a twenty year head
start on it.


*Smalltalk code
======================

A>>#test

| y |

10000000 timesRepeat: [
self value: 1.
y := self value
]
-----------
| a |

a := A new.
Time millisecondsToRun: [a test].
======================


0 new messages