Performance of list comprehensions vs. map

Chris Barker

unread,

Sep 5, 2001, 2:12:44 PM9/5/01

to

Hi all,

I just took another look at:

http://musi-cal.mojam.com/~skip/python/fastpython.html

specifically, the loops section:

"""
Loops

Python supports a couple of looping constructs. The for statement is
most commonly used. It loops over the elements of a sequence, assigning
each to the loop
variable. If the body of your loop is simple, the interpreter overhead
of the for loop itself can be a substantial amount of the overhead. This
is where the map
function is handy. You can think of map as a for moved into C code. The
only restriction is that the "loop body" of map must be a function call.

Here's a straightforward example. Instead of looping over a list of
words and converting them to upper case:

newlist = []
for word in list:
newlist.append(word.upper())

you can use map to push the loop from the interpreter into compiled C
code:

import string
newlist = map(string.upper, list)

List comprehensions were added to Python in version 2.0 as well. They
provide a syntactically more compact way of writing the above for loop:

newlist = [s.upper() for s in list]

It's not really any faster than the for loop version, however.
"""

my question is: Why not? I like list comprehensions a lot, and they
would seem to offer a way to get optimization over a for loop, much like
map does, but apparently not. If map can do the loop in C, why can't a
list comprehension?

In general, I like list comprehensions because they make Python a more
"sequence oriented" language: When programming, it is very common to
store a sequence of data of some type, and more often than not, do
something with that whole sequence, rather than individual elements of
that sequence. All that looping just to do the same thing to all the
elements really clutters up the logic of the code. map and list
comprehensions really help (as will element-wise operators, if we ever
get them). However, it seems to me that sequence-oriented structures
could really provide a performance benifit as well, but Python has not
gotten there yet. I think a big stumbling block is that Python has no
concept of the "homogenous" sequence. There are modules that provide
such a thing (Numeric, array), and indeed, the string type is a
homogenous sequence, but Python itself does not understand this concept,
so that if a map or list comprehesion is using a homogenous list, it
still has to check the type of every element as it goes through the
list. For example:

[2*a for a in list]

If the interpreter could just check once what the type of all the
elements of list were, it would only have to figure out what 2*a meant
once, and it could whip through the list very quickly. Indeed, if you
do:

[2*a for a in a_string]

The string is a homogenous sequence by definition (once you check that
it is a string), but the interpreter has no understanding of this
concept.

NOTE: I fully understand that adding a concept like this allows for a
whole set of optimizations that COULD be done, but someone would have to
get around to doing it, which would be a whole lot of work, and might
never happen. That's not a reason not to do it. First of all, the
framework has to be in place for this kind of optimization before anyone
can even start writing the code, and even if there are no optimizations
ever built into the interpreter, the concept of a homogenous sequence
would be very useful for extension writers. A number of times I have
written an extension that needed a list if which all the elements where
the same type. I had to put a type check inside my loop, which clutters
up the code, and slows things down some. If I'm working with numvers, I
use NumPy, but that's not always possible.

As for what might get added, I'm imagining that it should be possible to
automatically maintain a "homogeous" flag on a sequence: for an
imputable sequence it would get set when it was built. For a mutable
sequence, each time an element is added, a type check could be done, and
the homogenous flag turned off if the type didn't match. (turning it
back on when that item was removed would probably not be worth it).
Ideally, there would be a way for the programmer to define what was
meant by homogenous (i.e.: a NumPy array of doubles, or any Numpy array)
This would be trickier, but when in doubt, the homogenous flag would
just be turned off, and we'd be back where we started.

Even if no changes were made to the list type, the concept could be put
in place, and then existing homogenous sequences could be used more
optimally.

I've proposed similar ideas in the past, and not gotten much response. I
imagine my ideas are full of holes, but no one has taken the time to
point them out to me yet. I have come to the conclusion that the really
sharp minds on this list (and the folks that might be qualified to
actually impliment such a thing) are not really interested in something
like this that is a perfomance only improvement. Am I right? or is the
idea just so stupid that it's not worth commenting on?

-Chris

--
Christopher Barker,
Ph.D.
ChrisH...@home.net --- --- ---
http://members.home.net/barkerlohmann ---@@ -----@@ -----@@
------@@@ ------@@@ ------@@@
Oil Spill Modeling ------ @ ------ @ ------ @
Water Resources Engineering ------- --------- --------
Coastal and Fluvial Hydrodynamics --------------------------------------
------------------------------------------------------------------------

Paul Winkler

unread,

Sep 5, 2001, 2:50:17 PM9/5/01

to

On Wed, 05 Sep 2001 11:12:44 -0700, Chris Barker <chrish...@home.net> wrote:
>Hi all,
>
>I just took another look at:
>
>http://musi-cal.mojam.com/~skip/python/fastpython.html
>
>specifically, the loops section:
>
>"""
>Loops

(SNIP)

>List comprehensions were added to Python in version 2.0 as well. They
>provide a syntactically more compact way of writing the above for loop:
>
>newlist = [s.upper() for s in list]
>
>It's not really any faster than the for loop version, however.
>"""

Hmm... I just played around a bit more and it is in fact a bit faster,
but not much; as the size of the list grows, the time to do a list
comprehension vs. the time to do a for loop seems to converge, while
the time to do it via map is always fastest, sometimes by a lot.

>my question is: Why not? I like list comprehensions a lot, and they
>would seem to offer a way to get optimization over a for loop, much like
>map does, but apparently not. If map can do the loop in C, why can't a
>list comprehension?

That's what I want to know, too!

(SNIP description of homogenous lists)

>I've proposed similar ideas in the past, and not gotten much response. I
>imagine my ideas are full of holes, but no one has taken the time to
>point them out to me yet. I have come to the conclusion that the really
>sharp minds on this list (and the folks that might be qualified to
>actually impliment such a thing) are not really interested in something
>like this that is a perfomance only improvement. Am I right? or is the
>idea just so stupid that it's not worth commenting on?

Well, I don't think I qualify as among the really sharp minds here,
but I think it's a great idea.

--PW

Skip Montanaro

unread,

Sep 5, 2001, 3:21:27 PM9/5/01

to Chris Barker, pytho...@python.org

Chris> I just took another look at:

Chris> http://musi-cal.mojam.com/~skip/python/fastpython.html

Chris> specifically, the loops section:

fp.html> List comprehensions were added to Python in version 2.0 as
fp.html> well. They provide a syntactically more compact way of writing
fp.html> the above for loop:

fp.html> newlist = [s.upper() for s in list]

fp.html> It's not really any faster than the for loop version, however.

Chris> my question is: Why not?

map() is a single function that doesn't return control to the interpreter
until it's work is done. List comprehensions are essentially just syntactic
sugar for the equivalent for loop. Both the for loop and list comprehension
are implemented as loops using multiple virtual machine instructions.
Perhaps the easiest way to see this is to write three equivalent functions
and disassemble them:

def map_str(n):
return map(str, range(n))

def lc_str(n):
return [str(i) for i in range(n)]

def for_str(n):
l = []
append = l.append
for i in range(n):
append(str(i))

assert map_str(5) == lc_str(5) == for_str(5)

import dis
print "disassembling map_str"
dis.dis(map_str)
print "disassembling lc_str"
dis.dis(lc_str)
print "disassembling for_str"
dis.dis(for_str)

Chris> I like list comprehensions a lot, and they would seem to offer a
Chris> way to get optimization over a for loop, much like map does, but
Chris> apparently not. If map can do the loop in C, why can't a list
Chris> comprehension?

In theory, an optimizer could identify and replace some list comprehensions
with calls to map, but they wouldn't be quite equivalent. In the examples
above, there is no guarantee that the definition of the str function doesn't
change during the execution of either the for loop or the list comprehension
(think multiple threads of execution ;-). The call to map, on the other
hand, binds str once before map is called.

Chris> I think a big stumbling block is that Python has no concept of
Chris> the "homogenous" sequence. There are modules that provide such a
Chris> thing (Numeric, array), and indeed, the string type is a
Chris> homogenous sequence, but Python itself does not understand this
Chris> concept, so that if a map or list comprehesion is using a
Chris> homogenous list, it still has to check the type of every element
Chris> as it goes through the list. For example:

Chris> [2*a for a in list]

Chris> If the interpreter could just check once what the type of all the
Chris> elements of list were, it would only have to figure out what 2*a
Chris> meant once, and it could whip through the list very
Chris> quickly. Indeed, if you do:

Chris> [2*a for a in a_string]

But there is no guarantee that the list you are operating on isn't modified
by another thread during execution.

I've been thinking of this a little in the context of Ken Simpson's recently
announced PyInline module. The biggest reason I can't use it right now
(ignoring it's pre-alpha version number ;-) in my own code is precisely that
I operate on dicts, lists and tuples a lot. I'm sure that's true for other
Python programmers. The problem of easily writing C code that can operate
on lists (for example) boils down to deciding how you will map Python's
lists to something that is efficiently representable in C. Perl's inline
module does this with something called typemaps. (Look in something like
/usr/local/lib/perl5/5.6.1/ExtUtils for a file called "typemap".) I don't
really understand everything that's going on in that file, but it does
provide mappings. For instance, a "char **" argument followed by an
"unsigned int" arg might map to and from a list of strings and a length, an
"int *" followed by "unsigned int" might map to and from a list of ints,
etc. I'm not sure quite how you'd do dicts (or if you could in a reasonable
fashion without listifying them somehow).

I suspect this is about as close as Python will ever come to understanding
homogeneous containers at the language level.

Chris> I've proposed similar ideas in the past, and not gotten much
Chris> response. I imagine my ideas are full of holes, but no one has
Chris> taken the time to point them out to me yet. I have come to the
Chris> conclusion that the really sharp minds on this list (and the
Chris> folks that might be qualified to actually impliment such a thing)
Chris> are not really interested in something like this that is a
Chris> perfomance only improvement. Am I right? or is the idea just so
Chris> stupid that it's not worth commenting on?

It's not a matter of lack of interest as much as a desire to keep the
semantics of Python and lack of time. Here are some efforts I'm aware of.
Note that there are lots of ways performance can be addressed, not just by
constraining types.

* Rattlesnake (my thing) - a new register-oriented virtual machine whose
premise is that the current stack-oriented virtual machine does far
too much data shuffling

* Psyco (Armin Rego) - a run-time specializing virtual machine that sees
what sorts of types are input to a function and compiles a type- or
value-specific version of that function on-the-fly. I believe Armin
is looking at some JIT code generators in addition to or instead of
another virtual machine.

* Static typing SIG (http://www.python.org/sigs/types-sig/) - I'm not
familiar with this SIGs work, so you'll have to browse the archives.

* Python2C (Bill Tutt, Greg Stein) - A "module compiler" that generates
a C version of a Python module. http://www.mudlib.org/~rassilon/p2c/

* PEP 266 - A "pie in the sky" PEP I wrote that proposes that global
data could be accessed like local data if you reverse the roles of who
keeps references up-to-date.

* Various peephole optimizers - I wrote one a few years ago. There's
also Michael Hudson's bytecodehacks project (on Sourceforge). The
xfreeze bytecode freezer (part of the standard distribution?) also
does some peepholing.

I'm sure there are several I'm missing.

--
Skip Montanaro (sk...@pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/

Ype Kingma

unread,

Sep 5, 2001, 5:29:45 PM9/5/01

to

Chris,

For larger lists you might find that using append in the loop takes
quadratic time and that the list comprehension allows the new list to be constructed in linear time.

> my question is: Why not? I like list comprehensions a lot, and they
> would seem to offer a way to get optimization over a for loop, much like
> map does, but apparently not. If map can do the loop in C, why can't a
> list comprehension?

Pass.

This might work for sequences containing standard types. Instances of user
classes, however, may be changed while executing the loop over the sequence.
That means that the method that would be determined once to be executed
on all elements would not be the correct one for all elements accessed in
the loop.

Have fun,
Ype

Marcin 'Qrczak' Kowalczyk

unread,

Sep 5, 2001, 5:44:45 PM9/5/01

to

Wed, 05 Sep 2001 11:12:44 -0700, Chris Barker <chrish...@home.net> pisze:

> I think a big stumbling block is that Python has no concept of the
> "homogenous" sequence.

Do you mean a smart Python compiler/interpreter which tries to optimize
sequence operations in existing programs, on the assumption that many
will be homogeneous?

Or do you mean library support for explicit handling of homogeneous
sequences?

In the first case I think it's very hard, nearly impossible, because
of the mutable nature of Python. For example even if an implementation
knew that all elements of a list are the same class, mapping 2*x+1
through the list without method lookup on each element would need to
ensure that:

- The * operation yields the same class for all arguments, so +
doesn't need to be looked up in different classes.

- * and + aren't redefined in the class during the loop.

In the second case in alpha versions of Python you can already try
to map seq[0].__class__.__add__ instead of +, but I don't know if it
would be faster. It would check the type instead of method lookup.

--
__("< Marcin Kowalczyk * qrc...@knm.org.pl http://qrczak.ids.net.pl/
\__/
^^ SYGNATURA ZASTĘPCZA
QRCZAK

Chris Barker

unread,

Sep 5, 2001, 8:16:34 PM9/5/01

to

Skip Montanaro wrote:

> map() is a single function that doesn't return control to the interpreter
> until it's work is done. List comprehensions are essentially just syntactic
> sugar for the equivalent for loop.

This is exactly my question: why is it implimented this way?

Because it was easy: A fine answer

or

Because it is important for hte design that it work exactly that the
same way: which I don't understand at all.

> In theory, an optimizer could identify and replace some list comprehensions
> with calls to map,

This is what I have in mind.

> but they wouldn't be quite equivalent. In the examples
> above, there is no guarantee that the definition of the str function doesn't
> change during the execution of either the for loop or the list comprehension
> (think multiple threads of execution ;-). The call to map, on the other
> hand, binds str once before map is called.

Again, is this a design decision? that the expression in a list
comprehension shuld be able to be changes mid-comprehension? In another
thread to boot? This sounds like a the worst of confusing and
unpredictable behaviour to me!!!

If it's OK from a design standpoint to require that the function in a
map is bound once, it seems pretty reasonable to have the same
requirement for a list comprehension. In fact, it seems like a darn good
idea! The fact that the current implimentation doesn't enforce it is a
pretty poor design justification...or am I missing something?

> But there is no guarantee that the list you are operating on isn't modified
> by another thread during execution.

but there should be !!

> It's not a matter of lack of interest as much as a desire to keep the
> semantics of Python and lack of time.

The lack of time I can certainly understand. As for the symantics, I
think my proposal would make very little difference to the symantics,
except, perhaps, for allowing lists to be altered in the middle of a
list comprehension...which is pretty scary to me anyway. In fact, it
seems there has been recent discussion about removing some of the
extreme dynamicism from python , in the form of being able to re-bind
the __dict__ of a class, when...OK I'll admit it, I couldn't follow the
discussion! Even if you did want to continue to allow this kind of
extreme dynamicism, you could simply allow people to turn off the
hommogenous flag for a sequence: bingo! you can change it in another
thread to your hearts content!

My point is that I imagine most sequences used in Python are, in fact,
homogenous, but there is no way to know this, and it could make a huge
difference in performance. Look at NumPy. I simply could not use Python
without NumPy.

> Here are some efforts I'm aware of.
> Note that there are lots of ways performance can be addressed, not just by
> constraining types.

Certainly. And when any kind of type constraining is suggested, what a
lot of people say is that what we REALLY need is an all-powerful type
infering engine...talk about time to impliment issues! I think one
problem Python develpment does suffer from is that nothing gets done
that isn't reasonably doable with a fairly small amount of work, but
some things are simply too big to get done: like a type infering engine.

> * Rattlesnake (my thing) - a new register-oriented virtual machine whose
> premise is that the current stack-oriented virtual machine does far
> too much data shuffling

sounds cool

> * Psyco (Armin Rego) - a run-time specializing virtual machine that sees
> what sorts of types are input to a function and compiles a type- or
> value-specific version of that function on-the-fly. I believe Armin
> is looking at some JIT code generators in addition to or instead of
> another virtual machine.

wouldn't homogenouos sequences be a big help for this?

> * Static typing SIG (http://www.python.org/sigs/types-sig/) - I'm not
> familiar with this SIGs work, so you'll have to browse the archives.

This is a pretty quiet SIG these days. I know a while ago Guido said
that there would probably be some kin dof static typing in Py3k, but
mostly for compile time type checking, not performance. I hanv't heard
much since then, and it looks to be further off. Personally, I think a
little static type declarations could help Py2C and the like a lot.

> * Python2C (Bill Tutt, Greg Stein) - A "module compiler" that generates
> a C version of a Python module. http://www.mudlib.org/~rassilon/p2c/

I've been keeping my eye on this one, but it seems to be sleeping at the
moment... someone please correct me if I'm wrong. Homogenouos sequences
would be very handy here as well.

> * Various peephole optimizers - I wrote one a few years ago. There's
> also Michael Hudson's bytecodehacks project (on Sourceforge). The
> xfreeze bytecode freezer (part of the standard distribution?) also
> does some peepholing.

Hmm, I have no idea what peepholing is.. I'll have to check that out
when I get the time.

Skip Montanaro

unread,

Sep 5, 2001, 9:41:47 PM9/5/01

to Chris Barker, pytho...@python.org

Chris> Skip Montanaro wrote:
>> map() is a single function that doesn't return control to the
>> interpreter until it's work is done. List comprehensions are
>> essentially just syntactic sugar for the equivalent for loop.

Chris> This is exactly my question: why is it implimented this way?

Chris> Because it was easy: A fine answer

Chris> or

Chris> Because it is important for hte design that it work exactly that
Chris> the same way: which I don't understand at all.

It wasn't important for the design to work exactly like for loops, but given
Python's semantics it is hard to see how to have done it some other way.
Also, it was certainly the easiest route.

>> In theory, an optimizer could identify and replace some list
>> comprehensions with calls to map,

Chris> This is what I have in mind.

>> but they wouldn't be quite equivalent. In the examples above, there
>> is no guarantee that the definition of the str function doesn't
>> change during the execution of either the for loop or the list
>> comprehension (think multiple threads of execution ;-). The call to
>> map, on the other hand, binds str once before map is called.

Chris> Again, is this a design decision? that the expression in a list
Chris> comprehension shuld be able to be changes mid-comprehension? In
Chris> another thread to boot? This sounds like a the worst of confusing
Chris> and unpredictable behaviour to me!!!

No, this is Python's principle of least astonishment at work. Python's
language is dynamic all over the place. Placing arbitrary restrictions on
those semantics because they happen to occur between "[" and "]" would have
been wrong.

Chris> If it's OK from a design standpoint to require that the function
Chris> in a map is bound once, it seems pretty reasonable to have the
Chris> same requirement for a list comprehension. In fact, it seems like
Chris> a darn good idea! The fact that the current implimentation
Chris> doesn't enforce it is a pretty poor design justification...or am
Chris> I missing something?

map() is like any other function written in C. It does its thing, then
returns. It would be impossible for map() to check back to see if the name
"str" was rebound during its execution. All it knows is that it is passed a
callable object. It has no idea what that object was bound to in the
calling scope. It calls it once for each element of its various argument
lists. That would be true if map() was written in Python as well.

>> But there is no guarantee that the list you are operating on isn't
>> modified by another thread during execution.

Chris> but there should be !!

Why here and not elsewhere in the language? I'm afraid you're tilting at
windmills.

>> It's not a matter of lack of interest as much as a desire to keep the
>> semantics of Python and lack of time.

Chris> My point is that I imagine most sequences used in Python are, in
Chris> fact, homogenous, but there is no way to know this, and it could
Chris> make a huge difference in performance. Look at NumPy. I simply
Chris> could not use Python without NumPy.

You're more than welcome to write a PEP. One approach might be to propose
that the array object be "brought into the fold" as a builtin object and
modifications be made to the virtual machine so that homogeneity can be
assumed when operating on arrays.

>> Here are some efforts I'm aware of. Note that there are lots of ways
>> performance can be addressed, not just by constraining types.

Chris> Certainly. And when any kind of type constraining is suggested,
Chris> what a lot of people say is that what we REALLY need is an
Chris> all-powerful type infering engine...talk about time to impliment
Chris> issues! I think one problem Python develpment does suffer from is
Chris> that nothing gets done that isn't reasonably doable with a fairly
Chris> small amount of work, but some things are simply too big to get
Chris> done: like a type infering engine.

I don't think you need any sophisticated type inferencing engine. That's
one way to do things, but not the only way.

>> * Psyco (Armin Rego) - a run-time specializing virtual machine that
>> sees what sorts of types are input to a function and compiles a
>> type- or value-specific version of that function on-the-fly. I
>> believe Armin is looking at some JIT code generators in addition to
>> or instead of another virtual machine.

Chris> wouldn't homogenouos sequences be a big help for this?

Yes, they could be, but it wouldn't be necessary to get some useful
speedups. The prototype I saw compiled a specialized version of a function
based on its input types. For instance, you might have:

def f(a,b):
return (a+b)**2

Psyco (as I understand it) would compile a version of f that assumed that a
and b were ints, assuming it saw at runtime that f was being called with two
ints. After that, it might redirect the execution to the special version of
f if a two-int call was made, but revert to the old version if not.

It can go even further though, at least in theory. Suppose Psyco observed
that f was often called with a == 1 and b == 4. In that case it might build
a specialized version of f that simply returns 25.

Psyco could also presumably take advantage of homogeneity. It would just be
a "simple matter of programming".

>> * Static typing SIG (http://www.python.org/sigs/types-sig/) - I'm not
>> familiar with this SIGs work, so you'll have to browse the archives.

Chris> This is a pretty quiet SIG these days. I know a while ago Guido
Chris> said that there would probably be some kin dof static typing in
Chris> Py3k, but mostly for compile time type checking, not
Chris> performance. I hanv't heard much since then, and it looks to be
Chris> further off. Personally, I think a little static type
Chris> declarations could help Py2C and the like a lot.

Sure, but then it wouldn't be Python. ;-)

Of course the presence of types in the language wouldn't preclude using that
information to improve performance.

>> * Python2C (Bill Tutt, Greg Stein) - A "module compiler" that generates
>> a C version of a Python module. http://www.mudlib.org/~rassilon/p2c/

Chris> I've been keeping my eye on this one, but it seems to be sleeping
Chris> at the moment... someone please correct me if I'm wrong.
Chris> Homogenouos sequences would be very handy here as well.

Yes, I believe it has been fairly stable for awhile. I doubt that Bill or
Greg would turn away patches that improve it. ;-)

>> * Various peephole optimizers - I wrote one a few years ago. There's
>> also Michael Hudson's bytecodehacks project (on Sourceforge). The
>> xfreeze bytecode freezer (part of the standard distribution?) also
>> does some peepholing.

Chris> Hmm, I have no idea what peepholing is.. I'll have to check that
Chris> out when I get the time.

Basically, you identify patterns in the instruction sequence and replace
them with more efficient instruction sequences. Here's a paper I wrote for
SPAM-7:

http://musi-cal.mojam.com/~skip/python/spam7/optimizer.html

I updated as many of the links in the paper as I could do easily. Some are
still broken though.

Tim Peters

unread,

Sep 6, 2001, 1:12:55 AM9/6/01

to pytho...@python.org

[Chris Barker]
> ...

> I've proposed similar ideas in the past, and not gotten much response. I
> imagine my ideas are full of holes, but no one has taken the time to
> point them out to me yet. I have come to the conclusion that the really
> sharp minds on this list (and the folks that might be qualified to
> actually impliment such a thing) are not really interested in something
> like this that is a perfomance only improvement. Am I right? or is the
> idea just so stupid that it's not worth commenting on?

None of the above, although I'm not sure that's good news <wink>. We're
really not lacking for ideas, what we need is worker bees.

If you have time, here are two achievable (and ancient) ideas for
optimization that would benefit all programs:

1. Remove the need for SET_LINENO opcodes. They're not needed for line
numbers in tracebacks (tracebacks work fine under -O already, which
suppresses SET_LINENO). They're only used for line-by-line tracing
(as, e.g., used to implement debugger breakpoints), and that's a
very rare need.

2. Move PEP 267 ("Optimized Access to Module Namespaces") closer to
reality. More work than #1, but more potential gain too.

That these still haven't been done is simply evidence of how little work
gets volunteered in these areas.

everyone-has-opinions-but-opinions-aren't-executable-ly y'rs - tim

Alex Martelli

unread,

Sep 6, 2001, 7:38:07 AM9/6/01

to

"Paul Winkler" <slin...@yahoo.com> wrote in message
news:slrn9pct1n....@roaddog.armsnet...
...

> >List comprehensions were added to Python in version 2.0 as well. They
> >provide a syntactically more compact way of writing the above for loop:
> >
> >newlist = [s.upper() for s in list]
> >
> >It's not really any faster than the for loop version, however.
> >"""
>
> Hmm... I just played around a bit more and it is in fact a bit faster,
> but not much; as the size of the list grows, the time to do a list
> comprehension vs. the time to do a for loop seems to converge, while
> the time to do it via map is always fastest, sometimes by a lot.

I think one key issue is: map requires its sequence arguments to
support len(): it pre-allocates the needed list to the longest
len() out of those of its arguments (it can lengthen it later if
the len() were lying, I believe, but it gets its efficiency from
the fact it doesn't normally need to). For loops, and also list
comprehensions, do not require len() support -- they build the
resulting list one append at a time. This makes them more
flexible, but there's a price to pay for this. Plus, of course,
map's loop is a tight C one, and the others are looping in
interpreted Python bytecode.

The effects appear to be of comparable magnitude -- for ex,
consider the little test program pertes.py:

class Evens:
def __init__(self, n, claim=None):
self.n = n
self.len = claim
if self.len is None: self.len = n
def __len__(self):
return self.len
def __getitem__(self, index):
if index>=self.n: raise IndexError, index
return index*2

def funz(x): 23*x+17

def list_comp(n, claim=None, f=funz):
seq = Evens(n, claim)
return [f(x) for x in seq]

def for_loop(n, claim=None, f=funz):
seq = Evens(n, claim)
result = []
for x in seq: result.append(f(x))
return result

def use_map(n, claim=None, f=funz):
seq = Evens(n, claim)
return map(f, seq)

def pre_alloc(n, claim=None, f=funz):
seq = Evens(n, claim)
result = n*[0]
i = 0
for x in seq: result[i]=f(x); i+=1
return result

import time

if __name__=='__main__':
for f in list_comp,for_loop,use_map,pre_alloc:
print "%s: "%f.__name__,
n = 100*1000
start = time.clock()
result = f(n, n)
stend = time.clock()
print "%.2f"%(stend-start),
start = time.clock()
result = f(n, 1)
stend = time.clock()
print "%.2f"%(stend-start),
print

On my old box, pretty repeatably:

D:\>python pertes.py
list_comp: 2.68 2.63
for_loop: 2.88 2.86
use_map: 1.49 2.05
pre_alloc: 1.79 1.77

D:\>python pertes.py
list_comp: 2.67 2.63
for_loop: 2.88 2.84
use_map: 1.49 2.05
pre_alloc: 1.79 1.77

I.e., pre-allocation (looping in Python) can't beat
map (looping in C) when the latter also pre-allocs,
but when map's own pre-allocation is defeated (as
the sequence is instructed to lie about its length)
then pre_alloc wins. map is the only one of these
four solutions for which the correctness of seq's
length matters -- and it matters to the tune of
30% or thereabouts. If a list comprehension was
also able to pre-allocate, it might be sped up by
a similar amount -- not all the way to map's speeds
in this case, but maybe to below 2 seconds in this
example. That would require special-casing, with
comparison to the currently produced bytecode:

D:\>python -O
Python 2.1.1c1 (#19, Jul 13 2001, 00:25:06) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
Alternative ReadLine 1.4 -- Copyright 2001, Chris Gonnerman
>>> import dis
>>> import pertes
>>> dis.dis(pertes.list_comp)
0 LOAD_GLOBAL 0 (Evens)
3 LOAD_FAST 0 (n)
6 LOAD_FAST 1 (claim)
9 CALL_FUNCTION 2
12 STORE_FAST 4 (seq)
15 BUILD_LIST 0
18 DUP_TOP
19 LOAD_ATTR 4 (append)
22 STORE_FAST 5 (_[1])
25 LOAD_FAST 4 (seq)
28 LOAD_CONST 1 (0)
>> 31 FOR_LOOP 22 (to 56)
34 STORE_FAST 3 (x)
37 LOAD_FAST 5 (_[1])
40 LOAD_FAST 2 (f)
43 LOAD_FAST 3 (x)
46 CALL_FUNCTION 1
49 CALL_FUNCTION 1
52 POP_TOP
53 JUMP_ABSOLUTE 31
>> 56 DELETE_FAST 5 (_[1])
59 RETURN_VALUE
60 LOAD_CONST 0 (None)
63 RETURN_VALUE

Basically, in situation such as these where the list
comprehension has just one for, and no if, a call to
len() could be attempted on the sequence object in
the for, and, if that is satisfactory, the list could
be built with the indicated length, and instead of
append, a (currently non-existing) method indicating
"store at this index extending if necessary" could be
used. The index would need to be tracked (FOR_LOOP
has it, internally, but doesn't make it available on
the outside -- another long-standing small wish!-),
AND the resulting list trimmed at the end if the
len() lied by eccess.

Looks like a tad too much trouble to me for a pretty
modest performance extra in a specific case -- the
extra ops to set things up and conditionally trim at
the end might also easily worsen the performance for
very small list comprehensions, I suspect. Still,
those who disagree are of course welcome to try such
a patch and measure its performance effects:-).

Alex

Neil Schemenauer

unread,

Sep 6, 2001, 11:17:09 AM9/6/01

to pytho...@python.org

Alex Martelli wrote:
> I think one key issue is: map requires its sequence arguments to
> support len()

It won't in 2.2. Any object that supports the iteration protocol will
do (thanks to Tim).

Neil

Chris Barker

unread,

Sep 6, 2001, 5:33:01 PM9/6/01

to

Skip Montanaro wrote:

> Chris> Because it is important for hte design that it work exactly that
> Chris> the same way: which I don't understand at all.
>
> It wasn't important for the design to work exactly like for loops, but given
> Python's semantics it is hard to see how to have done it some other way.
> Also, it was certainly the easiest route.
>

> Chris> Again, is this a design decision? that the expression in a list
> Chris> comprehension shuld be able to be changes mid-comprehension? In
> Chris> another thread to boot? This sounds like a the worst of confusing
> Chris> and unpredictable behaviour to me!!!
>
> No, this is Python's principle of least astonishment at work. Python's
> language is dynamic all over the place. Placing arbitrary restrictions on
> those semantics because they happen to occur between "[" and "]" would have
> been wrong.

but it's not wrong to put in the same restrictions when map is called?
The list comprehension syntax is new.. it's not legal in older version
of Python. Therefore, you could put any restrictions you wanted in
there, and the restiction that the lists being comprehended don't change
in the middle ofhte process sounds like a pretty darn good one to me.

> Chris> If it's OK from a design standpoint to require that the function
> Chris> in a map is bound once, it seems pretty reasonable to have the
> Chris> same requirement for a list comprehension. In fact, it seems like
> Chris> a darn good idea! The fact that the current implimentation
> Chris> doesn't enforce it is a pretty poor design justification...or am
> Chris> I missing something?
>
> map() is like any other function written in C. It does its thing, then
> returns.

The fact that it is written in C should ideally not determine it's
behaviour. Ideally the behavious or map(), and list comprehensions,
should be determined by some sensible design criteria, including
performance, of course.

> It would be impossible for map() to check back to see if the name
> "str" was rebound during its execution. All it knows is that it is passed a
> callable object. It has no idea what that object was bound to in the
> calling scope. It calls it once for each element of its various argument
> lists. That would be true if map() was written in Python as well.

All I'm saying is that list comprehensions shluld behave the same way. I
havn't done any multi-threaded programming, but it sounds like all this
re-binding of objects in the middle function calls could get really
messy.

> >> But there is no guarantee that the list you are operating on isn't
> >> modified by another thread during execution.
>
> Chris> but there should be !!
>
> Why here and not elsewhere in the language?

I thought you said that map() doesn't allow it?

>I'm afraid you're tilting at windmills.

Perhaps I am. It may be that a highly dynamic language isn't well suited
to multi-threading, but this stuff sounds pretty scary to me... I'll
have to remember to be very careful with any multi-threaded code I
write.

> You're more than welcome to write a PEP.

The reason I havn't is that I am in no position to impliment even a
prototype of the idea (both time and skills), and unless someone who is
in that position takes some interest, there really isn't any point. It
also may be a just plain bad idea, so I would like more feedback before
I went that far. Thanks for your input, I have certainly learned from
this discussion.

> One approach might be to propose
> that the array object be "brought into the fold" as a builtin object

This is in PEP 209 (with NumPY arrays, whcih is much better that the
library array). I'm looking forward to it, I wish I could do more to
help out. Note that PEP 211 and 225 are relevent too. The three together
would do a lot to make Python more "array-oriented" which I think would
be wonderful.

> and
> modifications be made to the virtual machine so that homogeneity can be
> assumed when operating on arrays.

That would be a great next step.... it's not in the PEP yet, maybe I'll
send a note to authors.

> Chris> further off. Personally, I think a little static type
> Chris> declarations could help Py2C and the like a lot.
>
> Sure, but then it wouldn't be Python. ;-)

I'm not so sure about that... be definintion any change or enhancement
makes something that is not python. As long as full dynamic typing is
the default, it would still be Python.

Skip Montanaro

unread,

Sep 6, 2001, 6:39:14 PM9/6/01

to Chris Barker, pytho...@python.org

>> No, this is Python's principle of least astonishment at work.
>> Python's language is dynamic all over the place. Placing arbitrary
>> restrictions on those semantics because they happen to occur between
>> "[" and "]" would have been wrong.

Chris> but it's not wrong to put in the same restrictions when map is
Chris> called? The list comprehension syntax is new.. it's not legal in
Chris> older version of Python. Therefore, you could put any
Chris> restrictions you wanted in there, and the restiction that the
Chris> lists being comprehended don't change in the middle ofhte process
Chris> sounds like a pretty darn good one to me.

I think you're still missing what I was saying. map() *can't* possibly have
the same semantics as for loops precisely because it doesn't return control
to the interpreter where the name->object lookup takes place. Could list
comprehensions behave that way? Yes, but in my opinion (and apparently in
the opinion of the various powers that be), that would be wrong. If you
don't like all that dynamism, maybe Python's not the language for you. I
like it and am interested in speeding up the language as it exists, not some
other language with Python's syntax but less dynamic semantics.

Chris> If it's OK from a design standpoint to require that the function
Chris> in a map is bound once, it seems pretty reasonable to have the
Chris> same requirement for a list comprehension. In fact, it seems like
Chris> a darn good idea! The fact that the current implimentation
Chris> doesn't enforce it is a pretty poor design justification...or am
Chris> I missing something?

>> map() is like any other function written in C. It does its thing,
>> then returns.

Chris> The fact that it is written in C should ideally not determine
Chris> it's behaviour. Ideally the behavious or map(), and list
Chris> comprehensions, should be determined by some sensible design
Chris> criteria, including performance, of course.

This is not an ideal world. The fact that map() is a function (defined in C
or Python makes no difference) determines that. The mapping from the name
"str" to the function object that performs stringification takes place one
time only, when map()'s input parameters are evaluated. As far as map() is
concerned the name of the function may as well be
"the_moon_is_made_of_green_cheese".

>> >> But there is no guarantee that the list you are operating on isn't
>> >> modified by another thread during execution.
>>
Chris> but there should be !!
>>
>> Why here and not elsewhere in the language?

Chris> I thought you said that map() doesn't allow it?

map() gets a snapshot at the time it's called. The binding of the name
"str" can change during map()'s execution. It just won't notice. For loops
and list comprehensions reevaluate the name-to-object mapping each time
around the loop. Let me see if I can cook up a simple example... Okay,
toss this in a .py file and run it from the command line:

def a(i):
global c
c = b
return "a"

def b(i):
global c
c = a
return "b"

def pymap(f, l):
newl = []
for elt in l:
newl.append(f(elt))
return newl

n = 5

c = a
print "maploop:", pymap(c, range(n))

c = a
l = []
for i in range(n):
l.append(c(i))
print "forloop:", l

c = a
print "lcloop:", [c(i) for i in range(n)]

and execute it. You should see

maploop: ['a', 'a', 'a', 'a', 'a']
forloop: ['a', 'b', 'a', 'b', 'a']
lcloop: ['a', 'b', 'a', 'b', 'a']

Note that I don't call the real builtin map(), but a pseudo-map written in
Python. It still doesn't notice the change in binding of the name "c",
because what c referred to at the point it was called was bound to pymap's
formal parameter "f". The same thing happens with the real map() or with
any other function. In the for loop and list comprehension examples, c is
not a parameter, so they look it up each time around the loop.

>> I'm afraid you're tilting at windmills.

Chris> Perhaps I am. It may be that a highly dynamic language isn't well
Chris> suited to multi-threading, but this stuff sounds pretty scary to
Chris> me... I'll have to remember to be very careful with any
Chris> multi-threaded code I write.

Python is as well-suited to multi-threading as any other scripting language
I'm at all familiar with. Actually, this discussion we're having doesn't
really have anything to do with threads at all. I suggested threads as a
way that object bindings could change during the execution of a chunk of
code, but the example above demonstrates it just as well without using
threads.

>> One approach might be to propose that the array object be "brought
>> into the fold" as a builtin object

Chris> This is in PEP 209 (with NumPY arrays, whcih is much better that
Chris> the library array). I'm looking forward to it, I wish I could do
Chris> more to help out. Note that PEP 211 and 225 are relevent too. The
Chris> three together would do a lot to make Python more
Chris> "array-oriented" which I think would be wonderful.

I'm not familiar with PEP 209 (one can't read everything). In scanning the
abstract it looks like the authors propose a redesign of Numeric, not
incorporation of Numeric into the language. I was suggesting that perhaps
the array object could be elevated to the same status as lists, tuples and
dicts, with appropriate changes made to the virtual machine to take
advantage of arrays' homegeneity. Moving array into the language would
probably be a lot more modest bit of work than moving Numeric into the
language. Of course, if that step is to be taken, the array object could
probably benefit from experience with Numeric.

>> and modifications be made to the virtual machine so that homogeneity
>> can be assumed when operating on arrays.

Chris> That would be a great next step.... it's not in the PEP yet,
Chris> maybe I'll send a note to authors.

Chris> further off. Personally, I think a little static type
Chris> declarations could help Py2C and the like a lot.
>>
>> Sure, but then it wouldn't be Python. ;-)

Chris> I'm not so sure about that... be definintion any change or
Chris> enhancement makes something that is not python. As long as full
Chris> dynamic typing is the default, it would still be Python.

If I wrote a module in Python then compiled it to C, I would sure want it to
behave the same. Otherwise, the compilation process would probably
introduce subtle bugs into the code. I'd hate to have to debug the
generated C code.

Chris Barker

unread,

Sep 6, 2001, 8:30:49 PM9/6/01

to Skip Montanaro

Skip Montanaro wrote:

> I think you're still missing what I was saying. map() *can't* possibly have
> the same semantics as for loops precisely because it doesn't return control
> to the interpreter where the name->object lookup takes place.

I get that now.

> Could list
> comprehensions behave that way? Yes,

OK, at least I have that right.

> but in my opinion (and apparently in
> the opinion of the various powers that be), that would be wrong.

I guess I am having a hard time seeing why that would be wrong. I just
took a look at the PEP, and having been implimented already, it is
pretty sparce, but I see:

"""
Rationale

List comprehensions provide a more concise way to create lists in
situations where map() and filter() and/or nested loops would
currently be used.
"""

except map() and filter work differently than nested loops do, as we are
talking about here. Which functionality should be emulated?

and:

"""
- The form [... for x... for y...] nests, with the last index
varying fastest, just like nested for loops.
"""

This seems to be referring only to the order of the indexing.

So the PEP just doesn't answer the question. From the reference manual:

"""
When a list comprehension is supplied, it consists of a single
expression followed by at least one for clause and zero or more for or
if clauses. In this case, the elements of the new list are those that
would be produced by considering each of the for or if clauses a block,
nesting from left to right, and evaluating the expression to produce a
list element each time the innermost block is reached.
"""

This does say that the expression is evaluated "each time", which does
mean the current bindings of the objects in the expression will be used.
Without the PEP details, it's hard for me to know if this was a concious
design choice, or an aftifact of easy implimentaion

>If you don't like all that dynamism, maybe Python's not the language for you.

I like most of that dynamicism, but maybe not all....I do like Python a
whole lot, however.

Your example did make it clear how that kind of re-binding can be used,
and I can imagine it would be a useful feature (though I sure can't
figure out a use now!). However, you could always use for loops, if you
want that, and it would be more clear. Having the bindings of the
expression in a list comprehension fixed when the expression is
evaluated the first time would provide a lot less potential for
confusion, and also provide optimization options. Having it work EXACTLY
like a nested for loop makes it only syntactic sugar, and therefore less
useful. I do like the sugar however, so I'm still glad it's there, and
it's probably too late to change it anyway.

To use your example, I would like to see the following list
comprehension:

[c(i) for i in range(n)]

behave like:
l = []
fun = lambda i,f = c : f(i)
for i in range(n):
l.append(fun(i))

instead of:

l = []
for i in range(n):
l.append(c(i))
print "forloop:", l

i.e. bind the generating expression once at the beginning.

> I like it and am interested in speeding up the language as it exists, not some
> other language with Python's syntax but less dynamic semantics.

Me too. honestly. I am hoping that homogenous sequences could help
achieve that.

> This is not an ideal world. The fact that map() is a function (defined in C
> or Python makes no difference) determines that.

OK. I was confused a bit there. The issue is that map is a function, and
list comprehensions are not. They are, however, new, and could have
behaved either way.

> >> >> But there is no guarantee that the list you are operating on isn't
> >> >> modified by another thread during execution.
> >>
> Chris> but there should be !!
> >>
> >> Why here and not elsewhere in the language?
>
> Chris> I thought you said that map() doesn't allow it?
>
> map() gets a snapshot at the time it's called. The binding of the name
> "str" can change during map()'s execution.

What about the input list? If I call a:

l2 = map(function, l1)

and another thread is altering l1 at the same time, will that effect l2?
This is what is scary to me.

>I suggested threads as a
> way that object bindings could change during the execution of a chunk of
> code, but the example above demonstrates it just as well without using
> threads.

well, it demonstrates re-binding of the generating function, not the
source list being altered.. the later I find a lot more scary, and can't
imagine how it could happen in a single thread.

> I'm not familiar with PEP 209 (one can't read everything). In scanning the
> abstract it looks like the authors propose a redesign of Numeric, not
> incorporation of Numeric into the language.

You're right, but I'm pretty sure making it a part of the standard
library is one of the goals.

> I was suggesting that perhaps
> the array object could be elevated to the same status as lists, tuples and
> dicts, with appropriate changes made to the virtual machine to take
> advantage of arrays' homegeneity.

Yes, that is not proposed. I've posted a note to the NumPy list to see
if anyone thinks this is worthwhile.

> Moving array into the language would
> probably be a lot more modest bit of work than moving Numeric into the
> language.

Yes, but Numeric arrays are SO much more. The multidimensionality, the
ufuncs, the array-oriented math, ... I don't think it would be worth it
without all that. In fact, a lot of performance is gained by using NumPy
arrays, so there would be less need for VM changes to get perfomance.

> If I wrote a module in Python then compiled it to C, I would sure want it to
> behave the same. Otherwise, the compilation process would probably
> introduce subtle bugs into the code. I'd hate to have to debug the
> generated C code.

Sure. that's why it would need to be part of Python, not just extra
directives to Py2C. (this is not just my twisted, make-python-static
idea: Guido presented a proposal quite a while ago:
anyhttp://www.python.org/~guido/static-typing/. I don't know if he has
changes his mind, but it doesn't seem to be at the top of the list
either) However, even if it were extra directives, I'm not sure that
would be worse than what you have to do when you right your own C
extension. (except de-bugging generated code rather than your own: good
point!)

Well, no one else seems to be paying any attention to our jabbering here
so I don't htink we need to continue.

Thanks for your thoughts, I have learned a lot.

Aahz Maruch

unread,

Sep 6, 2001, 10:43:57 PM9/6/01

to

In article <3B97EB8D...@home.net>,
Chris Barker <chrish...@home.net> wrote:

>Skip Montanaro wrote:
>>
>> You're more than welcome to write a PEP.
>
>The reason I havn't is that I am in no position to impliment even a
>prototype of the idea (both time and skills), and unless someone who is
>in that position takes some interest, there really isn't any point. It
>also may be a just plain bad idea, so I would like more feedback before
>I went that far. Thanks for your input, I have certainly learned from
>this discussion.

You may want to bring this up on python-dev; most of the core developers
(except for Tim) pay little attention to c.l.py these days. Also, the
whole point of a PEP is to formalize the feedback process; you do *not*
need to be in a position to implement it.
--
--- Aahz <*> (Copyright 2001 by aa...@pobox.com)

Hugs and backrubs -- I break Rule 6 http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

"Plus ca change, plus c'est la meme chose."

Alex Martelli

unread,

Sep 7, 2001, 4:28:11 AM9/7/01

to

"Chris Barker" <chrish...@home.net> wrote in message
news:3B981539...@home.net...
...

> What about the input list? If I call a:
>
> l2 = map(function, l1)
>
> and another thread is altering l1 at the same time, will that effect l2?

Definitely! It would be horrible for map to take a snapshot "behind
your back" of l1 -- it's trivially easy for YOU to explicitly make
a temporary copy, just map(function, l1[:]), if you want that, while,
if map violated all normal Pythonic expectations by taking a snapshot,
it would be very hard for you to get back to normal Pythonic behavior.

There doesn't have to be any threading involved for the effect
to surface...:

>>> l1=range(10)
>>> def function(x):
... return x+l1.pop()
...
>>> map(function,l1)
[9, 9, 9, 9, 9]
>>> l1=range(10)
>>> map(function,l1[:])
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

As function modifies global modifiable object l1, it makes a
big difference whether map is using l1 itself or a snapshot
copy thereof. If you want a copy, ask for a copy -- exactly
as you would, e.g., in a for loop whose body may need to
modify the sequence on which the for is looping, you then
wouldn't write:
for x in thelist:
maychange(x,thelist)
but rather:
for x in thelist[:]:
maychange(x,thelist)
or, for maximum clarity/explicitness:
for x in copy.copy(thelist):
maychange(x,thelist)

> This is what is scary to me.

I don't really understand why. It's *normal* to take a
copy/snapshot of a modifiable container in a threaded
environment, so you can release locks fast AND not have
the container change behind your back -- and here you
don't even need the locking since thelist[:] is atomic.
(It's less normal to have a function modify global
state, as it ain't in general good programming practice,
but we all fall from grace from time to time, no?-).

> >I suggested threads as a
> > way that object bindings could change during the execution of a chunk of
> > code, but the example above demonstrates it just as well without using
> > threads.
>
> well, it demonstrates re-binding of the generating function, not the
> source list being altered.. the later I find a lot more scary, and can't
> imagine how it could happen in a single thread.

Let's not confuse *re-binding* of the sequence (which would have no effect
on any of map, filter, reduce, list comprehension, for loops - each
takes a *reference* to the sequence at the start, of course) with
*alteration* (modification) of the sequence object.

Alex

Chris Barker

unread,

Sep 7, 2001, 4:03:34 PM9/7/01

to

Chris Barker wrote:

> and can't imagine how it could happen in a single thread.

Clearly my imagination was very lame when I wrote that!!!

Alex Martelli wrote:

> Let's not confuse *re-binding* of the sequence (which would have no effect
> on any of map, filter, reduce, list comprehension, for loops - each
> takes a *reference* to the sequence at the start, of course) with
> *alteration* (modification) of the sequence object.

Good distinction.

I am starting to see a big hole in my idea for homogenous sequences...
Even if the sequence is homogenous when a function or somthing is called
( like map() ), it may not remain homogenous, so the interpreter would
have to check that there was only one reference to the object in order
to perform the optimisations I'm imagining. This wouldn't happen very
often. in fact, it could only work if you made the copy in the call:
map(fun, list[:]). That would kind of kill the kind of auto-optimizing
I'm imagining.

The only way for it to work would be to define a given sequence an
statically homogenous. These do exist: strings, for instance, and, in
fact, a homogenous tuple could not be modified either. As well as NumPy
arrays. (actually, I think you can change the type of a NumPy array on
the fly, at least with a C extension, but I don't think it can be done
in Python).

So the idea of keeping a dynamic homogenous flag with lists and
dictionaries is probably pretty useless, but it may still make sense to
have such a flag for other sequences: ones that are either immutable, or
defined to be homogenous. With the native types, I suppose there is no
need for a flag, but with extension types, it could still be useful.

One more question: is it possible for a function to be mutable?? Skip
Montanaro gave a good example of how a function could be re-bound in the
middle of a for loop or list comprehesion, but is there any way a
function could be redefined without re-binding, as in:

map(fun, t)

when this is called, the function bound to the "fun" name is passed to
map, as is the tuple bound to the name "t". The tuple is immutable, and
let's pretend it were also marked as homogeous. If the function is also
immutable, than map could conceivably compile a version of fun that
worked on the type of the items in t, and then apply that function to
all the elements in t, without type checking, and looping at the speed
of C. This is the kind of optimization I am imagining.

Granted, some one would have to write the code to do this , but at the
moment I'm trying to determine whether it is even possible. Also,
apparently there are folks working on some kinds of JIT compilation that
could be used here.

Even in a for loop or list comprehension, you would have to do the extra
check to see if the function had been re-bound, but that's the only
check that would have to be done.

I'm also seeing this as something that could be used by Py2C type
approaches, or hand written extensions. You could write a function that
operates on a homogenous sequence of strings, for instance. Then the
only type checking you would have to do is check if you got such a
sequence. If the sequence was mutable, you could make a copy in your
function (A lot of NumPy extensions do that: build a PyArray from
whatever is input, if a PyArray is input, it doesn't need to get
changed) You would have a call analagous to NumPy's
PyArray_FromObject(), maybe PyHomogenousSequenceFromObject() that would
build it for you if it was not what was passed in.

This is not the least bit trivial to impliment, but it could be done bit
by bit....

Paul Winkler

unread,

Sep 7, 2001, 4:17:12 PM9/7/01

to

On Thu, 6 Sep 2001 13:38:07 +0200, Alex Martelli <al...@aleax.it> wrote

(snip - a very long and detailed discussion of map, list
comprehensions, and for/append loops)

Thanks very much, Alex, now I understand what's happening under the
covers! I guess that optimizing list comprehensions in this way is
probably not worth the trouble, given the factors you mention. The
optional "if" clause of list comprehensions also complicates things.

I tried your test script, and got comparable results to yours, with
the exception that on my machine, use_map always beats pre_alloc even
when len() is "wrong" (although not by much). I tried it with
n=1000*1000 and n = 100 * 100 and got similarly proportioned results.

So for the foreseeable future, looks like map() is likely to remain
the fastest way to generate lists.

-- Paul Winkler

Skip Montanaro

unread,

Sep 7, 2001, 4:33:13 PM9/7/01

to Chris Barker, pytho...@python.org

Chris> One more question: is it possible for a function to be mutable??

Yes, in theory, from C code. You could create a new code object and change
the function's func_code attribute. I don't think you can do that from
Python code, even using the new module. My rattlesnake stuff purports to do
just that. It generates a new code object that uses the rattlesnake VM and
replaces the function's old code object with it. For the purposes of
testing, I am currently using a setcode builtin function written by Vladimir
Marangazov so that I can swap a function's code object from Python.

but-then-just-about-anything's-possible-in-C-ly y'rs,

Skip

Marcin 'Qrczak' Kowalczyk

unread,

Sep 7, 2001, 4:49:29 PM9/7/01

to

Fri, 07 Sep 2001 20:17:12 GMT, Paul Winkler <slin...@yahoo.com> pisze:

> So for the foreseeable future, looks like map() is likely to remain
> the fastest way to generate lists.

As long as the function to map is ready. If a separate function must
be created just for mapping (doesn't matter id by 'def' or 'lambda'),
I would expect a list comprehension with the function's body inserted
to be faster than map which must call it.

Tim Peters

unread,

Sep 7, 2001, 5:08:19 PM9/7/01

to pytho...@python.org

[Chris Barker]

> One more question: is it possible for a function to be
> mutable??

[Skip Montanaro]

> Yes, in theory, from C code. You could create a new code object
> and change the function's func_code attribute. I don't think you
> can do that from Python code, even using the new module.

func_code has been writable for some time. Here under 2.2a3:

>>> def f():
... pass
...
>>> def g():
... return 666
...
>>> f.func_code = g.func_code
>>> f()
666
>>>

> ...
> but-then-just-about-anything's-possible-in-C-ly y'rs,

ya-but-it's-easier-in-python<wink>-ly y'rs - tim

Skip Montanaro

unread,

Sep 7, 2001, 5:42:42 PM9/7/01

to Tim Peters, pytho...@python.org

Tim> func_code has been writable for some time. Here under 2.2a3:

Whoops! Thanks for the correction. I'm still working from a 1.5.2 codebase
in the rattlesnake stuff. That's the only context in which I've ever wanted
to modify a function's code object, and, lo and behold, it's possible there
as well...

Skip

Chris Barker

unread,

Sep 7, 2001, 6:33:24 PM9/7/01

to

Tim Peters wrote:
> > One more question: is it possible for a function to be
> > mutable??

> func_code has been writable for some time. Here under 2.2a3:
>
> >>> def f():
> ... pass
> ...
> >>> def g():
> ... return 666
> ...
> >>> f.func_code = g.func_code
> >>> f()
> 666

YOW!

this really makes optimizing even harder than I thought, and I always
thought it was hard.

Thanks, folks, I know I've learned a lot.

-Chris

Delaney, Timothy

unread,

Sep 9, 2001, 9:02:25 PM9/9/01

to pytho...@python.org

> From: Chris Barker [mailto:chrish...@home.net]

> Tim Peters wrote:
> > > One more question: is it possible for a function to be
> > > mutable??
> > func_code has been writable for some time. Here under 2.2a3:
> >
> > >>> def f():
> > ... pass
> > ...
> > >>> def g():
> > ... return 666
> > ...
> > >>> f.func_code = g.func_code
> > >>> f()
> > 666
>
> YOW!
>
> this really makes optimizing even harder than I thought, and I always
> thought it was hard.

Yep - now you know why so many people have *not* put the effort into
optimising Python. Dynamism pervades almost every part of the language.

The only optimisations I've seen proposed in recent times which truly look
interesting are

PEP 266: Optimizing Global Variable/Attribute Access
http://python.sourceforge.net/peps/pep-0266.html

PEP 267: Optimized Access to Module Namespaces
http://python.sourceforge.net/peps/pep-0267.html

both of which need implementing and testing to determine (a) whether they
are indeed an improvement, and (b) whether they are enough of an improvement
when balanced by how much they complicate the Python core code.

Personally, I'm in favour of PEP 266, which should provide an
across-the-board improvement without overly complicated code (it doesn't
have any special cases, whereas PEP 267 does). However, PEP 267 would do
much of the work as well - both would optimise getting references from other
modules, which currently usually requires looking up the name of the module
(usually local namespace, then module namespace, possibly with __builtin__
namespace) then the attribute in the found module's namespace - i.e. at
least 2 dictionary lookups + (normally) one local lookup. These PEPs would
reduce this to two local lookups - a huge improvement.

Tim Delaney
Cross Avaya R&D
+61 2 9352 9079

Skip Montanaro

unread,

Sep 9, 2001, 9:51:16 PM9/9/01

to Delaney, Timothy, pytho...@python.org

>> > func_code has been writable for some time. Here under 2.2a3:
>> >
>> > >>> def f():
>> > ... pass
>> > ...
>> > >>> def g():
>> > ... return 666
>> > ...
>> > >>> f.func_code = g.func_code
>> > >>> f()
>> > 666
>>

>> this really makes optimizing even harder than I thought, and I always
>> thought it was hard.

Tim> Yep - now you know why so many people have *not* put the effort
Tim> into optimising Python. Dynamism pervades almost every part of the
Tim> language.

I don't see how substituting one code object for another makes optimizing a
function any harder or easier.

Chris Barker

unread,

Sep 10, 2001, 2:37:07 PM9/10/01

to

"Delaney, Timothy" wrote:

> Yep - now you know why so many people have *not* put the effort into
> optimising Python. Dynamism pervades almost every part of the language.
>
> The only optimisations I've seen proposed in recent times which truly look
> interesting are
>
> PEP 266: Optimizing Global Variable/Attribute Access
> http://python.sourceforge.net/peps/pep-0266.html
>
> PEP 267: Optimized Access to Module Namespaces
> http://python.sourceforge.net/peps/pep-0267.html
>
> both of which need implementing and testing to determine (a) whether they
> are indeed an improvement, and (b) whether they are enough of an improvement
> when balanced by how much they complicate the Python core code.

These both do look good, but let's face it: It's not even clear that
they will help at all, and certainly not enough to make a huge
difference.

My issue with the dynamicism of Python is that it is wonderful and
powerful, but frankly, not used in a great deal of code: How amny people
even new that you could change the code of a function on the fly, let
alone actually used the feature? And of those few that have used the
feature, how often have they used it?

What I am searching for in this discussion are ways that Python can take
advantage of the fact that much code is not, in fact, all that dynamic.
I suspect that most of the time that a list comprehension or map is
called, the author of the code fully expects the expression (or
function) to do the same thing for the duration of the call, and that
the sequence is more often than not homogenous. There must be some way
to take advantage of that circumstance. Granted, we could speed up map
and list comprehensions a whole lot, and that wouldn't speed up the
language as a whole all that much, but this is just an example.

Perhaps the only way to do it would be to sprinkle a few "static" or
something statements in. Frankly I think this is ugly, but the current
answer to a few core functions in a Python program being too slow is to
re-code it in C. Surely that is ugly!

I-know-there-are-no-easy-answers-but-I-wish-there-were-ly yours,

Chris Barker

unread,

Sep 10, 2001, 2:26:44 PM9/10/01

to

Skip Montanaro wrote:
> Tim> Yep - now you know why so many people have *not* put the effort
> Tim> into optimising Python. Dynamism pervades almost every part of the
> Tim> language.
>
> I don't see how substituting one code object for another makes optimizing a
> function any harder or easier.

I'm not sure he was neccessarily referring to that particular example,
but I do think he's right:

Let's say that my homogenous sequence proposal were in place, and you
had the line of code:

map(func,sequence)

A mythical JIT compiler could compile a version of func that worked on
the type of object in sequence, and then apply it to all the elements.
Onoce all the checking and compiling was done, that would run as fast as
a static,compiled language, and therefore give a major speed-up (if
sequence was long enough). However, since the code object in func could
be changed while the map loop was running, there would have to be a
check to make sure both the code and the sequence had not changed. This
would ensure that the innner loop could NEVER be as fast as compiled
static code.

Anyone know how LISP deals with this: it is certainly a very dynamic
language.

Paul Winkler

unread,

Sep 12, 2001, 11:46:03 AM9/12/01

to

On 7 Sep 2001 20:49:29 GMT, Marcin 'Qrczak' Kowalczyk <qrc...@knm.org.pl> wrote:
>Fri, 07 Sep 2001 20:17:12 GMT, Paul Winkler <slin...@yahoo.com> pisze:
>
>> So for the foreseeable future, looks like map() is likely to remain
>> the fastest way to generate lists.
>
>As long as the function to map is ready. If a separate function must
>be created just for mapping (doesn't matter id by 'def' or 'lambda'),
>I would expect a list comprehension with the function's body inserted
>to be faster than map which must call it.

Interesting idea. I just tried ... Yes, *if* what you're doing is
practical to inline in the list comprehension, that can beat map().
And of course if what you're doing is calling, say, a function or
method from the standard library, you're not going to inline that, so
map() still looks like a good idea.

Example:

test = xrange(100000)

def multiplier(x):
return x * 0.99

def mapper(m=multiplier, f=test):
return map(m, f)

def comprehender_inline(f=test):
return [x * 0.99 for x in f]

def comprehender(m=multiplier, f=test):
return [m(x) for x in f]

# end of example

In this case, comprehender_inline wins by a very large margin,
followed by mapper, which is in turn faster than comprehender.

But here's the counterexample:

import string
test = # a 100000-character random string

def mapper(m=string.upper, f=test):
return map(m, f)

def comprehender(m=string.upper, f=test):
return [m(x) for x in f]

su = string.upper
def comprehender_inline(f=test):
su = string.upper
return[su(x) for x in f]

# end example

In this case, comprehender_inline is about the same speed as
comprehender, no surprise since they're almost identical. Map wins by
a large margin. If you rewrite comprehender_inline so it returns
[x.upper() for x in f], it speeds up quite a lot, but still loses to
map, and we're not really comparing apples to apples anymore.