Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

generated comprehensions

0 views
Skip to first unread message

Garth T Kidd

unread,
May 14, 2002, 2:31:10 AM5/14/02
to
I'm a little worried about getting into the habit of using list
comprehensions because I'll have to re-write the comprehensions back
in "normal" Python whenever someone tries to shove a lot of data
through them.

If we're talking normal sequences, of course, it's not that much of a
problem. If it fits in memory, it fits in memory. It's when you start
using generators because you need to that suddenly comprehensions look
a little brittle.

def printOdds(upto):
for odd in [num for num in xrange(upto) if num%2]:
print odd

... works fine if upto is 5, but just sits there chewing up memory if
upto is 10**9, at which point you abandon comprehensions and do it
properly:

def printOdds(upto):
for num in xrange(upto):
if num%2:
print odd

Other comprehension party tricks suffer similarly:

[a for a in xrange(upto//2) if sys.stdout.write('%s\n' % (a*2+1))]

I'm sure I'll figure out a decent rule of thumb (say, "unit test with
the biggest practical number, and get rid of comprehensions if they
turn out to be a problem", or "don't use comprehensions with
generators"), but the fracture still worries me.

Has anyone tried to figure out a generator variant on list
comprehensions? Should all list comprehensions return generators?

Regards,
Garth.

PS: Eh? ::

>>> xrange(0,5000000000)[3]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int

Michael Gilfix

unread,
May 14, 2002, 2:49:02 AM5/14/02
to
On Mon, May 13 @ 23:31, Garth T Kidd wrote:
> Has anyone tried to figure out a generator variant on list
> comprehensions? Should all list comprehensions return generators?

.. or rather, why shouldn't they? Aren't generators much more memory
efficient? Perhaps someone could enlighten me on the downside of
generators since they seem to accomplish everything a pure iterator
can do but in a more memory efficient way.

-- Mike

--
Michael Gilfix
mgi...@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html


Oren Tirosh

unread,
May 14, 2002, 4:09:53 AM5/14/02
to
On Mon, May 13, 2002 at 11:31:10PM -0700, Garth T Kidd wrote:
> Has anyone tried to figure out a generator variant on list
> comprehensions? Should all list comprehensions return generators?

PEP 289 by Raymond Hettinger. http://www.python.org/peps/pep-0289.html

"""
Specification for Generator Comprehensions :

If a list comprehension starts with a 'yield' keyword, then
express the comprehension with a generator. For example:

g = [yield (len(line),line) for line in file if len(line)>5]
...

BDFL Pronouncements

Generator comprehensions are REJECTED. The rationale is that the
benefits are marginal since generators can already be coded
directly and the costs are high because implementation and
maintenance require major efforts with the parser.
"""

RTFP'ly yours

Oren


John La Rooy

unread,
May 14, 2002, 6:08:23 AM5/14/02
to
Maybe we need an 'x' comprehension? x[...] anyone?

I think 1e9 is getting toward the exceptionally large. Sure if you think
someone is going to plug it into your program someday, you should work around it
*if* it is practical to do so. If the someday isn't for 5 or 10 years, that
comprehension or whatever is going to be umpteen times faster.

Best to keep things in perspective. We generally expect to need to maintain
programs and raise those builtin limits as users expect/require it.

You have a good point. There is an area where comprehensions aren't ideal and
as a programmer in a perfect world programmers would never need to worry about the
underlying implementation.

This did occur to me on reading your post, and the result did startle me a little:

Python 2.2.1 (#2, Apr 21 2002, 22:22:55)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> list(xrange(1e9))
Segmentation fault


On 13 May 2002 23:31:10 -0700

bru...@tbye.com

unread,
May 14, 2002, 11:02:09 AM5/14/02
to
On 13 May 2002, Garth T Kidd wrote:

> I'm a little worried about getting into the habit of using list
> comprehensions because I'll have to re-write the comprehensions back
> in "normal" Python whenever someone tries to shove a lot of data
> through them.

Who is 'someone' and why would they shove 'a lot' (too much) data through
your list comprehensions? (i.e. - are you actually writing code that has
to behave properly for unknown and possibly malicious users and/or handle
huge amounts of data or are you just musing over the theoretical limits of
what Python can handle?)

> If we're talking normal sequences, of course, it's not that much of a
> problem. If it fits in memory, it fits in memory. It's when you start
> using generators because you need to that suddenly comprehensions look
> a little brittle.
>
> def printOdds(upto):
> for odd in [num for num in xrange(upto) if num%2]:
> print odd
>
> ... works fine if upto is 5, but just sits there chewing up memory if
> upto is 10**9,

What are the odds of actually being bitten by this "problem"? Assuming you
have a real program, it'll be much more complex than this single function,
and I'd expect that it's likely to break down in other ways, not just list
comprehensions. For example, maybe you should avoid using strings as much
as possible too (and just do all your data processing by reading and
writing to an open file or a memory-mapped file) <0.7 wink>.

IOW, if your program will realistically need to handle a billion of
something, there's no point in singling out list comprehensions because
you'll need to be careful every step of the way.

> I'm sure I'll figure out a decent rule of thumb (say, "unit test with
> the biggest practical number, and get rid of comprehensions if they
> turn out to be a problem", or "don't use comprehensions with
> generators")

How about "don't cross the bridge til you get to it"? If you are writing a
program that needs to handle large amounts of data, that's something
you'll have to keep in mind every step of the way. Outside of that very
narrow domain, however, it's usually a waste of time to worry about it.
For example, the other day I wrote a quick utilitiy to add line numbers to
source code files for annotation - theoretically I can get into big
trouble if the source code file is too big to fit into memory or if it is
more than 2.1 billion lines long (I'm using Python 2.1), but realistically
it's a waste of time to be concerned with that.

-Dave

Raymond Hettinger

unread,
May 14, 2002, 12:42:19 PM5/14/02
to
> This did occur to me on reading your post, and the result did startle me a
little:
>
> Python 2.2.1 (#2, Apr 21 2002, 22:22:55)
> [GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> list(xrange(1e9))
> Segmentation fault

Hmm. That's not cool.
Added bug report: www.python.org/sf/556025 .


Raymond Hettinger


Greg Ewing

unread,
May 14, 2002, 10:27:03 PM5/14/02
to
Garth T Kidd wrote:
>
> def printOdds(upto):
> for odd in [num for num in xrange(upto) if num%2]:
> print odd
>
> ... works fine if upto is 5, but just sits there chewing up memory if
> upto is 10**9, at which point you abandon comprehensions and do it
> properly:

I don't think I would have used a comprehension there in
the first place, because building a list isn't needed.
i.e. I would have "done it properly" the first time,
particularly since the comprehension only saves 2 lines
of code.

> I'm sure I'll figure out a decent rule of thumb (say, "unit test with
> the biggest practical number, and get rid of comprehensions if they
> turn out to be a problem", or "don't use comprehensions with
> generators"),

The rule of thumb you want is: If you need the result as
a list, use a comprehension, otherwise don't.

--
Greg Ewing, Computer Science Dept, University of Canterbury,
Christchurch, New Zealand
To get my email address, please visit my web page:
http://www.cosc.canterbury.ac.nz/~greg

0 new messages