Python 'for' loop is memory inefficient

Dr. Phillip M. Feldman

unread,

Aug 14, 2009, 9:25:45 PM8/14/09

to pytho...@python.org

I wrote the following correct but inefficient test of primality for purposes
of demonstrating that the simplest algorithm is often not the most
efficient. But, when I try to run the following code with a value of n that
is large enough to produce a significant amount of running time, I get an
out-of-memory error!

def is_prime(n):
for j in range(2,n):
if (n % j) == 0: return False
return True

It seems as though Python is actually expanding range(2,n) into a list of
numbers, even though this is incredibly wasteful of memory. There should be
a looping mechanism that generates the index variable values incrementally
as they are needed.
--
View this message in context: http://www.nabble.com/Python-%27for%27-loop-is-memory-inefficient-tp24980842p24980842.html
Sent from the Python - python-list mailing list archive at Nabble.com.

Mark Lawrence

unread,

Aug 14, 2009, 9:38:18 PM8/14/09

to pytho...@python.org

Dr. Phillip M. Feldman wrote:
> I wrote the following correct but inefficient test of primality for purposes
> of demonstrating that the simplest algorithm is often not the most
> efficient. But, when I try to run the following code with a value of n that
> is large enough to produce a significant amount of running time, I get an
> out-of-memory error!
>
> def is_prime(n):
> for j in range(2,n):
> if (n % j) == 0: return False
> return True
>
> It seems as though Python is actually expanding range(2,n) into a list of
> numbers, even though this is incredibly wasteful of memory. There should be
> a looping mechanism that generates the index variable values incrementally
> as they are needed.

I have a strong suspicion that you will find hints in the Python
documentation that this has already been addressed. Perhaps you could
try reading before posting?

--
Kindest regards.

Mark Lawrence.

Ethan Furman

unread,

Aug 14, 2009, 9:43:42 PM8/14/09

to pytho...@python.org

Dr. Phillip M. Feldman wrote:
> I wrote the following correct but inefficient test of primality for purposes
> of demonstrating that the simplest algorithm is often not the most
> efficient. But, when I try to run the following code with a value of n that
> is large enough to produce a significant amount of running time, I get an
> out-of-memory error!
>
> def is_prime(n):
> for j in range(2,n):
> if (n % j) == 0: return False
> return True
>
> It seems as though Python is actually expanding range(2,n) into a list of
> numbers, even though this is incredibly wasteful of memory. There should be
> a looping mechanism that generates the index variable values incrementally
> as they are needed.

You already have an answer to the range issue, so I will only add that
putting a loop inside another loop is a decent way to expand the time taken.

I will also observe that if you were to stop programming whatever
language you are more familiar with in Python, and start programming
Python in Python, you'll have an easier time of it.

The Dive Into Python is an excellent start for that.

~Ethan~

r

unread,

Aug 14, 2009, 10:01:26 PM8/14/09

to

On Aug 14, 8:25 pm, "Dr. Phillip M. Feldman" <pfeld...@verizon.net>
wrote:

> I wrote the following correct but inefficient test of primality for purposes
> of demonstrating that the simplest algorithm is often not the most
> efficient. But, when I try to run the following code with a value of n that
> is large enough to produce a significant amount of running time, I get an
> out-of-memory error!

I don't think Python was created to keep mathematician's happy.
Although strangely enough, the creator holds a masters degree in
mathematics *and* computer science. Just another one of life's little
ironies ;)

Rascal

unread,

Aug 15, 2009, 1:39:08 AM8/15/09

to

look at xrange -- http://docs.python.org/library/functions.html#xrange

Hendrik van Rooyen

unread,

Aug 15, 2009, 5:21:12 AM8/15/09

to Dr. Phillip M. Feldman, pytho...@python.org

On Saturday 15 August 2009 03:25:45 Dr. Phillip M. Feldman wrote:

> It seems as though Python is actually expanding range(2,n) into a list of
> numbers, even though this is incredibly wasteful of memory. There should be
> a looping mechanism that generates the index variable values incrementally
> as they are needed.

There is.

Use xrange instead of range, and try again.

And while you are about it, you may as well teach them that it is much better
to do a multiplication than a division.

-Hendrik

John Machin

unread,

Aug 15, 2009, 5:20:12 AM8/15/09

to

Alternative hypothesis: the good doctor read the Python 3.x
documentation but absent-mindedly ran Python 2.x

Steven D'Aprano

unread,

Aug 15, 2009, 8:25:00 AM8/15/09

to

On Fri, 14 Aug 2009 18:25:45 -0700, Dr. Phillip M. Feldman wrote:

> It seems as though Python is actually expanding range(2,n) into a list
> of numbers, even though this is incredibly wasteful of memory. There
> should be a looping mechanism that generates the index variable values
> incrementally as they are needed.

Others have already pointed out to you that xrange() (for Python 2.x)
does what you want. In Python 3.x, the old range() is gone for good, and
xrange() renamed to just range().

However, I'd like to point out that your subject line is fundamentally
incorrect. What you've discovered has *nothing* to do with for-loops: the
for-loop will happily iterate over whatever object you pass to it (or at
least try to, because not all objects are iterable). You might be
thinking that Python requires you to write for-loops as

for i in range(...):

but that's not correct. The for-loop doesn't care what object comes after
the "in" and before the ":", so long as it can iterate over it one item
at a time:

for i in [1, 3, 4, 5, 2, 0, -1, 8, 7, 6]:
print i

for c in "abcd":
print c

and many other variations will work perfectly fine.

The memory-consumption you have discovered is a property of the range()
function, not the for-loop:

>>> L = range(2, 20)
>>> L
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

Compare to xrange:

>>> L = xrange(2, 20)
>>> L
xrange(2, 20)
>>> list(L) # expand out into a list
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

By the way, both range() and xrange() take an optional 'stepsize'
argument, defaulting to 1:

>>> range(3, 20, 2)
[3, 5, 7, 9, 11, 13, 15, 17, 19]

help(range) and help(xrange) in the interactive interpreter are your
friends.

--
Steven

John Nagle

unread,

Aug 15, 2009, 3:34:15 PM8/15/09

to

Actually, division speed hasn't been much of an issue in years. Arithmetic
has been faster than memory access since CPUs went superscalar. In CPython,
with all the interpreter overhead, you'll probably never notice the difference.

I'd thought "xrange" had already been obsoleted, and hadn't realized the
2.3 - 2.6 versions of CPython were still doing "range" the hard way, not
as a generator.

John Nagle

MRAB

unread,

Aug 15, 2009, 5:18:24 PM8/15/09

to pytho...@python.org

It would've broken some existing code, hence it was left until Python 3.

Emmanuel Surleau

unread,

Aug 16, 2009, 2:30:54 AM8/16/09

to pytho...@python.org

> Dr. Phillip M. Feldman wrote:

[snip]

> > def is_prime(n):
> > for j in range(2,n):
> > if (n % j) == 0: return False
> > return True
> >

> > It seems as though Python is actually expanding range(2,n) into a list of
> > numbers, even though this is incredibly wasteful of memory. There should
> > be a looping mechanism that generates the index variable values
> > incrementally as they are needed.
>

> You already have an answer to the range issue, so I will only add that
> putting a loop inside another loop is a decent way to expand the time
> taken.
>
> I will also observe that if you were to stop programming whatever
> language you are more familiar with in Python, and start programming
> Python in Python, you'll have an easier time of it.

I don't see what's particularly un-Pythonic with this code. Not using xrange()
is a mistake, certainly, but it remains clear, easily understandable code
which correctly demonstrates the naive algorithm for detecting whether n is a
prime. It doesn't call for condescension

> The Dive Into Python is an excellent start for that.

I never read it, but I was under the impression that the tutorial on
python.org was geared toward programmers moving to Python from other
languages. It was also my impression that Dive Into Python is rather outdated
and occasionally inaccurate (based on comments on this mailing list).

Cheers,

Emm

Steven D'Aprano

unread,

Aug 16, 2009, 4:32:35 AM8/16/09

to

On Sun, 16 Aug 2009 08:30:54 +0200, Emmanuel Surleau wrote:

[...]

>> I will also observe that if you were to stop programming whatever
>> language you are more familiar with in Python, and start programming
>> Python in Python, you'll have an easier time of it.
>
> I don't see what's particularly un-Pythonic with this code. Not using
> xrange() is a mistake, certainly, but it remains clear, easily
> understandable code which correctly demonstrates the naive algorithm for
> detecting whether n is a prime. It doesn't call for condescension

It's a particular unfair criticism because the critic (Ethan Furman)
appears to have made a knee-jerk reaction. The "some language in Python"
behaviour he's reacting to is the common idiom:

for i in range(len(seq)):
do_something_with(seq[i])

instead of the "Python in Python" idiom:

for obj in seq:
do_something_with(obj)

That's a reasonable criticism, but *not in the case*. Ethan appears to
have made the knee-jerk reaction "for i in range() is Bad" without
stopping to think about what this specific piece of code is actually
doing.

(Replace 'obj' with 'j', 'seq' with 'range(2, n)', and
'do_something_with' with 'if (n % j) == 0: return False', and you have
the exact same code pattern.)

--
Steven

Emmanuel Surleau

unread,

Aug 16, 2009, 9:34:06 AM8/16/09

to pytho...@python.org

> It's a particular unfair criticism because the critic (Ethan Furman)
> appears to have made a knee-jerk reaction. The "some language in Python"
> behaviour he's reacting to is the common idiom:
>
> for i in range(len(seq)):
> do_something_with(seq[i])
>
>
> instead of the "Python in Python" idiom:
>
> for obj in seq:
> do_something_with(obj)
>
>
> That's a reasonable criticism, but *not in the case*. Ethan appears to
> have made the knee-jerk reaction "for i in range() is Bad" without
> stopping to think about what this specific piece of code is actually
> doing.
>
> (Replace 'obj' with 'j', 'seq' with 'range(2, n)', and
> 'do_something_with' with 'if (n % j) == 0: return False', and you have
> the exact same code pattern.)

Fair enough. But as far as I know, for i in (x)range(num) is the canonical way
to iterate over numbers in Python.

Another case of lack of RTFM* before answering, I suppose.

Cheers,

Emm

*Read The Fine Mail

Benjamin Kaplan

unread,

Aug 16, 2009, 11:41:21 AM8/16/09

to pytho...@python.org

On Sun, Aug 16, 2009 at 2:30 AM, Emmanuel Surleau
<emmanuel...@gmail.com> wrote:
>
> I don't see what's particularly un-Pythonic with this code. Not using xrange()
> is a mistake, certainly, but it remains clear, easily understandable code
> which correctly demonstrates the naive algorithm for detecting whether n is a
> prime. It doesn't call for condescension

It's not that the code is bad, but too many people coming from Java
and C keep thinking of for loops like they're using Java or C and
therefore that "for i in range(a,b)" is identical to "for(int i = a; i
< b; i++)". It's not and, for the most part, you shouldn't code like
that. Since you're using numbers, range/xrange is the appropriate
idiom but you still have to remember that a for loop in python doesn't
loop until a condition is met, it loops through an iterator until the
interator says it's done.

bartc

unread,

Aug 16, 2009, 2:45:27 PM8/16/09

to

"Steven D'Aprano" <st...@REMOVE-THIS-cybersource.com.au> wrote in message
news:02969972$0$20647$c3e...@news.astraweb.com...

> On Fri, 14 Aug 2009 18:25:45 -0700, Dr. Phillip M. Feldman wrote:
>
>> It seems as though Python is actually expanding range(2,n) into a list
>> of numbers, even though this is incredibly wasteful of memory. There
>> should be a looping mechanism that generates the index variable values
>> incrementally as they are needed.
>
>
> Others have already pointed out to you that xrange() (for Python 2.x)
> does what you want. In Python 3.x, the old range() is gone for good, and
> xrange() renamed to just range().

It does seem rather worrying that whoever originally thought up Python
decided it would be a good idea to implement a simple 1 to N (or 0 to N-1)
for-loop by first creating an array of N consecutive integers!

They must have known Python was going to be slow anyway, but this wouldn't
have helped the speed any. Nor the memory consumption.

A for-loop, for iterating over a simple sequence, should be one of the
fastest things in the language.

[Presumably the internal code that created those consecutive integers used a
more conventional looping method...]

--
Bartc

MRAB

unread,

Aug 16, 2009, 4:00:27 PM8/16/09

to pytho...@python.org

bartc wrote:
>
> "Steven D'Aprano" <st...@REMOVE-THIS-cybersource.com.au> wrote in
> message news:02969972$0$20647$c3e...@news.astraweb.com...
>> On Fri, 14 Aug 2009 18:25:45 -0700, Dr. Phillip M. Feldman wrote:
>>
>>> It seems as though Python is actually expanding range(2,n) into a list
>>> of numbers, even though this is incredibly wasteful of memory. There
>>> should be a looping mechanism that generates the index variable values
>>> incrementally as they are needed.
>>
>>
>> Others have already pointed out to you that xrange() (for Python 2.x)
>> does what you want. In Python 3.x, the old range() is gone for good, and
>> xrange() renamed to just range().
>
> It does seem rather worrying that whoever originally thought up Python
> decided it would be a good idea to implement a simple 1 to N (or 0 to
> N-1) for-loop by first creating an array of N consecutive integers!
>

Whoever? I am shocked! ;-)

sturlamolden

unread,

Aug 16, 2009, 4:41:07 PM8/16/09

to

On 16 Aug, 11:45, "bartc" <ba...@freeuk.com> wrote:

> A for-loop, for iterating over a simple sequence, should be one of the
> fastest things in the language.

Anyone experienced with interpreted high-level languages knows this is
not true. Not because iterating a sequence is expensive, but because
the interpreter is constantly executing the body of the loop. There is
a reason why Matlab and Python/NumPy programmers rely heavily on
vectorized array expressions for efficient numerics.

The only thing more evil than a for-loop is recursive function calls.

This does not mean Python is slow. It just means you have to program
differently.

Message has been deleted

sturlamolden

unread,

Aug 16, 2009, 6:35:26 PM8/16/09

to

On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:

> Well, the alternative would be to have two keywords for looping: one
> for your "simple" incrementing integer loop, and another for a loop that
> operates over the elements of some collection type.

A compiler could easily recognise a statement like

for i in range(n):

as a simple integer loop. In fact, Cython is able to do this.

John Machin

unread,

Aug 16, 2009, 9:23:23 PM8/16/09

to

On Aug 17, 8:35 am, sturlamolden <sturlamol...@yahoo.no> wrote:

> A compiler could easily recognise a statement like
> for i in range(n):
> as a simple integer loop. In fact, Cython is able to do this.

Extremely easy, once users relinquish the right to replace built-in
"range" with their own concoctions ...

Benjamin Kaplan

unread,

Aug 16, 2009, 9:23:20 PM8/16/09

to pytho...@python.org

On Sun, Aug 16, 2009 at 6:35 PM, sturlamolden <sturla...@yahoo.no> wrote:
>
> A compiler could easily recognise a statement like
>
> for i in range(n):
>
> as a simple integer loop. In fact, Cython is able to do this.

but special cases aren't special enough to break the rules.

exa...@twistedmatrix.com

unread,

Aug 16, 2009, 9:28:40 PM8/16/09

to Benjamin Kaplan, pytho...@python.org

On 01:23 am, benjami...@case.edu wrote:
>On Sun, Aug 16, 2009 at 6:35 PM, sturlamolden <sturla...@yahoo.no>
>wrote:
>>

>>A compiler could easily recognise a statement like
>>
>> for i in range(n):
>>
>>as a simple integer loop. In fact, Cython is able to do this.
>

>but special cases aren't special enough to break the rules.

Although I think PyPy also recognizes this case and makes it as
efficient as using xrange, and does so without breaking any rules.

There *are* _some_ legitimate complaints to be made about the CPython
runtime. :)

Jean-Paul

Paul Rubin

unread,

Aug 16, 2009, 9:44:33 PM8/16/09

to

exa...@twistedmatrix.com writes:
> Although I think PyPy also recognizes this case and makes it as
> efficient as using xrange, and does so without breaking any rules.

How can pypy possibly know that the user hasn't assigned some other
value to "range"?

Carl Banks

unread,

Aug 16, 2009, 9:53:10 PM8/16/09

to

On Aug 16, 6:28 pm, exar...@twistedmatrix.com wrote:
> On 01:23 am, benjamin.kap...@case.edu wrote:
>
> >On Sun, Aug 16, 2009 at 6:35 PM, sturlamolden <sturlamol...@yahoo.no>

> >wrote:
>
> >>A compiler could easily recognise a statement like
>
> >> for i in range(n):
>
> >>as a simple integer loop. In fact, Cython is able to do this.
>
> >but special cases aren't special enough to break the rules.
>
> Although I think PyPy also recognizes this case and makes it as
> efficient as using xrange, and does so without breaking any rules.

PyPy uses a JIT compiler (which is still slower than CPython,
suggesting that perhaps they should have spent more time optimizing
the general case than optimizing for an easily avoided special case).

> There *are* _some_ legitimate complaints to be made about the CPython
> runtime. :)

This isn't one of them.

xrange has been part of Python for 10 years.

If there are any complaints to be made about this situation it's that
there are any 2.x learning materials anythere that continue to use
range() and not xrange() in this context.

Carl Banks

unread,

Aug 16, 2009, 10:12:20 PM8/16/09

to

It would be a simple to do if you were writing it for a different
langauge was a lot less dynamic than Python is. It'd be quite a
complex hack to add it to CPython's compiler while maintaing the
current highly dynamic runtime semantics and backwards compatibility,
which is a design constraint of Python whether you like it or not.

And all this complaining about an issue that can be worked around by
xrange instead of range. Sheesh.

> In fact, Cython is able to do this.

Cython can do this easily because it is a different language that is a
lot less dynamic than Python.

If you don't care about the dynamic stuff why don't you just use
Cython? Or quit complaining and just use xrange.

Carl Banks

Nobody

unread,

Aug 16, 2009, 11:10:17 PM8/16/09

to

On Sun, 16 Aug 2009 11:41:21 -0400, Benjamin Kaplan wrote:

> It's not that the code is bad, but too many people coming from Java
> and C keep thinking of for loops like they're using Java or C and
> therefore that "for i in range(a,b)" is identical to "for(int i = a; i
> < b; i++)". It's not and, for the most part, you shouldn't code like
> that. Since you're using numbers, range/xrange is the appropriate
> idiom but you still have to remember that a for loop in python doesn't
> loop until a condition is met, it loops through an iterator until the
> interator says it's done.

Java also has iterators; it's more a case of people coming from C and BASIC.

Although, some of those may have come *through* Java without abandoning
old habits. You see the same thing with people coming from BASIC to C and
writing:

#define NUM_DATES 50
int day[NUM_DATES], month[NUM_DATES], year[NUM_DATES];

rather than defining a "struct".

Sometimes referred to as "I know ten languages and can write in BASIC in
all of them".

exa...@twistedmatrix.com

unread,

Aug 17, 2009, 7:40:36 AM8/17/09

to Carl Banks, pytho...@python.org

On 02:12 am, pavlove...@gmail.com wrote:
>On Aug 16, 3:35 pm, sturlamolden <sturlamol...@yahoo.no> wrote:
>>On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
>>
>> > Well, the alternative would be to have two keywords for
>>looping: one
>> > for your "simple" incrementing integer loop, and another for a loop
>>that
>> > operates over the elements of some collection type.
>>
>>A compiler could easily recognise a statement like
>>
>> for i in range(n):
>>
>>as a simple integer loop.
>
>It would be a simple to do if you were writing it for a different
>langauge was a lot less dynamic than Python is. It'd be quite a
>complex hack to add it to CPython's compiler while maintaing the
>current highly dynamic runtime semantics and backwards compatibility,
>which is a design constraint of Python whether you like it or not.

In your other message, you said this wasn't a legitimate CPython
complaint. Here, you say that it would be a "complex hack" to implement
this in CPython. "complex hack" has negative connotations in my mind.
This seems contradictory to me.

>
>And all this complaining about an issue that can be worked around by
>xrange instead of range. Sheesh.

Sure. The specific issue of range vs xrange is quite a small one.
There is a slightly larger one regarding the flexibility (or relative
lack thereof) of the CPython runtime, though.

Jean-Paul

exa...@twistedmatrix.com

unread,

Aug 17, 2009, 7:43:15 AM8/17/09

to Carl Banks, pytho...@python.org

On 01:53 am, pavlove...@gmail.com wrote:
>On Aug 16, 6:28 pm, exar...@twistedmatrix.com wrote:
>>On 01:23 am, benjamin.kap...@case.edu wrote:
>>
>> >On Sun, Aug 16, 2009 at 6:35 PM, sturlamolden <sturlamol...@yahoo.no>
>> >wrote:
>>
>> >>A compiler could easily recognise a statement like
>>
>> >> for i in range(n):
>>
>> >>as a simple integer loop. In fact, Cython is able to do this.
>>
>> >but special cases aren't special enough to break the rules.
>>
>>Although I think PyPy also recognizes this case and makes it as
>>efficient as using xrange, and does so without breaking any rules.
>
>PyPy uses a JIT compiler (which is still slower than CPython,
>suggesting that perhaps they should have spent more time optimizing
>the general case than optimizing for an easily avoided special case).

It's true that PyPy has a JIT compiler (which is still slower than
CPython). However, this isn't where the range speedup comes from. The
range optimization is handled specifically in the implementation of
range (or possibly of list, or a combination of the two, I can't
remember exactly). It's effective even when the JIT compiler is
disabled.

>
>>There *are* _some_ legitimate complaints to be made about the CPython
>>runtime. :)
>
>This isn't one of them.
>
>xrange has been part of Python for 10 years.
>
>If there are any complaints to be made about this situation it's that
>there are any 2.x learning materials anythere that continue to use
>range() and not xrange() in this context.

Jean-Paul

exa...@twistedmatrix.com

unread,

Aug 17, 2009, 7:44:35 AM8/17/09

to pytho...@python.org

It doesn't really need to. The optimization isn't applied when the
compiler sees the name "range" being called. It's applied after the
object the default builtin name "range" is bound to is called.

Jean-Paul

sturlamolden

unread,

Aug 17, 2009, 8:37:29 AM8/17/09

to

On 16 Aug, 19:12, Carl Banks <pavlovevide...@gmail.com> wrote:

> If you don't care about the dynamic stuff why don't you just use
> Cython? Or quit complaining and just use xrange.

I think you are the only one complaining here.

MRAB

unread,

Aug 17, 2009, 9:01:19 AM8/17/09

to pytho...@python.org

exa...@twistedmatrix.com wrote:
> On 02:12 am, pavlove...@gmail.com wrote:

>> On Aug 16, 3:35�pm, sturlamolden <sturlamol...@yahoo.no> wrote:
>>> On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
>>>
>>> > � � � � Well, the alternative would be to have two keywords for
>>> looping: one
>>> > for your "simple" incrementing integer loop, and another for a loop
>>> that
>>> > operates over the elements of some collection type.
>>>
>>> A compiler could easily recognise a statement like
>>>
>>> � �for i in range(n):
>>>
>>> as a simple integer loop.
>>
>> It would be a simple to do if you were writing it for a different
>> langauge was a lot less dynamic than Python is. It'd be quite a
>> complex hack to add it to CPython's compiler while maintaing the
>> current highly dynamic runtime semantics and backwards compatibility,
>> which is a design constraint of Python whether you like it or not.
>

> In your other message, you said this wasn't a legitimate CPython
> complaint. Here, you say that it would be a "complex hack" to implement
> this in CPython. "complex hack" has negative connotations in my mind.
> This seems contradictory to me.
>>

>> And all this complaining about an issue that can be worked around by
>> xrange instead of range. Sheesh.
>

> Sure. The specific issue of range vs xrange is quite a small one. There
> is a slightly larger one regarding the flexibility (or relative lack
> thereof) of the CPython runtime, though.
>

A possible solution would be to make 'range' return an instance of, say,
'RangeList', a subclass of list which would behave like 'list'
externally but create the list of values internally only when needed.

Stefan Behnel

unread,

Aug 17, 2009, 9:20:31 AM8/17/09

to

Carl Banks wrote:
> On Aug 16, 3:35 pm, sturlamolden <sturlamol...@yahoo.no> wrote:
>> On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
>>
>>> Well, the alternative would be to have two keywords for looping: one
>>> for your "simple" incrementing integer loop, and another for a loop that
>>> operates over the elements of some collection type.
>> A compiler could easily recognise a statement like
>>
>> for i in range(n):
>>
>> as a simple integer loop.
>>

>> In fact, Cython is able to do this.
>
> Cython can do this easily because it is a different language that is a
> lot less dynamic than Python.

Actually, Cython is able to do this because it knows the global scope of a
module. The only thing that this optimisation makes impossible when you
compile your module using Cython is to inject builtins into the module
namespace *after* the compilation, either by assigning module attributes or
by importing the module into a custom namespace.

Given that both use cases are extremely rare, it was decided that
optimisations like this are more important than the ability to redefine the
most common builtins (such as 'range' or 'enumerate'). So, in a way, Cython
really makes them "builtins".

Stefan

Steven D'Aprano

unread,

Aug 17, 2009, 9:23:14 AM8/17/09

to

"Easily" huh? Would you like to put a small wager on that?

Here is an unedited copy-and-paste of the last few lines of an
interactive Python session:

>>> for i in range(5):
... print i
...
0
1
2
Surprise!
4
>>> __builtins__.range is range
True

Can you determine how I did this? How would the compiler avoid this? If
you can find a method for the compiler to avoid surprises like this, why
do you think it would be valuable to add all that extra complexity to the
language? (As opposed to an external tool like Cython.)

--
Steven

Stefan Behnel

unread,

Aug 17, 2009, 9:24:17 AM8/17/09

to

Cython allows you to do that. You just have to do it inside the module. If
Cython determines that "range" refers the global builtin name, it will
enable the loop optimisation. If you assign anything to that global name
inside your module (even the range function itself), this will disable the
optimisation.

Stefan

David Robinow

unread,

Aug 17, 2009, 10:04:14 AM8/17/09

to pytho...@python.org

On Sun, Aug 16, 2009 at 11:10 PM, Nobody<nob...@nowhere.com> wrote:
> Java also has iterators; it's more a case of people coming from C and BASIC.
>
> Although, some of those may have come *through* Java without abandoning
> old habits. You see the same thing with people coming from BASIC to C and
> writing:
>
> #define NUM_DATES 50
> int day[NUM_DATES], month[NUM_DATES], year[NUM_DATES];
>
> rather than defining a "struct".
>
> Sometimes referred to as "I know ten languages and can write in BASIC in
> all of them".

Ha, ha. I learned that pattern in Fortran. I confess to having written
code like that in C. I think I've gotten over it.

Ethan Furman

unread,

Aug 17, 2009, 12:24:26 PM8/17/09

to pytho...@python.org

Emmanuel Surleau wrote:
>>Dr. Phillip M. Feldman wrote:
>
>

> [snip]
>
>
>>>def is_prime(n):
>>> for j in range(2,n):

>>> if (n % j) == 0: return False

>>> return True

>>>
>>>It seems as though Python is actually expanding range(2,n) into a list of
>>>numbers, even though this is incredibly wasteful of memory. There should
>>>be a looping mechanism that generates the index variable values
>>>incrementally as they are needed.
>>

[snip]

>>
>>I will also observe that if you were to stop programming whatever
>>language you are more familiar with in Python, and start programming
>>Python in Python, you'll have an easier time of it.
>
>

> I don't see what's particularly un-Pythonic with this code. Not using xrange()
> is a mistake, certainly, but it remains clear, easily understandable code
> which correctly demonstrates the naive algorithm for detecting whether n is a
> prime. It doesn't call for condescension
>

[snip]
>
> Cheers,
>
> Emm

My comment about programming Python in Python was geared more towards
the subject line than the actual code, and the biases evident in his
comments in both this thread and earlier ones.

~Ethan~

Carl Banks

unread,

Aug 17, 2009, 2:32:29 PM8/17/09

to

On Aug 17, 4:40 am, exar...@twistedmatrix.com wrote:

> On 02:12 am, pavlovevide...@gmail.com wrote:
>
>
>
> >On Aug 16, 3:35 pm, sturlamolden <sturlamol...@yahoo.no> wrote:
> >>On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
>
> >> > Well, the alternative would be to have two keywords for
> >>looping: one
> >> > for your "simple" incrementing integer loop, and another for a loop
> >>that
> >> > operates over the elements of some collection type.
>
> >>A compiler could easily recognise a statement like
>
> >> for i in range(n):
>
> >>as a simple integer loop.
>
> >It would be a simple to do if you were writing it for a different
> >langauge was a lot less dynamic than Python is. It'd be quite a
> >complex hack to add it to CPython's compiler while maintaing the
> >current highly dynamic runtime semantics and backwards compatibility,
> >which is a design constraint of Python whether you like it or not.
>
> In your other message, you said this wasn't a legitimate CPython
> complaint. Here, you say that it would be a "complex hack" to implement
> this in CPython. "complex hack" has negative connotations in my mind.
> This seems contradictory to me.

Well, you missed the point, chief.

It's not a legitimate complaint because you can use xrange, you don't
need compiler magic to recognize and optimize range.

Carl Banks

exa...@twistedmatrix.com

unread,

Aug 17, 2009, 3:41:55 PM8/17/09

to Carl Banks, pytho...@python.org

There's a lot of things in Python that I don't strictly *need*. That
doesn't mean that they wouldn't be welcome if I could have them.
Getting rid of the range/xrange dichotomy would improve things. Yes, I
can work around it until the runtime is good enough to let me think
about an *interesting* problem instead. That makes it a legitimate
complaint in my eyes. You're welcome to disagree, of course, but do you
have an argument more compelling than the one you give here? It seems
to me one could use it to argue a lot of Python out of existence.
Chief. (Seriously, "chief"? What are you going for? It sounds like
condescension to me; what's the point of that? I hope I'm just
misreading you.)

Jean-Paul

Carl Banks

unread,

Aug 17, 2009, 3:59:43 PM8/17/09

to

On Aug 17, 12:41 pm, exar...@twistedmatrix.com wrote:
> There's a lot of things in Python that I don't strictly *need*. That
> doesn't mean that they wouldn't be welcome if I could have them.
> Getting rid of the range/xrange dichotomy would improve things. Yes, I
> can work around it until the runtime is good enough to let me think
> about an *interesting* problem instead.

You don't have to think about using xrange in a for loop, you just
always use it.

> That makes it a legitimate
> complaint in my eyes. You're welcome to disagree, of course, but do you
> have an argument more compelling than the one you give here?

I am not arguing in favor of range/xrange, I am saying that it's silly
to complain that the compiler isn't a whole lot more complex than it
is just so it can implemnent a semantically-diconnected special case
just so that you can avoid typing an extra "x". The cost doesn't even
remotely justify it.

Carl Banks

unread,

Aug 17, 2009, 4:00:44 PM8/17/09

to

On Aug 17, 12:59 pm, Carl Banks <pavlovevide...@gmail.com> wrote:
> The cost doesn't even remotely justify it.

I mean, it doesn't even remotely justify the cost.

Carl Banks

Terry Reedy

unread,

Aug 17, 2009, 11:56:58 PM8/17/09

to pytho...@python.org

exa...@twistedmatrix.com wrote:

> There's a lot of things in Python that I don't strictly *need*. That
> doesn't mean that they wouldn't be welcome if I could have them. Getting
> rid of the range/xrange dichotomy would improve things.

The developers agreed a couple of years ago. Starting using 3.1 if you
want this.

Since 'range' could refer to a user-defined object, rather than the
builtin function, there is no way the interpreter should substitute
'xrange'.

tjr

exa...@twistedmatrix.com

unread,

Aug 18, 2009, 7:45:18 AM8/18/09

to Terry Reedy, pytho...@python.org

On 03:56 am, tjr...@udel.edu wrote:
>exa...@twistedmatrix.com wrote:
>>There's a lot of things in Python that I don't strictly *need*. That
>>doesn't mean that they wouldn't be welcome if I could have them.
>>Getting rid of the range/xrange dichotomy would improve things.
>
>The developers agreed a couple of years ago. Starting using 3.1 if you
>want this.

And there was much rejoicing, et cetera.

>Since 'range' could refer to a user-defined object, rather than the
>builtin function, there is no way the interpreter should substitute
>'xrange'.

See the earlier parts of this thread for the reasons this isn't true. :)

Jean-Paul