I wrote the following correct but inefficient test of primality for purposes of demonstrating that the simplest algorithm is often not the most efficient. But, when I try to run the following code with a value of n that is large enough to produce a significant amount of running time, I get an out-of-memory error!
def is_prime(n): for j in range(2,n): if (n % j) == 0: return False return True
It seems as though Python is actually expanding range(2,n) into a list of numbers, even though this is incredibly wasteful of memory. There should be a looping mechanism that generates the index variable values incrementally as they are needed. -- View this message in context: http://www.nabble.com/Python-%27for%27-loop-is-memory-inefficient-tp2... Sent from the Python - python-list mailing list archive at Nabble.com.
> I wrote the following correct but inefficient test of primality for purposes > of demonstrating that the simplest algorithm is often not the most > efficient. But, when I try to run the following code with a value of n that > is large enough to produce a significant amount of running time, I get an > out-of-memory error!
> def is_prime(n): > for j in range(2,n): > if (n % j) == 0: return False > return True
> It seems as though Python is actually expanding range(2,n) into a list of > numbers, even though this is incredibly wasteful of memory. There should be > a looping mechanism that generates the index variable values incrementally > as they are needed.
I have a strong suspicion that you will find hints in the Python documentation that this has already been addressed. Perhaps you could try reading before posting?
> I wrote the following correct but inefficient test of primality for purposes > of demonstrating that the simplest algorithm is often not the most > efficient. But, when I try to run the following code with a value of n that > is large enough to produce a significant amount of running time, I get an > out-of-memory error!
> def is_prime(n): > for j in range(2,n): > if (n % j) == 0: return False > return True
> It seems as though Python is actually expanding range(2,n) into a list of > numbers, even though this is incredibly wasteful of memory. There should be > a looping mechanism that generates the index variable values incrementally > as they are needed.
You already have an answer to the range issue, so I will only add that putting a loop inside another loop is a decent way to expand the time taken.
I will also observe that if you were to stop programming whatever language you are more familiar with in Python, and start programming Python in Python, you'll have an easier time of it.
The Dive Into Python is an excellent start for that.
On Aug 14, 8:25 pm, "Dr. Phillip M. Feldman" <pfeld...@verizon.net> wrote:
> I wrote the following correct but inefficient test of primality for purposes > of demonstrating that the simplest algorithm is often not the most > efficient. But, when I try to run the following code with a value of n that > is large enough to produce a significant amount of running time, I get an > out-of-memory error!
I don't think Python was created to keep mathematician's happy. Although strangely enough, the creator holds a masters degree in mathematics *and* computer science. Just another one of life's little ironies ;)
On Saturday 15 August 2009 03:25:45 Dr. Phillip M. Feldman wrote:
> It seems as though Python is actually expanding range(2,n) into a list of > numbers, even though this is incredibly wasteful of memory. There should be > a looping mechanism that generates the index variable values incrementally > as they are needed.
There is.
Use xrange instead of range, and try again.
And while you are about it, you may as well teach them that it is much better to do a multiplication than a division.
> Dr. Phillip M. Feldman wrote:> I wrote the following correct but inefficient test of primality for purposes > > of demonstrating that the simplest algorithm is often not the most > > efficient. But, when I try to run the following code with a value of n that > > is large enough to produce a significant amount of running time, I get an > > out-of-memory error!
> > def is_prime(n): > > for j in range(2,n): > > if (n % j) == 0: return False > > return True
> > It seems as though Python is actually expanding range(2,n) into a list of > > numbers, even though this is incredibly wasteful of memory. There should be > > a looping mechanism that generates the index variable values incrementally > > as they are needed.
> I have a strong suspicion that you will find hints in the Python > documentation that this has already been addressed. Perhaps you could > try reading before posting?
Alternative hypothesis: the good doctor read the Python 3.x documentation but absent-mindedly ran Python 2.x
On Fri, 14 Aug 2009 18:25:45 -0700, Dr. Phillip M. Feldman wrote:
> It seems as though Python is actually expanding range(2,n) into a list > of numbers, even though this is incredibly wasteful of memory. There > should be a looping mechanism that generates the index variable values > incrementally as they are needed.
Others have already pointed out to you that xrange() (for Python 2.x) does what you want. In Python 3.x, the old range() is gone for good, and xrange() renamed to just range().
However, I'd like to point out that your subject line is fundamentally incorrect. What you've discovered has *nothing* to do with for-loops: the for-loop will happily iterate over whatever object you pass to it (or at least try to, because not all objects are iterable). You might be thinking that Python requires you to write for-loops as
for i in range(...):
but that's not correct. The for-loop doesn't care what object comes after the "in" and before the ":", so long as it can iterate over it one item at a time:
for i in [1, 3, 4, 5, 2, 0, -1, 8, 7, 6]: print i
for c in "abcd": print c
and many other variations will work perfectly fine.
The memory-consumption you have discovered is a property of the range() function, not the for-loop:
Hendrik van Rooyen wrote: > On Saturday 15 August 2009 03:25:45 Dr. Phillip M. Feldman wrote: > And while you are about it, you may as well teach them that it is much better > to do a multiplication than a division.
Actually, division speed hasn't been much of an issue in years. Arithmetic has been faster than memory access since CPUs went superscalar. In CPython, with all the interpreter overhead, you'll probably never notice the difference.
I'd thought "xrange" had already been obsoleted, and hadn't realized the 2.3 - 2.6 versions of CPython were still doing "range" the hard way, not as a generator.
John Nagle wrote: > Hendrik van Rooyen wrote: >> On Saturday 15 August 2009 03:25:45 Dr. Phillip M. Feldman wrote:
>> And while you are about it, you may as well teach them that it is much >> better to do a multiplication than a division.
> Actually, division speed hasn't been much of an issue in years. > Arithmetic > has been faster than memory access since CPUs went superscalar. In > CPython, > with all the interpreter overhead, you'll probably never notice the > difference.
> I'd thought "xrange" had already been obsoleted, and hadn't realized the > 2.3 - 2.6 versions of CPython were still doing "range" the hard way, not > as a generator.
It would've broken some existing code, hence it was left until Python 3.
> > def is_prime(n): > > for j in range(2,n): > > if (n % j) == 0: return False > > return True
> > It seems as though Python is actually expanding range(2,n) into a list of > > numbers, even though this is incredibly wasteful of memory. There should > > be a looping mechanism that generates the index variable values > > incrementally as they are needed.
> You already have an answer to the range issue, so I will only add that > putting a loop inside another loop is a decent way to expand the time > taken.
> I will also observe that if you were to stop programming whatever > language you are more familiar with in Python, and start programming > Python in Python, you'll have an easier time of it.
I don't see what's particularly un-Pythonic with this code. Not using xrange() is a mistake, certainly, but it remains clear, easily understandable code which correctly demonstrates the naive algorithm for detecting whether n is a prime. It doesn't call for condescension
> The Dive Into Python is an excellent start for that.
I never read it, but I was under the impression that the tutorial on python.org was geared toward programmers moving to Python from other languages. It was also my impression that Dive Into Python is rather outdated and occasionally inaccurate (based on comments on this mailing list).
On Sun, 16 Aug 2009 08:30:54 +0200, Emmanuel Surleau wrote:
[...]
>> I will also observe that if you were to stop programming whatever >> language you are more familiar with in Python, and start programming >> Python in Python, you'll have an easier time of it.
> I don't see what's particularly un-Pythonic with this code. Not using > xrange() is a mistake, certainly, but it remains clear, easily > understandable code which correctly demonstrates the naive algorithm for > detecting whether n is a prime. It doesn't call for condescension
It's a particular unfair criticism because the critic (Ethan Furman) appears to have made a knee-jerk reaction. The "some language in Python" behaviour he's reacting to is the common idiom:
for i in range(len(seq)): do_something_with(seq[i])
instead of the "Python in Python" idiom:
for obj in seq: do_something_with(obj)
That's a reasonable criticism, but *not in the case*. Ethan appears to have made the knee-jerk reaction "for i in range() is Bad" without stopping to think about what this specific piece of code is actually doing.
(Replace 'obj' with 'j', 'seq' with 'range(2, n)', and 'do_something_with' with 'if (n % j) == 0: return False', and you have the exact same code pattern.)
> It's a particular unfair criticism because the critic (Ethan Furman) > appears to have made a knee-jerk reaction. The "some language in Python" > behaviour he's reacting to is the common idiom:
> for i in range(len(seq)): > do_something_with(seq[i])
> instead of the "Python in Python" idiom:
> for obj in seq: > do_something_with(obj)
> That's a reasonable criticism, but *not in the case*. Ethan appears to > have made the knee-jerk reaction "for i in range() is Bad" without > stopping to think about what this specific piece of code is actually > doing.
> (Replace 'obj' with 'j', 'seq' with 'range(2, n)', and > 'do_something_with' with 'if (n % j) == 0: return False', and you have > the exact same code pattern.)
Fair enough. But as far as I know, for i in (x)range(num) is the canonical way to iterate over numbers in Python.
Another case of lack of RTFM* before answering, I suppose.
> I don't see what's particularly un-Pythonic with this code. Not using xrange() > is a mistake, certainly, but it remains clear, easily understandable code > which correctly demonstrates the naive algorithm for detecting whether n is a > prime. It doesn't call for condescension
It's not that the code is bad, but too many people coming from Java and C keep thinking of for loops like they're using Java or C and therefore that "for i in range(a,b)" is identical to "for(int i = a; i < b; i++)". It's not and, for the most part, you shouldn't code like that. Since you're using numbers, range/xrange is the appropriate idiom but you still have to remember that a for loop in python doesn't loop until a condition is met, it loops through an iterator until the interator says it's done.
> On Fri, 14 Aug 2009 18:25:45 -0700, Dr. Phillip M. Feldman wrote:
>> It seems as though Python is actually expanding range(2,n) into a list >> of numbers, even though this is incredibly wasteful of memory. There >> should be a looping mechanism that generates the index variable values >> incrementally as they are needed.
> Others have already pointed out to you that xrange() (for Python 2.x) > does what you want. In Python 3.x, the old range() is gone for good, and > xrange() renamed to just range().
It does seem rather worrying that whoever originally thought up Python decided it would be a good idea to implement a simple 1 to N (or 0 to N-1) for-loop by first creating an array of N consecutive integers!
They must have known Python was going to be slow anyway, but this wouldn't have helped the speed any. Nor the memory consumption.
A for-loop, for iterating over a simple sequence, should be one of the fastest things in the language.
[Presumably the internal code that created those consecutive integers used a more conventional looping method...]
>>> It seems as though Python is actually expanding range(2,n) into a list >>> of numbers, even though this is incredibly wasteful of memory. There >>> should be a looping mechanism that generates the index variable values >>> incrementally as they are needed.
>> Others have already pointed out to you that xrange() (for Python 2.x) >> does what you want. In Python 3.x, the old range() is gone for good, and >> xrange() renamed to just range().
> It does seem rather worrying that whoever originally thought up Python > decided it would be a good idea to implement a simple 1 to N (or 0 to > N-1) for-loop by first creating an array of N consecutive integers!
On 16 Aug, 11:45, "bartc" <ba...@freeuk.com> wrote:
> A for-loop, for iterating over a simple sequence, should be one of the > fastest things in the language.
Anyone experienced with interpreted high-level languages knows this is not true. Not because iterating a sequence is expensive, but because the interpreter is constantly executing the body of the loop. There is a reason why Matlab and Python/NumPy programmers rely heavily on vectorized array expressions for efficient numerics.
The only thing more evil than a for-loop is recursive function calls.
This does not mean Python is slow. It just means you have to program differently.
On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> Well, the alternative would be to have two keywords for looping: one > for your "simple" incrementing integer loop, and another for a loop that > operates over the elements of some collection type.
A compiler could easily recognise a statement like
for i in range(n):
as a simple integer loop. In fact, Cython is able to do this.
exar...@twistedmatrix.com writes: > Although I think PyPy also recognizes this case and makes it as > efficient as using xrange, and does so without breaking any rules.
How can pypy possibly know that the user hasn't assigned some other value to "range"?
> >On Sun, Aug 16, 2009 at 6:35 PM, sturlamolden <sturlamol...@yahoo.no> > >wrote:
> >>A compiler could easily recognise a statement like
> >> for i in range(n):
> >>as a simple integer loop. In fact, Cython is able to do this.
> >but special cases aren't special enough to break the rules.
> Although I think PyPy also recognizes this case and makes it as > efficient as using xrange, and does so without breaking any rules.
PyPy uses a JIT compiler (which is still slower than CPython, suggesting that perhaps they should have spent more time optimizing the general case than optimizing for an easily avoided special case).
> There *are* _some_ legitimate complaints to be made about the CPython > runtime. :)
This isn't one of them.
xrange has been part of Python for 10 years.
If there are any complaints to be made about this situation it's that there are any 2.x learning materials anythere that continue to use range() and not xrange() in this context.
On Aug 16, 3:35 pm, sturlamolden <sturlamol...@yahoo.no> wrote:
> On 16 Aug, 14:57, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> > Well, the alternative would be to have two keywords for looping: one > > for your "simple" incrementing integer loop, and another for a loop that > > operates over the elements of some collection type.
> A compiler could easily recognise a statement like
> for i in range(n):
> as a simple integer loop.
It would be a simple to do if you were writing it for a different langauge was a lot less dynamic than Python is. It'd be quite a complex hack to add it to CPython's compiler while maintaing the current highly dynamic runtime semantics and backwards compatibility, which is a design constraint of Python whether you like it or not.
And all this complaining about an issue that can be worked around by xrange instead of range. Sheesh.
> In fact, Cython is able to do this.
Cython can do this easily because it is a different language that is a lot less dynamic than Python.
If you don't care about the dynamic stuff why don't you just use Cython? Or quit complaining and just use xrange.
On Sun, 16 Aug 2009 11:41:21 -0400, Benjamin Kaplan wrote: > It's not that the code is bad, but too many people coming from Java > and C keep thinking of for loops like they're using Java or C and > therefore that "for i in range(a,b)" is identical to "for(int i = a; i > < b; i++)". It's not and, for the most part, you shouldn't code like > that. Since you're using numbers, range/xrange is the appropriate > idiom but you still have to remember that a for loop in python doesn't > loop until a condition is met, it loops through an iterator until the > interator says it's done.
Java also has iterators; it's more a case of people coming from C and BASIC.
Although, some of those may have come *through* Java without abandoning old habits. You see the same thing with people coming from BASIC to C and writing:
#define NUM_DATES 50 int day[NUM_DATES], month[NUM_DATES], year[NUM_DATES];
rather than defining a "struct".
Sometimes referred to as "I know ten languages and can write in BASIC in all of them".