Thanks in advance,
Richard Bow
======================================================================
# This finds integers n for which there are 2 pairs of positive integers,
#(a,b) and (c,d) such that (a**3) + (b**3) = n, and (c**3) + (d**3) = n.
# see http://www.mathpages.com/home/kmath028.htm
start = 1000000
end = 2000000
import time
t1= time.time()
for n in range(start, end + 1):
crn = int(n**(1/3.0)) + 2 # because of float probs, add 2 to be sure
list = [] # crn is large enough
for a in range(1,crn):
a3 = a ** 3
for b in range(a,crn):
b3 = b ** 3
if a3 + b3 == n:
list = list + [(n,a,b)]
if len(list) >= 2:
# e.g. [(1729, 1, 12), (1729, 9, 10)] and
# [(994688, 29, 99), (994688, 60, 92)]
print list
print
t2 = time.time()
print "Time's up!", t2 - t1, "seconds have elapsed"
print "EEEEEEEEEnnnnnnnnnnddddddddd"
##total for n = 1 thru n = 1,000,000 is 43 integers
==========================================================
> I'm hoping to get some hints for significantly speeding up the below
> script. I've found 43 integers between 1 and 1,000,000 that satisfy
> the
> condition, and would like to push on toward 10,000,000 in hopes of
> finding
> an integer for which there are 3 pairs. Of course, another reason for
> asking is to learn more Python from you guys.
One obvious optimization is that you're using range with huge values,
which is going to be expensive -- range returns a list, so calling it
with large values will create an enormous list, which is wasteful if all
you plan on doing is enumerating it. Try using xrange, or iterating
manually instead.
--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/ \ The doors of Heaven and Hell are adjacent and identical.
\__/ Nikos Kazantzakis
Physics reference / http://www.alcyone.com/max/reference/physics/
A physics reference.
> I'm hoping to get some hints for significantly speeding up the below
> script. I've found 43 integers between 1 and 1,000,000 that satisfy the
> condition, and would like to push on toward 10,000,000 in hopes of finding
> an integer for which there are 3 pairs. Of course, another reason for
> asking is to learn more Python from you guys.
hi,
without looking at your algorithme, it's typical think where psyco will
help you :
without psyco :
[(13832, 2, 24), (13832, 18, 20)]
Time's up! 16.6103349924 seconds have elapsed
EEEEEEEEEnnnnnnnnnnddddddddd
with psyco :
[(13832, 2, 24), (13832, 18, 20)]
Time's up! 2.99753201008 seconds have elapsed
EEEEEEEEEnnnnnnnnnnddddddddd
i try with python2.2 and last psyco CVS on my pII350
import psyco
start = 10000
end = 20000
def f():
for n in range(start, end + 1):
crn = int(n**(1/3.0)) + 2 # because of float probs, add 2 to be sure
list = [] # crn is large enough
for a in range(1,crn):
a3 = a ** 3
for b in range(a,crn):
b3 = b ** 3
if a3 + b3 == n:
list = list + [(n,a,b)]
if len(list) >= 2:
# e.g. [(1729, 1, 12), (1729, 9, 10)] and
# [(994688, 29, 99), (994688, 60, 92)]
print list
print
psyco.bind(f)
import time
t1= time.time()
f()
t2 = time.time()
print "Time's up!", t2 - t1, "seconds have elapsed"
print "EEEEEEEEEnnnnnnnnnnddddddddd"
--
William Dodé - flibuste.net
http://wikipython.tuxfamily.org
> One obvious optimization is that you're using range with huge values,
> which is going to be expensive -- range returns a list, so calling it
> with large values will create an enormous list, which is wasteful if all
> you plan on doing is enumerating it. Try using xrange, or iterating
> manually instead.
Well, range does return a list, but the list is reinitialized to [] for
each n, so it never has more than 2 (and hoping for 3) objects (so far,
i.e, up to n = 1,200,000).
Could you explain what you mean by "iterating manually"?
I'll take a look at xrange.
Thanks,
Richard Bow
Well, looking at the algorithm, there's no reason to loop on values of c,
then loop on values of a, then loop on values of b. Why not loop on c,
then on a, then find whether there's a value of b that will satisfy the
condition?
Your program on my system, no psyco:
Time's up! 6.24721693993 seconds have elapsed
My program on my system, no psyco:
(13832, [(2, 24), (18, 20)])
1
real 0m1.645s
(both on 10000-20000)
from __future__ import generators
r = range
def f(rr, n):
for c in rr:
l = []
for a in r(1, int(c**(1/3.0))+2):
a3 = a**3
if a3 > c: break
b = int((c - a**3)**(1./3)+.5)
if b < a: break
#print a, b, c
if a**3 + b**3 == c:
l.append((a, b))
if len(l) == n:
yield c, l
break
def main():
count = 0
for item in f(r(10000, 20000), 2):
count = count + 1
print item
print count
main()
IMO this program is the kind of inner loop that should be written in C
(or another typed language with no overhead on arithmetic) for efficiency.
My earlier algorithm (the one that doesn't loop on 'b'):
My program on my system, no psyco:
(13832, [(2, 24), (18, 20)])
1
real 0m1.645s
The same algorithm, implemented in C:
13832 (2 24) (18 20)
real 0m0.101s
psyco gets you closer (probably), but it still doesn't offer a factor of 16.
I'll admit that the "cubert" function makes me a bit uneasy (as its
corresponding part did in the python version) but it's a small matter of
making that function correct for all integer values...
#include <math.h>
#include <stdio.h>
#define N 2
unsigned cubert(unsigned int u) {
return (unsigned)(pow(u, 1./3.)+.5);
}
unsigned f(unsigned low, unsigned high) {
unsigned pairs[N][2];
unsigned n, a, b, c, cu;
for(c=low; c<high; c++) {
n = 0;
cu = cubert(c)+2;
for(a=1; a<cu; a++) {
b = cubert(c-a*a*a);
if (b < a) break;
if (a*a*a+b*b*b == c) {
pairs[n][0] = a;
pairs[n][1] = b;
n++;
if(n == N) {
printf("%u", c);
for(n=0; n<N; n++) {
printf(" (%u %u)", pairs[n][0], pairs[n][1]);
}
printf("\n");
break;
}
}
}
}
}
int main(int argc, char **argv) {
f(10000, 20000);
//f(1729, 1730);
}
First, precalculate a list of cubes instead of cubing over and over.
Next, calculate sums within particular range (small enough to not
oerflow memory).
Store as dict(sum) = (a,b) or list.append( (sum, a, b)).
If dict, easy to spot duplicate sums.
If list, sort and scan for dups.
Terry J. Reedy
def foo(end, min_pairs=2):
cubes = [0]
i = 1
while i**3 < end:
cubes.append(i**3)
i = i + 1
table = {}
for j in range(1, len(cubes)):
for k in range(j, len(cubes)):
sum = cubes[j] + cubes[k]
if not table.has_key(sum):
table[sum] = []
table[sum].append((j, k))
for sum in table.keys():
if len(table[sum]) >= min_pairs:
print sum, table[sum]
Regarding the question of three pairs,
>>> foo(10000000,3)
>>> foo(100000000,3)
87539319 [(167, 436), (228, 423), (255, 414)]
martin
I'm going to cheat by using a better algorithm. This finds all the results
up to 10,000,000 in about half a second on my box (under current CVS Python,
which is a bit zippier). You'll find going up to 100,000,000 more
interesting (which takes longer than an eyeblink, but a lot less time than
it's taking me to add this gratuitous parenthetical comment <wink>).
The key gimmick: The cube root of N is a *hell* of a lot smaller than N.
So rather than look at each N and then look for all ways of expressing N as
the sum of two cubes, instead loop over all cubes and figure out all the N
that can be gotten via adding two cubes.
This is Rule #1 of Python optimization:
Any Python program using a dict is 1000x faster than any C program.
There are no exceptions to this rule <ahem>.
def drive(start, end):
largest_root = int(end ** 0.333)
while (largest_root + 1) ** 3 < end:
largest_root += 1
cubes = [i**3 for i in xrange(1, largest_root + 1)]
ncubes = len(cubes)
n2cubes = {} # map n to list of (cube1, cube2)
for i in xrange(ncubes):
cube1 = cubes[i]
for j in xrange(i, ncubes):
cube2 = cubes[j]
n = cube1 + cube2
if n > end:
break
n2cubes.setdefault(n, []).append((cube1, cube2))
for n, pairs in n2cubes.iteritems():
if len(pairs) >= 2:
print n, pairs
For those who no longer have this thread on their newsgroup servers, please
see http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-
8&safe=off&th=8f0d333d0a2ed43b&rnum=2
or http://makeashorterlink.com/?L25C25732
Richard Bow
Well, brute-force method is not that bad.
Since you are calculating the pairs
a^3 + b^3 = n, for n=[1,2000000]
where
a=1, b=1,2,3,...,126
a=2, b=2,3,4,...,126
a=3, b=3,4,5,...,126
...
anyways, write a Python script to print them out. Then, 'sort' and 'uniq'.
For example, consider script 'abn.py',
maxn=2e6
maxi=math.ceil(maxn**(1/3.0))
for a in range(1, maxi+1):
for b in range(a, maxi+1):
n = a**3 + b**3
if n > maxn: break
print a, b, n
then,
python abn.py | sort -n -k 3 | uniq -dc -f 2 > abn
spits out 64 integers in 0.23 second on my dual-P3/800. And, going to
maxn=10e6
gives 150 integers in 0.65 second, and
maxn=1e9
gives 1554 integers in 14.7 seconds.
You can search for 3 pairs just as easily. There 8 integers with 3 pairs
for n < 1e9.
--
William Park, Open Geometry Consulting, <openge...@yahoo.ca>
Linux solution for data management and processing.
1986 VW Jetta, 350000km :-))