improving a huge double-for cycle

Alexzive

unread,

Sep 18, 2008, 8:25:02 AM9/18/08

to

Hello there :) ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:
SN.append(IN[i].label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:

but no improvements.

Many thanks, Alex

sk...@pobox.com

unread,

Sep 18, 2008, 8:44:29 AM9/18/08

to Alexzive, pytho...@python.org

Alex> Unfortunately my len(IN) is about 100.000 and the running time
Alex> about 15h !!!! :(

Alex> Any idea to improve it?

numpy?

http://numpy.scipy.org/
http://www.scipy.org/Numpy_Example_List

More immediately, note that you are building a list of len(IN) ints every
time through the inner loop. A quick hit might be this simple change:

indexes = range(len(IN))
for i in indexes: #scan all elements of the list IN
for j in indexes:
if i != j:
if (IN[i].coordinates[0] == IN[j].coordinates[0] and
IN[i].coordinates[1] == IN[j].coordinates[1]):
SN.append(IN[i].label)

Skip

Peter Otten

unread,

Sep 18, 2008, 8:45:34 AM9/18/08

to

Alexzive wrote:

When you're looking for duplicates an efficient solution is likely to be
based on a set or dict object.

# untested
from collections import defaultdict

groups = defaultdict(list)
for item in IN:
c = item.coordinates
groups[c[0], c[1]].append(item.label)
SN = []
for labels in groups.itervalues():
if len(labels) > 1:
SN.extend(labels) # or labels[1:] if you want to keep one item

Peter

Tim Chase

unread,

Sep 18, 2008, 8:57:16 AM9/18/08

to Alexzive, pytho...@python.org

> Code: Select all
> for i in range(len(IN)): #scan all elements of the list IN
> for j in range(len(IN)):
> if i <> j:
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)
>
>
> Unfortunately my len(IN) is about 100.000 and the running time about
> 15h !!!! :(
>
> Any idea to improve it?

[snip]

> I have already tried to group the "if statements" in a single one:
>
> Code: Select all
> if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
>
> but no improvements.

It's like rearranging deck-chairs on the Titanic :) Yes, it may
give a speed up, but what's 3 seconds when you're waiting 15hr :)

Not knowing the len(IN[x].coordinates) or their structure, if
it's a list of len==2, you should be able to just do

if i <> j and IN[i].coordinates == IN[j].coordinates

or

if i <> j and IN[i].coordinates[:2] == IN[j].coordinates[:2]

However, again, this is just polish. The big problem is that you
have an O(N^2) algorithm that's killing you.

1) use xrange instead of range to save eating memory with a huge
unneeded array.

2) unless you need to append duplicate labels, you know that when
I and J are swapped, you'll reach the same condition again, so it
might be worth writing the outer loops to eliminate this
scenario, and in the process, but just starting at i+1 rather
than i, you can forgo the check if "i<>j".

Such changes might look something like

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN[i].coordinates == IN[j].coordinates:
SN.append(IN[i].label)

If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.

-tkc

bearoph...@lycos.com

unread,

Sep 18, 2008, 9:13:39 AM9/18/08

to

Skip:

> indexes = range(len(IN))
> for i in indexes: #scan all elements of the list IN
> for j in indexes:

Nope, use xrange in both situations, and save a list.

Tim Chase:

> for i in xrange(len(IN)):
> for j in xrange(i+1, len(IN)):
> if IN[i].coordinates == IN[j].coordinates:
> SN.append(IN[i].label)
>
> If my college algorithms memory serves me sufficiently, this
> reduces your O(N^2) to O(N log N) which will garner you some
> decent time savings.

That's O(n^2) still, it's just half matrix, a triangle.

Bye,
bearophile

Steven D'Aprano

unread,

Sep 18, 2008, 9:30:47 AM9/18/08

to

On Thu, 18 Sep 2008 05:25:02 -0700, Alexzive wrote:

> Hello there :) ,
>
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a certain
> geometrical model.
>
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
>
> Code: Select all
> for i in range(len(IN)): #scan all elements of the list IN
> for j in range(len(IN)):
> if i <> j:
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)

Here's a better version of your algorithm, one which avoids the minor
inefficiencies but keeps the huge inefficiency:

for node1 in IN:
for node2 in IN:
if node1 is not node2:
if node1.coordinates == node2.coordinates:
SN.append(node1.label)

This assumes that node.coordinates is a type where equality is defined.
If they are a tuple or list, that should work fine.

But the huge inefficiency is that you are checking each node not once,
not twice, but 100,000 times! So you have to iterate 10,000,000,000
times, which is going to be slow no matter what you do. Especially in
pure Python.

Here's a better idea: iterate over the list once only:

seen = set()
for node in IN:
coords = tuple(node.coordinates)
if coords in seen:
SN.append(node.label)
else:
seen.add(coords)

Hope this helps.

--
Steven

Tim Chase

unread,

Sep 18, 2008, 9:39:51 AM9/18/08

to bearoph...@lycos.com, pytho...@python.org

> Tim Chase:
>> for i in xrange(len(IN)):
>> for j in xrange(i+1, len(IN)):
>> if IN[i].coordinates == IN[j].coordinates:
>> SN.append(IN[i].label)
>>
>> If my college algorithms memory serves me sufficiently, this
>> reduces your O(N^2) to O(N log N) which will garner you some
>> decent time savings.
>
> That's O(n^2) still, it's just half matrix, a triangle.

Ah, good catch as I'm thinking about it more, you're right...it's
O((N^2)/2) which is just O(N^2). Sigh. I'm not awake enough
yet. However, dividing the time by 2 (from 15hr to 7.5hr) is
still a significant savings in my book :)

However, if list-comprehensions are faster as well, you might be
able to do something like

SN = [d.label
for (i,d) in enumerate(IN)

for j in xrange(i+1, len(IN))

if d.coordinates == IN[j].coordinates
]

or the slightly more verbose (but closer to my original code) version

SN = [IN[i].label

for i in xrange(len(IN))
for j in xrange(i+1, len(IN))
if IN[i].coordinates == IN[j].coordinates

]

To the OP: As always, throw some timing code on the various
samples you get back from the list and see what works for you.

-tkc

J. Cliff Dyer

unread,

Sep 18, 2008, 10:25:07 AM9/18/08

to Tim Chase, pytho...@python.org

On Thu, 2008-09-18 at 07:57 -0500, Tim Chase wrote:
> > Code: Select all
> > for i in range(len(IN)): #scan all elements of the list IN
> > for j in range(len(IN)):
> > if i <> j:
> > if IN[i].coordinates[0] == IN[j].coordinates[0]:
> > if IN[i].coordinates[1] == IN[j].coordinates[1]:
> > SN.append(IN[i].label)
> >
> >

> Such changes might look something like
>

> for i in xrange(len(IN)):
> for j in xrange(i+1, len(IN)):
> if IN[i].coordinates == IN[j].coordinates:
> SN.append(IN[i].label)

If you aren't checking j values less than i, you might want to do both

SN.append(IN[i].label)
SN.append(IN[j].label)

on the same pass.

> If my college algorithms memory serves me sufficiently, this
> reduces your O(N^2) to O(N log N) which will garner you some
> decent time savings.
>

> -tkc
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

prue...@latinmail.com

unread,

Sep 18, 2008, 11:18:34 AM9/18/08

to

dup=set()
SN=[]
for item in IN:
c=item.coordinates[0], item.coordinates[1]
if c in dup:
SN.append(item.label)
else:
dup.add(c)

Jake Anderson

unread,

Sep 18, 2008, 11:42:45 AM9/18/08

to Alexzive, pytho...@python.org

psyco might help a fair bit (10x-40x) here ;->
perhaps look at dumping the data into sqlite then pulling it back out. It (or the other databases) are designed for tossing around large lumps of data.
Alexzive wrote:

Hello there :) ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
    for i in range(len(IN)): #scan all elements of the list IN
      for j in range(len(IN)):
        if i <> j:
         if IN[i].coordinates[0] == IN[j].coordinates[0]:
           if IN[i].coordinates[1] == IN[j].coordinates[1]:
              SN.append(IN[i].label)



Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
    if i <> j and if IN[i].coordinates[0] == IN[j].coordinates[0] and
if IN[i].coordinates[1] == IN[j].coordinates[1]:


but no improvements.

Many thanks, Alex
--
http://mail.python.org/mailman/listinfo/python-list

--

Vapour Forge

Jake Anderson

Project Manager

Mobile: 0412 897 125

Email: ja...@vapourforge.com

Web Page: www.vapourforge.com

Your source for custom IT services

Tino Wildenhain

unread,

Sep 18, 2008, 11:44:21 AM9/18/08

to Alexzive, pytho...@python.org

Hi,

Alexzive wrote:
> Hello there :) ,
>
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a
> certain geometrical model.
>
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
>
> Code: Select all
> for i in range(len(IN)): #scan all elements of the list IN
> for j in range(len(IN)):
> if i <> j:
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)
>

data=dict()
for item in IN: # (what a name! ;)
data.setdefault(item.coordinates,[]).append(item)
# provided coordinates is a tuple

dupes=[items for items in data.values() if len(items)>1]

should give you a resulting list of lists of elements in IN
which occur more then once per coordinates.
You can then decide which ones to remove or whatever.

If you want to have the output only containing nodes to remove,
the following would work:

dupes=[]
for items in data.values():
dupes.extend(items[1:])

HTH
Tino

Tino Wildenhain

unread,

Sep 18, 2008, 11:53:38 AM9/18/08

to Alexzive, pytho...@python.org

Tino Wildenhain wrote:
> Hi,

>
> Alexzive wrote:
>> Hello there :) ,
>>
>> I am a python newbie and need to run following code for a task in an
>> external simulation programm called "Abaqus" which makes use of python
>> to access the mesh (ensamble of nodes with xy coordinates) of a
>> certain geometrical model.
>>
>> [IN is the starting input containing the nodes to be check, there are
>> some double nodes with the same x and y coordinates which need to be
>> removed. SN is the output containing such double nodes]
>>
>> Code: Select all
>> for i in range(len(IN)): #scan all elements of the list IN
>> for j in range(len(IN)):
>> if i <> j:
>> if IN[i].coordinates[0] == IN[j].coordinates[0]:
>> if IN[i].coordinates[1] == IN[j].coordinates[1]:
>> SN.append(IN[i].label)
>>
>

> data=dict()
> for item in IN: # (what a name! ;)
> data.setdefault(item.coordinates,[]).append(item)
> # provided coordinates is a tuple
>
> dupes=[items for items in data.values() if len(items)>1]
>
> should give you a resulting list of lists of elements in IN
> which occur more then once per coordinates.
> You can then decide which ones to remove or whatever.
>
> If you want to have the output only containing nodes to remove,
> the following would work:
>
> dupes=[]
> for items in data.values():
> dupes.extend(items[1:])
>

I did a small test - I don't know if it reflects
the actual problem good enough but it seems it
all runs below 1 sec each so this would be very
fast compared to 15h:

>>> import random
>>> class Node(object):
... def __init__(self,x,y):
... self.coordinates=(x,y)

>>> IN=[Node(random.randint(0,100),random.randint(0,100)) for i in
xrange(100000)]

>>> data=dict()
>>> for item in IN: data.setdefault(item.coordinates,[]).append(item)
...

>>> dupes=[items for items in data.values() if len(items)>1]

>>> len(dupes)
10190
>>>

Cheers
Tino

gil...@gmail.com

unread,

Sep 18, 2008, 1:26:49 PM9/18/08

to

On Sep 18, 11:18 am, prueba...@latinmail.com wrote:
> dup=set()
> SN=[]
> for item in IN:
> c=item.coordinates[0], item.coordinates[1]
> if c in dup:
> SN.append(item.label)
> else:
> dup.add(c)

+1 for O(N)

If item.coordinates is just an (x, y) pair, you can skip building c
and save a little memory:

seen_coords = set()

for node in IN:
if node.coordinates in seen_coords:
SN.append(node.label)
else:
seen_coords.add(node.coordinates)

seen_coords gets populated with references to the existing
node.coordinates objects, instead of new tuples.

Geoff G-T

Harald Luessen

unread,

Sep 18, 2008, 2:10:01 PM9/18/08

to

On Thu, 18 Sep 2008 Alexzive wrote:

>I am a python newbie and need to run following code for a task in an
>external simulation programm called "Abaqus" which makes use of python
>to access the mesh (ensamble of nodes with xy coordinates) of a
>certain geometrical model.
>
>[IN is the starting input containing the nodes to be check, there are
>some double nodes with the same x and y coordinates which need to be
>removed. SN is the output containing such double nodes]
>
>Code: Select all
> for i in range(len(IN)): #scan all elements of the list IN
> for j in range(len(IN)):
> if i <> j:
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)

I did not test the syntax, but here is an idea with sorted lists.
It should be O(NlogN).

def sk(x):
return x.coordinates[0]

IN.sort(key=sk)

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):

if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:
SN.append(IN[i].label)

else:
break

Harald

Paul Hankin

unread,

Sep 18, 2008, 4:03:38 PM9/18/08

to

A simple O(N) algorithm:

from collections import defaultdict

d = defaultdict(int)
for x in IN:
d[x] += 1

SN = [x for (x, c) in d.iteritems() if c > 1]

--
Paul Hankin

Paul Hankin

unread,

Sep 18, 2008, 4:14:50 PM9/18/08

to

On Sep 18, 2:25 pm, Alexzive <zasaconsult...@gmail.com> wrote:

> Hello there :) ,
>
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a
> certain geometrical model.
>
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
>
> Code: Select all
> for i in range(len(IN)): #scan all elements of the list IN
> for j in range(len(IN)):
> if i <> j:
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)

Using only a little extra storage to compute IN (but O(N log N)
time complexity):

import itertools

IN.sort()
SN = [k for k, v in itertools.groupby(IN) if len(list(v)) > 1]

--
Paul Hankin

Bruno Desthuilliers

unread,

Sep 18, 2008, 2:16:25 PM9/18/08

to

Harald Luessen a écrit :
(snip)

> I did not test the syntax,
> but here is an idea with sorted lists.
> It should be O(NlogN).
>
> def sk(x):
> return x.coordinates[0]
>
> IN.sort(key=sk)
> for i in xrange(len(IN)):
> for j in xrange(i+1, len(IN)):
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)
> else:
> break

The syntax is ok. Not the results, or so it seems (cf my other post in
this thread). But you still get a good point for noticing the redundant
tests in the inner loop !-)

Bruno Desthuilliers

unread,

Sep 18, 2008, 2:43:00 PM9/18/08

to

Alexzive a écrit :

> Hello there :) ,
>
> I am a python newbie and need to run following code for a task in an
> external simulation programm called "Abaqus" which makes use of python
> to access the mesh (ensamble of nodes with xy coordinates) of a
> certain geometrical model.
>
> [IN is the starting input containing the nodes to be check, there are
> some double nodes with the same x and y coordinates which need to be
> removed. SN is the output containing such double nodes]
>
> Code: Select all
> for i in range(len(IN)): #scan all elements of the list IN
> for j in range(len(IN)):
> if i <> j:
> if IN[i].coordinates[0] == IN[j].coordinates[0]:
> if IN[i].coordinates[1] == IN[j].coordinates[1]:
> SN.append(IN[i].label)
>
>
>
> Unfortunately my len(IN) is about 100.000 and the running time about
> 15h !!!! :(
>
> Any idea to improve it?

A couple ones have been submitted. Harald gets a point about the
redundant tests (even if his solution seems to be broken, cf below) -
your inner loop should have looked like :

for j in xrange(i+1, len(IN))

Now the obvious winner is pruebono - even unoptimized, using sets seems
to be *way* faster than even the most optimized corrected version of
your algorithm.

Here's a quick bench - please everyone doublecheck it to make sure it's ok:

from timeit import Timer
import random

class Node(object):
def __init__(self, x, y):
self.coordinates = (x, y)
def __repr__(self):
return repr(self.coordinates)

# how to build src:
# src = [(random.randrange(10),random.randrange(10)) for x in xrange(100)]
src = [
(4, 9), (5, 0), (6, 6), (7, 2), (3, 6), (9, 6), (0, 1), (1, 6),
(0, 5), (1, 2), (8, 9), (5, 4), (1, 6), (7, 6), (9, 1), (7, 6),
(0, 1), (7, 4), (7, 4), (8, 4), (8, 4), (3, 5), (9, 6), (6, 1),
(3, 4), (4, 5), (0, 5), (6, 3), (2, 4), (1, 6), (9, 5), (1, 2),
(5, 8), (8, 5), (3, 1), (9, 4), (9, 4), (3, 3), (4, 8), (9, 7),
(8, 4), (6, 2), (1, 5), (5, 8), (8, 6), (0, 8), (5, 2), (3, 4),
(0, 5), (4, 4), (2, 9), (7, 7), (1, 0), (4, 2), (5, 7), (0, 4),
(2, 5), (0, 8), (7, 3), (9, 1), (0, 4), (5, 0), (4, 9), (0, 6),
(3, 0), (3, 0), (3, 9), (8, 3), (7, 9), (8, 5), (7, 6), (1, 5),
(0, 6), (5, 9), (6, 8), (0, 0), (4, 1), (3, 3), (5, 4), (5, 3),
(6, 1), (5, 4), (4, 5), (5, 8), (4, 1), (3, 6), (1, 9), (0, 5),
(6, 5), (5, 5), (6, 0), (0, 9), (2, 6), (0, 7), (5, 9), (7, 3),
(7, 9), (5, 4), (4, 9), (2, 9)
]
IN = [Node(x, y) for x, y in src]

def doubles0():
SN = []

for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):

#print i, j

if i <> j:
if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:

SN.append(IN[i])
return SN

def doubles1():
"first solve an algoritmic problem"
SN = []
for i in range(len(IN)):
for j in range(i+1, len(IN)):
#print i, j
#if i != j:

if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:

SN.append(IN[i])
return SN

def doubles2():
"then avoid buildin useless lists"
SN = []

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):

if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:

SN.append(IN[i])
return SN

def doubles3():
"then avoid uselessly calling a constant operation"
SN = []
in_len = len(IN)
for i in xrange(in_len):
for j in xrange(i+1, in_len):

if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:

SN.append(IN[i])
return SN

def doubles4():
"then alias commonly used methods"
SN = []
sn_append = SN.append
in_len = len(IN)
for i in xrange(in_len):
for j in xrange(i+1, in_len):

if IN[i].coordinates[0] == IN[j].coordinates[0]:
if IN[i].coordinates[1] == IN[j].coordinates[1]:

sn_append(IN[i])
return SN

def doubles5():
"then alias a couple other things"
SN = []
sn_append = SN.append
in_len = len(IN)
for i in xrange(in_len):
node_i = IN[i]
coords_i = node_i.coordinates
for j in xrange(i+1, in_len):
if coords_i[0] == IN[j].coordinates[0]:
if coords_i[1] == IN[j].coordinates[1]:
sn_append(node_i)
return SN

def doubles6():
"then simplify tests"
SN = []
sn_append = SN.append
in_len = len(IN)
for i in xrange(in_len):
node_i = IN[i]
coords_i = node_i.coordinates
for j in xrange(i+1, in_len):
if coords_i == IN[j].coordinates:
sn_append(node_i)
return SN

# Harald : uncomment this and run test_results. As far as I can tell, it
# doesn't yields the expected results

## IN7 = IN[:]
## def sortk7(n):
## return n.coordinates[0]

## def doubles7():
## "is ordering better ? - Nope Sir, it's broken..."
## IN7.sort(key=sortk)
## SN = []
## sn_append = SN.append
## in_len = len(IN)
## for i in xrange(in_len):
## node_i = IN[i]
## coords_i = node_i.coordinates
## for j in xrange(i+1, in_len):
## if coords_i[0] == IN[j].coordinates[0]:
## if coords_i[1] == IN[j].coordinates[1]:
## sn_append(node_i)
## else:
## break

## return SN

def doubles8():
"Using a set ?"
dup = set()
SN = []

for item in IN:
c = item.coordinates

if c in dup:
SN.append(item)
else:
dup.add(c)
return SN

def doubles9():
"Using a set is way faster - but can still be improved a bit"
dup = set()
dup_add = dup.add
SN = []
sn_append = SN.append

for item in IN:
c = item.coordinates

if c in dup:
sn_append(item)
else:
dup_add(c)
return SN

# need this for tests:
names_funcs = sorted(
(n, f) for n, f in locals().iteritems()
if n.startswith('doubles')
)

def test_results():
"""
make sure all solutions give correct results,
assuming the OP solution is at least correct !-)
"""
results = [
(n, set(o.coordinates for o in f()))
for n, f in names_funcs
]
_, correct = results[0]
ok = True
for n, r in results[1:]:
if r != correct:
print "error on %s ?\n expected %s\n, got %s" \
% (n, correct, r)
ok = False
return ok

def test_times():
" And the winner is..."
results = [
(n, Timer(
'%s()' % n,
'from __main__ import %s' % n
).timeit(100)
)
for n, _ in names_funcs
]
print "\n".join("%s : %s" % r for r in results)

Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):

>>> test_results()
True
>>> test_times()
doubles0 : 1.55667901039
doubles1 : 0.719144105911
doubles2 : 0.703393936157
doubles3 : 0.700654983521
doubles4 : 0.706257104874
doubles5 : 0.528184890747
doubles6 : 0.461633205414
doubles8 : 0.0134379863739
doubles9 : 0.0108540058136
>>>

Not surprisingly, half less iterations makes for half less time.
Aliasing, as often, proves to be a good optimization too. But obviously,
using the correct data structure / algorithm combo is the key : simpler
code, and 115 times faster (143 times with aliasing). If pruebono
solution's is correct (and it as AFAICT), your 15 hours computation
should by now take less than 10 minutes...

bearoph...@lycos.com

unread,

Sep 18, 2008, 7:20:20 PM9/18/08

to

Bruno Desthuilliers:
> def doubles9():
> ...

> SN = []
> sn_append = SN.append

Few more:

import psyco

def doubles10():

dup = set()
SN = []
for item in IN:
c = item.coordinates
if c in dup:
SN.append(item)
else:
dup.add(c)
return SN

psyco.bind(doubles10)

from collections import deque

def doubles11():

dup = set()
dup_add = dup.add

SN = deque()
sn_append = SN.append

for item in IN:
c = item.coordinates
if c in dup:
sn_append(item)
else:
dup_add(c)
return SN

def doubles12():
dup = set()
SN = deque()

for item in IN:
c = item.coordinates
if c in dup:
SN.append(item)
else:
dup.add(c)
return SN

psyco.bind(doubles12)

Timings:

doubles00 : 0.522365288653
doubles01 : 0.247219812198
doubles02 : 0.237889823898
doubles03 : 0.238638921389
doubles04 : 0.23821698217
doubles05 : 0.177042495425
doubles06 : 0.13166199162
doubles08 : 0.00569725197252
doubles09 : 0.00418566685667
doubles10 : 0.00192086920869
doubles11 : 0.00403324533245
doubles12 : 0.00184026840268

Hopefully that's fast enough :-)

Bye,
bearophile

Steven D'Aprano

unread,

Sep 18, 2008, 7:42:51 PM9/18/08

to

On Thu, 18 Sep 2008 20:43:00 +0200, Bruno Desthuilliers wrote:

> Now the obvious winner is pruebono - even unoptimized, using sets seems
> to be *way* faster than even the most optimized corrected version of
> your algorithm.

I'm not being snarky about losing priority here, but I submitted
essentially the same solution two hours earlier than pruebono.

Are people not seeing my posts? Have I been kill-filed by everyone except
Mensator? I also asked a question about HTTPError and I haven't seen any
responses at all.

--
Steven

bearoph...@lycos.com

unread,

Sep 18, 2008, 10:08:27 PM9/18/08

to

bearophile:
> doubles12 : 0.00184026840268

For fun, and to have an almost baseline reference, I have done a
little benchmark in D too:

import std.stdio: writefln;

// Using Tango std lib you don't need all this
version (Win32) {
import std.c.windows.windows;
double clock() {
long t;
QueryPerformanceCounter(&t);
return cast(double)t / queryPerformanceFrequency;
}

long queryPerformanceFrequency;
static this() {
QueryPerformanceFrequency(&queryPerformanceFrequency);
}
}

union N {
struct { int x, y; }
long xy;
}

auto IN = [
N(4, 9), N(5, 0), N(6, 6), N(7, 2), N(3, 6), N(9, 6), N(0, 1), N(1,
6),
N(0, 5), N(1, 2), N(8, 9), N(5, 4), N(1, 6), N(7, 6), N(9, 1), N(7,
6),
N(0, 1), N(7, 4), N(7, 4), N(8, 4), N(8, 4), N(3, 5), N(9, 6), N(6,
1),
N(3, 4), N(4, 5), N(0, 5), N(6, 3), N(2, 4), N(1, 6), N(9, 5), N(1,
2),
N(5, 8), N(8, 5), N(3, 1), N(9, 4), N(9, 4), N(3, 3), N(4, 8), N(9,
7),
N(8, 4), N(6, 2), N(1, 5), N(5, 8), N(8, 6), N(0, 8), N(5, 2), N(3,
4),
N(0, 5), N(4, 4), N(2, 9), N(7, 7), N(1, 0), N(4, 2), N(5, 7), N(0,
4),
N(2, 5), N(0, 8), N(7, 3), N(9, 1), N(0, 4), N(5, 0), N(4, 9), N(0,
6),
N(3, 0), N(3, 0), N(3, 9), N(8, 3), N(7, 9), N(8, 5), N(7, 6), N(1,
5),
N(0, 6), N(5, 9), N(6, 8), N(0, 0), N(4, 1), N(3, 3), N(5, 4), N(5,
3),
N(6, 1), N(5, 4), N(4, 5), N(5, 8), N(4, 1), N(3, 6), N(1, 9), N(0,
5),
N(6, 5), N(5, 5), N(6, 0), N(0, 9), N(2, 6), N(0, 7), N(5, 9), N(7,
3),
N(7, 9), N(5, 4), N(4, 9), N(2, 9)
];

N[] doubles13() {
size_t[N] dup; // used as set
N[] SN;
foreach (item; IN) {
if (item in dup)
SN ~= item;
else
dup[item] = 0;
}
return SN;
}

N[] doubles0() {
N[] SN;
for (int i; i < IN.length; i++)
for (int j; j < IN.length; j++)
if (i != j)
if (IN[i] == IN[j])
SN ~= IN[i];
return SN;
}

void test_results() {
size_t[N] set1, set2;
foreach (n; doubles0())
set1[n] = 0;
foreach (n; doubles13())
set2[n] = 0;
if (set1.keys.sort != set2.keys.sort)
throw new Error("ERROR");
}

void main() {
int n = 150_000;
test_results();

int count = n;
auto t0 = clock();
while (count--)
doubles13();
auto t1 = clock();

writefln("%0.10f", cast(double)(t1 - t0) / n);
}

doubles13() needs 0.000027-0.000037 seconds (*), about 60-75 times
faster than doubles12, this means about 3-4 seconds instead of 15h (on
the original computer).
Using C++ with GCC (using a <hash_map> for dup and a <vector> for SN)
you can probably go 10-40% faster still :-)

(*) Probably there's such variability of timings because the current
DMD compiler I have used doesn't add "align" instructions in the asm
it produces as GCC does. With modern CPUs very sensitive to code
alignment this leads to even 20-30% runtime differences in small tight
loops.

Bye,
bearophile

Terry Reedy

unread,

Sep 19, 2008, 12:46:02 AM9/19/08

to pytho...@python.org

Yes, after figuring out what to do from the original post, I saw yours
and then Pruebono's and decided that since two people had submitted the
jackpot algorithm, I need not say more. I will say this: this solution
amounts to finding equivalence classes (the sets of items with a given
'key') and then finding the classes (sets) with more than one member.
Defaultdict is great for this.

tjr

Tino Wildenhain

unread,

Sep 19, 2008, 1:11:51 AM9/19/08

to Terry Reedy, pytho...@python.org

Hi,

Terry Reedy wrote:
...

> Yes, after figuring out what to do from the original post, I saw yours
> and then Pruebono's and decided that since two people had submitted the
> jackpot algorithm, I need not say more. I will say this: this solution
> amounts to finding equivalence classes (the sets of items with a given
> 'key') and then finding the classes (sets) with more than one member.
> Defaultdict is great for this.

I must say mine works with at least similar performance. Maybe its a
timezone issue but I also provided a simple test to my solution.

Also I never saw a list where the threading often goes wrong like
this here - is there any special setup or is it just peoples MUA
which screws up?

T.

Gabriel Genellina

unread,

Sep 19, 2008, 5:08:32 AM9/19/08

to pytho...@python.org

En Fri, 19 Sep 2008 02:11:51 -0300, Tino Wildenhain <ti...@wildenhain.de>
escribió:

> Also I never saw a list where the threading often goes wrong like
> this here - is there any special setup or is it just peoples MUA
> which screws up?

Perhaps it's due to the newsgroup/list duality...

--
Gabriel Genellina

Jake Anderson

unread,

Sep 19, 2008, 10:30:53 AM9/19/08

to Bruno Desthuilliers

> Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):
>
> >>> test_results()
> True
> >>> test_times()
> doubles0 : 1.55667901039
> doubles1 : 0.719144105911
> doubles2 : 0.703393936157
> doubles3 : 0.700654983521
> doubles4 : 0.706257104874
> doubles5 : 0.528184890747
> doubles6 : 0.461633205414
> doubles8 : 0.0134379863739
> doubles9 : 0.0108540058136
> >>>
>
> Not surprisingly, half less iterations makes for half less time.
> Aliasing, as often, proves to be a good optimization too. But obviously,
> using the correct data structure / algorithm combo is the key : simpler
> code, and 115 times faster (143 times with aliasing). If pruebono
> solution's is correct (and it as AFAICT), your 15 hours computation
> should by now take less than 10 minutes...
>
>

Ubuntu 8.04 core2 2.6(i think)
without psycho
doubles0 : 0.610555171967
doubles1 : 0.29314494133
doubles2 : 0.286273956299
doubles3 : 0.281984090805
doubles4 : 0.28240609169
doubles5 : 0.207377910614
doubles6 : 0.156388044357
doubles8 : 0.00533080101013
doubles9 : 0.00458884239197

with psycho
doubles0 : 0.127684116364
doubles1 : 0.069571018219
doubles2 : 0.064826965332
doubles3 : 0.0702300071716
doubles4 : 0.0647261142731
doubles5 : 0.0522589683533
doubles6 : 0.0437579154968
doubles8 : 0.00190806388855
doubles9 : 0.00214099884033

On this small test its a variance between ~6x to 2X still its basically
free so why not ;->

gil...@gmail.com

unread,

Sep 19, 2008, 11:47:03 AM9/19/08

to

On Sep 18, 7:42 pm, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> I'm not being snarky about losing priority here, but I submitted
> essentially the same solution two hours earlier than pruebono.

My apologies (seriosuly). In this case, I think it might just be
haste. For what it's worth, I saw your post (on Google Groups), but I
skipped over it. You wrote two solutions, one slow and one fast (the
latter being the same as pruebono's). You put the slow one at the
top, I saw

for ...
for ...

and went straight to the next message without reading the better
solution. I knew that there was only one for loop necessary, so I
didn't bother reading on. Actually, I missed pruebono's post, too,
until after I figured it out myself (but before I posted).

That several people came up with the nigh exact same solution, modulo
variable names only, says something about the Zen of Python.

Geoff G-T

Harald Luessen

unread,

Sep 19, 2008, 12:46:01 PM9/19/08

to

On Thu, 18 Sep 2008 Bruno Desthuilliers wrote:

># Harald : uncomment this and run test_results. As far as I can tell, it
># doesn't yields the expected results
>
>## IN7 = IN[:]
>## def sortk7(n):
>## return n.coordinates[0]
>
>## def doubles7():
>## "is ordering better ? - Nope Sir, it's broken..."
>## IN7.sort(key=sortk)
>## SN = []
>## sn_append = SN.append
>## in_len = len(IN)
>## for i in xrange(in_len):
>## node_i = IN[i]
>## coords_i = node_i.coordinates
>## for j in xrange(i+1, in_len):
>## if coords_i[0] == IN[j].coordinates[0]:
>## if coords_i[1] == IN[j].coordinates[1]:
>## sn_append(node_i)
>## else:
>## break
>## return SN

...

>Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):
>
> >>> test_results()
>True
> >>> test_times()
>doubles0 : 1.55667901039
>doubles1 : 0.719144105911
>doubles2 : 0.703393936157
>doubles3 : 0.700654983521
>doubles4 : 0.706257104874
>doubles5 : 0.528184890747
>doubles6 : 0.461633205414
>doubles8 : 0.0134379863739
>doubles9 : 0.0108540058136

When you change my code then do it right. :-)
You forgot to change the IN to IN7 at _every_ place.
And the sortk should be sortk7 in _both_ places.

I never let the code run before myself. I just wrote it
in the newsreader. But now i did and I have a second
version as bonus.

IN7 = IN[:]

def sortk7(n):
return n.coordinates[0], n.coordinates[1]

def doubles7():
IN7.sort(key=sortk7)

SN = []
sn_append = SN.append

in_len = len(IN7)
for i in xrange(in_len):
node_i = IN7[i]

coords_i = node_i.coordinates
for j in xrange(i+1, in_len):

if coords_i[0] == IN7[j].coordinates[0]:
if coords_i[1] == IN7[j].coordinates[1]:
sn_append(node_i)
else:
break
return SN

def comp7( x, y ):
return cmp( x.coordinates, y.coordinates )

def doubles7a():
IN7.sort( comp7 )

SN = []
sn_append = SN.append

in_len = len(IN7)
for i in xrange(in_len):
node_i = IN7[i]

for j in xrange(i+1, in_len):

node_j = IN7[j]
if comp7( node_i, node_j ) == 0:
sn_append(node_i)
else:
break
return SN

Here are the results. (py2.5, WindowsXP, Pentium4, 2.6GHz, 1.5GB):
My version is not so bad.

doubles0 : 1.03830598582
doubles1 : 0.47943719104
doubles2 : 0.487412506338
doubles3 : 0.475924733451
doubles4 : 0.466548681466
doubles5 : 0.340487967046
doubles6 : 0.278480365521
doubles7 : 0.0953190978183
doubles7a : 0.0784233750379
doubles8 : 0.010236496538
doubles9 : 0.00742803903848

Harald

prue...@latinmail.com

unread,

Sep 19, 2008, 1:25:41 PM9/19/08

to

On Sep 18, 7:42 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:

I see your post now, but I didn't notice it at the time I posted.
Could be because I am using the noticeable bad Google groups interface
that is known for being unreliable. Duplicate solutions on Usenet are
almost a given and I consider duplicate solutions a good thing, it
means that other people will be able to understand that code.

In any case I am not here for glory I am posting under a pseudonym so
nobody discovers that I am slacking off at work reading Usen[carrier
lost]

Terry Reedy

unread,

Sep 19, 2008, 1:41:04 PM9/19/08

to pytho...@python.org

Actually, the situation is a c.l.p <==> python-list <==>
gmane.c.p.general triality.

Bruno Desthuilliers

unread,

Sep 20, 2008, 1:01:42 PM9/20/08

to

Steven D'Aprano a écrit :

> On Thu, 18 Sep 2008 20:43:00 +0200, Bruno Desthuilliers wrote:
>
>> Now the obvious winner is pruebono - even unoptimized, using sets seems
>> to be *way* faster than even the most optimized corrected version of
>> your algorithm.
>
> I'm not being snarky about losing priority here, but I submitted
> essentially the same solution two hours earlier than pruebono.

My bad... Now that you mention it, I see you effectively did. My fault,
I did a too-fast screening of submitted solutions, and hereby declare
you Steven as the winner.

> Are people not seeing my posts? Have I been kill-filed by everyone except
> Mensator?

I do see your posts. Whether I read them entirely is another question -
and in this concrete case, the answer is obviously 'nope'. Put the blame
on me.

> I also asked a question about HTTPError and I haven't seen any
> responses at all.

This is another problem - it happens that no one has a clue, or that the
one having a clue lack time (or willingness) to anwer. Once again, sorry
if me missing your correct answer drives you paranoid :(

Bruno Desthuilliers

unread,

Sep 20, 2008, 1:11:35 PM9/20/08

to

Bruno Desthuilliers a écrit :

> Alexzive a écrit :
>> Hello there :) ,
>>

(snip)

>
> Now the obvious winner is pruebono

Correction (my bad) : Steven D'Aprano submitted the set-based solution
first. So the winners are Steven and pruebono.

Bruno Desthuilliers

unread,

Sep 20, 2008, 1:22:35 PM9/20/08

to

Harald Luessen a écrit :

doh :(

Sorry Harald, my fault, you're right...

Point taken. Now there's one thing I find questionnable with your
solution (which is probably why I didn't bother double-checking it when
it *appeared* to be broken) : you assume that it's ok to loose original
ordering, which I strongly suspect is not the case for the OP use case,
and given the OP use case list's size, working on a copy might not be an
acceptable solution.

But anyway: this is not an excuse for me having broken your code. My
apologies.

Steven D'Aprano

unread,

Sep 20, 2008, 10:20:33 PM9/20/08

to

On Sat, 20 Sep 2008 19:01:42 +0200, Bruno Desthuilliers wrote:

> Once again, sorry
> if me missing your correct answer drives you paranoid :-)

What do you mean by that? How many other people have been talking about
me?

*wink*

--
Steven

Aaron "Castironpi" Brady

unread,

Sep 20, 2008, 10:51:10 PM9/20/08

to

On Sep 20, 9:20 pm, Steven D'Aprano <st...@REMOVE-THIS-

Why, no fewer than usual!

*wink*

Tim Chase

unread,

Sep 23, 2008, 10:45:12 AM9/23/08

to km, pytho...@python.org

km wrote:
> how abt this ?
>
> N = len(IN)
> for k in range(N):
> for j in range(N):
> if j >= k: # or k <= j
> doSomething()

This has the root problem that the "if" statement is evaluated
N*N times, which is ugly/slow O(N^2) behavior. My solution
managed to reduce it by a constant multiplier, but several folks
proposed a more elegant O(N) solution which was leaps & bounds
faster.

-tkc