Thank you,
Alan Isaac
===================================================================
import time, random
from itertools import groupby
from collections import defaultdict
class Pass:
pass
# arbitrary setup
keys = [Pass() for i in range(10)]*3
vals = [random.random() for i in range(30)]
kv = zip(keys,vals)
random.shuffle(kv)
#OBJECTIVE:
# find minimum val associated with each "key" in kv
print "method 1: brute force"
t=time.clock()
d = dict()
for k,v in kv:
if k in d:
if d[k] > v:
d[k] = v
else:
d[k] = v
print time.clock()-t
print d
print
print "method 2: groupby"
t=time.clock()
d = dict()
kv_sorted = sorted(kv, key=lambda x: id(x[0]))
for k, g in groupby( kv_sorted, key=lambda x: x[0] ):
d[k] = min(gi[1] for gi in g)
print time.clock()-t
print d
print
print "method 3: defaultdict"
t=time.clock()
d = defaultdict(list)
for k,v in kv:
d[k].append(v)
for k in d:
d[k] = min(d[k])
print time.clock()-t
print d
This is definitely the approach I'd use. Seems "pretty" enough to me. ;-)
STeVe
I find it most attractive of the lot.
But its costs would rise if the number
of values per object were large.
Anyway, I basically agree.
Thanks,
Alan
How about something like:
kv_sorted = sorted(kv, key=lambda x: (id(x[0]), x[1]))
Now do your groupby and the first element of each group is the minimum
for that group.
Bye,
bearophile
The first solution, once one changes "d = dict()" to "d = {}", is
quite free of any impediments to ease of reading, uses the minimum
necessary memory to build the required dict [and it's quite obvious by
inspection that this is so], and is the easiest to "translate" to
earlier versions of Python.
You mean like this?
#sort by id and then value
kv_sorted = sorted(kv, key=lambda x: (id(x[0]),x[1]))
#groupby: first element in each group is object and its min value
d =dict( g.next() for k,g in groupby( kv_sorted, key=lambda x: x[0] ) )
Yes, that appears to be fastest and is
pretty easy to read.
Thanks,
Alan
On average.
For the specified problem.
;-)
It can be speeded up a bit with
a try/except:
for k,v in kv:
try:
if d[k] > v:
d[k] = v
except KeyError:
d[k] = v
Cheers,
Alan Isaac