What's about l.index(max(l)) ?
> Oops, gmail has keyboard shortcuts apparently, to continue:
>
> def maxi(l):
> m = max(l)
> for i, v in enumerate(l):
> if m == v:
> return i
>
> But it seems like something that should be built in - or at least I should
> be able to write a lambda function for it, but I'm not sure how to do that
> either...Suggestions are very much welcome!
>
I really think this is a good candidate for a builtin. I suggest:
max2 (x):
""" return (minvalue,minindex)"""
This allows efficient usage in all cases, including iterators (not just
sequences).
from itertools import izip, count
answer = max(izip(l, count()))[1]
Raymond
Both of these methods scan the list twice. The two given by RH and SDD do
so just once. Both of those will give the index of the of the last maximum
value. If you want the index of the first max value (you did not specify
;-), write an explicit loop.
max((v,-i) for i,v in enumerate(l))
How about (corrected but still untested):
-max((v,-i) for i,v in enumerate(l))[1]
I may have still missed something.
The version I use:
def posmax(seq, key=None):
"""Return the position of the first maximum item of a sequence.
It accepts the usual key parameter too."""
if key:
return max(enumerate(seq), key=lambda k: key(k[1]))[0]
else:
return max(enumerate(seq), key=itemgetter(1))[0]
Bye,
bearophile
def posmax(seq, key=lambda x:x):
Is the Python max able to tell that's the identity function? I don't
think so, so your version may be slower and more memory hungry in the
common situation where you don't have a key function. So I think my
version is better :-)
Bye,
bearophile
I don't think there will be a noticable memory increase. It's not a
DSU situation like in sorting, it's just a small constant parameter.
Yes there might be a small speed hit. The compiler in principle could
convert the (lambda k: lambda x: k[1]) to something like
operator.itemgetter(1), but I doubt that it actually does so.
Actually there is a big speed hit. Until such time that Python has a fast
identity function, and lambda x:x isn't it, your version is about three
times slower than Bearophile's.
Much to my surprise, the fastest solution I've tried appears to be a pure
Python version not even using max() at all.
Here are the three versions:
def posmax1(seq, key=None): # bearophile's version
"""Return the position of the first maximum item of a sequence.
It accepts the usual key parameter too."""
if key:
return max(enumerate(seq), key=lambda k: key(k[1]))[0]
else:
return max(enumerate(seq), key=itemgetter(1))[0]
def posmax2(seq, key=lambda x:x): # Paul Rubin's version
return max(enumerate(seq), key=lambda k: key(k[1]))[0]
def posmax3(seq, key=None): # Pure Python version
if key is None:
m = seq[0]
index = 0
for i, x in enumerate(seq):
if x > m:
m = x
index = i
return index
else:
return NotImplemented
And my timing results:
>>> alist = [3]*1000 + [5] + [3]*1000
>>> assert alist.index(5) == 1000
>>> assert posmax1(alist) == posmax2(alist) == posmax3(alist) == 1000
>>>
>>> min(timeit.Timer('posmax1(alist)',
... 'from __main__ import posmax1, alist').repeat(number=1000))/1000
0.00074387979507446289
>>> min(timeit.Timer('posmax2(alist)',
... 'from __main__ import posmax2, alist').repeat(number=1000))/1000
0.0022983889579772949
>>> min(timeit.Timer('posmax3(alist)',
... 'from __main__ import posmax3, alist').repeat(number=1000))/1000
0.00063846302032470701
--
Steven
That version is easy to translate to other languages and you can
probably find that Psyco makes it much faster still.
Bye,
bearophile
Interesting. Fixing that speed hit probably needs compiler attention.
> Much to my surprise, the fastest solution I've tried appears to be a pure
> Python version not even using max() at all.
Could you try the l.index version? That avoids creating all those
intermediate tuples.
That operation is quite common, so it deserves a bit more work:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/543271
(I can show you the D/C code if you want it. The C code is just for
testing, while the D code is usable).
The final sum: the Psyco version is almost 40 times faster than the
Python version, and just 2.8 times slower than the D version, and 3.4
times slower than a C version (that doesn't use too many pointer
tricks). I think this is good enough.
Bye,
bearophile
I can't help wishing that psyco could do as good a job with the
simpler expressions of that function. I'll have to see what JHC does.
I don't think one should dismiss the simplest solution l.index(max(l))
out of hand (benchmark function borrowed from bearophile):
=================== posmax.py ==================
from operator import itemgetter
def simple_posmax(l, key=None):
if key:
return l.index(max(l, key=key))
else:
return l.index(max(l))
def clever_posmax(seq, key=None):
if key:
return max(enumerate(seq), key=lambda k: key(k[1]))[0]
else:
return max(enumerate(seq), key=itemgetter(1))[0]
def posmax_benchmark(posmax):
from time import clock
alist = [3]*1000 + [5] + [3]*1000
t = clock()
for _ in xrange(6000):
r = posmax(alist)
print posmax.__name__, round(clock() - t, 2), r
if __name__ == "__main__":
posmax_benchmark(clever_posmax)
posmax_benchmark(simple_posmax)
=============================================
marigold:python arno$ python posmax.py
clever_posmax 3.25 1000
simple_posmax 0.95 1000
simple_posmax is more than 3x faster on my machine. It's not
surprising as even though the list is walked twice, it is all done in
C and no new objects have to be created. Then only non-C bit is when
the result of max(l) is fed to l.index().
--
Arnaud
> def simple_posmax(l, key=None):
> if key:
> return l.index(max(l, key=key))
> else:
> return l.index(max(l))
def simple_posmax(l, **kw):
return l.index(max(l, **kw))
> simple_posmax is more than 3x faster on my machine. It's not
> surprising as even though the list is walked twice, it is all done in
> C and no new objects have to be created. Then only non-C bit is when
> the result of max(l) is fed to l.index().
It does expose a slight danger in that the list might somehow
self-mutate during the first traversal.
Yes. I wanted to make simple_posmax and clever_posmax as identical as
possible to make the comparison fair.
>
> > simple_posmax is more than 3x faster on my machine. It's not
> > surprising as even though the list is walked twice, it is all done in
> > C and no new objects have to be created. Then only non-C bit is when
> > the result of max(l) is fed to l.index().
>
> It does expose a slight danger in that the list might somehow
> self-mutate during the first traversal.
I suppose. Although you'd have to be mad to have comparisons mutate
stuff...
Also, it only works with sequences.
--
Arnaud
[...]
>
> simple_posmax is more than 3x faster on my machine. It's not
> surprising as even though the list is walked twice, it is all done in
> C and no new objects have to be created. Then only non-C bit is when
> the result of max(l) is fed to l.index().
Of course that's incorrect in general: if comparison functions between
objects in l are python functions then some bytecode will be run and
some new objects may be created. But in most cases I think it stands
true.
--
Arnaud