Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Index of maximum element in list

2 views
Skip to first unread message

Hexamorph

unread,
Jan 25, 2008, 4:47:37 PM1/25/08
to pytho...@python.org
Henry Baxter wrote:
> Oops, gmail has keyboard shortcuts apparently, to continue:
>
> def maxi(l):
> m = max(l)
> for i, v in enumerate(l):
> if m == v:
> return i
>

What's about l.index(max(l)) ?

Neal Becker

unread,
Jan 25, 2008, 4:47:22 PM1/25/08
to pytho...@python.org
Henry Baxter wrote:

> Oops, gmail has keyboard shortcuts apparently, to continue:
>
> def maxi(l):
> m = max(l)
> for i, v in enumerate(l):
> if m == v:
> return i
>

> But it seems like something that should be built in - or at least I should
> be able to write a lambda function for it, but I'm not sure how to do that
> either...Suggestions are very much welcome!
>

I really think this is a good candidate for a builtin. I suggest:

max2 (x):
""" return (minvalue,minindex)"""

This allows efficient usage in all cases, including iterators (not just
sequences).

Raymond Hettinger

unread,
Jan 25, 2008, 11:16:50 PM1/25/08
to

from itertools import izip, count
answer = max(izip(l, count()))[1]


Raymond

Scott David Daniels

unread,
Jan 26, 2008, 12:52:40 AM1/26/08
to
Hexamorph wrote:
> ...

> What's about l.index(max(l)) ?
What about
max((x,i) for i,x in enumerate(lst))[1]

Terry Reedy

unread,
Jan 26, 2008, 2:32:48 AM1/26/08
to pytho...@python.org

"Henry Baxter" <henry....@gmail.com> wrote in message
news:8d04436f0801251519g482...@mail.gmail.com...
| Thanks Hexamorph and Neal. Somehow I didn't make the connection with
using
| 'index', but I'm all sorted out now :)

|
| On Jan 25, 2008 1:47 PM, Hexamorph <hexa...@gmx.net> wrote:
|
| > Henry Baxter wrote:
| > > Oops, gmail has keyboard shortcuts apparently, to continue:
| > >
| > > def maxi(l):
| > > m = max(l)
| > > for i, v in enumerate(l):
| > > if m == v:
| > > return i
| > >
| >
| > What's about l.index(max(l)) ?

Both of these methods scan the list twice. The two given by RH and SDD do
so just once. Both of those will give the index of the of the last maximum
value. If you want the index of the first max value (you did not specify
;-), write an explicit loop.

Paul Rubin

unread,
Jan 26, 2008, 2:36:17 AM1/26/08
to
"Terry Reedy" <tjr...@udel.edu> writes:
> | > What's about l.index(max(l)) ?
>
> Both of these methods scan the list twice. The two given by RH and SDD do
> so just once. Both of those will give the index of the of the last maximum
> value. If you want the index of the first max value (you did not specify
> ;-), write an explicit loop.

max((v,-i) for i,v in enumerate(l))

Paul Rubin

unread,
Jan 26, 2008, 2:40:28 AM1/26/08
to
"Terry Reedy" <tjr...@udel.edu> writes:
> | > What's about l.index(max(l)) ?
>
> Both of these methods scan the list twice. The two given by RH and SDD do
> so just once. Both of those will give the index of the of the last maximum
> value. If you want the index of the first max value (you did not specify
> ;-), write an explicit loop.

How about (corrected but still untested):

-max((v,-i) for i,v in enumerate(l))[1]

I may have still missed something.

bearoph...@lycos.com

unread,
Jan 26, 2008, 4:51:51 AM1/26/08
to
> Henry Baxter wrote:
> > def maxi(l):
> > m = max(l)
> > for i, v in enumerate(l):
> > if m == v:
> > return i
>
> What's about l.index(max(l)) ?

The version I use:

def posmax(seq, key=None):
"""Return the position of the first maximum item of a sequence.
It accepts the usual key parameter too."""
if key:
return max(enumerate(seq), key=lambda k: key(k[1]))[0]
else:
return max(enumerate(seq), key=itemgetter(1))[0]

Bye,
bearophile

Paul Rubin

unread,
Jan 26, 2008, 5:25:22 AM1/26/08
to
bearoph...@lycos.com writes:
> def posmax(seq, key=None):
> """Return the position of the first maximum item of a sequence.
> It accepts the usual key parameter too."""
> if key:
> return max(enumerate(seq), key=lambda k: key(k[1]))[0]
> else:
> return max(enumerate(seq), key=itemgetter(1))[0]

def posmax(seq, key=lambda x:x):

bearoph...@lycos.com

unread,
Jan 26, 2008, 11:28:12 AM1/26/08
to
Paul Rubin:

> def posmax(seq, key=lambda x:x):
> return max(enumerate(seq), key=lambda k: key(k[1]))[0]

Is the Python max able to tell that's the identity function? I don't
think so, so your version may be slower and more memory hungry in the
common situation where you don't have a key function. So I think my
version is better :-)

Bye,
bearophile

Paul Rubin

unread,
Jan 26, 2008, 3:40:26 PM1/26/08
to

I don't think there will be a noticable memory increase. It's not a
DSU situation like in sorting, it's just a small constant parameter.
Yes there might be a small speed hit. The compiler in principle could
convert the (lambda k: lambda x: k[1]) to something like
operator.itemgetter(1), but I doubt that it actually does so.

Steven D'Aprano

unread,
Jan 26, 2008, 5:07:37 PM1/26/08
to

Actually there is a big speed hit. Until such time that Python has a fast
identity function, and lambda x:x isn't it, your version is about three
times slower than Bearophile's.

Much to my surprise, the fastest solution I've tried appears to be a pure
Python version not even using max() at all.

Here are the three versions:

def posmax1(seq, key=None): # bearophile's version


"""Return the position of the first maximum item of a sequence.
It accepts the usual key parameter too."""
if key:

return max(enumerate(seq), key=lambda k: key(k[1]))[0]

else:
return max(enumerate(seq), key=itemgetter(1))[0]


def posmax2(seq, key=lambda x:x): # Paul Rubin's version


return max(enumerate(seq), key=lambda k: key(k[1]))[0]


def posmax3(seq, key=None): # Pure Python version
if key is None:
m = seq[0]
index = 0
for i, x in enumerate(seq):
if x > m:
m = x
index = i
return index
else:
return NotImplemented

And my timing results:

>>> alist = [3]*1000 + [5] + [3]*1000
>>> assert alist.index(5) == 1000
>>> assert posmax1(alist) == posmax2(alist) == posmax3(alist) == 1000
>>>
>>> min(timeit.Timer('posmax1(alist)',
... 'from __main__ import posmax1, alist').repeat(number=1000))/1000
0.00074387979507446289
>>> min(timeit.Timer('posmax2(alist)',
... 'from __main__ import posmax2, alist').repeat(number=1000))/1000
0.0022983889579772949
>>> min(timeit.Timer('posmax3(alist)',
... 'from __main__ import posmax3, alist').repeat(number=1000))/1000
0.00063846302032470701

--
Steven

bearoph...@lycos.com

unread,
Jan 26, 2008, 5:12:37 PM1/26/08
to
Steven D'Aprano:

> Much to my surprise, the fastest solution I've tried appears to be a pure
> Python version not even using max() at all.

That version is easy to translate to other languages and you can
probably find that Psyco makes it much faster still.

Bye,
bearophile

Paul Rubin

unread,
Jan 26, 2008, 5:18:04 PM1/26/08
to
Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:
> Actually there is a big speed hit. Until such time that Python has a fast
> identity function, and lambda x:x isn't it, your version is about three
> times slower than Bearophile's.

Interesting. Fixing that speed hit probably needs compiler attention.

> Much to my surprise, the fastest solution I've tried appears to be a pure
> Python version not even using max() at all.

Could you try the l.index version? That avoids creating all those
intermediate tuples.

bearoph...@lycos.com

unread,
Jan 27, 2008, 5:49:25 AM1/27/08
to
bearophile:

> That version is easy to translate to other languages and you can
> probably find that Psyco makes it much faster still.

That operation is quite common, so it deserves a bit more work:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/543271

(I can show you the D/C code if you want it. The C code is just for
testing, while the D code is usable).

The final sum: the Psyco version is almost 40 times faster than the
Python version, and just 2.8 times slower than the D version, and 3.4
times slower than a C version (that doesn't use too many pointer
tricks). I think this is good enough.

Bye,
bearophile

Paul Rubin

unread,
Jan 27, 2008, 6:01:05 AM1/27/08
to
bearoph...@lycos.com writes:
> The final sum: the Psyco version is almost 40 times faster than the
> Python version, and just 2.8 times slower than the D version, and 3.4
> times slower than a C version (that doesn't use too many pointer
> tricks). I think this is good enough.

I can't help wishing that psyco could do as good a job with the
simpler expressions of that function. I'll have to see what JHC does.

Arnaud Delobelle

unread,
Jan 27, 2008, 6:32:12 AM1/27/08
to
On Jan 26, 10:07 pm, Steven D'Aprano <st...@REMOVE-THIS-

I don't think one should dismiss the simplest solution l.index(max(l))
out of hand (benchmark function borrowed from bearophile):

=================== posmax.py ==================
from operator import itemgetter

def simple_posmax(l, key=None):
if key:
return l.index(max(l, key=key))
else:
return l.index(max(l))

def clever_posmax(seq, key=None):


if key:
return max(enumerate(seq), key=lambda k: key(k[1]))[0]
else:
return max(enumerate(seq), key=itemgetter(1))[0]

def posmax_benchmark(posmax):
from time import clock


alist = [3]*1000 + [5] + [3]*1000

t = clock()
for _ in xrange(6000):
r = posmax(alist)
print posmax.__name__, round(clock() - t, 2), r

if __name__ == "__main__":
posmax_benchmark(clever_posmax)
posmax_benchmark(simple_posmax)
=============================================

marigold:python arno$ python posmax.py
clever_posmax 3.25 1000
simple_posmax 0.95 1000

simple_posmax is more than 3x faster on my machine. It's not
surprising as even though the list is walked twice, it is all done in
C and no new objects have to be created. Then only non-C bit is when
the result of max(l) is fed to l.index().

--
Arnaud


Paul Rubin

unread,
Jan 27, 2008, 6:42:33 AM1/27/08
to
Arnaud Delobelle <arn...@googlemail.com> writes:

> def simple_posmax(l, key=None):
> if key:
> return l.index(max(l, key=key))
> else:
> return l.index(max(l))

def simple_posmax(l, **kw):
return l.index(max(l, **kw))

> simple_posmax is more than 3x faster on my machine. It's not
> surprising as even though the list is walked twice, it is all done in
> C and no new objects have to be created. Then only non-C bit is when
> the result of max(l) is fed to l.index().

It does expose a slight danger in that the list might somehow
self-mutate during the first traversal.

Arnaud Delobelle

unread,
Jan 27, 2008, 7:03:08 AM1/27/08
to
On Jan 27, 11:42 am, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:

> Arnaud Delobelle <arno...@googlemail.com> writes:
> > def simple_posmax(l, key=None):
> >     if key:
> >         return l.index(max(l, key=key))
> >     else:
> >         return l.index(max(l))
>
> def simple_posmax(l, **kw):
>    return l.index(max(l, **kw))

Yes. I wanted to make simple_posmax and clever_posmax as identical as
possible to make the comparison fair.

>
> > simple_posmax is more than 3x faster on my machine.  It's not
> > surprising as even though the list is walked twice, it is all done in
> > C and no new objects have to be created. Then only non-C bit is when
> > the result of max(l) is fed to l.index().
>
> It does expose a slight danger in that the list might somehow
> self-mutate during the first traversal.

I suppose. Although you'd have to be mad to have comparisons mutate
stuff...
Also, it only works with sequences.
--
Arnaud

Arnaud Delobelle

unread,
Jan 27, 2008, 8:02:38 AM1/27/08
to
On Jan 27, 11:32 am, Arnaud Delobelle <arno...@googlemail.com> wrote:

[...]


>
> simple_posmax is more than 3x faster on my machine.  It's not
> surprising as even though the list is walked twice, it is all done in
> C and no new objects have to be created. Then only non-C bit is when
> the result of max(l) is fed to l.index().

Of course that's incorrect in general: if comparison functions between
objects in l are python functions then some bytecode will be run and
some new objects may be created. But in most cases I think it stands
true.

--
Arnaud

0 new messages