sorting a dictionary

Yuval

unread,

Feb 4, 2003, 1:56:20 AM2/4/03

to

I'm looking for an efficient way to get the key of the highest value
in a dictionary.
For example, for dict = {"a":10, "b":5, "c":15}, I want to write a
function get_highest(dict), that will return c as the result (since c
is the key of the highest value in the dict.

Thanks.

dsavitsk

unread,

Feb 4, 2003, 3:03:07 AM2/4/03

to

def get_highest(d): # don't use the name 'dict'
l = d.keys()
l.sort()
return l[-1]

-d

"Yuval" <yuva...@hotmail.com> wrote in message
news:6ca96053.03020...@posting.google.com...

Andrew Bennetts

unread,

Feb 4, 2003, 2:55:05 AM2/4/03

to

def get_highest(d):
return max([(x[1], x[0]) for x in d.items()])[1]

Tested:

>>> def get_highest(d):
... return max([(x[1], x[0]) for x in d.items()])[1]
...
>>> get_highest({"a":10, "b":5, "c":15})
'c'

-Andrew.

Laura Creighton

unread,

Feb 4, 2003, 2:42:30 AM2/4/03

to

> def get_highest(d): # don't use the name 'dict'
> l = d.keys()
> l.sort()
> return l[-1]
>
> -d

If you give this a dict of d1={"a":10, "b":5, "c":15, "d":12}
this will return 'd' because 'd' sorts after 'c'. I don't
think that this is what the OP wanted.

This will work, but there may be faster ways.

>>> def get_highest(d):
... l = []
... for k in d.keys():
... l.append([d[k], k])
... l.sort()
... return l[-1][1]
...
>>> get_highest(d1)
'c'

Laura

>
> "Yuval" <yuva...@hotmail.com> wrote in message
> news:6ca96053.03020...@posting.google.com...
> > I'm looking for an efficient way to get the key of the highest value
> > in a dictionary.
> > For example, for dict = {"a":10, "b":5, "c":15}, I want to write a
> > function get_highest(dict), that will return c as the result (since c
> > is the key of the highest value in the dict.
> >
> > Thanks.
>
>

> --
> http://mail.python.org/mailman/listinfo/python-list

Alex Martelli

unread,

Feb 4, 2003, 4:27:19 AM2/4/03

to

dsavitsk wrote:

[about getting the largest key -- the "sorting" in the subject
is a bit of involuntary misdirection by the OP...:-)]

> def get_highest(d): # don't use the name 'dict'
> l = d.keys()
> l.sort()
> return l[-1]

This is good, but it's O(N logN) -- if the dictionary is
huge, you'll be hurting. max(d) is faster, and follows
the good rule of not reimplementing something that Python
already has as a built-in.

Alex

Alex Martelli

unread,

Feb 4, 2003, 4:30:48 AM2/4/03

to

Alex Martelli wrote:

Heh -- sorry, I see the OP wanted the *key of the highest
value*, NOT the highest key, which is what dsavitsk's
solution, and max(d), would give. For the OP's problem,
you can't get any faster than O(N logN).

Alex

Alex Martelli

unread,

Feb 4, 2003, 4:36:16 AM2/4/03

to

Laura Creighton wrote:
...

> This will work, but there may be faster ways.
>
>>>> def get_highest(d):
> ... l = []
> ... for k in d.keys():
> ... l.append([d[k], k])
> ... l.sort()
> ... return l[-1][1]
> ...

There's an O(N) solution -- sorting is O(N log N),
therefore the O(N) solution WILL be faster for
large-enough dictionaries. For example:

def get_highest_in_linear_time(d):
iterator = d.iteritems()
maxkey, maxval = iterator.next()
for curkey, curval in iterator:
if curval>maxval: maxkey, maxval = curkey, curval
return maxkey

Alex

Alex Martelli

unread,

Feb 4, 2003, 4:43:47 AM2/4/03

to

Alex Martelli wrote:
...

> Heh -- sorry, I see the OP wanted the *key of the highest
> value*, NOT the highest key, which is what dsavitsk's
> solution, and max(d), would give. For the OP's problem,
> you can't get any faster than O(N logN).

Silly me, of COURSE you can -- see my other post on this
thread. Here's an alternate implementation as penance;

def get_highest_in_linear_time_without_explicit_iterators(d):
for k in d:
maxkey, maxval = k, d[k]
break
else:
raise ValueError, "empty dictionary has no highest key"
for k in d:
v = d[k]
if v>maxval: maxkey, maxval = k, v
return maxkey

Point is, of course, that given any sequence (and a dict
is easy to treat as such, either with explicit iterators
or for statements), finding its maximum is O(N), while
sorting it is O(N logN). If all you need is the maximum,
sorting first is sensible only for SMALL sequences.

While one wouldn't notmally worry about performance, O()
issues may be different, IMHO -- there's nearly no upper
bound to the performance one may lose by choosing the
wrong-big-O kind of algorithm somewhere; and they're
tricky in that profiling may NOT show the bottleneck if
you profile with too-small data (it's normal to profile
with small-ish inputs, since profiling slows a program
down SO much -- although hotshot helps with that!) --
but when the program is run on data sets that are much
larger, the wrong-big-O algorithm may easily become the
performance bottleneck. Which is why I'm so keen about
this...

Alex

Andrew Bennetts

unread,

Feb 4, 2003, 4:53:33 AM2/4/03

to

Sure you can... what's wrong with the solution I posted earlier:

max([(x[1], x[0]) for x in d.items()])[1]

?

[Besides being a gratuitous one-liner <wink>]

-Andrew.

Harvey Thomas

unread,

Feb 4, 2003, 4:48:57 AM2/4/03

to

> Alex Martelli wrote:
>
> > dsavitsk wrote:
> >
> > [about getting the largest key -- the "sorting" in the subject
> > is a bit of involuntary misdirection by the OP...:-)]
> >
> >> def get_highest(d): # don't use the name 'dict'
> >> l = d.keys()
> >> l.sort()
> >> return l[-1]
> >
> > This is good, but it's O(N logN) -- if the dictionary is
> > huge, you'll be hurting. max(d) is faster, and follows
> > the good rule of not reimplementing something that Python
> > already has as a built-in.
>
> Heh -- sorry, I see the OP wanted the *key of the highest
> value*, NOT the highest key, which is what dsavitsk's
> solution, and max(d), would give. For the OP's problem,
> you can't get any faster than O(N logN).
>
>

> Alex
>

isn't

max([(v,k) for k,v in d.items()])[1]

which I think does the required job O(N)?

_____________________________________________________________________
This message has been checked for all known viruses by the MessageLabs Virus Scanning Service.

Paul Rubin

unread,

Feb 4, 2003, 5:31:52 AM2/4/03

to

Andrew Bennetts <andrew-p...@puzzling.org> writes:
> Sure you can... what's wrong with the solution I posted earlier:
>
> max([(x[1], x[0]) for x in d.items()])[1]
>
> ?
>
> [Besides being a gratuitous one-liner <wink>]

That uses linear extra storage vs. Alex's which uses constant space.

Alex Martelli

unread,

Feb 4, 2003, 5:43:09 AM2/4/03

to

Andrew Bennetts wrote:
...

> Sure you can... what's wrong with the solution I posted earlier:
>
> max([(x[1], x[0]) for x in d.items()])[1]

This may fail if the dictionary has a complex-number key:

>>> d={1j:23, 12:15, 7:23}

>>> max([(x[1], x[0]) for x in d.items()])[1]

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=
>>>

I have already posted (although not tested) two solutions
that should be immune from this problem because they never
try comparing keys for inequality (only values). Comparing
items unfortunately may mean comparing keys.

You can fix that in various ways, e.g.:

max([(d[k], i, k) for k, i in zip(d,range(len(d))) ])[-1]

but I think this is a bit too complicated for a 1-liner (your
code wouldn't be, Andrew, if among the specifications there
was one about keys being guaranteed to be comparable for
inequality), and thus I think the functions I posted earlier
for this task may be preferable.

Alex

Alex Martelli

unread,

Feb 4, 2003, 5:45:02 AM2/4/03

to

Harvey Thomas wrote:
..

> max([(v,k) for k,v in d.items()])[1]
>
> which I think does the required job O(N)?

Yes -- indeed I was wrong about the O(N) part -- your code
has a potential subtle bug (may compare keys thus crash
when one of equal values has a complex key, see my reply
to A. Bennett), but I've already posted several ways to
get solid O(N) solutions to atone for my error;-).

Alex

Andrew Bennetts

unread,

Feb 4, 2003, 6:05:24 AM2/4/03

to

On Tue, Feb 04, 2003 at 10:43:09AM +0000, Alex Martelli wrote:
> Andrew Bennetts wrote:
> ...
> > Sure you can... what's wrong with the solution I posted earlier:
> >
> > max([(x[1], x[0]) for x in d.items()])[1]
>
> This may fail if the dictionary has a complex-number key:

Heh. That's a sneaky requirement -- I didn't think of that :)

> You can fix that in various ways, e.g.:
>
> max([(d[k], i, k) for k, i in zip(d,range(len(d))) ])[-1]
>
> but I think this is a bit too complicated for a 1-liner (your

Indeed :)

-Andrew.

Andrew Bennetts

unread,

Feb 4, 2003, 5:58:51 AM2/4/03

to

Fair enough -- but I was more asking why it didn't qualify as O(N). Alex
later replied to himself to explain that of course it can be done in O(N),
so we're all happy now :)

It's a good point, though... space requirements of algorithms are easily
forgotten.

-Andrew.

Anton Vredegoor

unread,

Feb 4, 2003, 6:35:42 AM2/4/03

to

On Tue, 4 Feb 2003 20:53:33 +1100, Andrew Bennetts
<andrew-p...@puzzling.org> wrote:

>Sure you can... what's wrong with the solution I posted earlier:
>
> max([(x[1], x[0]) for x in d.items()])[1]

It creates many temporary objects. This one doesn't, and it can remain
at a higher level of ambiguity.

def get_highest(D):
def kmax(a,b):
if D.get(a)<D.get(b): return b
return a
return reduce(kmax,D)

Regards,
Anton.

Anton Vredegoor

unread,

Feb 4, 2003, 7:22:03 AM2/4/03

to

On Tue, 04 Feb 2003 12:35:42 +0100, an...@vredegoor.doge.nl (Anton
Vredegoor) wrote:

the last line of the function should be:

def get_highest(D):
def kmax(a,b):
if D.get(a)<D.get(b): return b
return a

return reduce(kmax,D,None)

Because now it gives a traceback if called with for example [1] as
input. Lose ambiguity only as much as necessary, anything more would
be aji-keshi :-)

Regards,

Anton.

Alex Martelli

unread,

Feb 4, 2003, 7:56:54 AM2/4/03

to

Andrew Bennetts wrote:

> On Tue, Feb 04, 2003 at 10:43:09AM +0000, Alex Martelli wrote:
>> Andrew Bennetts wrote:
>> ...
>> > Sure you can... what's wrong with the solution I posted earlier:
>> >
>> > max([(x[1], x[0]) for x in d.items()])[1]
>>
>> This may fail if the dictionary has a complex-number key:
>
> Heh. That's a sneaky requirement -- I didn't think of that :)

I was so scatterminded myself on this thread (and on essentialia,
not marginalia...) that I can hardly criticize others!-)

>> You can fix that in various ways, e.g.:
>>
>> max([(d[k], i, k) for k, i in zip(d,range(len(d))) ])[-1]
>>
>> but I think this is a bit too complicated for a 1-liner (your
>
> Indeed :)

Still, the idea is worth noticing: if you want to avoid further
items in a tuple being compared (in a sort, max, etc, on a list
of tuples), process the list to insert an index at the position
right before the further items you don't want to be compared.

The new enumerate built-in in Python 2.3 helps with this, btw:
enumerate(d) does the same job as zip(d, range(len(d))), but
in a faster and more readable way. Still, I believe it _would_
be still a bit too complicated for a 1-liner, even so.

Alex

Giles Brown

unread,

Feb 4, 2003, 8:34:24 AM2/4/03

to

Andrew Bennetts <andrew-p...@puzzling.org> wrote in message news:<mailman.104434470...@python.org>...

> On Mon, Feb 03, 2003 at 10:56:20PM -0800, Yuval wrote:
> > I'm looking for an efficient way to get the key of the highest value
> > in a dictionary.
>

> def get_highest(d):
> return max([(x[1], x[0]) for x in d.items()])[1]
>
> Tested:
>
> >>> def get_highest(d):
> ... return max([(x[1], x[0]) for x in d.items()])[1]
> ...
> >>> get_highest({"a":10, "b":5, "c":15})
> 'c'

I like this approach. It seems very clear to me.
Just for curiosity I though I might write a version using generators.
(Playing with new toys I guess.)

from __future__ import generators

def iteritemsvk(d):
iterator = d.iteritems()
while 1:
k, v = iterator.next()
yield (v, k)

def get_highest(d):
return max(iteritemsvk(d))[1]

d = { "a" : 10, "b" : 5, "c" : 15 }
print get_highest(d)

Less compact than Andrew's answer, the only possible advantage I
could see to this approach is avoiding creating a full list of
key, value pairs (which might be a problem if the dictionary had
a lot in it). Am I right in thinking this avoids creating a list?

Cheers,
Giles

dsavitsk

unread,

Feb 4, 2003, 1:22:25 PM2/4/03

to

"Laura Creighton" <l...@strakt.com> wrote in message
news:mailman.1044345130...@python.org...

> I don't
> think that this is what the OP wanted.
>

You are correct, I read it too fast. :(

dsavitsk

unread,

Feb 4, 2003, 1:29:33 PM2/4/03

to

There is an irony to my responding too fast and incorrectly as I asked the
same question here

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&th=2f70b8176dd6dd00&seekm
=mailman.1016053606.7082.python-list%40python.org&frame=off

"Laura Creighton" <l...@strakt.com> wrote in message
news:mailman.1044345130...@python.org...

Nick Vargish

unread,

Feb 4, 2003, 12:42:34 PM2/4/03

to

yuva...@hotmail.com (Yuval) writes:

def key_of_highest(d):
mv = max(d.values())
for k in d:
if d[k] == mv:
break
return k

Well, it may not be O(N), but it avoids spurious assignments. :^)

Nick

--
# sigmask.py || version 0.2 || 2003-01-07 || Feed this to your Python.
print reduce(lambda x,y:x+chr(ord(y)-1),'Ojdl!Wbshjti!=obwAqbusjpu/ofu?','')

Thanos Vassilakis

unread,

Feb 4, 2003, 2:21:50 PM2/4/03

to

Instead of using a dict you could use maplist - a kind of psychotic map
that thinks it is a list:
It is a dict that remembers the order of insert, and can be sorted.

http://py.vaults.ca/apyllo.py?find=maplist

thanos

-----------------------------------------
This message and its attachments may contain privileged and confidential information. If you are not the intended recipient(s), you are prohibited from printing, forwarding, saving or copying this email. If you have received this e-mail in error, please immediately notify the sender and delete this e-mail and its attachments from your computer.

John La Rooy

unread,

Feb 4, 2003, 3:30:51 PM2/4/03

to

On 04 Feb 2003 12:42:34 -0500
Nick Vargish <n...@adams.patriot.net> wrote:

> yuva...@hotmail.com (Yuval) writes:
>
> > I'm looking for an efficient way to get the key of the highest value
> > in a dictionary.
> > For example, for dict = {"a":10, "b":5, "c":15}, I want to write a
> > function get_highest(dict), that will return c as the result (since c
> > is the key of the highest value in the dict.
>
> def key_of_highest(d):
> mv = max(d.values())
> for k in d:
> if d[k] == mv:
> break
> return k
>
> Well, it may not be O(N), but it avoids spurious assignments. :^)
>
> Nick
>

It is O(N)

def key_of_highest(d):
mv = max(d.values())

return [k for k in d if d[k]==mv][0]

John

Terry Reedy

unread,

Feb 4, 2003, 3:57:44 PM2/4/03

to

"Nick Vargish" <n...@adams.patriot.net> wrote in message
news:yyyof5s...@adams.patriot.net...

> yuva...@hotmail.com (Yuval) writes:
>
> > I'm looking for an efficient way to get the key of the highest
value
> > in a dictionary.
> > For example, for dict = {"a":10, "b":5, "c":15}, I want to write a
> > function get_highest(dict), that will return c as the result
(since c
> > is the key of the highest value in the dict.
>
> def key_of_highest(d):
> mv = max(d.values())
> for k in d:
> if d[k] == mv:
> break
> return k
>
> Well, it may not be O(N), but it avoids spurious assignments. :^)

It is O(n) (assuming that d.values() is); there is only the matter of
the constant, and the space for d.values() (versus the iterator
version).

TJR

Paul Rubin

unread,

Feb 4, 2003, 4:23:50 PM2/4/03

to

"Terry Reedy" <tjr...@udel.edu> writes:
> It is O(n) (assuming that d.values() is); there is only the matter of
> the constant, and the space for d.values() (versus the iterator version).

The reason we have a lot of threads about native compilers and other
types of Python speedups: constant factors do matter.

Giovanni Bajo

unread,

Feb 4, 2003, 8:41:06 PM2/4/03

to

"Alex Martelli" <al...@aleax.it> ha scritto nel messaggio
news:nXL%9.174528$AA2.6...@news2.tin.it...

> def get_highest_in_linear_time_without_explicit_iterators(d):
> for k in d:
> maxkey, maxval = k, d[k]
> break
> else:
> raise ValueError, "empty dictionary has no highest key"
> for k in d:
> v = d[k]
> if v>maxval: maxkey, maxval = k, v
> return maxkey

Doesn't d[k] require O(logN) too? Or is Python smart enough to optimize this
away since we are going through the dictionary?
Your previous solution with explicit iterators didn't have this "issue".

Giovanni Bajo

Paul Rubin

unread,

Feb 4, 2003, 8:53:14 PM2/4/03

to

"Giovanni Bajo" <no...@sorry.com> writes:
> Doesn't d[k] require O(logN) too? Or is Python smart enough to
> optimize this away since we are going through the dictionary?

Dictionary access is supposed to be constant time (hash lookup).

Nick Vargish

unread,

Feb 5, 2003, 7:48:01 AM2/5/03

to

John La Rooy <nospam...@doctor.com> writes:

> It is O(N)

Oh. Cool. I thought the "max()" call would count against me,
somehow... Is it too late for this old dog to learn O(n) metrics? :^)

> def key_of_highest(d):
> mv = max(d.values())
> return [k for k in d if d[k]==mv][0]

It's terser, but lacks the potential early escape that the explicit
for/test/break approach gives you.

Alex Martelli

unread,

Feb 5, 2003, 9:20:14 AM2/5/03

to

Giovanni Bajo wrote:

>
> "Alex Martelli" <al...@aleax.it> ha scritto nel messaggio
> news:nXL%9.174528$AA2.6...@news2.tin.it...
>
>> def get_highest_in_linear_time_without_explicit_iterators(d):
>> for k in d:
>> maxkey, maxval = k, d[k]
>> break
>> else:
>> raise ValueError, "empty dictionary has no highest key"
>> for k in d:
>> v = d[k]
>> if v>maxval: maxkey, maxval = k, v
>> return maxkey
>
> Doesn't d[k] require O(logN) too? Or is Python smart enough to optimize
> this away since we are going through the dictionary?

d[k] is roughly constant time (dictionaries are hash tables) in
terms of number of entries in the dictionary. If the KEYS take
non-O(1) time to hash, as I believe strings do [hash(x) is O(len(x))
if x is a string -- I *think*], then that might be an issue, yes.

So, this is O(N) where N is the number of entries in d, but only
assuming each of d's keys is of bounded length (or some other kind
of data type which takes bounded time per hash computation, i.e.,
_most_ of them but I believe not all).

However, I think this issue (which often does get ignored when
analyzing algorithms, said he to lighten the burden of his guilt;-)
applies to other "atoms" here. For example, the v>maxval comparison
is ASSUMED to be O(1) -- but it MIGHT depend on the v and maxval
values and types; if v and maxval are strings of len K that only
differs towards the end, then the comparison takes O(K) [and as
far as I can see there is no way to safeguard against this by
changing algorithms, since the comparisons _will_ need to be done].

> Your previous solution with explicit iterators didn't have this "issue".

Yes, that solution never indexed the dictionary (but it did still
compare the values, inevitably). Here's another O(N) solution,
similar to this one and not using explicit iterators (works in
old Python versions too):

def get_highest_in_linear_time_with_items(d):
dit = d.items()
maxkey, maxval = dit[0]
for k, v in dit:

if v>maxval: maxkey, maxval = k, v
return maxkey

This does require extra *space* that is O(N), though (to store
dit). This doesn't _directly_ translate to using iteritems,
because it does one access via dit[0]. Avoiding that access
can bring us back closer to the previous solution:

def get_highest_in_linear_time_with_iteritems(d):
idit = d.iteritems()
for maxkey, maxval in idit:

break
else:
raise ValueError, "empty dictionary has no highest key"

for k, v in idit:

if v>maxval: maxkey, maxval = k, v
return maxkey

However, getting maxkey and maxval with a call to idit.next()
is clearly a more direct approach than that "for which runs
only once because it immediately breaks" -- and that in turn
gets us all the way around to "my previous solution" you mention.

Alex

Duncan Booth

unread,

Feb 5, 2003, 9:33:15 AM2/5/03

to

Alex Martelli <al...@aleax.it> wrote in
news:y490a.135838$0v.38...@news1.tin.it:

> d[k] is roughly constant time (dictionaries are hash tables) in
> terms of number of entries in the dictionary. If the KEYS take
> non-O(1) time to hash, as I believe strings do [hash(x) is O(len(x))
> if x is a string -- I *think*], then that might be an issue, yes.

If I remember correctly, python stores the hash in each string, so if you
have already used that string as a dictionary key the hash isn't
recalculated. In this case the 'for k in d:' produces a sequence of keys
that are guaranteed already pre-hashed.

--
Duncan Booth dun...@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?

Christopher A. Craig

unread,

Feb 5, 2003, 9:41:30 AM2/5/03

to

Paul Rubin <phr-n...@NOSPAMnightsong.com> writes:

Dictionary access is constant time on average, but big O doesn't
denote average. Dictionary lookup is O(N), and those lookups are not
optimized out.

--
Christopher A. Craig <list-...@ccraig.org>
"The problem with X is that it's overadequate" Dennis Ritchie

Alex Martelli

unread,

Feb 5, 2003, 10:09:14 AM2/5/03

to

Terry Reedy wrote:
...

>> def key_of_highest(d):
>> mv = max(d.values())
>> for k in d:
>> if d[k] == mv:
>> break
>> return k
>>
>> Well, it may not be O(N), but it avoids spurious assignments. :^)
>
> It is O(n) (assuming that d.values() is); there is only the matter of
> the constant, and the space for d.values() (versus the iterator
> version).

max(d.itervalues()) should solve the space problem. If you want
to determine the times, of course, you need to run the various
possible solutions *on dictionaries that are representative of
problems your application will really solve* -- running on other
dictionaries whose contents are not representative of your
problems will give you little practical help.

Alex

Peter Abel

unread,

Feb 5, 2003, 5:31:10 PM2/5/03

to

Alex Martelli <al...@aleax.it> wrote in message news:<OQM%9.128856$0v.36...@news1.tin.it>...

> (may compare keys thus crash
> when one of equal values has a complex key, see my reply

> to A. Bennett) <snip>

In deed I could find this:
>>> d={'a': 41, 'b': 3, 'd': (1+2j)}
>>> max(d.values())

Traceback (most recent call last):

File "<interactive input>", line 1, in ?

TypeError: cannot compare complex numbers using <, <=, >, >=

But what I do not understand is why the following doesn't crash:
>>> d={'a': 41, 'b': 'X', 'd': (1+2j)}
>>> max(d.values())
'X'

Coming back once again to the basic problem. What about the
following concerning space, speed etc. when we assume
that values are only numbers.
>>> d={'a': 15, 'b': 12, 'd': 9}
>>> d.keys()[d.values().index(max(d.itervalues()))]
'a'
>>>

I-like-1-liners-Regards
Peter

Alex Martelli

unread,

Feb 5, 2003, 7:18:32 PM2/5/03

to

Peter Abel wrote:
...

> But what I do not understand is why the following doesn't crash:
>>>> d={'a': 41, 'b': 'X', 'd': (1+2j)}
>>>> max(d.values())
> 'X'

>>> 2j>'X'
0
>>> 2j<'X'
1
>>> 2j>43

Traceback (most recent call last):

File "<stdin>", line 1, in ?

TypeError: cannot compare complex numbers using <, <=, >, >=
>>>

you CAN compare a complex number to a string -- just not to
any other number. Now, that is a "feature" I'm not going to
try to defend (can anybody think of a GOOD reason, as opposed
to an implementation detail leaking out...?).

Alex

Carlos Ribeiro

unread,

Feb 5, 2003, 8:27:51 PM2/5/03

to

On Thursday 06 February 2003 12:18 am, Alex Martelli wrote:
> you CAN compare a complex number to a string -- just not to
> any other number. Now, that is a "feature" I'm not going to
> try to defend (can anybody think of a GOOD reason, as opposed
> to an implementation detail leaking out...?).

Oops. It looks like a bug. It smells like a bug. It moves like a bug. But such
trivial bugs do not exist, so it must be a feature :-)

Carlos Ribeiro

Erik Max Francis

unread,

Feb 5, 2003, 9:24:30 PM2/5/03

to

Alex Martelli wrote:

> you CAN compare a complex number to a string -- just not to
> any other number. Now, that is a "feature" I'm not going to
> try to defend (can anybody think of a GOOD reason, as opposed
> to an implementation detail leaking out...?).

It's a tip of the hat to the mathematical fact that comparisons between
complex numbers is undefined. I agree that in light of Python's
generality of comparisons, it's a little weird to restrict the case
where a complex number is being compared to another numeric value, but
to someone with a mathematics background the reason is obvious. I've
never understood why this "feature" generates such discord.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/ \ Does the true light / Of love come in flashes
\__/ Sandra St. Victor
Rules for Buh / http://www.alcyone.com/max/projects/cards/buh.html
The official rules to the betting card game, Buh.

Chad Netzer

unread,

Feb 5, 2003, 9:46:15 PM2/5/03

to

On Wed, 2003-02-05 at 18:24, Erik Max Francis wrote:

> It's a tip of the hat to the mathematical fact that comparisons between
> complex numbers is undefined.

I think this was understood by Peter Abel (who demonstrated a point
about comparing complex numbers with floats and strings). He ultimately
had a different point (I am assuming).

I believe what he found odd was that, this doesn't throw an exception:

>>> max([41, 'X', (1+2j)])
'X'

whereas this does:
>>> max([41, (1+2j), 'X'])

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: cannot compare complex numbers using <, <=, >, >=

So the order of the list of values can determine whether max()/min()
will raise a TypeError. And given that the original post was about
sorting dictionary keys, which have no defined ordering, this could
(possibly) bite someone who isn't prepared to catch the exception
(because the issue may not arise in testing, depending on what order the
test data is returned from dict.items())

> I've never understood why this "feature" generates such discord.

Granted. But, things like the above don't help. From a mathematician's
point of view, I'd want max to ALWAYS throw an exception on this data
set (because I'd want to think of the problem as being "find the largest
value in the set of values", where ordering of comparisons doesn't even
come into it. Of course, I probably wouldn't be comparing strings
either. :) )

--
Bay Area Python Interest Group - http://www.baypiggies.net/

Chad Netzer
(any opinion expressed is my own and not NASA's or my employer's)

Alex Martelli

unread,

Feb 6, 2003, 4:31:26 AM2/6/03

to

Erik Max Francis wrote:

> Alex Martelli wrote:
>
>> you CAN compare a complex number to a string -- just not to
>> any other number. Now, that is a "feature" I'm not going to
>> try to defend (can anybody think of a GOOD reason, as opposed
>> to an implementation detail leaking out...?).
>
> It's a tip of the hat to the mathematical fact that comparisons between
> complex numbers is undefined. I agree that in light of Python's

And comparing a complex number with a string IS defined...?

> generality of comparisons, it's a little weird to restrict the case
> where a complex number is being compared to another numeric value, but
> to someone with a mathematics background the reason is obvious. I've

It's not obvious to me why Python claims that 2j<'1' is defined while 2j<1
is not. If the '<' indicates an arbitrary total ordering between values,
as the former would suggest, then there's no obvious reason the ordering
is made non-total, when it's obviously trivial to define orderings
among complex numbers that are just as arbitrary as orderings BETWEEN
complex numbers and strings (e.g., Python 1.5.2 had no problem there).

> never understood why this "feature" generates such discord.

In my case, it's the inconsistency between 2j<'1' working while 2j<1
doesn't -- if both raised (or neither raised, as in 1.5.2), I could
see the sense of it. I'd prefer the 1.5.2 behavior (so the subtle
distinguos I had to draw in this thread were not needed) but I could
accept "no complex number can EVER be compared with ANYTHING" as
having some use (and no practical downside wrt the alternative of
"complex numbers can be compared with strings but not with numbers").

But the current neither-fish-nor-fowl situation is just silly.

Alex

Ian Bicking

unread,

Feb 6, 2003, 4:43:39 AM2/6/03

to

On Thu, 2003-02-06 at 03:31, Alex Martelli wrote:
> In my case, it's the inconsistency between 2j<'1' working while 2j<1
> doesn't -- if both raised (or neither raised, as in 1.5.2), I could
> see the sense of it. I'd prefer the 1.5.2 behavior (so the subtle
> distinguos I had to draw in this thread were not needed) but I could
> accept "no complex number can EVER be compared with ANYTHING" as
> having some use (and no practical downside wrt the alternative of
> "complex numbers can be compared with strings but not with numbers").

It would be better if there was an alternative to < (or cmp) that was
potentially arbitrary but also stable and would not raise exceptions.
Like maybe "order".

Then, would an entirely arbitrary function be appropriate? Like:

def order(a, b):
return cmp(id(a), id(b))

--
Ian Bicking ia...@colorstudy.com http://colorstudy.com
4869 N. Talman Ave., Chicago, IL 60625 / 773-275-7241
"There is no flag large enough to cover the shame of
killing innocent people" -- Howard Zinn

Alex Martelli

unread,

Feb 6, 2003, 5:19:01 AM2/6/03

to

Ian Bicking wrote:
...

> It would be better if there was an alternative to < (or cmp) that was
> potentially arbitrary but also stable and would not raise exceptions.
> Like maybe "order".
>
> Then, would an entirely arbitrary function be appropriate? Like:
>
> def order(a, b):
> return cmp(id(a), id(b))

I cannot see any practical use for this -- it seems to give
far too few "guarantees" one could base any use on. E.g.:

>>> def order(a, b):
... return cmp(id(a), id(b))
...
>>> x=11
>>> order(x,11)
0
>>> x=1111
>>> order(x,1111)
1
>>>

How would you use an "ordering" which can see "equality"
for very small integers but not for somewhat-larger ones,
etc, etc?

I think at the very least any ordering should cooperate
with == to the extent that x==y --> order(x,y)==0 (maybe
not the reverse, if the ordering is looking at fewer
aspects of x and y than the equality-comparison is --
but such an ordering wouldn't be suitable for general
use, I think).

Alex

Mel Wilson

unread,

Feb 6, 2003, 9:38:09 AM2/6/03

to

In article <OXp0a.139082$0v.39...@news1.tin.it>,

Alex Martelli <al...@aleax.it> wrote:
>In my case, it's the inconsistency between 2j<'1' working while 2j<1
>doesn't -- if both raised (or neither raised, as in 1.5.2), I could
>see the sense of it. I'd prefer the 1.5.2 behavior (so the subtle
>distinguos I had to draw in this thread were not needed) but I could
>accept "no complex number can EVER be compared with ANYTHING" as
>having some use (and no practical downside wrt the alternative of
>"complex numbers can be compared with strings but not with numbers").

With our penchant for dynamic typing we want to be able
to do

some_list_of_things.sort()

and not just get a mess of exceptions.

>But the current neither-fish-nor-fowl situation is just silly.

I guess it helps people who *are* doing mathematical
computing and have said `if a <= rj:` unaware that one of
the operands is no longer real.

I guess if you simply want to put a bunch of things into
a convenient arbitrary order, and you know there are complex
numbers around, you're going to have to write a custom
compare function and use the dire RTTI.

I guess it's a bit of a wart.

Regards. Mel.

What would that look like? (Python 2.1.3 dies
comparing a complex against a string.)

def square_peg_vs_round_hole (a, b):
## print a, type(a), b, type(b)
cplx_type = type (1+1j)
numeric_types = (type (0), type (0L), type (0.0))
if type (a) is cplx_type:
if type (b) is cplx_type:
return cmp (abs(a), abs(b)) # my choice of tactics
elif type (b) in numeric_types:
return 1 # sort complex after reals
else:
return -1 # complex before non-numeric

elif type (b) is cplx_type:
if type (a) in numeric_types:
return -1 # sort reals before complex
else:
return 1 # non-numeric after complex

else:
return cmp (a, b)
# end square_peg_vs_round_hole

Feeling of impending doom here if there is some type
`some_type` where I sort reals before complex, and complex
before some_type and the standard cmp wants to put some_type
before reals. That could be trouble.

Peter Abel

unread,

Feb 6, 2003, 5:29:07 PM2/6/03

to

Chad Netzer <cne...@mail.arc.nasa.gov> wrote in message news:<mailman.104449969...@python.org>...

> On Wed, 2003-02-05 at 18:24, Erik Max Francis wrote:
>
> > It's a tip of the hat to the mathematical fact that comparisons between
> > complex numbers is undefined.
>
> I think this was understood by Peter Abel (who demonstrated a point
> about comparing complex numbers with floats and strings). He ultimately
> had a different point (I am assuming).
>
> I believe what he found odd was that, this doesn't throw an exception:
>
> >>> max([41, 'X', (1+2j)])
> 'X'
>
> whereas this does:
> >>> max([41, (1+2j), 'X'])
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> TypeError: cannot compare complex numbers using <, <=, >, >=
>
> So the order of the list of values can determine whether max()/min()

> will raise a TypeError. <snip>

OK the first thing I wondered about was what you describe above.
But thinking more about this stuff I asked myself why complex
crashes against integer and not against char.
But searching an answer to this questions and also after reading
the following posts and remembering Alex Martelli who taught me
something about lexical order in an other thread I found some
explanatiions for myself.
1. As you already explained comparing int and complex makes no
sense from the mathematical point of view. Int is a scalar
and complex is something like a vector with length and direction.
So which property should be compared.
2. Comparing complex and char is not quite clear at the first view.
But comparing it lexically - I think - can make sense.
The same way int or float and char does.

If it is as simple as I described above, I have no problems with
that stuff.
Regards
Peter

Mike Meyer

unread,

Feb 6, 2003, 8:11:24 PM2/6/03

to

Alex Martelli <al...@aleax.it> writes:

> It's not obvious to me why Python claims that 2j<'1' is defined while 2j<1
> is not.

Of course, you *can* write comparisons that certainly imply that you
can cmopare complex numbers. For instance,

2j < a < 1

won't raise an exception if a is a non-numeric type. If that makes
sense, then 2j < 1 ought to make sense.

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Delaney, Timothy C (Timothy)

unread,

Feb 6, 2003, 9:09:47 PM2/6/03

to

> From: Mike Meyer [mailto:m...@mired.org]

>
> Of course, you *can* write comparisons that certainly imply that you
> can cmopare complex numbers. For instance,
>
> 2j < a < 1
>
> won't raise an exception if a is a non-numeric type. If that makes
> sense, then 2j < 1 ought to make sense.

Damn - that's absolutely correct. Care to raise a bug report on SF?

Tim Delaney

Alex Martelli

unread,

Feb 7, 2003, 2:53:48 AM2/7/03

to

Mel Wilson wrote:
...

> What would that look like? (Python 2.1.3 dies
> comparing a complex against a string.)
>
>
> def square_peg_vs_round_hole (a, b):

This implementation raises an exception when you try
to compare (e.g.) lists whose first items are a
complex and any number respectively (because it falls
back to cmp for this case and cmp cannot know it
needs to call square_etc recursively for lexical
comparisons of sequences). Reimplementing all of
cmp is quite a bother...

Alex

Mel Wilson

unread,

Feb 7, 2003, 11:41:27 AM2/7/03

to

In article <gCJ0a.142939$0v.40...@news1.tin.it>,
Alex Martelli <al...@aleax.it> wrote:

>Mel Wilson wrote:
>> def square_peg_vs_round_hole (a, b):
>
>This implementation raises an exception

Hmmm.. even more impending doom than I thought.

> when you try
>to compare (e.g.) lists whose first items are a
>complex and any number respectively (because it falls
>back to cmp for this case and cmp cannot know it
>needs to call square_etc recursively for lexical
>comparisons of sequences). Reimplementing all of
>cmp is quite a bother...

It isn't simple to satisfy, at the same time, two groups
who need opposite things. Maybe the cmp we have, doing the
right thing quite often, and doing something not insane most
of the time, is the most useful kludge we can get.

I could extend square_peg_... to recurse down lists, but
that way there's the question of what's meant to be discrete
and what's meant to be a list, and we've been there already.
And classes generally -- exactly whose __cmp__ method am
I going to ignore?

Even worse ideas are:

system.__cmp__ = square_peg_vs_round_hole

because it makes the behaviour of the whole interpreter
depend on the prowess of J. Random Coder (e.g. me) and

pragma no_fault_complex_compares

to micro-mamage cmp behaviour over a code suite. Bad
because there are a gazillion things that 'need' to be
micro-managed. Python would immediately cease to be a
simple language. And the stacking and unstacking... You
won't see a PEP from me anythime soon.

Regards. Mel.

Mike Meyer

unread,

Feb 7, 2003, 8:53:49 PM2/7/03

to

I should file a bug report that, if acted upon, would result in
further spreading the broken behavior of complex numbers in
comparisons? That seems counterproductive.