Incomparable abominations (was: python-dev Summary)

David Mertz, Ph.D.

unread,

Mar 19, 2003, 2:12:53 PM3/19/03

to

I don't usually read Python-Dev, although I suppose I should more (time,
time, time). The latest summary prompted a look, which made me notice
Alex Martelli's well-founded complaint about the breakage in sorting
list with complex numbers in them:

http://mail.python.org/pipermail/python-dev/2003-March/034065.html

Guido subsequently defended the erratic and unintuitive behavior of
Python comparisons. Andrew Koenig made an excellent observation in the
thread:

In the first case, we have a total ordering; in the second, we have
what C++ calls a "strict weak ordering", which is really an ordering
of equivalence classes.

I completely DO NOT BUY Guido's point in the thread. The failure of
.sort() depending on obscure combinations of list contents is, to my
mind, the BIGGEST WART that Python has.

The argument is sometimes made that it is a programming mistake to try
to compare items of incomensurable types. But current Python is very
happy to compare a string to a number, for example. It would take a
radical overhaul to get rid of all the arbitrary (but stable)
comparisons. In any case, when I see .sort(), I really want a (stable)
total ordering, even understanding that the ordering is arbitrary across
types. It is quite sensible, and not uncommon, to collect various
types of things within dict.keys(), for example.

Moreover, from my perspective, the same applies to incommensurabilities
involving Unicode strings. I've had it explained to me before that I
should believe this doesn't really count because I need to specify an
encoding. Nonsense! I want the ability to perform a total ordering on
Python objects, at least within a sort.

Here is some erratic behavior (most of it introduced, BTW, in *Python
2.1*; even 2.0 is OK with complex numbers). Try to guess when an
exception will be raised (and in which Python version):

['x','y','z'].sort()
['x','y','z', 1j].sort()
['x','y','z', 1j, 1].sort()
['x','y','z', 1, 2].sort()
[0j, 1j, 2j].sort() # An obvious "natural" order
[0j, 1, 2].sort()
[0, 1, 2].sort() # Notice that 0==0j --> True
[chr(120), chr(240)].sort()
[chr(120), chr(240), 'x'].sort()
[chr(120), chr(240), u'x'].sort() # Notice u'x'=='x' --> True
[chr(120), u'x'].sort()
[chr(120), 'x', u'x'].sort()
[chr(120), 'x', u'x', 1j].sort()

A custom sorting function can handle these inconsistencies--very slowly.
What would be MUCH nicer would be to have a Python 2.0-style
".stablesort()" method (almost, Unicode "broke" then already). Ideally,
this method would do something -sensible- within types, and something
-consistent- between types (at least consistent within one run of the
interpreter, not necessarily any more than that).

Yours, David...

Andrew Koenig

unread,

Mar 19, 2003, 5:18:57 PM3/19/03

to

David> I completely DO NOT BUY Guido's point in the thread. The
David> failure of The argument is sometimes made that it is a
David> programming mistake to try to compare items of incomensurable
David> types.

Actually, the problem is somewhat worse than that, because at present,
comparison between long integers and floating-point numbers is done by
converting the integer to floating-point and comparing the converted
result. One problem with this approach is that the conversion might
fail. So, for example, [0.0,1L<<10000].sort() typically causes a
run-time error.

More subtle problems can occur because of loss of information in
conversions. For example:

>>> x = 10000000000000000000000000000000000
>>> y = x+1
>>> z = float(x)
>>> x == z
1
>>> y == z
1
>>> x == y
0

See SF bug report #513866 for more detail.

--
Andrew Koenig, a...@research.att.com, http://www.research.att.com/info/ark

Greg Ewing (using news.cis.dfn.de)

unread,

Mar 19, 2003, 6:11:58 PM3/19/03

to

David Mertz, Ph.D. wrote:
> I want the ability to perform a total ordering on
> Python objects, at least within a sort.

In a recent discussion on python-dev, Guido has expressed
favourable thoughts on introducing a before(x,y) operation
for arbitrary ordering. So there is hope that we may get
back the ability to sort arbitrary lists in some future
version.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Jeremy Fincher

unread,

Mar 19, 2003, 6:43:54 PM3/19/03

to

me...@gnosis.cx (David Mertz, Ph.D.) wrote in message news:<mailman.104810142...@python.org>...
> [snip]
> Yours, David...

I can't say much more than, "I concur." It seems completely
un-Pythonic to me to restrict comparisons to happen only between
"compatible" types.

Jeremy

Lulu of the Lotus-Eaters

unread,

Mar 19, 2003, 7:04:29 PM3/19/03

to

|> I want the ability to perform a total ordering on
|> Python objects, at least within a sort.

"Greg Ewing" <m...@privacy.net> wrote previously:

|In a recent discussion on python-dev, Guido has expressed
|favourable thoughts on introducing a before(x,y) operation
|for arbitrary ordering. So there is hope that we may get
|back the ability to sort arbitrary lists in some future
|version.

That would be great. Hopefully this operation also rolled into a
.sort()-like method that avoids the function call overhead of using a
custom comparison function.

Or better still, the total ordering behavior can be the behavior of
[].sort() itself... while the current behavior can be retained under the
name:

[].kinda_like_sort_but_crash_at_the_slightest_whiff_of_trouble()

Yours, Lulu...

--
mertz@ | The specter of free information is haunting the `Net! All the
gnosis | powers of IP- and crypto-tyranny have entered into an unholy
.cx | alliance...ideas have nothing to lose but their chains. Unite
| against "intellectual property" and anti-privacy regimes!
-------------------------------------------------------------------------

Mike Meyer

unread,

Mar 20, 2003, 9:28:53 AM3/20/03

to

me...@gnosis.cx (David Mertz, Ph.D.) writes:

> I completely DO NOT BUY Guido's point in the thread. The failure of
> .sort() depending on obscure combinations of list contents is, to my
> mind, the BIGGEST WART that Python has.

I'm hoping this is a sign that what I think of as one of Python's
warts - the near-total ordering of objects - is going to be repaired.

Given the number of people who think that complex objects ought to
compare, maybe I ought to explain that.

One of the things Python does that I like is it doesn't try and guess
what I mean. I noticed this coming from scripting in Perl, where doing
stupid things gives stupid results instead of an error. I feel much
safer using Python to manipulate my data than I do using Perl because
of this.

Right now, when I compare incomparable types, I get a result. Worse
yet, I get one that can vary from Python implementation to Python
implementation. To get results I can trust, I have to convert one of
the two objects to something that is comparable to the object in
question.

I claim I should get an error when I compare incomparable types,
because explicit beats implicit, and the current behavior has no
practical value.

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

David Eppstein

unread,

Mar 20, 2003, 10:21:40 AM3/20/03

to

In article <x78yvan...@guru.mired.org>, Mike Meyer <m...@mired.org>
wrote:

> I claim I should get an error when I compare incomparable types,
> because explicit beats implicit, and the current behavior has no
> practical value.

Sure it does, it has the value that (if L is a list) L.sort() always
puts it in a canonical ordering (same list produces the same ordering,
within runs from the same version of Python) so e.g. you can compare
sets of items for equality without worrying about them being in
different orders. That is, it works if you're not so foolish as to put
complex numbers into your sets...

--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science

Jeremy Fincher

unread,

Mar 20, 2003, 1:41:37 PM3/20/03

to

"Greg Ewing (using news.cis.dfn.de)" <m...@privacy.net> wrote in message news:<b5at5f$21vsdt$1...@ID-169208.news.dfncis.de>...

> David Mertz, Ph.D. wrote:
> > I want the ability to perform a total ordering on
> > Python objects, at least within a sort.
>
> In a recent discussion on python-dev, Guido has expressed
> favourable thoughts on introducing a before(x,y) operation
> for arbitrary ordering. So there is hope that we may get
> back the ability to sort arbitrary lists in some future
> version.

If something like this was added, so there was a total ordering on
Python objects during sorts, I'd be completely happy with the
comparison operators </>/<=/>= being changed to only work between
consistent types. In fact, I'd prefer it that way.

Of course, == and != would still work between inconsistent types.

Jeremy

John Roth

unread,

Mar 20, 2003, 8:49:55 PM3/20/03

to

"Jeremy Fincher" <tweed...@hotmail.com> wrote in message
news:698f09f8.03032...@posting.google.com...

I think that's probably the way is *should* have been from the
beginning. However, I'd be against changing it at this point; it would
break programs. Save the change for Python 3000.

John Roth

>
> Jeremy

Jeremy Fincher

unread,

Mar 21, 2003, 12:58:58 AM3/21/03

to

"John Roth" <john...@ameritech.net> wrote in message news:<v7krnnm...@news.supernews.com>...

> "Jeremy Fincher" <tweed...@hotmail.com> wrote in message
> news:698f09f8.03032...@posting.google.com...

> > If something like this was added, so there was a total ordering on
> > Python objects during sorts, I'd be completely happy with the
> > comparison operators </>/<=/>= being changed to only work between
> > consistent types. In fact, I'd prefer it that way.
> >
> > Of course, == and != would still work between inconsistent types.
>
> I think that's probably the way is *should* have been from the
> beginning. However, I'd be against changing it at this point; it would
> break programs. Save the change for Python 3000.

If the sort() method on lists was changed to use this total ordering
on Python objects (using before()), then changing the relational
operators to raise exceptions on type-invalid code wouldn't break any
of *my* code, so I'd be fine with the change :)

(And what code it did break, mine or not, would almost certainly be
either a bug or a place that would need to use before() instead of <;
I can't imagine the latter case occuring all that often.)

Jeremy

John Roth

unread,

Mar 21, 2003, 8:29:34 AM3/21/03

to

But that's not the issue. Suggested changes to Python won't happen
if there's any likelihood of breaking people's code. Incompatible
changes need a lot of justification and quite a bit of work to show
that the error cases are unlikely. Assertions don't do it; actual
surveys
of working code are needed.

Even then it would have to be transitioned in over several releases,
with warning messages for illegal usage.

John Roth

>
> Jeremy

Jp Calderone

unread,

Mar 21, 2003, 1:23:37 PM3/21/03

to

Nah. All that's needed is a BDFL pronouncement. PEP 238, for example.

>
> Even then it would have to be transitioned in over several releases, with
> warning messages for illegal usage.

That much is certain.

Jp

--
Examinations are formidable even to the best prepared, for
even the greatest fool may ask more the the wisest man can answer.
-- C.C. Colton
--
up 1 day, 13:58, 2 users, load average: 0.18, 0.34, 0.36

Lulu of the Lotus-Eaters

unread,

Mar 21, 2003, 1:10:25 PM3/21/03

to

"John Roth" <john...@ameritech.net> wrote previously:

|But that's not the issue. Suggested changes to Python won't happen
|if there's any likelihood of breaking people's code. Incompatible
|changes need a lot of justification

Except, of course, in the case of Python 2.1 breaking lots of working
code by making complex numbers incomparable to numbers (but happily
comparable to other types). :-(

Tim Peters

unread,

Mar 22, 2003, 6:16:55 PM3/22/03

to

[Mike Meyer]

> I claim I should get an error when I compare incomparable types,
> because explicit beats implicit, and the current behavior has no
> practical value.

I think it's approximately impossible to explain why

1 + "2"

raises an exception now, but

1 < "2"

doesn't. In the earliest Pythons, for obscure technical reasons it wasn't
*possible* for a comparison operation to raise an exception, and I expect
the current inconsistency was driven at least as much by that technical
glitch as by deliberate choice.

[David Eppstein]

> Sure it does, it has the value that (if L is a list) L.sort() always
> puts it in a canonical ordering (same list produces the same ordering,
> within runs from the same version of Python)

It's not easy to spell out what Python guarantees here today, in part
because phrases like "same list" could mean all sorts of things when
platforms, releases, or program runs separate the two lists claimed to be
"the same".

Under any reasonable meaning I can dream up,

same list produces the same ordering, within runs from the same
version of Python

is too strong a claim, because Python's comparison sometimes resorts to
comparing objects' memory addresses. The most notorious case of that is
instances of a common class, where the class doesn't override the default
comparison behavior:

class C:
pass

a = C()
b = C()
print cmp(a, b)

The result printed depends entirely on where the memory allocated for a
happens to land relative to the memory allocated for b. This can and does
differ across:

+ Platforms, keeping the release of Python fixed.

+ Platform C library linked with, keeping the platform and Python
release fixed.

+ Program runs, keeping the platform and Python binary fixed.

The last point is platform- and build-dependent, for the code as given,
depending on how the platform C and OS interact to pass out address space.
On any platform, the result can differ across runs if the Python code is
more complicated (for example, if some other part of the program keys off
time of day, or a random.random() result, or thread timing -- anything that
can change the exact sequence of malloc() calls made).

This kind of thing is a common error when programming with persistent
databases. For example, ZODB uses persistent BTrees extensively, and BTrees
need a consistent total ordering of key values *across* program runs (and
across Python releases, for that matter). If key comparison ever falls back
to comparing object memory addresses, then a BTree created and used within a
single program run works fine, but when the next program run retrieves the
BTree from the database, the relative ordering of object addresses can be
any permutation of the original ordering. BTree invariants are violated
left and right then, but the database code has no clear and efficient way to
know that they are. The result is crazy-- and seemingly random --behavior,
instead of exceptions.

Applications get in real trouble from that, and also from letting 1 < "2"
pass silently.

> so e.g. you can compare sets of items for equality without worrying
> about them being in different orders.

For that purpose, it's usually more efficient to populate two dicts with the
list elements (mapping to counts if they're multisets), and then use dict1
== dict2. That's expected-case linear-time, in the number of elements.
Sorting has worse expected-case O() behavior.

I've used mixed_type_list.sort() on occasion just to bring objects of the
same type next to each other, but that could be done more efficiently using
a dict too (mapping type object to list of objects of that type).

I really don't know of a good use for mixed_type_list.sort(). If I had a
good use and Python griped, then DSU of the form

helper = [(type(obj), obj) for obj in mixed_type_list]
helper.sort()
list = [obj for t, obj in helper]

would transform it easily enough into a sort that didn't compare objects of
different types. I have to admit I'm not even tempted to sort complex
numbers, though.

looks-like-an-old-bug-to-me-ly y'rs - tim

Donn Cave

unread,

Mar 22, 2003, 7:38:54 PM3/22/03

to

Quoth "Tim Peters" <tim...@email.msn.com>:

| I think it's approximately impossible to explain why
|
| 1 + "2"
|
| raises an exception now, but
|
| 1 < "2"
|
| doesn't. In the earliest Pythons, for obscure technical reasons it wasn't
| *possible* for a comparison operation to raise an exception, and I expect
| the current inconsistency was driven at least as much by that technical
| glitch as by deliberate choice.

What about

1 == "2"

, should it also be cause for an exception? Sorry if I missed the
earlier posts in this thread that would explain how == differs from
< in this application.

Donn Cave, do...@drizzle.com

Lulu of the Lotus-Eaters

unread,

Mar 22, 2003, 9:17:23 PM3/22/03

to

"Tim Peters" <tim...@email.msn.com> wrote previously:

|Python's comparison sometimes resorts to comparing objects' memory
|addresses. The most notorious case of that is instances of a common
|class

I have no particular desire to have sorting work identically between
platforms, versions, or even different interpreter runs. As I wrote
before, I just want it consistent *within* an interpreter run.

That is to say, I want the following function always to return True:

def WantTrue(*l):
m = list(l)
n = list(l)
m.sort()
n.sort()
return m == n

My desire was fulfilled for Python 1.5.2 and every earlier version, btw
(the only caveat I can think of is if the builtin 'list' was
overridden). Python 2.0 broke the desideratum for the new Unicode
objects (sometimes)[*]. Python 2.1 broke it worse for complex numbers.

Yours, Lulu...

[*] I'm not positive whether 1.6 has Unicode, actually. I don't have
that version installed on my current system (only 1.5.1, 1.5.2, 2.0,
2.0-stackless, 2.1, 2.2.0, 2.2.2, 2.3a2, and Jython 2.0 :-)).

--
mertz@ _/_/_/_/_/_/_/ THIS MESSAGE WAS BROUGHT TO YOU BY:_/_/_/_/ v i
gnosis _/_/ Postmodern Enterprises _/_/ s r
.cx _/_/ MAKERS OF CHAOS.... _/_/ i u
_/_/_/_/_/ LOOK FOR IT IN A NEIGHBORHOOD NEAR YOU_/_/_/_/_/ g s

Tim Peters

unread,

Mar 22, 2003, 11:17:08 PM3/22/03

to

[Lulu of the Lotus-Eaters]

> I have no particular desire to have sorting work identically between
> platforms, versions, or even different interpreter runs. As I wrote
> before, I just want it consistent *within* an interpreter run.

Yes, I saw that before, and it's unclear why you're repeating it. As I
explained in the post to which you're replying here, inconsistency across
runs creates real problems for real apps. What I haven't seen from you is
*why* you want sorting "to work" even when comparison results are pulled out
of thin air. David Eppstein gave one use case (comparing sets represented
as lists for equality), almost certainly better done a different way (with
dicts, as explained before). What's your use case?

> That is to say, I want the following function always to return True:
>
> def WantTrue(*l):
> m = list(l)
> n = list(l)
> m.sort()
> n.sort()
> return m == n
>
> My desire was fulfilled for Python 1.5.2 and every earlier version, btw
> (the only caveat I can think of is if the builtin 'list' was
> overridden). Python 2.0 broke the desideratum for the new Unicode
> objects (sometimes) [*].

It would be hard to pick a worse example than that, don't you think?
Comparing regular strings to Unicode strings is comparing two kinds of
strings, and if a regular string can't be *meaningfully* converted to
Unicode, making up a nonsense result based on the memory addresses of the
strings seems plain harmful -- the result, and the sort, has nothing to do
with the string contents then. Python wasn't designed on the principle of
maximal surprise <0.5 wink>.

> Python 2.1 broke it worse for complex numbers.

Yes, but what of it? Do you actually sort lists containing complex numbers?
If so, I do believe you'll be the first to claim such a hobby.

> Yours, Lulu...
>
> [*] I'm not positive whether 1.6 has Unicode, actually.

1.6 was the first version with builtin Unicode support, but 2.0 was released
less than 24 hours later.

Tim Peters

unread,

Mar 22, 2003, 11:44:24 PM3/22/03

to

[Donn Cave]

> What about
>
> 1 == "2"
>
> , should it also be cause for an exception? Sorry if I missed the
> earlier posts in this thread that would explain how == differs from
> < in this application.

It's arguable both ways, of course. After rich comparisons appeared, people
played with both approaches in their own types. What Guido seems happiest
with now (and which I not coincidentally implemented for the 2.3 datetime
module's types, after real-world problems with "always raise an exception"
on meaningless comparisons):

class SomeType:
def __eq__(self, other):
if not_a_type_i_know_about(type(other)):
return NotImplemented
return True or False appropriately

and similarly for __ne__. The others raise TypeError instead if they don't
know about other.

The implications for __eq__: if type(other) is not one SomeType knows
about, returning NotImplemented gives other a chance at providing a
meaningful result. If other.__eq__(self) also returns NotImplemented,
Python falls back to comparing self and other by memory address. Since self
is not other, this must return False.

Similarly, if __ne__ falls back to comparing by memory address, that must
return True.

In a world where this was consistently applied,

1 == "2"

would return False, and

1 != "2"

True, and

1 <= "2"
1 < "2"
1 > "2"
1 >= "2"

would raise TypeError.

Dan Bishop

unread,

Mar 23, 2003, 3:46:06 AM3/23/03

to

"Donn Cave" <do...@drizzle.com> wrote in message news:<1048379933.586672@yasure>...

> What about
>
> 1 == "2"
>

> should it also be cause for an exception?

Imho, no. It should simply have a value of False.

> Sorry if I missed the
> earlier posts in this thread that would explain how == differs from
> < in this application.

Because it makes intuitive sense that x != y when x and y have
completely different types. The same cannot be said about "3" < 1.

David Mertz, Ph.D.

unread,

Mar 23, 2003, 2:52:49 AM3/23/03

to

|[Lulu of the Lotus-Eaters]
|> I have no particular desire to have sorting work identically between
|> platforms, versions, or even different interpreter runs. As I wrote
|> before, I just want it consistent *within* an interpreter run.

Tim Peters <tim...@comcast.net> wrote previously:

|Yes, I saw that before, and it's unclear why you're repeating it. As I
|explained in the post to which you're replying here, inconsistency across
|runs creates real problems for real apps.

You seemed to be answering Mike Meyer's stronger statement that the same
*version* should always sort the same, which I agree is too strong.

Looking at your first note again, I see what you were getting at with
ZODB persistence as being a case where differences between *runs* could
cause problems. The point was not clear (to me) at first read.

|*why* you want sorting "to work" even when comparison results are

|pulled out of thin air. ...Comparing regular strings to Unicode...

|Python wasn't designed on the principle of maximal surprise <0.5 wink>.

But the current list sorting behavior is BY FAR the most surprising
thing in Python! I would have a big challenge explaining it to a new
user... heck, I can hardly remember the border cases myself.

The examples I gave in my first post for this thread show how surprising
the exceptions are. I really think that more than half of highly
experienced Python programmers would have gotten at least one of them
wrong in their intuition (not you, not Alex... maybe not Guido :-)...
but past that, mistakes).

I guess part of what goes into this is different intuitions about the
meaning of sorting. I put sorting a list into a different conceptual
category than comparing individual items (while knowing, of course, the
underlying relatedness of the concepts). Sorting, to me, just means a
stable (but arbitrary) ordering. Comparison with "<" and friends
implie, to me, a "natural" ordering. I wouldn't mind if "<" stopped
comparing anything without an obviously correct answer.

Even then, no matter how you cut it, the relation "1j < 2j" is
self-evident and natural. How can I explain to anyone why that is an
error rather than a True result?!

Yours, David...

--
---[ to our friends at TLAs (spread the word) ]--------------------------
Echelon North Korea Nazi cracking spy smuggle Columbia fissionable Stego
White Water strategic Clinton Delta Force militia TEMPEST Libya Mossad
---[ Postmodern Enterprises <me...@gnosis.cx> ]--------------------------

John Roth

unread,

Mar 23, 2003, 7:13:17 AM3/23/03

to

"David Mertz, Ph.D." <me...@gnosis.cx> wrote in message
news:mailman.1048407038...@python.org...

>
> Even then, no matter how you cut it, the relation "1j < 2j" is
> self-evident and natural. How can I explain to anyone why that is an
> error rather than a True result?!

That is, however, a border case. Is 1+2j < 2+1j true or false?
Why? Support your answer.

John Roth

Mike Meyer

unread,

Mar 23, 2003, 11:04:33 AM3/23/03

to

me...@gnosis.cx (David Mertz, Ph.D.) writes:

> You seemed to be answering Mike Meyer's stronger statement that the same
> *version* should always sort the same, which I agree is too strong.

I never made any claims about sorting. What I did claim was that
Python would be better off if comparisons made sense or raised
exceptions. This does mean that sorting a list will give the same
results - but that's not the point. I don't think I've ever had a
reason to sort a heterogenous list.

Paul Moore

unread,

Mar 23, 2003, 3:28:27 PM3/23/03

to

me...@gnosis.cx (David Mertz, Ph.D.) writes:

> Even then, no matter how you cut it, the relation "1j < 2j" is
> self-evident and natural.

Indeed. It's self-evidently undefined.

> How can I explain to anyone why that is an error rather than a True
> result?!

You start by explaining what 1j and 2j mean. Once you've done that
(and if you've done it properly, you've covered the fact that there is
no defined ordering on the complex numbers), it's self-evident.

Anyone who doesn't understand the properties of complex numbers
doesn't have much justification for using them. Many people who do
understand them probably *still* don't have a need to use them in
Python.

But having said all this, I think you have a point that people can
(not all do, though) have different intuitions about sorting and
comparison operations - probably because when sorting "real world"
items (books on shelves for example) we don't explicitly think in
terms of comparisons. But I'm not sure it's possible to formalise such
a distinction in a way which is implementable, or of practical use in
a computer program.

Paul.
--
This signature intentionally left blank

Greg Ewing (using news.cis.dfn.de)

unread,

Mar 23, 2003, 11:51:45 PM3/23/03

to

Dan Bishop wrote:
> Because it makes intuitive sense that x != y when x and y have
> completely different types. The same cannot be said about "3" < 1.

Well, it *could* be argued that, using the same kind of
reasoning, "3" < 1 and "3" > 1 should both be False.

Not that I'm arguing that myself!

Lulu of the Lotus-Eaters

unread,

Mar 23, 2003, 11:58:31 PM3/23/03

to

|> the relation "1j < 2j" is self-evident and natural.

"John Roth" <john...@ameritech.net> wrote previously:

|That is, however, a border case. Is 1+2j < 2+1j true or false?

True that the latter case has no natural order. But the former case
does. Likewise, I find this order natural:

"A" < "B"

And this order is completely arbitrary:

u"A" < unicodedata.lookup('HEBREW LETTER ALEF')

In what sense is the Roman alphabet "less than" the Hebrew alphabet?...
I have no hunch at all about where the "A"-like letter in Arabic, Greek,
Cyrillic, etc. would fall in the sequence, FWIW. Nor for various
A-diacritics that occur in Romanesque and Cryillicish alphabets. I
expect the order to be stable, but there is no "natural" answer to the
order.

And yet, both the character/unicode comparisons provide answers, while
the the complex comparisons raise exceptions.

To paraphrase the Timbot, Python seems to be aiming for a "principle of
maximum surprise!"

Yours, Lulu...

--
_/_/_/ THIS MESSAGE WAS BROUGHT TO YOU BY: Postmodern Enterprises _/_/_/
_/_/ ~~~~~~~~~~~~~~~~~~~~[me...@gnosis.cx]~~~~~~~~~~~~~~~~~~~~~ _/_/
_/_/ The opinions expressed here must be those of my employer... _/_/
_/_/_/_/_/_/_/_/_/_/ Surely you don't think that *I* believe them! _/_/

Jp Calderone

unread,

Mar 24, 2003, 12:24:32 AM3/24/03

to

On Sun, Mar 23, 2003 at 11:58:31PM -0500, Lulu of the Lotus-Eaters wrote:
> |> the relation "1j < 2j" is self-evident and natural.
>
> "John Roth" <john...@ameritech.net> wrote previously:
> |That is, however, a border case. Is 1+2j < 2+1j true or false?
>
> True that the latter case has no natural order. But the former case
> does. Likewise, I find this order natural:
>
> "A" < "B"
>
> And this order is completely arbitrary:
>
> u"A" < unicodedata.lookup('HEBREW LETTER ALEF')
>
> In what sense is the Roman alphabet "less than" the Hebrew alphabet?...

In the sense (of which I am certain you are aware) that letters in the
Roman alphabet have been assigned numeric values which are less than those
assigned to letters in the Hebrew alphabet; so dictates conventional
programming practices.

In the same way, conventional mathematical practices dictate that complex
numbers not be compared with < and >.

Is it unfortunate that following two separate conventions sometimes leads
to apparently (superficial) conflicting ideas and behaviors? Yes.

Is it inexplicable to people without a PhD in computer science? No.

Does it prevent us from writing useful programs? No.

Is it an unsurmountable flaw in the language? No.

> [snip]

>
> To paraphrase the Timbot, Python seems to be aiming for a "principle of
> maximum surprise!"

Just the opposite, imho.

Jp

--
Lowery's Law:
If it jams -- force it. If it breaks, it needed replacing anyway.
--
up 4 days, 1:59, 2 users, load average: 0.00, 0.02, 0.00

Roy Smith

unread,

Mar 24, 2003, 7:52:00 AM3/24/03

to

Lulu of the Lotus-Eaters <me...@gnosis.cx> wrote:
> And yet, both the character/unicode comparisons provide answers, while
> the the complex comparisons raise exceptions.

The difference is that character sets are a tool invented by computer
types for their own use, and thus they get to make up the rules as they
go along.

Complex numbers were invented by math people. The onus is on us to
implement them according to the math rules. It may seem "natural" to
you that 1j < 2j, but the math people say it's not, so it's not.

If Bob has three oranges and Jane has 5 oranges, who has more apples?
That may be an absurd question, but it's no more or less absurd than
trying to evaluate 1j < 2j.

It certainly seems natural to some people that 0/0 should return 1.
Yet, it raises an exception. Why? Because the rules of math say it
should.

John Roth

unread,

Mar 24, 2003, 12:54:55 PM3/24/03

to

"Lulu of the Lotus-Eaters" <me...@gnosis.cx> wrote in message
news:mailman.1048482165...@python.org...

> |> the relation "1j < 2j" is self-evident and natural.
>
> "John Roth" <john...@ameritech.net> wrote previously:
> |That is, however, a border case. Is 1+2j < 2+1j true or false?
>
> True that the latter case has no natural order. But the former case
> does. Likewise, I find this order natural:
>
> "A" < "B"
>
> And this order is completely arbitrary:
>
> u"A" < unicodedata.lookup('HEBREW LETTER ALEF')
>
> In what sense is the Roman alphabet "less than" the Hebrew
alphabet?...
> I have no hunch at all about where the "A"-like letter in Arabic,
Greek,
> Cyrillic, etc. would fall in the sequence, FWIW. Nor for various
> A-diacritics that occur in Romanesque and Cryillicish alphabets. I
> expect the order to be stable, but there is no "natural" answer to the
> order.

This isn't a new problem with character string comparisons. It's been
that way as long as we've had computers that could compare characters!

Different character sets do it differently, even for 8 bit characters.
For example, in EBCDIC, lower case letters compare before
upper case letters, while in ASCII the reverse is the case. The
character encoding that IBM invented for Stretch (7030) had
the lower case letters interleaved with the upper case letters so that
comparisons would come out "right."

If you want a "real world" ordering out of characters, you have to
use the comparison functions that go along with the appropriate locale.
That uses the established ordering for each language.

Bengt Richter

unread,

Mar 24, 2003, 5:52:55 PM3/24/03

to

On Sun, 23 Mar 2003 07:13:17 -0500, "John Roth" <john...@ameritech.net> wrote:

>
>"David Mertz, Ph.D." <me...@gnosis.cx> wrote in message
>news:mailman.1048407038...@python.org...
>>
>> Even then, no matter how you cut it, the relation "1j < 2j" is
>> self-evident and natural. How can I explain to anyone why that is an
>> error rather than a True result?!
>
>That is, however, a border case. Is 1+2j < 2+1j true or false?
>Why? Support your answer.
>

I guess you could define an ordering of all the complex points by
sorting on (magnitude, phase) -- or (magnitude**2, phase) -- tuples.
In which case 1j < 2j would be the tuples

>>> (0**2+1**2, math.atan2(1,0)) , (0**2+2**2, math.atan2(2,0))
((1, 1.5707963267948966), (4, 1.5707963267948966))

compared:
>>> (0**2+1**2, math.atan2(1,0)) < (0**2+2**2, math.atan2(2,0))
1

and 1+2j < 2+1j would be the tuples

>>> (1**2+2**2, math.atan2(2,1)) , (2**2+1**2, math.atan2(1,2))
((5, 1.1071487177940904), (5, 0.46364760900080609))

compared:
>>> (1**2+2**2, math.atan2(2,1)) < (2**2+1**2, math.atan2(1,2))
0

At least an ordering would be defined, and IWG a fair proportion of
results would ring intuitively true ;-) (I.e., equality would
mean the identical complex point, and differing magnitudes would sort
by magnitude, and equal magnitude numbers would compare by positions
on their equal-magnitude-radius circles).

Regards,
Bengt Richter

Tim Peters

unread,

Mar 24, 2003, 10:58:05 PM3/24/03

to

[David Mertz, Ph.D.]
> ...

> But the current list sorting behavior is BY FAR the most surprising
> thing in Python! I would have a big challenge explaining it to a new
> user... heck, I can hardly remember the border cases myself.
>
> The examples I gave in my first post for this thread show how surprising
> the exceptions are. I really think that more than half of highly
> experienced Python programmers would have gotten at least one of them
> wrong in their intuition (not you, not Alex... maybe not Guido :-)...
> but past that, mistakes).

I expect I would have gotten most of them wrong, and I confess I wasn't
interested enough to guess at what would happen. Why would I? I asked last
time what your use case is: for what purpose do you think you need to be
able to sort lists with a mix of element types for which Python can't dream
up a better ordering than comparing the objects' memory addresses? If you
don't have a real use case, I don't care about this.

> I guess part of what goes into this is different intuitions about the
> meaning of sorting. I put sorting a list into a different conceptual
> category than comparing individual items (while knowing, of course, the
> underlying relatedness of the concepts). Sorting, to me, just means a
> stable (but arbitrary) ordering. Comparison with "<" and friends
> implie, to me, a "natural" ordering. I wouldn't mind if "<" stopped
> comparing anything without an obviously correct answer.

What's your use case?

> Even then, no matter how you cut it, the relation "1j < 2j" is

> self-evident and natural. How can I explain to anyone why that is an
> error rather than a True result?!

It's like faith: if they know anything about complex numbers, no
explanation is necessary; if they don't know anything about complex numbers,
not only is no explanation possible, but they don't have a use case for it
either so it's not a real problem for them <0.5 wink>.

Since the members of a complex numbers are floats, then if 1j < 2j is
self-evident, what's the self-evident result for 1e-300+1j < 2j? Is one
self-evident and the other arbitrary? Etc. There are *useful* orderings on
complex numbers, but no "most useful" ordering, and no ordering at all is
conventionally defined on them.

David Mertz

unread,

Mar 24, 2003, 11:59:25 PM3/24/03

to

"Tim Peters" <tim...@email.msn.com> wrote previously:

|what your use case is: for what purpose do you think you need to be
|able to sort lists with a mix of element types for which Python can't
|dream up a better ordering than comparing the objects' memory addresses?

I fairly often use lists to collect heterogeneous objects together.
This might often be in an introspective context: what attributes (and
attribute values) does an object have? But sometimes I want to be
forgiving in the calls made to my functions with various arguments
(maybe arguments that are not wholly proper, but I still want to do
something slightly reasonable). Or I parse values of various datatypes
out of a textual report. Or I get an XML-RPC or CORBA call with
heterogeneous parameters.

Admittedly, sorting those heterogeneous collections is probably not the
-most- common thing to do with them. And sometimes dictionaries are
a better data structure--and Sets, especially, will be useful once I
start really using 2.3. But there is something so simple an elegant
about throwing 'collection.append(...)' statements around liberally.

When I do want to sort a heterogenous collection, often what I want to
do with the ordered collection is extract subsequences that have a
"natural" order. Like loop through for a while, dealing with numbers;
then when I start seeing strings, deal with those in a different way
(but still within ordered subsequences). In such a story, I probably
would more-or-less ignore any objects that aren't of "interesting"
types, whenever I got to them.

Unlike almost any other operation, [...].sort() cannot be saved by any
simple try/except block. It is extremely difficult to accurately
specify in a program just when it will fail, and harder still to decide
exactly what to do to remedy the situation. I could run the list
through filter() to try to get rid of everything I don't like (and that
might not sort). But it's not obvious what would need to be filtered.
Complex numbers are OK sometimes, but not others. Unicode strings are
OK sometimes, but not others. Not depending on the individual object,
but upon what else might happen to be in the list. And even upon what
order the list started in, contrast:

Python 2.3a2 (#0, Feb 21 2003, 19:35:57)
...
>>> [u'x', 'x', chr(255)].sort()
>>> ['x', chr(255), u'x'].sort()
Traceback (most recent call last):
...

This is seriously pathological! Btw. something like chr(255) has a
genuine value without an encoding. For example, I was recently doing
some stuff with cryptographic key material that looks at raw byte
values--the question of charset just isn't relevant there.

There IS NOT "use case" in the sense that I simply cannot do something
like GUI callbacks or matrix manipulation or whatever because of the
current [].sort() behavior. I can always find SOME way around the
inconvenience. But it's one of those things that comes back to bite you
at unexpected times, and for no good reason. That's why it bugs me so
much.

Yours, David...