Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Python 3.0 - is this true?

19 views
Skip to first unread message

walterbyrd

unread,
Nov 8, 2008, 1:20:31 PM11/8/08
to
I have read that in Python 3.0, the following will raise an exception:

>>> [2, 1, 'A'].sort()

Will that raise an exception? And, if so, why are they doing this? How
is this helpful? Is this new "enhancement" Pythonic?

Peter Otten

unread,
Nov 8, 2008, 1:42:01 PM11/8/08
to
walterbyrd wrote:

> I have read that in Python 3.0, the following will raise an exception:
>
>>>> [2, 1, 'A'].sort()
>
> Will that raise an exception?

Yes.

>>> [2, 1, "a"].sort()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()

> And, if so, why are they doing this? How
> is this helpful? Is this new "enhancement" Pythonic?

Is 1 > "A"? Is ord("B") > "A", "11" > 10?
What happens for sorted([datetime.time(), "now"])?

As the Zen puts it:

"In the face of ambiguity, refuse the temptation to guess."

So yes, I think this is an enhancement, and a pythonic one.

Peter

Arnaud Delobelle

unread,
Nov 8, 2008, 2:02:28 PM11/8/08
to
walterbyrd <walte...@iname.com> writes:

> I have read that in Python 3.0, the following will raise an exception:
>
>>>> [2, 1, 'A'].sort()
>
> Will that raise an exception?

Yes. In fact, plenty of objects of different types aren't comparable
anymore.

> And, if so, why are they doing this?

How is it helpful to be able to sort things which have no natural order?

> How is this helpful?

It goes well with duck typing. It lets you know when you things happen
that you don't mean to happen.

> Is this new "enhancement" Pythonic?

By definition!

--
Arnaud

Steven D'Aprano

unread,
Nov 8, 2008, 9:44:31 PM11/8/08
to
On Sat, 08 Nov 2008 19:02:28 +0000, Arnaud Delobelle wrote:

>> And, if so, why are they doing this?
>
> How is it helpful to be able to sort things which have no natural order?

Assuming you need to sort arbitrary types, then you have to choose an
order, even if it is arbitrary, so long as it's consistent.

I agree that Python shouldn't try to guess how to order incomparable
types, nor how to order unorderable types, but I'm pretty sure that by
using the key argument to sort you can specify your own ordering. I don't
have Python 3 installed here, but some variation on this will probably
work:

>>> alist = [2+3j, -4+5j, 8+2j, 1-7j, 6]
>>> sorted(alist, key=str)
[(-4+5j), (1-7j), (2+3j), (8+2j), 6]

Define your own ordering if you need to sort incomparable types.

--
Steven

walterbyrd

unread,
Nov 8, 2008, 10:06:14 PM11/8/08
to
On Nov 8, 7:44 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:

> Define your own ordering if you need to sort incomparable types.

If you starting new, I suppose you can always work around this new
enhancement. But, couldn't this cause a lot of backward compatibility
issues?

Also, I thought that part of the python philosophy was to allow any
sort of object in a list, and to allow the same methods to work with
whatever was in list.

walterbyrd

unread,
Nov 8, 2008, 10:17:11 PM11/8/08
to
On Nov 8, 12:02 pm, Arnaud Delobelle <arno...@googlemail.com> wrote:

> It goes well with duck typing.  It lets you know when you things happen
> that you don't mean to happen.

But doesn't that also make the language less flexible?

For example, if I used C, I would never have to worry about assigning
a float to an integer variable. The language will not allow it. I
thought that python's flexibility, in regard to that sort of thing,
was supposed to be one of python's great strengths.

Would it be better if python lists only accepted one type of data?
Wouldn't that even go further to let you know when things happen, that
you don't mean to happen?

Cameron Simpson

unread,
Nov 8, 2008, 10:32:35 PM11/8/08
to pytho...@python.org
On 08Nov2008 19:17, walterbyrd <walte...@iname.com> wrote:
| On Nov 8, 12:02 pm, Arnaud Delobelle <arno...@googlemail.com> wrote:
| > It goes well with duck typing.  It lets you know when you things happen
| > that you don't mean to happen.
|
| But doesn't that also make the language less flexible?

No. All you need to do is define the comparison criterion so Python
doesn't make an arbitrary guess.

| For example, if I used C, I would never have to worry about assigning
| a float to an integer variable. The language will not allow it.

Hmm. I presume you've never used unions in C then?
Or cast to void* then back to another pointer?
C code can be written to pay a fair amount of attention to types
and protect (or warn, depending) about a lot of typing issues.
But you can also, either deliberately or through sloppiness,
do all sorts of nasty type mangling, including assigning a float
into space used elsewhere as an integer variable.

| I
| thought that python's flexibility, in regard to that sort of thing,
| was supposed to be one of python's great strengths.
| Would it be better if python lists only accepted one type of data?

No. That would be less flexible.

| Wouldn't that even go further to let you know when things happen, that
| you don't mean to happen?

If I meant to put different types in a list, then this would be a
serious problem.
--
Cameron Simpson <c...@zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Terry Reedy

unread,
Nov 8, 2008, 11:00:27 PM11/8/08
to pytho...@python.org

Yes, key= lets you sort anything anyway you want.
>>> l=[1, '2', 3j]
>>> sorted(l, key = str)
[1, '2', 3j]
>>> sorted(l, key = id)
['2', 3j, 1]


Terry Reedy

unread,
Nov 8, 2008, 11:04:45 PM11/8/08
to pytho...@python.org
walterbyrd wrote:

Guido and the developers changed the behavior of order comparisons, and
hence of sorts, because they agreed, on the basis of person-decades of
experience, with no dissent that I know of, that the new behavior would
be better.

Have you written any Python code where you really wanted the old,
unpredictable behavior?

> Would it be better if python lists only accepted one type of data?

Take a look at the array module.

Kay Schluehr

unread,
Nov 8, 2008, 11:36:59 PM11/8/08
to
On 9 Nov., 05:04, Terry Reedy <tjre...@udel.edu> wrote:

> Have you written any Python code where you really wanted the old,
> unpredictable behavior?

Sure:

if len(L1) == len(L2):
return sorted(L1) == sorted(L2) # check whether two lists contain
the same elements
else:
return False

It doesn't really matter here what the result of the sorts actually is
as long as the algorithm leads to the same result for all permutations
on L1 ( and L2 ).

Alex_Gaynor

unread,
Nov 8, 2008, 11:49:39 PM11/8/08
to

that same thing could be done with a multiset type, which would also
have better performance(O(n) vs. O(nlogn)).

Steven D'Aprano

unread,
Nov 9, 2008, 12:08:45 AM11/9/08
to
On Sat, 08 Nov 2008 19:06:14 -0800, walterbyrd wrote:

> On Nov 8, 7:44 pm, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>
>> Define your own ordering if you need to sort incomparable types.
>
> If you starting new, I suppose you can always work around this new
> enhancement. But, couldn't this cause a lot of backward compatibility
> issues?

Which is why the change has only been introduced into Python 3, and not
Python 2.x.


> Also, I thought that part of the python philosophy was to allow any sort
> of object in a list, and to allow the same methods to work with whatever
> was in list.

You wouldn't expect this to work would you?

sum( [1, 2, None, "a string", {'x': 5}, []] )

You can only sum objects which are compatible. You can only sort objects
which define an order: you can't sort lists containing complex numbers
because "greater than" and "less than" aren't defined for complex
numbers, and in Python 3 you will no longer be able to order mixed
objects unless they are intrinsically comparable (e.g. floats and ints),
unless you define your own order.

--
Steven

Kay Schluehr

unread,
Nov 9, 2008, 12:14:18 AM11/9/08
to

I guess building a multiset is a little more expensive than just O(n).
It is rather like building a dict from a list which is O(k*n) with a
constant but small factor k. The comparison is of the same order. To
enable the same behavior as the applied sorted() a multiset must
permit unhashable elements. dicts don't do the job.

Steven D'Aprano

unread,
Nov 9, 2008, 1:06:40 AM11/9/08
to

How often do you care about equality ignoring order for lists containing
arbitrary, heterogeneous types?

In any case, the above doesn't work now, since either L1 or L2 might
contain complex numbers. The sorted() trick only works because you're
making an assumption about the kinds of things in the lists. If you want
to be completely general, the above solution isn't guaranteed to work.

If you're prepared to assume hashability, then the obvious Multiset
solution is probably better even than the above. If you want complete
generality, you can't even assume that items in the list can be ordered
at all, so you need something like this:


def collate_and_sort(L):
D = {}
for item in L:
t = type(item)
x = D.setdefault(t, [])
x.append(item)
for x in D.values():
try:
x.sort()
except TypeError:
try:
x.sort(key=str)
except TypeError:
x.sort(key=id)
return D


def unordered_equals(L1, L2):
if len(L1) != len(L2):
return False
try:
return sorted(L1) == sorted(L2)
except TypeError:
return collate_and_sort(L1) == collate_and_sort(L2)


But note that even this solution isn't perfect, since it will fail to do
the right thing for L1=[1, {}, 2] and L2=[1, {}, 2.0]. Here is a way to
solve the problem assuming only that the items support equality:

def unordered_equals2(L1, L2):
if len(L1) != len(L2):
return False
for item in L1:
if L1.count(item) != L2.count(item):
return False
return True


--
Steven

Kay Schluehr

unread,
Nov 9, 2008, 1:53:14 AM11/9/08
to
On 9 Nov., 07:06, Steven D'Aprano <st...@REMOVE-THIS-

cybersource.com.au> wrote:
> On Sat, 08 Nov 2008 20:36:59 -0800, Kay Schluehr wrote:
> > On 9 Nov., 05:04, Terry Reedy <tjre...@udel.edu> wrote:
>
> >> Have you written any Python code where you really wanted the old,
> >> unpredictable behavior?
>
> > Sure:
>
> > if len(L1) == len(L2):
> > return sorted(L1) == sorted(L2) # check whether two lists contain
> > the same elements
> > else:
> > return False
>
> > It doesn't really matter here what the result of the sorts actually is
> > as long as the algorithm leads to the same result for all permutations
> > on L1 ( and L2 ).
>
> How often do you care about equality ignoring order for lists containing
> arbitrary, heterogeneous types?

A few times. Why do you care, Steven?

> In any case, the above doesn't work now, since either L1 or L2 might
> contain complex numbers.
> The sorted() trick only works because you're
> making an assumption about the kinds of things in the lists. If you want
> to be completely general, the above solution isn't guaranteed to work.

You are right. I never used complex numbers in Python so problems were
not visible. Otherwise the following comp function in Python 2.X does
the job:

def comp(x1, x2):
try:
if x1<x2:
return -1
else:
return 1
except TypeError:
if str(x1)<str(x2):
return -1
else:
return 1

Not sure how to transform it into a search key that is efficient and
reliable.

[...]

> Here is a way to
> solve the problem assuming only that the items support equality:
>
> def unordered_equals2(L1, L2):
> if len(L1) != len(L2):
> return False
> for item in L1:
> if L1.count(item) != L2.count(item):
> return False
> return True

Which is O(n**2) as you might have noted.


Steven D'Aprano

unread,
Nov 9, 2008, 2:32:59 AM11/9/08
to
On Sat, 08 Nov 2008 22:53:14 -0800, Kay Schluehr wrote:

>> How often do you care about equality ignoring order for lists
>> containing arbitrary, heterogeneous types?
>
> A few times. Why do you care, Steven?

I'm a very caring kind of guy.


>> In any case, the above doesn't work now, since either L1 or L2 might
>> contain complex numbers.
>> The sorted() trick only works because you're making an assumption about
>> the kinds of things in the lists. If you want to be completely general,
>> the above solution isn't guaranteed to work.
>
> You are right. I never used complex numbers in Python so problems were
> not visible. Otherwise the following comp function in Python 2.X does
> the job:

[...]


> Not sure how to transform it into a search key that is efficient and
> reliable.

Yes, that's a general problem. It's not always easy to convert a sort
comparison function into a key-based sort. I know that 99% of the time
key is the right way to do custom sorts, but what about the other 1%?


> [...]
>
>> Here is a way to
>> solve the problem assuming only that the items support equality:
>>
>> def unordered_equals2(L1, L2):
>> if len(L1) != len(L2):
>> return False
>> for item in L1:
>> if L1.count(item) != L2.count(item):
>> return False
>> return True
>
> Which is O(n**2) as you might have noted.

Yes, I did notice. If you can't assume even ordering, then you need to do
a lot more work.


--
Steven

Arnaud Delobelle

unread,
Nov 9, 2008, 3:19:26 AM11/9/08
to
Kay Schluehr <kay.sc...@gmx.net> writes:

> On 9 Nov., 07:06, Steven D'Aprano <st...@REMOVE-THIS-

[...]


>> In any case, the above doesn't work now, since either L1 or L2 might
>> contain complex numbers.
>> The sorted() trick only works because you're
>> making an assumption about the kinds of things in the lists. If you want
>> to be completely general, the above solution isn't guaranteed to work.
>
> You are right. I never used complex numbers in Python so problems were
> not visible. Otherwise the following comp function in Python 2.X does
> the job:
>
> def comp(x1, x2):
> try:
> if x1<x2:
> return -1
> else:
> return 1
> except TypeError:
> if str(x1)<str(x2):
> return -1
> else:
> return 1
>

Sadly it fails on transitivity:

>>> comp(2, 3j)
-1
>>> comp(3j, True)
-1
>>> comp(True, 2)
-1

--
Arnaud

Rhamphoryncus

unread,
Nov 9, 2008, 3:26:38 AM11/9/08
to
On Nov 8, 10:14 pm, Kay Schluehr <kay.schlu...@gmx.net> wrote:
> I guess building a multiset is a little more expensive than just O(n).
> It is rather like building a dict from a list which is O(k*n) with a
> constant but small factor k. The comparison is of the same order. To
> enable the same behavior as the applied sorted() a multiset must
> permit unhashable elements. dicts don't do the job.

Although it has a runtime of k*n, it is still O(n). big-O notation
ignores constant factors, dealing only with scalability.

Stefan Behnel

unread,
Nov 9, 2008, 5:44:00 AM11/9/08
to
Kay Schluehr wrote:
> On 9 Nov., 07:06, Steven D'Aprano <st...@REMOVE-THIS-
> cybersource.com.au> wrote:
>> How often do you care about equality ignoring order for lists containing
>> arbitrary, heterogeneous types?
>
> A few times. Why do you care, Steven?

"I miss this feature" is an unqualified statement, just like "I'll miss George
W. Bush". That does not necessarily imply that I want him back in the position
that the had for the last eight years. It might just mean that it was fun to
see him on telly (although I bet his entertainment show was more expensive
than the average Marx-Brothers sketch).

Stefan

Kay Schluehr

unread,
Nov 9, 2008, 6:51:42 AM11/9/08
to

You are right. I remembered this short after my posting was sent.

Duncan Booth

unread,
Nov 9, 2008, 7:09:16 AM11/9/08
to
Steven D'Aprano wrote:

>> Not sure how to transform it into a search key that is efficient and
>> reliable.
>
> Yes, that's a general problem. It's not always easy to convert a sort
> comparison function into a key-based sort. I know that 99% of the time
> key is the right way to do custom sorts, but what about the other 1%?
>

You can create a key object which holds onto the original object and
calls the comparison function each time two keys are compared.
See
http://groups.google.com/group/comp.lang.python/browse_thread/thread/fc2bcc7f93a4dd1a/141be6780acd99d9

Roy Smith

unread,
Nov 9, 2008, 7:57:28 AM11/9/08
to
In article <mailman.3701.1226203...@python.org>,
Terry Reedy <tjr...@udel.edu> wrote:

> Yes, key= lets you sort anything anyway you want.
> >>> l=[1, '2', 3j]
> >>> sorted(l, key = str)
> [1, '2', 3j]

The problem here is that str() is degenerate, i.e. a != b does not imply
str(a) != str(b).

> >>> sorted(l, key = id)
> ['2', 3j, 1]

And of course, this has to opposite problem, a == b does not imply id(a) ==
id(b). Whether either of these "problems" is really a problem obviously
depends on what you're trying to do.

In 3.0, can you still order types? In 2.x, you can do:

>>> t1 = type(1)
>>> t2 = type(1j)
>>> t1 < t2
False

If this still works in 3.0, then you can easily do something like:

def total_order(o1, o2):
"Compare any two objects of arbitrary types"
try:
return o1 <= o2
except UncomparableTypesError: # whatever the right name is
return type(o1) <= type(o2)

and get the same effect as you had in 2.x.

Diez B. Roggisch

unread,
Nov 9, 2008, 9:01:09 AM11/9/08
to
>
> Also, I thought that part of the python philosophy was to allow any
> sort of object in a list, and to allow the same methods to work with
> whatever was in list.

Not really. When the usual argument about the existence (and
justification) of lists & tuples comes along, one common distinction is
that

- tuples contain arbitrary object of varying types, so they are kind
of "records"
- lists should contain uniform objects.

None of this is enforced, but it's good customs to do so & to put it
frankly: if I came along a piece of code that created a list like the
one you presented, I'd say it's a code-smell.

Diez

Roy Smith

unread,
Nov 9, 2008, 9:25:39 AM11/9/08
to
In article <6no8p6F...@mid.uni-berlin.de>,

"Diez B. Roggisch" <de...@nospam.web.de> wrote:

> >
> > Also, I thought that part of the python philosophy was to allow any
> > sort of object in a list, and to allow the same methods to work with
> > whatever was in list.
>
> Not really. When the usual argument about the existence (and
> justification) of lists & tuples comes along, one common distinction is
> that
>
> - tuples contain arbitrary object of varying types, so they are kind
> of "records"
> - lists should contain uniform objects.

I see absolutely nothing wrong with lists of heterogenous types. Or, for
that matter, iterators which generate heterogeneous types. Here's some
perfectly reasonable examples (equally applicable to lists or iterators):

* The tokens parsed out of a file (ints, floats, identifiers, keywords,
various kinds of punctuation, etc)

* The entries in a unix directory (plain files, directories, symlinks,
special files, named sockets, etc)

* The vehicles going through a toll booth (cars, trucks, motorcycles)

I don't see any reason you shouldn't be able to build lists of those things.

Duncan Booth

unread,
Nov 9, 2008, 9:40:22 AM11/9/08
to
Roy Smith wrote:

> In 3.0, can you still order types? In 2.x, you can do:
>
>>>> t1 = type(1)
>>>> t2 = type(1j)
>>>> t1 < t2
> False
>
> If this still works in 3.0, then you can easily do something like:
>
> def total_order(o1, o2):
> "Compare any two objects of arbitrary types"
> try:
> return o1 <= o2
> except UncomparableTypesError: # whatever the right name is
> return type(o1) <= type(o2)
>
> and get the same effect as you had in 2.x.

No, that won't work. You can compare types for equality/inequality, but
they are not orderable:

>>> type(1)==type('a')
False
>>> sorted([1, 'a'], key=lambda x:(type(x),x))


Traceback (most recent call last):
File "<stdin>", line 1, in <module>

TypeError: unorderable types: type() < type()

Diez B. Roggisch

unread,
Nov 9, 2008, 10:17:46 AM11/9/08
to
Roy Smith schrieb:

When I wrote "uniform" I meant objects of the same kind. So for example
subclasses are of course ok. And all of your examples are these: I want
a Token-object, keeping file-location and possibly original string
representation. The same goes for files - they are simply strings, their
properties determined using stat-calls. And vehicles are... vehicles. So
I'd create a common base-class for them, or made them at least behave
proper through duck-typing. Which - for the case at hand - might include
creating a __cmp__-method, based on horsepower or price-tag or...

Diez

Stefan Behnel

unread,
Nov 9, 2008, 10:41:04 AM11/9/08
to

However, for an arbitrary ordering of types, id(type(o1)) <= id(type(o2)) will
work well enough.

Stefan

Roy Smith

unread,
Nov 9, 2008, 10:45:31 AM11/9/08
to
In article <6nod8uF...@mid.uni-berlin.de>,

Maybe, but only if the logic of my program says that's the right way to do
it. If I decide that the appropriate way to return an integer is by
returning something of type(int), that's my business. Why should I have to
define a Token class if using the native Python types works just as well
for what I'm doing? I'll write class Token when there's some added value
I get from it which I can't get with raw types. I don't want to be forced
into it just because a container doesn't like what I'm doing.

I spend too much time in C++ pleading with the compiler to allow me to do
what I want. I come to Python to get away from all that type bondage.

As another example, consider a list of items being juggled:

[RubberChicken(), ChainSaw(), Canteloupe()]

I could go through contortions to find some common subclass for these
items, but the whole *point* is that they're not of the same type. And
making a list of them is a perfectly reasonable thing to do.

Roy Smith

unread,
Nov 9, 2008, 10:49:04 AM11/9/08
to
In article <6nob2mF...@mid.individual.net>,
Duncan Booth <duncan...@invalid.invalid> wrote:

Sigh. So if I really wanted to do that, I'd be forced to write
"str(type(o1)) < str(type(o2))"? To me, that sounds suspiciously like
"sudo type(o1) < type(o2)" :-)

Marc 'BlackJack' Rintsch

unread,
Nov 9, 2008, 11:09:51 AM11/9/08
to
On Sun, 09 Nov 2008 10:45:31 -0500, Roy Smith wrote:

> In article <6nod8uF...@mid.uni-berlin.de>,
> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>> Roy Smith schrieb:
>>> In article <6no8p6F...@mid.uni-berlin.de>,
>>> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>>>
>> When I wrote "uniform" I meant objects of the same kind. So for example
>> subclasses are of course ok. And all of your examples are these: I want
>> a Token-object, keeping file-location and possibly original string
>> representation.
>
> Maybe, but only if the logic of my program says that's the right way to
> do it. If I decide that the appropriate way to return an integer is by
> returning something of type(int), that's my business. Why should I have
> to define a Token class if using the native Python types works just as
> well for what I'm doing? I'll write class Token when there's some
> added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a
`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.

> I don't want to be forced into it just because a container doesn't like
> what I'm doing.

The container doesn't say what *you* can do with the content, it is just
a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.

> As another example, consider a list of items being juggled:
>
> [RubberChicken(), ChainSaw(), Canteloupe()]
>
> I could go through contortions to find some common subclass for these
> items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a
method on them they have to implement that. Not necessarily through
inheritance.

> And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch

unread,
Nov 9, 2008, 11:10:36 AM11/9/08
to
On Sun, 09 Nov 2008 10:45:31 -0500, Roy Smith wrote:

> In article <6nod8uF...@mid.uni-berlin.de>,
> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>> Roy Smith schrieb:
>>> In article <6no8p6F...@mid.uni-berlin.de>,
>>> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>>>
>> When I wrote "uniform" I meant objects of the same kind. So for example
>> subclasses are of course ok. And all of your examples are these: I want
>> a Token-object, keeping file-location and possibly original string
>> representation.
>
> Maybe, but only if the logic of my program says that's the right way to
> do it. If I decide that the appropriate way to return an integer is by
> returning something of type(int), that's my business. Why should I have
> to define a Token class if using the native Python types works just as
> well for what I'm doing? I'll write class Token when there's some
> added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a

`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.

> I don't want to be forced into it just because a container doesn't like
> what I'm doing.

The container doesn't say what *you* can do with the content, it is just

a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.

> As another example, consider a list of items being juggled:


>
> [RubberChicken(), ChainSaw(), Canteloupe()]
>
> I could go through contortions to find some common subclass for these
> items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a

method on them they have to implement that. Not necessarily through
inheritance.

> And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch

unread,
Nov 9, 2008, 11:11:21 AM11/9/08
to
On Sun, 09 Nov 2008 10:45:31 -0500, Roy Smith wrote:

> In article <6nod8uF...@mid.uni-berlin.de>,
> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>> Roy Smith schrieb:
>>> In article <6no8p6F...@mid.uni-berlin.de>,
>>> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>>>
>> When I wrote "uniform" I meant objects of the same kind. So for example
>> subclasses are of course ok. And all of your examples are these: I want
>> a Token-object, keeping file-location and possibly original string
>> representation.
>
> Maybe, but only if the logic of my program says that's the right way to
> do it. If I decide that the appropriate way to return an integer is by
> returning something of type(int), that's my business. Why should I have
> to define a Token class if using the native Python types works just as
> well for what I'm doing? I'll write class Token when there's some
> added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a

`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.

> I don't want to be forced into it just because a container doesn't like
> what I'm doing.

The container doesn't say what *you* can do with the content, it is just

a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.

> As another example, consider a list of items being juggled:


>
> [RubberChicken(), ChainSaw(), Canteloupe()]
>
> I could go through contortions to find some common subclass for these
> items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a

method on them they have to implement that. Not necessarily through
inheritance.

> And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch

unread,
Nov 9, 2008, 11:23:53 AM11/9/08
to
On Sun, 09 Nov 2008 10:45:31 -0500, Roy Smith wrote:

> In article <6nod8uF...@mid.uni-berlin.de>,
> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>> Roy Smith schrieb:
>>> In article <6no8p6F...@mid.uni-berlin.de>,
>>> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>>>
>> When I wrote "uniform" I meant objects of the same kind. So for example
>> subclasses are of course ok. And all of your examples are these: I want
>> a Token-object, keeping file-location and possibly original string
>> representation.
>
> Maybe, but only if the logic of my program says that's the right way to
> do it. If I decide that the appropriate way to return an integer is by
> returning something of type(int), that's my business. Why should I have
> to define a Token class if using the native Python types works just as
> well for what I'm doing? I'll write class Token when there's some
> added value I get from it which I can't get with raw types.

One added value in Python 3.0 would be that they are sortable in a

`list`. But seriously, your token example should work fine with
Python 3.0 because the wish to sort tokens is a bit exotic IMHO.

> I don't want to be forced into it just because a container doesn't like
> what I'm doing.

The container doesn't say what *you* can do with the content, it is just

a bit picky about what the container itself can do. If a function or
method doesn't like something about the arguments it's perfectly okay to
raise a `TypeError` or a `ValueError`.

> As another example, consider a list of items being juggled:


>
> [RubberChicken(), ChainSaw(), Canteloupe()]
>
> I could go through contortions to find some common subclass for these
> items, but the whole *point* is that they're not of the same type.

They are of the same duck type "things to be juggled". And if you use a

method on them they have to implement that. Not necessarily through
inheritance.

> And making a list of them is a perfectly reasonable thing to do.

And possible with current Python and Python 3.0.

Ciao,
Marc 'BlackJack' Rintsch

Terry Reedy

unread,
Nov 9, 2008, 11:49:54 AM11/9/08
to pytho...@python.org
Kay Schluehr wrote:
> On 9 Nov., 05:04, Terry Reedy <tjre...@udel.edu> wrote:
>
>> Have you written any Python code where you really wanted the old,
>> unpredictable behavior?
>
> Sure:

I was asking the OP ;-)

>
> if len(L1) == len(L2):
> return sorted(L1) == sorted(L2) # check whether two lists contain
> the same elements
> else:
> return False
>
> It doesn't really matter here what the result of the sorts actually is
> as long as the algorithm leads to the same result for all permutations
> on L1 ( and L2 ).

Leaving aside the O(n) alternative for hashable items, which only
matters for 'long' lists....
If you literally mean 'the same elements', then key=id would work.

For some mixtures, such as strings and numbers, key=id will work better
than in 2.x since complex numbers can be included.

If you want to duplicate 2.x behavior, which does *not* work for all
types...

def py2key(item): return (str(type(item)), item)

Guido is aware that universal compare was occasionally useful, but
decided that it also caused problems. So he broke universal compare
when he introduced complex numbers without arbitrary comparisons.
Decimal numbers are even worse. So anyone who objects to the change in
3.0 *should* have objected when complex numbers were introduced.

Hallvard B Furuseth

unread,
Nov 9, 2008, 12:01:41 PM11/9/08
to
Steven D'Aprano writes:
> How often do you care about equality ignoring order for lists containing
> arbitrary, heterogeneous types?

Arbitrary, I never have. Different types of my choice, a few times.
I was only interested in there being some _sort_ order (and the same in
different runs of the program), not in what the sort order was.

However string sorting is partly arbitrary too. It depends on the
code points in the string's character set. u'\u00c5' < u'\u00d8'
(u'Å' < u'Ø') is wrong in Norwegian. '0' < 'B' < 'a' < '~' can be
wrong depending on the application, as can '10' < '5'.

So I don't quite buy the argument about arbitrary sorting. If you have
different types, at least you likely know that you are doing something
weird. OTOH it's quite common - even normal - to produce poor sort
results because one depends on the built-in arbitrary string sort order.


In any case, would it be possible to add a cmp= function which did more
or less what the current default cmp does now? Then it's available when
someone wants it, and the "arbitrariness" of different types is not
inflicted on people who don't. (In my case I _think_ it's just a "nice
to have" to me and nothing more, but I don't remember for sure.)

--
Hallvard

Hallvard B Furuseth

unread,
Nov 9, 2008, 12:10:15 PM11/9/08
to
Terry Reedy writes:
> If you want to duplicate 2.x behavior, which does *not* work for all
> types...
>
> def py2key(item): return (str(type(item)), item)

Nope.
sorted((-1, 2, True, False)) == [-1, False, True, 2]
sorted((-1, 2, True, False), key=py2key) == [False, True, -1, 2]
Might often be good enough though. But uses more memory.

--
Hallvard

Arnaud Delobelle

unread,
Nov 9, 2008, 12:31:40 PM11/9/08
to
Roy Smith <r...@panix.com> writes:

> I spend too much time in C++ pleading with the compiler to allow me to do
> what I want. I come to Python to get away from all that type bondage.
>
> As another example, consider a list of items being juggled:
>
> [RubberChicken(), ChainSaw(), Canteloupe()]
>
> I could go through contortions to find some common subclass for these
> items, but the whole *point* is that they're not of the same type. And
> making a list of them is a perfectly reasonable thing to do.

You do realise that all these objects (like any object in Python) are
all instances of the object type, don't you?

--
Arnaud

Aahz

unread,
Nov 9, 2008, 12:55:06 PM11/9/08
to
In article <roy-6896DF.0...@news.panix.com>,

Roy Smith <r...@panix.com> wrote:
>In article <6no8p6F...@mid.uni-berlin.de>,
> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>>attribution missing:

Overall agreed, but I think your reasoning breaks down when you talk
about sorting such a list: generally speaking, when you create a list
like that, it's because you specifically want to preserve ordering.

The case where it's mostly like you'd want to sort (entries in a Unix
directory), you can arguably reduce the discussion to string sorting
quite easily.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

"Martin v. Löwis"

unread,
Nov 9, 2008, 1:09:35 PM11/9/08
to Roy Smith
>> Yes, key= lets you sort anything anyway you want.
>> >>> l=[1, '2', 3j]
>> >>> sorted(l, key = str)
>> [1, '2', 3j]
>
> The problem here is that str() is degenerate, i.e. a != b does not imply
> str(a) != str(b).

So why is this a problem? The sort algorithm is stable, so it gives a
deterministic result. Not a particularly useful one - but I don't see
that any order on these three values could be considered "useful".

>> >>> sorted(l, key = id)
>> ['2', 3j, 1]
>
> And of course, this has to opposite problem, a == b does not imply id(a) ==
> id(b). Whether either of these "problems" is really a problem obviously
> depends on what you're trying to do.

And that's Terry's message entirely. Is is YOU who decides how you want
these things sorted (if you want to sort them). Python refuses the
temptation to guess.

> In 3.0, can you still order types? In 2.x, you can do:
>
>>>> t1 = type(1)
>>>> t2 = type(1j)
>>>> t1 < t2
> False

By what criteria? When is a type smaller than another one?

> def total_order(o1, o2):
> "Compare any two objects of arbitrary types"
> try:
> return o1 <= o2
> except UncomparableTypesError: # whatever the right name is
> return type(o1) <= type(o2)
>
> and get the same effect as you had in 2.x.

You can still do that - but you also need to define the order that
types have in 2.x (namely, by name).

Completely emulating the sorting of 2.x is nearly impossible,
considering how complex that is.

Regards,
Martin

"Martin v. Löwis"

unread,
Nov 9, 2008, 1:18:02 PM11/9/08
to Kay Schluehr
> Sure:

>
> if len(L1) == len(L2):
> return sorted(L1) == sorted(L2) # check whether two lists contain
> the same elements
> else:
> return False
>
> It doesn't really matter here what the result of the sorts actually is
> as long as the algorithm leads to the same result for all permutations
> on L1 ( and L2 ).

Unfortunately, for many releases, the list's sort algorithm would not
provide that property. Release for release, new cases where found where
the builtin ordering was not transitive (i.e. you get a < b and b < c,
but not a < c). With funny definitions of __cmp__ in some classes, you
can still get this today.

Regards,
Martin

"Martin v. Löwis"

unread,
Nov 9, 2008, 1:29:27 PM11/9/08
to Kay Schluehr
> def comp(x1, x2):
> try:
> if x1<x2:
> return -1
> else:
> return 1
> except TypeError:
> if str(x1)<str(x2):
> return -1
> else:
> return 1
>

Please correct me if I'm wrong, but I think this is not transitive. If
strings and ints are uncomparable, this will give you 20 < "5" and
"5" < 8, but not 20 < 8. As a result, quicksort will do nonsense
with that comparison function (i.e. it won't guarantee that things
are sorted in increasing order)

> Not sure how to transform it into a search key that is efficient and
> reliable.

Depending on what your notion of "sameness of lists" is, the following
might do:

def key(o):
return id(type(o)),o

This only requires that all objects of the same type in the list can
be sorted. If this is not guaranteed, then

def key(o)
try:
o<o
return id(type(o)),o
except TypeError:
return id(type(o)),id(o)

might do.

Regards,
Martin

"Martin v. Löwis"

unread,
Nov 9, 2008, 1:42:46 PM11/9/08
to
> In any case, would it be possible to add a cmp= function which did more
> or less what the current default cmp does now?

As somebody pointed out, it's possible to create a key= function that
provides arbitrary ordering: just return an object custom type whose
__lt__ never fails:

class AlwaysOrdered:
def __init__(self, o):
self.o = o
def __lt__(self, other):
o1 = self.o
o2 = other.o
try:
return o1 < o2
except TypeError:
return cmp((type(o1).__name__, id(o1)),
(type(o2).__name__, id(o2)))<0

L.sort(key=AlwaysOrdered)

(This is not quite the current implementation, and not even
transitive, but you get the idea)

Regards,
Martni

"Martin v. Löwis"

unread,
Nov 9, 2008, 1:46:51 PM11/9/08
to

Compared to what? To the current 2.x default implementation? Or compared
to an emulation of that, in 2.x, in a pure-Python function, passed
as a cmp= argument?

I think this one is *more* memory-efficient than a cmp function. Calling
a cmp function will always allocate a frame object, for each comparison,
giving you O(n log n) frame objects being created. OTOH, the key
function will only be called O(n) times, causing the allocation of only
O(n) objects. Of course, the peak memory usage is higher with the key
function (O(n), compared to O(1) for the cmp= function, assuming the
frame objects get deallocated and reused immediately).

Regards,
Martin

Arnaud Delobelle

unread,
Nov 9, 2008, 1:45:57 PM11/9/08
to
"Martin v. Löwis" <mar...@v.loewis.de> writes:

>> def comp(x1, x2):
>> try:
>> if x1<x2:
>> return -1
>> else:
>> return 1
>> except TypeError:
>> if str(x1)<str(x2):
>> return -1
>> else:
>> return 1
>>
>
> Please correct me if I'm wrong, but I think this is not transitive. If
> strings and ints are uncomparable, this will give you 20 < "5" and
> "5" < 8, but not 20 < 8. As a result, quicksort will do nonsense
> with that comparison function (i.e. it won't guarantee that things
> are sorted in increasing order)

Even in 2.x it doesn't work (I think I posted this earlier but I'm not
sure anymore) as this example shows:

2 < 3j and 3j < True, but True < 2

--
Arnaud

"Martin v. Löwis"

unread,
Nov 9, 2008, 2:24:27 PM11/9/08
to
> Even in 2.x it doesn't work (I think I posted this earlier but I'm not
> sure anymore) as this example shows:
>
> 2 < 3j and 3j < True, but True < 2

What specific value of x have you been trying? For x=4,5,6, I get

py> 2 < 3j and 3j < True


Traceback (most recent call last):

File "<stdin>", line 1, in ?
TypeError: no ordering relation is defined for complex numbers

Regards,
Martin

Arnaud Delobelle

unread,
Nov 9, 2008, 2:39:25 PM11/9/08
to
"Martin v. Löwis" <mar...@v.loewis.de> writes:

My example was using the comp function defined by Kay:

>> def comp(x1, x2):
>> try:
>> if x1<x2:
>> return -1
>> else:
>> return 1
>> except TypeError:
>> if str(x1)<str(x2):
>> return -1
>> else:
>> return 1

It just shows that it doesn't define an order for builtin types.

--
Arnaud

Kay Schluehr

unread,
Nov 9, 2008, 3:53:53 PM11/9/08
to
On 9 Nov., 17:49, Terry Reedy <tjre...@udel.edu> wrote:

> I was asking the OP ;-)

Thank you for the discussion.

Roy Smith

unread,
Nov 9, 2008, 5:21:39 PM11/9/08
to
In article <6nogbsF...@mid.uni-berlin.de>,

Marc 'BlackJack' Rintsch <bj_...@gmx.net> wrote:

> One added value in Python 3.0 would be that they are sortable in a
> `list`. But seriously, your token example should work fine with
> Python 3.0 because the wish to sort tokens is a bit exotic IMHO.

Hmmm, I seem to have engaged in a bit of topic drift, for which I
apologize. I was commenting specifically on the issue of lists holding
heterogeneous types, not on heterogeneous types being sortable.

"Martin v. Löwis"

unread,
Nov 9, 2008, 5:49:08 PM11/9/08
to
> Hmmm, I seem to have engaged in a bit of topic drift, for which I
> apologize. I was commenting specifically on the issue of lists holding
> heterogeneous types, not on heterogeneous types being sortable.

Still, I don't think this is a valid counter-example: I claim that the
data in the list of tokens is homogeneous (them all being tokens),
despite the individual objects having different types, and potentially
not even a common base class but object.

The "homogeneous vs. heterogeneous" discussion is not that much about
type, but more about semantics of the individual values. If you
represent (firstname, lastname, age), you use tuples - three different
(and orthogonal) pieces of information. There is no canonical way of
continuing that sequence (hence the fixed-size nature of the tuple is
no concern). OTOH, the token list has the structure [token, token, ...].
The third element is semantically not different from the first element,
hence a list is appropriate.

Regards,
Martin

Roy Smith

unread,
Nov 9, 2008, 7:26:16 PM11/9/08
to
In article <491768e5$0$2554$9b62...@news.freenet.de>,

"Martin v. Löwis" <mar...@v.loewis.de> wrote:

> The "homogeneous vs. heterogeneous" discussion is not that much about
> type, but more about semantics of the individual values. If you
> represent (firstname, lastname, age), you use tuples - three different
> (and orthogonal) pieces of information. There is no canonical way of
> continuing that sequence (hence the fixed-size nature of the tuple is
> no concern). OTOH, the token list has the structure [token, token, ...].
> The third element is semantically not different from the first element,
> hence a list is appropriate.

That is a much better way to describe it than talking about homogeneity,
except that it's not so much the semantics of the individual values, but
the semantics of the container. The unfortunate thing about tuples vs.
lists is that there are two completely orthogonal concepts in play.

On the one hand, you have an inherently fixed-sized collection of objects,
which represent specific attributes. A good analogy would be a C/C++
struct. On the other hand, you have an sequence. Something which might
logically have more or fewer elements at any given time. Something where
it makes sense to talk about "the next item", or "the previous item", or
"the last item".

In the other dimension, you have mutable vs. immutable. Or, more to the
point, hashable vs. non-hashable. Which, for most people means, "can be
used as a dictionary key" vs. "cannot be used as a dictionary key"

The problem is, we only have two types of containers to cover the four
possible combinations of these two orthogonal concepts. If you want to
make something a dictionary key, you have no choice; it's got to be a
tuple. In many cases, that trumps all other considerations.

Consider this problem:

At run time, you will construct an instance of class FrameCounter, fc. The
constructor takes a single integer, n. You call fc.next_item(i), at least
n times. Then, calling, fc.get_frames() will return a dict whose keys are
sub-sequences of length n, and whose values are a count of how many times
that sub-sequence appeared in the input.

Exmaple:

fc = FrameCounter(3)
for i in [1, 2, 3, 1, 2, 3]:
fc.next_item(i)
d = fc.get_frames()

will result in:

d[(1, 2, 3)] -> 2
d[(2, 3, 1)] -> 1
d[(3, 1, 2)] -> 1
d[(4, 5, 6)] -> raises KeyError

This is fairly simple thing to code. You accumulate items in a list by
appending each time next_item() is called, and either using pop(0) or
slicing to keep the last n items. Then, you construct a tuple from the
list and use that as the dictionary key to count how many times you've seen
that sub-sequence.

The point is that you're forced to use lists to compute the sub-sequences.
This makes sense, because lists fit the "indefinite length sequence" idea.
Then, you're forced to use tuples as the dictionary keys, because tuples
are immutable/hashable, while lists are not. This use of tuples doesn't
fit the "inherently fixed-sized collection of heterogeneous objects" idea
of a tuple. In this case, a tuple really is just an immutable list. Your
choice of containers is not based on any theoretical arguments of what each
type was intended to represent, but the cold hard reality of what
operations they support.

"Martin v. Löwis"

unread,
Nov 10, 2008, 4:10:19 AM11/10/08
to
Roy Smith wrote:
> Your
> choice of containers is not based on any theoretical arguments of what each
> type was intended to represent, but the cold hard reality of what
> operations they support.

Right. What seems missing is a "frozen list" type - the list needs to be
frozen in order to be used as a dictionary key (or any other lookup
structure). Fortunately, as you say, tuples already fill that role, so
one could write

frozenlist = tuple
...
sequence = frozenlist(items)
d[sequence] = d.get(sequence,0)+1

to make it explicit that here, the tuple has a different role.

Regards,
Martin

Steven D'Aprano

unread,
Nov 10, 2008, 4:52:04 AM11/10/08
to
On Sun, 09 Nov 2008 10:45:31 -0500, Roy Smith wrote:

> As another example, consider a list of items being juggled:
>
> [RubberChicken(), ChainSaw(), Canteloupe()]
>
> I could go through contortions to find some common subclass for these
> items, but the whole *point* is that they're not of the same type. And
> making a list of them is a perfectly reasonable thing to do.

Absolutely, and Python 3 does not prevent you from making a list of them.

However, sorting them is *not* a reasonable thing to do, just like
summing them is not a reasonable thing to do, and Python 3 does prevent
you from sorting (or summing) them without extra work.

If you've been relying on Python 2.x's "take a guess and hope it works"
heuristic for sorting uncomparable types, this will come across as a
nuisance. Sorry. I'm afraid you only have two choices in Python 3:

(1) find some way to avoid sorting such data; or

(2) bite the bullet and spend the time required to define your own
ordering relationships between types. After all, you can sort by
*anything* once you give Python an appropriate key function.


--
Steven

Duncan Grisby

unread,
Nov 10, 2008, 7:51:51 AM11/10/08
to
In article <mailman.3702.1226203...@python.org>,
Terry Reedy <tjr...@udel.edu> wrote:

>Have you written any Python code where you really wanted the old,
>unpredictable behavior?

I have an object database written in Python. It, like Python, is
dynamically typed. It heavily relies on being able to sort lists where
some of the members are None. To some extent, it also sorts lists of
other mixed types. It will be very hard to migrate this aspect of it
to Python 3.

Cheers,

Duncan.

--
-- Duncan Grisby --
-- dun...@grisby.org --
-- http://www.grisby.org --

Aahz

unread,
Nov 10, 2008, 9:40:05 AM11/10/08
to
In article <roy-6D96B6.1...@news.panix.com>,

Roy Smith <r...@panix.com> wrote:
>
>The point is that you're forced to use lists to compute the sub-sequences.
>This makes sense, because lists fit the "indefinite length sequence" idea.
>Then, you're forced to use tuples as the dictionary keys, because tuples
>are immutable/hashable, while lists are not. This use of tuples doesn't
>fit the "inherently fixed-sized collection of heterogeneous objects" idea
>of a tuple. In this case, a tuple really is just an immutable list. Your
>choice of containers is not based on any theoretical arguments of what each
>type was intended to represent, but the cold hard reality of what
>operations they support.

Note that one can, of course, write Python classes that do exactly what
you want.

Terry Reedy

unread,
Nov 10, 2008, 10:16:28 AM11/10/08
to pytho...@python.org
Duncan Grisby wrote:
> In article <mailman.3702.1226203...@python.org>,
> Terry Reedy <tjr...@udel.edu> wrote:
>
>> Have you written any Python code where you really wanted the old,
>> unpredictable behavior?
>
> I have an object database written in Python. It, like Python, is
> dynamically typed. It heavily relies on being able to sort lists where
> some of the members are None. To some extent, it also sorts lists of
> other mixed types. It will be very hard to migrate this aspect of it
> to Python 3.

Very Hard? Several key functions have been suggested on this thread.
Given that 2.x only sorts most but not all types and that the sort is
only guaranteed to be consistent within a session, as I remember, I
suspect you can choose or write something at least as good for your
purposes.

Robin Becker

unread,
Nov 10, 2008, 10:25:41 AM11/10/08
to pytho...@python.org
.......

In old style python there was a sort of standard behaviour whereby None was
comparable with most of the other primitive types. That at least allowed us to
performs various stupid tricks with data. Unfortunately it seems that None is no
longer orderable even against itself.

Is there any advice on how to create N/A float or integer or string values? I
assume the DB api will continue to return None for null columns no matter what
the column type.

Presumably I can always define my own comparator which makes None < x for all
x!=None.
--
Robin Becker

Steve Holden

unread,
Nov 10, 2008, 10:57:34 AM11/10/08
to pytho...@python.org
Robin Becker wrote:
> .......
>
> In old style python there was a sort of standard behaviour whereby None
> was comparable with most of the other primitive types. That at least
> allowed us to performs various stupid tricks with data. Unfortunately it
> seems that None is no longer orderable even against itself.
>
> Is there any advice on how to create N/A float or integer or string
> values? I assume the DB api will continue to return None for null
> columns no matter what the column type.
>
> Presumably I can always define my own comparator which makes None < x
> for all x!=None.

Yes, you can (though that will mean subtyping the standard Python types
and ensuring you use only the subtypes, not always easy to maintain).

Of course, using SQL against a traditional RDBMS will not return rows
with NULL values for salary in a query such as

SELECT name, address WHERE salary < 10000

precisely *because* NULL (absence of value) does not compare with any
value. So you could say that 3.0 is forcing us to acknowledge database
reality ;-)

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

George Sakkis

unread,
Nov 10, 2008, 11:05:40 AM11/10/08
to

If list grew a frozenlist superclass (just move up all the non-
mutating methods of list) and had a new freeze method, there would be
no need to copy the sequence:

def freeze(self):
self.__class__ = frozenlist

George

Robin Becker

unread,
Nov 10, 2008, 11:32:47 AM11/10/08
to pytho...@python.org
Steve Holden wrote:
.........intain).

>
> Of course, using SQL against a traditional RDBMS will not return rows
> with NULL values for salary in a query such as
>
> SELECT name, address WHERE salary < 10000
>
> precisely *because* NULL (absence of value) does not compare with any
> value. So you could say that 3.0 is forcing us to acknowledge database
> reality ;-)
>
> regards
> Steve
on the other hand I find it odd that

cmp(None,None) fails in Python 3 when None==None returns True.

In fact it seems that because None is non-comparable I need to write at least
three of the comparisons at least as two only leads to errors. So now even
though I can sort my objects with None I still cannot sort [None,None]

class A:
def __init__(self,val):
self.val=val
def __lt__(self,other):
if other is None: return False
return self.val<other.val
def __eq__(self,other):
if other is None:
return False
return self.val==other.val
def __gt__(self,other):
if other is None: return True
return self.val>other.val
def __repr__(self):
return 'A(%s)' % self.val

a=A(1)
b=A(2)
print('cmp(a,b) --> %s'%(cmp(a,b)))
print('a<b --> %s'%(a<b))
print('a>b --> %s'%(a>b))
print('a==b --> %s'%(a==b))
print('a<None --> %s'%(a<None))
print('a>None --> %s'%(a>None))
print('a==None --> %s'%(a==None))
print('None<a --> %s'%(None<a))
print('None>a --> %s'%(None>a))
print('cmp(a,None) --> %s'%(cmp(a,None)))

L=[b,a]
L.sort()
print('L --> %s'%(L))
L=[b,a,None]
L.sort()
print('L --> %s'%(L))
--
Robin Becker

Robin Becker

unread,
Nov 10, 2008, 11:41:58 AM11/10/08
to pytho...@python.org
Robin Becker wrote:
> Steve Holden wrote:
> .........intain).
>>
>> Of course, using SQL against a traditional RDBMS will not return rows
>> with NULL values for salary in a query such as
>>
>> SELECT name, address WHERE salary < 10000
>>
>> precisely *because* NULL (absence of value) does not compare with any
>> value. So you could say that 3.0 is forcing us to acknowledge database
>> reality ;-)
>>
>> regards
>> Steve
> on the other hand I find it odd that
>
> cmp(None,None) fails in Python 3 when None==None returns True.
>
> In fact it seems that because None is non-comparable I need to write at
> least three of the comparisons at least as two only leads to errors. So
> now even though I can sort my objects with None I still cannot sort
> [None,None]
>
........
In fact I'm probably being over optimistic here as even though my silly
[a,None].sort() works it's unlikely in general that an arbitrary list of Nones &
A()s will sort. I think a single None will sort in a list of arbitrary length,
but not two or more. How's that for special cases?
--
Robin Becker

Marc 'BlackJack' Rintsch

unread,
Nov 10, 2008, 12:33:40 PM11/10/08
to
On Mon, 10 Nov 2008 16:32:47 +0000, Robin Becker wrote:

> on the other hand I find it odd that
>
> cmp(None,None) fails in Python 3 when None==None returns True.

That's because there is no order defined for `NoneType` but equality is.

Ciao,
Marc 'BlackJack' Rintsch

ru...@yahoo.com

unread,
Nov 10, 2008, 2:21:18 PM11/10/08
to
On Nov 10, 8:57 am, Steve Holden <st...@holdenweb.com> wrote:
> Robin Becker wrote:
...snip...

>> In old style python there was a sort of standard behaviour whereby None
>> was comparable with most of the other primitive types. That at least
>> allowed us to performs various stupid tricks with data. Unfortunately it
>> seems that None is no longer orderable even against itself.
>>
>> Is there any advice on how to create N/A float or integer or string
>> values? I assume the DB api will continue to return None for null
>> columns no matter what the column type.
>>
>> Presumably I can always define my own comparator which makes None < x
>> for all x!=None.
>
> Yes, you can (though that will mean subtyping the standard Python types
> and ensuring you use only the subtypes, not always easy to maintain).
>
> Of course, using SQL against a traditional RDBMS will not return rows
> with NULL values for salary in a query such as
>
> SELECT name, address WHERE salary < 10000
>
> precisely *because* NULL (absence of value) does not compare with any
> value.

Huh? Thats like saying it's ok if cmp raises an error
when comparing negative numbers because "abs(x)" always
return positive ones. You will find plenty of cases
when db apps return NULL, e.g.:

SELECT name, salary WHERE name LIKE 'Steven %'

So you could say that 3.0 is forcing us to acknowledge database
> reality ;-)

(Again) huh?
Reality in databases is that NULL *is* comparable.
"NULL==something" returns False, it doesn't raise an error.

Arnaud Delobelle

unread,
Nov 10, 2008, 2:26:46 PM11/10/08
to
Robin Becker <ro...@reportlab.com> writes:

> Steve Holden wrote:
> .........intain).
>>
>> Of course, using SQL against a traditional RDBMS will not return rows
>> with NULL values for salary in a query such as
>>
>> SELECT name, address WHERE salary < 10000
>>
>> precisely *because* NULL (absence of value) does not compare with any
>> value. So you could say that 3.0 is forcing us to acknowledge database
>> reality ;-)
>>
>> regards
>> Steve
> on the other hand I find it odd that
>
> cmp(None,None) fails in Python 3 when None==None returns True.
>
> In fact it seems that because None is non-comparable I need to write
> at least three of the comparisons at least as two only leads to
> errors. So now even though I can sort my objects with None I still
> cannot sort [None,None]

Using the key argument you don't have to do all this:

Python 3.0rc1+ (py3k:66521, Sep 21 2008, 07:58:29)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def none_first(x):
... return (0,) if x is None else (1, x)
...
>>> sorted([5, 4, None, 8, None], key=none_first)
[None, None, 4, 5, 8]

--
Arnaud

George Sakkis

unread,
Nov 10, 2008, 2:39:52 PM11/10/08
to
On Nov 10, 2:21 pm, ru...@yahoo.com wrote:

> So you could say that 3.0 is forcing us to acknowledge database
>
> > reality ;-)
>
> (Again) huh?
> Reality in databases is that NULL *is* comparable.
> "NULL==something" returns False, it doesn't raise an error.

Given that in SQL "NULL `op` something" is False for all comparison
operators (even NULL=NULL), raising an error seems a much lesser evil

George

Rhamphoryncus

unread,
Nov 10, 2008, 2:43:59 PM11/10/08
to
On Nov 10, 8:16 am, Terry Reedy <tjre...@udel.edu> wrote:
> Duncan Grisby wrote:
> > In article <mailman.3702.1226203508.3487.python-l...@python.org>,

> >  Terry Reedy  <tjre...@udel.edu> wrote:
>
> >> Have you written any Python code where you really wanted the old,
> >> unpredictable behavior?
>
> > I have an object database written in Python. It, like Python, is
> > dynamically typed. It heavily relies on being able to sort lists where
> > some of the members are None. To some extent, it also sorts lists of
> > other mixed types. It will be very hard to migrate this aspect of it
> > to Python 3.
>
> Very Hard? Several key functions have been suggested on this thread.
> Given that 2.x only sorts most but not all types and that the sort is
> only guaranteed to be consistent within a session, as I remember, I
> suspect you can choose or write something at least as good for your
> purposes.

It's actually worse than that: A different input within a single
session will produce different results. Sorting relies on
transitivity, and comparing such types doesn't provide it.

You might as well comment out the sort and call it good. That's what
you really had in 2.x. It was close enough most of the time to *look*
right, yet in truth it silently failed. 3.0 makes it an explicit
failure.

(There were some situations that worked, but they're exceptional. You
can still do them now, but you need to be explicit (via a key function
or a special singleton.))

Steve Holden

unread,
Nov 10, 2008, 3:21:20 PM11/10/08
to pytho...@python.org
I'm not saying an RDBMS can't return NULL values. I am saying that
comparisons with NULL return NULL, not true or false. SQL uses
three-valued logic.

> So you could say that 3.0 is forcing us to acknowledge database
>> reality ;-)
>
> (Again) huh?
> Reality in databases is that NULL *is* comparable.
> "NULL==something" returns False, it doesn't raise an error.

That's at best misleading and at worst just plain wrong. If I have the
following table T:

+-------+-------+
| a | b |
+-------+-------+
| 1 | 1 |
+-------+-------+
| 2 | NULL |
+-------+-------+

you appear to be telling me that

SELECT * FROM T WHERE b <> 1

will return (2, NULL), whereas in fact it returns the empty set, since
the tests NULL = something, and NULL <> something both in fact return NULL.

You can't do an equality or inequality comparison between NULL and
anything else - even another NULL - and get anything but a NULL result.

You have to explicitly test for NULLs using IS NULL.

Russ P.

unread,
Nov 10, 2008, 3:24:28 PM11/10/08
to
On Nov 8, 10:20 am, walterbyrd <walterb...@iname.com> wrote:
> I have read that in Python 3.0, the following will raise an exception:
>
> >>> [2, 1, 'A'].sort()
>
> Will that raise an exception? And, if so, why are they doing this? How
> is this helpful? Is this new "enhancement" Pythonic?


I realize that I am late to this discussion, but I would like to add
something. I did not read all the replies, so please forgive me if the
following points have already been made.

This new behavior enhances the dynamic typing of Python and is very
helpful for preventing hard-to-detect bugs.

Let me give you an example of one such bug that cost me a significant
amount of time a while back. I had a function that returned a number,
and I meant to use it in a comparison test like this:

if myfunc() < 0: ...

Well, I mistakenly left off the parens:

if myfunc < 0: ...

Python just went ahead and did the comparison anyway. This sort of
behavior is more like Perl, with its weak typing. It is not "Pythonic"
in the sense of strong dynamic typing, without which Python would be
much less powerful than it is. This kind of bug can be hard to detect,
and I am glad that Python 3.0 prevents them. I wonder how much
existing Python code has these kinds of bugs lurking.

ru...@yahoo.com

unread,
Nov 10, 2008, 3:45:19 PM11/10/08
to

Yes, it's just plain wrong. :-( You are correct that it
returns NULL not False. Nevertheless, that typo does
not change my point that NULLs are comparable to other
values in SQL, in contrast to your original post that
seemed to be using SQLs NULL behavior as justification
for Py3K's making None not comparable to anything.

ru...@yahoo.com

unread,
Nov 10, 2008, 3:47:47 PM11/10/08
to
On Nov 10, 12:39 pm, George Sakkis <george.sak...@gmail.com> wrote:

s/False/NULL/.
Why is that evil? It is logically consistent, and more importantly,
useful.

In Python, the logically consistent argument is a little weaker (not
having tri-state logic) but the useful argument certainly still seems
true.

Steve Holden

unread,
Nov 10, 2008, 4:38:09 PM11/10/08
to pytho...@python.org

Well I guess now I understand your real point that's fair enough.

Steven D'Aprano

unread,
Nov 10, 2008, 8:10:00 PM11/10/08
to
On Mon, 10 Nov 2008 12:51:51 +0000, Duncan Grisby wrote:

> I have an object database written in Python. It, like Python, is
> dynamically typed. It heavily relies on being able to sort lists where
> some of the members are None. To some extent, it also sorts lists of
> other mixed types. It will be very hard to migrate this aspect of it to
> Python 3.

No, it is "very hard" to sort *arbitrary* objects consistently. If it
appears to work in Python 2.x that's because you've been lucky to never
need to sort objects that cause it to break.

But if you have a list consisting of only a few types, then it is not
that hard to come up with a strategy for sorting them. All you need is a
key function. Suppose that you want to sort a list of numbers and None.
Then this should do what you expect:

# untested
alist.sort(key=lambda x: (0, -99) if x is None else (1, x))


Another suggestion would be a key function that wraps the objects in a
"compare anything" proxy class. This is very hard to get right for
arbitrary types, which is why sorting in Python 2.x apparently contains
subtle bugs. But if you control the types that can appear in your list,
it's much simpler. I leave the full details as an exercise, but the heart
of it will be something like this:

class SortableProxy(object):
# define the order of types
types = [NoneType, str, int, MyWackyObjectClass]
def __lt__(self, other):
if type(self.proxy) == type(other.proxy):
return self.proxy < other.proxy
p = self.types.index(type(self.proxy)
q = self.types.index(type(other.proxy)
return p < q

I leave it to you to sort out the remaining details (pun intended).

--
Steven

Steven D'Aprano

unread,
Nov 10, 2008, 8:25:01 PM11/10/08
to
On Mon, 10 Nov 2008 11:43:59 -0800, Rhamphoryncus wrote:

> You might as well comment out the sort and call it good. That's what
> you really had in 2.x. It was close enough most of the time to *look*
> right, yet in truth it silently failed. 3.0 makes it an explicit
> failure.

I don't doubt that this is correct, but I think the argument that sorting
in Python 2.x has silent bugs would be much stronger if somebody could
demonstrate arrays that sort wrongly.

A shiny wooden nickel for the first person to show such an example!

--
Steven

Rhamphoryncus

unread,
Nov 10, 2008, 11:31:25 PM11/10/08
to
On Nov 10, 6:25 pm, Steven D'Aprano

>>> sorted([2, 1.5, Decimal('1.6'), 2.7, 2])
[1.5, 2.7000000000000002, Decimal("1.6"), 2, 2]

Where's my nickel? :P

Rhamphoryncus

unread,
Nov 10, 2008, 11:36:27 PM11/10/08
to

Ahh, I knew I had a copy of the time machine keys burried in that
drawer..

http://mail.python.org/pipermail/python-dev/2005-December/059166.html

Tim Roberts

unread,
Nov 11, 2008, 2:07:13 AM11/11/08
to
Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:

>On Sat, 08 Nov 2008 22:53:14 -0800, Kay Schluehr wrote:
>
>>> How often do you care about equality ignoring order for lists
>>> containing arbitrary, heterogeneous types?
>>
>> A few times. Why do you care, Steven?
>
>I'm a very caring kind of guy.

That's the best answer to that question that I've seen in a long time. I'll
remember that one.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Duncan Grisby

unread,
Nov 11, 2008, 5:15:09 AM11/11/08
to
In article <mailman.3754.1226330...@python.org>,
Terry Reedy <tjr...@udel.edu> wrote:

Yes, very hard. There are only ever simple types in the lists --
strings, integers, Nones, very occasionally floats, and lists of those
things. The sort is always predictable with those types. Just because
you can contrive situations to demonstrate unpredictable sorts doesn't
mean that all sorts with mixed types are unpredictable.

The sorting is in a performance-critical part of the system, so the
overhead of evaluating a key function is not insignificant. A key
function that returns objects that contrive to emulate the
functionality of a comparison function is definitely not appropriate.
That area of the system already builds the lists using C++ for speed,
so if we ever migrate to Python 3 it will probably be easier to do the
whole thing in C++ rather than jump through hoops to make the Python
sort work efficiently enough.

M.-A. Lemburg

unread,
Nov 11, 2008, 8:02:59 AM11/11/08
to Steven D'Aprano, pytho...@python.org
On 2008-11-11 02:10, Steven D'Aprano wrote:
> On Mon, 10 Nov 2008 12:51:51 +0000, Duncan Grisby wrote:
>
>> I have an object database written in Python. It, like Python, is
>> dynamically typed. It heavily relies on being able to sort lists where
>> some of the members are None. To some extent, it also sorts lists of
>> other mixed types. It will be very hard to migrate this aspect of it to
>> Python 3.
>
> No, it is "very hard" to sort *arbitrary* objects consistently. If it
> appears to work in Python 2.x that's because you've been lucky to never
> need to sort objects that cause it to break.

If you read Duncan's email, he isn't talking about arbitrary objects
at all. He's just referring to being able to sort lists that contain
None elements.

That's far from arbitrary and does work consistently in Python 2.x -
simply because None is a singleton which is special cased in Python:
None compares smaller to any other object in Python.

I'm not sure why this special case was dropped in Python 3.0. None
is generally used to be a place holder for a n/a-value and as
such will pop up in lists on a regular basis.

I think the special case for None should be readded to Python 3.0.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Nov 11 2008)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

George Sakkis

unread,
Nov 11, 2008, 10:59:19 AM11/11/08
to
On Nov 11, 8:02 am, "M.-A. Lemburg" <m...@egenix.com> wrote:
> On 2008-11-11 02:10, Steven D'Aprano wrote:
>
> > On Mon, 10 Nov 2008 12:51:51 +0000, Duncan Grisby wrote:
>
> >> I have an object database written in Python. It, like Python, is
> >> dynamically typed. It heavily relies on being able to sort lists where
> >> some of the members are None. To some extent, it also sorts lists of
> >> other mixed types. It will be very hard to migrate this aspect of it to
> >> Python 3.
>
> > No, it is "very hard" to sort *arbitrary* objects consistently. If it
> > appears to work in Python 2.x that's because you've been lucky to never
> > need to sort objects that cause it to break.
>
> If you read Duncan's email, he isn't talking about arbitrary objects
> at all. He's just referring to being able to sort lists that contain
> None elements.
>
> That's far from arbitrary and does work consistently in Python 2.x -
> simply because None is a singleton which is special cased in Python:
> None compares smaller to any other object in Python.
>
> I'm not sure why this special case was dropped in Python 3.0. None
> is generally used to be a place holder for a n/a-value and as
> such will pop up in lists on a regular basis.
>
> I think the special case for None should be readded to Python 3.0.

On python-ideas I proposed adding two new builtin singletons instead,
Smallest and Largest, since the behavior of None wrt comparisons was
never officially part of the language.

George

Robin Becker

unread,
Nov 11, 2008, 11:16:03 AM11/11/08
to pytho...@python.org
M.-A. Lemburg wrote:
> On 2008-11-11 02:10, Steven D'Aprano wrote:
>> On Mon, 10 Nov 2008 12:51:51 +0000, Duncan Grisby wrote:
>>
>>> I have an object database written in Python. It, like Python, is
>>> dynamically typed. It heavily relies on being able to sort lists where
>>> some of the members are None. To some extent, it also sorts lists of
>>> other mixed types. It will be very hard to migrate this aspect of it to
>>> Python 3.
>> No, it is "very hard" to sort *arbitrary* objects consistently. If it
>> appears to work in Python 2.x that's because you've been lucky to never
>> need to sort objects that cause it to break.
>
> If you read Duncan's email, he isn't talking about arbitrary objects
> at all. He's just referring to being able to sort lists that contain
> None elements.
>
> That's far from arbitrary and does work consistently in Python 2.x -
> simply because None is a singleton which is special cased in Python:
> None compares smaller to any other object in Python.
>
> I'm not sure why this special case was dropped in Python 3.0. None
> is generally used to be a place holder for a n/a-value and as
> such will pop up in lists on a regular basis.
>
> I think the special case for None should be readded to Python 3.0.
>
I agree here, it seems strange that cmp(None,None) is exceptional. Clearly the
is relation applies to None so does ==. Do we not have a sorting order for sets
with one element? My maths is now shot, but I seem to remember there are
automatic orders for such simple sets.
--
Robin Becker

Terry Reedy

unread,
Nov 11, 2008, 1:40:13 PM11/11/08
to pytho...@python.org
Duncan Grisby wrote:

> Yes, very hard.

There is a difference between 'very hard' (to get 'right') and 'to slow'
(for a particular application). I accept the latter.

> There are only ever simple types in the lists --
> strings, integers, Nones, very occasionally floats, and lists of those
> things. The sort is always predictable with those types. Just because
> you can contrive situations to demonstrate unpredictable sorts doesn't
> mean that all sorts with mixed types are unpredictable.

The 2.5 manual (and I sure before that) *intentially* defines the
default cross-type comparisons as unreliable.

"(This unusual definition of comparison was used to simplify the
definition of operations like sorting and the in and not in operators.
In the future, the comparison rules for objects of different types are
likely to change.)"

They have changed in the past and now they change again (yes, a little
more drastically this time, but as expected for some years).

> The sorting is in a performance-critical part of the system, so the
> overhead of evaluating a key function is not insignificant. A key
> function that returns objects that contrive to emulate the
> functionality of a comparison function is definitely not appropriate.
> That area of the system already builds the lists using C++ for speed,
> so if we ever migrate to Python 3 it will probably be easier to do the
> whole thing in C++ rather than jump through hoops to make the Python
> sort work efficiently enough.

Assuming the premises, agreed. No hurry, but you can even pull
timsort() out of the source, if you otherwise like its large-list
behavior, and hardcode the comparison function.

Terry Jan Reedy

Terry Reedy

unread,
Nov 11, 2008, 1:52:25 PM11/11/08
to pytho...@python.org
M.-A. Lemburg wrote:

>
> I think the special case for None should be readded to Python 3.0.

Quick summary of thread that MAL started on Python-3000-dev list:

Once upon a time, 0 < None was true.

When rich comparisons were added, None < 0 (and *most* other things)
become true as an intentionally undocumented implementation detail.

The None rule only applies for sure when None controls the comparison:
ob < None is true or undefined if type(ob) says so.

Guido's pronouncement: "In short, I'll have None of it."
summarizing

We're not going to add the "feature" back that None compares smaller
than everything. It's a slippery slope that ends with all operations
involving None returning None -- I've seen a proposal made in all
earnestness requesting that None+42 == None, None() == None, and so
on. This Nonesense was wisely rejected; a whole slew of
early-error-catching would have gone out of the window. It's the same
with making None smaller than everything else. For numbers, you can
already use -inf; for other types, you'll have to invent your own
Smallest if you need it.

tjr

"Martin v. Löwis"

unread,
Nov 11, 2008, 3:21:05 PM11/11/08
to Duncan Grisby
> The sorting is in a performance-critical part of the system, so the
> overhead of evaluating a key function is not insignificant.

Can you easily produce an example? It doesn't have to be real data,
but should have the structure (typewise) of the real data. I would
like to perform some measurements. For example, I could imagine that

l = []
for i in range(1000):
x = random.randint(0,100)
if x < 4: l.append(None)
else: l.append(x)

might adequately model your problem.

Regards,
Martin

Robin Becker

unread,
Nov 11, 2008, 4:23:06 PM11/11/08
to
This still doesn't explain why None should not be comparable to itself.

--
Robin Becker

Ben Finney

unread,
Nov 11, 2008, 6:17:05 PM11/11/08
to
Terry Reedy <tjr...@udel.edu> writes:

> We're not going to add the "feature" back that None compares smaller
> than everything. It's a slippery slope that ends with all operations
> involving None returning None -- I've seen a proposal made in all
> earnestness requesting that None+42 == None, None() == None, and so
> on. This Nonesense was wisely rejected

I agree with that decision. However, the behaviour you specify *is*
useful (though I don't think ‘None’ should have that behaviour). It is
the “Null object” design pattern, and may be familiar to many
readers in its SQL implementation as the ‘NULL’ non-value.

In fact, there is a Python Cookbook recipe implementing a ‘Null’
object <URL:http://code.activestate.com/recipes/68205/> that also
features in the O'Reilly _Python Cookbook, second edition_.

--
\ “This world in arms is not spending money alone. It is spending |
`\ the sweat of its laborers, the genius of its scientists, the |
_o__) hopes of its children.” —Dwight Eisenhower, 1953-04-16 |
Ben Finney

Terry Reedy

unread,
Nov 11, 2008, 7:00:49 PM11/11/08
to pytho...@python.org
Ben Finney wrote:
> Terry Reedy <tjr...@udel.edu> writes:
>
>> We're not going to add the "feature" back that None compares smaller
>> than everything. It's a slippery slope that ends with all operations
>> involving None returning None -- I've seen a proposal made in all
>> earnestness requesting that None+42 == None, None() == None, and so
>> on. This Nonesense was wisely rejected
>
> I agree with that decision. However, the behaviour you specify *is*

For the record, I was quoting Guido there.

Robin Becker

unread,
Nov 12, 2008, 4:33:23 PM11/12/08
to
Ben Finney wrote:
> Terry Reedy <tjr...@udel.edu> writes:
>
>> We're not going to add the "feature" back that None compares smaller
>> than everything. It's a slippery slope that ends with all operations
>> involving None returning None -- I've seen a proposal made in all
>> earnestness requesting that None+42 == None, None() == None, and so
>> on. This Nonesense was wisely rejected
>
> I agree with that decision. However, the behaviour you specify *is*
> useful (though I don't think ‘None’ should have that behaviour). It is
> the “Null object” design pattern, and may be familiar to many
> readers in its SQL implementation as the ‘NULL’ non-value.
>
> In fact, there is a Python Cookbook recipe implementing a ‘Null’
> object <URL:http://code.activestate.com/recipes/68205/> that also
> features in the O'Reilly _Python Cookbook, second edition_.
>
the difficulty here is that everybody will implement different Null
objects and lead to unwanted fragmentation.
--
Robin Becker

Duncan Grisby

unread,
Nov 19, 2008, 6:33:32 AM11/19/08
to
In article <4919E931...@v.loewis.de>,

Sorry for the delay in replying. Yes, that's not far off. Most of the
time the lists contain strings, though. A better approximation might
be to read lines from a file and randomly replace them with Nones:

l = []
for line in open("bigfile.txt"):


x = random.randint(0,100)
if x < 4: l.append(None)

else: l.append(line)

And maybe once in a while you end up with something not dissimilar to:

l = []
for line in open("bigfile.txt"):


x = random.randint(0,100)
if x < 4: l.append(None)

elif x < 5: l.append([line,line])
else: l.append(line)

In that kind of case it doesn't really matter what happens to list
items in the sort order, but it's important it doesn't fail to sort
the ones that are strings.

Terry Reedy

unread,
Nov 19, 2008, 11:49:00 AM11/19/08
to pytho...@python.org
Duncan Grisby wrote:


> Sorry for the delay in replying. Yes, that's not far off. Most of the
> time the lists contain strings, though. A better approximation might
> be to read lines from a file and randomly replace them with Nones:
>
> l = []
> for line in open("bigfile.txt"):
> x = random.randint(0,100)
> if x < 4: l.append(None)
> else: l.append(line)

So use '' or '\0' instead of None for null lines. Or replace None for
the sort.

> And maybe once in a while you end up with something not dissimilar to:
>
> l = []
> for line in open("bigfile.txt"):
> x = random.randint(0,100)
> if x < 4: l.append(None)
> elif x < 5: l.append([line,line])
> else: l.append(line)
>
> In that kind of case it doesn't really matter what happens to list
> items in the sort order, but it's important it doesn't fail to sort
> the ones that are strings.

Replace the sublists with a coded string, such as '\0'+line.
If sorting is time (and space) critical, you want a straight string sort
without key *or* cmp function.

tjr

Duncan Grisby

unread,
Nov 24, 2008, 5:42:41 AM11/24/08
to
In article <mailman.4252.1227113...@python.org>,
Terry Reedy <tjr...@udel.edu> wrote:

[...]


>> l = []
>> for line in open("bigfile.txt"):
>> x = random.randint(0,100)
>> if x < 4: l.append(None)
>> else: l.append(line)
>
>So use '' or '\0' instead of None for null lines. Or replace None for
>the sort.

Both '' and '\0' could conceivably be valid data values. The problem
is that the data comes from a dynamically-typed database, so there is
no guarantee what the data values might be. This for loop with a file
is just an example that generates similar-looking data. It bears no
relation to the way the data is actually acquired.

>> And maybe once in a while you end up with something not dissimilar to:
>>
>> l = []
>> for line in open("bigfile.txt"):
>> x = random.randint(0,100)
>> if x < 4: l.append(None)
>> elif x < 5: l.append([line,line])
>> else: l.append(line)
>>
>> In that kind of case it doesn't really matter what happens to list
>> items in the sort order, but it's important it doesn't fail to sort
>> the ones that are strings.
>
>Replace the sublists with a coded string, such as '\0'+line.

Again, this is just an example. As I say, in the real application, the
data has come from a dynamically-typed database. There's no easy way
to coerce all the items to the same type, because the application
doesn't know up-front what the data is going to look like. In some
cases, it may be that all the data items are lists of strings. In
other cases, all or most of the data items will be integers, for
example.

>If sorting is time (and space) critical, you want a straight string sort
>without key *or* cmp function.

That's exactly my point. Currently, the application just builds a list
of values retrieved from the database and asks Python to sort it. No
key or cmp function. With Python 3.0, it's going to require a lot more
work than that.

For my application, Python 3's comparison behaviour is a backwards
step. You can argue all you like that the new behaviour is the "right"
thing to do, and I'd be inclined to agree with you from a
philosophical point of view, but the fact is that it _will_ cause
problems for existing real code. The particularly ironic thing is that
the database's dynamic typing is closely modelled on Python, including
it's ability to gracefully handle mixed-type lists.

Aahz

unread,
Nov 24, 2008, 10:16:18 AM11/24/08
to
In article <ByvWk.21081$oK4....@newsfe18.ams2>,

Duncan Grisby <dunca...@grisby.org> wrote:
>In article <mailman.4252.1227113...@python.org>,
>Terry Reedy <tjr...@udel.edu> wrote:
>>
>>Replace the sublists with a coded string, such as '\0'+line.
>
>Again, this is just an example. As I say, in the real application, the
>data has come from a dynamically-typed database. There's no easy way
>to coerce all the items to the same type, because the application
>doesn't know up-front what the data is going to look like. In some
>cases, it may be that all the data items are lists of strings. In
>other cases, all or most of the data items will be integers, for
>example.
>
>>If sorting is time (and space) critical, you want a straight string sort
>>without key *or* cmp function.
>
>That's exactly my point. Currently, the application just builds a list
>of values retrieved from the database and asks Python to sort it. No
>key or cmp function. With Python 3.0, it's going to require a lot more
>work than that.

Unless I misunderstand you, your application is broken already. Complex
numbers have never been sortable. If your application works more like a
SQL database, then any given column will have only one datatype and as
long as you avoid types that cannot be compared against themselves
you're safe.

(I'll agree that from some perspectives the new behavior of None is a
wart but I think that in the end I agree with people who say that
preventing None from being sorted except intentionally will trap more
bugs earlier.)

>For my application, Python 3's comparison behaviour is a backwards
>step. You can argue all you like that the new behaviour is the "right"
>thing to do, and I'd be inclined to agree with you from a
>philosophical point of view, but the fact is that it _will_ cause
>problems for existing real code. The particularly ironic thing is that
>the database's dynamic typing is closely modelled on Python, including
>it's ability to gracefully handle mixed-type lists.

What I think people are arguing about is your use of the word "backward".
I don't think anyone claims that fixing your application will be trivial,
but your application appears to be already broken, and Python 3.0 is
simply forcing you to confront it head-on.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

Tim Rowe

unread,
Nov 24, 2008, 12:33:50 PM11/24/08
to pytho...@python.org
2008/11/24 Aahz <aa...@pythoncraft.com>:

> (I'll agree that from some perspectives the new behavior of None is a
> wart but I think that in the end I agree with people who say that
> preventing None from being sorted except intentionally will trap more
> bugs earlier.)

So will Python be introducing strong type checking and early binding,
so we can catch more bius earlier (compile rather than run time?) ;-)

--
Tim Rowe

Duncan Grisby

unread,
Nov 24, 2008, 12:42:43 PM11/24/08
to
In article <ggegg2$ji8$1...@panix3.panix.com>, Aahz <aa...@pythoncraft.com> wrote:

>In article <ByvWk.21081$oK4....@newsfe18.ams2>,
>Duncan Grisby <dunca...@grisby.org> wrote:

[...]


>>That's exactly my point. Currently, the application just builds a list
>>of values retrieved from the database and asks Python to sort it. No
>>key or cmp function. With Python 3.0, it's going to require a lot more
>>work than that.
>
>Unless I misunderstand you, your application is broken already.

You misunderstand me.

My application is not broken. It handles a wide range of data types,
_but they are all sortable_. It doesn't involve any complex numbers.
There is no way a complex number could be stored in the database, so
that is not an issue.

> Complex
>numbers have never been sortable. If your application works more like a
>SQL database, then any given column will have only one datatype and as
>long as you avoid types that cannot be compared against themselves
>you're safe.

My database isn't like a SQL database. It's dynamically typed, just
like Python. The range of types is limited to things that can be
marshalled into the underlying storage, so I can absolutely guarantee
that they are all sortable in Python 2, even though the permitted
range or types is quite wide.

[...]


>>For my application, Python 3's comparison behaviour is a backwards
>>step. You can argue all you like that the new behaviour is the "right"
>>thing to do, and I'd be inclined to agree with you from a
>>philosophical point of view, but the fact is that it _will_ cause
>>problems for existing real code. The particularly ironic thing is that
>>the database's dynamic typing is closely modelled on Python, including
>>it's ability to gracefully handle mixed-type lists.
>
>What I think people are arguing about is your use of the word "backward".
>I don't think anyone claims that fixing your application will be trivial,
>but your application appears to be already broken, and Python 3.0 is
>simply forcing you to confront it head-on.

The post you've quoted was the first time I used the word "backwards".
I didn't start this thread, I merely chipped in when people claimed
there were no real applications that would be affected by this change
to Python. I do have a real, widely deployed application that will be
affected. Claiming that it is not affected, or that it is already
"broken" does not change that fact.

Several people _have_ in fact claimed that I can't possibly have a
hard problem to solve in this respect. In general, the responses can
be grouped into those people claiming that it's trivial to fix, and
those (like you) claiming that it's ok that it's hard to fix because
it's already broken. Neither are true.

This issue is not impossible to deal with, of course, it's just one of
several things that will mean it's hard for us to migrate to Python 3.
What I find disturbing is the attitude that this change to Python's
comparison behaviour can't possibly have any downsides and that anyone
claiming it does is wrong.

"Martin v. Löwis"

unread,
Nov 24, 2008, 5:19:08 PM11/24/08
to
> Sorry for the delay in replying.

Thanks, and I'm in turn sorry myself, too. Here is my experiment:

I only looked at the first test case, where a bigfile.txt has
occasionally None lines. To run this test, I created a file with
33079296 and 704512 lines, by repeatedly cat'ing /etc/passwd.

I then ran this program:

import time,random
l = []

t=time.time()
for line in open("x"):


x = random.randint(0,100)
if x < 4: l.append(None)
else: l.append(line)

print(time.time()-t)

t=time.time()
for i in range(10):
L = l[:]
L.sort()
print(time.time()-t)

def key(n):
return n or ""

t=time.time()
for i in range(10):
L = l[:]
L.sort(key=key)
print(time.time()-t)

This key function is based on the assumption that
there won't be any strings in l, which is reasonable,
because they are all lines read from the input file
(i.e. contain \n at least). If truly empty strings
where possible, I'd map them to chr(0), which I would
assume couldn't occur elsewhere in the input.

With 2.5, I get these results (rounded)

3.8s
7.6s
14.3s

So with the key function, the overhead appears to have doubled.
I think the ratio should decrease as lists grow larger (because
the key function is linear, and the sort is nlogn, assuming the
overhead is in computing the key function), but I haven'd done
any series varying the file size.

With 3.0(rc3), I get (omitting the key-less sorting, as it won't
work)

24.8s
15.9s

So now the IO is much larger than the actual sorting, which is
about the same in 2.5 and 3.0. If you wonder where the overhead
of IO comes from: part of it probably from the UTF-8 decoding,
that the IO library now does transparently.

I don't know whether this overhead qualifies as "not
insignificant", most people probably would claim that a doubling
in time is indeed significant. OTOH, not knowing what the original
problem is (e.g. whether it's interactive or in a batch mode, and
whether bigfile.txt is much smaller or much larger than 32MiB),
I would probably
a) evaluate whether the increase in IO time is acceptable, and
b) if it is, just go with 3.0, as the increase in sorting time
is much smaller than the increase in IO time.

Regards,
Martin

Aahz

unread,
Nov 25, 2008, 9:39:34 PM11/25/08
to
In article <nIBWk.1525$sn1...@newsfe20.ams2>,

Duncan Grisby <dunca...@grisby.org> wrote:
>In article <ggegg2$ji8$1...@panix3.panix.com>, Aahz <aa...@pythoncraft.com> wrote:
>>In article <ByvWk.21081$oK4....@newsfe18.ams2>,
>>Duncan Grisby <dunca...@grisby.org> wrote:
>>>
>>>For my application, Python 3's comparison behaviour is a backwards
>>>step. You can argue all you like that the new behaviour is the "right"
>>>thing to do, and I'd be inclined to agree with you from a
>>>philosophical point of view, but the fact is that it _will_ cause
>>>problems for existing real code. The particularly ironic thing is that
>>>the database's dynamic typing is closely modelled on Python, including
>>>it's ability to gracefully handle mixed-type lists.
>>
>>What I think people are arguing about is your use of the word "backward".
>>I don't think anyone claims that fixing your application will be trivial,
>>but your application appears to be already broken, and Python 3.0 is
>>simply forcing you to confront it head-on.
>
>The post you've quoted was the first time I used the word "backwards".
>I didn't start this thread, I merely chipped in when people claimed
>there were no real applications that would be affected by this change
>to Python. I do have a real, widely deployed application that will be
>affected. Claiming that it is not affected, or that it is already
>"broken" does not change that fact.

Fair enough; I've been mostly skimming the thread and only your use of
"backward" prompted me to chime in. ;-)

>This issue is not impossible to deal with, of course, it's just one of
>several things that will mean it's hard for us to migrate to Python 3.
>What I find disturbing is the attitude that this change to Python's
>comparison behaviour can't possibly have any downsides and that anyone
>claiming it does is wrong.

Well, I agree with you that it does make things more difficult in some
respects. I also think that it's an aggregate improvement that is
similar to the way L.sort() returns None.

Steven D'Aprano

unread,
Nov 25, 2008, 10:08:28 PM11/25/08
to

Python already has strong type checking.

You're confusing it with static types, which Python doesn't have.


--
Steven

Steven D'Aprano

unread,
Nov 25, 2008, 10:19:41 PM11/25/08
to
On Mon, 24 Nov 2008 10:42:41 +0000, Duncan Grisby wrote:

> Again, this is just an example. As I say, in the real application, the
> data has come from a dynamically-typed database. There's no easy way to
> coerce all the items to the same type, because the application doesn't
> know up-front what the data is going to look like. In some cases, it may
> be that all the data items are lists of strings. In other cases, all or
> most of the data items will be integers, for example.

...


> For my application, Python 3's comparison behaviour is a backwards step.
> You can argue all you like that the new behaviour is the "right" thing
> to do, and I'd be inclined to agree with you from a philosophical point
> of view, but the fact is that it _will_ cause problems for existing real
> code. The particularly ironic thing is that the database's dynamic
> typing is closely modelled on Python, including it's ability to
> gracefully handle mixed-type lists.

If I have understood you correctly, essentially you just need a variation
on the Null object pattern, one that accepts comparisons with any other
type. That's easy, because you don't have to worry about mutual
comparisons of multiple arbitrary types (sorting mixed strings and
numbers and null). All you need is something that sorts below everything
else: something just like None, except sortable.

Unfortunately you can't inherit from NoneType (that I know of...) but
inheriting from object should be good enough. If it isn't, you can use
delegation instead of inheritance. I leave that as an exercise for the
reader.

# not tested in Python 3
class SortableNone(object):
def __lt__(self, other):
return True

Null = SortableNone()

Now just use Null instead of None in your database.


--
Steven

0 new messages