Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

python 3: sorting with a comparison function

1 view
Skip to first unread message

Thomas Heller

unread,
Oct 9, 2008, 3:32:09 PM10/9/08
to
Does Python 3 have no way anymore to sort with a comparison function?

Both [].sort() and sorted() seem to accept only 'key' and 'reverse' arguments,
the 'cmp' argument seems to be gone. Can that be?

Thomas

bearoph...@lycos.com

unread,
Oct 9, 2008, 4:36:58 PM10/9/08
to
Thomas Heller:

> the 'cmp' argument seems to be gone. Can that be?

Yes, that's a wonderful thing, because from the code I see around
99.9% of people see the cmp and just use it, totally ignoring the
presence of the 'key' argument, that allows better and shorter
solutions of the sorting problem. So removing the cmp is the only way
to rub the nose of programmers on the right solution, and it goes well
with the Python "There should be one-- and preferably only one --
obvious way to do it.".

For most of very uncommon situations where key isn't the right thing,
you can use this code by Hettinger:

def cmp2key(mycmp):
"Converts a cmp= function into a key= function"
class K:
def __init__(self, obj, *args):
self.obj = obj
def __cmp__(self, other):
return mycmp(self.obj, other.obj)
return K
s.sort(key=cmp2key(lambda p, q: cmp(p.lower(), q.lower())))

That code can't be used in one situation: when the array to be sorted
is huge, that situation can be handled by the original true cmp
function, but not by that cmp2key(). But I have met such situation so
far. When you have an huge amount of data, use an external sort, even
Windows has one, even if its usage is a bit tricky (linux sort command
is safer).

Bye,
bearophile

Terry Reedy

unread,
Oct 9, 2008, 5:44:56 PM10/9/08
to pytho...@python.org

Yes. When this was discussed, no one could come up with an actual use
case in which the compare function was not based on a key function.
Calling the key function n times has to be faster than calling a compare
function n to O(nlogn) times with 2 keys computed for each call. The
main counter argument would be if there is no room in memory for the
shadow array of key,index pairs. And that can be at least sometimes
handled by putting the original on disk and sorting an overt key,index
array. Or by using a database.

Thomas Heller

unread,
Oct 10, 2008, 6:06:37 AM10/10/08
to
> Thomas Heller wrote:
>> Does Python 3 have no way anymore to sort with a comparison function?
>>
>> Both [].sort() and sorted() seem to accept only 'key' and 'reverse' arguments,
>> the 'cmp' argument seems to be gone. Can that be?

Terry Reedy schrieb:

> Yes. When this was discussed, no one could come up with an actual use
> case in which the compare function was not based on a key function.
> Calling the key function n times has to be faster than calling a compare
> function n to O(nlogn) times with 2 keys computed for each call. The
> main counter argument would be if there is no room in memory for the
> shadow array of key,index pairs. And that can be at least sometimes
> handled by putting the original on disk and sorting an overt key,index
> array. Or by using a database.
>

bearoph...@lycos.com schrieb:

> Yes, that's a wonderful thing, because from the code I see around
> 99.9% of people see the cmp and just use it, totally ignoring the
> presence of the 'key' argument, that allows better and shorter
> solutions of the sorting problem. So removing the cmp is the only way
> to rub the nose of programmers on the right solution, and it goes well
> with the Python "There should be one-- and preferably only one --
> obvious way to do it.".


Thanks, I got it now.

Thomas

Kay Schluehr

unread,
Oct 10, 2008, 8:35:18 AM10/10/08
to
On 9 Okt., 22:36, bearophileH...@lycos.com wrote:

> Yes, that's a wonderful thing, because from the code I see around
> 99.9% of people see the cmp and just use it, totally ignoring the
> presence of the 'key' argument, that allows better and shorter
> solutions of the sorting problem.

Me too because I don't get this:

"key specifies a function of one argument that is used to extract a
comparison key from each list element: key=str.lower. The default
value is None."

Kay

prue...@latinmail.com

unread,
Oct 10, 2008, 1:22:28 PM10/10/08
to

Don't know if further explanation is needed, but here is the deal:

cmp is a function that receives two values and you return -1, 0 or 1
depending if the first is smaller, equal or bigger. 99% of the time
you will do some operation on the values that come in and then do a if
statement with ">" or "<" and return -1,0,1.

key is a function that receives one value and you return the value
that you would normally compare against.

Let me show an example:

>>> data=[(4,'v'),(2,'x'),(1,'a')]
>>> sorted(data)
[(1, 'a'), (2, 'x'), (4, 'v')]

OK, we sorted the data, but What if we want to sort by the letter
instead of the number? Let's use cmp:

>>> def comp(x, y):
key_of_x=x[1]
key_of_y=y[1]
if key_of_x < key_of_y:
return -1
elif key_of_x > key_of_y:
return 1
else:
return 0 #key_of_x == key_of_y

>>> sorted(data,cmp=comp)
[(1, 'a'), (4, 'v'), (2, 'x')]

Very well, so how do we do this using key?

>>> def keyfunc(x):
key_of_x=x[1]
return key_of_x

>>> sorted(data,key=keyfunc)
[(1, 'a'), (4, 'v'), (2, 'x')]


Same output. Very good.

(Of course a smart python developer would use the operator module so
he doesn't even have to write keyfunc but this was just an example)

In summary to transform most cmp functions to a key function you just
take the code that calculates the first value to be compared and leave
out the rest of the logic.

Hope that was helpful.

Kay Schluehr

unread,
Oct 10, 2008, 2:03:40 PM10/10/08
to

Yes, thanks a lot. In essence the "key" is a function that maps each
list element onto a value of a type for which a known order is defined
e.g. an integer. Applying sorted() sorts the list elements according
to the list of those values.

Sometimes it helps when people just make clear how they use technical
terms instead of invoking vague associations.

bearoph...@lycos.com

unread,
Oct 10, 2008, 2:38:18 PM10/10/08
to
Kay Schluehr:

> Sometimes it helps when people just make clear how they use technical
> terms instead of invoking vague associations.

And generally Python docs can enjoy growing few thousands examples...

Bye,
bearophile

Paul McGuire

unread,
Oct 10, 2008, 2:55:28 PM10/10/08
to
IIRC, the return values are not limited to -1, 0, and 1, but are more
like "any value less than 0", 0, and "any value greater than 0". This
allows you to implement numeric cmp routines as:

def cmp(x,y):
return x-y

or just:

cmp = lambda x,y: x-y

-- Paul

Kay Schluehr

unread,
Oct 10, 2008, 3:47:45 PM10/10/08
to

Cleaning up and extending documentation is a large community effort
that requires an informational PEP for guidelines and management
support by the python-dev leads. The official documentation is ad hoc
and just about better than nothing. A Python documentation guideline
might also have positive impact on 3rd party package authors like us.

Generally Python has become a very well managed project. I hope the
docs as well as the stdlib will become the major concerns of Python
3.1.

Thomas Heller

unread,
Oct 10, 2008, 4:28:45 PM10/10/08
to
bearoph...@lycos.com schrieb:

Well, that may not be necessary. But I think that a clear example how to use
the 'key=' parameter in the sort() and sorted() method/function is badly needed.

Thomas

Terry Reedy

unread,
Oct 10, 2008, 5:04:14 PM10/10/08
to pytho...@python.org
Kay Schluehr wrote:

> Me too because I don't get this:
>
> "key specifies a function of one argument that is used to extract a
> comparison key from each list element: key=str.lower. The default
> value is None."

I am not sure what you do not get, but it should say 'for example,
key=str.lower." None is the default value of key.

Kay Schluehr

unread,
Oct 10, 2008, 6:38:41 PM10/10/08
to

See the discussion above.

0 new messages