Go to Google Groups Home    comp.lang.python
mini-howto: sorting

Michael Scharf <michael.sch...@gmx.de>

Andrew Dalke wrote:

> I know, it's somewhat silly, but I went through a little tutorial
> here showing a half dozen ways to sort a list with the built-in
> "sort" method.  It was short so I wrote it up in case it is
> deserving of a mini HOWTO.  (Yeah, I still haven't written up the
> exception classes HOWTO ):  I think it's deserving because there's
> not one spot in the manual or Lutz' book talking about the different
> ways.
[snip]
> >>> class CmpAttr:
> >>>     def __init__(self, attr):
> >>>         self.attr = attr
> >>>     def __call__(self, x, y):
> >>>         return cmp(getattr(x, self.attr), getattr(y, self.attr))

[snip]

I would like to add yet another way of sorting: a little
sorting framework using design patterns [1]. A few days
ago, a colleague asked me how to sort tables on different
columns (some of them might be sorted inverse). In a
few minutes I created a little framework in python
(which he then implemented in C++). This is yet another
example of python as prototyping language. But I think,
the framework can also be used in python directly.

Let's take the Spam class:

>>> class Spam:
>>>     def __init__(self, spam, eggs):
>>>         self.spam = spam
>>>         self.eggs = eggs
>>>     def __repr__(self):
>>>         return 'Spam(s=%s,e=%s)' %(repr(self.spam),repr(self.eggs))
>>> a = [Spam(1, 4), Spam(9, 3), Spam(4,6),Spam(4,4)]

If you want to sort on two fields (e.g columns in tables or
two attributes) you might use the composite comparer:

>>> class CmpComposite:
>>>     """Takes a list of compare functions and sorts in that order."""
>>>     def __init__(self,*comparers):
>>>         self.comparers=comparers
>>>     def __call__(self,a,b):
>>>         for cmp in self.comparers:
>>>             c=cmp(a,b)
>>>             if c:
>>>                 return c
>>>         return 0

Now you can sort on Spam.spam, eggs.spam:

>>> a.sort(CmpComposite(CmpAttr('spam'),CmpAttr('eggs')))
>>> print a

[Spam(s=1,e=4), Spam(s=4,e=4), Spam(s=4,e=6), Spam(s=9,e=3)]

Or you can sort on Spam.eggs, Spam.spam:

>>> a.sort(CmpComposite(CmpAttr('eggs'),CmpAttr('spam')))
>>> print a

[Spam(s=9,e=3), Spam(s=1,e=4), Spam(s=4,e=4), Spam(s=4,e=6)

But what if you want to sort inverse on spam? Define a CompInverse:

>>> class CmpInverse:
>>>     """Inverses the effect of a cmp."""
>>>     def __init__(self,cmp):
>>>         self.cmp=cmp
>>>     def __call__(self,a,b):
>>>         return -self.cmp(a,b)

Let's try it:

>>> a.sort(CmpInverse(CmpAttr('spam')))

[Spam(s=9,e=3), Spam(s=4,e=6), Spam(s=4,e=4), Spam(s=1,e=4)

OK, and now sort on eggs and inverse on spam:

>>> a.sort(CmpComposite(CmpAttr('eggs'),CmpInverse(CmpAttr('spam'))))

[Spam(s=9,e=3), Spam(s=4,e=4), Spam(s=1,e=4), Spam(s=4,e=6)]

We can use our little framework to sort tables as well:

>>> class CmpColumn:
>>>     """Sorts on an index of a sequence."""
>>>     def __init__(self,column):
>>>         self.column=column
>>>     def __call__(self,a,b):
>>>         return cmp(a[self.column],b[self.column])

>>> table=[(1,2), (1,3), (1,0), (2,0), (0,3), (3,2)]

Let's now sort our table on column 0 and inverse on column 1:

>>> table.sort(CmpComposite(CmpInverse(CmpColumn(1)),CmpColumn(0)))
>>> print table

[(0, 3), (1, 3), (1, 2), (1, 0), (2, 0), (3, 2)]

In terms of design patterns, we can identify CmpInverse
as a Decorator and CmpComposite as Composite pattern.
CmpColumn and CmpAttr are Strategy patterns. It is now
relatively easy to sort on arbitrary combinations of
attributes. The disadvantage is: it can become quite
slow for big lists.

[1] Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides:
    Design Patterns: Elements of Reusable Object-Oriented Software.
    Addison-Wesley Pub Co; ISBN: 0201633612; 1994.
    http://www.amazon.com/exec/obidos/ASIN/0201633612

Michael
--
     ''''\     Michael Scharf
    ` c-@@     TakeFive Software
    `    >     http://www.TakeFive.com
     \_ V      mailto:Michael_Sch...@TakeFive.co.at

[ Comparers.py 2K ]
"""A set of comparer classes."""

class CmpComposite:
    """Takes a list of compare functions and sorts in that order."""
    def __init__(self,*comparers):
        self.comparers=comparers
    def __call__(self,a,b):
        for cmp in self.comparers:
            c=cmp(a,b)
            if c:
                return c
        return 0

class CmpInverse:
    """Inverses the effect of a cmp."""
    def __init__(self,cmp):
        self.cmp=cmp
    def __call__(self,a,b):
        return -self.cmp(a,b)

class CmpColumn:
    """Sorts on an index of a sequence."""
    def __init__(self,column):
        self.column=column
    def __call__(self,a,b):
        return cmp(a[self.column],b[self.column])
class CmpAttr:
    """Sorts on an attribute."""
    def __init__(self, attr):
        self.attr = attr
    def __call__(self, x, y):
        return cmp(getattr(x, self.attr), getattr(y, self.attr))

def test():

    l=[3,5,6,71,2,3,4,5]
    print 'sort inv'
    s=CmpInverse(cmp)
    l.sort(s)
    print ' ',l

    table=[
        (1,2),
        (1,3),
        (1,0),
        (2,0),
        (0,3),
        (3,2)
    ]

    print 'sort l[0], inv l[1]'
    table.sort(CmpComposite(CmpColumn(0),CmpInverse(CmpColumn(1))))
    print ' ',table

    print 'sort inv l[1], l[0]'
    table.sort(CmpComposite(CmpInverse(CmpColumn(1)),CmpColumn(0)))
    print ' ',table

    class Spam:
        def __init__(self, spam, eggs):
            self.spam = spam
            self.eggs = eggs
        def __repr__(self):
            return 'Spam(s=%s,e=%s)' %(repr(self.spam),repr(self.eggs))
    a = [Spam(1, 4), Spam(9, 3), Spam(4,6),Spam(4,4)]

    print "sort spam:"
    a.sort(CmpAttr('spam'))
    print ' ',a

    print "sort inv spam:"
    a.sort(CmpInverse(CmpAttr('spam')))
    print ' ', a

    print "sort spam, eggs:"
    a.sort(CmpComposite(CmpAttr('spam'),CmpAttr('eggs')))
    print ' ',a

    print "sort eggs, spam:"
    a.sort(CmpComposite(CmpAttr('eggs'),CmpAttr('spam')))
    print ' ',a

    print "sort eggs, inv spam:"
    a.sort(CmpComposite(CmpAttr('eggs'),CmpInverse(CmpAttr('spam'))))
    print ' ',a

if __name__=='__main__':
    test()