sort one list using the values from another list

Brian Blais

unread,

Feb 26, 2006, 11:04:21 AM2/26/06

to pytho...@python.org

Hello,

I have two lists, one with strings (filenames, actually), and one with a real-number
rank, like:

A=['hello','there','this','that']
B=[3,4,2,5]

I'd like to sort list A using the values from B, so the result would be in this example,

A=['this','hello','there','that']

The sort method on lists does in-place sorting. Is there a way to do what I want here?

thanks,

Brian Blais

--
-----------------

bbl...@bryant.edu
http://web.bryant.edu/~bblais

Kent Johnson

unread,

Feb 26, 2006, 11:30:33 AM2/26/06

to

Brian Blais wrote:
> Hello,
>
> I have two lists, one with strings (filenames, actually), and one with a
> real-number
> rank, like:
>
> A=['hello','there','this','that']
> B=[3,4,2,5]
>
> I'd like to sort list A using the values from B, so the result would be
> in this example,
>
> A=['this','hello','there','that']

Here are two ways:

>>> A=['hello','there','this','that']
>>> B=[3,4,2,5]

>>> zip(*sorted(zip(B,A)))[1]
('this', 'hello', 'there', 'that')

>>> [a for b,a in sorted(zip(B,A))]

['this', 'hello', 'there', 'that']

I prefer the second one, I think it is more readable, will use less
memory (doesn't have to create a new list B in the final step) and it's
even faster on my computer:

D:\Projects\CB>python -m timeit -s
"A=['hello','there','this','that'];B=[3,4,2,5]" "zip(*sorted(zip(B,A)))[1]"
100000 loops, best of 3: 6.29 usec per loop

D:\Projects\CB>python -m timeit -s
"A=['hello','there','this','that'];B=[3,4,2,5]" "[a for b,a in
sorted(zip(B,A))]"
100000 loops, best of 3: 5.53 usec per loop

(I'm bored this morning :-)

There's probably a clever way to do it using the key parameter to sort
but I can't think of it...

>
> The sort method on lists does in-place sorting. Is there a way to do
> what I want here?

The example above does not sort in place, if you want it to be in place use
A[:] = [a for b,a in sorted(zip(B,A))]

Kent

Jeffrey Schwab

unread,

Feb 26, 2006, 11:31:19 AM2/26/06

to

Brian Blais wrote:
> Hello,
>
> I have two lists, one with strings (filenames, actually), and one with a
> real-number
> rank, like:
>
> A=['hello','there','this','that']
> B=[3,4,2,5]
>
> I'd like to sort list A using the values from B, so the result would be
> in this example,
>
> A=['this','hello','there','that']
>
> The sort method on lists does in-place sorting. Is there a way to do
> what I want here?

If A has no duplicate elements, you could create a hash mapping A's
elements to their respective precedences, then provide a sort criterion
that accessed the hash. Alternatively, you could do something like this:

from operator import itemgetter
result = map(itemgetter(0), sorted(zip(A, B), key=itemgetter(1)))

Alex Martelli

unread,

Feb 26, 2006, 11:33:43 AM2/26/06

to

Brian Blais <bbl...@bryant.edu> wrote:

> Hello,
>
> I have two lists, one with strings (filenames, actually), and one with a
> real-number rank, like:
>
> A=['hello','there','this','that']
> B=[3,4,2,5]
>
> I'd like to sort list A using the values from B, so the result would be in
> this example,
>
> A=['this','hello','there','that']
>
> The sort method on lists does in-place sorting. Is there a way to do what
> I want here?

Sure, many ways, mostly ones based on the "Decorate-Sort-Undecorate"
idiom and its incarnation as the key= optional argument to function
sorted and list method sort. I believe that a more explicit DSU is
ideal (clearer, though maybe a tad slower) in your case:

_aux = zip(B, A)
_aux.sort()
A[:] = [a for b, a in _aux]

Twisting things to use the speedy key= approach seems more "contorted"
and less general in this case, and it also needs an auxiliary structure
(to avoid slowing things down by repeated .index calls) However, things
may be different if you need to consider the possibility that B has some
duplicate items, and deeply care for such a case NOT to result in
comparisons of items of A. The above approach would then have to be
changed, e.g., to:

_aux = zip(B, enumerate(A))
_aux.sort()
A[:] = [a for (b, (i, a)) in _aux]

where I've also added a pair of redundant parentheses to make the
nesting structure of items of _aux more obvious. Of course, each of
these has a variant using sorted instead of sort, and for those you
could use izip from itertools rather than built-in zip, and do
everything within one single statement, etc, etc.

Alex

Steven Bethard

unread,

Feb 26, 2006, 12:00:37 PM2/26/06

to

Brian Blais wrote:
> Hello,
>
> I have two lists, one with strings (filenames, actually), and one with a
> real-number
> rank, like:
>
> A=['hello','there','this','that']
> B=[3,4,2,5]
>
> I'd like to sort list A using the values from B, so the result would be
> in this example,
>
> A=['this','hello','there','that']

Here's a solution that makes use of the key= argument to sorted():

>>> A = ['hello','there','this','that']
>>> B = [3,4,2,5]
>>> indices = range(len(A))
>>> indices.sort(key=B.__getitem__)
>>> [A[i] for i in indices]

['this', 'hello', 'there', 'that']

Basically, it sorts the indices to A -- [0, 1, 2, 3] -- in the order
given by B, and then selects the items from A in the appropriate order.

Duncan Booth

unread,

Feb 26, 2006, 12:27:04 PM2/26/06

to

Steven Bethard wrote:

> Here's a solution that makes use of the key= argument to sorted():
>
> >>> A = ['hello','there','this','that']
> >>> B = [3,4,2,5]
> >>> indices = range(len(A))
> >>> indices.sort(key=B.__getitem__)
> >>> [A[i] for i in indices]
> ['this', 'hello', 'there', 'that']
>
> Basically, it sorts the indices to A -- [0, 1, 2, 3] -- in the order
> given by B, and then selects the items from A in the appropriate order.
>
>

That's impressive. I'm sure like a lot of other people I looked at the
question and thought it ought to be possible to use key to get this result
without a load of zipping and unzipping but just couldn't quite see it.

If you combine your technique with 'sorted', you get the one line version:

>>> [A[i] for i in sorted(range(len(A)), key=B.__getitem__)]

bearoph...@lycos.com

unread,

Feb 26, 2006, 2:10:38 PM2/26/06

to

Your solution Steven Bethard looks very intelligent, here is a small
speed test, because sorting a list according another one is a quite
common operation.
(Not all solutions are really the same, as Alex has shown).

from itertools import izip, imap
from operator import itemgetter
from random import randint, choice, randint, shuffle
from string import lowercase

def timeit(function, *arg1, **arg2):
"""timeit(function, *arg1, **arg2): given a function and its
parameters, calls it
and computes its running time, in seconds. It result is
discarded."""
t1 = clock()
function(*arg1, **arg2)
t2 = clock()
return round(t2-t1, 2)

def psort1(s1, s2):
s1[:] = [a for b,a in sorted(zip(s2, s1))]

def psort2(s1, s2):
aux = zip(s2, enumerate(s1))
aux.sort()
s1[:] = [a for b, (i, a) in aux]

def psort3(s1, s2):
_indices = range(len(s1))
_indices.sort(key=s2.__getitem__)
s1[:] = [s1[i] for i in _indices]

def psort4(s1, s2):
_indices = range(len(s1))
_indices.sort(key=s2.__getitem__)
s1[:] = map(s1.__getitem__, _indices)

def psort5(s1, s2):
s1[:] = zip(*sorted(zip(s2, s1)))[1]

def psort6(s1, s2):
s1[:] = map(itemgetter(0), sorted(zip(s1, s2), key=itemgetter(1)))

def psort7(s1, s2):
s1[:] = [a for b,a in sorted(izip(s2, s1))]

def psort8(s1, s2):
s1[:] = zip(*sorted(izip(s2, s1)))[1]

def psort9(s1, s2):
s1[:] = map(itemgetter(0), sorted(izip(s1, s2), key=itemgetter(1)))

n = 100000
s1 = ["".join(choice(lowercase) for i in xrange(randint(2,8))) for k in
xrange(n)]
s2 = range(n)
shuffle(s2)

for psort in sorts:
s1c = list(s1)
print timeit(psort, s1c, s2), "s"

Timings on my PC, Python 2.4, PIII 500 MHz:
2.87
3.82
1.6
1.56
4.35
2.49
2.75
4.29
2.35

psort4 is my variant of your solution, and it seems the faster one.
Note: one liners are bad, and expecially bad in Python that is designed
to be a readable language, so let's avoid them as much as possible. I
have used one of them to generate s1, but I'll not use them in
production code or in my libraries, etc.

Bye,
bearophile

Ron Adam

unread,

Feb 26, 2006, 4:33:32 PM2/26/06

to

bearoph...@lycos.com wrote:
> Your solution Steven Bethard looks very intelligent, here is a small
> speed test, because sorting a list according another one is a quite
> common operation.
> (Not all solutions are really the same, as Alex has shown).

Try this one.

def psort10(s1, s2):
d = dict(zip(s2,s1))
s1[:] = (d[n] for n in sorted(d.keys()))

It's faster on my system because d.keys() is already sorted. But that
may not be the case on other versions of python.

Ron