[proposal]sort.Array wrapper

624 views
Skip to first unread message

Ivan Krasin

unread,
Dec 1, 2011, 5:09:43 PM12/1/11
to golang-nuts
Hi go nuts!

I have recently discovered the following Python/Go comparison:

Python:

sort(a, key=lambda x: x[1] / x[2], reverse=True)

Go:

type Entry struct {
u float64
v float64
}

type EntrySlice []Entry

func (p EntrySlice) Len() int {
return len(p)
}

func (p EntrySlice) Less(i, j int) bool {
return p[i].u / p[i].v > p[j].u / p[j].v
}

func (p EntrySlice) Swap(i, j int) {
p[i], p[j] = p[j], p[i]
}

// a []Entry
sort.Sort(EntrySlice(a))

The proposal:

Introduce sort.Array helper function:

sort.Array(a, func(i,j int) bool { return a[i].u / a[i].v > a[j].u / a[j].v})

Currently, it's impossible to implement such a helper, because
reflect.Value has func Index(i int) Value to access an element in the
array, but there's no way to set the element of the array.

Thus, sort.Array needs reflect.Value.SetIndex(i int, ev Value) to be
implemented.

Two possible improvements:

1. put sort.Array into sort/util package, because it introduces
dependency on reflect package.
2. implement reflect.Value.SwapIndex(i,j int) which will speed up the
sort, because no intermediate values are created.

Comments?

Ivan Krasin

Russ Cox

unread,
Dec 1, 2011, 5:24:57 PM12/1/11
to Ivan Krasin, golang-nuts
A sort implementation that uses reflect but has a1-line API is going
to be significantly slowerthan one that requires 5 lines but does not
use reflect.A reflect-based sort does not have a place in the standard
library.
There will always be examples where one-liners
in one language turn into more code in other languages.
The response does not have to be to compete on
those terms.

However, I would at least point out that the comparison
you posted is flawed. You can't charge the struct definition
to the sorting (it would have existed anyway), and most
people would use 1-line method implementations:

Python:

sort(a, key=lambda x: x[1] / x[2], reverse=True)

Go:

type ByRatioReverse []Entry
func (b ByRatioReverse) Swap(i, j int) { b[i], b[j] = b[j], b[i] }
func (b ByRatioReverse) Len() int { return len(b) }
func (b ByRatioReverse) Less(i, j int) bool { return b[i].u/b[i].v >
b[j].u/b[j].v }

sort.Sort(ByRatioReverse(a))

That's really not a big deal in the grand scheme of
things. If all you do is sort things, and you never
use the same sort order or container type twice,
then yes you have a 5x code blowup. Maybe you
should use Python in that case. That's okay.

Russ

Reinhard

unread,
Dec 23, 2011, 5:43:16 PM12/23/11
to golang-nuts
I think you know that if many different "key functions" (in the sense
of
Python) are used, one can reduce the code size:

// fixed part start ------------
type Keyer interface {
Key() int
}

type Keylist []Keyer

func (lst Keylist) Len() int { return len(lst) }
func (lst Keylist) Swap(i, j int) { lst[j], lst[i] = lst[i], lst[j] }
func (lst Keylist) Less(i, j int) bool {
return lst[i].key() < lst[j].key()
}

// define sort() on Keylist

func (lst Keylist) sort() { sort.Sort(lst) }
// fixed part end -------------

To apply, one has to define a Key() method for list elements and then

list := make([]Keyer, len(...))

for ... {
list[i] = ...
}

Keylist(list).sort()

The list elements have to be converted back then.

This costs more memory (extra list of interfaces) and some time in pre
and postprocessing, but say if you iterate over many key functions,
this
saves code. A performance problem could be the method call in Less()
(Go
uses quicksort). I looked in the Python source and observed (as
expected) that they form a new list of pairs with key values and
pointers to list elements. So the number of calls of key() increases
linearly with the list length. For very long lists and many algorithms
(or
an expensive sort function), one should implement sorting this way, I
think. BTW, Python implements very sophisticated sorting algorithms.

Reinhard

Kevin Gillette

unread,
Jan 11, 2012, 8:08:06 PM1/11/12
to golang-nuts
Python's sort is a variation of mergesort that handles runs of already
sorted data efficiently (you can wikipedia timsort to read about it).
I'm guessing (hoping) go uses a stable sort?

At any rate, while I come from (and am still in) the python camp,
while I'm wearing my go hat, I'll agree with Russ, that a key func is
more a detriment than a benefit in go (in terms of performance and
readability). Reinhard really came up with one of the best/only ways
to deal with making a key-based sorter given the restrictions of
comparability, but it has the issue that, in the general case, items
can't necessarily be cleanly reduced to an int -- key would
essentially have to be a hashing function, and even hash map
implementations expect collisions [this could not tolerate while still
producing an accurate sort]). Therefore, I think the 3 liner Russ
showed is good enough for most purposes.

Kyle Lemons

unread,
Jan 11, 2012, 11:19:11 PM1/11/12
to Kevin Gillette, golang-nuts
I'm guessing (hoping) go uses a stable sort?

I believe sort.Sort uses quicksort, and is therefore unstable. 

Maxim Pimenov

unread,
Jan 13, 2012, 11:42:39 AM1/13/12
to Kyle Lemons, Kevin Gillette, golang-nuts
It should be documented, I think.
Also, there was a discussion about adding SortInReverse or something similar 
to sort package a couple of weeks ago. The implementation suggested then
was unstable regardless of stability of sort.Sort (it was not added, though).
Reply all
Reply to author
Forward
0 new messages