sort.go uses SymMerge from 2004, but the authors came up with a better algorithm in 2008

756 views
Skip to first unread message

mi...@bonzaiapps.com

unread,
Mar 21, 2014, 10:26:53 AM3/21/14
to golan...@googlegroups.com
I just came across the sort.go file:
http://golang.org/src/pkg/sort/sort.go?m=text

It uses SymMerge for its stable in-place sorting, which was created in 2004 by Pok-Son Kim and Arne Kutzner, but those authors came up with a superior algorithm in 2008 called ratio-based merging. I made an implementation based on that paper here:

Nicolas Hillegeer

unread,
Mar 22, 2014, 6:25:33 AM3/22/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
It would indeed be nice to have this improvement incorporated in the stdlib. I see there's no Go implementation in your github repo though?

mi...@bonzaiapps.com

unread,
Mar 22, 2014, 8:30:26 AM3/22/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
Hi Nicolas,

That's true, there's no Go version right now. It actually hadn't crossed my mind to mention anything to people working on languages other than C++, but I was doing research on the history of in-place merge algorithms (I promised the authors of ratio-based merging that I'd write them a Wikipedia article) and the sort.go file was the first result for "merge_without_buffer" and "Dudzinski and Dydek". Once I saw it used SymMerge I realized I should probably let the Go guys know about the superior paper those same authors had published.

- Mike

Brad Fitzpatrick

unread,
Mar 22, 2014, 3:58:42 PM3/22/14
to mi...@bonzaiapps.com, golang-nuts
Please file a bug: http://golang.org/issue/new


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Volker Dobler

unread,
Mar 23, 2014, 7:04:25 PM3/23/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
Am Freitag, 21. März 2014 15:26:53 UTC+1 schrieb mi...@bonzaiapps.com:
Thanks for implementing it in C! 

I was aware of the paper (and the 2006 SOFSEM paper)
but both looked rather complicated in comparison to SymMerge
and SplitMerge and didn't find simple enough (for me) descriptions
of Hwang and Lin’s algorithm. So I went with SymMerge.

I'll have a look at your C code and may port it to Go if I
can spare some hours.

V.


mi...@bonzaiapps.com

unread,
Mar 23, 2014, 9:02:15 PM3/23/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
Haha, I had the same problem with the Hwang and Lin algorithm. The current version doesn't even use it, because I found it was faster (and far simpler) to just use a repeated binary search and insert. Basically once I implemented the algorithm from the paper and verified that it worked, I went back and wrote a new and simplified version that sorta does the same thing.

If you make a "port" that is just a literal translation of the C code for now, I can maintain it for you while I finish up the remaining bits and pieces – it's only "99%" finished right now. Then you could finish turning it into proper Go code from there. Or if you don't have enough free time soon I'll probably have it finished up soon anyway.

Gerard

unread,
Mar 24, 2014, 4:46:45 AM3/24/14
to golan...@googlegroups.com
A stupid question probably, but what about patents?

Volker Dobler

unread,
Mar 24, 2014, 4:52:58 AM3/24/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
Am Montag, 24. März 2014 02:02:15 UTC+1 schrieb mi...@bonzaiapps.com:
Haha, I had the same problem with the Hwang and Lin algorithm. The current version doesn't even use it, because I found it was faster (and far simpler) to just use a repeated binary search and insert. Basically once I implemented the algorithm from the paper and verified that it worked, I went back and wrote a new and simplified version that sorta does the same thing.
Does the simplified version still have the same complexity?
 
If you make a "port" that is just a literal translation of the C code for now, I can maintain it for you while I finish up the remaining bits and pieces – it's only "99%" finished right now. Then you could finish turning it into proper Go code from there. Or if you don't have enough free time soon I'll probably have it finished up soon anyway.
Go ahead. I can't find my remaining supply of "instant time: just add hot water (TM)".

V. 

mi...@bonzaiapps.com

unread,
Mar 24, 2014, 6:12:10 AM3/24/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
Asymptotic complexity, or code complexity? The code is a lot simpler (no "rotation-based variant of the Hwang and Lin algorithm", no BLOCKMERGE, no movement imitation buffer, no 10-step cleanup at the end...), and I'm pretty sure it has the same asymptotic complexity but with lower hidden constant factors. I can't say for sure, though – I just know it holds up well in the tests I've tried.

- Mike

mi...@bonzaiapps.com

unread,
Mar 24, 2014, 6:13:51 AM3/24/14
to golan...@googlegroups.com
I contacted the authors of the paper for permission to release it to the public domain before telling anyone about the project.

mi...@bonzaiapps.com

unread,
Apr 4, 2014, 12:41:09 AM4/4/14
to golan...@googlegroups.com, mi...@bonzaiapps.com
Just wanted to let you know that the sorting algorithm should be finished, and I have a full analysis of the asymptotic complexity up on Wikipedia:


I ended up not using Hwang and Lin anywhere, since the overhead of using it always exceeded any savings from the fewer comparisons it offered – at least in the places where it would have been used in this algorithm. It does however use a very similar algorithm when extracting and redistributing the buffers.



On Monday, March 24, 2014 4:52:58 AM UTC-4, Volker Dobler wrote:
Reply all
Reply to author
Forward
0 new messages