Sorting an array of string with ASM

König

unread,

Jun 7, 2002, 4:05:36 AM6/7/02

to

Hi!

I have a huge array of string. Lets just say I have 1 million strings.
I want to sort the alphabetically (obviously), but the quickest
sorting algorithm I’ve stumbled across lately is QuickSort, (the
one in the thread example that comes with Delphi). But it’s not
that “quick” when it has to deal with over 10-20 thousand
elements.

Therefore I want to have a procedure written in asm so I really can
gain some speed. But here is the catch………..I suck at
asm! I have little, well actually almost NO EXPERIENCE with asm.
That’s why I’m asking for help!

...(I need that damn array sorted FAST)

König

unread,

Jun 7, 2002, 4:08:11 AM6/7/02

to

Hi

I have a huge array of string. Lets just say I have 1 million strings.
I want to sort the alphabetically (obviously), but the quickest

sorting algorithm I've stumbled across lately is QuickSort, (the one
in the thread example that comes with Delphi). But it's not that
"quick" when it has to deal with over 10-20 thousand elements.

Therefore I want to have a procedure written in asm so I really can

gain some speed. But here is the catch....I suck at asm! I have
little, well actually almost NO EXPERIENCE with asm. That's why I'm

Denis Jeanroy

unread,

Jun 7, 2002, 9:39:37 AM6/7/02

to

I'm not sure that a ASM function will improve it a lot. I think that most
delphi functions are already optimized, some of them using ASM (you will see
that looking into delphi source files).

have you considered using third party libraries ?
Is it possible to handle the sorting when elements are added to the list,
and not afterwards on the whole array ?
if the list is also built by your application, it may be possible to process
strings one by one when they have to be inserted. Time is generally less
critical at this stage. At the end you get a sorted list (a list and an
index).
if you get the list from an external file, this will not help.

Denis

fat G

unread,

Jun 7, 2002, 9:09:05 PM6/7/02

to

QuickSort shouldn't have no problem with 10-20 thousand elements.
You sure you're using QuickSort, not some selection or bubble sort? :)
It's the fastest there is, so you should maybe speed up your comparation
function.

"König" <el...@mailme.dk> wrote in message
news:a2937919.02060...@posting.google.com...

Jan Philips

unread,

Jun 7, 2002, 9:50:35 PM6/7/02

to

On 7 Jun 2002 01:05:36 -0700, el...@mailme.dk (König) wrote:

>I have a huge array of string. Lets just say I have 1 million strings.
>I want to sort the alphabetically (obviously), but the quickest
>sorting algorithm I’ve stumbled across lately is QuickSort, (the
>one in the thread example that comes with Delphi). But it’s not
>that “quick” when it has to deal with over 10-20 thousand
>elements.

>...(I need that damn array sorted FAST)

First, how fast does it need to be? In my test, Quicksort sorts
20,000 random 20-character strings in less than 0.05 seconds (Delphi
6, O+,R-,Q-, 1.3 GHz Athlon). It sorts 1,000,000 of them in 5.6
seconds.

Secondly, I doubt an asm version will be much faster.

Ethar Alali

unread,

Jun 7, 2002, 11:12:53 PM6/7/02

to

The quicksort functions and sort methods in TList objects are indeed the
fastest there are in Delphi. It is unlikely that he is using anything else
because the sort methods all indirectly call quicksort unless they have
written their own sort routine.

In terms of it being the fastest there is, lets see how many other people in
this newsgroup know this, the complexity of a quicksort function is
something in the region of O(n log n ) this is exactly the same as merge
sort (the different being the position at which the sorting is done,
quicksort on the way down, mergesort on the way up)

I came accross an algorithm for sorting numbers (given enough storage, which
for a million is more than plentyful) that can sort descrete numbers in
O( n ) time complexity and hence is significantly faster than quick-sort,
merge-sort and other O(n log n) based methods. But as you have already
gathered the fact that it was designed to sort integers it a major
limitation.

I did adapt this algorithm successfully to sort strings using a hash code
and it worked well. That was some time ago, and it was used to sort fixed
length fields (unique offline identifiers). Variable length items are
another matter. I am sure it is possible but I have not spent enough time
thinking about it to work it out.

Anyway, enough of my rabbiting. Assembler should not be needed in this case
20,000 records is nothing.

Ethar

"fat G" <fa...@NO-SPAMemail.si> wrote in message
news:tScM8.346$w5.1...@news.siol.net...

Graham Parsons

unread,

Jun 8, 2002, 6:10:51 PM6/8/02

to

Just an observation. I find that sorting when writing the file (i.e.
inputting the data) is better than trying to sort it quickly just before you
need it. As our friends here have said, to sort 20,000 in half a second is
no problem, especially if that half a second is incorporated into the
initial input time - you'd barely notice it - and the results are there when
you need them.
Just my 10c. worth...
Graham.

"König" <el...@mailme.dk> wrote in message
news:a2937919.02060...@posting.google.com...

fat G

unread,

Jun 9, 2002, 10:47:16 AM6/9/02

to

I don't think merge sort would prove faster in this case. He clearly needs
an internal sort.
One way I can think of, is using your optimized fixed-length string sort.
With a few adaptations, it would probably be able to sort all strings by
first, let's say, 4 characters,
thus making it a fixed-length data sort. Then performing it again on smaller
subarrays, which have
the first 4 characters equal. Repeating the last step until each string is
on it's own (or strings are completely equal).
But in the worst case, I don't think it would prove much faster..

"Ethar Alali" <som...@dsl.pipex.com> wrote in message
news:3d0176a2$0$238$cc9e...@news.dial.pipex.com...

König

unread,

Jun 9, 2002, 1:34:33 PM6/9/02

to

Hi again!

To follow up on the discussion, I'll give you a little more
information: (sorry, should have done that to begin with)

It's NOT possible to sort when elements are added to the list. I have
a complete list and then I want to sort it.

I mentioned that it was an array of string, but really it's an array
of "ElementInfo"

...ElementInfo??? What??!

Well....

type
ElementInfo = record
s1,s2,s3: string;
end;

I want to sort the list alphabetically according to s1.

I've tried the QuickSort that is in the threads example, but it sorts
an array [0..25000] of ElementInfo in about 5-6 seconds. Why is that?

This is how QuickSort looks:

procedure QuickSort(var A: array of ElementInfo; iLo, iHi: Integer);
var
Lo, Hi, Mid: Integer;
T: ElementInfo;
begin
Lo := iLo;
Hi := iHi;
Mid := A[(Lo + Hi) div 2].s1;
repeat
while A[Lo].s1 < Mid do Inc(Lo);
while A[Hi].s1 > Mid do Dec(Hi);
if Lo <= Hi then
begin
T := A[Lo];
A[Lo] := A[Hi];
A[Hi] := T;
Inc(Lo);
Dec(Hi);
end;
until Lo > Hi;
if Hi > iLo then QuickSort(A, iLo, Hi);
if Lo < iHi then QuickSort(A, Lo, iHi);
end;

P.S.
Sorry for last time posting the same message twice! I don't know what
happened!!!

Maarten Wiltink

unread,

Jun 9, 2002, 2:28:01 PM6/9/02

to

König wrote in message ...

>Hi again!
>
>To follow up on the discussion, I'll give you a little more
>information: (sorry, should have done that to begin with)
>
>It's NOT possible to sort when elements are added to the list. I have
>a complete list and then I want to sort it.
>
>I mentioned that it was an array of string, but really it's an array
>of "ElementInfo"
>
>...ElementInfo??? What??!
>
>Well....
>
>type
> ElementInfo = record
> s1,s2,s3: string;
>end;
>
>I want to sort the list alphabetically according to s1.

So for purposes of sorting, it's still an array of string.

>I've tried the QuickSort that is in the threads example, but it sorts
>an array [0..25000] of ElementInfo in about 5-6 seconds. Why is that?

Probably because when switching records, it's switching both
the whole records. Try sorting a list of pointers to every record,
so it switches only the pointers.

Groetjes,
Maarten Wiltink

Jan Philips

unread,

Jun 9, 2002, 3:49:01 PM6/9/02

to

On Sun, 9 Jun 2002 20:28:01 +0200, "Maarten Wiltink"
<maa...@kittensandcats.net> wrote:

>Probably because when switching records, it's switching both
>the whole records. Try sorting a list of pointers to every record,
>so it switches only the pointers.

Even when switching records, my quicksort is nearly 100 times faster.

Jan Philips

unread,

Jun 9, 2002, 4:00:52 PM6/9/02

to

On 9 Jun 2002 10:34:33 -0700, el...@mailme.dk (König) wrote:

>type
> ElementInfo = record
> s1,s2,s3: string;
>end;

How long are the strings? When I run it with random 20 char strings
it takes a small fraction of a second.

Jan Philips

unread,

Jun 9, 2002, 4:03:59 PM6/9/02

to

On 9 Jun 2002 10:34:33 -0700, el...@mailme.dk (König) wrote:

>procedure QuickSort(var A: array of ElementInfo; iLo, iHi: Integer);
>var
> Lo, Hi, Mid: Integer;
> T: ElementInfo;

Mid should be a string.

Jan Philips

unread,

Jun 9, 2002, 4:20:38 PM6/9/02

to

On 9 Jun 2002 10:34:33 -0700, el...@mailme.dk (König) wrote:

>I've tried the QuickSort that is in the threads example, but it sorts
>an array [0..25000] of ElementInfo in about 5-6 seconds. Why is that?

Are yuo sure that this is the time to sort only, not reading from disk
or anything else? What speed CPU are you using? How long are the
strings?

Michael Davis

unread,

Jun 10, 2002, 3:52:18 PM6/10/02

to

el...@mailme.dk (=?ISO-8859-1?Q?K=F6nig?=) wrote in message news:<a2937919.02060...@posting.google.com>...

Is your array by any chance already partly sorted? QuickSort is very
quick if the list to be sorted is initially in a random order, but it
has one serious drawback: if the list to be sorted is already almost
sorted into order, then QuickSort becomes very slow. In fact it is
slowest of all if you use it on a list which is in fact already
completely in order. This is very odd behaviour: I know of no other
sorting algorithm that behaves like this, and some are quicker on an
almost sorted list than on a random one, which seems much more in
keeping with common-sense: there is less work to do!

If this is what is happening in your case the best solution is to use
a different sorting algorithm. ShellSort works at a speed of the same
order of magnitude as QuickSort on a random list, and does not slow
down on a partly sorted list. Perhaps a web search for 'ShellSort'
will prvide you with something usable. Failing that, you could first
use a random number generator to 'unsort' your array, and then sort it
with QuickSort. This is a weird idea, but it actually can sometimes
result in a speed increase.

Michael Davis.

Ethar Alali

unread,

Jun 10, 2002, 4:20:59 PM6/10/02

to

void rant_and_rave( void )
{
The reason why Quicksort is slow when sorting pre-sorted lists is that it's
AVERAGE is bound from above by O( n log n ) it has to do ALL comparisons in
order to realise that it has nothign to do. The remaining time (when the
list is not completely sorted) it is sorting in a time period of less than
or equal to O( n log n ), as an average case. Note that the assymptotic
concept of O(...) is the equivalent of the WORST case scenario average.

Quite incorrectly, ShellSort is nowhere near the speed of QuickSort. The
shell sort has a time complexity set of O( n sqrt( n ) ) which means that
for 20,000 records, we are looking at something in the order of 2828427
comparisons (and hence multiply with the time for one comparison) whilst
quicksort sorts in the 285754 comparisons, an order of magnitude of almost
10 times faster in worst case. This of course is true only for large sorts,
for small sorts it is unlikely to show up as a significant performance
difference in application programs or for small amounts in the practical
scheme of things. If you need to sort a large array you should not consider
ShellSort etc.

There is good reason to use (or learn to use, which unfortunately most
programmers dont know) asymptotic notation and Algorithmic methods (no I
have not mis-spelled that, I do not mean algorithms, I mean the analysis of
the algorithm for performance predictability. If you thought I was stupid,
who's laughing now!). Mathematicians can teach software developers a lot and
they should listen!! After all we should be pushing for engineering
principles here.

... and breathe, excuse me while I go count to ten.

}// End function rant_and_rave

Ethar

P.S. Something I forgot to ask, why are you trying to write your own
quicksort when there exists one in the VCL?

"Michael Davis" <miki...@yahoo.co.uk> wrote in message
news:62f60a1.02061...@posting.google.com...

Jan Philips

unread,

Jun 10, 2002, 4:58:12 PM6/10/02

to

On Mon, 10 Jun 2002 21:20:59 +0100, "Ethar Alali"
<som...@dsl.pipex.com> wrote:

>Quite incorrectly, ShellSort is nowhere near the speed of QuickSort. The
>shell sort has a time complexity set of O( n sqrt( n ) ) which means that
>for 20,000 records, we are looking at something in the order of 2828427
>comparisons (and hence multiply with the time for one comparison) whilst
>quicksort sorts in the 285754 comparisons, an order of magnitude of almost
>10 times faster in worst case.

In practice it is a lot closer. On my tests with sorting 20,000
integers and with 20-char strings, Quicksort is 2.6 to 2.7 times
faster than Shellsort. We're talking about milliseconds too.

Ethar Alali

unread,

Jun 10, 2002, 5:49:02 PM6/10/02

to

That is fine, doesn't surprise me, the thing I highlighted in my previous
e-mail is that the
notation referres to the average speed of the algorithm being bound from
above by O(n log n) an from below by something else. As an example, for
bubble sort, if the list is already completely sorted, it is bound from
below as O( n ) - nothing changes, yet for quicksort, you aealways comparing
to the end of the list, thus you are limited from below by n log n, THAT is
why it seems slower for already sorted lists, you can start to see my point
of view I am sure. The 'Big-O' denotes the WORST case scenario.

The second thing you must bear in mind, is that the O() functions do not
care about the scalar coefficient of any polynomial value. They care about
the order because it is a set of possible values of result time. Hence 2n
log n, 3n log n and 4n log n are all in O( n log n ). The scalar factors can
appear to speed it up by a certain amount but it is nothing compared to the
increase due to an order of power. e.g. 30 * 20000 <<< 20000 * 20000.

>We're talking about milliseconds too

This is right, but I was talking about microseconds. One thing you have to
look at is that due to the fact that the ShellSort algo. is a O( n
sqrt( n ) ) gradient curves upwards whilst O( n log n ) curves downwards,
thus there is a point where O( n sqrt( n ) ) member 'a' and O( n
sqrt( n ) ) variable 'b' are equal. draw the graphs of these two functions
superimposed on each other and you will see what I mean. Test it out with
smaller values and you may find a situation where given a certain value of n
and an implimentation of shell sort, the ShellSort algo actually outperforms
quicksort, for a given constant c. As an example, if n = 2 and c = 4 then:

cn log2 n is member of O( n log2 n ) = 8 log2 4 = 16

n sqrt( n ) = 2 sqrt( 4 ) = 4

This is obviously a trivial example, but it makes the point. The scalar
factors depend on the implimentation of the algorithm, processor power,
compilers etc. and is only one member of the set of O( n log n ) for
example. SO your sort of 20000 producing 2.6 - 2.7 times is NOT outside the
bounds of the theoretical, it is a member of it, just like a lot of things
in the practical world that practitioners assume theorists are wasting their
time on.

Thanx for you comments. Much appreciated.

Ethar

"Jan Philips" <jud.mc...@mindspring.com> wrote in message
news:dh4aguofdbn2f9u95...@4ax.com...

Michael Davis

unread,

Jun 11, 2002, 4:28:55 AM6/11/02

to

"Ethar Alali" <som...@dsl.pipex.com> wrote in message news:<3d050a9d$0$233$cc9e...@news.dial.pipex.com>...
> void rant_and_rave( void )>

> Quite incorrectly, ShellSort is nowhere near the speed of QuickSort. The
> shell sort has a time complexity set of O( n sqrt( n ))

I was under the impression that ShellSort was of order O(n log n), but
I am willing to accept that I was wrong. It is actually some years
since I have had any reason to go into these details. However,
someone else has pointed out in a message on another thread that
MergeSort is of order O(n log n), so that might be a solution to the
original problem.

> > P.S. Something I forgot to ask, why are you trying to write your own
> quicksort when there exists one in the VCL?
>

The original message says "I've stumbled across lately is QuickSort,
(the
one in the thread example that comes with Delphi)." That doesn't
sound to me like someone trying to write their own QuickSort!

Michael Davis

Ethar Alali

unread,

Jun 11, 2002, 4:53:57 AM6/11/02

to

Probably, sorry about last night, had a bad day =8¬)

No not trying to wrte their own quicksrt, but it is not the same as the VCL
quicksort (at least not exactly)

Anyway, public apology. Sorry. I make a habit of that. Too much time in
front of a computer, thats my problem. This is not a reflection on yourself
or other programmers. It can get a little frustrating at time, but I have to
realise that I have been there too. But I also learned that just coding will
get nowhere. Have to plan what you want to do. Design and engineering are
ways meeting constraints (amongst other things of course)

The whole asymtotic notation for Shellsort can be derived form the algorithm
itself. I use algorithmics on a day to day basis to meet the constraints
specified in the requirements given several dimensionos such as time to
impliment, space requirements, performance etc. and linear programming
becomes a mighty useful tool.

Again sorry for the rant and rave function.

Ethar

"Michael Davis" <miki...@yahoo.co.uk> wrote in message
news:62f60a1.02061...@posting.google.com...

Jan Philips

unread,

Jun 11, 2002, 8:10:38 PM6/11/02

to

On Mon, 10 Jun 2002 22:49:02 +0100, "Ethar Alali"
<som...@dsl.pipex.com> wrote:

>That is fine, doesn't surprise me, the thing I highlighted in my previous
>e-mail is that the
>notation referres to the average speed of the algorithm being bound from
>above by O(n log n) an from below by something else.

Quicksort is not bounded above by O(n log n), it is O(n^2) in the
worst case. The average is O(n log n) though.

>This is right, but I was talking about microseconds. One thing you have to
>look at is that due to the fact that the ShellSort algo. is a O( n
>sqrt( n ) )

Shell sort is (much) better than that. Depending on the
implementation, it can be n * (log n)^2, n^(5/4), n^(7/6) for
practical versions, or asymptotically n^(1+epsilon) for any epsilon >
0.

>gradient curves upwards whilst O( n log n ) curves downwards,

I'm not sure what you mean here. Do you mean like the slope or
derrivative (gradient)?

>This is obviously a trivial example, but it makes the point. The scalar
>factors depend on the implimentation of the algorithm, processor power,
>compilers etc. and is only one member of the set of O( n log n ) for
>example. SO your sort of 20000 producing 2.6 - 2.7 times is NOT outside the
>bounds of the theoretical,

For 20,000 random data items, Shell sort actually performs fewer
comparisons (but more data moves) than Quicksort.

>Thanx for you comments. Much appreciated.

Nice talking to you.

Jan Philips

unread,

Jun 11, 2002, 8:16:34 PM6/11/02

to

On 11 Jun 2002 01:28:55 -0700, miki...@yahoo.co.uk (Michael Davis)
wrote:

>I was under the impression that ShellSort was of order O(n log n), but
>I am willing to accept that I was wrong.

Not quite, see my other message.

fat G

unread,

Jun 11, 2002, 9:57:14 PM6/11/02

to

If the array is nearly sorted, then why not use heapsort (or bubblesort for
that matter :)?
Heapsort is the only one that can compare to quicksort in speed. Both are
definetly faster than others, in a very big majority of cases.
We're told on our faculty never never never to use Shellsort (we have to
know all about it, though). They say it's only important in a historical
sense - the first sorting algorithm to break O( sqr(n) ). I don't think
they're talking nonsense, when they say 'don't use it'. Have no arguments of
my own, still learning this area.

"Michael Davis" <miki...@yahoo.co.uk> wrote in message
news:62f60a1.02061...@posting.google.com...

Ethar Alali

unread,

Jun 12, 2002, 9:34:37 AM6/12/02

to

Just to clarify a few things.

> >That is fine, doesn't surprise me, the thing I highlighted in my previous
> >e-mail is that the
> >notation referres to the average speed of the algorithm being bound from
> >above by O(n log n) an from below by something else.
>
> Quicksort is not bounded above by O(n log n), it is O(n^2) in the
> worst case. The average is O(n log n) though.
>

Reading the last sentence of my part you see that we agree on this =8¬) The
average case of quicksort IS bounded from above by O( n log n ), indeed the
top is bounded by n^2. We are agreeing on this one.

> >This is right, but I was talking about microseconds. One thing you have
to
> >look at is that due to the fact that the ShellSort algo. is a O( n
> >sqrt( n ) )
>
> Shell sort is (much) better than that. Depending on the
> implementation, it can be n * (log n)^2, n^(5/4), n^(7/6) for
> practical versions, or asymptotically n^(1+epsilon) for any epsilon >
> 0.
>

True, the implimentation of shell sort is dependant on the sequences used to
compare elements in the array/list, and indeed if the result is
n^(1+epsilon) it is still slower than quicksort.

> >gradient curves upwards whilst O( n log n ) curves downwards,
>
> I'm not sure what you mean here. Do you mean like the slope or
> derrivative (gradient)?
>

Yep

>
> >This is obviously a trivial example, but it makes the point. The scalar
> >factors depend on the implimentation of the algorithm, processor power,
> >compilers etc. and is only one member of the set of O( n log n ) for
> >example. SO your sort of 20000 producing 2.6 - 2.7 times is NOT outside
the
> >bounds of the theoretical,
>
> For 20,000 random data items, Shell sort actually performs fewer
> comparisons (but more data moves) than Quicksort.
>

You are right, I miscommunicated here, when I talked about comparisons I
meant operations on data whch encompas a comparison or a move. My bad.
Should have specified that in more detail.

Ethar

Ethar Alali

unread,

Jun 12, 2002, 9:39:24 AM6/12/02

to

Well the problem from my perspective is that the implimentaiton is very much
a big factor in the algorithm for Shellsort, the complexity analysis will
show various values depending on it, (hence resulting in more operations on
the data). I would not go so far as to say never use it, pick your tool is
what I would say.

I dont believe you should assume that the data is almost always already
sorted, dangerous assumption that. But indeed there is a point to what you
have said. If it is nearly sorted (or fully sorted) then a bubblesort algo
is bounded from BELOW (Big-Ohmega) by O( n ) believe it or not (given the
list may be already sorted) whilst quicksort would still perform a O( n log
n ) sort.

Ethar

"fat G" <fa...@NO-SPAMemail.si> wrote in message

news:HXxN8.1527$w5.2...@news.siol.net...

Beer Hunter

unread,

Jun 12, 2002, 7:40:18 PM6/12/02

to

miki...@yahoo.co.uk (Michael Davis) wrote in message
news:<62f60a1.02061...@posting.google.com>...

> Is your array by any chance already partly sorted? QuickSort is very
> quick if the list to be sorted is initially in a random order, but it
> has one serious drawback: if the list to be sorted is already almost
> sorted into order, then QuickSort becomes very slow. In fact it is
> slowest of all if you use it on a list which is in fact already
> completely in order.

This was only the case for the initial implementation of the
quicksort, where the first element was always chosen as the pivot.
Shortly after that problem was discovered, people came up with all
sorts of solutions. A popular one is to take the median of the first,
middle, and last element of the current section, then use that as the
pivot.

> This is very odd behaviour: I know of no other

> sorting algorithm that behaves like this...

The initial implementation of the treesort behaved that way too,
although you can avoid the problem by using a red-black or AVL tree.

Jan Philips

unread,

Jun 12, 2002, 8:32:29 PM6/12/02

to

On Wed, 12 Jun 2002 03:57:14 +0200, "fat G" <fa...@NO-SPAMemail.si>
wrote:

>We're told on our faculty never never never to use Shellsort

That's nonsense. Perhaps they meant never use Bubble sort. I use
Shell sort a lot.

> I don't think
>they're talking nonsense, when they say 'don't use it'.

I do think it is nonense to not use Shell.

Jan Philips

unread,

Jun 12, 2002, 8:34:17 PM6/12/02

to

On Wed, 12 Jun 2002 14:34:37 +0100, "Ethar Alali"
<som...@dsl.pipex.com> wrote:

>True, the implimentation of shell sort is dependant on the sequences used to
>compare elements in the array/list, and indeed if the result is
>n^(1+epsilon) it is still slower than quicksort.

Yes, but it isn't 10 times slower on 20,000 items, as was claimed.

fat G

unread,

Jun 12, 2002, 8:38:08 PM6/12/02

to

Well... it might be a matter of opinion, but the people that told me that
are proffesors on our faculty of CS,
and are supposed to know these things. I'ma stick with them until proven
wrong :).
If I may ask, why _are_ you using it? Heapsort and quicksort are both faster
(... in most cases), not to mention
much simplier to implement?

"Jan Philips" <jud.mc...@mindspring.com> wrote in message

news:4kpfgu81vbst74oj4...@4ax.com...

fat G

unread,

Jun 12, 2002, 8:42:13 PM6/12/02

to

> >We're told on our faculty never never never to use Shellsort
>
> That's nonsense. Perhaps they meant never use Bubble sort. I use
> Shell sort a lot.

They didn't mean bubblesort. It was definetely shellsort, I've looked it up.
Hope this doesn't sound too aggressive, but just because you are using
it a lot, it doesn't mean it's good, does it?
Not trying to be offensive or anything.

fat G

unread,

Jun 12, 2002, 9:02:30 PM6/12/02

to

How about using iterative quicksort, anyone tried that?
In theory, it's faster than recursive version.

"Beer Hunter" <beer_...@swirve.com> wrote in message
news:95849100.02061...@posting.google.com...

Ethar Alali

unread,

Jun 12, 2002, 9:40:26 PM6/12/02

to

It is not non-sense, your profs. have a point in certain circumstances, but
because of the choice of sequence that has to be made for the sort there is
an elements of risk that the chosen method is slower than other methods.

If I remember rightly the Shell sort is used as a Genetic algorithms test
(well task actually) for optimisation, the idea being to optimise the sort
of a particular set or 20 numbers. There are various hill climbing algo's
that have hit an optimized point.

heap sort and bubble sort are indeed very easy to impliment, but I cannot
advocate using them unless you know for sure that the algo's will be in
their element (i.e. they will be performing sorts on nearly sorted data, to
get as close to O( n ) as possible) if that is the case then use them, else
look at a more geeric approach.

The one thing with QuickSort, Radix sort etc. is that you have a fairly
predictable average complexity from each, within a defined range. This does
not apply to Bubble sort which has a complexity between O( n ) and O( n^2 )
which is pretty significant, thus making it sometimes (not all the time)
pretty inconsistent between implimentations.

The Heap and Bubble sorts are not faster than any other algorithm unless
they are working on data that is nearly already sorted. Most of the time the
averages are inthe O( n^2 ) mark hence way below the merge and quicksorts of

O( n log n )

It is amazing to see how heated a discussion on sorting can get =8¬) I am
not even sure that we answered te question that was asked, indeed I am not
sure I remember the original question.

Ethar

"fat G" <fa...@NO-SPAMemail.si> wrote in message

news:FTRN8.1573$w5.2...@news.siol.net...

Jan Philips

unread,

Jun 12, 2002, 9:44:57 PM6/12/02

to

On Thu, 13 Jun 2002 02:42:13 +0200, "fat G" <fa...@NO-SPAMemail.si>
wrote:

>They didn't mean bubblesort. It was definetely shellsort, I've looked it up.

>Hope this doesn't sound too aggressive, but just because you are using
>it a lot, it doesn't mean it's good, does it?

It isn't good because I use it, I use it because it is good.

Jan Philips

unread,

Jun 12, 2002, 9:45:05 PM6/12/02

to

On Thu, 13 Jun 2002 02:38:08 +0200, "fat G" <fa...@NO-SPAMemail.si>
wrote:

>Well... it might be a matter of opinion, but the people that told me that

>are proffesors on our faculty of CS,
>and are supposed to know these things. I'ma stick with them until proven
>wrong :).

Well I know something too. I've been studying algorithms for more
than 22 years and I've never heard "don't use Shell sort" before, but
I HAVE often heard it about bubblesort.

>If I may ask, why _are_ you using it? Heapsort and quicksort are both faster
>(... in most cases), not to mention
>much simplier to implement?

If you are sorting a few thousand records, the execution time
difference is very small. For sorting 1000 records, Quicksort is on
the order of 0.0001 second faster. For 10,000 records, the difference
is on the order of 0.001 second. For sorting hundreds of thousands or
millions of records, then Quicksort is naturally better.

Shell sort is simplier to implement than heap sort or quick sort. And
good implementations of quicksort get a little trickier. One good
implementation that I sometimes use requires adding a sentinal record
in order to run correctly. If I forget that, I get wrong answers very
quickly...

Jan Philips

unread,

Jun 12, 2002, 9:50:39 PM6/12/02

to

On Thu, 13 Jun 2002 02:40:26 +0100, "Ethar Alali"
<som...@dsl.pipex.com> wrote:

>The Heap and Bubble sorts are not faster than any other algorithm unless
>they are working on data that is nearly already sorted. Most of the time the
>averages are inthe O( n^2 ) mark hence way below the merge and quicksorts of
>O( n log n )

Actually, heap sort is O(n log n) on the average and in the worst
case. In fact, it was the first sort proven to be O(n log n) in the
worst case. I think heap sort takes about twice as long as quicksort
on the average, but it is guaranteed to be O(n log n), so it could be
a good choice when the data may be pathological.

Jan Philips

unread,

Jun 12, 2002, 9:56:14 PM6/12/02

to

On Thu, 13 Jun 2002 03:02:30 +0200, "fat G" <fa...@NO-SPAMemail.si>
wrote:

>How about using iterative quicksort, anyone tried that?

>In theory, it's faster than recursive version.

In my test the non-recursive quicksort is about 0.0001 second faster
than the recursive version when sorting 10,000 records.

Jan Philips

unread,

Jun 12, 2002, 10:21:32 PM6/12/02

to

On Thu, 13 Jun 2002 02:42:13 +0200, "fat G" <fa...@NO-SPAMemail.si>
wrote:

>They didn't mean bubblesort. It was definetely shellsort, I've looked it up.

You shouldn't believe everything they say. You should use your own
common sense. In "Algorithms", by Robert Sedgewick, first edition
page 98, "Shellsort is the method of choice for many sorting
applications because it has acceptable running time for even
moderately large files and requires only a very small amount of code,
which is easy to get working. There are more efficient methods, but
they're perhaps only twice as fast (if that much) except for large
files, and significantly more complicated. In short, if you have a
sorting problem, use Shellsort, then determine whether the extra
effort required to replace it with a sophisticated method will be
worthwhile."

And that corresponds to common sense and actual practice.

Bruce Roberts

unread,

Jun 13, 2002, 12:18:30 AM6/13/02

to

"fat G" <fa...@NO-SPAMemail.si> wrote in message
news:FTRN8.1573$w5.2...@news.siol.net...

> Well... it might be a matter of opinion, but the people that told me that
> are proffesors on our faculty of CS,
> and are supposed to know these things. I'ma stick with them until proven
> wrong :).

Regardless of academic qualification I would be leary of anyone who says
that one should never use a particular sort algorithm. One of the reasons
that there are so many sort algorithms is that each one of them has a set of
characteristics that make it ideal for a certain set of circumstances.
Knowing the nature of the data being sorted is a requirement of choosing the
optimal sort algorithm. If one doesn't have this knowledge then choosing
shell sort makes a great deal of sense.

> If I may ask, why _are_ you using it? Heapsort and quicksort are both
faster
> (... in most cases), not to mention
> much simplier to implement?

One of my corollaries to Murphy's law is that if there is a worst case
scenario, it will only appear after a product goes live. IMO "in most cases"
just doesn't cut it.

Ethar Alali

unread,

Jun 13, 2002, 2:57:47 AM6/13/02

to

Good point, I stand corrected, Heap is indeed O( n log n ) making it a good
choice for speed actually. Sounds good to me. Good call Jan. Being twice as
slow in terms of complexity is nothing, as long as we are not looking at
orders of magnitude then fine.

Ethar

"Jan Philips" <jud.mc...@mindspring.com> wrote in message

news:lbufguklvk5is3ptf...@4ax.com...

Ethar Alali

unread,

Jun 13, 2002, 3:01:26 AM6/13/02

to

I agree with Jan, as I mensioned before, the whole principle of selection
should be based on several factors, including speed, performance, space and
difficulty. This corresponds to a linear programming problem and hence will
allow you to find the optimal algo. for your problem. Never using it is a
bad recommendation, but even Bubblesort has it's good points, given the
right circumstances. Wriete down the variables, and select the one which
gives an optimal solution given the time and space constraints that you
have.

Ethar

"fat G" <fa...@NO-SPAMemail.si> wrote in message

news:tXRN8.1574$w5.2...@news.siol.net...

König

unread,

Jun 13, 2002, 6:40:26 AM6/13/02

to

I can see that the discussion has gotten a life of it's own. ;-)

If some of you don't remember me or even know who I am at all, let me
fill you in.... I'm the one posting the original question.

I've been following the discussion and I must admit that I thought it
would bee way simpler to "just sort a list". You really have to know
something about the data in the list, and how it is structured.

I was using Quick Sort and I couldn't figure out why it was kinda
"slow". But then I looked at my list and I discovered that large
sections of it already were sorted!! Well I guess that means bye bye
Quick Sort then!?

Then I can see that some of you suggest the Shell Sort and the Merge
Sort for a list that is partially sorted, as an alternative to, or
even a better solution than Quick Sort. But what about posting some
usable code so I could test it myself? That would really bee nice!!

Also...I've been searching the web for different sorting algorithms,
and I've (once again) stumbled across something interesting. If you'd
take the time to check out this website:
http://www.informatik.uni-stuttgart.de/ifi/ps/Ploedereder/sorter/sortanimation2.html
and try the animation. Could someone please explain what the
distribution sort is and how it works?! It looks fast!! ...and if you
have some code for that one too, you are more than welcome to post
it!!

...thanks for the great discussion, I've really learned something.

PhilO...@pgrahams.com

unread,

Jun 13, 2002, 11:26:54 AM6/13/02

to

"fat G" <fa...@NO-SPAMemail.si> wrote:

|Hope this doesn't sound too aggressive, but just because you are using
|it a lot, it doesn't mean it's good, does it?

Hope this doesn't sound too aggressive, but just because a professor
says something, doesn't make it so! (except in his class)

Phil

fat G

unread,

Jun 15, 2002, 7:15:33 AM6/15/02

to

True, I shouldn't believe everything. But it's a starting point to start
building my beliefs. After all, I've just learned all about these sorts,
while you and many others have years of experience with them.
Had no _real_ problem to solve yet with this knowledge.
And I agree that "never never never" is too radical, too.

"Jan Philips" <jud.mc...@mindspring.com> wrote in message

news:guvfgukrecuau31h1...@4ax.com...

Jan Philips

unread,

Jun 15, 2002, 6:38:20 PM6/15/02

to

On Thu, 13 Jun 2002 08:01:26 +0100, "Ethar Alali"
<som...@dsl.pipex.com> wrote:

>I agree with Jan, as I mensioned before, the whole principle of selection
>should be based on several factors, including speed, performance, space and
>difficulty.

Yes, I routinely use 4 or 5 sorting algorithms, depending on the
situation.

Jan Philips

unread,

Jun 15, 2002, 7:04:41 PM6/15/02

to

On Sat, 15 Jun 2002 13:15:33 +0200, "fat G" <fa...@NO-SPAMemail.si>
wrote:

>True, I shouldn't believe everything. But it's a starting point to start

>building my beliefs. After all, I've just learned all about these sorts,
>while you and many others have years of experience with them.

Even the lowly Bubble Sort has some uses, and it is certainly the
least well regarded of the mainstream sorting methods. It is OK on
nearly sorted data, and that comes up in practice sometimes.

Jan Philips

unread,

Jun 15, 2002, 7:06:21 PM6/15/02

to

On Thu, 13 Jun 2002 07:57:47 +0100, "Ethar Alali"
<som...@dsl.pipex.com> wrote:

>Good point, I stand corrected, Heap is indeed O( n log n ) making it a good
>choice for speed actually. Sounds good to me. Good call Jan. Being twice as
>slow in terms of complexity is nothing, as long as we are not looking at
>orders of magnitude then fine.

The Algoritmn Design Manual says that a poorly written Quicksort can
be slower than a poorly written Heap sort. I've never experienced
that, but it could be.

Beer Hunter

unread,

Jun 16, 2002, 1:02:28 AM6/16/02

to

"fat G" <fa...@NO-SPAMemail.si> wrote in message

news:<ueSN8.1575$w5.2...@news.siol.net>...

> How about using iterative quicksort, anyone tried that?
> In theory, it's faster than recursive version.

Do you mean replacing the second recursive call with iteration, or
both? Some compilers already replace tail recursion with iteration -
if you want to see a difference, you might need to replace both.

fat G

unread,

Jun 16, 2002, 7:01:33 AM6/16/02

to

both, of course
The stack implementation needs to be quite fast, though.
Otherwise there's probably no difference.

"Beer Hunter" <beer_...@swirve.com> wrote in message
news:95849100.02061...@posting.google.com...