What is the advantage of using a single linked list? Either way, if
your array contains N elements, you will need a total of N nodes.
Whether they're in one linked list or several does not seem to matter.
I don't see how it makes anything faster. Whenever you add a node, you
have to make some other node point to it, and you have to make it point
somewhere. Whether it points to null or to another node doesn't seem
like it would make any performance difference.
>What do you mean by sub allocator?
Often when you deal with lots of small allocations, you can increase
the speed of your program by requesting large chunks of memory from the
OS and then managing the allocation of those large chunks into small
chunks yourself.
For example, in your case, let's assume you have 4 byte pointers and
you want to sort elements of an array that are also 4 bytes. Then each
node requires 8 bytes. You will have N nodes, so you can just call
malloc() (or whatever) one time and allocate 8*N bytes of memory. Then
set a pointer to the beginning of that region of memory. Call it
next_available. Whenever you need the address of a new (blank, unused)
node, take the value of next_available as that address and then update
next_available by adding 8 to it. This way, you end up calling the
operating system only one time for memory allocation instead of N
times. Since memory allocation can be slow (as can any system call),
this could make your sort go much faster.
- Logan
--
"I'll tell you something. Luxury disgusts me." Giorgio Armani, Jan 17, 2002
( http://dailynews.yahoo.com/h/nm/20020117/re/life_fashion_armani_dc_1.html )
And the algorithm I posted does not need to allocate [besides the static
array] any memory and makes exactly one pass over the data. Unfortunately,
both of our algorithms are fairly useless for general purpose sorting. They
work only for a special case.
> Can
> you point me to a book which implements Radix with a single linked
> list and no pre calculation of identical keys?
No book needed. I posted an algorithm that does exactly that. Too bad it
isn't worth the powder to blow it up with.
> What do you mean by sub allocator?
You allocate memory in large blocks (say one megabyte) and then suballocate
it yourself. It turns out that malloc() on many systems is incredibly slow.
As an illustration, I once wrote a sorting algorithm that allocated strings
as it read them, one at a time, using malloc(). It turns out that it was
faster to physically read the file twice. Of course, I know better now.
--
C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
"The C-FAQ Book" ISBN 0-201-84519-9
C.A.P. FAQ: ftp://cap.connx.com/pub/Chess%20Analysis%20Project%20FAQ.htm
Well that is without identical keys, is it not?
>
> > Can
> > you point me to a book which implements Radix with a single linked
> > list and no pre calculation of identical keys?
>
> No book needed. I posted an algorithm that does exactly that. Too bad it
> isn't worth the powder to blow it up with.
>
Why would you say that? It's already 2x as fast as the current Radix
using Knuth's best solution. It's certainly more efficient then
allocating memory (even if you sub allocate) and the amount of memory
needed is always the same: N linked list elements.
Sean
Well, allocation takes time, even sub-allocation. This algo would be
faster. But perhaps you are right, this could be nit-picking.
Sean
The algorithm I posted (here, I will refresh your memory) can have any
number of duplicate keys, up to the size of an unsigned short. You could
use it for larger types, but then the static array becomes prohibitive:
#include <limits.h>
unsigned long buckets[USHRT_MAX] = {0}; /* I like to be explicit */
void sort(unsigned short *list, unsigned count)
{
unsigned i;
for (i = 0; i < count; i++)
buckets[list[i]]++;
}
Here is data output that will show the data in order:
void spew(void)
{
unsigned i,j;
for (i = 0; i < count; i++)
if (buckets[i])
for (j = 0; j < buckets[i]; j++)
printf("%u ", i);
}
> > > Can
> > > you point me to a book which implements Radix with a single linked
> > > list and no pre calculation of identical keys?
> >
> > No book needed. I posted an algorithm that does exactly that. Too bad
it
> > isn't worth the powder to blow it up with.
> >
> Why would you say that? It's already 2x as fast as the current Radix
> using Knuth's best solution. It's certainly more efficient then
> allocating memory (even if you sub allocate) and the amount of memory
> needed is always the same: N linked list elements.
Your algorithm doesn't even work for general purpose sorting. If you think
it is dandy, why not write a utility and sell it. There is a world's
fastest sorting contest -- why don't you enter it?
Here's another world's fastest sort guy:
http://members.aol.com/mckyyy/sbcntsrt.htm
And another:
http://www.rrsd.com/
Both of those guys did just what you did: reinvent a well-known sort.
Actual record holder (at one time anyway):
http://now.cs.berkeley.edu/NowSort/index.html
Once you can beat the benchmarks for NowSort, then you'll have something.
Bon Chance.
Let me clarify this sentence.
You can have up to ULONG_MAX duplicate keys. They keys are up integers
between 0 USHORT_MAX.
> You could
> use it for larger types, but then the static array becomes prohibitive:
>
> #include <limits.h>
> unsigned long buckets[USHRT_MAX] = {0}; /* I like to be explicit */
> void sort(unsigned short *list, unsigned count)
> {
> unsigned i;
> for (i = 0; i < count; i++)
> buckets[list[i]]++;
> }
>
> Here is data output that will show the data in order:
>
> void spew(void)
> {
> unsigned i,j;
> for (i = 0; i < count; i++)
> if (buckets[i])
> for (j = 0; j < buckets[i]; j++)
> printf("%u ", i);
> }
This is really a fast way to sort. Unfortunately, it's completely useless
for all intents and purposes.
But they are good sorts. With certain caveats distribution sorts are
superior to comparison sorts. The basis for this superiority is that
distribution sorts can, in effect, capture several bits of comparison
in a single pass.
>
>Actual record holder (at one time anyway):
>http://now.cs.berkeley.edu/NowSort/index.html
>
>Once you can beat the benchmarks for NowSort, then you'll have something.
>Bon Chance.
There is a problem with benchmarks for sorting - they presume a
particular environment. How much memory, how many processors, how
many mass storage devices, the performance characteristics of the mass
storage device, the amount of data, the format of the data, the size
of data elements, the size of keys, et cetera, are all determinants as
to what sort algorithm will be optimal. But then, you knew that.
>--
>C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
> "The C-FAQ Book" ISBN 0-201-84519-9
>C.A.P. FAQ: ftp://cap.connx.com/pub/Chess%20Analysis%20Project%20FAQ.htm
>
>
Richard Harter, c...@tiac.net,
http://www.tiac.net/users/cri, http://www.varinoma.com
Love, no matter how pure, is the most selfish of gifts.
For that reason it is the one gift that must be given.
> Actual record holder (at one time anyway):
> http://now.cs.berkeley.edu/NowSort/index.html
>
> Once you can beat the benchmarks for NowSort, then you'll have something.
> Bon Chance.
Yeah, like any of us actually have 32-node clusters or 95 Ultra1
machines sitting around so that we could even try out our
algorithms in an environment comparable to the ones used for
benchmarking NOW-Sort. I couldn't even afford the electricity to
power that many machines. I mean, I could sit around saying,
"Hey I bet this would be nice and fast" and that'd be fun for a
while, but I'd never get a chance to find out how well it works,
if at all. I'd be a lot more impressed by something done on a
single node.
--
"Long noun chains don't automatically imply security."
--Bruce Schneier
I prefer to say that a distribution sort (under ideal conditions)
categorizes the data into n distinct buckets in a single pass.
Sample problematic set:
100 character string with the first 99 characters identical.
A small modification to LSB radix sort can still sort this data in two
passes.
You simply count all of the buckets at each level on the first pass, then
skip all unnecessary distribution passes.
Amusingly, a little known fact is that distribution sorts can be made fully
general purpose with the addition of a callback function. The callback
function does not compare. Rather, it hands back block_of_bits[i] such that
this bucket contains the bits from most significant to least significant for
chunk_of_key[i] {and obviously chunk_of_key[0] is the most significant and
the last chunk is the least significant}. For instance, if the key is
unsigned char and the buckets are CHAR_BIT bits wide, then you simply return
the ith character. The function is obvious for various integers, but it can
easily be applied to floating point types and any other data type as well.
If you build it, they will come[1]
[1] Provided that it is actually faster.
> "Dann Corbit" <dco...@connx.com> writes:
>
> > Actual record holder (at one time anyway):
> > http://now.cs.berkeley.edu/NowSort/index.html
> >
> > Once you can beat the benchmarks for NowSort, then you'll have something.
> > Bon Chance.
>
> Yeah, like any of us actually have 32-node clusters or 95 Ultra1
> machines sitting around so that we could even try out our
> algorithms in an environment comparable to the ones used for
> benchmarking NOW-Sort.
Some of us have access to one at work, though... ;-)
And setting up a cluster on a set of networked machines is not that hard,
these days. It'll be slower than a purpuse-built cluster with a dataswitch,
but it should be good enough for proof-of-principle...
-- Gordon D. Pusch
perl -e '$_ = "gdpusch\@NO.xnet.SPAM.com\n"; s/NO\.//; s/SPAM\.//; print;'
Hey, this is just a discussion group, why are you so angry? If
everyone who thinks they want to share something (for free) that may
(a long shot) be new, you should encourage that kind of thing, not
beat them down. I you some kind of idiot?
Sean
It's totally pointless to store that. If you just want to sort integers,
all you need to know is how many of each type there are. Period.
> > Your algorithm doesn't even work for general purpose sorting. If you
think
> > it is dandy, why not write a utility and sell it. There is a world's
> > fastest sorting contest -- why don't you enter it?
>
> Hey, this is just a discussion group, why are you so angry? If
> everyone who thinks they want to share something (for free) that may
> (a long shot) be new, you should encourage that kind of thing, not
> beat them down. I you some kind of idiot?
I don't mean to beat you down and I am not angry at you. I only mean to say
that there is more to sorting than meets the eye. Making something that can
separate a single data type rapidly has very limited usefulness. Though in
the case of character strings (which are probably most of data) it is still
pretty useful.
I think your research should continue. But I also think that you should
look at what others have done. When you come up with something better,
everyone will benefit. You don't understand why your algorithm isn't
superior. I have tried to show you. I'm tired of it now.
Happy sorting.
>"Richard Harter" <c...@tiac.net> wrote in message
>news:3c5f6d16...@news.SullyButtes.net...
>> On Mon, 4 Feb 2002 20:08:01 -0800, "Dann Corbit" <dco...@connx.com>
>[snip]
>> But they are good sorts. With certain caveats distribution sorts are
>> superior to comparison sorts. The basis for this superiority is that
>> distribution sorts can, in effect, capture several bits of comparison
>> in a single pass.
>
>I prefer to say that a distribution sort (under ideal conditions)
>categorizes the data into n distinct buckets in a single pass.
That's the natural way to think of it; however if you think in terms
of bits you can account for non-ideal distributions, i.e., for some
buckets getting more than others.
>
>Sample problematic set:
>100 character string with the first 99 characters identical.
>A small modification to LSB radix sort can still sort this data in two
>passes.
>You simply count all of the buckets at each level on the first pass, then
>skip all unnecessary distribution passes.
That comes under the category of special case code that is only useful
if you know in advance that you have that special case.
>
>Amusingly, a little known fact is that distribution sorts can be made fully
>general purpose with the addition of a callback function. The callback
>function does not compare. Rather, it hands back block_of_bits[i] such that
>this bucket contains the bits from most significant to least significant for
>chunk_of_key[i] {and obviously chunk_of_key[0] is the most significant and
>the last chunk is the least significant}. For instance, if the key is
>unsigned char and the buckets are CHAR_BIT bits wide, then you simply return
>the ith character. The function is obvious for various integers, but it can
>easily be applied to floating point types and any other data type as well.
Cute. That works with the usual provisos about efficiency.
A game that can be played when sorting strings is to compress them
using a compression scheme that preserves order. Presumably
compressing and decompressing would be prohibitively expensive if they
were part of the sorting process but not if the compressed form of the
keys were stored with the data.
Not at all!
With LSD radix sort, you will always have to count every bin, and so you may
as well count them all in one pass. In fact, if you do it the standard way,
there are exactly 2*n passes over the data. One distribution pass and one
counting pass for each bin. If you do all the counts at once, then there
are exactly n+1 passes over the data. There is one counting pass and one
distribution pass for the n buckets. No matter what, it is always better to
do it this way for LSD radix sort. Also, by doing it this way, you can do a
quick statistical check and decide if a different algorithm would be better
or if you can skip any of the bins.
LSD radix sort should *ALWAYS* proceed in exactly that manner. It is clear
than as the key gets longer and longer the modified algorithm approximates
twice the speed of the standard algorithm. If you are still doing it the
old way, you should fix it right away.
[snip]
>"Richard Harter" <c...@tiac.net> wrote in message
>news:3c664357...@news.SullyButtes.net...
>> On Mon, 4 Feb 2002 23:42:23 -0800, "Dann Corbit" <dco...@connx.com>
>[snip]
>> >Sample problematic set:
>> >100 character string with the first 99 characters identical.
>> >A small modification to LSB radix sort can still sort this data in two
>> >passes.
>> >You simply count all of the buckets at each level on the first pass, then
>> >skip all unnecessary distribution passes.
>>
>> That comes under the category of special case code that is only useful
>> if you know in advance that you have that special case.
>
>Not at all!
>
>With LSD radix sort, you will always have to count every bin, and so you may
>as well count them all in one pass.
It would be better not to make statements such as "you will always
have to count every bin". The reason for counting is to establish the
bin sizes; this is only relevant if you need to know the bin sizes.
There are several alternatives where you don't need to know the bin
sizes, e.g., when the bins are maintained as linked lists. It is
true, however, that radix sorts are commonly done in a way that
involves counting and preassignment of bins.
>In fact, if you do it the standard way,
>there are exactly 2*n passes over the data. One distribution pass and one
>counting pass for each bin.
>If you do all the counts at once, then there
>are exactly n+1 passes over the data.
Ah, but that one counting pass involves looking at all of the bytes of
the key where the distribution passes involve looking at only byte of
the key. Either way each byte (digit) of the key has to be examined
twice, once for counting and once for distribution. The advantage
(and I assume that there is one) lies in the superior tightness of the
counting loop in your suggested scheme.
>There is one counting pass and one
>distribution pass for the n buckets. No matter what, it is always better to
>do it this way for LSD radix sort.
Have you tested this? That is, have you benchmarked the two methods
and have you profiled the benchmarks.
>Also, by doing it this way, you can do a
>quick statistical check and decide if a different algorithm would be better
>or if you can skip any of the bins.
That is indeed an advantage.
>LSD radix sort should *ALWAYS* proceed in exactly that manner. It is clear
>than as the key gets longer and longer the modified algorithm approximates
>twice the speed of the standard algorithm.
Actually it is not clear at all. In as LSD radix sort with fixed
record length and fixed key length of size m and n records with the
sorting being done into preallocated continguous bins we have
m' * n record moves
m' * n examination of keys
m * n updates of count vectors
K record accesses
where m' is the number of levels where there is more than one value to
be distributed. The reordering that you suggest does not change any
of these numbers (in the standard method you can may make the same
skip after counting trick) except the number of record accesses where
the ratio is (n+1)/2n. Whether different numbers of record accesses
matter depends on the memory organization. If accesses are cheap and
constant cost the amortized cost of record accesses may be small
compared to the other factors. In particular one expects moving
records to be the dominant cost.
This is not to say that your trick is not a good thing to do; it
clearly is and should provide some improvement over the "standard"
across the board. I rather doubt that it is anything like a 2 to 1
improvement.
>If you are still doing it the
>old way, you should fix it right away.
When I create a special purpose sort it is almost never an LSD; it
isn't hard to create an efficient MSD and MSD's are much more
flexible.
I have never seen a linked list radix sort that was as fast as one where the
counts happen first. That is due to the very high cost of memory
allocation, especially of many small chunks. A suballocator will help, but
you also have a higher cost because of the memory used by the links. If you
hold the whole record in memory, it is not terribly consequential. But
usually, you just have the key in memory. If the key is ten bytes (for
example) and a pointer is 4 bytes, then you have reduced your sortable size
for one block of records by 40%.
Doesn't mean that a linked list radix sort can't be faster. Just that I
have never seen one (and I have tried that route and failed to create
improvements).
> >In fact, if you do it the standard way,
> >there are exactly 2*n passes over the data. One distribution pass and
one
> >counting pass for each bin.
>
> >If you do all the counts at once, then there
> >are exactly n+1 passes over the data.
>
> Ah, but that one counting pass involves looking at all of the bytes of
> the key where the distribution passes involve looking at only byte of
> the key. Either way each byte (digit) of the key has to be examined
> twice, once for counting and once for distribution. The advantage
> (and I assume that there is one) lies in the superior tightness of the
> counting loop in your suggested scheme.
>
> >There is one counting pass and one
> >distribution pass for the n buckets. No matter what, it is always better
to
> >do it this way for LSD radix sort.
>
> Have you tested this? That is, have you benchmarked the two methods
> and have you profiled the benchmarks.
Yes I have. Sometimes, it is enormously faster. I attribute this to having
to load distant objects into memory almost twice as many times, which is
very expensive. I expect for different architectures, the profile would be
different. Even going from a P4 to an Athlon to an Alpha will change things
a lot. Often, a cache fault is as expensive as a missed branch prediction.
Quite painful in other words. Sorting large files (out of core) is a
typical job for radix sort. After all, why are we bothering with radix sort
if we only have a couple million items. Hence we are talking about disk
passes in that situation. (It is true that radix sort is purely a serial
read and easy to predict and hence disk cache works quite well on it, but
there is still a large penalty for each additional pass.)
One thing for sure, it will definitely be faster than the extra passes. Try
it yourself. You can find code for it on the C Unleashed CD.
I also skip distribution steps if either all elements are the same. Happens
more often than you might imagine. Besides, it is only a single comparision
for an entire pass.
> This is not to say that your trick is not a good thing to do; it
> clearly is and should provide some improvement over the "standard"
> across the board. I rather doubt that it is anything like a 2 to 1
> improvement.
Try it. You'll be surprised. If there are only two buckets, you might not
see quite that much. But try with strings of unsigned char at least ten
long. You have to have a lot of data before the effect shows up most
strongly. A few hundred thousands strings and it won't be very noticeable.
But a few hundred megabytes and you will see a dramatic difference.
> >If you are still doing it the
> >old way, you should fix it right away.
>
> When I create a special purpose sort it is almost never an LSD; it
> isn't hard to create an efficient MSD and MSD's are much more
> flexible.
I probably do half and half. If I know enough about the data to know that I
will mostly have to traverse the whole key anyway, I will use LSD. If the
keys are long and have enough initial entropy to quickly separate them, then
MSD. [If, indeed, a radix sort is wanted at all.]
In a database, you typically house table statistics in the system tables.
This information can be used to increase the speed of sorting even more.
Is not the remainder of my point that you no longer need to scan for
bin sizes? It can be saved in a single linked list N elements in
size. I have tested this 1000's of times, it works properly..
Sean
What I meant is, all bins can be stored in a single linked list and it
is not necessary to count them. Each individual bin has a pointer to
the first and last occurence of its index within the linked list. It
simply drops the occurence into the bin, picks up the last occurence,
and links that to the next free link. When sorting is complete it
follows the first link to the last link of that bin index. How does
that sound?
Sean
Yes. This result is well known. It does work, without question. The only
remaining question is:
"Does it work better in real life situations?"
> Well, allocation takes time, even sub-allocation. This algo would be
> faster. But perhaps you are right, this could be nit-picking.
>
> Sean
Not necessarily. In a large number of situations, you know in advance your pattern of allocation
and de-allocation, and so you actually don't have to keep track of these things at runtime. This
does not work for general programs, but for some algorithms may be a plus.
David Feuer
I do not know which book you are referencing, I have never seen this
implemented before. Please point me in the right direction if you
could.
Sean
You've never seen a linked list version of radix sort?
A Web search will turn up thousands of examples.
Dan,
I don't know how many times I'm going to say the same thing, but I
have seen these examples and not one has implemented the Radix with a
single linked list for multiple keys. They must count first, then
allocate, then sort. Please point me towards 1 of these examples if
you could.
Sean
Sorry, I'm not your web search engine. If you are curious enough, you will
find them. If not, I don't care.
That's because there are none. You simply can't admit that it's new.
You *are* an idiot.
> I don't know how many times I'm going to say the same thing, but I
> have seen these examples and not one has implemented the Radix with a
> single linked list for multiple keys. They must count first, then
> allocate, then sort. Please point me towards 1 of these examples if
> you could.
Most radix sort implementations use multiple lists, but they do NOT
"count first". They simply keep individual lists -- since you have
pointers into your single list for each key (essentially head and tail
pointers for each key), what you're doing is precisely the same
thing. It's just that your lists are chained together. Other than
that (and I don't really see the advantage of that), this is a very
run-of-the-mill radix sort....
--
Steve Tate --- srt[At]cs.unt.edu | Gratuitously stolen quote:
Dept. of Computer Sciences | "The box said 'Requires Windows 95, NT,
University of North Texas | or better,' so I installed Linux."
Denton, TX 76201 |
Ok, but how do you know how much memory to allocate for each list? I
suppose you could just allocate a large enough chunk and take from
that. Anyway, could you point me towards this type of implimentation.
And Dan, you are still a weasel.
Sean
This sorting algorhithm is unique as it makes no comparisons of
elements and does not move elements. It utilizes a bin system to
“drop” numbers into order. It also uses a linked list to
store duplicate numbers. It completely sorts an array in 1 pass.
This seems amazing to me as well, but that’s the way it works.
The sorting portion of the algorhithm is straight forward (at least to
me). There are 2 separate operations that create the sort. The first
operation is retrieving the random number from Unsorted_array() and
placing it in the appropriate “Bin”. This is done by
indexing sequentially thru Unsorted_array(). If for example index=12
and Unsorted_array(index) = 1532 then Bin_Data_Structure(1532) is
where it places information about this occurrence of value 1532. This
information is the index into the Unsorted_array, which is 12. Since
there is a corrosponding Bin_Data_Structure element for each value of
Unsorted_array(), Bin_Data_Structure() must be dimensioned to the
highest value of the elements in Unsorted_array(). For example if
Unsorted_array() contained only 10 elements and the highest value was
1,000,100 then Bin_Data_Structure() would need to dimensioned to
1,000,100 elements to sort 10 items! Fortunately, I believe in this
situation it would possible to map the values to the first 10
index’s and convert back after the sort. This way, we would
only have to dimension Bin_Data_Structure to 10, rather then the
highest value in Unsorted_array(). This demo simply dimension B_D_S()
to the maximum range of random numbers, which is why I limit it to 1
million.
The second operation is necessary because it is possible to have
multiple instances of the same number. If Unsorted_array(12)=1532 and
Unsorted_array(19) =1532, then B_D_S(Unsorted_array(19)) would
overwrite the original contents of B_D_S(1532).
B_D_S(1532).index_value was 12 and is now 19! This was no good. The
first solution that came to mind is to dimension
B_D_S(Largest_Unsorted_List_Value, Number_Of_Elements). So if by
chance all of the elements were the same value you would simply index
to the next free position in the B_D_S (1532,0 for index 12; 1532,1
for index 19). Of course this solution is not workable. As the
values of the B_D_M() subscripts increase, so does the memory
consumption, but I was starting to get the right idea. This is when I
decided to use a linked list to store each occurrence of a number and
its index into the Unsorted_array(). B_D_S() would now hold not the
index of the Unsorted_array() but The First Occurrence of
Unsorted_array(index) and the Last Position of Unsorted_array(index)
within the linked list. The linked list would now hold the
occurrences of the Unsorted_array() values and the original index into
Unsorted_array(). Since there is only 1 linked list there is a need
to chain the identical numbers, so the list is 2 directional with the
back pointer pointing to the last occurrence of the same number. The
forward pointer is updated at the previous last occurrence to point at
the new occurrence. B_D_S(Unsorted_array(index)) is updated with the
last occurrence of its slot so as to avoid having to search out the
last occurrence of that bin in the link list, which saves plenty of
time. The other half of B_D_S(), the first occurrence of the number
never changes. So, after this process is complete (all index’s
of Unsorted_array() have been read and then stored in the linked list
utilizing the B_D_S() information), all that is necessary to do is
read the information back out into an ouput file or another array.
This is a simple process of going thru B_D_S() in ascending or
desending order picking up the B_D_S().First_element and following
the forward pointers within the linked list to the next occurrence of
the number. It terminates when forward pointer is null. Then,
proceed to the next B_D_S(index). It might have been simpler to
create a linked list for each bin location, but then I realized that
the 1 linked list was better because I would not know how large to
make each linked list for each bin and would have to dynamically
allocate for each new occurrence which would cost time. With 1 list I
always know the size, its Unsorted_array(number_of_elements) long.
One interesting attribute of this sorting system is it can be
processed in parallel with no wasted cycles. If you have say, 4
processors, you could simply divide Number_of_elements/4 and use this
as a starting pointer for each processor. For example, 100 elements
would mean processor 1 would start at 0 and end at 24, processor 2 at
25, processor 3 at 50, etc. Each processor could have its own linked
list so it wouldn’t have to wait while another processor updated
the linked list. When assembling the finished sorted array, you could
simply go to the bin(index) of each processor and combine the
occurences of the index. No wait cycles!!
Another attribute is that it works at the same speed on all data. It
needs to process N items as a minimum and maximum.
I have run some tests of the speed of this algorithm but I’m not
sure I’m getting optimal results as I have programmed and
compiled this in Visual Basic 6 without using pointers which I’m
sure would be more efficient. As it stands, it will sort 1,000,000 8
bit numbers in .16 seconds and 1,000,000 16 bit numbers in .6 seconds
on an 1.1 Ghz PC. If someone could implement this in C or assembler
to get a real benchmark it would probably make more sense (seems a
little slow to me considering the benchmarks I’ve been seeing in
the newsgroups). However, there is no reason this should not be
faster then any sort out there as it is 1 pass with 2 if statements in
between.
If you need further explanation about how this sort works, please
e-mail me. I need to write a paper regarding the code with benchmark
and comparison statistics and if you would like to help, I would
appreciate it.
Complete source code, results, and demo application are available at:
Not yet available because I need someone to host it for me. About 2
mb in size. Please e-mail me at lorbe...@hotmail.com
Here is a short report with without any clock.
Sort array length: 50
Random number maximum value: 25
Unsorted list:
17 6 22 14 9 5 3 15 6 2 9 15 4 13 13 15 20 22 21
20 3 16 21 10 24 10 17 16 5 2 20 2 18 5 17 2 1 14
24 14 24 3 18 1 10 6 3 22 22 22 4
Sort value: 1
Index in original array: 36
Index in original array: 43
Occurence Count: 2
Sort value: 2
Index in original array: 9
Index in original array: 29
Index in original array: 31
Index in original array: 35
Occurence Count: 4
Sort value: 3
Index in original array: 6
Index in original array: 20
Index in original array: 41
Index in original array: 46
Occurence Count: 4
Sort value: 4
Index in original array: 12
Index in original array: 50
Occurence Count: 2
Sort value: 5
Index in original array: 5
Index in original array: 28
Index in original array: 33
Occurence Count: 3
Sort value: 6
Index in original array: 1
Index in original array: 8
Index in original array: 45
Occurence Count: 3
Sort value: 9
Index in original array: 4
Index in original array: 10
Occurence Count: 2
Sort value: 10
Index in original array: 23
Index in original array: 25
Index in original array: 44
Occurence Count: 3
Sort value: 13
Index in original array: 13
Index in original array: 14
Occurence Count: 2
Sort value: 14
Index in original array: 3
Index in original array: 37
Index in original array: 39
Occurence Count: 3
Sort value: 15
Index in original array: 7
Index in original array: 11
Index in original array: 15
Occurence Count: 3
Sort value: 16
Index in original array: 21
Index in original array: 27
Occurence Count: 2
Sort value: 17
Index in original array: 0
Index in original array: 26
Index in original array: 34
Occurence Count: 3
Sort value: 18
Index in original array: 32
Index in original array: 42
Occurence Count: 2
Sort value: 20
Index in original array: 16
Index in original array: 19
Index in original array: 30
Occurence Count: 3
Sort value: 21
Index in original array: 18
Index in original array: 22
Occurence Count: 2
Sort value: 22
Index in original array: 2
Index in original array: 17
Index in original array: 47
Index in original array: 48
Index in original array: 49
Occurence Count: 5
Sort value: 24
Index in original array: 24
Index in original array: 38
Index in original array: 40
Occurence Count: 3
Sorted list:
1 1 2 2 2 2 3 3 3 3 4 4 5 5 5 6 6 6 9 9 10 10 10 13 13 14 14 14 15 15
15 16 16 17 17 17 18 18 20 20 20 21 21 22 22 22 22 22 24 24
And here is the source code which is probably easier to understand
then my explanation!
Global Array_Length As Long
Global Random_Number_Max As Long
Global Larger_Of_2_Arrays As Long
Global SOI_Flag As Boolean
Dim To_Sort_List() As Long
Dim Sorted_Array() As Long
Dim Nxt_Free_Lnk As Long
Dim Tick_Off(1000000) As Boolean
Dim Start_Time As Single
Dim End_Time As Single
Dim The_Time As Single
Dim Calc_Avg As Double
Type Drp_Bin_Struct
First_In_Linked_List As Long
Last_In_Linked_List As Long
End Type
Dim Drp_Bin_Pntr() As Drp_Bin_Struct
Type Multiple_Number_Occurence_Linked_List_Struct
Forward_Pntr As Long
Back_Pntr As Long
Index_In_To_Sort_List As Long
End Type
Dim Multiple_Number_Occurence_Linked_List() As
Multiple_Number_Occurence_Linked_List_Struct
Public Sub t_main()
Dim xq As Long, The_nmbr As Long
Installed_path = CurDir
Crlf = Chr$(13) + Chr$(10)
Sort_Form_Main.Label6.Caption = ""
Open Installed_path + "\Report.txt" For Output Access Write As #1
Print #1, "START" + Crlf
Close #1
Open Installed_path + "\Report.txt" For Append As #1
Call i_init
Print #1, "Sort array length: " + Str$(Array_Length)
Print #1, "Random number maximum value: " +
Str$(Random_Number_Max) + Crlf
' create random list of numbers
Sort_Form_Main.Label1.Caption = "Writing Random Numbers To
Report": DoEvents
Randomize Timer
Print #1, "Unsorted list: "
Calc_Avg = 0
For xq = 0 To Array_Length
Rnd_num = Int(Random_Number_Max * Rnd(1) + 1)
To_Sort_List(xq) = Rnd_num
Calc_Avg = Calc_Avg + Rnd_num
Print #1, Str$(Rnd_num); " ";
Next xq
Calc_Avg = Int(Calc_Avg / Array_Length)
Print #1, Crlf
Sort_Form_Main.Label1.Caption = "Writing Random Numbers To Report
- Done!": DoEvents
Sort_Form_Main.Label6.Caption = "Average Random #: " +
Str$(Calc_Avg)
' SORT
' drop in bin and link to next occurence of same number
Sort_Form_Main.Label2.Caption = "Sorting": DoEvents
Start_Time = Timer
Nxt_Free_Lnk = 0: Erase Tick_Off
For xq = 0 To Array_Length
The_nmbr = To_Sort_List(xq)
If Tick_Off(The_nmbr) = False Then
Drp_Bin_Pntr(The_nmbr).First_In_Linked_List = Nxt_Free_Lnk
Multiple_Number_Occurence_Linked_List(Nxt_Free_Lnk).Back_Pntr
= Larger_Of_2_Arrays + 1
End If
Previous_last = Drp_Bin_Pntr(The_nmbr).Last_In_Linked_List
Drp_Bin_Pntr(The_nmbr).Last_In_Linked_List = Nxt_Free_Lnk
If Previous_last <> Larger_Of_2_Arrays + 1 Then
Multiple_Number_Occurence_Linked_List(Previous_last).Forward_Pntr
= Drp_Bin_Pntr(The_nmbr).Last_In_Linked_List
End If
Multiple_Number_Occurence_Linked_List(Nxt_Free_Lnk).Back_Pntr =
Previous_last
Tick_Off(The_nmbr) = True
Multiple_Number_Occurence_Linked_List(Nxt_Free_Lnk).Index_In_To_Sort_List
= xq
Nxt_Free_Lnk = Nxt_Free_Lnk + 1
Next xq
End_Time = Timer
The_Time = End_Time - Start_Time
Sort_Form_Main.Label2.Caption = "Sorting - Done!
CLOCK:" + Str$(The_Time) + " Seconds": DoEvents
' print sorted list
Sort_Form_Main.Label3.Caption = "Writing Final Results": DoEvents
Strt_sort_cnt = 0
For xq = 0 To Larger_Of_2_Arrays
If Tick_Off(xq) = True Then
Ocr_cnt = 0
If SOI_Flag = True Then
Print #1, "Sort value: " + Str$(xq)
End If
Pntr_in_linked_list = Drp_Bin_Pntr(xq).First_In_Linked_List
Do
Sorted_Array(Strt_sort_cnt) = xq: Strt_sort_cnt =
Strt_sort_cnt + 1
If SOI_Flag = True Then
Print #1, " Index in original array: " +
Str$(Multiple_Number_Occurence_Linked_List(Pntr_in_linked_list).Index_In_To_Sort_List)
End If
Pntr_in_linked_list =
Multiple_Number_Occurence_Linked_List(Pntr_in_linked_list).Forward_Pntr
Ocr_cnt = Ocr_cnt + 1
If Pntr_in_linked_list = Larger_Of_2_Arrays + 1 Then Exit
Do
Loop
If SOI_Flag = True Then
Print #1, " Occurence Count: " + Str$(Ocr_cnt) + Crlf
End If
End If
Next xq
Print #1, Crlf + "Sorted list: ": DoEvents
xq = 0
Do Until xq = Strt_sort_cnt - 1
Print #1, Str$(Sorted_Array(xq));
xq = xq + 1
Loop
Close #1
Sort_Form_Main.Label3.Caption = "Writing Final Results - Done!":
DoEvents
MsgBox "Results has been saved to " + Installed_path + "The File
Name Is REPORT.txt", 0
Sort_Form_Main.Command1.Enabled = True
End Sub
Public Sub i_init()
Sort_Form_Main.Label1.Caption = ""
Sort_Form_Main.Label2.Caption = ""
Sort_Form_Main.Label3.Caption = ""
DoEvents
For xq = 0 To Larger_Of_2_Arrays
Drp_Bin_Pntr(xq).First_In_Linked_List = Larger_Of_2_Arrays + 1
Drp_Bin_Pntr(xq).Last_In_Linked_List = Larger_Of_2_Arrays + 1
Multiple_Number_Occurence_Linked_List(xq).Forward_Pntr =
Larger_Of_2_Arrays + 1
Next xq
End Sub
Public Sub i_set_arrays(rslt)
Ran_Max = Val(Sort_Form_Main.Random_Number_Max_Input.Text)
Array_L = Val(Sort_Form_Main.Array_Length_Input.Text)
SOI_Flag = Sort_Form_Main.Check1.Value
No_soi_flag = False
If SOI_Flag = True And Array_L > 5000 Then
No_soi_flag = True
End If
If Array_L > 0 And Array_L <= 1000000 And Ran_Max > 0 And Ran_Max
<= 1000000 And Ran_Max >= -1000000 Then
ReDim To_Sort_List(Array_L) As Long
ReDim Sorted_Array(Array_L) As Long
Larg_Of_2 = 0
If Ran_Max > Array_L Then
Larg_Of_2 = Ran_Max
Else
Larg_Of_2 = Array_L
End If
'ReDim Tick_Off(Larg_Of_2) As boolean
ReDim Drp_Bin_Pntr(Larg_Of_2) As Drp_Bin_Struct
ReDim Multiple_Number_Occurence_Linked_List(Larg_Of_2) As
Multiple_Number_Occurence_Linked_List_Struct
Array_Length = Array_L
Random_Number_Max = Ran_Max
Larger_Of_2_Arrays = Larg_Of_2
rslt = True
End If
If Array_L <= 0 Or Array_L > 1000000 Then
MsgBox "Array Length Must Be Greater Then 0 and Less Or Equal
to 1,000,000.", 0
rslt = False
End If
If Ran_Max = 0 Or Ran_Max > 1000000 Or Ran_Max < -1000000 Then
MsgBox "The Range Of Random Number May Not Be 0 or greater
then +/-1,000,000.", 0
rslt = False
End If
If No_soi_flag = True Then
MsgBox "Array Lengh May Not Be Greater Then 5000 to show
indexes (generates too large a report). DISABLEING SHOW INDEXES", 0
Sort_Form_Main.Check1.Value = False
SOI_Flag = Sort_Form_Main.Check1.Value
End If
End Sub
Thanks very much.
Sean Lorber
lorbe...@hotmail.com
I hate to disappoint you, but you're not the first to invent this
algorithm.
Given your level enthusiasm for discovering a new algorithm, you might
really enjoy what you'd learn in the process of getting a computer
science degree. (By coincidence, essentially the same algorithm
you describe happens to be the first thing my teacher talked about
on the first day of my first computer science course in college.)
By the way, if you'd like to read about it on your own, try
searching the web for "bucket sort". Or, you might start at
http://hissa.nist.gov/dads/HTML/bucketsort.html .
- Logan
--
Diamonds may be "forever", but graphite is more thermodynamically favorable.
You are quite correct. I am self learner, not a computer scientist
with a degree. However, my algorhithm uses 1 linked list instead of a
linked list for each bin (for duplicate numbers) so it is essentially
2x as fast (and probably faster because you don't have to dynamically
create the linked list for each new occurence) as the current
bucketsort.
Sean
TAOCP, volume 3, "Sorting and Searching", Program 5.2.5R., pages 168-177.
Mine is the 3rd edition, but it was originally published 30 years ago.
Want to see some nifty, blazing fast linear sorting algorithms to bench
against?
Look here:
http://www.nada.kth.se/~snilsson/
If you have a list of small integers, it is possible to sort them in exactly
N operations, where n is the number of elements. Here is the entire
algorithm:
#include <limits.h>
unsigned long buckets[USHRT_MAX] = {0}; /* I like to be explicit */
void sort(unsigned short *list, unsigned count)
{
unsigned i;
for (i = 0; i < count; i++)
buckets[list[i]]++;
}
Do you know how to use it?
Think about long keys...
There is a radix sort which takes n*(m+1) passes over a list of n items
where the keys require m chunks to be processed.
It takes a lot of work to reinvent the wheel. It can be great fun, but then
you discover that your wooden wagon wheel gets outperformed by a Michelin or
Pirelli.
On 17 Jan 2002 13:46:52 -0800, lorbe...@hotmail.com (Sean Lorber)
wrote:
On 17 Jan 2002 13:46:52 -0800, lorbe...@hotmail.com (Sean Lorber)
wrote:
>This algorhithm is PUBLIC DOMAIN. I require no compensation for this
>and if you really want to sort, I suggest 'quicksort' - it beats the
>running time of most algorithms - it's O(n log n). Worst case is n^2,
>but I think you'd have to give it like a sorted array in order to get
>the worst running time (if I remember correctly)
>
Let me rephrase that, if I may: "If you want to sort and know nothing
about sorting, and performance is not a critical concern, use
quicksort."
> On Thu, 24 Jan 2002 12:03:42 GMT, M.N. <balkan...@yahoo.com> wrote:
>
>>and if you really want to sort, I suggest 'quicksort' - it beats the
>>running time of most algorithms - it's O(n log n). Worst case is n^2,
>>but I think you'd have to give it like a sorted array in order to get
>>the worst running time (if I remember correctly)
>>
>
> Let me rephrase that, if I may: "If you want to sort and know nothing
> about sorting, and performance is not a critical concern, use
> quicksort."
I'm just curious... what algorithm do you use if you know something about
sorting? :-))
Cheers,
Lada
--
Ladislav Strojil
MFF, Charles University in Prague
What data type are you sorting?
If non-numeric, how expensive is it to compare two elements?
If it's numeric, what is the distribution?
Is it already partially sorted? Is it at all likely that it's fully
sorted?
If integers, how many bits?
Does all the stuff you're sorting fit in RAM? (If not, how many disks
do you have?)
Does it fit in the processor's cache?
And by the way, how many processors do you have?
- Logan
--
"I'll tell you something. Luxury disgusts me." Giorgio Armani, Jan 17, 2002
( http://dailynews.yahoo.com/h/nm/20020117/re/life_fashion_armani_dc_1.html )
> In article <a2pidj$ag2$2...@crax.cesnet.cz>,
> Ladislav Strojil <bw...@seznam.cz> wrote:
>>I'm just curious... what algorithm do you use if you know something about
>>sorting? :-))
>
> What data type are you sorting?
> If non-numeric, how expensive is it to compare two elements?
> If it's numeric, what is the distribution?
> Is it already partially sorted? Is it at all likely that it's fully
> sorted?
> If integers, how many bits?
> Does all the stuff you're sorting fit in RAM? (If not, how many disks
> do you have?)
> Does it fit in the processor's cache?
> And by the way, how many processors do you have?
I get the point. :-)) Thanks.
And this algorithmic weakness can be offset with a hybrid QuickSort,
which basically drops to something like a bubble sort for when the
recursive call reaches the point of a small list which is nearly
sorted. This keeps the algorithm closer to O(n log n) rather than the
worst case O(n^2). Again, standard CS text book stuff.
Lee.
Just because the sub-lists are small does not mean that they are nearly
sorted. For instance, if you start with distinct elements, the sublists
tend to be unordered even when small.
Jon Bentley has given a formal proof that *every* quicksort algorithm goes
perverse O(n^2) for some particular input. Actually, it's obvious when you
think about it. Using bubble-sort for the cutoff algorithm is so horrid
that it makes me ill. Surely you mean insertion sort or shellsort.
> This keeps the algorithm closer to O(n log n) rather than the
> worst case O(n^2). Again, standard CS text book stuff.
You are referring to Richard Singleton's ACM Algorithm 347 (from 1965). But
he certainly had the good sense not to use bubble sort.
Another variant is introspective sort, which switches to heapsort if it sees
that it is going bad.
Some interesting research on minimal time sorting has come up with some
interesting new algorithms. In particular, relaxed heap sort and quick heap
sort.
Does this include variants of quicksort that try very hard to choose
the pivot so that the partitions are always chosen well? I haven't
managed to find a good description of it, but there seems to be an
algorithm that can find the median of a set in linear time. If such an
algorithm exists, it seems you should be able to always have optimal
partitioning and thus be able to guarantee O(n log n).
It exists. Its called quickselect, and it was invented by (get this)
Blum, Floyd, Pratt, Rivest, and Tarjan. Its chapter 10 in CLR.
Sorry about not getting back sooner.
Are you saying the single linked list implementation is present in
this book? Foiled once more! lol
> Want to see some nifty, blazing fast linear sorting algorithms to bench
> against?
>
> Look here:
> http://www.nada.kth.se/~snilsson/
>
> If you have a list of small integers, it is possible to sort them in exactly
> N operations, where n is the number of elements. Here is the entire
> algorithm:
> #include <limits.h>
> unsigned long buckets[USHRT_MAX] = {0}; /* I like to be explicit */
> void sort(unsigned short *list, unsigned count)
> {
> unsigned i;
> for (i = 0; i < count; i++)
> buckets[list[i]]++;
> }
>
> Do you know how to use it?
>
> Think about long keys...
>
> There is a radix sort which takes n*(m+1) passes over a list of n items
> where the keys require m chunks to be processed.
>
> It takes a lot of work to reinvent the wheel. It can be great fun, but then
> you discover that your wooden wagon wheel gets outperformed by a Michelin or
> Pirelli.
Agreed. There is the problem of the linked list with identical keys.
I read however that it is possible to count the occurences of
identical keys prior to placing them in bins thus allowing you to
allocate the size of the linked list. The 1 difference I seem to of
come up with is the Single Linked List which would essentially make my
algo 2x as fast as the present Radix. Sean
That one is only average case linear time. There is another algorithm that
finds the pivot in worst case linear time.
At any rate, either of those methods is a terrible way to choose the pivot.
Randomization using the median of a sample set of possible medians based
upon the log of the set size is far better.
If you use one of those sophisticated median selection algorithms, qsort()
does indeed become deterministically O(n*log(n)).
Unfortunately, it is also slower than heapsort.
With the randomized median selection, it is much faster than heapsort, on
average.
Yes. In fact, that is the usual implementation. If you don't own a copy of
"The Art of Computer Programming" by Donald Knuth, you should definitely get
one. Sedgewick, Weiss and Skienna are also MUST HAVES.
If you want to become a valuable programming asset for the company you work
for, understanding algorithms is the best way to do that.
A lot of small memory allocations will beat the tar out of sorting speed.
For algorithms that have to be fast and which also will do a very large
number of allocations, I generally write my own suballocater.
> Yes. In fact, that is the usual implementation. If you don't own a copy of
> "The Art of Computer Programming" by Donald Knuth, you should definitely get
> one. Sedgewick, Weiss and Skienna are also MUST HAVES.
I have Knuth and Sedgewick, but I haven't heard of Weiss and
Skienna. Which book or books in particular?
For Steven S. Skiena (oops, I spelled it wrong), I have "The Algorithm
Design Manual" which is not only the best algorithm cookbook around, but it
is full of wonderful "war stories" that are definitely eye-openers. Very
similar to the style of Jon Bentley (ANOTHER must have -- anything he
writes), it is a great overview of the basic algorithm techniques.
ISBN: 0-387-94860-0.
For Mark Allen Weiss, I have a big pile of them. One which happens to be
handy is:
"Data Structures and Algorithm Analysis in C"
ISBN 0-201-49840-5
Interesting connection:
Knuth taught Sedgewick who taught Weiss (if I have my story straight).
Also, all of the above are very nice people, and I have exchanged emails
with all of them.
I have read "The Art of Computer Programming" by Donald Knuth and
unless I missed something, he suggests counting identical keys to get
the bin size prior to sorting. My algo does not need to do this. Can
you point me to a book which implements Radix with a single linked
list and no pre calculation of identical keys?
What do you mean by sub allocator?
Sean