Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

why insertsort stub so long?

50 views
Skip to first unread message

fir

unread,
Jan 13, 2020, 2:10:07 PM1/13/20
to
i was writing and testing warious versions of quicksort

basic quicksort took for example 150 ms for 1e6 ints

c++ std::sort took for comparsion 111 ms

one that has one recursion call out of its two
get down to 140 ms

one with insertion sort stub for shorter partitions took a record as so far about 114 ms, (code below) but what is surprising is it works best with p[artition sizes like 70, 100 matybe more (i not tested yet very much) on shorter (like 50) it gets slowly worse and really worse for short like 10

is it proper behaviour? 70 seem a lot to me, do insertion sort should win with quicksort in that range?

my code


void QuicksortInts7(int* table, int lo, int hi)
{

if(hi-lo+1<=60) // Insertion sort when subarray small enough
{
int a, i;

for(int j=lo+1; j<=hi; j++)
{
a=table[j];

for(i=j-1; i>=0; i--)
{
if(table[i] <= a) break;
table[i+1]=table[i];
}

table[i+1]=a;
}

return;
}
// [3 5 7] 6 4 2
// 3 5 7 7
// 3 5 6 7

int i = lo;
int j = hi;

while (i < hi) {
int pivot = table[(lo+hi)/2];

while (i <= j) // Partition
{
while ( table[i] < pivot ) i=i+1;
while ( table[j] > pivot ) j=j-1;
if (i <= j)
{
int t = table[i]; table[i] = table[j]; table[j] = t;
i=i+1; j=j-1;
}
}
// Recursion for quicksort
if (lo < j) QuicksortInts7(table, lo, j);

lo=i; j=hi;
}
}

fir

unread,
Jan 13, 2020, 2:23:13 PM1/13/20
to
btw this is the best i get at the moment for sorting ints (neer much interested in sorting, also generally dont need it in codes) have you got something better (i mean piece of c source)? you may post it i can compile and test if its better

fir

unread,
Jan 13, 2020, 5:17:37 PM1/13/20
to
W dniu poniedziałek, 13 stycznia 2020 20:23:13 UTC+1 użytkownik fir napisał:
> btw this is the best i get at the moment for sorting ints (neer much interested in sorting, also generally dont need it in codes) have you got something better (i mean piece of c source)? you may post it i can compile and test if its better

by adding


if(hi-lo+1==2)
{
if(table[lo]>table[hi]) { int t=table[lo]; table[lo]=table[hi]; table[hi]=t; return; }
else return;
}

atthe begininh of insert sort part i obtained idential result as c++ std:sort (111 ms)

so it seems now it is only needed to found something that can beat it, what it could be?

jak

unread,
Jan 14, 2020, 3:48:30 AM1/14/20
to
I didn't measure, but you can try:

table[lo] ^= table[hi] ^= table[lo] ^= table[hi];

fir

unread,
Jan 14, 2020, 4:08:10 AM1/14/20
to
unfortunatelly its slower rises from 111-112 to 120 ms

as to this size some felow on stack answerd "Depending on cache size, the range is usually 32 to 96, Around 128, it's break even"

fir

unread,
Jan 14, 2020, 4:29:21 AM1/14/20
to
note btw the myth quicksort is so fast
(good example how myths work, then people like bonita or others live in his word of myths and spread lies) (hovever in thsi case i was also on pressure of this myth slighty, slightly as i would not enter the discussion "quicksort is fastest" but simply not knowing answer to the myth)

if sole quicksort takes 150 ms (basic) and 140 ms this optimised and very simple addition putting small insertsort makes it 111-112 means quicksort is not fastest

it also seems that this insertsort part (which probably is the heaviest worker here) can be improved by trying sort networks

https://stackoverflow.com/questions/4770651/what-is-the-fastest-sorting-algorithm-for-a-small-number-of-integers

i not yet tried it, maybe i will try (im not sure as sorting is a bit boring and in my case rarely used, sometimes i like to work on something that i see gives me the framerate increase (frametime drop) but in that applications i generally dont use soriting as it generally slow, but maybe i will find some )

fir

unread,
Jan 14, 2020, 6:40:20 AM1/14/20
to
i also tried radix sort from this page

https://stackoverflow.com/questions/29019318/optimizing-qsort

maybe i repeste it in case this page will be down

typedef int I32;
typedef unsigned int UI32;

I32 * RadixSort(I32 * pData, I32 * pTemp, size_t count)
{
size_t mIndex[4][256] = {0}; // index matrix
UI32 *pDst, *pSrc, *pTmp;
size_t i,j,m,n;
UI32 u;

for(i = 0; i < count; i++){ // generate histograms
u = pData[i];
for(j = 0; j < 4; j++){
if(j != 3) // signed integer handling
mIndex[j][(size_t)(u & 0xff)]++;
else
mIndex[j][(size_t)((u^0x80) & 0xff)]++;
u >>= 8;
}
}
for(j = 0; j < 4; j++){ // convert to indices
n = 0;
for(i = 0; i < 256; i++){
m = mIndex[j][i];
mIndex[j][i] = n;
n += m;
}
}
pDst = (UI32 *)pTemp; // radix sort
pSrc = (UI32 *)pData;
for(j = 0; j < 4; j++){
for(i = 0; i < count; i++){
u = pSrc[i];
if(j != 3) // signed integer handling
m = (size_t)(u >> (j<<3)) & 0xff;
else
m = (size_t)((u >> (j<<3))^0x80) & 0xff;
pDst[mIndex[j][m]++] = u;
}
pTmp = pSrc;
pSrc = pDst;
pDst = pTmp;
}
return((I32 *)pSrc);
}

it takes 27.5 ms when the unsigned version is even faster and takes stable 26 ms (and frame in my testing enironment takes 3.5 ms byself as im clear frame and draw plots of times when i test, i also each frame copy the generated random filled input table into the input again and again for sort to sort it, this radix sort dont need it so it is even yet slightly faster) so in fact the times are
24 ms and 22.5 ms,
now tat is real improvement, especially i gues integer sorting is quite usable and most critical.. (dat is my oldskul style of coding (natural c low style) i would say, when i was not so burnt out) this case makes it usable in some cases when sorting was not in play at all

0 new messages