C++ threshold for "stupid" sorting algorithms (O(n^2))

Soviet_Mario

unread,

Jan 1, 2020, 7:17:20 AM1/1/20

to

In your experience, what is (roughly) a reasonable threshold
(n) in terms of convenience of passing from stupid
algorithms (bubblesort, gnomesort) to better and more
complex ones ?

I know this depends a lot also on HW.

But to write down an order of magnitude, I'm talking about
small data sets (some tens to at worst some hundredth, never
above 1000).

With such small figures, I was thinking that even the most
simple O(n^2) bubble or gnome could be acceptable.
But I've no experience in real comparison.

The application is not performance intensive, at least for
this operations which would be performed only a few times

I did not mention that the algorithm HAS TO BE "stable" :
not to move equally ranked items (in order to allow tables
to be sorted by multiple keys)

--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)

Bonita Montero

unread,

Jan 1, 2020, 7:25:32 AM1/1/20

to

> In your experience, what is (roughly) a reasonable threshold (n) in
> terms of convenience of passing from stupid algorithms (bubblesort,
> gnomesort) to better and more complex ones ?

Don't use the simpler algorithms. Algthough they might be faster
on a few elements, their runtime doesn't count then because there
are just a few elements.

Bo Persson

unread,

Jan 1, 2020, 7:35:01 AM1/1/20

to

On 2020-01-01 at 13:17, Soviet_Mario wrote:
> In your experience, what is (roughly) a reasonable threshold (n) in
> terms of convenience of passing from stupid algorithms (bubblesort,
> gnomesort) to better and more complex ones ?
>
> I know this depends a lot also on HW.
>

It also depends on the data that is to be sorted.

For example, if the data is already almost correctly sorted, that will
be the best case for bubblesort but the worst case for quicksort.

So, it depends.

Bo Persson

Soviet_Mario

unread,

Jan 1, 2020, 8:09:49 AM1/1/20

to

I won't even be able to write a stable version of quick sort :\

>
> So, it depends.
>
>
> Bo Persson
>
>
>

Bo Persson

unread,

Jan 1, 2020, 9:16:48 AM1/1/20

to

On 2020-01-01 at 14:09, Soviet_Mario wrote:
> On 01/01/2020 13:34, Bo Persson wrote:
>> On 2020-01-01 at 13:17, Soviet_Mario wrote:
>>> In your experience, what is (roughly) a reasonable threshold (n) in
>>> terms of convenience of passing from stupid algorithms (bubblesort,
>>> gnomesort) to better and more complex ones ?
>>>
>>> I know this depends a lot also on HW.
>>>
>>
>> It also depends on the data that is to be sorted.
>>
>> For example, if the data is already almost correctly sorted, that will
>> be the best case for bubblesort but the worst case for quicksort.
>
> I won't even be able to write a stable version of quick sort :\
>

No, of course not.

Just saying that there are cases where a "stupid" algorithm can be O(n)
and a "smart" one be O(n^2). And that it can be data dependent, and not
just depend on the size.

That might make it hard to find a useful general threshold.

Bo Persson

Robert Wessel

unread,

Jan 1, 2020, 10:24:33 AM1/1/20

to

On Wed, 1 Jan 2020 13:17:09 +0100, Soviet_Mario <Sovie...@CCCP.MIR>
wrote:

>In your experience, what is (roughly) a reasonable threshold
>(n) in terms of convenience of passing from stupid
>algorithms (bubblesort, gnomesort) to better and more
>complex ones ?
>
>I know this depends a lot also on HW.
>
>But to write down an order of magnitude, I'm talking about
>small data sets (some tens to at worst some hundredth, never
>above 1000).
>
>With such small figures, I was thinking that even the most
>simple O(n^2) bubble or gnome could be acceptable.
>But I've no experience in real comparison.
>
>The application is not performance intensive, at least for
>this operations which would be performed only a few times
>
>
>I did not mention that the algorithm HAS TO BE "stable" :
>not to move equally ranked items (in order to allow tables
>to be sorted by multiple keys)

Generally the O(n**2) sorts are considered usable up a few tens of
items. And use insertion sort, not bubble sort (simpler algorithm,
twice as fast on average).

But you're posting this is a C++ forum. std::stable_sort short be
your first choice for such a requirement. The first question is why
*not* use the library function? If you think it's only for
containers, that's wrong, you can use std::begin and std::end to
process a normal array.

Soviet_Mario

unread,

Jan 1, 2020, 11:13:02 AM1/1/20

to

On 01/01/2020 16:24, Robert Wessel wrote:
> On Wed, 1 Jan 2020 13:17:09 +0100, Soviet_Mario <Sovie...@CCCP.MIR>
> wrote:
>
>> In your experience, what is (roughly) a reasonable threshold
>> (n) in terms of convenience of passing from stupid
>> algorithms (bubblesort, gnomesort) to better and more
>> complex ones ?
>>
>> I know this depends a lot also on HW.
>>
>> But to write down an order of magnitude, I'm talking about
>> small data sets (some tens to at worst some hundredth, never
>> above 1000).
>>
>> With such small figures, I was thinking that even the most
>> simple O(n^2) bubble or gnome could be acceptable.
>> But I've no experience in real comparison.
>>
>> The application is not performance intensive, at least for
>> this operations which would be performed only a few times
>>
>>
>> I did not mention that the algorithm HAS TO BE "stable" :
>> not to move equally ranked items (in order to allow tables
>> to be sorted by multiple keys)
>
>
> Generally the O(n**2) sorts are considered usable up a few tens of
> items. And use insertion sort, not bubble sort (simpler algorithm,
> twice as fast on average).
>
> But you're posting this is a C++ forum. std::stable_sort short be

gulp ! I did not even know it existed ... I will try to see
if I'm able to apply to struct data (where the sorting is to
be done by a specialized comparison operator o function)

yes a built-in function would be a lot appreciated.

I used to use vectors <type> on some IDEs .... but on some
others, QT for example, now I only find another kind of
vector template, requiring not only a type, but also an
"allocator" (and I don't know what to pass as argument, I'd
just prefer the general purpose allocator). Dunno why only
these new vectors are provided.

> your first choice for such a requirement. The first question is why
> *not* use the library function?

I did not know about :)

> If you think it's only for
> containers, that's wrong, you can use std::begin and std::end to
> process a normal array.

I find it more readable to be able to use
operator [] (int Index)
actually

Ben Bacarisse

unread,

Jan 1, 2020, 12:33:33 PM1/1/20

to

But there might be billions of such small sorts. It's a rare situation,
but it shows that the logic you used is not sound.

--
Ben.

Christian Gollwitzer

unread,

Jan 2, 2020, 5:29:22 AM1/2/20

to

Am 01.01.20 um 13:17 schrieb Soviet_Mario:

> In your experience, what is (roughly) a reasonable threshold (n) in
> terms of convenience of passing from stupid algorithms (bubblesort,
> gnomesort) to better and more complex ones ?
>
> I know this depends a lot also on HW.

> I did not mention that the algorithm HAS TO BE "stable" : not to move
> equally ranked items (in order to allow tables to be sorted by multiple
> keys)

Do mergesort: it is inherently stable, rather simple and scales well for
large sorts, too.

Christian

Soviet_Mario

unread,

Jan 2, 2020, 1:46:34 PM1/2/20

to

I'll look for it, tnx

Bonita Montero

unread,

Jan 2, 2020, 2:01:11 PM1/2/20

to

>> Do mergesort: it is inherently stable, rather simple and scales well
>> for large sorts, too.
>> Christian

> I'll look for it, tnx

And there's std::stable_sort which is usually implemented with megesort.
With parallel_policy or parallel_unsequenced_policy it even exploits all
cores.

Öö Tiib

unread,

Jan 2, 2020, 6:11:52 PM1/2/20

to

On Wednesday, 1 January 2020 18:13:02 UTC+2, Soviet_Mario wrote:
>
> I used to use vectors <type> on some IDEs .... but on some
> others, QT for example, now I only find another kind of
> vector template, requiring not only a type, but also an
> "allocator" (and I don't know what to pass as argument, I'd
> just prefer the general purpose allocator). Dunno why only
> these new vectors are provided.

Vector has been always such.

On usual case (over 99% of code) we do not provide that second
argument since it has default value that is very rarely worth to
change. Allocators are complex topic so these can be learned later.

That default value is std::allocator<T> on case of std::vector
and std::pmr::polymorphic_allocator<T> on case of std::pmr::vector
(that is in standard since C++17).

It is odd when IDE somehow requires entry of allocator, maybe
find out what you did in settings that caused it and try to turn
the behavior off.

Bonita Montero

unread,

Jan 3, 2020, 1:37:26 AM1/3/20

to

Here's a simple parallel mergesort in C++.
Use it as a temporary object to sort anything.

template<typename T>
struct invoke_on_destruct
{
private:
T &m_t;
bool m_enabled;

public:
invoke_on_destruct( T &t ) :
m_t( t ), m_enabled( true )
{
}

~invoke_on_destruct()
{
if( m_enabled )
m_t();
}

void invoke_and_disable()
{
m_t();
m_enabled = false;
}
};

template<typename It,
typename Cmp = std::less<typename
std::iterator_traits<It>::value_type>,
typename Allocator = std::allocator<typename
std::iterator_traits<It>::value_type>>
class merge_sort
{
private:
using ep_allocator = typename
std::allocator_traits<Allocator>::template rebind_alloc<std::exception_ptr>;
using exceptions_vector = std::vector<std::exception_ptr,
ep_allocator>;

public:
merge_sort( It start, It end, Cmp const &cmp = Cmp(), unsigned
nThreads = 0, Allocator const &alloc = Allocator() );

struct par_exception : public std::exception
{
using iterator = typename exceptions_vector::iterator;
// there's no copy-constructor because of internal vector
// so catch par_exception only via reference
iterator begin();
iterator end();
private:
friend class merge_sort;
par_exception( exceptions_vector &&exceptions );
exceptions_vector m_exceptions;
};

private:
Cmp m_cmp;
Allocator m_alloc;
// mutex that protects the exception-array
std::mutex m_excMtx;
exceptions_vector m_exceptions;
template<typename UpIt>
void parRecursion( UpIt start, UpIt end, unsigned nThreads );
template<typename UpIt, typename BufIt>
void recursion( UpIt start, UpIt end, BufIt buf );
template<typename UpIt, typename BufIt>
void merge( UpIt start, BufIt leftBuf, BufIt rightBuf, BufIt bufEnd );
};

template<typename It, typename Cmp, typename Allocator>
merge_sort<It, Cmp, Allocator>::merge_sort( It start, It end, Cmp const
&cmp, unsigned nThreads, Allocator const &alloc ) :
m_cmp( cmp ),
m_alloc( alloc ),
m_exceptions( ep_allocator( alloc ) )
{
using namespace std;
// threads == 0 -> number of threads = hardware-threads
unsigned hwThreads = thread::hardware_concurrency();
hwThreads += hwThreads == 0;
nThreads = nThreads == 0 ? hwThreads : nThreads <= hwThreads ?
nThreads : hwThreads;
// reserve number of threads elements in the exception_ptr-vector
// so that there will be no exception when we do emplace_back
m_exceptions.reserve( nThreads );
try
{
parRecursion( start, end, nThreads );
}
catch( ... )
{
if( m_exceptions.size() )
{
// additional exception catched: throw par_exception
m_exceptions.emplace_back( current_exception() );
throw par_exception( move( m_exceptions ) );
}
else
// single exception: rethrow it
throw;
}
if( m_exceptions.size() )
if( m_exceptions.size() > 1 )
// multiple exceptions from threads: throw par_exception
throw par_exception( move( m_exceptions ) );
else
// only one exception from threads: rethrow it
rethrow_exception( m_exceptions[0] );
}

template<typename It, typename Cmp, typename Allocator>
template<typename UpIt>
void merge_sort<It, Cmp, Allocator>::parRecursion( UpIt start, UpIt end,
unsigned nThreads )
{
using namespace std;
using T = typename iterator_traits<It>::value_type;
size_t n = end - start;
if( nThreads <= 1 )
{
size_t bs = 0;
// calculate buffer-/stack-size
for( size_t split = n; split > 1; bs += split, split -= split /
2 );
vector<T, Allocator> buf( bs, T(), m_alloc );;
recursion( start, end, buf.begin() );
}
else
{
if( n < 2 )
return;
// split-buffer
vector<T, Allocator> buf( start, end, m_alloc );
// array for left-recursion and right-recursion thread
array<thread, 2> threads;
// automatically join threads when an exception is thrown
auto joinThreads = [&threads]()
{
for( thread &thr : threads )
if( thr.get_id() != thread::id() )
// try to join infinitely because the thread
// will continue to access our buffer
for( ; ; )
try
{
thr.join();
break;
}
catch( ... )
{
}
};
invoke_on_destruct iodJoin( joinThreads );
// iterator-type for our split-buffer
using BufIt = typename vector<T, Allocator>::iterator;
// proxy thread-lambda for our new threads
auto prProxy = [this]( BufIt start, BufIt end, unsigned nThreads )
{
try
{
parRecursion( start, end, nThreads );
}
catch( ... )
{
// remember exception in our exception-array
unique_lock<mutex> excLock( m_excMtx );
m_exceptions.emplace_back( current_exception() );
}
};
unsigned rightThreads = nThreads / 2,
leftThreads = nThreads - rightThreads;
// if the left number of threads is uneven give the threads
more input
size_t left = (size_t)(n * ((double)leftThreads /
nThreads)),
right = n - left;
BufIt leftBuf = buf.begin(),
leftBufEnd = buf.begin() + left,
rightBuf = leftBufEnd,
bufEnd = buf.begin() + n;
// start left thread
threads[0] = move( thread( prProxy, leftBuf, leftBufEnd,
leftThreads ) );
if( rightThreads > 1 )
// start right thread
threads[1] = move( thread( prProxy, rightBuf, bufEnd,
rightThreads ) );
else
// there's only one thread right, so we do it on our own
parRecursion( rightBuf, bufEnd, 1 );
// join threads
iodJoin.invoke_and_disable();
// stop if there are any exceptions from the threads
unique_lock<mutex> excLock( m_excMtx );
if( m_exceptions.size() )
return;
excLock.unlock();
// merge split-buffer back into input-buffer
merge( start, leftBuf, rightBuf, bufEnd );
}
}

template<typename It, typename Cmp, typename Allocator>
template<typename UpIt, typename BufIt>
void merge_sort<It, Cmp, Allocator>::recursion( UpIt start, UpIt end,
BufIt buf )
{
using namespace std;
if( end - start < 2 )
return;
copy( start, end, buf );
size_t n = end - start,
right = n / 2,
left = n - right;
BufIt leftBuf = buf,
leftBufEnd = buf + left,
rightBuf = leftBufEnd,
bufEnd = buf + n;
recursion( leftBuf, leftBufEnd, bufEnd );
recursion( rightBuf, bufEnd, bufEnd );
merge( start, leftBuf, rightBuf, bufEnd );
}

template<typename It, typename Cmp, typename Allocator>
template<typename UpIt, typename BufIt>
void merge_sort<It, Cmp, Allocator>::merge( UpIt start, BufIt leftBuf,
BufIt rightBuf, BufIt bufEnd )
{
assert(leftBuf < rightBuf && rightBuf < bufEnd);
BufIt leftBufEnd = rightBuf;
for( UpIt wrtBack = start; ; )
if( m_cmp( *leftBuf, *rightBuf ) )
{
*wrtBack++ = *leftBuf;
if( ++leftBuf == leftBufEnd )
{
// faster for small number of elements than std::copy
do
*wrtBack++ = *rightBuf;
while( ++rightBuf != bufEnd );
return;
}
}
else
{
*wrtBack++ = *rightBuf;
if( ++rightBuf == bufEnd )
{
do
*wrtBack++ = *leftBuf;
while( ++leftBuf != leftBufEnd );
return;
}
}
}

Soviet_Mario

unread,

Jan 3, 2020, 2:06:26 PM1/3/20

to

do you happen to remember in which header is that
stable_sort declared ? I can't find it under namespace std::
without some proper #include ...

Öö Tiib

unread,

Jan 3, 2020, 2:12:46 PM1/3/20

to

On Friday, 3 January 2020 21:06:26 UTC+2, Soviet_Mario wrote:
> On 02/01/20 20:00, Bonita Montero wrote:
> >>> Do mergesort: it is inherently stable, rather simple and
> >>> scales well for large sorts, too.
> >>> Christian
> >
> >> I'll look for it, tnx
> >
> > And there's std::stable_sort which is usually implemented
> > with megesort.
> > With parallel_policy or parallel_unsequenced_policy it even
> > exploits all
> > cores.
>
>
> do you happen to remember in which header is that
> stable_sort declared ? I can't find it under namespace std::
> without some proper #include ...

The <algorithm> also you need to compile in compiler set to C++17.
That perhaps means adding

CONFIG += c++17

To your qtcreator .pro project.

David Brown

unread,

Jan 3, 2020, 3:25:32 PM1/3/20

to

On 03/01/2020 20:06, Soviet_Mario wrote:
> On 02/01/20 20:00, Bonita Montero wrote:
>>>> Do mergesort: it is inherently stable, rather simple and scales well
>>>> for large sorts, too.
>>>> Christian
>>
>>> I'll look for it, tnx
>>
>> And there's std::stable_sort which is usually implemented with megesort.
>> With parallel_policy or parallel_unsequenced_policy it even exploits all
>> cores.
>
>
> do you happen to remember in which header is that stable_sort declared ?
> I can't find it under namespace std:: without some proper #include ...
>

<https://en.cppreference.com/w/cpp/algorithm/stable_sort>

That site is an excellent reference for this sort of thing.

Melzzzzz

unread,

Jan 3, 2020, 4:00:44 PM1/3/20

to

and pops up on every google search metionig cpp.
>

--
press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

Soviet_Mario

unread,

Jan 4, 2020, 11:58:12 AM1/4/20

to

tnx to you both

I'd have to verify if GCC installed supports such a recent
standard.
But I guess some proper version of algorithm exist also in
less recent version, or at least I hope

>
> To your qtcreator .pro project.
>

Bonita Montero

unread,

Jan 4, 2020, 9:54:34 PM1/4/20

to

>>> In your experience, what is (roughly) a reasonable threshold (n) in
>>> terms of convenience of passing from stupid algorithms (bubblesort,
>>> gnomesort) to better and more complex ones ?

>> Don't use the simpler algorithms. Algthough they might be faster
>> on a few elements, their runtime doesn't count then because there
>> are just a few elements.

> But there might be billions of such small sorts. It's a rare situation,
> but it shows that the logic you used is not sound.

Try this on your computer.

#include <iostream>
#include <chrono>
#include <iterator>
#include <functional>
#include <algorithm>
#include <vector>
#include <random>
#include <climits>
#include <chrono>

using namespace std;
using namespace chrono;

template<typename It, typename Pred = less<typename
iterator_traits<It>::value_type>>
void bubblesort( It begin, It end, Pred const &p = Pred() );

int main()
{
using timestamp = time_point<high_resolution_clock>;
size_t const SIZE = 20,
ROUNDS = 100'000;
vector<int> vRef( SIZE, 0 ),
vSort( SIZE, 0 );
mt19937_64 rg;
uniform_int_distribution<int> uid( numeric_limits<int>::min(),
numeric_limits<int>::max() );
timestamp start;
uint64_t bsNs, sNs;
for( int &v : vRef )
v = uid( rg );
for( size_t c = 2; c <= SIZE; ++c )
{
start = high_resolution_clock::now();
for( size_t r = ROUNDS; r != 0; --r )
copy( vRef.begin(), vRef.begin() + c, vSort.begin() ),
bubblesort( vSort.begin(), vSort.end() );
bsNs = (uint64_t)duration_cast<nanoseconds>(
high_resolution_clock::now() - start ).count();
start = high_resolution_clock::now();
for( size_t r = ROUNDS; r != 0; --r )
copy( vRef.begin(), vRef.begin() + c, vSort.begin() ),
sort( vSort.begin(), vSort.end() );
sNs = (uint64_t)duration_cast<nanoseconds>(
high_resolution_clock::now() - start ).count();
cout << "size:\t" << c << "\tbs:\t" << bsNs / 1.0E6 << "\ts:\t"
<< sNs / 1.0E6 << endl;
}
}

template<typename It, typename Pred>
void bubblesort( It begin, It end, Pred const &p )
{
bool swapped = true;
It up;
for( ; swapped && end - begin >= 2; --end )
{
swapped = false;
up = begin;
do
if( p( up[1], up[0] ) )
swap( up[1], up[0] ),
swapped = true;
while( ++up != end - 1 );
}
}

With this coded and a current MSVC on my Ryzen 7 1800X,
bubblesort is even slower with _2_ elements !

Bonita Montero

unread,

Jan 4, 2020, 10:04:09 PM1/4/20

to

> ...

> With this coded and a current MSVC on my Ryzen 7 1800X,
> bubblesort is even slower with _2_ elements !

This are the MSVC-results:

size: 2 bs: 4.3066 s: 3.5168
size: 3 bs: 3.4546 s: 3.371
size: 4 bs: 7.6116 s: 3.9207
size: 5 bs: 7.1022 s: 3.7874
size: 6 bs: 10.2507 s: 4.1151
size: 7 bs: 11.1817 s: 4.1515
size: 8 bs: 13.1037 s: 4.386
size: 9 bs: 13.4265 s: 4.63
size: 10 bs: 13.5443 s: 4.531
size: 11 bs: 19.1778 s: 5.0473
size: 12 bs: 18.9487 s: 5.2043
size: 13 bs: 17.51 s: 4.7317
size: 14 bs: 16.1673 s: 5.6115
size: 15 bs: 23.015 s: 5.9018
size: 16 bs: 24.3316 s: 6.3325
size: 17 bs: 23.1535 s: 6.5593
size: 18 bs: 24.1078 s: 6.9179
size: 19 bs: 20.7311 s: 7.6469
size: 20 bs: 21.4417 s: 7.5721

Mr Flibble

unread,

Jan 4, 2020, 10:59:40 PM1/4/20

to

But we already know that bubblesort is slow (who doesn't) so what is your fucking point?

/Flibble

--
"Snakes didn't evolve, instead talking snakes with legs changed into snakes." - Rick C. Hodgin

“You won’t burn in hell. But be nice anyway.” – Ricky Gervais

“I see Atheists are fighting and killing each other again, over who doesn’t believe in any God the most. Oh, no..wait.. that never happens.” – Ricky Gervais

"Suppose it's all true, and you walk up to the pearly gates, and are confronted by God," Byrne asked on his show The Meaning of Life. "What will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a world that is so full of injustice and pain. That's what I would say."

Bonita Montero

unread,

Jan 4, 2020, 11:03:48 PM1/4/20

to

>> With this coded and a current MSVC on my Ryzen 7 1800X,
>> bubblesort is even slower with _2_ elements !

> But we already know that bubblesort is slow (who doesn't) so what is
> your fucking point?

Soviet Mario said that bubblesort might be faster for a few elements.
I said that doesn't count because such small sorts have a low runtime
anyway. But Ben told that this might be relevant because ther might
be billions of such sorts. And I measured that bubblesort is even
slower for a few elements; that's not apparent.

Ben Bacarisse

unread,

Jan 4, 2020, 11:21:05 PM1/4/20

to

Bonita Montero <Bonita....@gmail.com> writes:

>>>> In your experience, what is (roughly) a reasonable threshold (n) in
>>>> terms of convenience of passing from stupid algorithms (bubblesort,
>>>> gnomesort) to better and more complex ones ?
>
>>> Don't use the simpler algorithms. Algthough they might be faster
>>> on a few elements, their runtime doesn't count then because there
>>> are just a few elements.
>
>> But there might be billions of such small sorts. It's a rare situation,
>> but it shows that the logic you used is not sound.
>
> Try this on your computer.

First, I was commenting on your logic, not your conclusion. Your
conclusion -- don't use simple but slow sorts -- might be right but you
can't get to that conclusion using the logic you used.

Second, the results from your timing code will depend on the data as
well as on the algorithm. For example, on my laptop, with sorted data,
your code (slightly modified to use non-random numbers) shows bubble
sort as faster (by a factor of at least 2) for all sizes tested (you go
up to 20). I don't know if the code is measuring this correctly -- I am
just reporting on the output.

Third, the OP was not just about bubble sort. To get a sound answer
from measurements, the best simple sort, insertion sort, should be used.

<cut timing code>
--
Ben.

Bonita Montero

unread,

Jan 4, 2020, 11:26:59 PM1/4/20

to

>> Try this on your computer.

> Second, the results from your timing code will depend on the data as
> well as on the algorithm. For example, on my laptop, with sorted data,
> your code (slightly modified to use non-random numbers) shows bubble
> sort as faster (by a factor of at least 2) for all sizes tested (you

> go up to 20). ...

That's not true. You're a liar.

> Third, the OP was not just about bubble sort.

> ... (bubblesort, gnomesort) ...

Bonita Montero

unread,

Jan 4, 2020, 11:50:49 PM1/4/20

to

Little bug:

>     for( size_t c = 2; c <= SIZE; ++c )
>     {
>         start = high_resolution_clock::now();
>         for( size_t r = ROUNDS; r != 0; --r )
>             copy( vRef.begin(), vRef.begin() + c, vSort.begin() ),
>             bubblesort( vSort.begin(), vSort.end() );

bubblesort( vSort.begin(), vSort.begin() + c );

>         bsNs = (uint64_t)duration_cast<nanoseconds>(
> high_resolution_clock::now() - start ).count();
>         start = high_resolution_clock::now();
>         for( size_t r = ROUNDS; r != 0; --r )
>             copy( vRef.begin(), vRef.begin() + c, vSort.begin() ),
>             sort( vSort.begin(), vSort.end() );

> sort( vSort.begin(), vSort.begin() + c );

>         sNs = (uint64_t)duration_cast<nanoseconds>(
> high_resolution_clock::now() - start ).count();
>         cout << "size:\t" << c << "\tbs:\t" << bsNs / 1.0E6 << "\ts:\t"
> << sNs / 1.0E6 << endl;
>     }

Before I measured always the same block-size.
With that code now bubblesort is faster up to 6 elements.

size: 2 bs: 0.3522 s: 0.9208
size: 3 bs: 0.4927 s: 1.0144
size: 4 bs: 0.9208 s: 1.6702
size: 5 bs: 1.2458 s: 1.6793
size: 6 bs: 1.6039 s: 2.1342
size: 7 bs: 2.0537 s: 1.9543
size: 8 bs: 2.4752 s: 2.6737
size: 9 bs: 3.1595 s: 3.0331
size: 10 bs: 3.7429 s: 3.187
size: 11 bs: 4.4821 s: 3.4102
size: 12 bs: 5.5326 s: 4.0687
size: 13 bs: 7.0794 s: 4.2507
size: 14 bs: 7.0796 s: 4.6087
size: 15 bs: 9.5706 s: 5.3354
size: 16 bs: 10.4491 s: 5.8877
size: 17 bs: 13.0148 s: 5.9372
size: 18 bs: 13.8979 s: 7.4023
size: 19 bs: 16.1734 s: 7.5393
size: 20 bs: 18.0571 s: 8.0887

Ben Bacarisse

unread,

Jan 5, 2020, 8:33:19 AM1/5/20

to

Bonita Montero <Bonita....@gmail.com> writes:

>>> Try this on your computer.
>
>> Second, the results from your timing code will depend on the data as
>> well as on the algorithm. For example, on my laptop, with sorted data,
>> your code (slightly modified to use non-random numbers) shows bubble
>> sort as faster (by a factor of at least 2) for all sizes tested (you
>> go up to 20). ...
>
> That's not true. You're a liar.

That's a nasty thing to say. If I thought you were a serious
contributer, I'd be offended.

$ g++ -std=c++17 -O2 -o s s.cc
$ ./s
size: 2 bs: 3.08614 s: 8.22362
size: 3 bs: 3.13931 s: 7.59102
size: 4 bs: 3.34163 s: 8.23757
size: 5 bs: 3.09223 s: 8.09352
size: 6 bs: 3.04952 s: 7.66801
size: 7 bs: 3.59429 s: 7.60449
size: 8 bs: 3.0702 s: 7.82664
size: 9 bs: 3.06544 s: 7.61816
size: 10 bs: 3.37942 s: 7.87771
size: 11 bs: 3.06233 s: 7.57085
size: 12 bs: 3.06663 s: 7.67827
size: 13 bs: 3.35755 s: 7.61874
size: 14 bs: 3.05564 s: 7.66383
size: 15 bs: 3.06742 s: 7.66818
size: 16 bs: 3.12856 s: 7.84636
size: 17 bs: 3.0493 s: 7.60936
size: 18 bs: 3.29187 s: 7.81111
size: 19 bs: 3.07461 s: 7.88316
size: 20 bs: 3.05721 s: 7.47266

This is with an already sorted array. s.cc has this change from your
code

v = 1; //uid( rg );

BTW, I make no comment on what these number mean -- your program may be
wrong for all I know.

--
Ben.

Bonita Montero

unread,

Jan 5, 2020, 9:01:28 AM1/5/20

to

>>> Second, the results from your timing code will depend on the data as
>>> well as on the algorithm. For example, on my laptop, with sorted data,
>>> your code (slightly modified to use non-random numbers) shows bubble
>>> sort as faster (by a factor of at least 2) for all sizes tested (you
>>> go up to 20). ...

>> That's not true. You're a liar.

> That's a nasty thing to say. If I thought you were a serious
> contributer, I'd be offended.

Try the fixed code. The fixed code gives this with gcc on my computer:

size: 2 bs: 0.7988 s: 1.2735
size: 3 bs: 0.9466 s: 1.5443
size: 4 bs: 1.2005 s: 1.6128
size: 5 bs: 1.5233 s: 1.9473
size: 6 bs: 2.353 s: 1.9982 ---
size: 7 bs: 2.2377 s: 2.365
size: 8 bs: 4.0274 s: 2.727
size: 9 bs: 3.4135 s: 3.2835
size: 10 bs: 4.171 s: 3.269
size: 11 bs: 4.7766 s: 3.7635
size: 12 bs: 5.6038 s: 4.1902
size: 13 bs: 7.9656 s: 4.0113
size: 14 bs: 8.9014 s: 4.5925
size: 15 bs: 9.9181 s: 4.7311
size: 16 bs: 11.1119 s: 4.9711
size: 17 bs: 12.4162 s: 6.0172
size: 18 bs: 14.4748 s: 6.85
size: 19 bs: 15.8686 s: 7.1291
size: 20 bs: 20.8648 s: 7.2139

Above 5 elements quicksort ist faster.

Melzzzzz

unread,

Jan 5, 2020, 10:16:16 AM1/5/20

to

I use quicksort then insertion sort when partition falls down bellow 15
elements...
I think this is common knowledge.
>
><cut timing code>

Soviet_Mario

unread,

Jan 5, 2020, 3:13:39 PM1/5/20

to

If I did understand right, Ben was saying that very short
sorting could be repeated thousands of times (not meaning
sorting thousands of items, but some tens of items in
thousand occasions : giving a multiplier of advantage or
disadvantage).

Anyway : to sum up this thread.

I've found the stable_sort in the header algorithm.
I did not fully understand its internals (a template of too
much parameters obscure to me), so I have decided to recode
the 10 lines long example Gnome Sort (optimized version)
from Wikipedia.

Moreover I have to adapt it to sorting rows / columns of a
rectangular matrix, where the key-field is a parameter and
whole columns/rows have to be moved, not just the key-field

As I did not well understand the standard algorithm version,
I gave up with it in this more customized sorting.

Tnx to all

Ben Bacarisse

unread,

Jan 5, 2020, 5:26:17 PM1/5/20

to

Bonita Montero <Bonita....@gmail.com> writes:

>>>> Second, the results from your timing code will depend on the data as
>>>> well as on the algorithm. For example, on my laptop, with sorted data,
>>>> your code (slightly modified to use non-random numbers) shows bubble
>>>> sort as faster (by a factor of at least 2) for all sizes tested (you
>>>> go up to 20). ...
>
>>> That's not true. You're a liar.
>
>> That's a nasty thing to say. If I thought you were a serious
>> contributer, I'd be offended.
>
> Try the fixed code. The fixed code gives this with gcc on my computer:

No apology for accusing me of lying I see.

The fixed code does not alter what I said: the simple sort is faster for
all your test sizes for some data sets -- specifically when the data are
already sorted. The factor is now smaller (no longer always greater
than 2) but still between 1.5 and 2.

--
Ben.

Bonita Montero

unread,

Jan 5, 2020, 11:40:02 PM1/5/20

to

>> Try the fixed code. The fixed code gives this with gcc on my computer:

> No apology for accusing me of lying I see.
> The fixed code does not alter what I said: the simple sort is faster for
> all your test sizes for some data sets -- specifically when the data are
> already sorted. The factor is now smaller (no longer always greater
> than 2) but still between 1.5 and 2.

The discussion wasn't about presorted datasets and that's not usual.

Bonita Montero

unread,

Jan 5, 2020, 11:50:10 PM1/5/20

to

> I've found the stable_sort in the header algorithm.

> I did not fully understand its internals ...

stable_sort is usually mergesort because mergesort is usually
the fastest stable sort algorithm.

Bonita Montero

unread,

Jan 6, 2020, 12:55:34 AM1/6/20

to

> I've found the stable_sort in the header algorithm.

> I did not fully understand its internals ...

As I said stable_sort is usually mergesort.
I developed a parallel mergesort some month ago.
I stripped it to single-threaded and here's the code:

#include <iostream>
#include <iterator>
#include <vector>
#include <random>
#include <functional>
#include <memory>
#include <cassert>

template<typename It,
typename Cmp = std::less<typename
std::iterator_traits<It>::value_type>,
typename Allocator = std::allocator<typename
std::iterator_traits<It>::value_type>>
class merge_sort
{

public:
merge_sort( It start, It end, Cmp const &cmp = Cmp(), Allocator

const &alloc = Allocator() );

private:
Cmp m_cmp;

template<typename UpIt, typename BufIt>
void recursion( UpIt start, UpIt end, BufIt buf );
template<typename UpIt, typename BufIt>
void merge( UpIt start, BufIt leftBuf, BufIt rightBuf, BufIt bufEnd );
};

template<typename It, typename Cmp, typename Allocator>
merge_sort<It, Cmp, Allocator>::merge_sort( It start, It end, Cmp const

&cmp, Allocator const &alloc ) :
m_cmp( cmp )

{
using namespace std;
using T = typename iterator_traits<It>::value_type;

size_t bs = 0;
// calculate buffer-/stack-size

for( size_t split = end - start; split > 1; bs += split, split -=
split / 2 );
vector<T, Allocator> buf( bs, T(), alloc );;

recursion( start, end, buf.begin() );
}

template<typename It, typename Cmp, typename Allocator>
template<typename UpIt, typename BufIt>
void merge_sort<It, Cmp, Allocator>::recursion( UpIt start, UpIt end,
BufIt buf )
{
using namespace std;

size_t n = end - start;

if( n < 2 )

return;
copy( start, end, buf );

size_t right = n / 2,

int main()
{
using namespace std;
size_t const SIZE = 1'000'000;
vector<int> vSort( SIZE, 0 );

mt19937_64 rg;
uniform_int_distribution<int> uid( numeric_limits<int>::min(),
numeric_limits<int>::max() );

for( int &v : vSort )

v = uid( rg );

merge_sort( vSort.begin(), vSort.end() );
}

Maybe you would be able to understand what's going on.

Öö Tiib

unread,

Jan 6, 2020, 2:54:43 AM1/6/20

to

Already sorted or close to sorted data tends to be more likely than
other permutations. But main issue is why you constantly write such
blatant falsehoods? Ben specifically wrote "with sorted data":

Bonita Montero wrote:
> Ben Bacarisse wrote:

>> Ben Bacarisse wrote:
>>
>>> Try this on your computer.
>

>> Second, the results from your timing code will depend on the data as
>> well as on the algorithm. For example, on my laptop, with sorted data,
>> your code (slightly modified to use non-random numbers) shows bubble
>> sort as faster (by a factor of at least 2) for all sizes tested (you
>> go up to 20). ...
>
> That's not true. You're a liar.

You are clearly liar yourself, Ben did not lie.

Bonita Montero

unread,

Jan 6, 2020, 2:59:18 AM1/6/20

to

>> The discussion wasn't about presorted datasets and that's not usual.

> Already sorted or close to sorted data tends to be more likely than
> other permutations.

Give me a URL to a paper that proves this.

Öö Tiib

unread,

Jan 6, 2020, 4:02:11 AM1/6/20

to

Miserable attempt of hiding being caught being liar noted.

It is just quite common with real world data that it was entered in sorted order
or that some other part of system already sorted it.
<https://en.wikipedia.org/wiki/Sorting_algorithm#Efficient_sorts>
"Second, the algorithms often perform poorly on already sorted data or almost
sorted data – these are common in real-world data, and can be sorted in O(n)
time by appropriate algorithms."

Jorgen Grahn

unread,

Jan 6, 2020, 4:35:06 AM1/6/20

to

On Mon, 2020-01-06, Öö Tiib wrote:
> On Monday, 6 January 2020 09:59:18 UTC+2, Bonita Montero wrote:
>> >> The discussion wasn't about presorted datasets and that's not usual.
>>
>> > Already sorted or close to sorted data tends to be more likely than
>> > other permutations.
>>
>> Give me a URL to a paper that proves this.

...

> It is just quite common with real world data that it was entered in sorted order
> or that some other part of system already sorted it.
> <https://en.wikipedia.org/wiki/Sorting_algorithm#Efficient_sorts>

It's rather obvious if you think about it. If a sequence of records
doesn't appear at random, it's probably going to be sorted according
to the record's primary key, because it makes sense for the data
source to keep the records in some order.

This has other consequences, too. A system I once worked with
accepted tens of thousands of network connections/sessions, and stored
them as a std::map. This worked fine if clients connected one by one,
but if the next gateway (there was such a thing) restarted, it
reestablished the sessions in key order, and performance suffered
because the std::map got rebalanced repeatedly. We had to add a
workaround for that (I forget what the workaround was).

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Bonita Montero

unread,

Jan 6, 2020, 4:36:12 AM1/6/20

to

>> Give me a URL to a paper that proves this.

> Miserable attempt of hiding being caught being liar noted.
> It is just quite common with real world data that it was entered in sorted order
> or that some other part of system already sorted it.
> <https://en.wikipedia.org/wiki/Sorting_algorithm#Efficient_sorts>
> "Second, the algorithms often perform poorly on already sorted data or almost
> sorted data – these are common in real-world data, and can be sorted in O(n)
> time by appropriate algorithms."

That's not a proof on the heuristics of chosen sorting algorithms.

Öö Tiib

unread,

Jan 6, 2020, 6:33:38 AM1/6/20

to

So what? As system designer you should have good idea what kind of
data from what kinds of sources it processes. So asking for formal
statistical proofs indicates cluenesness.

Bonita Montero

unread,

Jan 6, 2020, 6:59:12 AM1/6/20

to

>>> Miserable attempt of hiding being caught being liar noted.
>>> It is just quite common with real world data that it was entered in sorted order
>>> or that some other part of system already sorted it.
>>> <https://en.wikipedia.org/wiki/Sorting_algorithm#Efficient_sorts>
>>> "Second, the algorithms often perform poorly on already sorted data or almost
>>> sorted data – these are common in real-world data, and can be sorted in O(n)
>>> time by appropriate algorithms."

>> That's not a proof on the heuristics of chosen sorting algorithms.

> So what? As system designer you should have good idea what kind of
> data from what kinds of sources it processes. So asking for formal
> statistical proofs indicates cluenesness.

If you imagine for what purposes the one or the other algoritm might be
suitable you don't find out how often the one or the other is actually
needed in real projects. So you're clueness here as you suggest it is
possible to find out this the way you describe.

Ben Bacarisse

unread,

Jan 6, 2020, 5:54:30 PM1/6/20

to

You made the discussion about (a) a sweeping statement of yours that was
untrue, and (b) your calling me a liar for make a true statement.

--
Ben.

Bonita Montero

unread,

Jan 6, 2020, 10:57:17 PM1/6/20

to

> You made the discussion about (a) a sweeping statement of yours that
> was untrue, and (b) your calling me a liar for make a true statement.

Because you're so stupid to trust any code.

Melzzzzz

unread,

Jan 7, 2020, 2:51:25 AM1/7/20

to

Hey, buffon, I see you are egomaniacal still.

Tim Rentsch

unread,

Jan 10, 2020, 4:54:05 PM1/10/20

to

Robert Wessel <robert...@yahoo.com> writes:

> Generally the O(n**2) sorts are considered usable up a few tens of
> items. And use insertion sort, not bubble sort (simpler algorithm,
> twice as fast on average).

The simplest bubble sort is simpler than the simplest insertion
sort.

The most complete bubble sort is simpler than the most complete
insertion sort.

It is only if we compare a somewhat less complete rendition of
insertion sort to a somewhat more complete rendition of bubble
sort that "insertion sort" may be called simpler.

I agree that insertion sort is a better choice than bubble sort
in almost all cases, but saying insertion sort is simpler than
bubble sort is an apples-and-oranges comparison.

Robert Wessel

unread,

Jan 10, 2020, 5:45:29 PM1/10/20

to

For common definitions of what bubble and insertion sorts are,
insertion sorts are simpler.

Standard versions in C:

void InsertionSort(int a[], int length)
{
int i, j, temp;

for(i = 1; i < length; i++)
{
temp = a[i];
for (j = i - 1; j >= 0 && a[j] > temp; j--)
a[j + 1] = a[j];
a[j] = temp;
}
}

void BubbleSort(int a[], int length)
{
int i, j, temp;

for (i = 0; i < length; i++)
{
for (j = 0; j < length - i - 1; j++)
{
if (a[j + 1] < a[j])
{
temp = a[j];
a[j] = a[j + 1];
a[j + 1] = temp;
}
}
}
}

There's not a lot of difference there (basically all the same lines of
code, except that the "if" in the bubble sort has been folded into
insertion sort's inner loop termination test). But there's little to
add to the insertion sort, while the bubble sort is usually enhanced
by adding a check to stop if no swapping has been done in a pass,
adding more code (and if you don't, the usual claim that bubble sort
works well on already sorted files is false). If you do the swap test
in bubble sort you can simplify the outer loop, but a fair cost in
performance for unsorted inputs.

Enhancement to both are possible (if somewhat pointless), but often
result in what's usually considered a different sort. Using a binary
search to determine the insertion point, or working the bubble sort
from both ends result in binary insertion and cocktail shaker sorts,
for example.

Tim Rentsch

unread,

Jan 21, 2020, 6:12:06 AM1/21/20

to

The first function body is shorter but that doesn't mean it is
simpler. The loop invariants in the first function are more
complicated than those in the second function. Also I think
you're being unfair in how the two function bodies are written.
In most cases bubble sort is presented using a "swap()" function
to exchange two elements. There are a few other deficiencies,
relative to how the InsertionSort function is written, for the
coding in the BubbleSort function. It's easy, and I think more
usual, to present bubble sort something like this:

void BubbleSort(int a[], int length)
{

int n, i;

for (n = length; n > 1; n--)
for (i = 1; i < n; i++)
if (a[i-1] < a[i])
swap( a, i-1, i );
}

Writing bubble sort this way also illustrates a key difference
between the two: swap() is a simple and easy-to-understand
abstraction. There is no analogous abstraction for the key
operation in insertion sort. That lack isn't just accidental.

> But there's little to add to the insertion sort, while the bubble
> sort is usually enhanced by adding a check to stop if no swapping
> has been done in a pass, adding more code (and if you don't, the
> usual claim that bubble sort works well on already sorted files is

> false). [...]

More apples-and-oranges. Of course bubble sort can be enhanced
in that way, but the main point of doing so is to show that the
enhancement doesn't buy much. In contrast, the common follow-on
to insertion sort -- using binary search to find the insertion
point -- offers a significant theoretical (and often practical)
advantage, in that there is now an O( n log n ) bound on the
number of comparisons done. No one is arguing that bubble sort
is a better practical choice for actual coding; insertion sort
is clearly better, on several different axes. But supporting
that argument by claiming insertion sort is simpler than bubble
sort is overreaching. It's better, in the sense of making a more
convincing argument, to admit that bubble sort is simpler, and
then go on to explain that despite the relative simplicity of
bubble sort, insertion sort (or sometimes others, depending on
circumstances) is a better practical choice, for several reasons
that are compelling and also clearly defensible.

Ben Bacarisse

unread,

Jan 21, 2020, 10:04:18 AM1/21/20

to

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:

> Robert Wessel <robert...@yahoo.com> writes:
<cut>

Some people would say you can write insertion sort using only swap:

for (int i = 1; i < length; i++)
for (int j = i; j > 0 && a[j-1] < a[j]; j--)
swap(a, j-1, j);

and some people would not call your example bubble sort because it has
no "almost sorted" advantage. These discussions often end up a bit
ungrounded because the terms are not exact.

And if you don't want to do the insert using swaps, why not write an
insert function? In what way would that not be an easy-to understand
abstraction? Although insert is more complex that swap, it has the
effect of cutting the complexity of the sort function in half: one loops
inserts each element into a growing sorted array; inserting each element
needs a simple loop.

<cut>
--
Ben.

Tim Rentsch

unread,

Jan 27, 2020, 2:52:26 PM1/27/20

to

Yes. I too think that's reasonable. It isn't how I would
usually write an insertion sort, but presenting it as one
way of writing insertion sort is appropriate IMO.

> and some people would not call your example bubble sort because it
> has no "almost sorted" advantage.

Yes, I acknowledge that. It kind of surprised me when I first
heard it but it is obviously true (that is, that some people
would have that reaction). As a double check on my own usage I
consulted two acquaintances, with different backgrounds, and
phrasing the question as neutrally as I could. Both of them
answered the way I would (i.e., that early termination is not
essential to be termed bubble sort), but all that proves is only
that different people mean somewhat different things when they
talk about it.

> These discussions often end up a bit ungrounded because the
> terms are not exact.

Certainly they often do end up a bit ungrounded. I might say
that's largely because different people define the terms slightly
differently and also vary in how much latitude they are willing
to accept. I myself tend to fall in the more tolerant admission
in this area. But the main thing is I think we agree on the key
point, that it's hard to reach a shared conclusion because people
bring with them (and sometimes unconsciously) different working
assumptions.

> And if you don't want to do the insert using swaps, why not write
> an insert function? In what way would that not be an easy-to
> understand abstraction? Although insert is more complex that
> swap, it has the effect of cutting the complexity of the sort
> function in half: one loops inserts each element into a growing
> sorted array; inserting each element needs a simple loop.

Let me try to respond to this paragraph more all-at-once than a
sentence at a time.

First I think all your statements and questions are reasonable.
Writing a separate function to perform insertions clearly can
help simplify the overall algorithm. I suppose the same sort of
thing might be doable with bubble sort, but it's less obvious
how much benefit that would bring.

I suspect a large part of the difficulty we have in reconciling
our different points of view comes from how we characterize the
different sorting algorithms and how we measure "simpleness". My
view of things like sorting algorithms tends to be conceptual: I
remember (what I think are) the essential elements, generally the
fewer the better, and forget the details because for me it's
usually easier to reconstruct them than remember them. The
central component of bubble sort (as I conceptualize it, in case
that needs saying) is comparing and possibly swapping adjacent
elements. The central component of insertion sort is finding and
moving an element into the appropriate place of an already sorted
subarray. (Sorry for the poor phrasing there.) Insertion isn't
a very complicated abstraction but it's a "larger" abstraction
than swapping - it involves a larger part of the element array,
and in a qsort-like setting needs the comparison function, which
swap does not.

Summarizing by example: for me a cocktail sort (which I learned as
a "brick and bubble" sort) is really just a bubble sort; that the
"bubbles" are going two different directions doesn't change that.
For deciding which is simpler, I think bubble sort is simpler
because I think the "central operation" (swap versus insert) is
easier to explain to an eight year old. I will grant you that both
of these views are both personal and subjective. Perhaps part of
my point is to get the personalness and subjectiveness out into the
open rather than hidden behind seemingly absolute statements.

Let me run now through several revisions of a function that shows
a way of coding a sorting algorithm that I might call bubble
sort. All of these writings have the property that early
termination (that is, in time closer to O(N) than to O(N log N)
will occur for some set of inputs that are close to being sorted.
I wrote all these myself, in the order shown, but not exactly
with a clear destination in mind. So some of my comments below
have the benefit of hindsight, although the algorithms themselves
mostly don't.

(Incidental note: I use functions 'backwards()' and 'exchange()'
for purposes of comparing and swapping, respectively. This was
done to allow instrumentation and measurement.)

First, the most simple-minded version:

void
naive_bubble_sort( int a[], int n ){
int i;

start:
for( i = 1; i < n; i++ ){
if( backwards( a, i-1, i ) ){
exchange( a, i-1, i );
goto start;
}
}
}

In descriptive English: look for the first out-of-order pair.
If one is found, put them in order, and simply start again from
the beginning.

(Historical note: this version of bubble sort was actually used
for a common command in a popular OS of the late 1960's and early
1970's. Fortunately it was sorting a very limited number of
items.)

I think this is great place to start an explanation of bubble
sort, because the code is both easy to understand and obviously
works correctly. IMO the 'goto' actually helps the presentation,
at least for the first version, but let's bow to structured
tradition and eliminate it. Doing this is straightforward:

void
poor_mans_bubble_sort( int a[], int n ){
int i;

for( i = 1; i < n; i++ ){
if( backwards( a, i-1, i ) ){
exchange( a, i-1, i );
i = 0;
}
}
}

Now suitably goto-free, but both of these have an awful property:
the usual running time is not even O(N*N) but O(N*N*N). Terrible!
To improve that, we might observe that when the out-of-order pair
is reached, all the ones before it haven't changed and so don't
need to be looked at again. What has changed is a[i-1], so we
should go back and look at a[i-2] and a[i-1] -- assuming, that is,
that there /is/ an a[i-2]; if i is less than 2 we can simply go
forward. This ideas gives the next revision:

void
better_bubble_sort( int a[], int n ){
int i;

for( i = 1; i < n; i++ ){
if( backwards( a, i-1, i ) ){
exchange( a, i-1, i );
if( i > 1 ) i -= 2;
}
}
}

This "minor" change is surprisingly effective. Not only has the
running time improved from O(N*N*N) to O(N*N), but it does early
termination quite well. On scrambled inputs it's competitive with
the Knuth version of bubble sort (sometimes better, sometimes
worse), and on "nearly sorted" inputs it does well on a larger set
of inputs than the Knuth bubble sort. It needs more explanation
to show that it's correct, which of course is usually the cost
of a fancier algorithm.

Besides being faster on more nearly sorted inputs, this version
behaves much like other bubble sorts. In particular it uses about
twice as many compares as (linear) insertion sort. The "bubbles"
are mostly "sinking" rather than "rising", but for me that doesn't
change things enough to make a difference.

(Please also see the note at the bottom regarding this function.)

We can make a further improvement by noting a simlar property in
the other direction. After finding an out-of-order pair, and
dragging the smaller element down to its appropriate depth, there
is no need check the elements between there and where we first
found it. We add a variable 'next_i' to remember the high water
mark, and use 'next_i++' whenever we switch directions from
"downwards" to "upwards". Here is the code:

void
even_better_bubble_sort( int a[], int n ){
int i = 1, next_i = 2;

while( i < n ){
if( backwards( a, i-1, i ) ){
exchange( a, i-1, i );
i = i > 1 ? i-1 : next_i++;
} else {
i = next_i++;
}
}
}

Probably because of how I arrived at this refinement, I thought of
it as an enhanced bubble sort. Then I ran my little test harness,
and was surprised to see that it does exactly the same number of
compares as (linear) insertion sort. In retrospect, naturally, I
can see that what is being done here is isomorphic to a swapping
insertion sort. But it wasn't what I was expecting when I first
coded it. I hope at least that the journey was interesting.

Two further comments. First is that I coded up all these version
myself without consulting any references. It was only after I had
written all of them (and run the measurements) that I learned from
a friend that what I call here "better_bubble_sort()" is known
more widely as "Gnome sort". So that was serendipitous.

My closing comment gets back to comparing the two sorting methods.
If we contrast the last two functions, what I think stands out is
the extra work done to allow the skipping of comparisons between
the sink point and the high water mark. To me this difference both
shows why insertion sort runs faster than bubble sort and also
demonstrates that insertion sort is more elaborate than bubble
sort: more mechanism is needed, which requires more explanation,
but that additional mechanism is what gives insertion sort its
speed advantage over bubble sort. I admit the conclusion is both
personal and subjective. To me it still seems right because it
isn't what I set out to do - my motivation was only to present a
series of algorithms that I would all call bubble sort, to show
what I mean when I use the term. I was actually very surprised
when I saw that the last version had the same behavior as does
insertion sort.

Sorry this has ended up being so long, and again I hope it was
at least interesting.

Ben Bacarisse

unread,

Jan 29, 2020, 7:32:20 PM1/29/20

to

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:

<cut lots>

I left this a few days in the hope that I'd have something interesting
to say about it since your post was itself interesting and clearly not
trivial to write. Unfortunately I failed, but such a thoughtful post
merits a reply, even if the reply is not thoughtful.

--
Ben.

Tim Rentsch

unread,

Feb 15, 2020, 8:30:49 AM2/15/20

to

Thanks, I appreciate the followup.

(for Ben: I sent you an email earlier this week. Did you get that
or did it fall through the cracks?)

Ben Bacarisse

unread,

Feb 15, 2020, 11:39:53 AM2/15/20

to

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
<cut>

> (for Ben: I sent you an email earlier this week. Did you get that
> or did it fall through the cracks?)

I replied a 3 days ago.

--
Ben.