merge-sort with alloca()-buffer

Bonita Montero

unread,

Jan 19, 2021, 11:09:11 AM1/19/21

to

I had an algorithm which needs fast sorting. So I was curious
about if I can beat std::sort with my own stable sort-function
which allocates the external memory on the stack through alloca()
up to a limit of 64kB.
Here's the code:

#pragma once
#include "disarmed-iterators.h"
#include <iterator>
#include <type_traits>
#include <utility>
#include <vector>

template<typename ItType, typename IsLess>
void xstable_sort_buf( typename std::iterator_traits<ItType>::value_type
*buf, ItType begin, ItType end, IsLess const &isLess );

template<typename ItType, typename IsLess>
void xstable_sort( ItType begin, ItType end, IsLess const &isLess )
{
using namespace std;
using T = typename std::iterator_traits<ItType>::value_type;
static_assert(is_nothrow_move_constructible<T>::value, "*begin must be
nothrow move-contructrible");
static_assert(is_nothrow_move_assignable<T>::value, "*begin must be
nothrow move-assignable");
union TU
{
T t;
};
static_assert(sizeof(TU) == sizeof(T), "sizeof of a single unioned type
must be the same as the type itself");
size_t n = end - begin,
capacity = 0;
for( size_t add = n; add > 2; add -= add / 2 )
capacity += add;
if( ((char *)&*begin - (char *)&*end) <= 0x8000 )
{
T *buf = (T *)alloca( capacity * sizeof(T) );
return xstable_sort_buf( buf, begin, end, isLess );
}
vector<TU> buf;
buf.resize( capacity );
xstable_sort_buf( (T *)&buf[0], begin, end, isLess );
}

template<typename ItType, typename IsLess>
void xstable_sort_buf( typename std::iterator_traits<ItType>::value_type
*buf, ItType begin, ItType end, IsLess const &isLess )
{
using namespace std;
using T = typename std::iterator_traits<ItType>::value_type;
static_assert(is_nothrow_move_constructible<T>::value, "*begin must be
nothrow move-contructrible");
static_assert(is_nothrow_move_assignable<T>::value, "*begin must be
nothrow move-assignable");
size_t n = end - begin;
if( n <= 2 )
{
if( n == 2 && isLess( begin[1], begin[0] ) )
swap( begin[1], begin[0] );
return;
}
T *p;
ItType it;
for( p = buf, it = begin; it != end; )
new( (void *)p++ )T( move( *it++ ) );
size_t right = n / 2,
left = n - right;
xstable_sort_buf( buf + n, buf, buf + left, isLess );
xstable_sort_buf( buf + n, buf + left, buf + n, isLess );
T *pLeft = buf,
*pRight = buf + left;
T *pLeftEnd = pRight,
*pRightEnd = buf + n;
for( it = begin; ; )
{
if( isLess( *pLeft, *pRight ) )
{
*it++ = move( *pLeft );
if( ++pLeft == pLeftEnd )
{
do
*it++ = move( *pRight );
while( ++pRight != pRightEnd );
break;
}
}
else
{
*it++ = move( *pRight );
if( ++pRight == pRightEnd )
{
do
*it++ = move( *pLeft );
while( ++pLeft != pLeftEnd );
break;
}
}
}
for( p = buf; p != buf + n; ++p )
p->~T();
}

Unfortunately the code needs severely degenerated input to compete
with std::sort. Usually my code is somewhat slower than std::sort,
but nevertheless it's a good fast stable sorting algorithm which
prevents the cost of allocating the buffer for low Ns; and when
the Ns are high enough, the allocation-costs are low in contrast
to the sorting-overhead.

Bonita Montero

unread,

Jan 19, 2021, 12:28:38 PM1/19/21

to

So I changed the code a bit so that the construction of the buffer
-elements is done only once:

#pragma once
#include "disarmed-iterators.h"
#include <iterator>
#include <type_traits>
#include <utility>
#include <vector>

#include <memory>

#if defined(_MSC_VER)
#pragma warning(push)
#pragma warning(disable: 6255) // alloca may oveflow stack
#endif

template<typename ItType, typename IsLess>
void xstable_sort_buf( typename std::iterator_traits<ItType>::value_type
*buf, ItType begin, ItType end, IsLess const &isLess );

template<typename ItType, typename IsLess, typename Alloc =
std::allocator<typename std::iterator_traits<ItType>::value_type>>
void xstable_sort( ItType begin, ItType end, IsLess const &isLess, Alloc
alloc = Alloc() )

{
using namespace std;
using T = typename std::iterator_traits<ItType>::value_type;
static_assert(is_nothrow_move_constructible<T>::value, "*begin must be
nothrow move-contructrible");
static_assert(is_nothrow_move_assignable<T>::value, "*begin must be
nothrow move-assignable");
union TU
{
T t;
};
static_assert(sizeof(TU) == sizeof(T), "sizeof of a single unioned type
must be the same as the type itself");
size_t n = end - begin,
capacity = 0;
for( size_t add = n; add > 2; add -= add / 2 )
capacity += add;
if( ((char *)&*begin - (char *)&*end) <= 0x8000 )
{
T *buf = (T *)alloca( capacity * sizeof(T) );

if constexpr( !is_trivial<T>::value )
for( size_t i = 0; i != capacity; ++i )
new( (void *)&buf[i] )T();

xstable_sort_buf( buf, begin, end, isLess );

for( size_t i = 0; i != capacity; ++i )
buf[i].~T();
return;
}
vector<T> buf( capacity, T(), alloc );
xstable_sort_buf( &buf[0], begin, end, isLess );

}

template<typename ItType, typename IsLess>
void xstable_sort_buf( typename std::iterator_traits<ItType>::value_type
*buf, ItType begin, ItType end, IsLess const &isLess )
{
using namespace std;
using T = typename std::iterator_traits<ItType>::value_type;
static_assert(is_nothrow_move_constructible<T>::value, "*begin must be
nothrow move-contructrible");
static_assert(is_nothrow_move_assignable<T>::value, "*begin must be
nothrow move-assignable");
size_t n = end - begin;
if( n <= 2 )
{
if( n == 2 && isLess( begin[1], begin[0] ) )
swap( begin[1], begin[0] );
return;
}
T *p;
ItType it;
for( p = buf, it = begin; it != end; )

*p++ = *it++;

size_t right = n / 2,
left = n - right;
xstable_sort_buf( buf + n, buf, buf + left, isLess );
xstable_sort_buf( buf + n, buf + left, buf + n, isLess );
T *pLeft = buf,
*pRight = buf + left;
T *pLeftEnd = pRight,
*pRightEnd = buf + n;
for( it = begin; ; )
{
if( isLess( *pLeft, *pRight ) )
{
*it++ = move( *pLeft );
if( ++pLeft == pLeftEnd )
{
do
*it++ = move( *pRight );
while( ++pRight != pRightEnd );
break;
}
}
else
{
*it++ = move( *pRight );
if( ++pRight == pRightEnd )
{
do
*it++ = move( *pLeft );
while( ++pLeft != pLeftEnd );
break;
}
}
}
}

#if defined(_MSC_VER)
#pragma warning(pop)
#endif

One problem I have currently is that when I call xstable_sort without
an allocator-argument the compiler complains about an ambigous function
-call. How can this happen ?

Severity Code Description Project File Line Suppression State
Error (active) E0308 more than one instance of overloaded function
"xstable_sort" matches the argument list: sortBench
C:\Users\Bonita\Documents\Source\sortBench\sortBench.cpp 102
Error C2668 'xstable_sort': ambiguous call to overloaded function
sortBench C:\Users\Bonita\Documents\Source\sortBench\sortBench.cpp 102

Juha Nieminen

unread,

Jan 20, 2021, 3:50:54 PM1/20/21

to

Bonita Montero <Bonita....@gmail.com> wrote:
> Unfortunately the code needs severely degenerated input to compete
> with std::sort.

It's hard to compete with introsort for the vast majority of input.
The insertion sort that it uses for small-enough partitions speeds it
up by quite a margin.

To make merge sort competitive you need to do the same, in other words,
use insertion sort for small-enough segments.

If you are using the recursive form of merge sort, if the segment to
sort is small enough, just use insertion sort instead of calling the
merge sort recursively for its two halves.

If you are using the iterative form of merge sort (ie. you start with
segments of 2 elements, then go up from there) you need to do the
insertion-sorting first for segments of the array of a smallish
size, and then run merge sort starting with that size.

Bonita Montero

unread,

Jan 21, 2021, 12:20:43 AM1/21/21

to

I thought that the ABI of my compiler might be not so effective as
managing my own recursion with my own stack iteratively. So I wrote
an iterative solution, still with the buffer on the stack and if
the buffer is too large, the buffer on the heap:

#pragma once
#include "disarmed-iterators.h"
#include <iterator>
#include <type_traits>
#include <utility>
#include <vector>

#include <memory>
#include <new>
#include <memory>
#include <cstddef>
#include "ndi_t.h"
#include "xassert.h"

#if defined(_MSC_VER)
#pragma warning(push)
#pragma warning(disable: 6255) // alloca may oveflow stack

#pragma warning(disable: 26812) // prefer uncsoped enums
#endif

template<typename ItType, typename IsLess, typename Alloc =

std::allocator<char>>

void xstable_sort( ItType begin, ItType end, IsLess const &isLess, Alloc

alloc = Alloc() )

{
using namespace std;
using T = typename std::iterator_traits<ItType>::value_type;

using frame_alloc = typename allocator_traits<Alloc>::template
rebind_alloc<char>;

static_assert(is_nothrow_move_constructible<T>::value, "*begin must be
nothrow move-contructrible");
static_assert(is_nothrow_move_assignable<T>::value, "*begin must be
nothrow move-assignable");

struct frame_t
{
frame_t *prevFrame,
*nextFrame;
T *inputBegin;
size_t nInputs;
enum : std::int8_t
{
CALL = 0,
SECOND_CALL = 1,
RET = 2
} state;
T buf[1];
};
auto alignUp = []( size_t size, size_t align ) -> size_t
{
return size + align - 1 & ~(align - 1);

};
size_t n = end - begin,

frameSpace = alignof(frame_t) - 1;
for( size_t nInputs = n; nInputs > 2; nInputs -= nInputs / 2 )
frameSpace += alignUp( offsetof(frame_t, buf) + nInputs * sizeof(T),
alignof(frame_t) );
frameSpace += sizeof(frame_t);
vector<ndi_t<char>> vecBuf;
void *bufBegin;
if( frameSpace <= 0x10000 )
bufBegin = (char *)alloca( frameSpace );
else
vecBuf.resize( frameSpace ),
bufBegin = &vecBuf[0];
frame_t *firstFrame = (frame_t *)alignUp( (size_t)bufBegin,
alignof(frame_t) );
frame_t *curFrame = firstFrame,
*prevFrame = nullptr;
size_t nInputs = n;
for( ; nInputs > 2; nInputs -= nInputs / 2 )
try
{
curFrame->prevFrame = prevFrame;
T *t = curFrame->buf;
try
{

if constexpr( !is_trivial<T>::value )

for( T *t = curFrame->buf; t != curFrame->buf + nInputs; ++t )
new( (void *)t )T();
}
catch( ... )
{
while( t-- > bufBegin )
t->~T();
throw;
}
prevFrame = curFrame;
(char *&)curFrame += alignUp( offsetof(frame_t, buf) + nInputs *
sizeof(T), alignof(frame_t) );
prevFrame->nextFrame = curFrame;
}
catch( ... )
{
while( (curFrame = prevFrame) != nullptr )
for( T *t = curFrame->buf; t != curFrame->buf + curFrame->nInputs; )
t->~T();
throw;
}
curFrame->prevFrame = prevFrame;
curFrame->nextFrame = nullptr;
firstFrame->inputBegin = &*begin;
firstFrame->nInputs = n;
firstFrame->state = frame_t::CALL;
curFrame = firstFrame;
for( ; curFrame; )
{
xassert(curFrame->state >= frame_t::CALL && curFrame->state <=
frame_t::RET);
switch( curFrame->state )
{
case frame_t::CALL:
{
call:
if( curFrame->nInputs <= 2 )
{
if( curFrame->nInputs == 2 && isLess( curFrame->inputBegin[1],
curFrame->inputBegin[0] ) )
swap( curFrame->inputBegin[1], curFrame->inputBegin[0] );
curFrame = curFrame->prevFrame;
break;
}
T *inputs = curFrame->inputBegin,
*buf = curFrame->buf;
for( size_t i = 0; i != curFrame->nInputs; ++i )
buf[i] = move( inputs[i] );
size_t n = curFrame->nInputs,
left = n - n / 2;
frame_t *nextFrame = curFrame->nextFrame;
nextFrame->inputBegin = curFrame->buf;
nextFrame->nInputs = left;
curFrame->state = frame_t::SECOND_CALL;
curFrame = nextFrame;
goto call;
}
case frame_t::SECOND_CALL:
{
size_t n = curFrame->nInputs,

right = n / 2,
left = n - right;

frame_t *nextFrame = curFrame->nextFrame;
nextFrame->inputBegin = curFrame->buf + left;
nextFrame->nInputs = right;
curFrame->state = frame_t::RET;
curFrame = nextFrame;
goto call;
}
break;
case frame_t::RET:
{
size_t n = curFrame->nInputs,

right = n / 2,
left = n - right;

T *pLeft = curFrame->buf,
*pRight = pLeft + left;
T *pLeftEnd = pRight,
*pRightEnd = pLeft + n;
for( T *moveBack = curFrame->inputBegin; ; )

{
if( isLess( *pLeft, *pRight ) )
{

*moveBack++ = move( *pLeft );

if( ++pLeft == pLeftEnd )
{
do

*moveBack++ = move( *pRight );

while( ++pRight != pRightEnd );
break;
}
}
else
{

*moveBack++ = move( *pRight );

if( ++pRight == pRightEnd )
{
do

*moveBack++ = move( *pLeft );

while( ++pLeft != pLeftEnd );
break;
}
}
}

curFrame = curFrame->prevFrame;
break;
}
}
}
for( curFrame = firstFrame; curFrame->nextFrame; curFrame =
curFrame->nextFrame )
{
T *buf = curFrame->buf;
for( size_t i = 0; i != curFrame->nInputs; ++i )
buf[i].~T();
}
}

#if defined(_MSC_VER)
#pragma warning(pop)
#endif

// ndi_t:

#pragma once
#include <type_traits>

template<typename T>
union ndi_t
{
ndi_t();
ndi_t( T value );
~ndi_t();
operator T &();
private:
T m_t;
};

template<typename T>
inline
ndi_t<T>::ndi_t()
{
if constexpr( !std::is_trivially_copyable<T>::value )
new( (void *)&m_t )T();
}

template<typename T>
inline
ndi_t<T>::ndi_t( T value ) :
m_t( value )
{
}

template<typename T>
inline
ndi_t<T>::~ndi_t()
{
if constexpr( !std::is_trivially_copyable<T>::value )
m_t.~T();
}

template<typename T>
inline
ndi_t<T>::operator T &()
{
return m_t;
}

// xassert.h

#pragma once
#include <cassert>

#if !defined(assume)
#if defined(_MSC_VER)
#define assume(e) (__assume(e))
#elif defined(__GNUC__)
#define assume(e) \
{ \
if( !(e) ) \
__builtin_unreachable(); \
}
#else
#define assume(e) ((void)0)
#endif
#endif

#if !defined(xassert)
#if !defined(NDEBUG)
#define xassert(e) (assert(e))
#else
#if defined(_MSC_VER)
#define xassert(e) assume(e)
#elif defined(__GNUC__)
#define xassert(e) \
{ \
if( !(e) ) \
__builtin_unreachable(); \
}
#else
#define xassert(e) ((void)0)
#endif
#endif
#endif

Bonita Montero

unread,

Jan 21, 2021, 12:30:41 AM1/21/21

to

> It's hard to compete with introsort for the vast majority of input.

Introsort isn't stable.

Chris M. Thomasson

unread,

Jan 21, 2021, 12:31:02 AM1/21/21

to

[...]

This seems a bit similar to something I did way back:

https://pastebin.com/raw/f37a23918

Bonita Montero

unread,

Jan 21, 2021, 1:13:19 AM1/21/21

to

> This seems a bit similar to something I did way back:
> https://pastebin.com/raw/f37a23918

That's ACII-art.

Chris M. Thomasson

unread,

Jan 21, 2021, 1:29:16 AM1/21/21

to

Why? Its an older region allocator I did. Your use of offset of reminded
me of it. And your use of rang some bells:

#define RALLOC_ALIGN_UP(mp_ptr, mp_align) \
((void*)( \
(((ralloc_uintptr_type)(mp_ptr)) + ((mp_align) - 1)) \
& ~(((mp_align) - 1)) \
))

Chris M. Thomasson

unread,

Jan 21, 2021, 1:37:54 AM1/21/21

to

offsetof using in the following macro is still interesting to me:

#define RALLOC_ALIGN_OF(mp_type) \
offsetof( \
struct { \
char pad_RALLOC_ALIGN_OF; \
mp_type type_RALLOC_ALIGN_OF; \
}, \
type_RALLOC_ALIGN_OF \
)

Bonita Montero

unread,

Jan 21, 2021, 1:59:41 AM1/21/21

to

> #define RALLOC_ALIGN_UP(mp_ptr, mp_align) \
> ((void*)( \
> (((ralloc_uintptr_type)(mp_ptr)) + ((mp_align) - 1)) \
> & ~(((mp_align) - 1)) \
> ))

That's while macros aren't recommended in C++.

Chris M. Thomasson

unread,

Jan 21, 2021, 2:52:59 AM1/21/21

to

No, they are not. Iirc, I posted this code way back in 2009 in C.

Bonita Montero

unread,

Jan 21, 2021, 3:10:29 AM1/21/21

to

>> That's while macros aren't recommended in C++.

> No, they are not. ...

They aren't recommended, i.e. they should be replaced by inline-code
as most as possible. Macros tend to be cryptic and can't be debugged.

Juha Nieminen

unread,

Jan 21, 2021, 7:36:18 AM1/21/21

to

Bonita Montero <Bonita....@gmail.com> wrote:
>> It's hard to compete with introsort for the vast majority of input.
>
> Introsort isn't stable.

I was talking about speed, and gave a suggestion on how to speed up
merge sort.

Bonita Montero

unread,

Jan 21, 2021, 8:56:26 AM1/21/21

to

>>> It's hard to compete with introsort for the vast majority of input.

>> Introsort isn't stable.

> I was talking about speed, and gave a suggestion on how to speed up
> merge sort.

That's not a speedup of mergesort sicne mergesort is stable.

Bonita Montero

unread,

Jan 21, 2021, 9:26:23 AM1/21/21

to

Unfortunately MSVC doesn't shortcut setting curframe->state and
switching on it by doing a direct jump (g++ does). So I did the
jups myself with gotos.
My xstable_sort is about as fast as libstdc++'s stable_sort
without parallel execution, but by far not as fast MSVC's
stable_sort; I don't know why.

#pragma once
#include "disarmed-iterators.h"

#include <type_traits>
#include <utility>
#include <vector>
#include <memory>
#include <new>

#include <cstddef>
#include "ndi_t.h"
#include "xassert.h"

#if defined(_MSC_VER)
#pragma warning(push)

#pragma warning(disable: 6011) // dereferencing a null-pointer

#pragma warning(disable: 6255) // alloca may oveflow stack

#endif

template<typename T, typename IsLess, typename Alloc = std::allocator<char>>
void xstable_sort( T *begin, T *end, IsLess const &isLess, Alloc alloc =

Alloc() )
{
using namespace std;

using frame_alloc = typename allocator_traits<Alloc>::template
rebind_alloc<char>;
static_assert(is_nothrow_move_constructible<T>::value, "*begin must be
nothrow move-contructrible");
static_assert(is_nothrow_move_assignable<T>::value, "*begin must be
nothrow move-assignable");
struct frame_t
{

enum RET : bool
{
RET_SECOND_CALL,
RET_MERGE
};

frame_t *prevFrame,
*nextFrame;
T *inputBegin;
size_t nInputs;

RET ret;

T buf[1];
};
auto alignUp = []( size_t size, size_t align ) -> size_t
{

return size + align - 1 & -(ptrdiff_t)align;

};
size_t n = end - begin,
frameSpace = alignof(frame_t) - 1;
for( size_t nInputs = n; nInputs > 2; nInputs -= nInputs / 2 )
frameSpace += alignUp( offsetof(frame_t, buf) + nInputs * sizeof(T),
alignof(frame_t) );
frameSpace += sizeof(frame_t);
vector<ndi_t<char>> vecBuf;
void *bufBegin;
if( frameSpace <= 0x10000 )
bufBegin = (char *)alloca( frameSpace );
else
vecBuf.resize( frameSpace ),
bufBegin = &vecBuf[0];
frame_t *firstFrame = (frame_t *)alignUp( (size_t)bufBegin,
alignof(frame_t) );

firstFrame->inputBegin = &*begin;
firstFrame->nInputs = n;

frame_t *curFrame = firstFrame,
*prevFrame = nullptr;
size_t nInputs = n;
for( ; nInputs > 2; nInputs -= nInputs / 2 )
try
{
curFrame->prevFrame = prevFrame;

if constexpr( !is_trivial<T>::value )

{

T *t = curFrame->buf;
try
{

for( T *t = curFrame->buf; t != curFrame->buf + nInputs; ++t )
new( (void *)t )T();
}
catch( ... )
{
while( t-- > bufBegin )
t->~T();
throw;
}
}
prevFrame = curFrame;
(char *&)curFrame += alignUp( offsetof(frame_t, buf) + nInputs *
sizeof(T), alignof(frame_t) );
prevFrame->nextFrame = curFrame;
}
catch( ... )
{

while( (curFrame = curFrame->prevFrame) != nullptr )

for( T *t = curFrame->buf; t != curFrame->buf + curFrame->nInputs; )
t->~T();
throw;
}
curFrame->prevFrame = prevFrame;
curFrame->nextFrame = nullptr;

curFrame = firstFrame;
for( ; ; )

{
call:
{
if( curFrame->nInputs <= 2 )
{
if( curFrame->nInputs == 2 && isLess( curFrame->inputBegin[1],
curFrame->inputBegin[0] ) )
swap( curFrame->inputBegin[1], curFrame->inputBegin[0] );

if( !(curFrame = curFrame->prevFrame) )
break;
if( curFrame->ret == frame_t::RET_SECOND_CALL )
goto second_call;
else
goto merge;

}
T *inputs = curFrame->inputBegin,
*buf = curFrame->buf;
for( size_t i = 0; i != curFrame->nInputs; ++i )
buf[i] = move( inputs[i] );
size_t n = curFrame->nInputs,
left = n - n / 2;
frame_t *nextFrame = curFrame->nextFrame;
nextFrame->inputBegin = curFrame->buf;
nextFrame->nInputs = left;

curFrame->ret = frame_t::RET_SECOND_CALL;

curFrame = nextFrame;
goto call;
}

second_call:

{
size_t n = curFrame->nInputs,
right = n / 2,
left = n - right;
frame_t *nextFrame = curFrame->nextFrame;
nextFrame->inputBegin = curFrame->buf + left;
nextFrame->nInputs = right;

curFrame->ret = frame_t::RET_MERGE;

curFrame = nextFrame;
goto call;
}

merge:

{
size_t n = curFrame->nInputs,
right = n / 2,
left = n - right;
T *pLeft = curFrame->buf,
*pRight = pLeft + left;
T *pLeftEnd = pRight,
*pRightEnd = pLeft + n;
for( T *moveBack = curFrame->inputBegin; ; )
{
if( isLess( *pLeft, *pRight ) )
{
*moveBack++ = move( *pLeft );
if( ++pLeft == pLeftEnd )
{
do
*moveBack++ = move( *pRight );
while( ++pRight != pRightEnd );
break;
}
}
else
{
*moveBack++ = move( *pRight );
if( ++pRight == pRightEnd )
{
do
*moveBack++ = move( *pLeft );
while( ++pLeft != pLeftEnd );
break;
}
}
}

if( !(curFrame = curFrame->prevFrame) )
break;
if( curFrame->ret == frame_t::RET_SECOND_CALL )
goto second_call;
else
goto merge;

Juha Nieminen

unread,

Jan 21, 2021, 12:27:59 PM1/21/21

to

What do you mean? Insertion sort is stable. How does it make merge short
not stable?

Bonita Montero

unread,

Jan 21, 2021, 12:43:57 PM1/21/21

to

>>>> Introsort isn't stable.

>>> I was talking about speed, and gave a suggestion on how to speed up
>>> merge sort.

>> That's not a speedup of mergesort sicne mergesort is stable.

> What do you mean? Insertion sort is stable. How does it make merge short
> not stable?

You were talking about introsort first. And introsort is is partitially
quicksort, partititially heapsort. And neither quicksort nor heapsort
are stable.

Öö Tiib

unread,

Jan 21, 2021, 3:12:59 PM1/21/21

to

You probably meant timsort that is stable hybrid sort of merge sort and
insertion sort; introsort is different hybrid.
.

Juha Nieminen

unread,

Jan 22, 2021, 12:29:59 PM1/22/21

to

Did you read my first message at all?

You wrote that your merge sort is slower than std::sort(), and I explained
that to make merge sort faster you need to optimize it by using insertion
sort for segments that are small enough, just like what std::sort() (which
uses introsort) does.

At no point did I make any mention whatsoever about using quicksort or
heapsort. I don't understand where you are getting that from.

Bonita Montero

unread,

Jan 22, 2021, 12:59:48 PM1/22/21

to

> Did you read my first message at all?

You:

> It's hard to compete with introsort for the vast majority of input.

So introsort is a bad advice here.

Chris M. Thomasson

unread,

Jan 22, 2021, 5:22:29 PM1/22/21

to

Agreed.

Bonita Montero

unread,

Jan 23, 2021, 12:46:39 AM1/23/21

to

>>> No, they are not. ...

>> They aren't recommended, i.e. they should be replaced by inline-code
>> as most as possible. Macros tend to be cryptic and can't be debugged.

> Agreed.

There are only three reaons for macros:
* pseudo-functions that are ina header and inlined and need to be
compatible with C and C++.
* psduo-constant which can be overriden, which is impossible with
a const or constexpr-"variable":
#if !defined(LRU_FETCH_DELAY_BUFFERS)
#define LRU_DELAY_BUFFERS 200
#endif
* conditional compilation

Öö Tiib

unread,

Jan 23, 2021, 4:41:28 AM1/23/21

to

Weird that I use it mostly for fourth reason namely that big part of
metaprogramming is only available using preprocessor. Perhaps you
aren't very familiar with C++.

Bonita Montero

unread,

Jan 23, 2021, 12:11:12 PM1/23/21

to

> Weird that I use it mostly for fourth reason namely that big part
> of metaprogramming is only available using preprocessor. Perhaps
> you aren't very familiar with C++.

Metaprogramming via macros is ugly because it is non-debuggable.
So it should be avoided.

Juha Nieminen

unread,

Jan 23, 2021, 2:35:12 PM1/23/21

to

What are you talking about? You didn't read any of what I wrote in my
original message, did you?

I did not recomment using introsort. I explained *why* introsort is
faster than merge sort, and how to make merge sort faster by using
one of the same techniques, namely insertion sort for small-enough
segments.

What's so hard to undrstand in this?

I don't really understand why you are asking things in this newsgroup
if you are set on not reading the answers given to you.

Chris M. Thomasson

unread,

Jan 23, 2021, 4:08:23 PM1/23/21

to

They can be radically hard core and weird. Iirc, I had to use, iirc the
name correctly, the Chaos lib several times. I think its still in boost.

https://github.com/rofl0r/chaos-pp

https://www.boost.org/doc/libs/1_74_0/libs/preprocessor/doc/index.html

Jorgen Grahn

unread,

Jan 23, 2021, 5:39:54 PM1/23/21

to

I don't understand why you bother answering. You know, and everyone
else knows, that BM either is unable to, or doesn't want to,
communicate properly with people on Usenet. (With some rare
exceptions.)

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

daniel...@gmail.com

unread,

Jan 23, 2021, 6:04:14 PM1/23/21

to

On Saturday, January 23, 2021 at 5:39:54 PM UTC-5, Jorgen Grahn wrote:
> I don't understand why you bother answering. You know, and everyone
> else

I don't know about everyone else, but personally I've found the topics
introduced by Bonita to be worthy, and the discussions between
Bonita and Chris to be interesting. I hope these continue.

Daniel

Öö Tiib

unread,

Jan 24, 2021, 3:50:42 AM1/24/21

to

You apparently are unfamiliar with C++ programming language that
is relatively ugly engineering tool. Also you likely have no much
programming experience as it does not matter how ugly some code
is when same can't be done in some other way.

Your "non-debuggable" argument is also nonsense as debugging and
giving feedback about bugs is teethless without preprocessor. What
is code file name, function name, source code line number, time of
compiling ... all of that is available only from preprocessor in C++.

Bonita Montero

unread,

Jan 24, 2021, 4:58:06 AM1/24/21

to

> You apparently are unfamiliar with C++ programming language that

> is relatively ugly engineering tool. ...

C++ isnt gerally ugly, but metaprogramming via macros is ugly for
sure.

> Also you likely have no much
> programming experience as it does not matter how ugly some code
> is when same can't be done in some other way.

>ou cannot think logically. Because the opinion about whether
something is ugly has nothing to do with the experience with
a language.
And I've got a lot of experience with C++; I'm using C++ since
the beginning of the 90s.

> Your "non-debuggable" argument is also nonsense as debugging and
> giving feedback about bugs is teethless without preprocessor. What
> is code file name, function name, source code line number, time of
> compiling ... all of that is available only from preprocessor in C++.

A single macro is debugged in one step - you can't step inside
it. That's huge problem.

Öö Tiib

unread,

Jan 24, 2021, 6:06:29 AM1/24/21

to

On Sunday, 24 January 2021 at 11:58:06 UTC+2, Bonita Montero wrote:
> > You apparently are unfamiliar with C++ programming language that
> > is relatively ugly engineering tool. ...
>
> C++ isnt gerally ugly, but metaprogramming via macros is ugly for
> sure.

Every mature C++ library starting from implementations of standard
library of C++ is literally packed full of preprocessor usage.
It is because same can't be made without preprocessor. So if
preprocesor is ugly then C++ is ugly. Q.E.D.

> > Also you likely have no much
> > programming experience as it does not matter how ugly some code
> > is when same can't be done in some other way.
> >ou cannot think logically. Because the opinion about whether
> something is ugly has nothing to do with the experience with
> a language.
> And I've got a lot of experience with C++; I'm using C++ since
> the beginning of the 90s.

You talk about different activity. For you "programming" is
"writing code", for me it is "making software that people use".
You can choose what you write or not when the only goal is just to
write it. When the goal is to achieve useful software then
you have to pursue the needed effect regardless of how "ugly"
the code is. So your lack of experience makes you to say
nonsense.

> > Your "non-debuggable" argument is also nonsense as debugging and
> > giving feedback about bugs is teethless without preprocessor. What
> > is code file name, function name, source code line number, time of
> > compiling ... all of that is available only from preprocessor in C++.
> A single macro is debugged in one step - you can't step inside
> it. That's huge problem.

Same applies here. You are talking like from different universe.
For you "debugging" is "stepping in code with debugger".
For me it is "removing defects from software". I very rarely have
to step in code for anything and if I need then it is usually because
of being written by such debugger steppors. Good code uses unit
tests to verify its correctness and what C++ unit test framework
out there does not use preprocessor?

Bonita Montero

unread,

Jan 24, 2021, 7:01:36 AM1/24/21

to

> Every mature C++ library starting from implementations of standard
> library of C++ is literally packed full of preprocessor usage.

No, macros aren by far not used so often as in C.

> You can choose what you write or not when the only goal is just
> to write it. When the goal is to achieve useful software then
> you have to pursue the needed effect regardless of how "ugly"
> the code is. So your lack of experience makes you to say
> nonsense.

I've not a lack of experience but a better taste than you.

> For you "debugging" is "stepping in code with debugger".

No, stepping into is an important part part of debugging.
And you can't debug macros because of what I said.

daniel...@gmail.com

unread,

Jan 24, 2021, 11:49:41 AM1/24/21

to

On Sunday, January 24, 2021 at 3:50:42 AM UTC-5, Öö Tiib wrote:
On Saturday, 23 January 2021 at 19:11:12 UTC+2, Bonita Montero wrote:

> > > I use it mostly for fourth reason namely that big part
> > > of metaprogramming is only available using preprocessor.

> > Metaprogramming via macros is ugly because it is non-debuggable.

> Your "non-debuggable" argument is also nonsense

No, it isn't. Well designed macros that support metaprogramming
may be easy enough for a user to work with, without making mistakes,
but if the user actually did mistakenly omit a right parenthesis or
add an extra argument, any compiler error message would likely be
incomprehensible.

> as debugging and
> giving feedback about bugs is teethless without preprocessor. What
> is code file name, function name, source code line number, time of
> compiling ... all of that is available only from preprocessor in C++.

Irrelevant to the specific point about macros for metaprogramming.

Öö Tiib also wrote:

> Perhaps you aren't very familiar with C++.

> You apparently are unfamiliar with C++ programming language

> You likely have no much programming experience

Have you noticed that when someone has the audacity to disagree
with one of your points of view, sometimes you respond with
disparaging comments about that person, and sometimes you
don't? Depending on who it is?

Daniel

Bonita Montero

unread,

Jan 24, 2021, 12:00:10 PM1/24/21

to

> No, it isn't. Well designed macros that support metaprogramming
> may be easy enough for a user to work with, without making mistakes,
> but if the user actually did mistakenly omit a right parenthesis or
> add an extra argument, any compiler error message would likely be
> incomprehensible.

Compiler-errors aren't such a big issue, but you can't debug the
code inside a macro.

daniel...@gmail.com

unread,

Jan 24, 2021, 12:47:03 PM1/24/21

to

But if the alternative to using the macro was to write a great deal of repetitive
boilerplate code, the likelihood of a run time error could be greater. I've never
actually experienced a run-time issue as a result of using a macro that
generated boilerplate code. But I concede it could happen. But why can't
you debug that? At this point you'd just be stepping through code, the
macro is gone.

Daniel

Richard Damon

unread,

Jan 24, 2021, 1:17:55 PM1/24/21

to

I believe there are IDE's that will let you step through the macro
expansion to 'debug' them. You can also step through the assembly code
generated if it is an execution problem.

Bonita Montero

unread,

Jan 24, 2021, 1:25:41 PM1/24/21

to

> But if the alternative to using the macro was to write a great deal
> of repetitive boilerplate code, the likelihood of a run time error
> could be greater. I've never actually experienced a run-time issue
> as a result of using a macro that generated boilerplate code. But I
> concede it could happen. But why can't you debug that? At this point
> you'd just be stepping through code, the macro is gone.

The cases where you need to write code into a macro are very rare.
I had it as a macro which defines different exception-classes; the
parameters were the name of the exception-class (later determined
via RTTI to be handled to the client), the superclass of the excep-
tion-class, the exception-id (just integer, also later handled to
the client) and the string for the .what-method (also handled to
the client). That was a bit more convenient as to define an excep-
tion-class for each use.

Bonita Montero

unread,

Jan 24, 2021, 1:26:44 PM1/24/21

to

> I believe there are IDE's that will let you step through the macro
> expansion to 'debug' them. You can also step through the assembly
> code generated if it is an execution problem.

Then you've to switch to the disassemly view.
But usually you can't step inside a macro.

daniel...@gmail.com

unread,

Jan 24, 2021, 1:51:27 PM1/24/21

to

Point taken. But I don't use the debugger very much :-)

Daniel

Ian Collins

unread,

Jan 24, 2021, 2:00:38 PM1/24/21

to

CLion supports inspecting and stepping through macros.

--
Ian.

Richard Damon

unread,

Jan 24, 2021, 3:16:36 PM1/24/21

to

I could say that if you can't switch to the disassembly view to see what
is happening, can you really say that you are a fully competent programmer?

daniel...@gmail.com

unread,

Jan 24, 2021, 3:49:48 PM1/24/21

to

Is there something in Bonita's posts that suggest Bonita is not able to
switch to the disassembly view? I have a different impression from reading
other posts, where disassembly code is discussed. But there are lots of things
that we can do, but would prefer not to have to do.

Daniel

Chris M. Thomasson

unread,

Jan 24, 2021, 4:01:24 PM1/24/21

to

Well, now this reminds me of a time where a programmer, that was fluent
in x86, had to debug something. Well, he ran a mutation of his code on a
SPARC and had a deadlock. The mutation "seemingly" ran fine on a x86.
However, he only knew x86! So disassembly can mess one up if they do not
understand another architecture. Back then, I could read SPARC asm. I
knew some PPC as well. I will never forget when he asked me what the
MEMBAR instruction was all about.

Chris M. Thomasson

unread,

Jan 24, 2021, 8:07:08 PM1/24/21

to

So do I. Bonita makes me think of my past. Making me remember hard core
threading sync algorithms.

Richard Damon

unread,

Jan 24, 2021, 9:27:15 PM1/24/21

to

The fact that she said she COULDN'T debug the code in a macro. (emphasis
added). It wasn't just harder, it couldn't be done.

Then when the technique was described, it was then dismissed as
apparently something very uncommon to do.

Chris M. Thomasson

unread,

Jan 24, 2021, 9:47:42 PM1/24/21

to

I did not know that! Thanks Ian, and everyone else.

Chris M. Thomasson

unread,

Jan 24, 2021, 9:52:47 PM1/24/21

to

I have noticed a little pattern akin to that as well.

Öö Tiib

unread,

Jan 25, 2021, 1:43:25 AM1/25/21

to

On Sunday, 24 January 2021 at 18:49:41 UTC+2, daniel...@gmail.com wrote:
> On Sunday, January 24, 2021 at 3:50:42 AM UTC-5, Öö Tiib wrote:
> On Saturday, 23 January 2021 at 19:11:12 UTC+2, Bonita Montero wrote:
>
> > > > I use it mostly for fourth reason namely that big part
> > > > of metaprogramming is only available using preprocessor.
> > > Metaprogramming via macros is ugly because it is non-debuggable.
> > Your "non-debuggable" argument is also nonsense
> No, it isn't. Well designed macros that support metaprogramming
> may be easy enough for a user to work with, without making mistakes,
> but if the user actually did mistakenly omit a right parenthesis or
> add an extra argument, any compiler error message would likely be
> incomprehensible.

Frequently incomprehensible are also the compiler errors from template
metaprogramming. There was briefly nice constexpr metaprogramming
but C++17 deleted a way to handle compile-time errors (and to
have compile time diagnostics) using it. So metaprogramming of
C++ has bad diagnostics ... should it be avoided because of that?

> > as debugging and
> > giving feedback about bugs is teethless without preprocessor. What
> > is code file name, function name, source code line number, time of
> > compiling ... all of that is available only from preprocessor in C++.
> Irrelevant to the specific point about macros for metaprogramming.
> Öö Tiib also wrote:
>
> > Perhaps you aren't very familiar with C++.
> > You apparently are unfamiliar with C++ programming language
> > You likely have no much programming experience
>
> Have you noticed that when someone has the audacity to disagree
> with one of your points of view, sometimes you respond with
> disparaging comments about that person, and sometimes you
> don't? Depending on who it is?

I do not know why low experience is defamatory? We all started
there. It is better to be without experience than to have long
experience how some ugly things are without alternatves but
still pretend to be blind to it.

Jorgen Grahn

unread,

Jan 25, 2021, 2:14:52 AM1/25/21

to

I agree BM may write about interesting things, and people make serious
attempts to respond. The problem is, it always goes downhill from
there.

daniel...@gmail.com

unread,

Jan 25, 2021, 9:19:12 AM1/25/21

to

On Monday, January 25, 2021 at 2:14:52 AM UTC-5, Jorgen Grahn wrote:
> I agree BM may write about interesting things, and people make serious
> attempts to respond. The problem is, it always goes downhill from
> there.

All discussions go downhill very quickly as soon as people start feeling
the need to comment on the deficiencies of the poster as opposed to
the content of the post.

Daniel

Jorgen Grahn

unread,

Jan 26, 2021, 2:22:19 AM1/26/21

to

On Sun, 2021-01-24, Öö Tiib wrote:

> I very rarely have to step in code for anything and if I need then
> it is usually because of being written by such debugger steppors.

Same here. I don't rule out, though, that others can use stepping
efficiently. My coworkers all do it (or say they want to).

> Good code uses unit tests to verify its correctness and what C++
> unit test framework out there does not use preprocessor?

Mine -- but then I didn't focus on getting good failure reports
from it. I can tell my tests to dump core if they fail.

Juha Nieminen

unread,

Jan 26, 2021, 7:15:00 AM1/26/21

to

Or when they start making untrue claims about someone's answer, and
stubborningly refuse to back off.

Chris M. Thomasson

unread,

Jan 26, 2021, 9:09:03 PM1/26/21

to

So far, that sure seems to be a key aspect of her modus operandi. I find
it somewhat amusing when she claims what a compiler must do. Or what a
mutex must be comprised of, ect, ect...

Chris M. Thomasson

unread,

Jan 26, 2021, 9:12:37 PM1/26/21

to

On 1/26/2021 4:14 AM, Juha Nieminen wrote:

Another thing that seems odd, is that she does not seem to acknowledge
she has errors in some of her code. I found a really nasty memory
visibility error in one of her semaphores. So far, she just seems to
brush it off as if nothing happened. I have not had to time to examine
all of her code, but I found a bad one. So bad, that her code might seem
to work fine, then crash at a random time. Serious error.

Ian Collins

unread,

Jan 26, 2021, 9:23:10 PM1/26/21

to

When you attempt a discussion with someone who can't even quote
correctly, what else do you expect?

--
Ian.

Chris M. Thomasson

unread,

Jan 27, 2021, 10:46:09 PM1/27/21

to

I was trying to see if she could somehow, sort of, "change". Basically,
its been a horrible failure. Damn.