Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

std::sort slow due to allocating and deallocing memory?

105 views
Skip to first unread message

Philipp Klaus Krause

unread,
Nov 15, 2020, 11:14:41 AM11/15/20
to
Hello. I've used C++ once in a while for a long time, but I'm not an expert.

I std::sort an std::vector of structs. The struct has a member function
used as comparison for the std::sort (it iterates through a subset fo
the entries in an std::valarray<bool> and return a value depending ont hat).

valgrind --tool=callgrind tells me that this std::sort contributes
significantly to the runtime of a function I want to make faster. I also
see that most of the runtime of the std::sort is spent inside operator
delete(void *) and operator new(unsigned long). For about 4000 calls to
std::sort I see about 90000 calls to delete and new each.
Why does std::sort need to allocate and deallocate memory that often?
Can I do something about it? What can I do to speed up this std::sort?

Philipp

P.S.: In
https://sourceforge.net/p/sdcc/code/HEAD/tree/branches/lospre-vs-mcpre/sdcc/src/SDCClospre.hpp,
the compare function is in lines 55-69, the calls to sort are in lines
224, 291 and 292.

Mr Flibble

unread,
Nov 15, 2020, 11:45:22 AM11/15/20
to
What makes you think std::sort is allocating memory as opposed to the elements being sorted?

/Flibble

--
¬

Bonita Montero

unread,
Nov 15, 2020, 11:45:47 AM11/15/20
to
std::sort shouldn't allocate memory by itself because it usually
uses quicksort, which is an in-place sort. It even hasn't an allo-
caor-parameter. It might be that the items you sort allocate and
deallocate memory when being moved / copied.
The allocation-method which usually allocates memory is stable_sort,
which usually uses mergesort.

Philipp Klaus Krause

unread,
Nov 15, 2020, 12:11:32 PM11/15/20
to
Am 15.11.20 um 17:45 schrieb Bonita Montero:
> std::sort shouldn't allocate memory by itself because it usually
> uses quicksort, which is an in-place sort. It even hasn't an allo-
> caor-parameter. It might be that the items you sort allocate and
> deallocate memory when being moved / copied.

Hm, "moveed". Apparently C++ has something called move constructors. And
sometimes the compiler creates them automatically. Of course, I have no
idea if the compiler created move constructors here, or if I should
provide one or if move constructors are even relevant to my problem of
speeding up the std::sort.

The things being sorted (lines 50 to 69 of
https://sourceforge.net/p/sdcc/code/HEAD/tree/branches/lospre-vs-mcpre/sdcc/src/SDCClospre.hpp)
are structs containing three members: An boost::tuple<float, float>, an
std::valarray<bool> and a comparison function.

Philipp

Philipp Klaus Krause

unread,
Nov 15, 2020, 12:33:02 PM11/15/20
to
Am 15.11.20 um 17:45 schrieb Mr Flibble:
>>
>> P.S.: In
>> https://sourceforge.net/p/sdcc/code/HEAD/tree/branches/lospre-vs-mcpre/sdcc/src/SDCClospre.hpp,
>>
>> the compare function is in lines 55-69, the calls to sort are in lines
>> 224, 291 and 292.
>
> What makes you think std::sort is allocating memory as opposed to the
> elements being sorted?
>
> /Flibble
>

Well, I know my compare function doesn't allocate memory, and valgrind
shows me that somehow through std::sort there are calls to delete and new.
My background is mostly C. If I'd see calls to malloc and free that
somehow happens through qsort, and my compare functions doesn't allocate
memory, it must be qsort.

Maybe I should just try to replace the std::sort by qsort to get rid of
the memory allocation stuff?

Philipp

Richard Damon

unread,
Nov 15, 2020, 1:06:43 PM11/15/20
to
One thing to take a close look at is what you are actually sorting.
These will need to be moved from one place to another, and this might be
done using placement-new/move-constructors, which might LOOK like memory
allocation, or the classes move-constructor might be falling back to a
copy constructor and allocating memory.

qsort, since it moves with memcpy isn't suitable for sorting 'object'
that have constructors, but only pointers to them. This also points to
the fact that sorting an array of pointers will also be quicker to do in
many cases to sorting the objects themselves (if the objects have
significant size), but does cost and additional indirection and might be
a bit less cache friendly.

Philipp Klaus Krause

unread,
Nov 15, 2020, 1:33:12 PM11/15/20
to
Am 15.11.20 um 19:06 schrieb Richard Damon:
I just did a quick test comparing std::sort to qsort by sorting an
std::vector of 1000 of these:

#define GLOBAL_SIZE 5

struct s
{
boost::tuple<float, float> s;
std::valarray<bool> global;
int size;

bool operator <(const struct s& s) const
{
for(int i = 0; i < GLOBAL_SIZE; i++)
{
if (global[i] < s.global[i])
return(true);
if (global[i] > s.global[i])
return(false);
}
return(false);
}
};

std::sort took about 2.5 times as long as qsort. So I guess the bottom
line is that std::sort can be fast (as can be seen from various websites
where people claim that std::sort is faster than qsort by sorting arrays
of ints or floats), but for anything nontrivial to be sorted, will take
extra work to do so.

Any recommendatiosn on what I need to do to get fast sorting? Write a
move constructor? Write a custom swap function? Anything else?
https://stackoverflow.com/questions/14212701/stdsort-does-not-always-call-stdswap
looks like there are multiple things to implement just to get efficient
sorting.

David Brown

unread,
Nov 15, 2020, 2:07:02 PM11/15/20
to
Your structure here has non-trivial constructors and destructors - it
contains a "valarray" which is a dynamically allocated array. Each time
the sort algorithm needs to move these "struct s" structures around,
it's going to have to create and destroy them.

As it stands, this is the only correct way to do it - movement using
memcpy may be invalid. C-style "qsort" can only beat std::sort because
it ignores the requirements of the C class.

> Any recommendatiosn on what I need to do to get fast sorting? Write a
> move constructor? Write a custom swap function? Anything else?
> https://stackoverflow.com/questions/14212701/stdsort-does-not-always-call-stdswap
> looks like there are multiple things to implement just to get efficient
> sorting.
>

Custom swap and move constructors are definitely a help here. Another
obvious choice is that you should not use "valarray" when you know the
size of the array - use "std::array" which has the convenience of
working like a C++ container, but the efficiency of a fixed-size C array.

You should also consider whether you want an array here at all. The
code you have presented might be a simplification, but it is an
extremely inefficient way to hold an array of 5 booleans. Your
std::valarray will be 16 bytes in size (judging from a quick gcc test),
with a pointer to a heap-allocated block for the actual data. The
obvious way to store 5 booleans is a single uint8_t with one bit per
item. This would be massively simpler and faster, reducing your
comparison to a single compare (rather than a loop) without any
indirection, and making your struct trivially copyable.

And if you really need such a big and complex structure, then have an
array of pointers to the structs, and sort the pointer array - that will
be much faster in C qsort or C++ std::sort.

Melzzzzz

unread,
Nov 15, 2020, 5:35:13 PM11/15/20
to
valarray probably allocates during swap.

>
> Philipp


--
current job title: senior software engineer
skills: c++,c,rust,go,nim,haskell...

press any key to continue or any other to quit...
U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec
Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec
Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi
bili naoruzani. -- Mladen Gogala

Philipp Klaus Krause

unread,
Nov 16, 2020, 3:34:00 AM11/16/20
to
Am 15.11.20 um 20:06 schrieb David Brown:
>
> Custom swap and move constructors are definitely a help here.

Implementing a custom swap (that just invokes std::swap on all members)
indeed made std::sort faster than qsort for the example. And
implementing a move constructor and amove assignment operator imporved
it a little bit more.

> Another
> obvious choice is that you should not use "valarray" when you know the
> size of the array - use "std::array" which has the convenience of
> working like a C++ container, but the efficiency of a fixed-size C array.

I made the example for performance testing. In the code I really want to
optimize, the size of the valarray is small most of the time, but can
change at runtime. Also the use of valarray instead of vector is a
previously applied optimization, as profiling showed an |= on valarray
to be faster than iterating through the std::vector that was used before.

Juha Nieminen

unread,
Nov 16, 2020, 4:01:20 AM11/16/20
to
Philipp Klaus Krause <p...@spth.de> wrote:
> Why does std::sort need to allocate and deallocate memory that often?

As others have pointed out, it doesn't. It's your own class that's
allocating and deallocating memory each time elements are being copied
or swapped around.

Philipp Klaus Krause

unread,
Nov 16, 2020, 4:11:58 AM11/16/20
to
Am 16.11.20 um 09:33 schrieb Philipp Klaus Krause:
> Am 15.11.20 um 20:06 schrieb David Brown:
>>
>> Custom swap and move constructors are definitely a help here.
>
> Implementing a custom swap (that just invokes std::swap on all members)
> indeed made std::sort faster than qsort for the example. And
> implementing a move constructor and amove assignment operator imporved
> it a little bit more.
>

The custom swap and implementing moves did speed up the simplified
example, but slowed down the case I really want to optimize.

I'll try sorting an array of pointers next, to see if the possible speed
up on sorting is worth the cost of the extra indirection on access.

David Brown

unread,
Nov 16, 2020, 4:31:48 AM11/16/20
to
If the size of the array is limited by a small compile-time known value,
then it will be more efficient to use a std::array of the maximum size
than a variable sized array. A valarray is easily going to have an
overhead of a hundred bytes or more, with the indirection, handling of
the memory allocation, minimum block size of the heap allocator, etc.
And if you really have an array of bools, you can have a rather large
fixed-size array before the valarray leads to any savings.

Philipp Klaus Krause

unread,
Nov 16, 2020, 4:58:30 AM11/16/20
to
Am 16.11.20 um 10:31 schrieb David Brown:
>
> If the size of the array is limited by a small compile-time known value,
> then it will be more efficient to use a std::array of the maximum size
> than a variable sized array. A valarray is easily going to have an
> overhead of a hundred bytes or more, with the indirection, handling of
> the memory allocation, minimum block size of the heap allocator, etc.
> And if you really have an array of bools, you can have a rather large
> fixed-size array before the valarray leads to any savings.
>

Performance numbers on that question look clear: Going back from
std:.valarry to std::vector increases runtime of the whole algorithm by
about 80%. That is much more than what could be compensated by faster
sorting.

David Brown

unread,
Nov 16, 2020, 5:05:10 AM11/16/20
to
This is always worth doing when you have big objects, even if they don't
need constructing and destructing. The extra indirection is negligible
compared to the big blocks you are copying, and the indirection you
already have for your valarray's. The beauty of C++ is you can wrap
things in classes so that the inconvenience of the indirection is hidden
from the code that uses it.

David Brown

unread,
Nov 16, 2020, 5:07:42 AM11/16/20
to
No one is suggesting using std::vector. I am suggesting std::array,
which is a /completely/ different beast. And if your array really is
just holding booleans (I don't know if that's just the simplified
example code or from the real code), then you can use a single uint of
appropriate size.

Bonita Montero

unread,
Nov 16, 2020, 7:27:33 AM11/16/20
to
> The things being sorted (lines 50 to 69 of
> https://sourceforge.net/p/sdcc/code/HEAD/tree/branches/lospre-vs-mcpre/sdcc/src/SDCClospre.hpp)
> are structs containing three members: An boost::tuple<float, float>, an
> std::valarray<bool> and a comparison function.

I'm not going to analyze your code. But there must be
allocations you made on your own. std::sort-implementations
usually don't allocate any memory.

Öö Tiib

unread,
Nov 16, 2020, 7:28:35 AM11/16/20
to
For compile-time-sized array of truth values the std::bitset can be
best. Doing std:qsort with something that is not array of
TrivialType the behavior is undefined so I do not understand
how OP compared std::sort with std::qsort.

David Brown

unread,
Nov 16, 2020, 7:59:45 AM11/16/20
to
I guess a std::bitset will be as efficient as a uint of the same size,
with the convenience of access operators and members as a container. As
far as I can tell, using "to_ulong" will let him do the ordering
function as a single comparison.


Bo Persson

unread,
Nov 16, 2020, 8:02:15 AM11/16/20
to
Here you run into another special case (which C++ seems full of :-)).
std::vector<bool> is optimized for space (by storing bits instead of
bools), so it is expected to be slower, but possibly use fewer bytes.



Bo Persson

Bonita Montero

unread,
Nov 16, 2020, 10:33:21 AM11/16/20
to
Your objects are not only compared but actually also copied or moved.
Depending on if you've defined a proper move-constructor and move
-assignment-operator or there aren't any copy-constructors od copy
assingment-operators that override the default-implementations there
won't be any allocations.

Jorgen Grahn

unread,
Nov 16, 2020, 3:03:51 PM11/16/20
to
On Sun, 2020-11-15, David Brown wrote:
> On 15/11/2020 19:32, Philipp Klaus Krause wrote:
...
>> struct s
>> {
>> boost::tuple<float, float> s;
>> std::valarray<bool> global;
>> int size;
...
>> };
...

> Custom swap and move constructors are definitely a help here.

I probably should know this, but doesn't the default move
constructor move the members one by one, if possible?

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Vir Campestris

unread,
Nov 16, 2020, 4:50:00 PM11/16/20
to
... and the way to speed it up is to make a vector of pointers to the
structs, and sort that.

_Then_ move the objects into their new locations.

Andy

Juha Nieminen

unread,
Nov 17, 2020, 2:40:01 AM11/17/20
to
Vir Campestris <vir.cam...@invalid.invalid> wrote:
> ... and the way to speed it up is to make a vector of pointers to the
> structs, and sort that.
>
> _Then_ move the objects into their new locations.

If you have a vector of pointers where each pointer essentially means
"the object I'm pointing to should go to my position in the vector",
moving the pointed-to elements in the original array to the same
positions as the pointers sounds like a rather non-trivial task.
At least if you want to do it in-place. I suppose that if it's ok
to move the elements to a new vector, then it's easy. However, now it's
not in-place sorting anymore, and every time you sort you'll be creating
a new dynamically allocated array.

Philipp Klaus Krause

unread,
Nov 17, 2020, 8:24:18 AM11/17/20
to
Am 16.11.20 um 11:07 schrieb David Brown:
>>
>> Performance numbers on that question look clear: Going back from
>> std:.valarry to std::vector increases runtime of the whole algorithm by
>> about 80%. That is much more than what could be compensated by faster
>> sorting.
>>
>
> No one is suggesting using std::vector. I am suggesting std::array,
> which is a /completely/ different beast. And if your array really is
> just holding booleans (I don't know if that's just the simplified
> example code or from the real code), then you can use a single uint of
> appropriate size.
>

The size of the std::valarray isn't known before runtime. Often it is
small, but sometimes it can also be 300 or so.

Philipp Klaus Krause

unread,
Nov 17, 2020, 8:27:18 AM11/17/20
to
Am 16.11.20 um 13:59 schrieb David Brown:
> I guess a std::bitset will be as efficient as a uint of the same size,
> with the convenience of access operators and members as a container. As
> far as I can tell, using "to_ulong" will let him do the ordering
> function as a single comparison.

In the non-simplified code, there is also a mask (a
boost::container::flat_set<unsigned int>) involved, that selects, which
entries of the std::valarray participate in the comparison. I removed it
in the simplified code, since qsort seemed easier when not using a mask.

David Brown

unread,
Nov 17, 2020, 4:14:05 PM11/17/20
to
Again, you have more and more big, complex and time-consuming structures
that have constructors, destructors and allocated memory resources - and
you are surprised that it takes time to move them around when sorting.

You have a background in small microcontroller systems. Think like an
embedded programmer. How would you be doing all this in C on an 8051?
You would /not/ be using a big, bulky general containers like this.

You need a set of values - generally a small number, but occasionally a
big number up to 300 - with the value type being bits.

You need a mask of bits to say which of these you have in your comparisons.


The way to implement this is with bit arrays. You can use uint64_t, or
smaller sizes, or you can use std::bitset if you prefer. And you use
the equivalent of a "small string optimisation" - your class holds
enough of the data for typical use, but allows dynamic allocation for
big cases. I've given a sample here to get started - it's obviously
missing a constructor and destructor, a swap specialisation, and a
method of changing the size (thus adding to the number of bits). It's a
nice cache-friendly 32-byte size (for 64-bit systems) - this cache
friendliness will outweigh the cost of swapping 32 bytes compared to
using an array of pointers.

#include <bitset>
#include <tuple>
#include <array>

const int chunk_size = 64;
const int big_chunks = 7;

struct S {
using chunk = std::bitset<chunk_size>;
struct extended {
std::array<chunk, big_chunks> big_data;
std::array<chunk, big_chunks> big_mask;
};

std::tuple<float, float> data; // or a pointer to data, or whatever suits
chunk small_data;
chunk small_mask;
extended * pe;

bool operator < (const S& s) {
if ((small_data & small_mask).to_ulong() < (s.small_data &
s.small_mask).to_ulong()) return true;
if ((small_data & small_mask).to_ulong() > (s.small_data &
s.small_mask).to_ulong()) return false;
if (!pe && !s.pe) return false; // equal
if (!pe) return true;
if (!s.pe) return false;

for (auto i = 0; i < big_chunks; i++) {
if ((pe->big_data[i] & pe->big_mask[i]).to_ulong() <
(s.pe->big_data[i] & s.pe->big_mask[i]).to_ulong()) return true;
if ((pe->big_data[i] & pe->big_mask[i]).to_ulong() >
(s.pe->big_data[i] & s.pe->big_mask[i]).to_ulong()) return
false;
}

return false; // equal
}
};

extern const int ss = sizeof(S);
extern const int sse = sizeof(S::extended);

bool foo(S& a, S& b) {
return a < b;
}

Vir Campestris

unread,
Nov 17, 2020, 4:33:18 PM11/17/20
to
But his problem seems to be that he's copying his non-trivial objects
lots of times as the sort runs.

This way with a bit of faffing about the pointers get copied lots of
times, but the objects only get copied once.

Andy

Tim Rentsch

unread,
Nov 17, 2020, 10:35:23 PM11/17/20
to
Juha Nieminen <nos...@thanks.invalid> writes:

> Vir Campestris <vir.cam...@invalid.invalid> wrote:
>
>> ... and the way to speed it up is to make a vector of pointers to the
>> structs, and sort that.
>>
>> _Then_ move the objects into their new locations.
>
> If you have a vector of pointers where each pointer essentially means
> "the object I'm pointing to should go to my position in the vector",
> moving the pointed-to elements in the original array to the same
> positions as the pointers sounds like a rather non-trivial task.
> At least if you want to do it in-place. [...]

Writing in C, it took me 14 lines of code. Of course it might
end up being longer in C++.

Juha Nieminen

unread,
Nov 19, 2020, 9:42:01 AM11/19/20
to
And I'm sure you can do it in 2 lines of code in Haskell. The length of
the implementation is not the point. The point is the implementation itself.
*How* do you sort the array in-place, when you have another array of
pointers where each pointer is located in the position where the object
it's pointing to needs to go?

Suppose the first pointer points to the 10th element in the original array.
So you need to move the 10th element to the first position. But the first
position in the original array already has an element which is not yet in
its proper position. You would need to move the first element of the
original array to its correct position first. How? (And even if you were
to find out where it should go, also in that position there will be an
element that needs to be moved to its correct position first, and so on.)

Care to share the algorithm instead of bragging about how many lines of
code of C you can implement it with?

Richard Damon

unread,
Nov 19, 2020, 10:05:33 AM11/19/20
to
The simplest method moves every element twice. You buld a new array in
the sorted order using your table of pointers, and then copy it back.

A more complicated method is start at the first element, move it to a
temporay location, then follow the cycle moving each the object that
goes into that location into it, then follow with the newly open
location (and mark the sorting array as that move having been done).
When you finally get to the location that first element wanted to go
into it, you put it in place, and look for the first eleent that needs
to move again. This is more complicated, but only needs one temp
location and one additional move per cycle in the sort list.

Öö Tiib

unread,
Nov 19, 2020, 6:43:24 PM11/19/20
to
On Thursday, 19 November 2020 at 17:05:33 UTC+2, Richard Damon wrote:
>
> A more complicated method is start at the first element, move it to a
> temporay location,

Moving that first element does not make any sense if first pointer points at
it already? It is then correctly sorted. Also don't we somehow need to figure
out what pointer actually points at it for that move to make sense even
when first pointer does not point at it?

> then follow the cycle moving each the object that
> goes into that location into it, then follow with the newly open
> location (and mark the sorting array as that move having been done).
> When you finally get to the location that first element wanted to go
> into it, you put it in place, and look for the first eleent that needs
> to move again.

As it did start confusing I'm perhaps confused, you mark something,
things want and need to go. There are just two arrays sorted pointers
and possibly-unsorted elements for me. Somehow you do
it once per cycle at time and somehow you separate that some element
does not belong to cycle that has been already processed but you
leave al these details out. Feels it all makes sense to you but if it is
really 10 lines of C then I would prefer that C.

Öö Tiib

unread,
Nov 19, 2020, 10:25:13 PM11/19/20
to
On Friday, 20 November 2020 at 01:43:24 UTC+2, Öö Tiib wrote:
> Feels it all makes sense to you but if it is
> really 10 lines of C then I would prefer that C.

Was it meant something like that?:

void sort_with_pointers(std::vector<Value>& data, std::vector<Value*>& pointers)
{
assert(data.size() == pointers.size());
Value tmp;
for (size_t i = 0; i <pointers.size(); ++i)
{
if (pointers[i] != &data[i])
{
tmp = std::move(data[i]);
size_t current = i;
size_t next = pointers[current] - &data[0];
while (next != i)
{
data[current] = std::move(data[next]);
pointers[current] = &data[current];
current = next;
next = pointers[next] - &data[0];
}
data[current] = std::move(tmp);
pointers[current] = &data[current];
}
}
}

Tim Rentsch

unread,
Nov 20, 2020, 12:59:55 PM11/20/20
to
Juha Nieminen <nos...@thanks.invalid> writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> wrote:
>
>> Juha Nieminen <nos...@thanks.invalid> writes:
>>
>>> Vir Campestris <vir.cam...@invalid.invalid> wrote:
>>>
>>>> ... and the way to speed it up is to make a vector of pointers to the
>>>> structs, and sort that.
>>>>
>>>> _Then_ move the objects into their new locations.
>>>
>>> If you have a vector of pointers where each pointer essentially means
>>> "the object I'm pointing to should go to my position in the vector",
>>> moving the pointed-to elements in the original array to the same
>>> positions as the pointers sounds like a rather non-trivial task.
>>> At least if you want to do it in-place. [...]
>>
>> Writing in C, it took me 14 lines of code. Of course it might
>> end up being longer in C++.
>
> And I'm sure you can do it in 2 lines of code in Haskell. The length of
> the implementation is not the point. The point is the implementation itself.
> *How* do you sort the array in-place, when you have another array of
> pointers where each pointer is located in the position where the object
> it's pointing to needs to go?

Knowing that there is a short implementation in a well-understood
and known language helps limit the search for a solution.

> Suppose the first pointer points to the 10th element in the original array.
> So you need to move the 10th element to the first position. But the first
> position in the original array already has an element which is not yet in
> its proper position. You would need to move the first element of the
> original array to its correct position first. How? (And even if you were
> to find out where it should go, also in that position there will be an
> element that needs to be moved to its correct position first, and so on.)

Tiib posted some code that is basically the same as my own.

> Care to share the algorithm instead of bragging about how many lines of
> code of C you can implement it with?

As a general rule I don't post code right away, to give other
people a chance to work out problems for themselves, rather than
just being told the answer. As it turns out, in this case that
produced a nice benefit, in that Tiib's algorithm is a bit nicer
than mine.

Incidentally, I wasn't bragging, because I don't think of what I
did as being anything special, either in finding a solution or in
how long it was. I don't mind helping people in areas where they
really need help, but I don't think of this problem as being one
of those, at least not for non-novice programmers.

Tim Rentsch

unread,
Nov 20, 2020, 1:11:49 PM11/20/20
to
Tiib <oot...@hot.ee> writes:

> On Friday, 20 November 2020 at 01:43:24 UTC+2, <<D6 F6 [05b6]>> Tiib wrote:
>
>> Feels it all makes sense to you but if it is
>> really 10 lines of C then I would prefer that C.

It was 14 lines.

> Was it meant something like that?:
>
> void sort_with_pointers(std::vector<Value>& data, std::vector<Value*>& pointers)
> {
> assert(data.size() == pointers.size());
> Value tmp;
> for (size_t i = 0; i <pointers.size(); ++i)
> {
> if (pointers[i] != &data[i])
> {
> tmp = std::move(data[i]);
> size_t current = i;
> size_t next = pointers[current] - &data[0];
> while (next != i)
> {
> data[current] = std::move(data[next]);
> pointers[current] = &data[current];
> current = next;
> next = pointers[next] - &data[0];
> }
> data[current] = std::move(tmp);
> pointers[current] = &data[current];
> }
> }
> }

Yes, this code is basically the same as what I wrote. For the
inner loop I used a for() rather than a while(), which compacts
the code a bit. Also there are some minor style differences,
which together with the while()/for() shortening account for
the small difference in function lengths.

I must add though, that your algorithm is an improvement over
mine, in that you set pointers[current] to &data[current], which
means the test for already-in-place is just a single if(). As
cycles were moved, I would mark each pointer in the cycle by
setting it to NULL, which means the starting check needs to check
two different conditions. So thank you for the idea, and kudos
for discovering this nice improvement.

Öö Tiib

unread,
Nov 23, 2020, 1:51:05 AM11/23/20
to
Great. It can reduce moves to minimum when moves are pricey.
I have so far kept such complex data in memory non-movable.
Even when initially there is only one order then later there other
ways to order are often needed. Then sorted external arrays of
indexes or pointers or intrusive lists and sets can be used instead
of actual physical sorting. Keeping expensive to move
values at one place forever makes usage of several indexing
ways cheaper to keep and less error-prone.
But that may be also preliminary pessimization and
violation of YAGNI when applied too generally.
0 new messages