Clusters instead of std::list and std::map!

sven...@web.de

unread,

Oct 11, 2018, 1:09:12 PM10/11/18

to ISO C++ Standard - Future Proposals

Hello Mr. and Mrs.,

i have written an algorithm which can order and sort large numbers of items.

It creates a directory-tree in form of a pyramid, moving the entries and groups accordingly.

I provide it in form of c++-templates via GitHub.

I hope my solution will make it into the standard-library!

Best regards,

Sven Bieg

Gašper Ažman

unread,

Oct 11, 2018, 3:15:50 PM10/11/18

to std-pr...@isocpp.org

Hi Sven,

Your solution might make it to boost. In order for something to make it to the standard library, you have to specify what the interface does (and not how it does it). I took a read through your website that you wrote for it. It has some nice graphs, which look hopeful; however, I didn't find any mention of other important details of your data structure, and I have questions.

1. what are the characteristics of iterator stability for your structure?

2. how would one use it with a custom allocator? As far as I'm aware, it is not allocator-aware.

3. how does reclaiming memory work when it gets smaller?

4. could it support node splicing? How about sublist splicing? These are important characteristics of std::list.

5. could you implement lower_bound and upper_bound for this data structure? How about equal_range?

6. When is it allowed to throw? As far as I saw, even iterators allocate - which was quite surprising.

All in all, I don't think this code would be accepted into boost - you'd have to undergo quite a vigorous review process, and change the naming scheme on all the names, probably remove all the C++/CLI extensions, and change all the microsoftisms to regular c++.

The standard, as mentioned, does not accept code - it accepts specifications. We'll gladly take a look if you write a thorough specification of the API and its behaviour, though.

Thank you very much for your contribution,

Gašper

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/cf119955-9bde-46f6-84bd-01aabf5b4a6d%40isocpp.org.

Bengt Gustafsson

unread,

Oct 12, 2018, 2:08:13 AM10/12/18

to ISO C++ Standard - Future Proposals

This looks like an in memory B-tree to me. This is a very useful data structure which I would like to see in std. It should be possible to make it at least as close to map as unordered_map is.

Martin Moene

unread,

Oct 12, 2018, 5:29:08 AM10/12/18

to std-pr...@isocpp.org, sven...@web.de

On 11-Oct-18 19:09, sven...@web.de wrote:

I hope my solution will make it into the standard-library!

To get an idea about standardi[sz]ing:

Best regards
Martin

Sven Bieg

unread,

Oct 12, 2018, 12:01:58 PM10/12/18

to std-pr...@isocpp.org

Hello Mr. Gustafsson,

in fact my algorithm doesn’t build a binary tree, it builds a pyramid, 50 times more steep than the real ones! Have You had a look at my homepage? There is a short overview how it works. I’m from Germany, please excuse my bad english!

Best regards,

Sven Bieg

Magnus Fromreide

unread,

Oct 12, 2018, 5:31:20 PM10/12/18

to std-pr...@isocpp.org

On Fri, Oct 12, 2018 at 04:01:49PM +0000, Sven Bieg wrote:
> Hello Mr. Gustafsson,
>
> in fact my algorithm doesn’t build a binary tree, it builds a pyramid, 50 times more steep than the real ones! Have You had a look at my homepage? There is a short overview how it works. I’m from Germany, please excuse my bad english!

He said B-tree, not binary tree. They are very different beasts.

See e.g. wikipedia: https://en.wikipedia.org/wiki/B-tree

/MF

> Best regards,
>
> Sven Bieg
>
> --
> You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.

> To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/DB7PR02MB4140F37CEBA876BD7102E223B1E20%40DB7PR02MB4140.eurprd02.prod.outlook.com.

Sven Bieg

unread,

Oct 13, 2018, 3:12:16 AM10/13/18

to std-pr...@isocpp.org

I've done a comparison to std::list and std::map. You can find some diagrams at

http://svenbieg.azurewebsites.net/Clusters/List

and

http://svenbieg.azurewebsites.net/Clusters/Index

I don't know if Microsoft has it's own Implementation, but i get much better results using Clusters!

sven...@web.de

unread,

Oct 13, 2018, 4:42:00 AM10/13/18

to ISO C++ Standard - Future Proposals

Hello Gasper,

i'm glad to see You. I don't know if can answer all Your questions, because i am a hobby-programmer not a professional. But i am going to try:

1) The iterator stores the position for each level. Because the cluster starts with only one level and the number can grow, it is allocated.

I'm not sure what You mean with iterator-stability, iteration ain't critical. If the position on one level reaches the end, it is reset to zero and the

position of the parent is increased. The most critical process is the insertion. With a group-size of 100 items i have to move at most 10.000 items

on the first level. If new groups have to be allocated, the maximum number is the number of levels, while the number of items grows exponential.

The groups are half-full at least, because an empty group is created only when all groups on a level are full,

and the groups are combined with their neighbors when possible.

2) You need to use new and delete, wich can be overloaded. The entries are moved in the structure without calling their con-/destructors.

3) Every time an entry is removed, the parent checks, if the group can be combined with the neighbor-groups. The groups are half-full at minimum.

4) My algorithm works a little bit different. I have item-groups on the lowest level, and parent-groups on higher levels.

The parent-group fails to insert an item, it splits the child-group when possible. If the root is full and can not be split,

a new root is created.

5) In the index, the items are in ascending order. It references the first and the last entry. I think, this is what You mean.

6) An exception can be thrown, every time memory needs to be allocated. I'm not using exceptions, only assertion for debugging.

I'm using Visual Studio and the Windows Runtime environment. I'm not using the standard-library directly.

Maybe You can help me to write a standard-implementation?

Best regards,

Sven

Gašper Ažman

unread,

Oct 13, 2018, 1:18:51 PM10/13/18

to std-pr...@isocpp.org

Dear Sven,

With my email, I was trying to do two things:

a) I asked you a very small subset of the questions you need good clear answers for if you want to attempt getting this into the standard library. The standardization process requires a very thorough specification of what every part of the data structure or algorithm does, what its effects are, and so on. A good generic component of the standard library has very many facets, including:

- iterator stability guarantees

- what operations may throw

- allocator awareness

- adherence to C++ design principles (iterators are cheap to copy and thin!)

I was trying to make sure you understood at least some of the reasons your library is not ready for inclusion. It will take a lot of work, almost all of it by you, to get it to the point where getting it into the standard would be feasible. That said, everyone wants good algorithms and data structures to be in the standard library! No-one will tell you that you shouldn't spend the time and work to get it to the quality level required for inclusion. However, be prepared to have to improve your design (perhaps for years) until everyone is satisfied.

The standardization process works as a bunch of authors pushing their contributions that they care about through the committee that finds flaws in them until they can't find flaws anymore, and then vote it in. It's basically like the most stringent code review process ever designed, except it doesn't review code - it reviews specifications. If that's not your cup of tea, you're welcome to keep your library on github forever, where it will be useful to everyone that has the ability to pull it down and integrate it in their project.

Thank you for your answers. I'll reply to each with the design issues they bring up.

1) The iterator stores the position for each level. Because the cluster starts with only one level and the number can grow, it is allocated.
I'm not sure what You mean with iterator-stability, iteration ain't critical.

Iteration is critical. However, iterators are not used just for iteration - they are also used as coordinates into your data structure that refer to elements. Their stability (whether they remain valid when the data structure changes) is critical for integrating your data structure with anything else in the language.

The standard library assumes iterators are cheap (and noexcept!) to copy and cheap to hold. Your iterators do not fit this model (they allocate), therefore they are not a good fit for the standard library with this design. Perhaps you can think on that for a bit.

If the position on one level reaches the end, it is reset to zero and the
position of the parent is increased. The most critical process is the insertion. With a group-size of 100 items i have to move at most 10.000 items
on the first level. If new groups have to be allocated, the maximum number is the number of levels, while the number of items grows exponential.
The groups are half-full at least, because an empty group is created only when all groups on a level are full,
and the groups are combined with their neighbors when possible.

Iterators should probably have nothing to do with that.

2) You need to use new and delete, wich can be overloaded. The entries are moved in the structure without calling their con-/destructors.

Overloading new and delete are frowned upon and work per-type, not per-use, so allocator support is essential for the standard library.

3) Every time an entry is removed, the parent checks, if the group can be combined with the neighbor-groups. The groups are half-full at minimum.

Thanks!

4) My algorithm works a little bit different. I have item-groups on the lowest level, and parent-groups on higher levels.
The parent-group fails to insert an item, it splits the child-group when possible. If the root is full and can not be split,
a new root is created.

Therefore it cannot be a replacement for std::list. Lists have very few applications where they are the best data structure, but to my knowledge all of them involve complete stability of nodes (no iterator invalidation except on deletion of pointed-to node), the ability to splice a node or a few in and out (lru-cache, for instance), and keeping the order of nodes as inserted. Nobody uses a list as a sorted-by-key container.

5) In the index, the items are in ascending order. It references the first and the last entry. I think, this is what You mean.

It seems that you can then. Why is it not in the interface?

6) An exception can be thrown, every time memory needs to be allocated. I'm not using exceptions, only assertion for debugging.

My question was mostly related to the fact that you have no specification of which operations on your structure can throw (in other words, use new). I suggest thinking about that one.

I'm using Visual Studio and the Windows Runtime environment. I'm not using the standard-library directly.

Your data structure should not require the windows runtime if you are targeting the standard library.

Maybe You can help me to write a standard-implementation?

I'm afraid I don't have enough spare time to attempt that. We on this list will happily tell you what is missing though, so you can fix it and try again.

Best regards,

Sven

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/09f8e2a6-871c-4474-b887-1cfec7d3a7d0%40isocpp.org.

Message has been deleted

sven...@web.de

unread,

Oct 13, 2018, 11:58:03 PM10/13/18

to ISO C++ Standard - Future Proposals

Hello Gasper,

like You said, i created a new list-class today wich is more standard-conform. In fact i only put things together and renamed everything, i did no changes to the code itself.

http://github.com/svenbieg/clusters-standard

To the questions again:

1) I now understand what You mean with iterator-stability. The iterators have to stay valid, when another iterator changes the list.

I didn't implement this, but it is possible. The list would have to keep track of all iterators for this, maybe a limited number of

simultaneous iterators.

2) The allocator is easy to integrate, i think. I will leave this away, because i have integrated a custom allocator already.

A number of 100 items is allocated uninitialized in each item-group.

3) You first have to use this algorithm, before You really can make a benefit!

4) Splicing single items in and out is possible. Insertions and deletions are processed in constant low time.

Even chains of items are possible, but one thing after another.

5) I will add this to index-class.

6) I have to add the noexcept keyword to all read-only functions.

Like i said, i am a hobby-programmer and i don't got much time, too. I have a full-time-job in electric industry.

I will add the functionality You need and do some benchmarks. I will post the results here, so You stay up to date.

Thank You,

Sven

Bengt Gustafsson

unread,

Oct 14, 2018, 6:15:23 AM10/14/18

to ISO C++ Standard - Future Proposals, sven...@web.de

Its still a B-tree, give or take. Here is another implementation, which is already "std-ish" when it comes to the API:

https://code.google.com/archive/p/cpp-btree/

It was made by Google but apparently they didn't have time or interest to reformat it to a specification for inclusion in the standard.

Den söndag 14 oktober 2018 kl. 05:58:03 UTC+2 skrev sven...@web.de:

Hello Gasper,

like You said, i created a new list-class today wich is more standard-conform. In fact i only put things together and renamed everything, i did no changes to the code itself.

http://github.com/svenbieg/clusters-standard

To the questions again:

1) I now understand what You mean with iterator-stability. The iterators have to stay valid, when another iterator changes the list.
I didn't implement this, but it is possible. The list would have to keep track of all iterators for this, maybe a limited number of
simultaneous iterators.

No, iterators don't have to be stable, but the specification must say whether they are or not. For instance vector iterators are not stable.

2) The allocator is easy to integrate, i think. I will leave this away, because i have integrated a custom allocator already.
A number of 100 items is allocated uninitialized in each item-group.

If each block has a parent pointer I don't think the iterator needs to store a variable amount of data. You'd still need an allocator for the blocks themselves though.

With traditional allocators you'd be in trouble if you try to allocate something that is not an array of your T type, but I don't think this is a valid concern anymore.

A classical B-tree has all blocks equally structured as an array of pair<T, Block*>, and if the Block* is not nullptr there is at least one sub-block containing elements between the T of the pair and the next. This way internal and leaf nodes have the same structure.

However, if you store only keys in internal nodes you can get a steeper tree with equal size blocks which may be advantageous. This decision would however probably not have to be in the specification but would be a _quality of implementation_ (QOI) issue.

sven...@web.de

unread,

Oct 14, 2018, 6:40:42 AM10/14/18

to ISO C++ Standard - Future Proposals

Hello Mr. Gustaffson,

the Balancing-tree seems to have the same goal, but again:

My algorithm does not build a tree. It is working in the other direction.

I will do a comparison to btree, i wonder wich algorithm performs better.

Thank You for Your reply,

Sven

sven...@web.de

unread,

Oct 14, 2018, 7:48:52 AM10/14/18

to ISO C++ Standard - Future Proposals, sven...@web.de

Bryce Adelstein Lelbach aka wash

unread,

Oct 19, 2018, 12:00:09 AM10/19/18

to std-pr...@isocpp.org, sven...@web.de

Hi Sven,

I'd suggest moving this discussion to the Boost mailing list, which would be closer to a good fit for this work than the standard.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/3c8fb242-bdde-1ee8-878f-94bfb3ae084c%40eld.physics.LeidenUniv.nl.

sven...@web.de

unread,

Oct 19, 2018, 12:27:17 AM10/19/18

to ISO C++ Standard - Future Proposals

I don't think so, sorry!

Bryce Adelstein Lelbach aka wash

unread,

Oct 19, 2018, 10:41:50 AM10/19/18

to std-pr...@isocpp.org

If you want to suggest something for standardization, you need more than just an implementation. As others have mentioned, you need a specification.

I don't think this work is a great fit for the standard. I would suggest Boost.

On Fri, Oct 19, 2018, 12:27 AM <sven...@web.de> wrote:

I don't think so, sorry!

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/5888e4a7-3ed3-4759-969e-30211027e7b0%40isocpp.org.

sven...@web.de

unread,

Oct 19, 2018, 12:01:47 PM10/19/18

to ISO C++ Standard - Future Proposals

Sorry Mr., may i call You Brice?

i think Gasper is right when he says that some functionality is still missing, like subtree-splicing,

lower- and upper-bound, equal-range and, from my opinion, insert- and remove-many.

The algorithm is not only faster, then boost would be the right place, it is also consuming less memory.

B-tree was much to complicated to make it into the standard-library, maybe it lacks.

Clusters is as simple as effective, so the right place is the standard-library.

Like Mr. Gustafsson said, iterator-stability ain't critical in a single-threaded apartment,

and the standard-library is not thread-safe at all.

Tomorrow i'm going to try to get a standard-implementation of the index.

The list-iterator is still missing set-item.

I will do the specification later on, maybe somebody is really interested in Clusters

and wants to some "testing".

Best regards,

Sven

Gašper Ažman

unread,

Oct 20, 2018, 11:08:04 AM10/20/18

to std-pr...@isocpp.org

... Please don't think I'm endorsing whatever this guy is doing. I tried to tell him.

However, I think that not turning people away is our duty, and I'm hoping we can coach him enough for him to realize that he's missing a ton of context.

I agree with you completely though, boost is a much better fit.

G

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAP3wax_V-42h6fBD3LvEFWfGGAcBF1nf0NPmMbAKN%2B3QzKg1eA%40mail.gmail.com.

Gašper Ažman

unread,

Oct 20, 2018, 11:14:15 AM10/20/18

to std-pr...@isocpp.org

I apologize for the tone of the above. It was addressed to Bryce, who is a friend of mine, and I didn't realize it was going straight to the reflector. Mea culpa.

Sven, I meant what I said, though - I hope you realize boost is a much better target for this at this time. Once it matures, you can very well target the standard library with the specification of your boost library - the standard does not accept code. If you want proof, look at the ranges proposal that Eric Niebler and Casey Carter have been trying to get into the standard, or the current flatmap proposal by Zach Laine. It will give you an idea of what is expected for the standard.

Again, sorry for the tone; I really do want you to keep working in the community, and try for the standard with a mature specification, because your clusters do look like a useful data-structure - they are far from ready, though.

I'd like to address your iterator stability statement above as well; iterator stability has nothing to do with threading, and everything to do with how you can combine data structures; without stable iterators, storing iterators to objects in your data structure in *another* data structure is not possible, which severely limits composability.

Gašper

sven...@web.de

unread,

Oct 20, 2018, 3:12:41 PM10/20/18

to ISO C++ Standard - Future Proposals

I've written a standard-implementation and moved it to http://github.com/svenbieg/clusters.

I think this is also the right place for the specification.

In a cluster, iterator-stability doesn't seem to be so important, because iterators can be set to any position.

The absolute position can be saved, the cluster can be changed, and the iterator can be reset to the previous item.

In a multi-threaded environment the operations can be synchronized, like i've done in Clusters-Runtime.

Mr. Gustafsson mentioned, if groups would have parent-pointers, the iterators would not have to allocate.

This is right, but they still would not be stable. I think it's better to keep the clusters as small as possible.

A custom allocator would have to be faster than the default. I'm thinking of a memory-manager using clusters,

making custom allocators unnecessary.

A very interesting feature is the sublist-splicing, wich You suggested.

I'm going to implement this, but i didn't make it by now.

I thought of insert- and remove-many, having large text-files doing replacements.

Splicing out a sub-list is possible in constant-low time in a cluster.

Combination of data-structures is another interesting feature You mentioned.

Some functions are still missing, like insert-many and remove-many,

splice-out, Thank You, and maybe some boolean functions for indices.

This can be done with high-performance, but i didn't make, too.

I think i misunderstood You with lower- and upper-bound.

You can use an absolute position and a counter in clusters.

The user can pass invalid arguments now, i've added some exceptions.

I've also added the noexcept-keyword to read-only functions,

it is necessary for std::move_if_no_except, wich is also still missing.

Maybe some one on this list can have a look at the code?

Some pieces might be very un-professional.

Today i've read of std::move_if_no_except for the first time.

I'm going to write a specification, i got some ideas already.

Pictures say more than words, i think.

Sven

Bryce Adelstein Lelbach aka wash

unread,

Oct 20, 2018, 7:09:06 PM10/20/18

to std-pr...@isocpp.org, Odin HOLMES

Adding Odin.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/8cd19c2f-3a6d-4943-b18e-625b6cc6ee95%40isocpp.org.

holme...@gmail.com

unread,

Oct 20, 2018, 7:59:10 PM10/20/18

to ISO C++ Standard - Future Proposals, hol...@auto-intern.de

Hello Sven,

I had a look at your code and I think I understand how it works inside. It looks like an implementation of a b-tree with the 'order' template-able and defaulted to 100. There are a few things which I noticed about the implementation but that's not important here (I'll post issues on your GitHub repo). What's interesting from a standardization standpoint is the public interface as well as the guarantees/requirements http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf see section 26 and think about how your container fits in (which guarantees it fulfills and which it can't). Having a working implementation is a plus but the real work will be coming up with a proposal/wording etc. This might be interesting for SG14 (we have looked at enough other proposals, this would play nice with pool allocators, caching etc.) but you should read the standard and try to come up with wording in a similar form. Feel free to contact me in German if that's easier for you.

Best,

Odin

sven...@web.de

unread,

Oct 21, 2018, 3:29:22 AM10/21/18

to ISO C++ Standard - Future Proposals

Hey Odin,

nice to meet You! I'm looking forward to Your issues on GitHub. I've seen a chart of the stadard-comittee today, where SG14 is part of. There is another team SG21, working with databases. They should know about Clusters, too! I actually was writing a database, when the algorithm began becoming intersting for memory-management. There is a lot of work to do, but not today. Today is party-time!

Mit freundlichen Grüßen,

Sven

Jens Maurer

unread,

Oct 21, 2018, 3:51:03 AM10/21/18

to std-pr...@isocpp.org, sven...@web.de, holme...@gmail.com

On 10/21/2018 09:29 AM, sven...@web.de wrote:
> Hey Odin,
>
> nice to meet You! I'm looking forward to Your issues on GitHub. I've seen a chart of the stadard-comittee today, where SG14 is part of. There is another team SG21, working with databases. They should know about Clusters, too! I actually was writing a database, when the algorithm began becoming intersting for memory-management. There is a lot of work to do, but not today. Today is party-time!

SG11 (Databases) is defunct. In any case, SG11 was not about
writing a database implementation (where b-trees are useful),
but standardizing the interface to relational databases such
as Oracle or MySQL (roughly, ODBC for C++).

Jens

sven...@web.de

unread,

Oct 28, 2018, 4:22:27 AM10/28/18

to ISO C++ Standard - Future Proposals, hol...@auto-intern.de, holme...@gmail.com

I have created a printable version of the specification.

You can find it at http://svenbieg.azurewebsites.net/Clusters/Printable.aspx.

It is only the introduction right now, i will have to add the different interfaces.

But i think, this is what You meant.

sven...@web.de

unread,

Oct 28, 2018, 4:52:54 AM10/28/18

to ISO C++ Standard - Future Proposals, hol...@auto-intern.de, holme...@gmail.com, sven...@web.de

http://svenbieg.azurewebsites.net/Clusters/Clusters_en-US.pdf

pa...@lib.hu

unread,

Oct 28, 2018, 6:34:18 AM10/28/18

to ISO C++ Standard - Future Proposals, sven...@outlook.de

I just noticed this and the other tests. They all add numbers in a strict increasing order. This may create a serious skew in the test results. In any potential direction, but my estimate is that your algo takes a serious benefit from that, ike as if you tested a vector.insert() only adding to the end instead of anywhere in the middle.

Run the test with different patterns, like full backwards, some number mappings (i.e odd(x) ? -x : x ). And random numbers for tests it is possible.

You also need to provide the compiler version and flags used in the run.

pa...@lib.hu

unread,

Oct 28, 2018, 7:12:54 AM10/28/18

to ISO C++ Standard - Future Proposals, sven...@outlook.de, pa...@lib.hu

Also for the memory footprint tests use a series of Ts with different sizeof. As the nodes have fixed overhead (like 3 pointers in map), the waste will show different ratio as the type gets fatter, not the minimal that is used in the examples.

Message has been deleted

sven...@web.de

unread,

Oct 28, 2018, 2:02:37 PM10/28/18

to ISO C++ Standard - Future Proposals

Hello Mr. Pasa,

You are right, my tests are not scientific. I don't even know how the map works internally, but i know about clusters.
In a cluster, the worst case is push_front(), then the maximum number of items has to be moved. I'm going to add this test as worst case scenario to the specification.

The groups in a cluster are half-full at minimum, except there is only one group.
With a minimal item-size of 1 byte, there is an overhead of about 104% on the lowest level at maximum.
The overhead for parent-groups is about 100 pointers per 2.500 items, with half-full parent-groups and half-full item-groups, this is about 16%.
Parent-groups on higher levels are below 2 percent.
The maximum overhead is 118% then.
With an item-size less the size of 3 pointers, the cluster would save memory, having more than 50 items. Of course, having many clusters with only a few items could increase memory usage, but with a benefit in speed.

Best regards,
Sven

Magnus Fromreide

unread,

Oct 28, 2018, 3:08:27 PM10/28/18

to std-pr...@isocpp.org

On Sun, Oct 28, 2018 at 11:02:37AM -0700, sven...@web.de wrote:
> Hello Mr. Pasa,
>
> You are right, my tests are not scientific. I don't even know how the map works internally, but i know about clusters.
> In a cluster, the worst case is push_front(), then the maximum number of items has to be moved. I'm going to add this test as worst case scenario to the specification.
>
> The groups in a cluster are half-full at minimum, except there is only one group.
> With a minimal item-size of 1 byte, there is an overhead of about 104% on the lowest level at maximum.
> The overhead for parent-groups is about 100 pointers per 2.500 items, with half-full parent-groups and half-full item-groups, this is about 16%.
> Parent-groups on higher levels are below 2 percent.
> The maximum overhead is 118% then.

I think this is wrong - consider the case of a cluster containing a single
element - would not that get considerably worse overhead than 118%? From the
description I'd guess that cluster would have a memory overhead in the
vicinity of 10000%.

/MF

sven...@web.de

unread,

Oct 28, 2018, 3:23:52 PM10/28/18

to ISO C++ Standard - Future Proposals

This is right Mr. Fromreide,

the point is that one item is not a list. There may be an overhead of 10000%,
but this is only 100 times the item-size. With pointers as items, this would be only 400 bytes. The custom allocators are not mentioned here, wich cause an additional overhead, too. Think of an index-cluster in memory-management, sorting free space by size!

Best regards,

Sven

sven...@web.de

unread,

Nov 13, 2018, 3:37:12 PM11/13/18

to ISO C++ Standard - Future Proposals

I've written a tiny operating-system for my pc as a hobby many years ago, the winters are long in Germany. At some point i needed a file-system for the different drivers, but i didn't want to name these system-files. I thought it should be possible to access these files directly from the root-directory, using an unique identifier.
First i wanted to use strings like in a dictionary, but there were not enough characters to fill one block on the hard-disk. So i built a group of ids that could fill one block, and thought of this group as a single character. The parent-group was then the previous character in the string. To minimize the overhead from the disk-drive, the blocks had to be as full as possible. And because the drive was slow, the strings had to be as short as possible. I've now chosen c++ as language, and i'm not going to complete the original filesystem-driver.
The Clusters-algorithm is done and it seems to be the most powerful sorting algorithm in the world! Even the btree-algorithm from Google needs double the time and 10% more memory. This is why i write this specification, to bring Clusters into the standard-library.
Memory-allocations take a lot of time and the Clusters-algorithm in memory-management can bring a big benefit to the general public.
When i was writing the algorithm i found, that the algorithm could also be used for serialization only, without sorting. This can drastically reduce memory allocations, but only if the data may be fragmented. A double success!

Best regards,

Sven Bieg

masse....@gmail.com

unread,

Nov 26, 2018, 2:52:52 PM11/26/18

to ISO C++ Standard - Future Proposals, sven...@web.de

Hi,

I didn't went into the details of your algorithm, but it seems you achieve interesting results. Still, I believe that what you've done is more a matter of implementation than a matter of specification. The things is that specification is more about how you can expects your container to behave (what will work, what won't, what's the expected complexity of the different operations, ...) than how you did implement it.

For example, if you take a look at https://en.cppreference.com/w/cpp/container, you'll see 2 table.

The 1st one is about how does the iterator behave in terms of validity.

The 2nd is about the possible operations per container-type.

I think that filling those 2 tables for your structure could be a good starting point.

Just my 2 cents, though.

Nicolas.

Reply all

Reply to author

Forward

Message has been deleted