> El 13/03/2015 a las 8:51, Giacomo Drago escribió:
>
>> My library is not ready for a formal review process, and I'd like to ask
>> whether there is some potential interest first, thus saving everybody a
>> lot of time.
>>
>> I have been working on a flat_map
>> (http://www.boost.org/doc/libs/1_57_0/doc/html/boost/
>> container/flat_map.html)
>> variant that keeps some of the distinctive features of the original one
>> (contiguous storage/cache friendliness, zero memory
>> overhead/shrink_to_fit support, vector-based representation, etc...) but
>> makes insertion and erasure of the elements much faster, at the price of
>> a slower lookup and in-order traversal.
>>
>
> Just a question. Are your insertions one by one or by range?
>
Might also be interesting to know if your perf numbers are based on a
primitive key or a UDT, and whether the relative performance changes
depending on key/value sizes. --DD
Insertions are one by one, and they are random with respect to the order
of the keys. Same applies for lookups and erasures. No cheating.
I will try with different keys that are not PDOs and have non-trivial
comparators, but I expect the benchmarks to reflect my previous
findings. The bottom line is that it makes fewer comparisons and swaps
when inserting and removing, and more comparisons when looking up.
Elements are not *quite* sorted into the storage vector, of course.
The main use case I see for this data structure - and I am already using
it even though not in a production environment - is that you may need a
very space-efficient associative, sorted container that is not so
painfully slow when it comes to insert/erase elements.
I'll send more benchmarks. Any interest in the general design and the
source code (not "boostified" yet)?
Best
Giacomo
I didn't want to suggest any cheating, just trying to understand the
scenario. If insertions are by range, then flat_map can be optimized. If
insertions are one by one, it's more difficult, unless there are two
phases, first insertion (without lookups between insertions) and after
that, only searches. That scenario could be also optimized.
Best,
Ion
They are one by one, and can interleave in any possible way without
affecting performance. I didn't imply you were suggesting any cheating,
but I know that benchmarks can be crafted to make things look better
than they are. And you can do this without even knowing.
Best
GIacomo
My library is not ready for a formal review process, and I'd like to
ask whether there is some potential interest first, thus saving
everybody a lot of time.
I have been working on a flat_map
(http://www.boost.org/doc/libs/1_57_0/doc/html/boost/container/flat_map.html)
variant that keeps some of the distinctive features of the original
one (contiguous storage/cache friendliness, zero memory
overhead/shrink_to_fit support, vector-based representation, etc...)
but makes insertion and erasure of the elements much faster, at the
price of a slower lookup and in-order traversal."
Would it be easy for you to provide benchmarks of std::map with
boost::container::node_allocator available in latest boost release?
According to my own benchmarks it can improve performance a lot,
mainly when memory is likely to be fragmented.
Regards,
Marcelo
It's possible to accomplish this today, but you need to do the buffering
yourself outside of the flat_map using the ordered_unique_range_t insert
method.
http://www.boost.org/doc/libs/1_57_0/doc/html/boost/container/flat_map.html#idp36621984-bb
I use this method a lot where I figure out what I am going to be inserting
before I do the insert, then I insert in one step. Of course, this assumes
you don't need to know what's in the map while you're also figuring out
what to put in the map.
-- chris
On 2015-03-13 09:11, Giacomo Drago wrote:> On 2015-03-13 08:27,
Dominique Devienne wrote:
>> On Fri, Mar 13, 2015 at 9:22 AM, Ion Gaztañaga <igazt...@gmail.com>
>> wrote:
>>>
>>> Just a question. Are your insertions one by one or by range?
>>>
>>
>> Might also be interesting to know if your perf numbers are based on a
>> primitive key or a UDT, and whether the relative performance changes
>> depending on key/value sizes. --DD
This is what happens with a key made of four integers rather than just
an integer.
Size: 50000
std::map:
Random insertion: 19.881 ms
Random find: 19.87 ms
Iteration: 6.241 ms
Random erase: 42.872 ms
experimental::flat_map:
Random insertion: 509.001 ms
Random find: 14.922 ms
Iteration: 9.036 ms
Random erase: 727.366 ms
boost::container::flat_map:
Random insertion: 1049.87 ms
Random find: 7.778 ms
Iteration: 0.082 ms
Random erase: 885.753 ms
Size: 200000
std::map:
Random insertion: 124.174 ms
Random find: 133.111 ms
Iteration: 21.915 ms
Random erase: 253.892 ms
experimental::flat_map:
Random insertion: 8181.72 ms
Random find: 96.38 ms
Iteration: 45.847 ms
Random erase: 10951.3 ms
boost::container::flat_map:
Random insertion: 20271.5 ms
Random find: 53.225 ms
Iteration: 0.455 ms
Random erase: 17338.5 ms
Best
Giacomo