Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

std::map performance

0 views

Skip to first unread message

Robi-Wan-Kenobi

unread,

Apr 4, 2008, 3:24:17 PM4/4/08

Hi folks,

for our application i have to cache data. For that purpose i am
currently using the vector as a container to store the pointers to
objects holding the data.
Currently that's ok since the keys for the acces are rather sequential
with no bigger gaps in between.
But for the future we attempt to use some other key which will not be
sequential. Then i will have to use the map container for storage of
the pointers with that new key.
In some tests i just made it turned out that it will get slower with
the use of map.
I generated a bucket of these cahce-objects, stored them in the
container, retrieved them from there about 1000 times and erased them
again.
For that scenario the map is about four times slower than the vector.
The use of hash_map was only a little bit faster.
Has anyone experiences in optimization of the map? Is there some way
to make it faster? Or woud it be faster to use some special map
implementation (or CMap / QMap)?

Thanks a lot.

Robert

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Alberto Ganesh Barbati

unread,

Apr 5, 2008, 3:29:20 PM4/5/08

Robi-Wan-Kenobi ha scritto:

> Hi folks,
>
> for our application i have to cache data. For that purpose i am
> currently using the vector as a container to store the pointers to
> objects holding the data.
> Currently that's ok since the keys for the acces are rather sequential
> with no bigger gaps in between.
> But for the future we attempt to use some other key which will not be
> sequential. Then i will have to use the map container for storage of
> the pointers with that new key.
> In some tests i just made it turned out that it will get slower with
> the use of map.
> I generated a bucket of these cahce-objects, stored them in the
> container, retrieved them from there about 1000 times and erased them
> again.
> For that scenario the map is about four times slower than the vector.
> The use of hash_map was only a little bit faster.
> Has anyone experiences in optimization of the map? Is there some way
> to make it faster?
>

Faster? You are saying that map is four time slower than vector, well...
I might say that it is *only* four time slower and that looks like an
excellent achievement to me, given that that map access is O(log(n))
while vector access is O(1)! How much faster would you want it? If you
want an associative container you have to pay for it!

Anyway, a few advices:

1) profiling benchmarks as you did may be useful, but nothing is better
than profiling in "real" runtime conditions. Sometimes the result are
strikingly different!

2) it seems strange to hear about an hashed container performing a
"little bit" faster than an ordered container, given that an hashed
container has average constant complexity as opposed to logarithmic
complexity. This may be an hint that your benchmark is not measuring the
actual performances of the container, but is dominated by some other
cost. Maybe you should try using a different hash function.

3) It's true that std::map is an all-purpose container and ad hoc
solution might be more performant. But, believe me, unless you are using
some crappy standard library, std::map (or even the hash map you are
using) is probably very optimized. Usually the bottleneck is in the way
*you* use the map rather than in the map itself.

> Or woud it be faster to use some special map
> implementation (or CMap / QMap)?

I'm sorry but I don't know what CMap and QMap are, but really, I don't
think they might make a big difference.

HTH,

Ganesh

Lance Diduck

unread,

Apr 6, 2008, 7:59:04 PM4/6/08

On Apr 4, 3:24 pm, Robi-Wan-Kenobi <robert.ka...@gmx.de> wrote:
> Hi folks,
>
> for our application i have to cache data. For that purpose i am
> currently using the vector as a container to store the pointers to
> objects holding the data.
> Currently that's ok since the keys for the acces are rather sequential
> with no bigger gaps in between.
> But for the future we attempt to use some other key which will not be
> sequential. Then i will have to use the map container for storage of
> the pointers with that new key.
> In some tests i just made it turned out that it will get slower with
> the use of map.
> I generated a bucket of these cahce-objects, stored them in the
> container, retrieved them from there about 1000 times and erased them
> again.
> For that scenario the map is about four times slower than the vector.
> The use of hash_map was only a little bit faster.
> Has anyone experiences in optimization of the map? Is there some way
> to make it faster? Or woud it be faster to use some special map
> implementation (or CMap / QMap)?
>
> Thanks a lot.
>
> Robert

Without seeing a profile of the ratio of modification to reads it is
hard to tell just what the best solution is.
For example, vector element access is only O(1) if you happen to know
just what index you want to access. If you are doing a lookup by key,
then this turns into O(n). If the vector were sorted, then look up is
O(ln n). However, inserts into the vector is O(1) in the O(n) find
case, but O(n ln n) in the sorted case.
That said, O(1) for one container is not the same as O(1) for another.
THe container, vendor, and actual number of elements plays a big role.
The fact that map and hash performance were nearly identical is not
surprising. It is a urban legend that hash has O(1) lookups. Actually
that number is O(n/buckets). So it is only a finely tuned hash that
has O(1) lookup. Another advantage of map is that it is easy to write
a free list allocator that works with it, whereas with hash this is
more difficult.
For example, in one of my apps, I have several threads that populate
their own isolated maps. So what I do is make an allocator per thread,
and pass it to that therads map. Why? If you ever profiled a high
performance MT app, you will note that there is considerable
contention in the default allocators. This trick completely isolated
the map and the allocator. Doing this with a hash is much more
difficult, because writing an allocator for it is harder (you cant
just have a free list of equal size blocks)

Here is a study I recently did--
The data (in ms) for two compilers (A and B) to load
(10,100,1000,10000) pair<int,double> 100 times per thread, then sort
by various STL methods:
(A-1 thrd) (A-2 thrds) (B-1 thrd) (B-2 thrds)
native vector with pre-sorted data
(0,15,46,359) (16,25,250,2625) (0,0,2,17) (1,1,2,18)
native map
(0,47,329,3157) (16,220,2100,19K) (1,12,118,1222)
(11,90,840,7800)
native map w/ lockfree allocator
(0,16,304,2094) (16,165,2000,16K) (0,7,75,781) (1,9,100,1050)
native vector
(16,125,1469,17469) (188,2500,25K,302K) (0,29,53,238) (0,1,16,200)
open source vector (using native std::sort)
(0,0,78,1093) (0,0,78,1100) (0,0,15,203) (0,0,15,203)
native list
(31,250,3093,37500) (469,5K,63K,685K) (1,12,130,1360)
(2,80,890,7600)
native list /alloc
(31,250,3094,37400) (464,5K,63K,686K) (0,9,85,917) (0,9,100,1147)
open source list
(0,63,781,9328) (110,1391,15K,168K) (0,22,145,1561)
(13,112,968,7307)
open source list /alloc
(15,62,734,8672) (85,1350,16K,168K) (0,9,90,1162)
(1,10,125,1410)
[Intel Core2 @2.13Ghz processor (just a regular PC, running XP sp2)]
There is obvious high observational error, but you get the idea. (and
yes, reserve was called on the vectors). You can see that I get
radically different results. Compiler B did fine, as long as I used my
own allocator concoction.

Lance

Mathias Gaunard

unread,

Apr 6, 2008, 7:54:45 PM4/6/08

On Apr 4, 9:24 pm, Robi-Wan-Kenobi <robert.ka...@gmx.de> wrote:

> But for the future we attempt to use some other key which will not be
> sequential. Then i will have to use the map container for storage of
> the pointers with that new key.
> In some tests i just made it turned out that it will get slower with
> the use of map.
> I generated a bucket of these cahce-objects, stored them in the
> container, retrieved them from there about 1000 times and erased them
> again.
> For that scenario the map is about four times slower than the vector.
> The use of hash_map was only a little bit faster.
> Has anyone experiences in optimization of the map?

Did you try std::tr1::unordered_set? What implementation are you using
of the tools you tried?

What's the type of your key? int?
What's your value type? (size, copying cost)
What's your usage pattern?

0 new messages