Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Iterating a std::vector vs iterating a std::map?

1 view

Skip to first unread message

carl

unread,

Nov 23, 2009, 7:47:37 PM11/23/09

I have a std::vector and a std::map containing the same number of elements.
Currently I experience that iterating all elements in the map is slower that
iterating all elements in the vector. Is this true ?

White Wolf

unread,

Nov 23, 2009, 8:07:30 PM11/23/09

Do you want us to decide if you are hallucinating or not? You have
said: you are experiencing it. So I guess then it is true. :)

If your question is: will iterating a map be slower than iterating a
vector, the answer is: most certainly. But there is another question
that must be asked: Should you care about it? And the answer is: In
most cases you don't.

See Scott Meyers: Effective STL for answers why a vector will be faster
than a map. But also see the constraints, because you cannot always use
a vector when you need a map.

A map contains ordered elements. It is a node based container, normally
implemented as a red-black tree. Because with a map you ask for
"ordering", you pay for it with the iteration being somewhat slower.
However that "slower" may not matter in your application at all, if what
you do with the elements of the map is slower. For example making
iteration faster won't make any different if you are printing out the
elements and you are waiting for the terminal to print and scroll...

--
BR, WW

carl

unread,

Nov 23, 2009, 8:20:18 PM11/23/09

"White Wolf" <wo...@freemail.hu> wrote in message
news:4b0b31d3$0$577$c5fe...@read01.usenet4all.se...

> carl wrote:
>> I have a std::vector and a std::map containing the same number of
>> elements. Currently I experience that iterating all elements in the map
>> is slower that iterating all elements in the vector. Is this true ?
>
> Do you want us to decide if you are hallucinating or not? You have said:
> you are experiencing it. So I guess then it is true. :)
>
> If your question is: will iterating a map be slower than iterating a
> vector, the answer is: most certainly. But there is another question that
> must be asked: Should you care about it? And the answer is: In most cases
> you don't.

For each voxel in a 3D volume (that typically contains 512*512*512 voxels) I
need to iterate a number of objects (typically from 0 - 30 elements). I
have reduced the numbers of objects stored in the std::map using a kd-tree.
But still I need to iterate all the elements in some of the operations.

Before I was just storing all elements in a std::vector and iterated them
all. Now I have changed to an implementation using a std::map and a kd-tree
but even though the number of elements are reduced drastically I get a much
worse performance compared to iterating all elements in the std::vector
(typically it contained 216 elements).

Paavo Helde

unread,

Nov 24, 2009, 2:06:01 AM11/24/09

"carl" <carl@.com> wrote in
news:4b0b34ea$0$280$1472...@news.sunsite.dk:

Then it means that your attempt of optimization by replacing std::vector
with std::map did not work out. You have to try something else.

The map typically uses a separate dynamic allocation for each object.
This means that is relatively slow, both by creation/deletion time as
well as when iterated over (because of bad locality). In performance-
critical code parts either a hash map or a plain array like std::vector
is typically better.

It seems you have a kind of image processing application. A rule of thumb
for such an application is that for fast processing one needs to have
only a limited number of plain arrays (like 2 or 3) describing your data,
which you can iterate over easily. The main thing is to keep locality
(next iteration access only memory locations near of the previous
iteration) and avoid excess dynamic allocations. Of course everything
depends on the usage type and goals.

Note that using standard containers in performance-critical code parts is
not straightforward and can kill the performance quite easily if one does
not have a clear overview what's happening behind the scenes. I'm not
advocating against using std::vector, but one has to be aware of the
potential problems and how to work around them. Notably the swap() member
function comes often handy in performance-critical code.

If the code still seems too slow, then one should use profiling to see if
there are any unexpected bottlenecks. You cannot rely on profiling only,
because if you write uniformly unefficient code all over the place there
will be no bottlenecks.

hth
Paavo

James Kanze

unread,

Nov 24, 2009, 5:28:50 AM11/24/09

On Nov 24, 7:06 am, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
> "carl" <carl@.com> wrote innews:4b0b34ea$0$280$1472...@news.sunsite.dk:

[...]

> Note that using standard containers in performance-critical
> code parts is not straightforward and can kill the performance
> quite easily if one does not have a clear overview what's
> happening behind the scenes. I'm not advocating against using
> std::vector, but one has to be aware of the potential problems
> and how to work around them. Notably the swap() member
> function comes often handy in performance-critical code.

As a general rule, you shouldn't use standard classes, or
classes from any general library, as part of the application
abstractions. You should always define your own classes, with
the exact interface you need (and no more). The standard
classes only appear in the implementation of these classes, so
you can swap them in or out as needed (or even replace them with
a custom implementation, if none of the standard classes meets
your needs).

> If the code still seems too slow, then one should use
> profiling to see if there are any unexpected bottlenecks. You
> cannot rely on profiling only, because if you write uniformly
> unefficient code all over the place there will be no
> bottlenecks.

Sure there will. The code which is executed most often will be
the bottleneck. You can generally start by not worrying too
much about efficiency; if you've encapsulated correctly, there
will then be no problem improving the efficiency of the places
which really are bottlenecks, i.e. the places which are executed
the most often.

--
James Kanze

Juha Nieminen

unread,

Nov 24, 2009, 11:37:14 AM11/24/09

James Kanze wrote:
> As a general rule, you shouldn't use standard classes, or
> classes from any general library, as part of the application
> abstractions. You should always define your own classes, with
> the exact interface you need (and no more). The standard
> classes only appear in the implementation of these classes, so
> you can swap them in or out as needed (or even replace them with
> a custom implementation, if none of the standard classes meets
> your needs).

While in certain large projects it's a good idea to abstract
implementation details away as much as possible, in smaller projects
trying to abstract everything away can be more work than it's worth.

For example, suppose you have some function or member function which
takes, let's say, a std::vector as parameter:

class Something
{
public:
void foo(const std::vector<unsigned>& values);
};

This is very unabstract code. It hard-codes both the data container
and the element type. One would think that it's a good idea to abstract
that away a bit, for example:

class Something
{
public:
typedef std::vector<unsigned> ValueContainer;
void foo(const ValueContainer& values);
};

Now if all the outside code uses Something::ValueContainer instead of
std::vector<unsigned>, everything is well? Maybe, except that that
typedef doesn't enforce anything. You can still see that what's being
used is really a std::vector<unsigned>, and nothing stops you from using
that type directly and passing instances of it to Something::foo().

Moreover, even if the outside code would strictly adhere to always
using Something::ValueContainer, exactly which member functions of that
type is it allowed to use? Since it is a std::vector, all of the
std::vector functions are free to be used, making it more difficult to
change the type of ValueContainer later to something else. You quickly
notice that, in fact, you didn't abstract anything away at all with that
typedef. You are just using an alias.

So if you want to truly abstract the type away you have to do it like:

class Something
{
public:
class ValueContainer;
void foo(const ValueContainer& values);
};

and then you define Something::ValueContainer as a class which contains
the necessary member functions for passing the numerical values to
Something::foo().

The problem? Now Something::ValueContainer will be way more limited
than std::vector is. For example, the calling code might benefit from
things like random access with that data container, so if ValueContainer
doesn't provide it, it hinders what the code can do.

Of course this is intentional: By not providing random access you are
more free to change the internal data structure to something else if
needed (eg. std::list). However, by being too careful like this, you are
now hindering the outside code.

What you could do is to add a way to initialize a
Something::ValueContainer with a set of values (from a container or an
iterator range). However, what you have done now is effectively move the
abstraction problem from Something to Something::ValueContainer,
achieving only little advantage. You could as well have done that with
Something::foo() directly.

It might also introduce an inefficiency because now values need to be
copied around instead of the calling code just using the same container
for both whatever it needs to do (eg. requiring random access) and
making Something read from it, so the values don't need to be copied
anywhere just for Something::foo() to read them.

Paavo Helde

unread,

Nov 24, 2009, 12:57:36 PM11/24/09

James Kanze <james...@gmail.com> wrote in news:5cdb6f3a-e2e0-48c1-
9080-cc3...@1g2000vbm.googlegroups.com:

> On Nov 24, 7:06 am, Paavo Helde <myfirstn...@osa.pri.ee> wrote:

>> "carl" <carl@.com> wrote innews:4b0b34ea$0$280$14726298
@news.sunsite.dk:

>> If the code still seems too slow, then one should use
>> profiling to see if there are any unexpected bottlenecks. You
>> cannot rely on profiling only, because if you write uniformly
>> unefficient code all over the place there will be no
>> bottlenecks.
>
> Sure there will. The code which is executed most often will be
> the bottleneck. You can generally start by not worrying too
> much about efficiency; if you've encapsulated correctly, there
> will then be no problem improving the efficiency of the places
> which really are bottlenecks, i.e. the places which are executed
> the most often.

Yes, *if encapsulated correctly*, there is no problem. However, I'm
afraid that the people writing unefficient code are not very proficient
in things like encapsulation, either. An example would be that somebody
writes all over the place:

A* a = new A();
a->f();
delete a;

If operator new pops up in the profiler, it is not clear which use cases
are legitimate, which are not, and fixing the problem is not so simple as
it must be done in many places. If it does not pop up in the profiler, it
just makes the program slower by few percents and nobody ever fixes it.
If there are many such things, these few percents are all added up...

Paavo

James Kanze

unread,

Nov 25, 2009, 4:46:28 AM11/25/09

On Nov 24, 4:37 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
> James Kanze wrote:
> > As a general rule, you shouldn't use standard classes, or
> > classes from any general library, as part of the application
> > abstractions. You should always define your own classes,
> > with the exact interface you need (and no more). The
> > standard classes only appear in the implementation of these
> > classes, so you can swap them in or out as needed (or even
> > replace them with a custom implementation, if none of the
> > standard classes meets your needs).

> While in certain large projects it's a good idea to abstract
> implementation details away as much as possible, in smaller
> projects trying to abstract everything away can be more work
> than it's worth.

"As a general rule" doesn't mean "absolutely with no
exceptions". In general, in all large projects and in most
small projects, it's worth defining your application specific
abstractions. If you're writing an editor, you don't use
std::string for your text buffer, you use TextBuffer, a class
that you've designed. (Of course, at least at the beginning,
TextBuffer will use std::string in its implementation.)

> For example, suppose you have some function or member function which
> takes, let's say, a std::vector as parameter:
>
> class Something
> {
> public:
> void foo(const std::vector<unsigned>& values);
> };

> This is very unabstract code. It hard-codes both the data
> container and the element type. One would think that it's a
> good idea to abstract that away a bit, for example:

> class Something
> {
> public:
> typedef std::vector<unsigned> ValueContainer;
> void foo(const ValueContainer& values);
> };

> Now if all the outside code uses Something::ValueContainer
> instead of std::vector<unsigned>, everything is well? Maybe,
> except that that typedef doesn't enforce anything. You can
> still see that what's being used is really a
> std::vector<unsigned>, and nothing stops you from using that
> type directly and passing instances of it to Something::foo().

First, it depends. Is this ValueContainer fundamentally part of
your application abstraction? If so, it should be a class, and
not a typedef. A class which exposes the interface defined by
the abstraction in your application, and not the interface
defined by std::vector. If not, if the abstraction is precisely
that of std::vector, then std::vector is fine.

Having said that, it's sometimes a bit delicate. I have more
than a few cases where the abstraction does wrap a standard
container, like std::vector, *and* includes iterators. In such
cases, it's very tempting to use a (member) typedef to define
the iterators, which does lock me into random access iterators,
even if the abstraction doesn't require them. I've thought
about creating a template to "downgrade" iterators, but I've not
gotten around to it.

> Moreover, even if the outside code would strictly adhere to
> always using Something::ValueContainer, exactly which member
> functions of that type is it allowed to use? Since it is a
> std::vector, all of the std::vector functions are free to be
> used, making it more difficult to change the type of
> ValueContainer later to something else. You quickly notice
> that, in fact, you didn't abstract anything away at all with
> that typedef. You are just using an alias.

In sum: typedef doesn't define a new abstraction. What else is
new?

> So if you want to truly abstract the type away you have to do
> it like:

> class Something
> {
> public:
> class ValueContainer;
> void foo(const ValueContainer& values);
> };

> and then you define Something::ValueContainer as a class which
> contains the necessary member functions for passing the
> numerical values to Something::foo().

> The problem? Now Something::ValueContainer will be way more
> limited than std::vector is. For example, the calling code
> might benefit from things like random access with that data
> container, so if ValueContainer doesn't provide it, it hinders
> what the code can do.

That's not the problem. That's rather exactly what we're trying
to achieve. My code guarantees a certain functionality for
ValueContainer, and only that functionality. Client code can't
use more than I guarantee (in theory, anyway), so I'm free to
change the implementation anyway I want, as long as I maintain
that functionality.

It's called encapsulation, and it's essential if you want to be
able to optimize the code later. Or evolve it in any other way.

> Of course this is intentional: By not providing random access
> you are more free to change the internal data structure to
> something else if needed (eg. std::list). However, by being
> too careful like this, you are now hindering the outside code.

You want to define a contract for the outside code. Ideally,
you'll define the contract in such a way that the outside code
can do whatever it needs to do, and you maintain all the
necessary freedom to implement the code however you want. Even
using a typedef to std::vector "hinders" outside code: it can't
hold an iterator into the container over an insertion, for
example. It's up to you, as the designer, to decide what you
want (or need) to support, and what you don't.

> What you could do is to add a way to initialize a
> Something::ValueContainer with a set of values (from a
> container or an iterator range). However, what you have done
> now is effectively move the abstraction problem from Something
> to Something::ValueContainer, achieving only little advantage.
> You could as well have done that with Something::foo()
> directly.

> It might also introduce an inefficiency because now values
> need to be copied around instead of the calling code just
> using the same container for both whatever it needs to do (eg.
> requiring random access) and making Something read from it, so
> the values don't need to be copied anywhere just for
> Something::foo() to read them.

I'm not sure I fully understand what you're getting at in the
above two paragraphs, but one thing is sure: *IF* performance is
ever an issue, you'd better hide all implementation details
behind such an abstraction, and limit client code as much as
possible, or you'll be overly constrained in you optimization
possibilities.

Don't forget, if you've not provided random access, and it later
becomes evident that client code needs it, you can always add
it. Adding functionality causes no problems. Removing it, on
the other hand, breaks client code. So you don't want to
provide anything you're not sure the client needs.

--
James Kanze

James Kanze

unread,

Nov 25, 2009, 4:55:07 AM11/25/09

On Nov 24, 5:57 pm, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
> James Kanze <james.ka...@gmail.com> wrote in news:5cdb6f3a-e2e0-48c1-
> 9080-cc35f58e1...@1g2000vbm.googlegroups.com:

> > On Nov 24, 7:06 am, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
> >> "carl" <carl@.com> wrote innews:4b0b34ea$0$280$14726298

> @news.sunsite.dk:

> >> If the code still seems too slow, then one should use
> >> profiling to see if there are any unexpected bottlenecks. You
> >> cannot rely on profiling only, because if you write uniformly
> >> unefficient code all over the place there will be no
> >> bottlenecks.

> > Sure there will. The code which is executed most often will be
> > the bottleneck. You can generally start by not worrying too
> > much about efficiency; if you've encapsulated correctly, there
> > will then be no problem improving the efficiency of the places
> > which really are bottlenecks, i.e. the places which are executed
> > the most often.

> Yes, *if encapsulated correctly*, there is no problem.
> However, I'm afraid that the people writing unefficient code
> are not very proficient in things like encapsulation, either.

Yes. People who write bad code will write bad code:-). They
have to be taught. But it's easy to check for proper
encapsulation during code review. On the other hand, it's
almost impossible to see where the bottlenecks will be until
you've actually measured. (At the lowest level, at least.
Obviously, if someone's used an O(n^2) algorithm, and you know
that there will be millions of elements to deal with, you don't
need the profiler to know that it's not going to work. You
should catch things like that in code review as well.)

> An example would be that somebody writes all over the place:

> A* a = new A();
> a->f();
> delete a;

> If operator new pops up in the profiler, it is not clear which
> use cases are legitimate, which are not, and fixing the
> problem is not so simple as it must be done in many places. If
> it does not pop up in the profiler, it just makes the program
> slower by few percents and nobody ever fixes it. If there are
> many such things, these few percents are all added up...

The problem above is deeper. C++ supports value operations, and
a programmer who doesn't use them needs to be taught C++. It
has nothing to do with performance (at least not in the first
line); it's a question of semantics. The difference between:

A* a = new A();
a->f();
delete a;

and

A a;
a.f();

isn't just performance---it's also number of lines of code
written, what happens if an exception is thrown, and what
happens if (later) someone assigns a to another variable (of the
same type). And all of those issues are more important than
performance, since they affect program correctness and
maintainability.

(In most languages, for a variety of reasons, doing things the
idiomatically correct way will result in better performance,
over all. In C++, this is particularly true. But in all cases,
the reason for doing them in this way is because it is the
idiomatically correct way, and not for performance reasons per
se.)

--
James Kanze

0 new messages