map vs list performance

barcaroller

unread,

Dec 30, 2009, 6:17:57 PM12/30/09

to

I have an STL list that contains about 1500 objects. To check if an object
exists in my list I iterate over it and check if a "key" matches. A key is
just a struct that contains four integers.

typedef struct
{
int a;
int b;
int c;
int d;
} abcd;

class C
{
abcd key;
char data[100];
}

for i = list.begin(); i != list.end(); i++)
if ( mya == i->key.a && myb == i->key.b &&
myc == i->key.c && myd == i->key.d )
// object has been found; access data[]

To improve performance, I converted the list to an STL map. I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects. I also replaced the
list's for loop with map's find() function.

if (map.find(mykey) != map.end())
// object has been found; access data[]

Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches). I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements. I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).

Has anyone else noticed this performance issue? Using gcc on Linux.

Sam

unread,

Dec 30, 2009, 7:21:30 PM12/30/09

to

barcaroller writes:

Two possibilities come to mind.

1) Your comparison, above, is not an apple-to-apple comparison. Try
replacing your manual list search loop with:

if (std::find(list.begin(), list.end(), key) != list.end())

Then, you can benchmark std::list::find() against std::map::find() and have
a meaningful comparison. Unfortunately, the complexity of STL templates
sometimes exceeds the compiler's optimization aggressiveness, and it ends up
generating code that's slower than manual search.

2) Although your searches may be random, there is some impact on a map from
the order the objects get inserted into the map, but I wouldn't expect even
the least optimal map insertion order to have such a major impact on
searches.

anonma...@gmail.com

unread,

Dec 30, 2009, 7:42:15 PM12/30/09

to

Sounds fishy. Show the operator< function that you wrote. Also, for
a map, you do not need operator==. You would need that operator if
you used the find algorithm with the list.

HTH

anonma...@gmail.com

unread,

Dec 30, 2009, 7:52:05 PM12/30/09

to

On Dec 30, 7:42 pm, "AnonMail2...@gmail.com" <anonmail2...@gmail.com>
wrote:

The operator< member function should be something like this:

bool abcd::operator<(const abcd & rhs) const
{
bool res = false;
if (a < rhs.a)
{
res = true;
}
else if (a == rhs.a)
{
if (b < rhs.b)
{
res = true;
}
else if (b == rhs.b)
{
if (c < rhs.c)
{
res = true;
}
else if (c == rhs.b)
{
if (d < rhs.d)
{
res = true;
}
}
}
}
return res;
}

HTH

Branimir Maksimovic

unread,

Dec 30, 2009, 7:54:28 PM12/30/09

to

Because in map, search goes with random access in memory,
and with list it goes in sequential order. I have discovered
that by using bidirectional iterator for quick sort instead of
random access I gain double the speed of quick sort...
because it seems that cpu reads ahead in memory
cache while instructions are executing.
Of course, nodes in list have to be sorted by address.

>>
>> Has anyone else noticed this performance issue? Using gcc on Linux.

Yes. On list where nodes are sorted by address pass through list
is twice as fast as unordered nodes, no matter the size of list
and where nodes are. If nodes are allocated one behind another
you get additional speed boost, perhaps additional double speed.
That's why , search on list on 1500 elements is faster than map,
because probably sequential search goes through level 1 cache
read ahead, which is 10-30 times faster than main memory.

Greets

--
http://maxa.homedns.org/

Horace Cochran

unread,

Dec 30, 2009, 8:41:39 PM12/30/09

to

Hi there, did you know GNU time command can be used to clock the runtime of
your executable a.out? For your case, you can write 2 programs each using a
different STL container then run them with 'time a.out'.

Naive Group User

unread,

Dec 30, 2009, 10:44:59 PM12/30/09

to

Dear Sir,

I have just made a question about speaking skill in sci.math- a very
necessary skill I would like to acquire via your experience Sir.
I have been to an interview as of late for a job of business
consultant and definitely I fail but I am not absolutely in a bad mood
because those interviewers are a little limited in viewpoints and
evaluation process, and admittedly there are things I forget; things I
(sadly) missed during I was at schools but definitely I lack a special
speaking skill that I strongly need to acquire during the next few
weeks or more. Thereby, seeking Sir, The Professor and Bear's advice
as well as experience is a must-do right now. Just a few posts don't
count up the tall pyramids but they are priceless and appriciative
things ever from an online community Sir.

About your question, Sir,
Since built-in functions in those containers are mainly to as
versatile as possible to 'feed' a large number of coders of varied
needs. So, a hefty import of different headers as well as libraries
will be performed during the compiling and linking processes. The
containers inheritted from the base classes i.e iostream reimplemented
several to many virtual functions to expose their (aka) polymorphism
for current and future uses with the templatizing capabilities. Once
the objects are assigned to each placeholder of the list which is more
limited in functionalities performed in terms of first-key-second-
value pairwise implementation in comparison to map container. I think
that is why it is much faster than the map in relation to runtime
speed.

Naive Group User

unread,

Dec 30, 2009, 10:50:07 PM12/30/09

to

> speed.- Hide quoted text -
>
> - Show quoted text -

Ohhh I read what i wrote and I don't understand what I wrote too :-D

I am Sorry Sir but
I am have been online too long
I better go now

Next time next better
Remember me, remember you

barcaroller

unread,

Dec 30, 2009, 11:01:14 PM12/30/09

to

"AnonMa...@gmail.com" <anonma...@gmail.com> wrote

> Sounds fishy. Show the operator< function that you wrote. Also, for
> a map, you do not need operator==. You would need that operator if
> you used the find algorithm with the list.
>
>

> The operator< member function should be something like this:
>
> bool abcd::operator<(const abcd & rhs) const
> {
> bool res = false;
> if (a < rhs.a)
> {
> res = true;
> }
> else if (a == rhs.a)
> {
> if (b < rhs.b)
> {
> res = true;
> }
> else if (b == rhs.b)
> {
> if (c < rhs.c)
> {
> res = true;
> }
> else if (c == rhs.b)
> {
> if (d < rhs.d)
> {
> res = true;
> }
> }
> }
> }
> return res;
> }

Thank you. My operator<() function looks like this:

bool operator<(const key& k)
{
if (a < k.a)
return true;
else
if (a > k.a)
return false;

if (b < k.b)
return true;
else
if (b > k.b)
return false;

if (c < k.c)
return true;
else
if (c > k.c)
return false;

if (d < k.d)
return true;

return false;
}

barcaroller

unread,

Dec 30, 2009, 11:20:07 PM12/30/09

to

"Branimir Maksimovic" <bm...@hotmail.com> wrote:

> Because in map, search goes with random access in memory,
> and with list it goes in sequential order. I have discovered
> that by using bidirectional iterator for quick sort instead of
> random access I gain double the speed of quick sort...
> because it seems that cpu reads ahead in memory
> cache while instructions are executing.
> Of course, nodes in list have to be sorted by address.
>

> Yes. On list where nodes are sorted by address pass through list
> is twice as fast as unordered nodes, no matter the size of list
> and where nodes are. If nodes are allocated one behind another
> you get additional speed boost, perhaps additional double speed.
> That's why , search on list on 1500 elements is faster than map,
> because probably sequential search goes through level 1 cache
> read ahead, which is 10-30 times faster than main memory.

Thank you for that explanation. It would explain what I'm seeing. When I
first familiarized myself with the STL (many years ago) I was led to believe
that map and set searches (as opposed to insertions) are generally faster
than list searches for large containers. The theory behind that reasoning
is well defined. It would indeed be unfortunate if the performance of these
container types depends on the CPU architecture.

Stephen Howe

unread,

Dec 31, 2009, 8:59:13 AM12/31/09

to

On Wed, 30 Dec 2009 16:52:05 -0800 (PST), "AnonMa...@gmail.com" <anonma...@gmail.com> wrote:

>The operator< member function should be something like this:
>
>bool abcd::operator<(const abcd & rhs) const
>{
> bool res = false;
> if (a < rhs.a)
> {
> res = true;
> }
> else if (a == rhs.a)
> {
> if (b < rhs.b)
> {
> res = true;
> }
> else if (b == rhs.b)
> {
> if (c < rhs.c)
> {
> res = true;
> }
> else if (c == rhs.b)
> {
> if (d < rhs.d)
> {
> res = true;
> }
> }
> }
> }
> return res;
>}

or

bool abcd::operator<(const abcd & rhs) const
{

if (a != rhs.a)
return (a < rhs.a);
else if (b != rhs.b)
return (b < rhs.b);
else if (c != rhs.c)
return (c < rhs.c);
else
return (d < rhs.d);
}

Stephen Howe

Lars Tetzlaff

unread,

Dec 31, 2009, 1:09:28 PM12/31/09

to

I think you missed something. With the follwoing code

#include <cstdlib>
#include <list>
#include <map>
#include <iostream>
#include <vector>
#include <time.h>

using namespace std;

struct abcd

{
int a;
int b;
int c;
int d;

abcd( int a, int b, int c, int d ) : a(a), b(b), c(c), d(d) {}
abcd() {}

bool operator<(const abcd& k) const

{
if (a < k.a)
return true;
else
if (a > k.a)
return false;

if (b < k.b)
return true;
else
if (b > k.b)
return false;

if (c < k.c)
return true;
else
if (c > k.c)
return false;

if (d < k.d)
return true;

return false;
}
};

struct C

{
abcd key;
char data[100];

};

std::list<C> l;
std::map<abcd,C> m;
vector<abcd> k;

void init()
{
srand( 42 );

for( int i = 0; i < 1500; ++i ) {
abcd aABCD( rand(), rand(), rand(), rand() );
C aC;
aC.key = aABCD;
l.push_back( aC );
m.insert( make_pair( aABCD, aC ) );
if( i%10 == 0 ) k.push_back( aABCD );
}
}

const int count = 1000;

int main()
{
init();

clock_t start = clock();

int x = 0;
for( int z = 0; z < count; ++z ) {
x = 0;
for( vector<abcd>::iterator j = k.begin(); j != k.end(); j++ ) {
for( list<C>::iterator i = l.begin(); i != l.end(); i++) {
if ( j->a == i->key.a && j->b == i->key.b &&
j->c == i->key.c && j->d == i->key.d ){
x++;
continue;
}
}
}
}

clock_t listtime = clock();

int y = 0;

for( int z = 0; z < count; ++z ) {
y = 0;
for( vector<abcd>::iterator j = k.begin(); j != k.end(); j++ ) {
if( m.find( *j ) != m.end() ) {
y++;
}
}
}

clock_t maptime = clock();

std::cerr << "List: " << listtime - start << " " << x << std::endl;
std::cerr << "Map: " << maptime - listtime << " " << y << std::endl;
}

i get:

List: 1780000 150
Map: 10000 150

so map is 178 times faster than map!

Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.

Lars

Jeff Flinn

unread,

Dec 31, 2009, 3:17:36 PM12/31/09

to

or my favorite:

#include "boost/tuple/tuple.hpp"
#include "boost/tuple/tuple_comparison.hpp"

bool abcd::operator<(const abcd & rhs) const
{

using boost::tuple::tie; // or IIRC std::tr1::tuple::tie

return tie(a, b, c, d) < tie(rhs.a, rhs.b, rhs.c, rhs.d);
}

Jeff

Branimir Maksimovic

unread,

Dec 31, 2009, 6:28:01 PM12/31/09

to

Lars Tetzlaff wrote:
> i get:
>
> List: 1780000 150
> Map: 10000 150
>
> so map is 178 times faster than map!
>
> Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.

n^2 algorithm have to be slower than nlogn, no matter what ;)

Greets

--
http://maxa.homedns.org/

Daniel Pitts

unread,

Dec 31, 2009, 6:37:07 PM12/31/09

to

Branimir Maksimovic wrote:
> Lars Tetzlaff wrote:
>> i get:
>>
>> List: 1780000 150
>> Map: 10000 150
>>
>> so map is 178 times faster than map!
>>
>> Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.
>
> n^2 algorithm have to be slower than nlogn, no matter what ;)

Your statement is not true.

If an algorithm is O(f(n)) then the number of operations can be k*f(n) +
g(n), where g(n) contributes less than f(n) for large n. so, for small
N, an O(nlogn) algorithm may be slower than an N^2 algorithm. For
example, compare heap-sort to merge-sort on a list of 3 elements.

--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

Branimir Maksimovic

unread,

Dec 31, 2009, 6:40:20 PM12/31/09

to

Daniel Pitts wrote:
> Branimir Maksimovic wrote:
>> Lars Tetzlaff wrote:
>>> i get:
>>>
>>> List: 1780000 150
>>> Map: 10000 150
>>>
>>> so map is 178 times faster than map!
>>>
>>> Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.
>>
>> n^2 algorithm have to be slower than nlogn, no matter what ;)
> Your statement is not true.
>
> If an algorithm is O(f(n)) then the number of operations can be k*f(n) +
> g(n), where g(n) contributes less than f(n) for large n. so, for small
> N, an O(nlogn) algorithm may be slower than an N^2 algorithm. For
> example, compare heap-sort to merge-sort on a list of 3 elements.
>
>

Agreed. btw happy new year ;)

Greets

--
http://maxa.homedsn.org/

Daniel Pitts

unread,

Dec 31, 2009, 6:42:38 PM12/31/09

to

Yes, Happy New Year to you too.

Happy New Year to everyone.

Leigh Johnston

unread,

Jan 1, 2010, 9:13:33 AM1/1/10

to

"Stephen Howe" <sjhoweATdialDOTpipexDOTcom> wrote in message
news:rcbpj559cc6n6kpfj...@4ax.com...

>
> or
>
> bool abcd::operator<(const abcd & rhs) const
> {
> if (a != rhs.a)
> return (a < rhs.a);
> else if (b != rhs.b)
> return (b < rhs.b);
> else if (c != rhs.c)
> return (c < rhs.c);
> else
> return (d < rhs.d);
> }

That looks wrong, you have to check for a > rhs.a etc. also.

Leigh Johnston

unread,

Jan 1, 2010, 9:15:06 AM1/1/10

to

Actually I retract that... :)

"Leigh Johnston" <le...@i42.co.uk> wrote in message
news:p4ydnad4YemLnqPW...@giganews.com...

Leigh Johnston

unread,

Jan 1, 2010, 9:42:35 AM1/1/10

to

My timings indicate something somewhat different to your findings:

struct timer
{
timer(const char* message)
{
std::cout << message;
QueryPerformanceCounter(&iStartTime);
}
~timer()
{
LARGE_INTEGER endTime;
QueryPerformanceCounter(&endTime);
LARGE_INTEGER freq;
QueryPerformanceFrequency(&freq);
std::cout.precision(4);
std::cout << std::fixed << (float)(endTime.QuadPart -
iStartTime.QuadPart)/(float)freq.QuadPart << " seconds" << std::endl;
}
LARGE_INTEGER iStartTime;
};

struct abcd

{
int a;
int b;
int c;
int d;

abcd() : a(rand() % 1000), b(rand() % 1000), c(rand() % 1000), d(rand() %
1000) {}
bool operator<(const abcd& rhs) const

{
if (a != rhs.a)
return (a < rhs.a);
else if (b != rhs.b)
return (b < rhs.b);
else if (c != rhs.c)
return (c < rhs.c);
else
return (d < rhs.d);
}

bool operator==(const abcd& rhs) const
{
return (a == rhs.a && b == rhs.b && c == rhs.c && d == rhs.d);
}
};

int main()
{
std::vector<abcd> v;
for (int i = 0; i != 1500; ++i)
v.push_back(abcd());
std::list<abcd> l;
std::multiset<abcd> s;
for (int i = 0; i != 1500; ++i)
{
l.push_back(v[i]);
s.insert(v[i]);
}
{
srand(0);
timer t("list: ");
for (int i = 1; i != 1000000; ++i)
std::find(l.begin(), l.end(), v[rand() % 1500]);
}
{
srand(0);
timer t("multiset: ");
for (int i = 1; i != 1000000; ++i)
s.find(v[rand() % 1500]);
}
}

---

list: 2.1494 seconds
multiset: 0.1614 seconds

multiset is equivalent to map for the purposes of this discussion.

Maxim Yegorushkin

unread,

Jan 1, 2010, 6:13:49 PM1/1/10

to

Not so.

Both map and list store elements in nodes. Each node is allocated
separately.

> I have discovered
> that by using bidirectional iterator for quick sort instead of
> random access I gain double the speed of quick sort...

Can not be true. Quick sort requires random iterators, this is why you
can't pass std::list<> iterators into std::sort(). std::list<> has sort
member function for sorting which employs mergesort because quicksort
can not be used for lists.

[]

>>> Has anyone else noticed this performance issue? Using gcc on Linux.
>
> Yes. On list where nodes are sorted by address pass through list
> is twice as fast as unordered nodes, no matter the size of list
> and where nodes are.

Sounds unlikely. Can't ignore CPU cache effects.

--
Max

Juha Nieminen

unread,

Jan 2, 2010, 4:17:11 PM1/2/10

to

Maxim Yegorushkin wrote:
> quicksort can not be used for lists.

Actually that's not true. Quicksort can be implemented for lists
(using the exact same computational complexity as with arrays).
Introsort (which is what std::sort uses) cannot be implemented because
it relies on heap sort and insertion sort.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---