typedef struct
{
int a;
int b;
int c;
int d;
} abcd;
class C
{
abcd key;
char data[100];
}
for i = list.begin(); i != list.end(); i++)
if ( mya == i->key.a && myb == i->key.b &&
myc == i->key.c && myd == i->key.d )
// object has been found; access data[]
To improve performance, I converted the list to an STL map. I used the
struct as the map's key and I added the operator<() and operator==()
functions to allow the map to sort and find objects. I also replaced the
list's for loop with map's find() function.
if (map.find(mykey) != map.end())
// object has been found; access data[]
Everything is working correctly (adding, finding, deleting, etc) but the map
solution is over 100% slower than the list for searches (say 100,000
searches). I understand that a map has some additional overhead but I would
have expected this overhead to be negligible for 1500 elements. I can't
explain why a sequential list search is faster (the found objects are
randomly dispersed and are not at the beginning of the list).
Has anyone else noticed this performance issue? Using gcc on Linux.
Two possibilities come to mind.
1) Your comparison, above, is not an apple-to-apple comparison. Try
replacing your manual list search loop with:
if (std::find(list.begin(), list.end(), key) != list.end())
Then, you can benchmark std::list::find() against std::map::find() and have
a meaningful comparison. Unfortunately, the complexity of STL templates
sometimes exceeds the compiler's optimization aggressiveness, and it ends up
generating code that's slower than manual search.
2) Although your searches may be random, there is some impact on a map from
the order the objects get inserted into the map, but I wouldn't expect even
the least optimal map insertion order to have such a major impact on
searches.
Sounds fishy. Show the operator< function that you wrote. Also, for
a map, you do not need operator==. You would need that operator if
you used the find algorithm with the list.
HTH
The operator< member function should be something like this:
bool abcd::operator<(const abcd & rhs) const
{
bool res = false;
if (a < rhs.a)
{
res = true;
}
else if (a == rhs.a)
{
if (b < rhs.b)
{
res = true;
}
else if (b == rhs.b)
{
if (c < rhs.c)
{
res = true;
}
else if (c == rhs.b)
{
if (d < rhs.d)
{
res = true;
}
}
}
}
return res;
}
HTH
Because in map, search goes with random access in memory,
and with list it goes in sequential order. I have discovered
that by using bidirectional iterator for quick sort instead of
random access I gain double the speed of quick sort...
because it seems that cpu reads ahead in memory
cache while instructions are executing.
Of course, nodes in list have to be sorted by address.
>>
>> Has anyone else noticed this performance issue? Using gcc on Linux.
Yes. On list where nodes are sorted by address pass through list
is twice as fast as unordered nodes, no matter the size of list
and where nodes are. If nodes are allocated one behind another
you get additional speed boost, perhaps additional double speed.
That's why , search on list on 1500 elements is faster than map,
because probably sequential search goes through level 1 cache
read ahead, which is 10-30 times faster than main memory.
Greets
Dear Sir,
I have just made a question about speaking skill in sci.math- a very
necessary skill I would like to acquire via your experience Sir.
I have been to an interview as of late for a job of business
consultant and definitely I fail but I am not absolutely in a bad mood
because those interviewers are a little limited in viewpoints and
evaluation process, and admittedly there are things I forget; things I
(sadly) missed during I was at schools but definitely I lack a special
speaking skill that I strongly need to acquire during the next few
weeks or more. Thereby, seeking Sir, The Professor and Bear's advice
as well as experience is a must-do right now. Just a few posts don't
count up the tall pyramids but they are priceless and appriciative
things ever from an online community Sir.
About your question, Sir,
Since built-in functions in those containers are mainly to as
versatile as possible to 'feed' a large number of coders of varied
needs. So, a hefty import of different headers as well as libraries
will be performed during the compiling and linking processes. The
containers inheritted from the base classes i.e iostream reimplemented
several to many virtual functions to expose their (aka) polymorphism
for current and future uses with the templatizing capabilities. Once
the objects are assigned to each placeholder of the list which is more
limited in functionalities performed in terms of first-key-second-
value pairwise implementation in comparison to map container. I think
that is why it is much faster than the map in relation to runtime
speed.
Ohhh I read what i wrote and I don't understand what I wrote too :-D
I am Sorry Sir but
I am have been online too long
I better go now
Next time next better
Remember me, remember you
> Sounds fishy. Show the operator< function that you wrote. Also, for
> a map, you do not need operator==. You would need that operator if
> you used the find algorithm with the list.
>
>
> The operator< member function should be something like this:
>
> bool abcd::operator<(const abcd & rhs) const
> {
> bool res = false;
> if (a < rhs.a)
> {
> res = true;
> }
> else if (a == rhs.a)
> {
> if (b < rhs.b)
> {
> res = true;
> }
> else if (b == rhs.b)
> {
> if (c < rhs.c)
> {
> res = true;
> }
> else if (c == rhs.b)
> {
> if (d < rhs.d)
> {
> res = true;
> }
> }
> }
> }
> return res;
> }
Thank you. My operator<() function looks like this:
bool operator<(const key& k)
{
if (a < k.a)
return true;
else
if (a > k.a)
return false;
if (b < k.b)
return true;
else
if (b > k.b)
return false;
if (c < k.c)
return true;
else
if (c > k.c)
return false;
if (d < k.d)
return true;
return false;
}
> Because in map, search goes with random access in memory,
> and with list it goes in sequential order. I have discovered
> that by using bidirectional iterator for quick sort instead of
> random access I gain double the speed of quick sort...
> because it seems that cpu reads ahead in memory
> cache while instructions are executing.
> Of course, nodes in list have to be sorted by address.
>
> Yes. On list where nodes are sorted by address pass through list
> is twice as fast as unordered nodes, no matter the size of list
> and where nodes are. If nodes are allocated one behind another
> you get additional speed boost, perhaps additional double speed.
> That's why , search on list on 1500 elements is faster than map,
> because probably sequential search goes through level 1 cache
> read ahead, which is 10-30 times faster than main memory.
Thank you for that explanation. It would explain what I'm seeing. When I
first familiarized myself with the STL (many years ago) I was led to believe
that map and set searches (as opposed to insertions) are generally faster
than list searches for large containers. The theory behind that reasoning
is well defined. It would indeed be unfortunate if the performance of these
container types depends on the CPU architecture.
>The operator< member function should be something like this:
>
>bool abcd::operator<(const abcd & rhs) const
>{
> bool res = false;
> if (a < rhs.a)
> {
> res = true;
> }
> else if (a == rhs.a)
> {
> if (b < rhs.b)
> {
> res = true;
> }
> else if (b == rhs.b)
> {
> if (c < rhs.c)
> {
> res = true;
> }
> else if (c == rhs.b)
> {
> if (d < rhs.d)
> {
> res = true;
> }
> }
> }
> }
> return res;
>}
or
bool abcd::operator<(const abcd & rhs) const
{
if (a != rhs.a)
return (a < rhs.a);
else if (b != rhs.b)
return (b < rhs.b);
else if (c != rhs.c)
return (c < rhs.c);
else
return (d < rhs.d);
}
Stephen Howe
I think you missed something. With the follwoing code
#include <cstdlib>
#include <list>
#include <map>
#include <iostream>
#include <vector>
#include <time.h>
using namespace std;
struct abcd
{
int a;
int b;
int c;
int d;
abcd( int a, int b, int c, int d ) : a(a), b(b), c(c), d(d) {}
abcd() {}
bool operator<(const abcd& k) const
{
if (a < k.a)
return true;
else
if (a > k.a)
return false;
if (b < k.b)
return true;
else
if (b > k.b)
return false;
if (c < k.c)
return true;
else
if (c > k.c)
return false;
if (d < k.d)
return true;
return false;
}
};
struct C
{
abcd key;
char data[100];
};
std::list<C> l;
std::map<abcd,C> m;
vector<abcd> k;
void init()
{
srand( 42 );
for( int i = 0; i < 1500; ++i ) {
abcd aABCD( rand(), rand(), rand(), rand() );
C aC;
aC.key = aABCD;
l.push_back( aC );
m.insert( make_pair( aABCD, aC ) );
if( i%10 == 0 ) k.push_back( aABCD );
}
}
const int count = 1000;
int main()
{
init();
clock_t start = clock();
int x = 0;
for( int z = 0; z < count; ++z ) {
x = 0;
for( vector<abcd>::iterator j = k.begin(); j != k.end(); j++ ) {
for( list<C>::iterator i = l.begin(); i != l.end(); i++) {
if ( j->a == i->key.a && j->b == i->key.b &&
j->c == i->key.c && j->d == i->key.d ){
x++;
continue;
}
}
}
}
clock_t listtime = clock();
int y = 0;
for( int z = 0; z < count; ++z ) {
y = 0;
for( vector<abcd>::iterator j = k.begin(); j != k.end(); j++ ) {
if( m.find( *j ) != m.end() ) {
y++;
}
}
}
clock_t maptime = clock();
std::cerr << "List: " << listtime - start << " " << x << std::endl;
std::cerr << "Map: " << maptime - listtime << " " << y << std::endl;
}
i get:
List: 1780000 150
Map: 10000 150
so map is 178 times faster than map!
Testet on Linux with gcc 4.3.2, -O3 option, processor at a fixed clock.
Lars
or my favorite:
#include "boost/tuple/tuple.hpp"
#include "boost/tuple/tuple_comparison.hpp"
bool abcd::operator<(const abcd & rhs) const
{
using boost::tuple::tie; // or IIRC std::tr1::tuple::tie
return tie(a, b, c, d) < tie(rhs.a, rhs.b, rhs.c, rhs.d);
}
Jeff
n^2 algorithm have to be slower than nlogn, no matter what ;)
Greets
If an algorithm is O(f(n)) then the number of operations can be k*f(n) +
g(n), where g(n) contributes less than f(n) for large n. so, for small
N, an O(nlogn) algorithm may be slower than an N^2 algorithm. For
example, compare heap-sort to merge-sort on a list of 3 elements.
--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
Agreed. btw happy new year ;)
Greets
Yes, Happy New Year to you too.
Happy New Year to everyone.
That looks wrong, you have to check for a > rhs.a etc. also.
"Leigh Johnston" <le...@i42.co.uk> wrote in message
news:p4ydnad4YemLnqPW...@giganews.com...
struct timer
{
timer(const char* message)
{
std::cout << message;
QueryPerformanceCounter(&iStartTime);
}
~timer()
{
LARGE_INTEGER endTime;
QueryPerformanceCounter(&endTime);
LARGE_INTEGER freq;
QueryPerformanceFrequency(&freq);
std::cout.precision(4);
std::cout << std::fixed << (float)(endTime.QuadPart -
iStartTime.QuadPart)/(float)freq.QuadPart << " seconds" << std::endl;
}
LARGE_INTEGER iStartTime;
};
struct abcd
{
int a;
int b;
int c;
int d;
abcd() : a(rand() % 1000), b(rand() % 1000), c(rand() % 1000), d(rand() %
1000) {}
bool operator<(const abcd& rhs) const
{
if (a != rhs.a)
return (a < rhs.a);
else if (b != rhs.b)
return (b < rhs.b);
else if (c != rhs.c)
return (c < rhs.c);
else
return (d < rhs.d);
}
bool operator==(const abcd& rhs) const
{
return (a == rhs.a && b == rhs.b && c == rhs.c && d == rhs.d);
}
};
int main()
{
std::vector<abcd> v;
for (int i = 0; i != 1500; ++i)
v.push_back(abcd());
std::list<abcd> l;
std::multiset<abcd> s;
for (int i = 0; i != 1500; ++i)
{
l.push_back(v[i]);
s.insert(v[i]);
}
{
srand(0);
timer t("list: ");
for (int i = 1; i != 1000000; ++i)
std::find(l.begin(), l.end(), v[rand() % 1500]);
}
{
srand(0);
timer t("multiset: ");
for (int i = 1; i != 1000000; ++i)
s.find(v[rand() % 1500]);
}
}
---
list: 2.1494 seconds
multiset: 0.1614 seconds
multiset is equivalent to map for the purposes of this discussion.
Not so.
Both map and list store elements in nodes. Each node is allocated
separately.
> I have discovered
> that by using bidirectional iterator for quick sort instead of
> random access I gain double the speed of quick sort...
Can not be true. Quick sort requires random iterators, this is why you
can't pass std::list<> iterators into std::sort(). std::list<> has sort
member function for sorting which employs mergesort because quicksort
can not be used for lists.
[]
>>> Has anyone else noticed this performance issue? Using gcc on Linux.
>
> Yes. On list where nodes are sorted by address pass through list
> is twice as fast as unordered nodes, no matter the size of list
> and where nodes are.
Sounds unlikely. Can't ignore CPU cache effects.
--
Max
Actually that's not true. Quicksort can be implemented for lists
(using the exact same computational complexity as with arrays).
Introsort (which is what std::sort uses) cannot be implemented because
it relies on heap sort and insertion sort.
--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---