Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Container Performance

494 views
Skip to first unread message

Bjarne Stroustrup

unread,
May 29, 2003, 2:29:36 PM5/29/03
to
/* Standard Container Benchmark

Version 0.9, May 23, 2003

The primary purpose of this benchmark is to show how different standard
containers are in terms of performance. We have observed that programmers
often use sets, multisets, deques in the situations where sorted vectors
are the optimal solutions. Similarly, programmers often choose lists simply
because they do a few insertions and deletions without knowing that vectors
are more efficient at that for small containers.

Frequently, of course, performance does not matter,
but it is important that programmers are aware of the consequences of their
choices. We are not saying that only vectors should be used, there are
cases when one really needs a more complicated data structure, but one needs to
understand the price for additional functionality.

The secondary purpose of the benchmark is to encourage compiler and library vendors to
keep improving performance. For example, it is not acceptable that some compilers give
you a sizable penalty for using vector iterators instead of pointer. It is also quite
clear that performance of other standard containers could be improved.

The benchmark is doing the same task 7 times using different data structures:
array, vector (using a pointer as iterator), vector (using the defailt cector iterator),
list, deque, set and multiset. The task is to remove duplicates from a sequence of doubles.
This is a simple test, but it illustrates the performance of containers.

It is clear that the benchmark needs to be eventually extended
to slists and even to hash-sets and hash-maps, but we decided that at present it
should work only with the standard containers. As the standard grows, so can
the benchmark. The additional benefit of not including hash based containers is
that now all the test have identical asymptotic complexity and, even more
importantly, do almost the same number of comparisons. The difference is only
in data structure overhead and cache behavior.

The number of times that a given test is run inversly proportional to NlogN where N is the
size of the sequence. This allows one to see how containers of different size behave.
The instructive values used for the benchmark are: 10, 100, 1000, 10000, 100000, 1000000.

The time taken for a run of the benchmark depends on the machine used, the compiler used,
the compiler and optimizer settings used, as well as the standard library. Please note that
the time taken could be several minutes - and much more if you use a slow debug mode.

The output is a table where columns are separated by tabs and rows by newlines. This is
barely ok for a human reader, but ideal for feeding into a program, such as a spreadsheet
for display or analysis.

If you want to add your own test of a container, add the name of your container to
the "header string, write a test function like the existing ones, e.g. vector_test,
and add a call of "run" for your test in "run_tests".


Alex Stepanov
step...@adobe.com

Bjarne Stroustrup
b...@cs.tamu.edu

*/

#include <stddef.h> // some older implementations lack <cstddef>
#include <time.h>
#include <math.h>
#include <stdlib.h>

#include <vector>
#include <algorithm>
#include <list>
#include <deque>
#include <set>

#include <iostream>
#include <iomanip>

typedef double element_t;

using namespace std;

vector<double> result_times; // results are puched into this vector

const char header[] =
"\tarray"
"\tvector with pointers"
"\tvector with iterators"
"\tdeque"
"\tlist"
"\tset"
"\tmultiset";

void do_head()
{
cout << "size" << header << '\n';
}

int do_tail()
{
// in case you want to stop for confirmation in a windows console application
//char c;
//cin >> c;
return 0;
}

void do_row(int size)
{
cout << size;
cout << fixed << setprecision(2);
for (size_t i = 0; i < result_times.size(); ++i)
cout << '\t' << result_times[i];
cout << '\n';
}

clock_t start_time;

inline void start_timer() { start_time = clock(); }

inline double timer()
{
clock_t end_time = clock();
return (end_time - start_time)/double(CLOCKS_PER_SEC);
}

typedef void(*Test)(element_t*, element_t*, int);
void run(Test f, element_t* first, element_t* last, int number_of_times)
{
start_timer();
while (number_of_times-- > 0) f(first,last,number_of_times);
result_times.push_back(timer());
}

void array_test(element_t* first, element_t* last, int number_of_times)
{
element_t* array = new element_t[last - first];
copy(first, last, array);
sort(array, array + (last - first));
unique(array, array + (last - first));
delete [] array;
}

void vector_pointer_test(element_t* first, element_t* last, int number_of_times)
{
vector<element_t> container(first, last);
// &*container.begin() gets us a pointer to the first element
sort(&*container.begin(), &*container.end());
unique(&*container.begin(), &*container.end());
}

void vector_iterator_test(element_t* first, element_t* last, int number_of_times)
{
vector<element_t> container(first, last);
sort(container.begin(), container.end());
unique(container.begin(), container.end());
}

void deque_test(element_t* first, element_t* last, int number_of_times)
{
// deque<element_t> container(first, last); CANNOT BE USED BECAUSE OF MVC++ 6
deque<element_t> container(size_t(last - first), 0.0);
copy(first, last, container.begin());
sort(container.begin(), container.end());
unique(container.begin(), container.end());
}

void list_test(element_t* first, element_t* last, int number_of_times)
{
list<element_t> container(first, last);
container.sort();
container.unique();
}

void set_test(element_t* first, element_t* last, int number_of_times)
{
set<element_t> container(first, last);
}

void multiset_test(element_t* first, element_t* last, int number_of_times)
{
multiset<element_t> container(first, last);
typedef multiset<element_t>::iterator iterator;
{
iterator first = container.begin();
iterator last = container.end();

while (first != last) {
iterator next = first;
if (++next == last) break;
if (*first == *next)
container.erase(next);
else
++first;
}
}
}

void initialize(element_t* first, element_t* last)
{
element_t value = 0.0;
while (first != last) {
*first++ = value;
value += 1.;
}
}

double logtwo(double x)
{
return log(x)/log((double) 2.0);
}

const int largest_size = 1000000;

int number_of_tests(int size) {
double n = size;
double largest_n = largest_size;
return int(floor((largest_n * logtwo(largest_n)) / (n * logtwo(n))));
}

void run_tests(int size)
{
const int n = number_of_tests(size);
const size_t length = 2*size;
result_times.clear();

// make a random test set of the chosen size:
vector<element_t> buf(length);
element_t* buffer = &buf[0];
element_t* buffer_end = &buf[length];
initialize(buffer, buffer + size); // elements
initialize(buffer + size, buffer_end); // duplicate elements
random_shuffle(buffer, buffer_end);

// test the containers:
run(array_test, buffer, buffer_end, n);
run(vector_pointer_test, buffer, buffer_end, n);
run(vector_iterator_test, buffer, buffer_end, n);
run(deque_test, buffer, buffer_end, n);
run(list_test, buffer, buffer_end, n);
run(set_test, buffer, buffer_end, n);
run(multiset_test, buffer, buffer_end, n);
do_row(size);
}

int main()
{
do_head();
const int sizes [] = {10, 100, 1000, 10000, 100000, 1000000};
const int n = sizeof(sizes)/sizeof(int);
for (int i = 0; i < n; ++i) run_tests(sizes[i]);
return do_tail();
}


[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Siemel Naran

unread,
May 30, 2003, 7:36:24 AM5/30/03
to
"Bjarne Stroustrup" <b...@research.att.com> wrote in message

> The primary purpose of this benchmark is to show how different standard
> containers are in terms of performance. We have observed that programmers
> often use sets, multisets, deques in the situations where sorted vectors
> are the optimal solutions. Similarly, programmers often choose lists
simply
> because they do a few insertions and deletions without knowing that
vectors
> are more efficient at that for small containers.

On BC6 release mode the results ==>

size array vector vector deque list set multiset
10 2.69 2.26 2.24 7.00 14.99 3.09 5.60
100 1.48 1.51 1.50 4.70 4.38 2.45 3.74
1000 1.41 1.44 1.44 3.73 3.28 2.01 2.95
10000 1.42 1.45 1.45 3.72 4.40 2.38 5.07
100000 1.86 1.88 1.88 4.38 9.74 6.52 9.38
1000000 2.46 2.34 2.31 4.96 6.92 6.23 8.54

--
+++++++++++
Siemel Naran

Tim Love

unread,
May 30, 2003, 2:34:04 PM5/30/03
to
"Siemel Naran" <Sieme...@KILL.att.net> writes:

>"Bjarne Stroustrup" <b...@research.att.com> wrote in message

>> The primary purpose of this benchmark is to show how different
standard
>> containers are in terms of performance. We have observed that
programmers
>> often use sets, multisets, deques in the situations where sorted
vectors
>> are the optimal solutions. Similarly, programmers often choose lists
>simply
>> because they do a few insertions and deletions without knowing that
>vectors
>> are more efficient at that for small containers.

>On BC6 release mode the results ==>

>size array vector vector deque list set multiset
>10 2.69 2.26 2.24 7.00 14.99 3.09 5.60
>100 1.48 1.51 1.50 4.70 4.38 2.45 3.74
>1000 1.41 1.44 1.44 3.73 3.28 2.01 2.95
>10000 1.42 1.45 1.45 3.72 4.40 2.38 5.07
>100000 1.86 1.88 1.88 4.38 9.74 6.52 9.38
>1000000 2.46 2.34 2.31 4.96 6.92 6.23 8.54

Here are the results from HP-UX's aCC -O
size array v ptr v it deque list set multiset

10 3.30 3.60 3.62 16.30 15.19 9.79 14.50
100 1.59 1.67 1.67 6.42 8.89 4.83 7.99
1000 1.35 1.41 1.40 5.03 7.40 3.83 5.78
10000 1.41 1.30 1.30 4.53 6.79 3.53 4.82
100000 1.37 1.44 1.43 4.53 11.28 4.54 7.61
1000000 1.52 1.54 1.53 4.69 17.59 9.17 12.35

Siemel Naran

unread,
May 31, 2003, 5:12:04 AM5/31/03
to
"Bjarne Stroustrup" <b...@research.att.com> wrote in message

> The primary purpose of this benchmark is to show how different standard


> containers are in terms of performance. We have observed that programmers
> often use sets, multisets, deques in the situations where sorted vectors
> are the optimal solutions. Similarly, programmers often choose lists
simply
> because they do a few insertions and deletions without knowing that
vectors
> are more efficient at that for small containers.

Sometimes we want a map<T,U> with the two phase usage that we'll populate
the map in phase one, and read from the map in phase two treating the map as
const. In this case it might be better to use a
std::vector<std::pair<T,U>>, populate the vector, sort, then use lower_bound
and upper_bound to search the vector.

Can anyone think of other examples or heuristics that tell us what container
to use where?

--
+++++++++++
Siemel Naran

Dhruv

unread,
May 31, 2003, 5:14:45 AM5/31/03
to
t...@eng.cam.ac.uk (Tim Love) wrote in message news:<bb7kdb$agu$1...@pegasus.csx.cam.ac.uk>...
> "Siemel Naran" <Sieme...@KILL.att.net> writes:
>

> >On BC6 release mode the results ==>
>
> >size array vector vector deque list set multiset
> >10 2.69 2.26 2.24 7.00 14.99 3.09 5.60
> >100 1.48 1.51 1.50 4.70 4.38 2.45 3.74
> >1000 1.41 1.44 1.44 3.73 3.28 2.01 2.95
> >10000 1.42 1.45 1.45 3.72 4.40 2.38 5.07
> >100000 1.86 1.88 1.88 4.38 9.74 6.52 9.38
> >1000000 2.46 2.34 2.31 4.96 6.92 6.23 8.54
>
> Here are the results from HP-UX's aCC -O
> size array v ptr v it deque list set multiset
>
> 10 3.30 3.60 3.62 16.30 15.19 9.79 14.50
> 100 1.59 1.67 1.67 6.42 8.89 4.83 7.99
> 1000 1.35 1.41 1.40 5.03 7.40 3.83 5.78
> 10000 1.41 1.30 1.30 4.53 6.79 3.53 4.82
> 100000 1.37 1.44 1.43 4.53 11.28 4.54 7.61
> 1000000 1.52 1.54 1.53 4.69 17.59 9.17 12.35
>

/* Results: On an AMD Duron 1GHz processor, 128M RAM.
g++ 3.2: gcc 3.2: libstdc++ (SGI derived).
Compile command: g++ -Wall -O3 -o container_test container_test.cpp


size array vector with pointers vector with iterators deque
list set multiset
10 1.04 1.17 1.42 2.49 10.20 2.57
4.44
100 0.76 0.76 1.04 1.88 3.37 2.00
3.11
1000 0.84 0.85 1.04 1.65 2.59 1.61
2.27
10000 1.06 1.05 1.24 1.75 5.64 3.26
5.95
100000 1.00 1.00 1.16 1.60 7.81 5.86
7.95
1000000 1.12 1.13 1.30 1.74 7.28 8.24
11.50

*/

Francis Glassborow

unread,
May 31, 2003, 11:34:15 AM5/31/03
to
In message <cf18e89.03053...@posting.google.com>, Dhruv
<dhru...@gmx.net> writes

>/* Results: On an AMD Duron 1GHz processor, 128M RAM.
> g++ 3.2: gcc 3.2: libstdc++ (SGI derived).
> Compile command: g++ -Wall -O3 -o container_test container_test.cpp
>
>
>size array vector with pointers vector with iterators deque
> list set multiset
>10 1.04 1.17 1.42 2.49 10.20 2.57
> 4.44
>100 0.76 0.76 1.04 1.88 3.37 2.00
> 3.11
>1000 0.84 0.85 1.04 1.65 2.59 1.61
> 2.27
>10000 1.06 1.05 1.24 1.75 5.64 3.26
> 5.95
>100000 1.00 1.00 1.16 1.60 7.81 5.86
> 7.95
>1000000 1.12 1.13 1.30 1.74 7.28 8.24
> 11.50
>
>*/
While the results I got using G++ in its MinGW form were slightly
different the overall pattern was the same. IIUC this suggests that if a
vector will do (you can accept the invalidation of iterators by
push_back()) using a vector with pointers is as good as using an array.
A vector with iterators is better in all cases than any other container
type.

The odd feature to my mind is the shape of the performance curve for
sets and multisets which seems U shaped for the data we have so far.

I suggest that one small addition and one small change makes posting
these results easier:

add

ofstream out("data.txt")

as a global variable :(

and replace uses of cout with out.


--
ACCU Spring Conference 2003 April 2-5
The Conference you should not have missed
ACCU Spring Conference 2004 Late April
Francis Glassborow ACCU

LLeweLLyn

unread,
May 31, 2003, 6:45:39 PM5/31/03
to

Here are results using g++3.3 -O3, on i686-linux-gnu, with some
formatting changes for easier reading:

size array vector with pointers vector with iterators

10 2.02 2.13 2.02
100 1.36 1.32 1.35
1000 1.33 1.31 1.40
10000 1.63 1.65 1.77
100000 1.50 1.53 1.60
1000000 1.78 1.73 1.69


size deque list set multiset
10 3.35 12.56 3.18 5.71
100 2.47 4.61 2.54 3.78
1000 2.26 3.60 2.06 3.06
10000 2.39 6.95 3.87 7.01
100000 2.30 9.54 6.90 9.57
1000000 2.44 8.91 9.76 13.46

It seems gcc 3.3 is finally bring gcc back down to the abstraction
penalty gcc 2.95.x had years ago ... :-/

(note: after seeing several runs, it seemed I noticed a variance of
about +/-10%, so don't read too much into differences in the lower
digits.)

John Potter

unread,
Jun 1, 2003, 4:35:35 AM6/1/03
to
On 29 May 2003 14:29:36 -0400, b...@research.att.com (Bjarne Stroustrup)
wrote:

[benchmark]

A nit on fairness. The array based containers do not destruct the
remove without erase items while the node based containers do for the
erased items.

vector_pointer_test
// unique(&*container.begin(), &*container.end());
container.erase(unique(&*container.begin(), &*container.end())
- &*container.begin() + container.begin(), container.end());

vector_iterator_test
// unique(container.begin(), container.end());
container.erase(unique(container.begin(), container.end()),
container.end());

deque_test
// unique(container.begin(), container.end());
container.erase(unique(container.begin(), container.end()),
container.end());

The difference is only noticable for size 10; however, it does add
consistency to the tests.

It might also be interesting to show the cost of using the general
algorithm when a specialization is available.

void list_test2(element_t* first, element_t* last, int number_of_times)


{
list<element_t> container(first, last);
container.sort();

container.erase(unique(container.begin(), container.end()),
container.end());
}

With a little help, it can also be used with multiset in place of the
custom code.

template <class Fiter>
struct UnconstIter {
// Minimal required for my tests. Add more if needed.
typedef forward_iterator_tag iterator_category;
typedef typename Fiter::value_type value_type;
typedef typename Fiter::difference_type difference_type;
typedef value_type* pointer;
typedef value_type& reference;
Fiter p;
UnconstIter (Fiter p) : p(p) { }
UnconstIter& operator++ () { ++ p; return *this; }
reference operator* () const { return const_cast<reference>(*p); }
bool operator== (UnconstIter rhs) const { return p == rhs.p; }
bool operator!= (UnconstIter rhs) const { return p != rhs.p; }
};

void multiset_test2(element_t* first, element_t* last,


int number_of_times)
{
multiset<element_t> container(first, last);

typedef UnconstIter<multiset<element_t>::iterator> Uit;
container.erase(unique(Uit(container.begin()),
Uit(container.end())).p, container.end());
}

Note: If you are thinking of posting that this will not work, please
try it and then study the data structure. For the unique part, the
multiset is a sorted sequence. For the erase part, it is a balanced
structure where the keys are not considered.

In both cases, I observed a 10% to 20% increase in time which decreased
as size increased. The only strange value was list at 1M where there
was a large increase. Using gcc only. Considering that creating the
ordered collection is NlgN and unique is N, the increase in total time
represents more than doubling the unique time. This seems a bit
strange.

Since there seems to be some interest in seeing results, here is
gcc-2.95.2 on an old RS6000 with AIX-4.3.

size array vectorP vectorI deque list listA set multi multiA
10 12.80 11.27 11.06 33.20 87.75 88.85 24.76 36.53 41.65
100 5.58 5.50 5.48 17.00 28.39 28.61 14.72 22.04 24.90
1000 5.26 5.25 5.26 14.97 26.16 26.71 13.77 21.23 24.38
10000 5.83 5.83 5.82 14.72 30.20 31.06 16.06 23.58 26.61
100000 6.80 6.78 6.82 15.41 41.48 44.14 25.78 35.57 39.66
1000000 7.28 7.29 7.32 15.61 35.29 49.37 31.95 40.90 45.85

Result observations.

The only posted case where I know that vector::iterator is not a
pointer is gcc 3.2. It seems that the compiler is not able to make
this simple pointer wrapper as good as a pointer.

The non-vector containers are much better with gcc-3.2 than above.

The containers with space overhead show the time increase when they
force swapping at smaller sizes.

The results should show users the high cost of non-vectors. Do they
show implementers that they are doing poorly or is it just a fact of
complexity?

John

John Potter

unread,
Jun 1, 2003, 4:37:17 AM6/1/03
to
On 31 May 2003 11:34:15 -0400, Francis Glassborow
<francis.g...@ntlworld.com> wrote:

> The odd feature to my mind is the shape of the performance curve for
> sets and multisets which seems U shaped for the data we have so far.

The tests are based upon an assumption that all tests take NlgN time.
The number of times the test is performed is reduced as the size of
the data set increases. If the assumption were true, all tests for a
container would produce the same result. Any deviation we see is likely
a result of lg rounding, allocator block size, core size, cache size and
other things unrelated to the containers.

My results show the same "U" shape for all containers. It may be more
noticable for set because of the high space overhead exhausting real
memory sooner.

> I suggest that one small addition and one small change makes posting
> these results easier:

Surely you jest. >data.txt Why would you hard wire a file name?

John

LLeweLLyn

unread,
Jun 1, 2003, 4:39:53 AM6/1/03
to
"Siemel Naran" <Sieme...@KILL.att.net> writes:

> "Bjarne Stroustrup" <b...@research.att.com> wrote in message
>
> > The primary purpose of this benchmark is to show how different standard
> > containers are in terms of performance. We have observed that programmers
> > often use sets, multisets, deques in the situations where sorted vectors
> > are the optimal solutions. Similarly, programmers often choose lists
> simply
> > because they do a few insertions and deletions without knowing that
> vectors
> > are more efficient at that for small containers.
>
> Sometimes we want a map<T,U> with the two phase usage that we'll populate
> the map in phase one, and read from the map in phase two treating the map as
> const. In this case it might be better to use a
> std::vector<std::pair<T,U>>, populate the vector, sort, then use lower_bound
> and upper_bound to search the vector.
>
> Can anyone think of other examples or heuristics that tell us what container
> to use where?

[snip]

*shrug* ... use vector if you think the total data size is smaller
than 1/2 your expected (L2) cache size?

Stephen Howe

unread,
Jun 1, 2003, 4:42:24 AM6/1/03
to
> void multiset_test(element_t* first, element_t* last, int number_of_times)
> {
> multiset<element_t> container(first, last);
> typedef multiset<element_t>::iterator iterator;
> {
> iterator first = container.begin();
> iterator last = container.end();
>
> while (first != last) {
> iterator next = first;
> if (++next == last) break;
> if (*first == *next)
> container.erase(next);
> else
> ++first;
> }
> }
> }

I prefer (tighter while loop)

void multiset_test(element_t* first, element_t* last, int number_of_times)
{
multiset<element_t> container(first, last);
typedef multiset<element_t>::iterator iterator;
{
iterator first = container.begin();
iterator last = container.end();

if (first != last) {
iterator next = first;
++next;

while (next != last) {
if (*first == *next)
container.erase(first);
first = next;
++next;
}
}
}
}

I also think that the deque test should be the same as the vector test and
if VC++ 6.0's predefined macro indicates VC6 then use copy() otherwise for
other compilers, use the deque range constructor.

Stephen Howe

THORSTEN OTTOSEN

unread,
Jun 1, 2003, 4:46:40 AM6/1/03
to
machine: 900MHz P3, Windows XP

On visual C++ v 7.1 ( cl /O2 )
-------------------------------
size array vector with pointers vector with iterators deque
listsetm
ultiset
10 3.04 3.19 3.72 10.81 21.41 5.91 11.02
100 1.61 1.63 1.91 6.30 6.52 3.42 6.12
1000 1.41 1.42 1.65 5.27 4.54 2.59 5.07
10000 1.27 1.26 1.45 5.09 4.52 2.85 6.42
100000 1.56 1.55 1.66 5.35 6.40 5.10 8.32
1000000 1.63 1.63 1.76 5.47 7.12 6.39 9.69

On Comeau 4.3.0.1 with visual 7.1 backend, libcomo (como /O2 )
--------------------------------------------------------
size array vector with pointers vector with iterators deque
listsetm
ultiset
10 1.52 1.60 1.59 7.27 40.10 5.48 6.94
100 0.76 0.78 0.78 3.69 15.68 3.21 4.06
1000 0.69 0.70 0.70 3.17 12.32 2.43 3.00
10000 0.83 0.78 0.67 3.04 11.15 2.15 3.05
100000 1.00 1.00 1.00 3.34 13.91 3.78 5.34
1000000 1.19 1.19 1.19 3.40 11.07 3.56 4.44

regards

Thorsten Ottosen, Aalborg University

Francis Glassborow

unread,
Jun 1, 2003, 9:04:12 AM6/1/03
to
In message <mugidv8o3uk2v58a4...@4ax.com>, John Potter
<jpo...@falcon.lhup.edu> writes

> > I suggest that one small addition and one small change makes posting
> > these results easier:
>
>Surely you jest. >data.txt Why would you hard wire a file name?

Because it was the minimal change to hard-wiring console output.

--
ACCU Spring Conference 2003 April 2-5
The Conference you should not have missed
ACCU Spring Conference 2004 Late April
Francis Glassborow ACCU

Troll_King

unread,
Jun 1, 2003, 9:05:39 AM6/1/03
to
b...@research.att.com (Bjarne Stroustrup) wrote in message
news:<HFnqC...@research.att.com>...

> /* Standard Container Benchmark
>
> Version 0.9, May 23, 2003
>
> The primary purpose of this benchmark is to show how different
standard
> containers are in terms of performance.

> Frequently, of course, performance does not matter,

> but it is important that programmers are aware of the consequences of
their
> choices. We are not saying that only vectors should be used, there are
> cases when one really needs a more complicated data structure, but one
needs to
> understand the price for additional functionality.

>

> The benchmark is doing the same task 7 times using different data
structures:
> array, vector (using a pointer as iterator), vector (using the defailt
cector iterator),
> list, deque, set and multiset. The task is to remove duplicates from a
sequence of doubles.
> This is a simple test, but it illustrates the performance of
containers.


If I were a Master Class programmer than I would exchange a class or
struct type instead for the 'double' built in type and observe vector
container performance degrate significantly compared to std::list. So
one can probably say that std::vector is the best choice in general
when storing built in types however look into other containers when
storing class and struct types.

Alex Vinokur

unread,
Jun 1, 2003, 1:05:08 PM6/1/03
to
###########################
### Windows, MINGW, g++ ###
###########################


==========================================
Windows 2000
Intel(R) Celeron(R) CPU 1.70 GHz
Total Physical Memory (RAM) 523,760 KB
Available Physical Memory 389,360 KB
Total Virtual Memory 1,407,956 KB
Available Virtual Memory 1,119,024 KB
Page File Space 884,196 KB

MinGW 2.0.0.-2
GNU g++ version 3.2 (mingw special 20020817-1)
The version of the C++ library in compressed ISO date format :
__GLIBCPP__ == 20020816
==========================================


--------------- No optimization ---------------
size array v-ptr v-iter deque list set multiset
10 2.39 2.56 4.17 6.71 40.98 11.80 16.60
100 2.84 2.86 5.54 8.17 20.01 9.35 11.76
1000 2.85 2.87 5.22 7.13 17.07 8.56 10.14
10000 2.93 2.91 5.07 6.63 23.09 11.96 17.82
100000 2.50 2.51 4.50 5.86 25.26 16.69 19.97
1000000 2.71 2.70 4.91 6.31 20.07 17.04 19.74


--------------- Optimization O1 ---------------
size array v-ptr v-iter deque list set multiset
10 3.39 3.42 4.67 6.30 13.68 3.38 6.38
100 1.81 1.78 2.05 3.63 5.92 3.07 4.57
1000 1.95 1.94 2.19 3.29 5.19 3.11 4.11
10000 2.13 2.12 2.15 3.27 13.29 7.76 14.25
100000 2.15 1.93 2.07 3.01 17.27 12.79 16.08
1000000 2.10 2.10 2.16 3.09 11.98 12.44 15.23


--------------- Optimization O2 ---------------
size array v-ptr v-iter deque list set multiset
10 3.44 3.53 3.48 6.73 14.67 3.97 7.28
100 2.12 1.88 1.86 3.77 6.40 3.23 5.24
1000 1.97 1.98 2.11 3.31 5.42 3.09 4.37
10000 2.15 2.14 2.21 3.23 13.04 7.56 13.84
100000 1.92 1.91 2.04 2.85 16.81 12.67 16.22
1000000 2.12 2.11 2.22 3.00 12.00 12.47 15.38


--------------- Optimization O3 ---------------
size array v-ptr v-iter deque list set multiset
10 2.36 2.43 3.50 6.28 13.18 3.69 6.29
100 1.28 1.28 1.77 3.65 6.25 3.13 4.54
1000 1.69 1.68 2.14 3.35 5.39 3.13 4.11
10000 1.96 1.95 2.25 3.29 13.18 7.59 13.72
100000 1.92 1.93 2.22 3.13 18.48 13.91 17.05
1000000 1.94 1.93 2.19 2.99 11.67 12.11 14.75

--
==========================================
Alex Vinokur
mailto:ale...@connect.to
http://www.simtel.net/pub/oth/19088.html
http://sourceforge.net/users/alexvn
==========================================

Alex Vinokur

unread,
Jun 1, 2003, 2:51:03 PM6/1/03
to
{This is an exact (or near-exact) duplicate of an article
that has already been processed. -mod}

############################
### Windows, CYGWIN, g++ ###
############################


==========================================
Windows 2000
Intel(R) Celeron(R) CPU 1.70 GHz
Total Physical Memory (RAM) 523,760 KB
Available Physical Memory 389,360 KB
Total Virtual Memory 1,407,956 KB
Available Virtual Memory 1,119,024 KB
Page File Space 884,196 KB

CYGWIN_NT-5.0 1.3.22(0.78/3/2)
GNU g++ version 3.2 20020927 (prerelease)


The version of the C++ library in compressed ISO date format :

__GLIBCPP__ == 20020927
==========================================


--------------- No optimization ---------------
size array v-ptr v-iter deque list set multiset

10 2.83 3.00 4.58 6.79 19.57 5.27 7.58
100 1.25 1.25 2.34 3.67 18.80 9.37 11.69
1000 2.78 2.78 4.93 6.84 15.65 8.43 10.01
10000 2.74 2.75 4.79 6.41 21.99 11.61 17.77
100000 2.45 2.44 4.29 5.68 24.75 16.83 20.59
1000000 2.57 2.58 4.54 5.91 22.46 21.29 25.45


--------------- Optimization O1 ---------------
size array v-ptr v-iter deque list set multiset

10 3.50 3.56 4.15 5.94 10.52 3.21 5.40
100 1.59 1.60 1.96 3.71 5.54 2.92 4.35
1000 1.92 1.92 2.10 3.25 4.98 2.92 3.86
10000 1.96 1.96 2.07 3.19 12.64 7.20 12.92
100000 1.83 1.84 1.91 2.80 17.07 12.86 16.46
1000000 1.97 1.98 2.04 2.92 14.06 16.91 20.91


--------------- Optimization O2 ---------------
size array v-ptr v-iter deque list set multiset

10 3.79 3.56 3.56 5.68 10.52 3.14 5.38
100 1.61 1.61 1.71 3.54 5.52 2.89 4.34
1000 1.92 1.92 2.08 3.20 4.88 2.94 3.85
10000 1.97 1.97 2.13 3.14 12.22 7.23 12.79
100000 1.86 1.84 1.99 2.76 16.77 12.96 16.46
1000000 1.98 2.00 2.11 2.86 13.84 16.91 20.85


--------------- Optimization O3 ---------------
size array v-ptr v-iter deque list set multiset

10 3.33 3.35 3.56 6.02 10.57 3.27 5.18
100 1.26 1.25 1.71 3.50 5.60 2.98 4.27
1000 1.64 1.65 2.09 3.26 4.91 2.98 3.71
10000 1.79 1.79 2.15 3.12 11.85 7.37 13.15
100000 1.86 1.87 2.17 3.01 18.30 14.32 18.03
1000000 1.85 1.87 2.13 2.86 13.83 17.11 20.95


--
==========================================
Alex Vinokur
mailto:ale...@connect.to
http://www.simtel.net/pub/oth/19088.html
http://sourceforge.net/users/alexvn
==========================================

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Alex Vinokur

unread,
Jun 1, 2003, 2:51:27 PM6/1/03
to
{This is an exact (or near-exact) duplicate of an article
that has already been processed. -mod}

###########################
### Windows, DJGPP, gpp ###
###########################


==========================================
Windows 2000
Intel(R) Celeron(R) CPU 1.70 GHz
Total Physical Memory (RAM) 523,760 KB
Available Physical Memory 389,360 KB
Total Virtual Memory 1,407,956 KB
Available Virtual Memory 1,119,024 KB
Page File Space 884,196 KB

DJGPP 2.03
GNU gpp version 3.2.1


The version of the C++ library in compressed ISO date format :

__GLIBCPP__ == 20021119
==========================================


--------------- No optimization ---------------
size array v-ptr v-iter deque list set multiset

10 2.09 2.25 3.57 5.16 20.55 5.44 8.41
100 2.86 2.86 5.22 7.86 19.89 9.12 11.43
1000 2.86 2.86 5.27 7.09 17.20 8.41 10.00
10000 2.80 3.24 4.95 6.37 22.58 11.21 17.09
100000 2.42 2.42 4.34 5.60 25.33 16.70 20.66
1000000 2.58 2.58 4.62 5.82 22.91 21.15 25.66


--------------- Optimization O1 ---------------
size array v-ptr v-iter deque list set multiset

10 2.97 3.02 3.63 4.78 11.54 3.35 6.32
100 1.59 1.65 1.98 3.24 5.99 2.86 4.40
1000 1.92 1.92 2.14 3.13 5.33 2.97 3.85
10000 1.98 1.98 2.09 3.13 13.24 7.36 13.30
100000 1.76 1.81 1.87 2.69 17.09 12.97 16.54
1000000 1.98 1.98 2.03 2.80 14.18 16.70 20.82


--------------- Optimization O2 ---------------
size array v-ptr v-iter deque list set multiset

10 2.91 3.02 3.08 5.05 15.49 3.41 6.26
100 1.70 1.65 1.70 3.30 5.93 2.86 4.56
1000 1.98 1.98 2.09 3.13 5.11 2.97 4.01
10000 1.98 1.98 2.09 3.08 12.58 7.20 13.19
100000 1.87 1.81 1.92 2.64 16.65 12.86 16.48
1000000 1.98 2.03 2.03 2.80 13.90 16.65 20.71


--------------- Optimization O3 ---------------
size array v-ptr v-iter deque list set multiset

10 1.59 1.70 3.08 5.27 11.65 4.84 5.82
100 1.21 1.26 1.76 3.41 5.82 2.91 4.34
1000 1.65 1.70 2.14 3.13 5.16 2.97 3.85
10000 1.81 1.81 2.14 3.08 12.97 7.20 12.91
100000 1.81 1.87 2.09 2.91 18.52 13.96 17.91
1000000 1.87 1.81 2.09 2.80 13.90 16.81 20.71


==========================================
Alex Vinokur
mailto:ale...@connect.to
http://www.simtel.net/pub/oth/19088.html
http://sourceforge.net/users/alexvn
==========================================

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

John Potter

unread,
Jun 1, 2003, 5:37:13 PM6/1/03
to
On 1 Jun 2003 04:42:24 -0400, "Stephen Howe"
<NOSPAM...@dial.pipex.com> wrote:

> I prefer (tighter while loop)

Correct comes first. The algorithm is required to keep the
first and remove the others. Yours keeps the last.

> iterator first = container.begin();
> iterator last = container.end();

> if (first != last) {
> iterator next = first;
> ++next;

> while (next != last) {
> if (*first == *next)
> container.erase(first);
> first = next;
> ++next;
> }

for (; next != last; ++ next)


if (*first == *next) {
container.erase(next);

next = first;
}
else
first = next;

I agree with your choice of a pretest and nice loop over the abnormal
exit loop. But this is just style and has nothing to do with the
objective of the benchmark.

John

Carl Barron

unread,
Jun 1, 2003, 5:39:45 PM6/1/03
to
Bjarne Stroustrup <b...@research.att.com> wrote:

> void vector_pointer_test(element_t* first, element_t* last,
int number_of_times)
> {
> vector<element_t> container(first, last);
> // &*container.begin() gets us a pointer to the first element
> sort(&*container.begin(), &*container.end());
> unique(&*container.begin(), &*container.end());
> }

nit pick: vector<T,A>::end() does not need to be dereferenceable. but
a vector is in a single block so sort(&container.front(),
&container.back()+1); unique(&constainer.front(), &container.back()+1);
is more portable:)

John Potter

unread,
Jun 1, 2003, 5:42:14 PM6/1/03
to
On 1 Jun 2003 09:05:39 -0400, trol...@shaw.ca (Troll_King) wrote:

> b...@research.att.com (Bjarne Stroustrup) wrote in message
> news:<HFnqC...@research.att.com>...

> > The primary purpose of this benchmark is to show how different


> > standard containers are in terms of performance.

> If I were a Master Class programmer than I would exchange a class or


> struct type instead for the 'double' built in type and observe vector
> container performance degrate significantly compared to std::list.

Being a curious third class hacker, I made the exchange.

double
size array vectP vectI deque list listA set mult multA
10 1.67 1.73 1.75 2.81 4.95 5.88 1.55 2.56 3.12
100 0.75 0.77 0.83 1.66 2.56 3.06 1.38 2.09 2.45
1000 0.91 0.91 0.97 1.48 2.28 2.62 1.39 1.83 2.16
10000 0.84 0.89 0.92 1.34 3.17 3.48 1.44 3.78 4.25
100000 0.83 0.83 0.91 1.20 6.25 6.89 4.70 6.11 6.64
1000000 0.83 0.86 0.92 1.25 5.33 8.61 6.28 7.67 8.53

struct { char[6]; };
size array vectP vectI deque list listA set mult multA
10 7.41 7.00 7.08 8.66 8.88 9.30 5.53 7.56 7.99
100 5.45 5.55 5.56 6.19 6.06 6.14 5.14 6.30 6.80
1000 5.41 5.41 5.44 5.83 5.48 5.59 4.86 5.69 6.03
10000 5.20 5.09 5.24 5.56 6.53 6.72 5.09 7.30 7.88
100000 4.75 4.75 4.84 5.06 9.45 10.14 7.67 9.39 9.89
1000000 5.03 5.09 5.17 5.33 8.67 12.08 9.38 10.95 11.84

string
size array vectP vectI deque list listA set mult multA
10 24.78 22.34 22.61 24.06 11.36 12.95 6.94 9.88 11.44
100 15.50 14.67 14.88 15.25 7.27 8.05 5.75 7.28 8.17
1000 13.69 13.11 13.34 13.83 6.47 6.97 5.23 6.30 7.02
10000 13.67 13.55 13.56 14.09 11.05 11.86 8.67 11.55 12.73
100000 14.31 13.94 14.00 14.41 14.22 15.36 12.02 13.95 15.00
1000000 16.28 15.89 16.16 16.39 14.59 19.09 15.02 17.98 19.77

A reasonable compare point seems to be 1000. Array based shines
for double. All containers look alike for struct. Node based
shines for dynamic members. For large enough size, everything
becomes i/o bound cobol. Node based has smaller large.

The purpose of this benchmark is?

John

Siemel Naran

unread,
Jun 1, 2003, 5:43:21 PM6/1/03
to
"Stephen Howe" <NOSPAM...@dial.pipex.com> wrote in message
news:3ed9614a$0

> I also think that the deque test should be the same as the vector test and

A deque<T> is conceptually like a list of vector<T>. Thus a deque::iterator
has contains two iterators -- a vector<T>::iterator for the element within
the list/segment, and a list<vector<T>>::iterator for the current
list/segment. For deque<T>::iterator::operator== the system has to compare
vector::iterator only assuming different deques don't share segments. But
deque<T>::operator++ is inefficient: program has to do operator++ on the
vector::iterator (as always) then check to see if we are at the end of the
segment and if so go to the start of the next segment (the extra step).

But an implementation may specialize for_each, find_if, unique,sort and the
others for deque<iterator> by rewriting the algorithm as a double nested
loop. The outer loops over segments, and the inner loops over elements in
the segment. This way the inner loop does not have to concern itself with
checking if we are at the end of the segment and if so go to the start of
the next segment (the extra step).

My implementation does not have any of these specializations. Though it
does have a specialization for find_if for random access iterators for
manual loop unrolling, where instead of looping N times for an array of N
elements they loop N/4 times where the body of the loop does 4 comparisons.

With the right specializations for all the algorithms I imagine deque will
be almost the same as vector, maybe just 5% more. To test this will be a
lengthy project, though I hope someone does it. We'd have to
[ ] create a class __segmented_access_category_tag derived from
random_access_category_tag
[ ] use it in class deque::iterator and const_iterator
[ ] create member functions __segment_begin and __segment_end
[ ] create typedef to be returned by above functions
[ ] create member functions __element_begin and __element_end
[ ] create typedef to be returned by above functions
[ ] specialize all the algorithms
Will attempt this soon for some of the algorithms.

Also, for best results, the compiler should optimize small user types as it
does the builtin types. I believe KCC does this optimization.

--
+++++++++++
Siemel Naran

Andy Sawyer

unread,
Jun 2, 2003, 12:11:44 AM6/2/03
to
In article <QYxxC2B$Fc2+...@robinton.demon.co.uk>,
on 1 Jun 2003 09:04:12 -0400,
Francis Glassborow <francis.g...@ntlworld.com> wrote:

> In message <mugidv8o3uk2v58a4...@4ax.com>, John Potter
> <jpo...@falcon.lhup.edu> writes
> > > I suggest that one small addition and one small change makes
posting
> > > these results easier:
> >
> >Surely you jest. >data.txt Why would you hard wire a file name?
>
> Because it was the minimal change to hard-wiring console output.

Except the original does not have "hard-wired" console output, it has
"hard-wired" "standard output" output. Not the same thing at all. On
most platforms where this benchmark is likely to be run, this is
trivial to redirect to a file with whatever name you like (including,
but certainly not limited to, 'data.txt'). (Of course, 'data.txt' may
not be a valid filename on some platforms)

--
"Light thinks it travels faster than anything but it is wrong. No matter
how fast light travels it finds the darkness has always got there
first,
and is waiting for it." -- Terry Pratchett, Reaper Man

Alex Vinokur

unread,
Jun 2, 2003, 5:31:56 AM6/2/03
to
| ######################################
| ######### Windows, DJGPP, gpp ########
| ### Using C/C++ Program Perfometer ###
| ######################################


Performance was measured with using C/C++ Program Perfometer.
http://alexvn.freeservers.com/s1/perfometer.html
http://sourceforge.net/projects/cpp-perfometer


| ==========================================
| Windows 2000
| Intel(R) Celeron(R) CPU 1.70 GHz

| DJGPP 2.03
| GNU gpp version 3.2.1

| ==========================================

| --------------------------------
| Fragments of the Raw Run Log
| Note. Full Raw Run Log can be seen at :
| http://groups.google.com/groups?th=2407b19b6bdb96c9
| --------------------------------

Comparative Analysis of Container Performance
---------------------------------------------

No optimization
: ================================
: array-10 -> 12
: array-100 -> 135
: array-1000 -> 1980
: --------------------------------
: vector-ptr-10 -> 10
: vector-ptr-100 -> 136
: vector-ptr-1000 -> 1976
: --------------------------------
: vector-iter-10 -> 22
: vector-iter-100 -> 290
: vector-iter-1000 -> 3964
: --------------------------------
: deque-10 -> 30
: deque-100 -> 390
: deque-1000 -> 5145
: --------------------------------
: list-10 -> 105
: list-100 -> 955
: list-1000 -> 11840
: --------------------------------
: set-10 -> 25
: set-100 -> 433
: set-1000 -> 5960
: --------------------------------
: multiset-10 -> 41
: multiset-100 -> 565
: multiset-1000 -> 7119
: ================================


--

Troll_King

unread,
Jun 2, 2003, 7:54:57 AM6/2/03
to
John Potter <jpo...@falcon.lhup.edu> wrote in message news:<esdkdvk1gmfe030u9...@4ax.com>...

> On 1 Jun 2003 09:05:39 -0400, trol...@shaw.ca (Troll_King) wrote:
>
> > b...@research.att.com (Bjarne Stroustrup) wrote in message
> > news:<HFnqC...@research.att.com>...
>
> > > The primary purpose of this benchmark is to show how different
> > > standard containers are in terms of performance.
>
{excessive quote snipped -mod}

If you use a class type with a copy constructor and assignment
operator, than std::vector performs a lot worse than the test types
you used, but anyway this experiment shows me that GNU Linux g++ 3.2
on RH 9.0 vector using iterators performs much worse than the vector
using pointers (using the original program) :
.... array vecP vecI
10 2.17 2.49 5.95
100 1.68 1.70 4.20
1000 1.57 1.58 3.88
10000 1.76 1.74 3.86
100000 1.77 1.77 3.58
1000000 2.01 2.02 4.00

GNU Linux just doesn't seem to have much respect for Standard C++, but
I think all that's going to change in the future.

Stephen Howe

unread,
Jun 2, 2003, 2:04:29 PM6/2/03
to
> Correct comes first. The algorithm is required to keep the
> first and remove the others. Yours keeps the last.

It won't make any difference for double but for other element types it will.
Yes you are right.

Stephen Howe

Dhruv

unread,
Jun 2, 2003, 2:37:11 PM6/2/03
to
trol...@shaw.ca (Troll_King) wrote in message news:<39ca89a9.03060...@posting.google.com>...

> John Potter <jpo...@falcon.lhup.edu> wrote in message news:<esdkdvk1gmfe030u9...@4ax.com>...
> > On 1 Jun 2003 09:05:39 -0400, trol...@shaw.ca (Troll_King) wrote:
> >
> > > b...@research.att.com (Bjarne Stroustrup) wrote in message
> > > news:<HFnqC...@research.att.com>...
>
> > > > The primary purpose of this benchmark is to show how different
> > > > standard containers are in terms of performance.
> >
> {excessive quote snipped -mod}
> { another major snip here } ......


Well, after going through all the posts, I think that the performance
depends not only on the choice of the container, but also on the
compiler, and the particular implementation of the library. Ok, well
that's pretty obvious. Also, I read a post which showed that the
performance of the containers (such as vector, whic rely on copy for
moving about their memebers) was pretty bad, if not pathetic. I did
not think of this originally.

If you look more closely at the performance of the map and multimap,
you will see that the performance of those for small data sets is
better , and worse for large data sets for the SGI derived
implementations, while its opposite for the Dinkumware
implementation. I'm no expert on maps, and multipaps (well, never used
them before EVER!!!) Also, I think that a lot also depends upon the
allocators provided by these implementations. One fact to note is that
the SGI allocators do not return memory to the OS after their
destructor has been invoked, while others might or might not, you
never know. Here, during testing the time for the construction ad
destruction of the object, and hence the underlying allocator had also
been taken into account, since the containers atre local to their
respective functions. I donot know if this will yield expected results
for allocators that do actually give back to the OS the memory that
they acquire.

Also, I would like to know if I could implement a vector (that is
standards compilant) in a way that the elements are not contiguous in
memory. This wil improve the performance of the vector for non-POD
types (I think). So, we could specialize this implementation for
non-POD's. I asked this because I'm currently trying to re-write some
standard containers that are frequently used (such as list, vector,
and later maybe map, and string) for improving thie performance. This
project will be called (nstl => non-standard template library), which
will work with the stl, it's not meant as a replacement.

Well, getting back to the vector, I intend making it as an array of
pointers to the type to be stored. That means that all the
requirements like conatant elemental access, and linear insertion, and
deletiond will be satisfied, but scaled down by a huge margin for huge
data types. If anyone has implemented such a thing, please mail
me. I'll try to use it instead of starting from scratch.

Also, 1 more thing... How would it differ if the array_test function
instead of having called opreator new, allocated the memory form the
runtime stack of the program, ok, I know the size should be known, but
for now assume that it is. I do not know much about cache, but I think
that the code and data cache is different. Please could someone thrwo
more light onto that??????


-Dhruv.

John Potter

unread,
Jun 2, 2003, 2:39:14 PM6/2/03
to
On 2 Jun 2003 07:54:57 -0400, trol...@shaw.ca (Troll_King) wrote:

> If you use a class type with a copy constructor and assignment
> operator, than std::vector performs a lot worse than the test types
> you used,

I'm pretty sure that std::string has those things. For small
sizes up to 1000, vector took twice the time of node based containers.
For larger sizes, the space overhead of the node based containers
destroyed their advantage and all containers look about the same.

> but anyway this experiment shows me that GNU Linux g++ 3.2
> on RH 9.0 vector using iterators performs much worse than the vector
> using pointers (using the original program) :

> .... array vecP vecI
> 10 2.17 2.49 5.95
> 100 1.68 1.70 4.20
> 1000 1.57 1.58 3.88
> 10000 1.76 1.74 3.86
> 100000 1.77 1.77 3.58
> 1000000 2.01 2.02 4.00

You rally do need to find the -O switch. When you ask for slow code,
you will get it. Same code, same compiler, different box/os.

No optimization
10 2.77 3.00 4.53
100 1.27 1.26 2.33
1000 1.30 1.31 2.30
10000 1.24 1.23 2.16
100000 1.12 1.12 1.99
1000000 1.19 1.17 2.06

-O2
10 1.67 1.64 1.70
100 0.77 0.78 0.81
1000 0.89 0.88 0.97
10000 0.83 0.83 0.92
100000 0.81 0.81 0.86
1000000 0.84 0.83 0.94

John

Andy Sawyer

unread,
Jun 3, 2003, 5:26:56 AM6/3/03
to
In article <cf18e89.03060...@posting.google.com>,
on 2 Jun 2003 14:37:11 -0400,
dhru...@gmx.net (Dhruv) wrote:

> Also, I would like to know if I could implement a vector (that is
> standards compilant) in a way that the elements are not contiguous
> in memory.

A vector implemented in such a way would conform with the letter (if
not the intent) of ISO/IEC 14882:1998(E). However, TC1 makes it clear
that the storage for vector should be contiguous.

> Well, getting back to the vector, I intend making it as an array of
> pointers to the type to be stored. That means that all the
> requirements like conatant elemental access, and linear insertion, and
> deletiond will be satisfied, but scaled down by a huge margin for huge
> data types. If anyone has implemented such a thing, please mail
> me. I'll try to use it instead of starting from scratch.

There are many other possibilites with a container like this, and it's
certainly useful. It's also not limited in functionality to "huge"
types - it's handy for small types with expensive copy/assignment
operations. It's also (effectively) a deque with a page-size of 1.

Regards,
Andy S.


--
"Light thinks it travels faster than anything but it is wrong. No matter
how fast light travels it finds the darkness has always got there first,
and is waiting for it." -- Terry Pratchett, Reaper Man

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Rayiner Hashem

unread,
Jun 3, 2003, 5:31:29 AM6/3/03
to
> GNU Linux just doesn't seem to have much respect for Standard C++, but
> I think all that's going to change in the future.
Odd, on my machine (Gentoo, G++ 3.2.2) the vector with iterators is
only a few tenths of a second slower than the vector with pointers.

Troll_King

unread,
Jun 3, 2003, 5:35:50 AM6/3/03
to
John Potter <jpo...@falcon.lhup.edu> wrote in message news:<lu1ndvo9q6md1n338...@4ax.com>...

Could you show me the exact instruction you entered to compile that
code using g++? I had trouble with this because simply adding -02
isn't recognized by g++.

With regard to the data types, see C++ Primer 3rd edition page 254 -
260. The std::string has a different capacity than a large complex
class, so the full penalty of std::vector isn't realized.

Francis Glassborow

unread,
Jun 3, 2003, 9:26:15 AM6/3/03
to
In message <cf18e89.03060...@posting.google.com>, Dhruv
<dhru...@gmx.net> writes

>Also, I would like to know if I could implement a vector (that is
>standards compilant) in a way that the elements are not contiguous in
>memory.

No you cannot. It is effectively a requirement that vector contain its
data in an array.


--
ACCU Spring Conference 2003 April 2-5
The Conference you should not have missed
ACCU Spring Conference 2004 Late April
Francis Glassborow ACCU

John Potter

unread,
Jun 3, 2003, 3:04:22 PM6/3/03
to
On 1 Jun 2003 04:35:35 -0400, John Potter <jpo...@falcon.lhup.edu>
wrote:

> In both cases, I observed a 10% to 20% increase in time which decreased
> as size increased. The only strange value was list at 1M where there
> was a large increase. Using gcc only. Considering that creating the
> ordered collection is NlgN and unique is N, the increase in total time
> represents more than doubling the unique time. This seems a bit
> strange.

Some investigation showed that the difference is a result of locality of
reference. For size == 1000000 whichever test, list.unique or unique(),
runs second has the large added time. Having sorted the nodes and then
deallocated them produces poor locality for the second build and sort.

As always, one wonders what the benchmark is measuring.

John Potter

unread,
Jun 3, 2003, 5:26:46 PM6/3/03
to
On 3 Jun 2003 05:35:50 -0400, trol...@shaw.ca (Troll_King) wrote:

> Could you show me the exact instruction you entered to compile that
> code using g++? I had trouble with this because simply adding -02
> isn't recognized by g++.

g++ [options] [files]

$ g++ -O2 ct.cc

John

LLeweLLyn

unread,
Jun 3, 2003, 5:56:45 PM6/3/03
to
hel...@mindspring.com (Rayiner Hashem) writes:

> > GNU Linux just doesn't seem to have much respect for Standard C++, but
> > I think all that's going to change in the future.
> Odd, on my machine (Gentoo, G++ 3.2.2) the vector with iterators is
> only a few tenths of a second slower than the vector with pointers.

[snip]

And on my machine, mandrake linux with g++ 3.3 -O3, vector with
iterators is sometimes a few percent _faster_ and sometimes a few
percent slower; e.g. the difference between a pointer and an
iterator is less then the error margin of the test.

LLeweLLyn

unread,
Jun 3, 2003, 5:57:26 PM6/3/03
to
trol...@shaw.ca (Troll_King) writes:

> John Potter <jpo...@falcon.lhup.edu> wrote in message news:<lu1ndvo9q6md1n338...@4ax.com>...
> > On 2 Jun 2003 07:54:57 -0400, trol...@shaw.ca (Troll_King)
> wrote:

[snip]


> > You rally do need to find the -O switch. When you ask for slow code,
> > you will get it. Same code, same compiler, different box/os.
> >
> > No optimization
> > 10 2.77 3.00 4.53
> > 100 1.27 1.26 2.33
> > 1000 1.30 1.31 2.30
> > 10000 1.24 1.23 2.16
> > 100000 1.12 1.12 1.99
> > 1000000 1.19 1.17 2.06
> >
> > -O2
> > 10 1.67 1.64 1.70
> > 100 0.77 0.78 0.81
> > 1000 0.89 0.88 0.97
> > 10000 0.83 0.83 0.92
> > 100000 0.81 0.81 0.86
> > 1000000 0.84 0.83 0.94

[snip]


> Could you show me the exact instruction you entered to compile that
> code using g++? I had trouble with this because simply adding -02

_Letter_ O, not numeral 0.

> isn't recognized by g++.

[snip]

Juding from open source projects, -O2 is probably the most widely used
flag for compiling with g++.

Louis Lavery

unread,
Jun 3, 2003, 6:00:26 PM6/3/03
to

Bjarne Stroustrup <b...@research.att.com> wrote in message
news:HFnqC...@research.att.com...

> /* Standard Container Benchmark
>
> Version 0.9, May 23, 2003
>
> The primary purpose of this benchmark is to show how different standard
> containers are in terms of performance. We have observed that programmers

[snip]


>
> const char header[] =
> "\tarray"
> "\tvector with pointers"
> "\tvector with iterators"
> "\tdeque"
> "\tlist"
> "\tset"
> "\tmultiset";

Um, why not std::string?

Louis.

LLeweLLyn

unread,
Jun 4, 2003, 7:49:17 AM6/4/03
to
"Louis Lavery" <Lo...@devilsChimney.co.uk> writes:

> Bjarne Stroustrup <b...@research.att.com> wrote in message
> news:HFnqC...@research.att.com...
>> /* Standard Container Benchmark
>>
>> Version 0.9, May 23, 2003
>>
>> The primary purpose of this benchmark is to show how different standard
>> containers are in terms of performance. We have observed that programmers
>
> [snip]
>
>
>>
>> const char header[] =
>> "\tarray"
>> "\tvector with pointers"
>> "\tvector with iterators"
>> "\tdeque"
>> "\tlist"
>> "\tset"
>> "\tmultiset";
>
> Um, why not std::string?

[snip]

Perhaps because it is not a container? However I see your point; every
now and then somebody posts really slow code using string to this
ng, and they are told use vector<char> instead.

Alex Vinokur

unread,
Jun 4, 2003, 7:51:45 AM6/4/03
to
Another performance testsuites measured with using C/C++ Program Perfometer.
http://alexvn.freeservers.com/s1/perfometer.html
http://sourceforge.net/projects/cpp-perfometer


1. for-loop vs. for_each algorithm in vector, string, list, set.
http://groups.google.com/groups?selm=3C8DD296.A7304148%40bigfoot.com

2. find() algorithm vs. find() method in vector, list, set, map
http://groups.google.com/groups?selm=3C8C7B97.E4B5CA9C%40bigfoot.com

3. Access to array, vector, basic_string
http://groups.google.com/groups?selm=3C8C619E.D3F557D5%40bigfoot.com

4. switch vs. dynamic_cast
http://groups.google.com/groups?selm=15eacd17.0204300451.209a527f%40posting.google.com

5. operator[] vs. at()
http://groups.google.com/groups?selm=aa8kpg%248oo15%241%40ID-79865.news.dfncis.de

=====================================
Alex Vinokur
mailto:ale...@connect.to

http://mathforum.org/library/view/10978.html

Stephen Howe

unread,
Jun 4, 2003, 9:30:21 AM6/4/03
to
STLPerf1.cpp is the benchmark unaltered
STLPerf2.cpp is the benchmark altered so that deque uses range constuctor
and no copy() and multiset uses the code that John Potter proposed with my
wrapper.

Compiler VC7.

The second set of figures in each case in STLPort 4.5.3, an instresting
comparison.

cl /O2 /GX /ML STLPerf1.cpp:
size array vector with pointers vector with iterators deque list
set multiset
10 1.98 2.03 2.25 6.66 15.38 3.61 6.94
100 1.16 1.14 1.33 4.13 5.33 2.11 3.69
1000 1.14 1.16 1.28 3.47 3.89 1.80 3.30
10000 1.05 1.05 1.16 3.34 3.84 1.92 4.14
100000 1.09 1.08 1.16 3.17 6.30 4.89 7.66
1000000 1.05 1.08 1.13 3.03 7.00 6.55 9.09

cl /O2 /GX /ML /I\STLport-4.5.3\stlport STLPerf1.cpp:
size array vector with pointers vector with iterators deque list
set multiset
10 1.36 1.34 1.36 1.84 3.72 1.00 1.48
100 0.67 0.67 0.67 1.13 1.45 1.00 1.41
1000 0.70 0.69 0.70 0.98 1.28 1.02 1.30
10000 0.72 0.70 0.66 0.91 1.36 1.02 1.59
100000 0.72 0.72 0.72 1.00 5.38 3.70 5.20
1000000 0.77 0.77 0.75 1.02 3.17 3.39 4.47

cl /O2 /GX /ML STLPerf2.cpp:
size array vector with pointers vector with iterators deque list
set multiset
10 1.95 2.03 2.23 6.56 15.48 3.67 7.00
100 1.14 1.14 1.31 4.11 5.33 2.13 3.70
1000 1.16 1.14 1.30 3.42 3.91 1.81 3.30
10000 1.05 1.05 1.16 3.27 3.83 1.94 4.11
100000 1.08 1.08 1.16 3.11 6.31 4.91 7.63
1000000 1.06 1.06 1.13 2.97 6.98 6.58 9.13

cl /O2 /GX /ML /I\STLport-4.5.3\stlport STLPerf2.cpp:
size array vector with pointers vector with iterators deque list
set multiset
10 1.34 1.33 1.34 1.80 3.73 1.03 1.45
100 0.69 0.67 0.67 1.09 1.48 1.03 1.33
1000 0.70 0.69 0.70 0.98 1.27 1.05 1.28
10000 0.72 0.70 0.64 0.91 1.36 1.00 1.56
100000 0.70 0.73 0.70 0.95 5.38 3.70 5.20
1000000 0.77 0.75 0.77 0.97 3.17 3.39 4.45

Stephen Howe

Alex Vinokur

unread,
Jun 4, 2003, 12:31:18 PM6/4/03
to
Here are some performance testsuites for comparative analysis : STL vs. Standard C Library.

The following functions (methods, operators) were measured.

----------------------------------------------------
Standard C Library | STL
----------------------------------------------------
strcpy | operator= (string)
strcat | operator+= (string)
strlen | size () (string)
char* | constructor&destructor (string)
char-array | constructor&destructor (string)
malloc&free | new&delete (string)
(malloc+init)&free | new&delete (string)
----------------------------------------------------


Measurement results:
http://groups.google.com/groups?selm=7dqehp%249jl%241%40nnrp1.dejanews.com
http://groups.google.com/groups?selm=36822F3A.DBFF0811%40tibam.elex.co.il
http://groups.google.com/groups?selm=368734F0.70B29CE2%40tibam.elex.co.il


=====================================
Alex Vinokur
mailto:ale...@connect.to

http://mathforum.org/library/view/10978.html

tom_usenet

unread,
Jun 4, 2003, 1:57:26 PM6/4/03
to
On 1 Jun 2003 17:39:45 -0400, cbar...@ix.netcom.com (Carl Barron)
wrote:

>Bjarne Stroustrup <b...@research.att.com> wrote:
>
>> void vector_pointer_test(element_t* first, element_t* last,
> int number_of_times)
>> {
>> vector<element_t> container(first, last);
>> // &*container.begin() gets us a pointer to the first element
>> sort(&*container.begin(), &*container.end());
>> unique(&*container.begin(), &*container.end());
>> }
> nit pick: vector<T,A>::end() does not need to be dereferenceable. but
>a vector is in a single block so sort(&container.front(),
>&container.back()+1); unique(&constainer.front(), &container.back()+1);
>is more portable:)

The original code should indeed assert on a "safe" debugging stl mode,
such as Dinkumware's or STLport's.

Tom

tom_usenet

unread,
Jun 4, 2003, 1:58:34 PM6/4/03
to
On 3 Jun 2003 05:35:50 -0400, trol...@shaw.ca (Troll_King) wrote:

>Could you show me the exact instruction you entered to compile that
>code using g++? I had trouble with this because simply adding -02
>isn't recognized by g++.

O for Optimization, not 0 for zero. -O2.

>
>With regard to the data types, see C++ Primer 3rd edition page 254 -
>260. The std::string has a different capacity than a large complex
>class, so the full penalty of std::vector isn't realized.

But std::string allocates memory. If it isn't reference counted, then
copying it will have a large overhead - larger than most complex types
that don't allocate dynamic memory.

However, I think libstdc++ 3 does use a reference counted string type,
so a more useful test (that isn't effected by different string
implementations) might be on a large object, such as, say:

struct s
{
int array[100];
};

Tom

Ben Hutchings

unread,
Jun 5, 2003, 3:21:41 AM6/5/03