STL container question

Boltar

unread,

Oct 1, 2008, 9:31:48 AM10/1/08

to

Hi

I need to store a number of integer values which I will then search on
later to see if they exist in my container. Can someone tell me which
container would be quickest for finding these values? I can't use a
plain C array (unless I make it 2^32 in size!) since I don't know the
max integer value.

Thanks for any help

B2003

Victor Bazarov

unread,

Oct 1, 2008, 9:36:35 AM10/1/08

to

Store first, then sort, then search (using 'std::binary_search'), you
could just use 'std::vector'. If you expect both searching and updating
the container, 'std::set' is probably better, its insertions are quite
fast. What book on the Standard Library are you reading that does not
have comparison of different standard containers in terms of performance?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Lars Tetzlaff

unread,

Oct 1, 2008, 9:37:15 AM10/1/08

to

Boltar schrieb:

std::set

Lars

Jeff Schwab

unread,

Oct 1, 2008, 9:40:16 AM10/1/08

to

Boltar wrote:

Sorted vector. See Effective STL, Item 23.

For the record, you wouldn't 2^32 integers, just 2^32 bits = 500 MiB.
It's actually not that much RAM, depending on your target system, and
would let you check for integers with O(1) complexity (rather than O(log
N)).

Ioannis Vranos

unread,

Oct 1, 2008, 10:09:31 AM10/1/08

to

Victor Bazarov wrote:
>
> Store first, then sort, then search (using 'std::binary_search'), you
> could just use 'std::vector'.

For that case, I think std::list is a better option, since the sorting
will be faster,

Victor Bazarov

unread,

Oct 1, 2008, 10:12:26 AM10/1/08

to

Do you have any proof of that?

Ioannis Vranos

unread,

Oct 1, 2008, 10:23:13 AM10/1/08

to

Victor Bazarov wrote:
> Ioannis Vranos wrote:
>> Victor Bazarov wrote:
>>>
>>> Store first, then sort, then search (using 'std::binary_search'), you
>>> could just use 'std::vector'.
>>
>>
>> For that case, I think std::list is a better option, since the sorting
>> will be faster,
>
> Do you have any proof of that?

Lists are implemented using pointers to point to the previous and to the
next elements, so list::sort(), is more efficient by changing pointer
values, while sorting a vector involves copying objects.

Juha Nieminen

unread,

Oct 1, 2008, 10:43:47 AM10/1/08

to

Ioannis Vranos wrote:
> Lists are implemented using pointers to point to the previous and to the
> next elements, so list::sort(), is more efficient by changing pointer
> values, while sorting a vector involves copying objects.

The original poster talked about storing integer values. I highly
doubt sorting a list of integers will be faster than sorting an array of
integers. In fact, I'm pretty sure of the contrary.

Juha Nieminen

unread,

Oct 1, 2008, 10:44:55 AM10/1/08

to

If memory usage is not an issue, std::set is by far the easiest solution.

(OTOH if the amount of integers can be counted in the millions, then
it may be better to use a sorted vector or whatever.)

Rolf Magnus

unread,

Oct 1, 2008, 12:53:20 PM10/1/08

to

Ioannis Vranos wrote:

And you really think that doing two pointer exchanges is faster than one
integer exchange?

Rolf Magnus

unread,

Oct 1, 2008, 12:57:08 PM10/1/08

to

Jeff Schwab wrote:

However, it can still be slower, since it's more or less the worst thing you
can do to the cache.

Erik Wikström

unread,

Oct 1, 2008, 1:31:53 PM10/1/08

to

Still, you should only get one cache-miss when looking for a value, if
you use a set or vector you will probably get more.

--
Erik Wikström

Pete Becker

unread,

Oct 1, 2008, 1:40:55 PM10/1/08

to

On 2008-10-01 10:23:13 -0400, Ioannis Vranos
<ivr...@no.spam.nospamfreemail.gr> said:

In other words, no. <g> One could also point out that quicksort can be
used to sort a vector but can't be used on a list, so obviously sorting
a vector is faster. The problem with both arguments is exactly what
Victor implied: they're handwaving. Proof requires much more detailed
analysis, or if you're making a decision for a particular platform,
measurement.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Lars Tetzlaff

unread,

Oct 1, 2008, 1:42:43 PM10/1/08

to

Jeff Schwab schrieb:

How do you get O(1) from a sorted vector<int>? A binary search costs
O(log N).

Lars

Victor Bazarov

unread,

Oct 1, 2008, 1:51:07 PM10/1/08

to

Jeff meant one would get O(1) from looking up in the full array of bits.
Every integer would mean a shift to form the index and a mask to get
to the bit value, one shift, one pointer addition, one dereference, one
bitwise AND per lookup, O(1).

Lars Tetzlaff

unread,

Oct 1, 2008, 2:04:39 PM10/1/08

to

Victor Bazarov schrieb:

> Lars Tetzlaff wrote:
>> Jeff Schwab schrieb:
>>> Boltar wrote:
>>>
>>>> I need to store a number of integer values which I will then search on
>>>> later to see if they exist in my container. Can someone tell me which
>>>> container would be quickest for finding these values? I can't use a
>>>> plain C array (unless I make it 2^32 in size!) since I don't know the
>>>> max integer value.
>>> Sorted vector. See Effective STL, Item 23.
>>>
>>> For the record, you wouldn't 2^32 integers, just 2^32 bits = 500 MiB.
>>> It's actually not that much RAM, depending on your target system, and
>>> would let you check for integers with O(1) complexity (rather than O(log
>>> N)).
>>
>> How do you get O(1) from a sorted vector<int>? A binary search costs
>> O(log N).
>
> Jeff meant one would get O(1) from looking up in the full array of bits.
> Every integer would mean a shift to form the index and a mask to get to
> the bit value, one shift, one pointer addition, one dereference, one
> bitwise AND per lookup, O(1).
>
> V

I see, something like vector<bool>. Ok, that would be fast.

Lars

Rolf Magnus

unread,

Oct 1, 2008, 3:47:45 PM10/1/08

to

Erik Wikström wrote:

If you only do a single look-up, yes. But in that case, building up an array
of 500 Megabytes will take a quite significant amount of time. ;-)

James Kanze

unread,

Oct 1, 2008, 4:06:16 PM10/1/08

to

Some systems won't allow you to allocate that much as a local
variable, so you might as well use std::vector. If the
reallocations are an issue, one call up front to capacity should
solve that.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Oct 1, 2008, 4:08:15 PM10/1/08

to

One thing I don't understand here: both a C style array and
std::vector use a single block of contiguous memory. How could
cache performance be any different for them?

(An earlier suggestion to use std::list does suffer from this,
of course.)

James Kanze

unread,

Oct 1, 2008, 4:11:34 PM10/1/08

to

On Oct 1, 7:40 pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2008-10-01 10:23:13 -0400, Ioannis Vranos

> <ivra...@no.spam.nospamfreemail.gr> said:
> > Victor Bazarov wrote:
> >> Ioannis Vranos wrote:
> >>> Victor Bazarov wrote:

> >>>> Store first, then sort, then search (using
> >>>> 'std::binary_search'), you could just use 'std::vector'.

> >>> For that case, I think std::list is a better option, since
> >>> the sorting will be faster,

> >> Do you have any proof of that?

> > Lists are implemented using pointers to point to the
> > previous and to the next elements, so list::sort(), is more
> > efficient by changing pointer values, while sorting a vector
> > involves copying objects.

> In other words, no. <g> One could also point out that
> quicksort can be used to sort a vector but can't be used on a
> list, so obviously sorting a vector is faster. The problem
> with both arguments is exactly what Victor implied: they're
> handwaving. Proof requires much more detailed analysis, or if
> you're making a decision for a particular platform,
> measurement.

To other considerations: you can't use binary_search on
std::list---with any sort of size, that's going to make a
significant difference in favor of vector. And of course, an
std::list has horrible locality, which will translate into a lot
of cache misses (or even virtual page faults). So even if
sorting it were faster (which I somehow doubt)...

Rolf Magnus

unread,

Oct 1, 2008, 4:39:00 PM10/1/08

to

James Kanze wrote:

> On Oct 1, 7:31 pm, Erik Wikström <Erik-wikst...@telia.com> wrote:
>> On 2008-10-01 18:57, Rolf Magnus wrote:
>> > Jeff Schwab wrote:
>> >> Boltar wrote:
>> >>> I need to store a number of integer values which I will
>> >>> then search on later to see if they exist in my container.
>> >>> Can someone tell me which container would be quickest for
>> >>> finding these values? I can't use a plain C array (unless
>> >>> I make it 2^32 in size!) since I don't know the max
>> >>> integer value.
>
>> >> Sorted vector. See Effective STL, Item 23.
>
>> >> For the record, you wouldn't 2^32 integers, just 2^32 bits
>> >> = 500 MiB. It's actually not that much RAM, depending on
>> >> your target system, and would let you check for integers
>> >> with O(1) complexity (rather than O(log N)).
>
>> > However, it can still be slower, since it's more or less the
>> > worst thing you can do to the cache.
>
>> Still, you should only get one cache-miss when looking for a
>> value, if you use a set or vector you will probably get more.
>
> One thing I don't understand here: both a C style array and
> std::vector use a single block of contiguous memory. How could
> cache performance be any different for them?

We're not talking about C style arrays vs. vector, but about different
algorithms. One is to storing the values in a sorted array/vector, then
doing a binary search for the value. The other is like a trivial kind of
hash table, where each possible integer value has a boolean entry, so you
can simply use the value as index. Saves the search, but needs a big amount
of memory that is much larger than the cache.

Jeff Schwab

unread,

Oct 1, 2008, 8:28:49 PM10/1/08

to

Right, but the whole thing doesn't have to be in cache, just the parts
that are used. The rest may as well be swapped out of memory
altogether. Of course, that's the point of cache. :)

Ioannis Vranos

unread,

Oct 2, 2008, 6:08:57 AM10/2/08

to

We must think generally. In general, sorting a list is faster than
sorting a vector, because the list sorting does not involve the
construction or destruction of any object.

Regarding ints, I think sorting a vector of ints and as list of ints,
both have about the same efficiency.

If the programmer decides to replace ints with other objects, he will
not have to change much in the code, if he uses a list.

Ioannis Vranos

unread,

Oct 2, 2008, 6:10:40 AM10/2/08

to

Pete Becker wrote:
>
> In other words, no. <g> One could also point out that quicksort can be
> used to sort a vector but can't be used on a list, so obviously sorting
> a vector is faster.

list::sort is faster than any sort on vector, because it does not

Ioannis Vranos

unread,

Oct 2, 2008, 6:21:58 AM10/2/08

to

James Kanze wrote:
> To other considerations: you can't use binary_search on
> std::list---with any sort of size, that's going to make a
> significant difference in favor of vector.

Of course you can:

#include <algorithm>
#include <list>
#include <cstddef>
#include <iostream>

int main()
{
using namespace std;

typedef list<int> intlist;

intlist ilist;

for (size_t i= 0; i< 100; ++i)
ilist.push_back(100-i);

ilist.sort();

for(intlist::const_iterator p= ilist.begin(); p!= ilist.end(); ++p)
cout<< *p<< endl;

binary_search(ilist.begin(), ilist.end(), 50);
}

Hendrik Schober

unread,

Oct 2, 2008, 6:56:38 AM10/2/08

to

Ioannis Vranos wrote:
> [...]

> We must think generally. In general, sorting a list is faster than
> sorting a vector, because the list sorting does not involve the
> construction or destruction of any object.
>
> Regarding ints, I think sorting a vector of ints and as list of ints,
> both have about the same efficiency.

Why don't you just measure before you doubt the statements
of those who already went and did this?

On my platform, this

#include <iostream>
#include <vector>
#include <list>
#include <algorithm>

const unsigned int kLimit = 1000000;

template< class Cont >
void fill(Cont& cont)
{
for( unsigned int u = 0; u < kLimit; ++u ) {
cont.push_back(u);
}
}

template< class Cont >
void test(Cont& cont);

int main()
{
std::vector<unsigned int> v; v.reserve(kLimit);
std::list<unsigned int> l;

std::cout << "filling a vector..." << std::endl;
fill(v);
std::cout << "filling a list..." << std::endl;
fill(l);
std::cout << "...done.\n";

std::cout << "sorting a vector..." << std::endl;
test(v);
std::cout << "sorting a list..." << std::endl;
test(l);

return 0;
}

template< typename T, class Al >
inline void sort(std::vector<T,Al>& v) {std::sort(v.begin(),v.end());}

template< typename T, class Al >
inline void sort(std::list<T,Al>& l) {l.sort();}

#include <windows.h> //for GetTickCount()

template< class Cont >
void test(Cont& cont) {
const DWORD start = GetTickCount();
sort(cont);
std::cout << "...took " << GetTickCount()-start << "msecs." << std::endl;
}

outputs
filling a vector...
filling a list...
...done.
sorting a vector...
...took 47msecs.
sorting a list...
...took 1562msecs.
and thus agrees with everyone who disagreed with you.

> If the programmer decides to replace ints with other objects, he will
> not have to change much in the code, if he uses a list.

Right. It wouldn't do any good anyway as this

#include <iostream>
#include <vector>
#include <list>
#include <algorithm>
#include <string>

const unsigned int kLimit = 1000000;

template< class Cont >
void fill(Cont& cont)
{
for( unsigned int u = 0; u < kLimit; ++u ) {
cont.push_back(Test());
}
}

template< class Cont >
void test(Cont& cont);

class Test {
public:
Test() : instance_(++id_), str_("test it!") {}
bool operator<(const Test& rhs) {return instance_>rhs.instance_;}
private:
unsigned int instance_;
static unsigned int id_;
std::string str_;
};

unsigned int Test::id_ = 0;

int main()
{
std::vector<Test> v; v.reserve(kLimit);
std::list<Test> l;

std::cout << "filling a vector..." << std::endl;
fill(v);
std::cout << "filling a list..." << std::endl;
fill(l);
std::cout << "...done.\n";

std::cout << "sorting a vector..." << std::endl;
test(v);
std::cout << "sorting a list..." << std::endl;
test(l);

return 0;
}

template< typename T, class Al >
inline void sort(std::vector<T,Al>& v) {std::sort(v.begin(),v.end());}

template< typename T, class Al >
inline void sort(std::list<T,Al>& l) {l.sort();}

#include <windows.h> //for GetTickCount()

template< class Cont >
void test(Cont& cont) {
const DWORD start = GetTickCount();
sort(cont);
std::cout << "...took " << GetTickCount()-start << "msecs." << std::endl;
}

outputs

filling a vector...
filling a list...
...done.
sorting a vector...
...took 829msecs.
sorting a list...
...took 2437msecs.

and thus again disagrees with you.

Eagerly awaiting your counter example,

Scho-you-can-take-your-foot-out-of-your-mouth-now-bi

Maxim Yegorushkin

unread,

Oct 2, 2008, 7:14:52 AM10/2/08

to

On Oct 1, 6:40 pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2008-10-01 10:23:13 -0400, Ioannis Vranos

> <ivra...@no.spam.nospamfreemail.gr> said:
>
>
>
> > Victor Bazarov wrote:
> >> Ioannis Vranos wrote:
> >>> Victor Bazarov wrote:
>
> >>>> Store first, then sort, then search (using 'std::binary_search'), you
> >>>> could just use 'std::vector'.
>
> >>> For that case, I think std::list is a better option, since the sorting
> >>> will be faster,
>
> >> Do you have any proof of that?
>
> > Lists are implemented using pointers to point to the previous and to
> > the next elements, so list::sort(), is more efficient by changing
> > pointer values, while sorting a vector involves copying objects.
>
> In other words, no. <g> One could also point out that quicksort can be
> used to sort a vector but can't be used on a list, so obviously sorting
> a vector is faster.

Not always so.

You are correct that as lists do not provide random-access iterators
quicksort can not be used. Instead, lists are normally sorted using
merge sort. Merge sort has better worst-case performance than
quicksort. The drawback of merge sort is that it normally requires
extra memory, whereas quicksort does not.

http://en.wikipedia.org/wiki/Merge_sort
<quote>
In the worst case, merge sort does about 39% fewer comparisons than
quicksort does in the average case; merge sort always makes fewer
comparisons than quicksort, except in extremely rare cases, when they
tie, where merge sort's worst case is found simultaneously with
quicksort's best case. In terms of moves, merge sort's worst case
complexity is O(n log n)—the same complexity as quicksort's best case,
and merge sort's best case takes about half as many iterations as the
worst case.
</quote>

--
Max

Ioannis Vranos

unread,

Oct 2, 2008, 9:00:21 AM10/2/08

to

Hendrik Schober wrote:
> Ioannis Vranos wrote:
>> [...]
>> We must think generally. In general, sorting a list is faster than
>> sorting a vector, because the list sorting does not involve the
>> construction or destruction of any object.
>>
>> Regarding ints, I think sorting a vector of ints and as list of ints,
>> both have about the same efficiency.
>
> Why don't you just measure before you doubt the statements
> of those who already went and did this?
>
> On my platform, this

[ Non-portable code...]

>
> and thus again disagrees with you.
>
> Eagerly awaiting your counter example,

#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>

class SomeClass
{
typedef std::vector<int> TypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();

bool operator<(const SomeClass &argSomeClass) const
{
return vec[0]< argSomeClass.vec[0];
}
};

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=10000;

typedef vector<SomeClass> Vector;
typedef list<SomeClass> List;

cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

cout<<" Done!\n\n"<< flush;

List lis(vec.size());

cout<< "Filling list with vector elements..."<< flush;

for(Vector::size_type i= 0; i< vec.size(); ++i)
lis.push_back(vec[i]);

cout<< " Done!\n\n"<< flush;

clock_t timeBeginVector, timeEndVector, timeBeginList, timeEndList;

cout<< "Timing the sorting of the vector..."<< flush;

timeBeginVector= clock();

sort(vec.begin(), vec.end());

timeEndVector= clock();

cout<< " Done!\n\n"<< flush;

cout<< "Timing the sorting of the list..."<< flush;

timeBeginList= clock();

lis.sort();

timeEndList= clock();

cout<< " Done!\n\n"<< flush;

cout<< "The sorting of the vector took "
<< static_cast<double>((timeEndVector- timeBeginVector))/
CLOCKS_PER_SEC
<< " seconds\n\n";

cout<< "The sorting of the list took "
<< static_cast<double>((timeEndList- timeBeginList))/
CLOCKS_PER_SEC
<< " seconds\n\n";
}

SomeClass::SomeClass():vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

Hendrik Schober

unread,

Oct 2, 2008, 9:14:28 AM10/2/08

to

Ioannis Vranos wrote:
> Hendrik Schober wrote:
>> Ioannis Vranos wrote:
>>> [...]
>>> We must think generally. In general, sorting a list is faster than
>>> sorting a vector, because the list sorting does not involve the
>>> construction or destruction of any object.
>>>
>>> Regarding ints, I think sorting a vector of ints and as list of ints,
>>> both have about the same efficiency.
>> Why don't you just measure before you doubt the statements
>> of those who already went and did this?
>>
>> On my platform, this
>
>
> [ Non-portable code...]
>
>
>
>> and thus again disagrees with you.
>>
>> Eagerly awaiting your counter example,
>
>

> [code]

I had to add another zero in order to get sensible results:

Creating vector with 100000 elements... Done!
Filling list with vector elements... Done!
Timing the sorting of the vector... Done!
Timing the sorting of the list... Done!
The sorting of the vector took 0.016 seconds
The sorting of the list took 0.468 seconds

HTH,

Schobi

Ioannis Vranos

unread,

Oct 2, 2008, 9:24:40 AM10/2/08

to

Your computer is faster than mine and probably with more RAM.

In my computer with SIZE 10000:

john@john-desktop:~/Projects/src$ ./foobar_cpp

Creating vector with 10000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 2.55 seconds

The sorting of the list took 0.02 seconds

john@john-desktop:~/Projects/src$

In my computer with SIZE 100000 (uses swap):

john@john-desktop:~/Projects/src$ ./foobar_cpp

Creating vector with 100000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 27.46 seconds

The sorting of the list took 4.58 seconds

john@john-desktop:~/Projects/src$

==> The timings less than 1 second are not conclusive. Add 0s until you
exceed 1-2 seconds.

Ioannis Vranos

unread,

Oct 2, 2008, 9:31:30 AM10/2/08

to

The program had a serious bug.

corrected:

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();

srand(time(0));

const size_t SIZE=10000;

===> List lis;

Hendrik Schober

unread,

Oct 2, 2008, 9:31:16 AM10/2/08

to

Ioannis Vranos wrote:
> [...]

> ==> The timings less than 1 second are not conclusive. Add 0s until you
> exceed 1-2 seconds.

If I add a single one more, the program runsout of memory.

Schobi

Hendrik Schober

unread,

Oct 2, 2008, 9:34:29 AM10/2/08

to

Ioannis Vranos wrote:
> The program had a serious bug.
>
>
> corrected:

> [...]

Creating vector with 100000 elements... Done!

Filling list with vector elements... Done!
Timing the sorting of the vector... Done!
Timing the sorting of the list... Done!
The sorting of the vector took 0.015 seconds
The sorting of the list took 0.437 seconds

Schobi

Ioannis Vranos

unread,

Oct 2, 2008, 9:38:43 AM10/2/08

to

Ioannis Vranos wrote:
> The program had a serious bug.
>
>
> corrected:
>
>
>

And my results:

1. const size_t SIZE= 90000;

john@john-desktop:~/Projects/src$ ./foobar_cpp

Creating vector with 90000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 24.79 seconds

The sorting of the list took 0.1 seconds

john@john-desktop:~/Projects/src$

2. const size_t SIZE= 100000;

john@john-desktop:~/Projects/src$ ./foobar_cpp

Creating vector with 100000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 26.8 seconds

The sorting of the list took 0.1 seconds

john@john-desktop:~/Projects/src$

Ioannis Vranos

unread,

Oct 2, 2008, 9:42:25 AM10/2/08

to

Again, increase the number of const size_t SIZE so as to exceed 1-2
seconds for the sorting of one container, and then post your results.

For example try

const size_t SIZE= 400000;

Victor Bazarov

unread,

Oct 2, 2008, 9:48:56 AM10/2/08

to

This exercise is pointless. I made 200K elements (no swap, but the
computer uses up to 2.5G of RAM, probably mostly due to the need to keep
all the pointers in the list elements) and I still consistently get

Creating vector with 200000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 0.015 seconds

The sorting of the list took 0.156 seconds

What is it you're trying to prove? That your computer and your
implementation are dilapidated? Get a better compiler.

Ioannis Vranos

unread,

Oct 2, 2008, 9:56:06 AM10/2/08

to

Victor Bazarov wrote:
>
> This exercise is pointless. I made 200K elements (no swap, but the
> computer uses up to 2.5G of RAM, probably mostly due to the need to keep
> all the pointers in the list elements) and I still consistently get
>
>
> Creating vector with 200000 elements... Done!
>
> Filling list with vector elements... Done!
>
> Timing the sorting of the vector... Done!
>
> Timing the sorting of the list... Done!
>
> The sorting of the vector took 0.015 seconds
>
> The sorting of the list took 0.156 seconds
>
> What is it you're trying to prove? That your computer and your
> implementation are dilapidated? Get a better compiler.

The specific code had a serious bug (list had double elements than
vector). Also differences in milliseconds have no point (are inconclusive).

Ioannis Vranos

unread,

Oct 2, 2008, 10:00:23 AM10/2/08

to

Ioannis Vranos wrote:
>
>> The program had a serious bug.
>

> [...]

>
>
> And my results:
>
>
> 1. const size_t SIZE= 90000;
>
>
> john@john-desktop:~/Projects/src$ ./foobar_cpp
>
> Creating vector with 90000 elements... Done!
>
> Filling list with vector elements... Done!
>
> Timing the sorting of the vector... Done!
>
> Timing the sorting of the list... Done!
>
> The sorting of the vector took 24.79 seconds
>
> The sorting of the list took 0.1 seconds
>
> john@john-desktop:~/Projects/src$
>
>
>
> 2. const size_t SIZE= 100000;
>
> john@john-desktop:~/Projects/src$ ./foobar_cpp
>
> Creating vector with 100000 elements... Done!
>
> Filling list with vector elements... Done!
>
> Timing the sorting of the vector... Done!
>
> Timing the sorting of the list... Done!
>
> The sorting of the vector took 26.8 seconds
>
> The sorting of the list took 0.1 seconds
>
> john@john-desktop:~/Projects/src$

Just for the record, my system is:

OS: Ubuntu 8.04 x86.

Compiler: gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)

Hardware: Pentium III 1 GHz, 1 GB RAM.

Victor Bazarov

unread,

Oct 2, 2008, 10:03:18 AM10/2/08

to

-------------- a debug build:

Creating vector with 200000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 0.203 seconds

The sorting of the list took 4.602 seconds

-------------- a release build:

Creating vector with 200000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 0.016 seconds

The sorting of the list took 0.062 seconds

-------------- another release build:

Creating vector with 300000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 0.016 seconds

The sorting of the list took 0.125 seconds

---------------------------------------------------------

Tell me again, what does your example prove? And, again, get a better
compiler and a better library. Whatever version you're using is *crap*,
obviously.

Ioannis Vranos

unread,

Oct 2, 2008, 10:05:29 AM10/2/08

to

Victor Bazarov wrote:
> What is it you're trying to prove? That your computer and your
> implementation are dilapidated? Get a better compiler.

Apart from the bug, sorting a list, does not involve the construction or
destruction of any object, while in the case of vector, element objects
are copied.

Ioannis Vranos

unread,

Oct 2, 2008, 10:07:57 AM10/2/08

to

Victor Bazarov wrote:
>
> Tell me again, what does your example prove? And, again, get a better
> compiler and a better library. Whatever version you're using is *crap*,
> obviously.

Do you agree that sorting a list does not involve object creation, while
sorting a vector involves the copying of the object elements?

Ioannis Vranos

unread,

Oct 2, 2008, 10:08:48 AM10/2/08

to

Ioannis Vranos wrote:
> Victor Bazarov wrote:
>>
>> Tell me again, what does your example prove? And, again, get a better
>> compiler and a better library. Whatever version you're using is
>> *crap*, obviously.
>
>
> Do you agree that sorting a list does not involve object creation

or assignment,

Victor Bazarov

unread,

Oct 2, 2008, 10:11:16 AM10/2/08

to

Ioannis Vranos wrote:
> Ioannis Vranos wrote:
>>
>>> The program had a serious bug.
>>
> > [...]
> >
> >
>> And my results:
>>
>>
>> 1. const size_t SIZE= 90000;
>>
>>
>> john@john-desktop:~/Projects/src$ ./foobar_cpp
>>
>> Creating vector with 90000 elements... Done!
>>
>> Filling list with vector elements... Done!
>>
>> Timing the sorting of the vector... Done!
>>
>> Timing the sorting of the list... Done!
>>
>> The sorting of the vector took 24.79 seconds
>>
>> The sorting of the list took 0.1 seconds
>>
>> john@john-desktop:~/Projects/src$
>>
>

> Just for the record, my system is:
>
>
> OS: Ubuntu 8.04 x86.
>
> Compiler: gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
>
> Hardware: Pentium III 1 GHz, 1 GB RAM.

The system I tested on is DELL T7400 (dual Intel Xeon quad core), Vista
Ultimate 64, Visual C++ 2008 sp1, 4 gigs of RAM. Since no parallelizing
is done in your program, the number of cores doesn't matter, but the CPU
is faster, and while I would love to try it on a 16-gigabyte machine or
better, I don't have access to one. My machine is what our clients are
getting, so the results stand. You're free to speculate what makes it
so *bad* on your machine, but I'd seriously think about switching to a
real OS and a real compiler, if I got such bad results as you do.

Victor Bazarov

unread,

Oct 2, 2008, 10:15:04 AM10/2/08

to

Whatever it involves (and I'd think it would actually be *swapping* and
not assignment or copying), apparently *your library* does it in a very
unacceptable manner.

You can theorize as much as you want, but in the end the results only
prove that one needs to measure before coming to any conclusions. Logic
does not always work well...

Chris M. Thomasson

unread,

Oct 2, 2008, 10:43:57 AM10/2/08

to

"Victor Bazarov" <v.Aba...@comAcast.net> wrote in message
news:gc2l17$jqi$2...@news.datemas.de...

> Ioannis Vranos wrote:
>> Ioannis Vranos wrote:
>>> Victor Bazarov wrote:
>>>>
>>>> Tell me again, what does your example prove? And, again, get a better
>>>> compiler and a better library. Whatever version you're using is
>>>> *crap*, obviously.
>>>
>>>
>>> Do you agree that sorting a list does not involve object creation
>>
>> or assignment,
>>
>>> while sorting a vector involves the copying of the object elements?
>
> Whatever it involves (and I'd think it would actually be *swapping* and
> not assignment or copying),

Doesn't a swap sort of imply a copy of "something"?

struct object {
int x;
};

object objs[2] = { { 0 }, { 1 } };

How do you swap `obj[0]' with `obj[1]' without copying "something"?

> apparently *your library* does it in a very unacceptable manner.
> You can theorize as much as you want, but in the end the results only
> prove that one needs to measure before coming to any conclusions. Logic
> does not always work well...

;^)

Richard Herring

unread,

Oct 2, 2008, 10:39:29 AM10/2/08

to

In message <gc26jq$ksb$1...@ulysses.noc.ntua.gr>, Ioannis Vranos
<ivr...@no.spam.nospamfreemail.gr> writes

>Juha Nieminen wrote:
>> Ioannis Vranos wrote:
>>> Lists are implemented using pointers to point to the previous and to the
>>> next elements, so list::sort(), is more efficient by changing pointer
>>> values, while sorting a vector involves copying objects.
>> The original poster talked about storing integer values. I highly
>> doubt sorting a list of integers will be faster than sorting an array of
>> integers. In fact, I'm pretty sure of the contrary.
>
>
>We must think generally. In general, sorting a list is faster than
>sorting a vector, because the list sorting does not involve the
>construction or destruction of any object.

So how many constructions and destructions does std::sort involve?

>
>Regarding ints, I think sorting a vector of ints and as list of ints,
>both have about the same efficiency.

How many integer assignments to swap two ints in a vector?
How many pointer assignments to swap two elements in a list?

>
>If the programmer decides to replace ints with other objects, he will
>not have to change much in the code, if he uses a list.

What would he have to change with a vector?

--
Richard Herring

Lars Tetzlaff

unread,

Oct 2, 2008, 10:55:43 AM10/2/08

to

Ioannis Vranos schrieb:

This program doesn't sort at all, because the vec is initialized with
one constant! Change the list filling code to

for(Vector::size_type i= 0; i< vec.size(); ++i) {
vec[i]= SomeClass();
lis.push_back(vec[i]);
}

and you have something to sort.

My result on OpenSuse 11 64Bit with Phenom 9850 and 4GB RAM, gcc 4.3.1,
const size_t SIZE=100000, compiled with g++ -O3 :

Creating vector with 100000 elements... Done!

Filling list with vector elements... Done!

Timing the sorting of the vector... Done!

Timing the sorting of the list... Done!

The sorting of the vector took 1.12 seconds

The sorting of the list took 0.12 seconds

Lars

Jerry Coffin

unread,

Oct 2, 2008, 11:03:38 AM10/2/08

to

In article <aeb09adf-92fc-45ce-aa7f-
80f4ba...@l62g2000hse.googlegroups.com>, maxim.ye...@gmail.com
says...

[ ... ]

> You are correct that as lists do not provide random-access iterators
> quicksort can not be used.

Actually, that's not true. It's entirely possible to implement a
quicksort on a linked list. You can't (efficiently) use a median-of-
three (or whatever) pivot selection, but the quicksort itself has no
need for random-access iteration.

> Instead, lists are normally sorted using merge sort.

This, however, is true. Even though you _can_ use quicksort on a linked
list, the potential gains from doing so are much smaller than the
potential losses.

> Merge sort has better worst-case performance than
> quicksort. The drawback of merge sort is that it normally requires
> extra memory, whereas quicksort does not.

When applied to arrays/vectors, that's true. For a linked list, they
both require essentially the same extra memory.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Juha Nieminen

unread,

Oct 2, 2008, 11:24:47 AM10/2/08

to

Ioannis Vranos wrote:
> list::sort is faster than any sort on vector, because it does not
> involve the construction or destruction of any object.

This is true even if the elements being sorted are ints?

What do you think is faster, copying and assigning individual ints, or
updating pairs of pointers at each stage of the sorting?

Juha Nieminen

unread,

Oct 2, 2008, 11:27:12 AM10/2/08

to

Jerry Coffin wrote:
> You can't (efficiently) use a median-of-three (or whatever) pivot selection

Not true.

Granted, you can't do it for the *first* partition, but there's no
problem in getting the median-of-three pivot in subsequent partitions.

Ioannis Vranos

unread,

Oct 2, 2008, 11:50:25 AM10/2/08

to

Chris M. Thomasson wrote:
>
>
> Doesn't a swap sort of imply a copy of "something"?
>
> struct object {
> int x;
> };
>
> object objs[2] = { { 0 }, { 1 } };
>
> How do you swap `obj[0]' with `obj[1]' without copying "something"?
>

A vector::swap() can be a shallow swap. For example if a vector is
implemented as a pointer pointing to an array on the free store, swap
could just swap the pointer values, without interfering with the contents.

Ioannis Vranos

unread,

Oct 2, 2008, 11:54:44 AM10/2/08

to

? Yes it is initalised with its number of elements, which are SomeClass
objects created with their default constructor.

> Change the list filling code to
>
> for(Vector::size_type i= 0; i< vec.size(); ++i) {
> vec[i]= SomeClass();
> lis.push_back(vec[i]);
> }

Why?

Lars Tetzlaff

unread,

Oct 2, 2008, 12:14:10 PM10/2/08

to

Ioannis Vranos schrieb:

On my system all values are equal. The constructor is only called once!
The other elements are initialized with a copy of this SomeClass object!

Don't know what the standard says about this ...

>> Change the list filling code to
>>
>> for(Vector::size_type i= 0; i< vec.size(); ++i) {
>> vec[i]= SomeClass();
>> lis.push_back(vec[i]);
>> }
>
>
> Why?

To get distinct elements. See above.

Lars

Ioannis Vranos

unread,

Oct 2, 2008, 12:25:49 PM10/2/08

to

Lars Tetzlaff wrote:
>> corrected:

There are two vecs in the code. Please be more specific, which are you
referring to?

Lars Tetzlaff

unread,

Oct 2, 2008, 12:29:40 PM10/2/08

to

Ioannis Vranos schrieb:

Vector vec(SIZE); in main.

Lars

Ioannis Vranos

unread,

Oct 2, 2008, 1:01:44 PM10/2/08

to

Lars Tetzlaff wrote:
>>
>> There are two vecs in the code. Please be more specific, which are you
>> referring to?
>
> Vector vec(SIZE); in main.

Already spotted the strange behaviour and posted a new message with the
subject "Strange behaviour".

Rolf Magnus

unread,

Oct 2, 2008, 1:09:47 PM10/2/08

to

Ioannis Vranos wrote:

What does your example have to do with the OP's question? The OP has one int
as member, while in your code, each element contains a vector of 1000 ints
instead. If this proves anything, then that you can construct a situation
where a list is faster than a vector. Unfortunately, that situation has
nothing to do with the OP's question.

Chris M. Thomasson

unread,

Oct 2, 2008, 1:34:58 PM10/2/08

to

"Ioannis Vranos" <ivr...@no.spam.nospamfreemail.gr> wrote in message
news:gc2qk2$26om$1...@ulysses.noc.ntua.gr...

My point was that "something" is copied; in this case a pointer value.
However, in this case, that statement would be a ridiculous nit-pick!

;^)

Chris M. Thomasson

unread,

Oct 2, 2008, 1:38:37 PM10/2/08

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:Nw7Fk.13201$ex3....@newsfe02.iad...

Anyway, I this was about sorting vectors... AFAICT, this implies some
swapping of the contents of slots in the array, and swapping implies
"something" is copied. If its a pointer value, so be it. If not, well, then
things can start to get expensive. The sort and swap impl needs to be very
highly optimized indeed.

Chris M. Thomasson

unread,

Oct 2, 2008, 1:39:31 PM10/2/08

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message

news:bA7Fk.13202$ex3....@newsfe02.iad...

> "Chris M. Thomasson" <n...@spam.invalid> wrote in message
> news:Nw7Fk.13201$ex3....@newsfe02.iad...
>> "Ioannis Vranos" <ivr...@no.spam.nospamfreemail.gr> wrote in message
>> news:gc2qk2$26om$1...@ulysses.noc.ntua.gr...
>>> Chris M. Thomasson wrote:
>>>>
>>>>
>>>> Doesn't a swap sort of imply a copy of "something"?
>>>>
>>>> struct object {
>>>> int x;
>>>> };
>>>>
>>>> object objs[2] = { { 0 }, { 1 } };
>>>>
>>>> How do you swap `obj[0]' with `obj[1]' without copying "something"?
>>>>
>>>
>>>
>>> A vector::swap() can be a shallow swap. For example if a vector is
>>> implemented as a pointer pointing to an array on the free store, swap
>>> could just swap the pointer values, without interfering with the
>>> contents.
>>
>> My point was that "something" is copied; in this case a pointer value.
>> However, in this case, that statement would be a ridiculous nit-pick!
>>
>> ;^)
>
> Anyway, I this was about sorting vectors...

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Anyway, I _thought_ this was about sorting vectors...

ARGH!

Chris M. Thomasson

unread,

Oct 2, 2008, 1:43:09 PM10/2/08

to

"Jerry Coffin" <jco...@taeus.com> wrote in message
news:MPG.234e90945...@news.sunsite.dk...

> In article <aeb09adf-92fc-45ce-aa7f-
> 80f4ba...@l62g2000hse.googlegroups.com>, maxim.ye...@gmail.com
> says...
>
> [ ... ]
>
>> You are correct that as lists do not provide random-access iterators
>> quicksort can not be used.
>
> Actually, that's not true. It's entirely possible to implement a
> quicksort on a linked list. You can't (efficiently) use a median-of-
> three (or whatever) pivot selection, but the quicksort itself has no
> need for random-access iteration.

You can use quick-sort on a skip-list where one of the skip-links points to
the middle of the list, or some other pivot.

Erik Wikström

unread,

Oct 2, 2008, 1:49:42 PM10/2/08

to

On 2008-10-02 16:11, Victor Bazarov wrote:
> Ioannis Vranos wrote:
>> Ioannis Vranos wrote:
>>>
>>>> The program had a serious bug.
>>>
>> > [...]
>> >
>> >
>>> And my results:
>>>
>>>
>>> 1. const size_t SIZE= 90000;
>>>
>>>
>>> john@john-desktop:~/Projects/src$ ./foobar_cpp
>>>
>>> Creating vector with 90000 elements... Done!
>>>
>>> Filling list with vector elements... Done!
>>>
>>> Timing the sorting of the vector... Done!
>>>
>>> Timing the sorting of the list... Done!
>>>
>>> The sorting of the vector took 24.79 seconds
>>>
>>> The sorting of the list took 0.1 seconds

I managed to duplicate your results once, but only once. Despite
numerous tries your code consistently shows the vector being at least an
order of magnitude faster. Make sure your are not running other CPU-
intensive applications while running the test.

> The system I tested on is DELL T7400 (dual Intel Xeon quad core), Vista
> Ultimate 64, Visual C++ 2008 sp1, 4 gigs of RAM. Since no parallelizing
> is done in your program, the number of cores doesn't matter, but the CPU
> is faster, and while I would love to try it on a 16-gigabyte machine or
> better, I don't have access to one. My machine is what our clients are
> getting, so the results stand. You're free to speculate what makes it
> so *bad* on your machine, but I'd seriously think about switching to a
> real OS and a real compiler, if I got such bad results as you do.

Long since I last heard anyone calling Linux and gcc toy products. :-)

--
Erik Wikström

Victor Bazarov

unread,

Oct 2, 2008, 1:55:41 PM10/2/08

to

Ioannis Vranos wrote:
> The program had a serious bug.
>
>

> corrected:
>
> [..]

OK, as it was determined, your compiler has a bug (that makes it invoke
the default c-tor for all elements of a vector constructed with only the
size given. I've added a copy constructor that does exactly the same
thing as the default constructor and finally got your results. A vector
of large objects with random values takes longer to sort than a list of
large objects with similarly random values. Significantly longer. Your
theory is confirmed, I suppose.

Now, it has really nothing to do with the OP's problem, but still...

Victor Bazarov

unread,

Oct 2, 2008, 1:57:14 PM10/2/08

to

Erik Wikström wrote:
> On 2008-10-02 16:11, Victor Bazarov wrote:

>> [..] You're free to speculate what makes it

>> so *bad* on your machine, but I'd seriously think about switching to a
>> real OS and a real compiler, if I got such bad results as you do.
>
> Long since I last heard anyone calling Linux and gcc toy products. :-)

I didn't call them toy products, either.

Ioannis Vranos

unread,

Oct 2, 2008, 2:37:07 PM10/2/08

to

Victor Bazarov wrote:
>
> OK, as it was determined, your compiler has a bug (that makes it invoke
> the default c-tor for all elements of a vector constructed with only the
> size given.

Actually from the responses I got in the "Strange behaviour (corrected)"
thread, it isn't a bug.

> I've added a copy constructor that does exactly the same
> thing as the default constructor and finally got your results. A vector
> of large objects with random values takes longer to sort than a list of
> large objects with similarly random values. Significantly longer. Your
> theory is confirmed, I suppose.
>
> Now, it has really nothing to do with the OP's problem, but still...

Strangely enough there should be no difference. With the default copy
constructor I get the differences I mentioned, while you get them only
when you define the copy constructor explicitly.

jl_...@hotmail.com

unread,

Oct 2, 2008, 2:45:19 PM10/2/08

to

Juha Nieminen wrote:
> > The original poster talked about storing integer values. I highly
> > doubt sorting a list of integers will be faster than sorting an array of
> > integers. In fact, I'm pretty sure of the contrary.

On Oct 2, 4:08 am, Ioannis Vranos replied:

> If the programmer decides to replace ints with other objects, he will
> not have to change much in the code, if he uses a list.

Even if he uses a std::vector, he still won't have to change much
in the code.

In my experience, it is usually not a good thing to write code
optimized for something that "might" happen in the future. If it does
happen, let the coder change it then, and not earlier.

I've seen code where data was stored in a list (or vector), but
nothing whatsoever was being done with the list after that. When I
asked the original coder why it was there, the reply was, "So that if
a future coder needs a list of the data, it's there to use."

Seriously, though, if you saw a list or vector that had did nothing
other than being populated, wouldn't you think it was a bug (or at
least vestigial code that was forgotten to be removed)?

I mean, if I had to add code that required me to use a list/vector
of that data, I would probably declare and populate my own list (who
wouldn't?). Even if I did see the first list (the one written for
future use), I would probably pass it over thinking it was needed for
some other piece of code -- fearing that if I manipulated it in any
way it might break current functional code.

To be honest, I think the STL sort algorithms are efficient enough
that the type of container they sort doesn't really matter in the long
run. In other words, they scale well enough that their difference in
speed can be pretty much considered negligible.

That being said, I think the original poster should use the
container that best suits the problem best. If the data is meant to
be used as a linked-list, then he/she should use a std::list. If the
data is meant to be used as a vector, then he/she should use a
std::vector.

Personally, I would suggest using a std::set, since it seems like
all he/she needs the container for is for checking to see if a number
is among the set of numbers he/she has already encountered -- unless
he/she also needs to use the set of numbers somewhere else as a list
or a vector.

Take care,

-- Jean-Luc

Ioannis Vranos

unread,

Oct 2, 2008, 2:45:52 PM10/2/08

to

Disregard this message of mine.

Ioannis Vranos

unread,

Oct 2, 2008, 2:48:55 PM10/2/08

to

Second correction:

#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>

class SomeClass
{
typedef std::vector<int> TypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();
SomeClass(const SomeClass &);

bool operator<(const SomeClass &argSomeClass) const
{
return vec[0]< argSomeClass.vec[0];
}
};

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=1000;

typedef vector<SomeClass> Vector;
typedef list<SomeClass> List;

cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

cout<<" Done!\n\n"<< flush;

List lis;

timeBeginVector= clock();

sort(vec.begin(), vec.end());

timeEndVector= clock();

timeBeginList= clock();

lis.sort();

timeEndList= clock();

sort(vec.begin(), vec.end());
}

==> SomeClass::SomeClass(const SomeClass &):vec(VectorSize)

Ioannis Vranos

unread,

Oct 2, 2008, 3:03:19 PM10/2/08

to

jl_...@hotmail.com wrote:
> Juha Nieminen wrote:
>>> The original poster talked about storing integer values. I highly
>>> doubt sorting a list of integers will be faster than sorting an array of
>>> integers. In fact, I'm pretty sure of the contrary.
>
>

> Personally, I would suggest using a std::set, since it seems like
> all he/she needs the container for is for checking to see if a number
> is among the set of numbers he/she has already encountered -- unless
> he/she also needs to use the set of numbers somewhere else as a list
> or a vector.

I also think std::set is the answer for the OP.

Victor Bazarov

unread,

Oct 2, 2008, 3:17:12 PM10/2/08

to

Just like std::list, std::set wasts too much memory per element to be
efficient for storing large numbers of elements, especially if each
datum is of a fundamental type. There is no need to waste all that
memory if the collection stabilises before the lookups. If lookups and
changes to the collection were intermingled, then re-sorting a vector
(or reallocating it when inserting in the middle) would be too much,
then a set would probably be more efficient. Again, nothing can be said
with certainty without *measuring*.

Jerry Coffin

unread,

Oct 3, 2008, 1:26:18 AM10/3/08

to

In article <kL5Fk.103$Nj...@read4.inet.fi>, nos...@thanks.invalid
says...

Quite true -- my statement was sufficiently incomplete that it wasn't
really accurate as stated. Thanks for the correction.

Lars Tetzlaff

unread,

Oct 3, 2008, 5:46:07 AM10/3/08

to

Ioannis Vranos schrieb:
> Second correction:

>
>
> ==> SomeClass::SomeClass(const SomeClass &):vec(VectorSize)
> {
> using namespace std;
>
> for(TypeVector::size_type i= 0; i< vec.size(); ++i)
> vec[i]= rand();
>
> sort(vec.begin(), vec.end());
> }

You will never get a sorted vector. Every time you copy an element you
will change it's value!

Lars

Ioannis Vranos

unread,

Oct 3, 2008, 6:01:25 AM10/3/08

to

I think during sort(), the default assignment operator is used, and not
the copy constructor.

Lars Tetzlaff

unread,

Oct 3, 2008, 6:11:47 AM10/3/08

to

Ioannis Vranos schrieb:

You have to make a copy during the exchange of two elements:

SomeClass tmp = vec[i]; // new value here
vec[i] = vec[j];
vec[j] = tmp;

Lars

Ioannis Vranos

unread,

Oct 3, 2008, 7:16:58 AM10/3/08

to

Lars Tetzlaff wrote:
> Ioannis Vranos schrieb:
>

> You have to make a copy during the exchange of two elements:
>
> SomeClass tmp = vec[i]; // new value here
> vec[i] = vec[j];
> vec[j] = tmp;

You are right, I will post a corrected version.

Thanks.

Ioannis Vranos

unread,

Oct 3, 2008, 7:18:24 AM10/3/08

to

Third correction:

#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>

class SomeClass
{
public:

typedef std::vector<int> TypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

void fillWithSortedRandomNumbers();

bool operator<(const SomeClass &) const;

SomeClass();
};

SomeClass::SomeClass():vec(VectorSize){}

inline void SomeClass::fillWithSortedRandomNumbers()
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

inline bool SomeClass::operator<(const SomeClass &argSomeClass) const

{
return vec[0]< argSomeClass.vec[0];
}

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=10000;

typedef vector<SomeClass> Vector;
typedef list<SomeClass> List;

cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

for(Vector::size_type i= 0; i< vec.size(); ++i)
vec[i].fillWithSortedRandomNumbers();

Rolf Magnus

unread,

Oct 3, 2008, 2:59:53 PM10/3/08

to

Victor Bazarov wrote:

> Erik Wikström wrote:
>> On 2008-10-02 16:11, Victor Bazarov wrote:
>>> [..] You're free to speculate what makes it
>>> so *bad* on your machine, but I'd seriously think about switching to a
>>> real OS and a real compiler, if I got such bad results as you do.
>>
>> Long since I last heard anyone calling Linux and gcc toy products. :-)
>
> I didn't call them toy products, either.

No, you just claimed they were not "a real OS and a real compiler".

Victor Bazarov

unread,

Oct 3, 2008, 3:21:52 PM10/3/08

to

No, I didn't *claim* that either. I *implied* it, maybe. And still,
it's up to the reader to [try to] deduce what I *implied*. Don't put
words into my mouth, please.

Branimir Maksimovic

unread,

Oct 4, 2008, 1:03:51 AM10/4/08

to

On Oct 1, 3:31 pm, Boltar <boltar2...@yahoo.co.uk> wrote:
> Hi
>
> I need to store a number of integer values which I will then search on
> later to see if they exist in my container. Can someone tell me which
> container would be quickest for finding these values? I can't use a
> plain C array (unless I make it 2^32 in size!) since I don't know the
> max integer value.
>
> Thanks for any help
>
> B2003

You don;t need to know maximum value but aproximatelly
maximum number of integers that would be there.
If you are good guesser this hash class will suit your
needs. It has O(1) lookup and if there are more
numbers than table entries will do linear search,
but that shouldn't happen as much.

class HashT{
public:
HashT(unsigned max = 100):table_(max){}
bool find(unsigned num)const
{
unsigned index = num%table_.size();
if(table_[index].size()>0)
{
vector<int>::const_iterator
i = std::find(table_[index].begin(),table_[index].end(),num);
return i!=table_[index].end();
}
return false;
}
void insert(unsigned num)
{
unsigned index = num%table_.size();
if(table_[index].empty())
table_[index].push_back(num);
else
{
vector<int>::const_iterator
i = std::find(table_[index].begin(),table_[index].end(),num);
if(i == table_[index].end())
table_[index].push_back(num);
}
}
private:
vector<vector<int> > table_;
};

Greetings, Branimir.

Jerry Coffin

unread,

Oct 4, 2008, 10:28:15 AM10/4/08

to

In article <gc2qk2$26om$1...@ulysses.noc.ntua.gr>,
ivr...@no.spam.nospamfreemail.gr says...

[ ... ]

> A vector::swap() can be a shallow swap.

Not only can be, but basically _has_ to be. In particular, the _only_
situation in which vector::swap is allowed to throw is if the
container's allocator throws. This means that swap can't use _any_ of
the operators built into the type being contained. That leaves a shallow
swap as essentially the only option.

aku ankka

unread,

Oct 4, 2008, 8:02:19 PM10/4/08

to

On Oct 1, 11:08 pm, James Kanze <james.ka...@gmail.com> wrote:
> On Oct 1, 7:31 pm, Erik Wikström <Erik-wikst...@telia.com> wrote:
> One thing I don't understand here: both a C style array and
> std::vector use a single block of contiguous memory. How could
> cache performance be any different for them?

The memory isn't contiguous on most common contemporary systems; it is
heavily fragmented and just appears contiguous (paged).

Juha Nieminen

unread,

Oct 5, 2008, 8:32:11 AM10/5/08

to

AFAIK that's completely inconsequential from the point of view of the
cache. The size of a cache block is much smaller than the size of a
memory page (which is probably a multiple of it).

Hendrik Schober

unread,

Oct 5, 2008, 3:25:00 PM10/5/08

to

Ioannis Vranos wrote:
> Third correction:
> [...]

So it took you three tries and code that's just about as
far from the original requirement as possible while still
dealing with lists and vectors to come up with an example
where a list is actually faster than a vector.
Do you realize that you just proofed everyone else's point.
(Start with a vector, and measure whether a list is faster,
for it might happen, although not in general.)

Schobi

Chris M. Thomasson

unread,

Oct 7, 2008, 12:03:58 AM10/7/08

to

"Erik Wikström" <Erik-w...@telia.com> wrote in message
news:duOEk.2893$U5.1...@newsb.telia.net...
> On 2008-10-01 18:57, Rolf Magnus wrote:
>> Jeff Schwab wrote:

>>
>>> Boltar wrote:
>>>
>>>> I need to store a number of integer values which I will then search on
>>>> later to see if they exist in my container. Can someone tell me which
>>>> container would be quickest for finding these values? I can't use a
>>>> plain C array (unless I make it 2^32 in size!) since I don't know the
>>>> max integer value.
>>>

>>> Sorted vector. See Effective STL, Item 23.
>>>
>>> For the record, you wouldn't 2^32 integers, just 2^32 bits = 500 MiB.
>>> It's actually not that much RAM, depending on your target system, and
>>> would let you check for integers with O(1) complexity (rather than O(log
>>> N)).
>>
>> However, it can still be slower, since it's more or less the worst thing
>> you
>> can do to the cache.
>
> Still, you should only get one cache-miss when looking for a value, if
> you use a set or vector you will probably get more.

It takes some effort to maximize the use of the cache. I don't think one
could even use a `std::vector' in a cache-friendly manner. Well, first of
all, the memory for the vector would simply need to start on a cache-line
boundary; how can this be ensured? I think it would be impossible. Well,
perhaps with a custom allocator... Therefore, if you want to create highly
cache-friendly arrays and algorihtms, well, you need to "roll your own". I
don't quite see how one could use a `std::vector' and know that all the
caveats, such as false-sharing and proper data-alignment, have been
addressed...

Hendrik Schober

unread,

Oct 7, 2008, 5:14:18 AM10/7/08

to

Chris M. Thomasson wrote:
> "Erik Wikström" <Erik-w...@telia.com> wrote in message
> news:duOEk.2893$U5.1...@newsb.telia.net...

> [...]

>>> However, it can still be slower, since it's more or less the worst thing
>>> you
>>> can do to the cache.
>> Still, you should only get one cache-miss when looking for a value, if
>> you use a set or vector you will probably get more.
>
> It takes some effort to maximize the use of the cache. I don't think one
> could even use a `std::vector' in a cache-friendly manner. Well, first of
> all, the memory for the vector would simply need to start on a cache-line
> boundary; how can this be ensured? I think it would be impossible. Well,
> perhaps with a custom allocator... Therefore, if you want to create highly
> cache-friendly arrays and algorihtms, well, you need to "roll your own". I
> don't quite see how one could use a `std::vector' and know that all the
> caveats, such as false-sharing and proper data-alignment, have been
> addressed...

Actually the knowledge that 'std::vector' is more cache-friendly
than 'std::list' was obtained empirical: First people discovered
the fact and then found out (or rather speculated, I suppose
nobody ever went and really researched it) that processor caching
is the reason.

Schobi