Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help needed on STL performance

4 views
Skip to first unread message

raokramer

unread,
Sep 18, 2003, 11:40:30 PM9/18/03
to
Hi,
The following simple program finishes in 4 sec if I use
array, but it takes 225 sec if I use vector. How to make
vector equal in performance? I asked this question at
other forums, and one guy said they were equal in speed on
MSVC++ 6.0, and I have .Net. The exes he sent me ran 4 sec
for both vector and array:

#include <iostream>
#include <vector>

const int size=200000;

void cont(void){
std::vector<unsigned> temp;
temp.resize(size);
for(int i=0; i<size; i++) temp[i] = i;
}

void arr(void){
unsigned a[size];
for(int i=0; i<size; i++) a[i]=i;
}

int main(void)
{
for(int k=0; k<1000; k++){
cont();
//arr();
if(k%10==0)std::cout<<k<<"\n";
}
return 0;
};

Thanks,
--Raokramer

Thore B. Karlsen

unread,
Sep 18, 2003, 11:44:28 PM9/18/03
to
On Thu, 18 Sep 2003 20:40:30 -0700, "raokramer" <raok...@yahoo.com>
wrote:

Are you compiling in release mode? Vectors are slower in debug mode than
arrays, but shouldn't be much slower (if at all) in release mode.

--
Be seeing you.

raokramer

unread,
Sep 19, 2003, 12:00:06 AM9/19/03
to
I press <ctrl> - F5 to run. I suppose it's a release mode,
because it creates an .exe so that I can run it later, and
even when I run exes, vectors are by several orders
slower. --Raokramer

raokramer

unread,
Sep 19, 2003, 12:02:08 AM9/19/03
to
I posted a lot on this issue at

http://forums.devshed.com/t82588/s.html

http://openglforums.com/forums/viewtopic.php?
t=3417&sid=3cacb034605fe9c342988163ecbe8d7b

Thore B. Karlsen

unread,
Sep 19, 2003, 12:06:47 AM9/19/03
to
On Thu, 18 Sep 2003 21:00:06 -0700, "raokramer" <raok...@yahoo.com>
wrote:

Debug mode also produces executables. Make sure it's in release mode.

--
Be seeing you.

raokramer

unread,
Sep 19, 2003, 12:24:19 AM9/19/03
to
Yes, that helped to increase speed! You were right about
that release version was faster. In release version
however vectors are still much slower - that program ran 3
sec with arrays and 22 sec with vectors. Is it possible to
make vectors as fast as arrays? Thanks a lot -- Raokramer

Thore B. Karlsen

unread,
Sep 19, 2003, 12:33:00 AM9/19/03
to
On Thu, 18 Sep 2003 21:24:19 -0700, "raokramer" <raok...@yahoo.com>
wrote:

You're comparing apples to oranges. The vector is allocated on the heap,
the array is allocated on the stack. The overhead you're seeing is
because of the overhead of the construction/memory allocation of the
vector object. If you make the vector and the array both global you
should see no speed difference.

--
Be seeing you.

Stephen Howe

unread,
Sep 19, 2003, 8:57:45 AM9/19/03
to

Not in this case, no. You are not comparing like with like.
Underneath, vector will be allocating/deallocating memory from the heap.
The array is on the stack.

You should be aware of that.

Stephen Howe


tom_usenet

unread,
Sep 19, 2003, 9:33:25 AM9/19/03
to
On Thu, 18 Sep 2003 20:40:30 -0700, "raokramer" <raok...@yahoo.com>
wrote:

>Hi,


> The following simple program finishes in 4 sec if I use
>array, but it takes 225 sec if I use vector. How to make
>vector equal in performance? I asked this question at
>other forums, and one guy said they were equal in speed on
>MSVC++ 6.0, and I have .Net. The exes he sent me ran 4 sec
>for both vector and array:
>
>#include <iostream>
>#include <vector>
>
>const int size=200000;

Try it with:

int size = 200000;

>
>void cont(void){
> std::vector<unsigned> temp;
> temp.resize(size);
> for(int i=0; i<size; i++) temp[i] = i;
>}
>
>void arr(void){
> unsigned a[size];
> for(int i=0; i<size; i++) a[i]=i;

The above isn't a fair test. Instead:

unsigned* a = new unsigned[size];


for(int i=0; i<size; i++) a[i]=i;

delete[] a;

Vector is runtime sizeable, whereas arrays have their size fixed at
compile time. If you want an STL style container with compile time
size, then this might be suitable:

http://www.boost.org/libs/array/index.htm

You should find that that has performance identical to the plain
array.

Tom

raokramer

unread,
Sep 19, 2003, 11:46:08 AM9/19/03
to
Tom, Dynamic arrays proved to be just as fast as regular
ones (at least with my MSVC.Net version), and way faster
than vectors. --Raokramer
>.
>

Thore B. Karlsen

unread,
Sep 19, 2003, 11:53:02 AM9/19/03
to
On Fri, 19 Sep 2003 08:46:08 -0700, "raokramer" <raok...@yahoo.com>
wrote:

>Tom, Dynamic arrays proved to be just as fast as regular
>ones (at least with my MSVC.Net version), and way faster
>than vectors.

I tested it yesterday with VC++ 7.1, and that was not the case for me.
What does your code look like?

--
Be seeing you.

raokramer

unread,
Sep 19, 2003, 12:16:08 PM9/19/03
to
Thore,

How come when compiled on MSVC 6.0, the speeds for the
vector and for the array were the same? Second, I ran this:

#include <iostream>
#include <vector>

const int size=200000;
int a[size];
std::vector<int> v;

void cont(std::vector<int> & temp){


for(int i=0; i<size; i++) temp[i] = i;
}

void arr(int a[]){


for(int i=0; i<size; i++) a[i]=i;
}

int main(void)
{
v.reserve(size);
int a[size];
std::vector<int> v; v.reserve(size);


for(int k=0; k<1000; k++){

//cont(v);
arr(a);


if(k%10==0)std::cout<<k<<"\n";
}
return 0;
};

The speed ratio is still 2:10 ! Did you run a test? Could
you paste the code you ran please? Thanks, --Raokramer

Carl Daniel [VC++ MVP]

unread,
Sep 19, 2003, 12:59:20 PM9/19/03
to
raokramer wrote:
> Hi,
> The following simple program finishes in 4 sec if I use
> array, but it takes 225 sec if I use vector. How to make
> vector equal in performance? I asked this question at
> other forums, and one guy said they were equal in speed on
> MSVC++ 6.0, and I have .Net. The exes he sent me ran 4 sec
> for both vector and array:

I improved your program a bit:

#include <windows.h>
#include <iostream>
#include <vector>

const int size=200000;

void cont(void)
{
std::vector<unsigned> temp;
temp.resize(size);
for(int i=0; i<size; i++)
temp[i] = i;
}

void arr(void)
{
unsigned* a = new unsigned[size];


for(int i=0; i<size; i++)
a[i]=i;

delete [] a;
}

int main(void)
{
std::cout << "Container test...\n";
LARGE_INTEGER start;
LARGE_INTEGER freq;
::QueryPerformanceFrequency(&freq);
::QueryPerformanceCounter(&start);

for(int k=0; k<1000; k++)
{
cont();

if(k%10==0)
std::cout<<k<<"\r";
}

LARGE_INTEGER mid;
::QueryPerformanceCounter(&mid);
std::cout << "Elapsed: " << (double)(1000000.0 *
(mid.QuadPart-start.QuadPart)/freq.QuadPart) << "us\n";

for(int k=0; k<1000; k++)
{

arr();
if(k%10==0)
std::cout<<k<<"\r";
}

LARGE_INTEGER stop;
::QueryPerformanceCounter(&stop);
std::cout << "Elapsed: " << (double)(1000000.0 *
(stop.QuadPart-mid.QuadPart)/freq.QuadPart) << "us\n";

return 0;
};

compile from the command line with cl -GX -O1 vectperf0919.cpp

I get the following results on my 3Ghz P4:

Container test...
Elapsed: 2.71622e+006us
Array test...
Elapsed: 1.74494e+006us

So, the vector version took 2.7 seconds while the array version took 1.7
seconds. So, where's the difference come from? A valid question, indeed.

First, let's look at the prolog code of the two functions:

cont():

; 8 : {

00000 b8 00 00 00 00 mov eax, __ehhandler$?cont@@YAXXZ
00005 e8 00 00 00 00 call __EH_prolog
0000a 83 ec 10 sub esp, 16 ; 00000010H
0000d 56 push esi
0000e 57 push edi

; 9 : std::vector<unsigned> temp;

0000f 33 ff xor edi, edi
00011 89 7d e8 mov DWORD PTR _temp$[ebp+4], edi
00014 89 7d ec mov DWORD PTR _temp$[ebp+8], edi
00017 89 7d f0 mov DWORD PTR _temp$[ebp+12], edi

; 10 : temp.resize(size);

0001a 57 push edi
0001b be 40 0d 03 00 mov esi, 200000 ; 00030d40H
00020 56 push esi
00021 8d 4d e4 lea ecx, DWORD PTR _temp$[ebp]
00024 89 7d fc mov DWORD PTR __$EHRec$[ebp+8], edi
00027 e8 00 00 00 00 call
?resize@?$vector@IV?$allocator@I@std@@@std@@QAEXII@Z ; std::vector<unsigned
int,std::allocator<unsigned int> >::resize

--------------------------------

arr():

; 17 : unsigned* a = new unsigned[size];

00000 68 00 35 0c 00 push 800000 ; 000c3500H
00005 e8 00 00 00 00 call ??_U@YAPAXI@Z ; operator new[]
0000a 59 pop ecx

--------------------------------

Obviously the implementation of cont() has more startup overhead: an
exception frame must be established, and vector::resize was implemented out
of line, so there's a call. One last detail (and a major complaint about
std::vector): When you resize() a std::vector, the new elements are
initialized - in other words, they're set to 0. This means that the call to
resize() has as much overhead as the balance of the loop!

Next, let's take a look at the inner loops of cont() and arr():

cont():

; 11 : for(int i=0; i<size; i++)

0002c 33 c0 xor eax, eax
$L63681:

; 12 : temp[i] = i;

0002e 8b 4d e8 mov ecx, DWORD PTR _temp$[ebp+4]
00031 89 04 81 mov DWORD PTR [ecx+eax*4], eax
00034 40 inc eax
00035 3b c6 cmp eax, esi
00037 7c f5 jl SHORT $L63681

; 13 : }
--------------------------------

arr():

; 18 : for(int i=0; i<size; i++)

0000b 33 c9 xor ecx, ecx
$L63691:

; 19 : a[i]=i;

0000d 89 0c 88 mov DWORD PTR [eax+ecx*4], ecx
00010 41 inc ecx
00011 81 f9 40 0d 03 00 cmp ecx, 200000 ; 00030d40H
00017 7c f4 jl SHORT $L63691

--------------------------------

The core loops are similar, but the optimizer has made different (and less
optimal) register allocation choices: In arr(), the value of 'a' is kept in
the ecx register throughout the loop, while in cont(), the pointer to the
underlying data is re-fetched from the stack for each iteration. The
optimizer was unable to promote a member of the vector class to register
storage throughout the loop.

For completeness, let's look at the epilog code.

cont():

00039 8d 4d e4 lea ecx, DWORD PTR _temp$[ebp]
0003c e8 00 00 00 00 call
?_Tidy@?$vector@IV?$allocator@I@std@@@std@@IAEXXZ ; std::vector<unsigned
int,std::allocator<unsigned int> >::_Tidy
00041 8b 4d f4 mov ecx, DWORD PTR __$EHRec$[ebp]
00044 5f pop edi
00045 5e pop esi
00046 64 89 0d 00 00
00 00 mov DWORD PTR fs:__except_list, ecx
0004d c9 leave
0004e c3 ret 0
--------------------------------

arr():

; 20 : delete [] a;

00019 50 push eax
0001a e8 00 00 00 00 call ??_V@YAXPAX@Z ; operator delete[]
0001f 59 pop ecx

; 21 : }

00020 c3 ret 0
--------------------------------

Again, the epilog for the array case is slightly simpler due to the lack of
an exception frame.

So, why is the vector case slower? Because it's doing twice the work. Try
making the vector in cont() static and see what the result is. In my case,
the times became:

Container test...
Elapsed: 959344us
Array test...
Elapsed: 1.7355e+006us

--> The std::vector case is now nearly 2X faster than the array case.

Or, change the implemention of arr() to initialize the array to 0. In my
case, the times became:

Container test...
Elapsed: 2.69969e+006us
Array test...
Elapsed: 2.82572e+006us

So, the final question: why were the times the same in VC6? Because VC6 did
not correctly initialize the new elements of the vector as a result of
resize().

-cd


Thore B. Karlsen

unread,
Sep 19, 2003, 1:34:28 PM9/19/03
to
On Fri, 19 Sep 2003 09:16:08 -0700, "raokramer" <raok...@yahoo.com>
wrote:

>Thore,

I get exactly the same speed with arrays and vectors with this code. (I
changed the reserve() to resize().) This is in VC++ 7.0 with optimize
for speed.

--
Be seeing you.

Paul Thompson

unread,
Sep 22, 2003, 7:33:34 AM9/22/03
to
"raokramer" <raok...@yahoo.com> wrote in message
news:00de01c37e5f$c6282b20$a001...@phx.gbl...

> Hi,
> The following simple program finishes in 4 sec if I use
> array, but it takes 225 sec if I use vector. How to make
> vector equal in performance? I asked this question at
> other forums, and one guy said they were equal in speed on
> MSVC++ 6.0, and I have .Net. The exes he sent me ran 4 sec
> for both vector and array:
>
> #include <iostream>
> #include <vector>
>
> const int size=200000;
>
> void cont(void){
> std::vector<unsigned> temp;
> temp.resize(size);

<snip>

Is there any advantage in your code over:

std::vector<unsigned> temp(size);

?

Paul


0 new messages