#include <iostream>
#include <vector>
const int size=200000;
void cont(void){
std::vector<unsigned> temp;
temp.resize(size);
for(int i=0; i<size; i++) temp[i] = i;
}
void arr(void){
unsigned a[size];
for(int i=0; i<size; i++) a[i]=i;
}
int main(void)
{
for(int k=0; k<1000; k++){
cont();
//arr();
if(k%10==0)std::cout<<k<<"\n";
}
return 0;
};
Thanks,
--Raokramer
Are you compiling in release mode? Vectors are slower in debug mode than
arrays, but shouldn't be much slower (if at all) in release mode.
--
Be seeing you.
http://forums.devshed.com/t82588/s.html
http://openglforums.com/forums/viewtopic.php?
t=3417&sid=3cacb034605fe9c342988163ecbe8d7b
Debug mode also produces executables. Make sure it's in release mode.
--
Be seeing you.
You're comparing apples to oranges. The vector is allocated on the heap,
the array is allocated on the stack. The overhead you're seeing is
because of the overhead of the construction/memory allocation of the
vector object. If you make the vector and the array both global you
should see no speed difference.
--
Be seeing you.
Not in this case, no. You are not comparing like with like.
Underneath, vector will be allocating/deallocating memory from the heap.
The array is on the stack.
You should be aware of that.
Stephen Howe
>Hi,
> The following simple program finishes in 4 sec if I use
>array, but it takes 225 sec if I use vector. How to make
>vector equal in performance? I asked this question at
>other forums, and one guy said they were equal in speed on
>MSVC++ 6.0, and I have .Net. The exes he sent me ran 4 sec
>for both vector and array:
>
>#include <iostream>
>#include <vector>
>
>const int size=200000;
Try it with:
int size = 200000;
>
>void cont(void){
> std::vector<unsigned> temp;
> temp.resize(size);
> for(int i=0; i<size; i++) temp[i] = i;
>}
>
>void arr(void){
> unsigned a[size];
> for(int i=0; i<size; i++) a[i]=i;
The above isn't a fair test. Instead:
unsigned* a = new unsigned[size];
for(int i=0; i<size; i++) a[i]=i;
delete[] a;
Vector is runtime sizeable, whereas arrays have their size fixed at
compile time. If you want an STL style container with compile time
size, then this might be suitable:
http://www.boost.org/libs/array/index.htm
You should find that that has performance identical to the plain
array.
Tom
>Tom, Dynamic arrays proved to be just as fast as regular
>ones (at least with my MSVC.Net version), and way faster
>than vectors.
I tested it yesterday with VC++ 7.1, and that was not the case for me.
What does your code look like?
--
Be seeing you.
How come when compiled on MSVC 6.0, the speeds for the
vector and for the array were the same? Second, I ran this:
#include <iostream>
#include <vector>
const int size=200000;
int a[size];
std::vector<int> v;
void cont(std::vector<int> & temp){
for(int i=0; i<size; i++) temp[i] = i;
}
void arr(int a[]){
for(int i=0; i<size; i++) a[i]=i;
}
int main(void)
{
v.reserve(size);
int a[size];
std::vector<int> v; v.reserve(size);
for(int k=0; k<1000; k++){
//cont(v);
arr(a);
if(k%10==0)std::cout<<k<<"\n";
}
return 0;
};
The speed ratio is still 2:10 ! Did you run a test? Could
you paste the code you ran please? Thanks, --Raokramer
I improved your program a bit:
#include <windows.h>
#include <iostream>
#include <vector>
const int size=200000;
void cont(void)
{
std::vector<unsigned> temp;
temp.resize(size);
for(int i=0; i<size; i++)
temp[i] = i;
}
void arr(void)
{
unsigned* a = new unsigned[size];
for(int i=0; i<size; i++)
a[i]=i;
delete [] a;
}
int main(void)
{
std::cout << "Container test...\n";
LARGE_INTEGER start;
LARGE_INTEGER freq;
::QueryPerformanceFrequency(&freq);
::QueryPerformanceCounter(&start);
for(int k=0; k<1000; k++)
{
cont();
if(k%10==0)
std::cout<<k<<"\r";
}
LARGE_INTEGER mid;
::QueryPerformanceCounter(&mid);
std::cout << "Elapsed: " << (double)(1000000.0 *
(mid.QuadPart-start.QuadPart)/freq.QuadPart) << "us\n";
for(int k=0; k<1000; k++)
{
arr();
if(k%10==0)
std::cout<<k<<"\r";
}
LARGE_INTEGER stop;
::QueryPerformanceCounter(&stop);
std::cout << "Elapsed: " << (double)(1000000.0 *
(stop.QuadPart-mid.QuadPart)/freq.QuadPart) << "us\n";
return 0;
};
compile from the command line with cl -GX -O1 vectperf0919.cpp
I get the following results on my 3Ghz P4:
Container test...
Elapsed: 2.71622e+006us
Array test...
Elapsed: 1.74494e+006us
So, the vector version took 2.7 seconds while the array version took 1.7
seconds. So, where's the difference come from? A valid question, indeed.
First, let's look at the prolog code of the two functions:
cont():
; 8 : {
00000 b8 00 00 00 00 mov eax, __ehhandler$?cont@@YAXXZ
00005 e8 00 00 00 00 call __EH_prolog
0000a 83 ec 10 sub esp, 16 ; 00000010H
0000d 56 push esi
0000e 57 push edi
; 9 : std::vector<unsigned> temp;
0000f 33 ff xor edi, edi
00011 89 7d e8 mov DWORD PTR _temp$[ebp+4], edi
00014 89 7d ec mov DWORD PTR _temp$[ebp+8], edi
00017 89 7d f0 mov DWORD PTR _temp$[ebp+12], edi
; 10 : temp.resize(size);
0001a 57 push edi
0001b be 40 0d 03 00 mov esi, 200000 ; 00030d40H
00020 56 push esi
00021 8d 4d e4 lea ecx, DWORD PTR _temp$[ebp]
00024 89 7d fc mov DWORD PTR __$EHRec$[ebp+8], edi
00027 e8 00 00 00 00 call
?resize@?$vector@IV?$allocator@I@std@@@std@@QAEXII@Z ; std::vector<unsigned
int,std::allocator<unsigned int> >::resize
--------------------------------
arr():
; 17 : unsigned* a = new unsigned[size];
00000 68 00 35 0c 00 push 800000 ; 000c3500H
00005 e8 00 00 00 00 call ??_U@YAPAXI@Z ; operator new[]
0000a 59 pop ecx
--------------------------------
Obviously the implementation of cont() has more startup overhead: an
exception frame must be established, and vector::resize was implemented out
of line, so there's a call. One last detail (and a major complaint about
std::vector): When you resize() a std::vector, the new elements are
initialized - in other words, they're set to 0. This means that the call to
resize() has as much overhead as the balance of the loop!
Next, let's take a look at the inner loops of cont() and arr():
cont():
; 11 : for(int i=0; i<size; i++)
0002c 33 c0 xor eax, eax
$L63681:
; 12 : temp[i] = i;
0002e 8b 4d e8 mov ecx, DWORD PTR _temp$[ebp+4]
00031 89 04 81 mov DWORD PTR [ecx+eax*4], eax
00034 40 inc eax
00035 3b c6 cmp eax, esi
00037 7c f5 jl SHORT $L63681
; 13 : }
--------------------------------
arr():
; 18 : for(int i=0; i<size; i++)
0000b 33 c9 xor ecx, ecx
$L63691:
; 19 : a[i]=i;
0000d 89 0c 88 mov DWORD PTR [eax+ecx*4], ecx
00010 41 inc ecx
00011 81 f9 40 0d 03 00 cmp ecx, 200000 ; 00030d40H
00017 7c f4 jl SHORT $L63691
--------------------------------
The core loops are similar, but the optimizer has made different (and less
optimal) register allocation choices: In arr(), the value of 'a' is kept in
the ecx register throughout the loop, while in cont(), the pointer to the
underlying data is re-fetched from the stack for each iteration. The
optimizer was unable to promote a member of the vector class to register
storage throughout the loop.
For completeness, let's look at the epilog code.
cont():
00039 8d 4d e4 lea ecx, DWORD PTR _temp$[ebp]
0003c e8 00 00 00 00 call
?_Tidy@?$vector@IV?$allocator@I@std@@@std@@IAEXXZ ; std::vector<unsigned
int,std::allocator<unsigned int> >::_Tidy
00041 8b 4d f4 mov ecx, DWORD PTR __$EHRec$[ebp]
00044 5f pop edi
00045 5e pop esi
00046 64 89 0d 00 00
00 00 mov DWORD PTR fs:__except_list, ecx
0004d c9 leave
0004e c3 ret 0
--------------------------------
arr():
; 20 : delete [] a;
00019 50 push eax
0001a e8 00 00 00 00 call ??_V@YAXPAX@Z ; operator delete[]
0001f 59 pop ecx
; 21 : }
00020 c3 ret 0
--------------------------------
Again, the epilog for the array case is slightly simpler due to the lack of
an exception frame.
So, why is the vector case slower? Because it's doing twice the work. Try
making the vector in cont() static and see what the result is. In my case,
the times became:
Container test...
Elapsed: 959344us
Array test...
Elapsed: 1.7355e+006us
--> The std::vector case is now nearly 2X faster than the array case.
Or, change the implemention of arr() to initialize the array to 0. In my
case, the times became:
Container test...
Elapsed: 2.69969e+006us
Array test...
Elapsed: 2.82572e+006us
So, the final question: why were the times the same in VC6? Because VC6 did
not correctly initialize the new elements of the vector as a result of
resize().
-cd
>Thore,
I get exactly the same speed with arrays and vectors with this code. (I
changed the reserve() to resize().) This is in VC++ 7.0 with optimize
for speed.
--
Be seeing you.
<snip>
Is there any advantage in your code over:
std::vector<unsigned> temp(size);
?
Paul