a=rand(1,300);
t1(1:10000)=0;
t2(1:10000)=0;
for i=1:10000
tic; fft(a); t1(i)=toc;
tic; fft(a,2^nextpow2(length(a))); t2(i)=toc;
end
mean(t1)
mean(t2)
I always get mean(t2) > mean(t1)
is there a reason why?
First of all don't use TIC/TOC for performance purposes.
Use CPUTIMe or better, the profiler.
Second, use one loop for each case:
N = 100000;
ts1=cputime;
for n=1:N
fft(a);
end
te1=cputime - ts1;
ts2=cputime;
for n=1:N
fft(a2^nextpow2(length(a)));
end
te2=cputime - ts2;
tm1 = te1/N;
tm2 = te2/N;
Rune
In addition to what Rune said, it may be that the fft() operation itself is faster, but it still costs you CPU time to zero pad the array. This requires memory allocation operations.
Accelerating fft's by zero padding is ideally something you would do when you have to do multiple fft jobs in sequence. You would then pre-allocate the zero-padded array and reload the array with new data for each job.
Of course the matlab documentation says to use tic/toc instead of cputime
doc tic
The results are substantially the same in a relative sense:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
a=rand(1,300);
N = 100000;
% Use cputime:
ts1=cputime;
for n=1:N
fft(a);
end
te1=cputime - ts1;
ts2=cputime;
for n=1:N
fft(a,2^nextpow2(length(a)));
end
te2=cputime - ts2;
times = [te1 te2];
reltime = times/min(times)
% Use tic toc:
tic
for n=1:N
fft(a);
end
te1 = toc;
tic
for n=1:N
fft(a,2^nextpow2(length(a)));
end
te2 = toc;
times = [te1 te2];
reltime = times/min(times)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
reltime =
1 3.0488
reltime =
1 3.096
7KxK-p195K951R<3;4;-K.@K-%;:-/AA@kY;-R1;;/EK4B8.-<K8@-:4e1K
doc cputime
not tic as I previously posted. Sorry.
{eZdiWW5ZVtttdtVV:^i^\e/]t#VjVbbt`iddj{ncXkVd]tZc]VdaNBZaXt
One other point. Zero-padding is favorable mainly when your initial number of points (before zero padding) is only slightly less than the next higher power of 2.
Therefore, you should really use length(a)=511 instead of length(a)=300 to see a favorable effect.
Rune,
If I recall correctly, you have stated in previous posts that you are
using a version of MATLAB that is some years out of date. Our current
recommendation to users is to use tic and toc. These functions have
changed quite a bit over the last few years.
To the original poster, give my timeit function a try:
http://www.mathworks.com/matlabcentral/fileexchange/18798
---
Steve Eddins
http://blogs.mathworks.com/steve/
Also, blindly zero-padding a signal may introduce errors into the
Fourier transform. For example, if the signal has a mean of say 100,
then zero padding will introduce a step of 100 and this will result in
red noise in the frequency domain.
It's a minor thing, but you probably want to compute 2^nextpow2(length(a))
once outside the loop being timed and use that value inside the loop. That
saves (N-1) [assuming you time the one call to NEXTPOW2 that you do outside
the loop] instances of function call overhead, error checking inside
NEXTPOW2, exponentiation, etc.
*snip*
--
Steve Lord
sl...@mathworks.com
True if the O.P. is calculating the fft of many variables that are the same length in a loop. But for comparing the raw speed of getting the fft one way versus another, I think it is fair to leave any steps that must be taken inside the timer!
JggIKggugnWA\WXMaQI\IP"TgVTXWP5W\WKIWIIJMg]M(PUVnUQS-IOgM^]
FFTW (the FFT engine used by MATLAB) knows how to do efficient
non-power-of-two FFTs. If you use 293 (a prime number) for the length,
you see the desired effect. It is still a small effect (1.5x), mostly because
these lengths are so small. The overhead is substantial. Trying this
again with larger numbers will show a larger difference. Using 293*293,
for example, gives me an almost 5x speedup to use the next largest power
of 2.
-Peter
The correct statement is that FFTs are faster for sizes that factorize
into small prime factors, ideally 2,3,5. Powers of 2 have a
historical advantage in that they are the most heavily optimized in
many FFT implementations, but the FFT implementation used by Matlab
(FFTW) is almost as optimized for other small prime factors.
If you have a size that is factorizable into powers of 2,3,5 (like 300
in your case), then going to the next power of 2 (512 in your case) is
likely to be slower.
If you have a size that has large prime factors, then it is usually
beneficial from a performance perspective to go to the next size that
is factorizable into {2,3,5}, not the next power of 2. (Note,
however, that even large prime sizes are handled by an N log N
algorithm in the FFT code used by Matlab.) Of course, depending upon
your application, zero-padding may cause other problems by changing
what you are computing.
Regards,
Steven G. Johnson