Pete.
Just want to share-"borrow" an observation (by
permalink) from this
Microchip forum-thread that may be useful to further kick the can around the subject.
QUOTE:
"
I did some tests on a 256 point FFT... the dsPIC33F come up with around
about 22000 cycles (allowing just over 19000 for the FFT, then you have
the BitReverseComplex and a copy operation assuming you are doing it out
of place). This is Microchips standard optimised FFT algorithm in their
DSP library and of course it makes extensive use of the dsPICs
addressing modes and DSP core.
For comparison, an
implementation of an FFT on the PIC32 using standard C code with no
optimised assembler similar to what you would find in 'Numerical
Recipes in C' takes ~410000 cycles (with no-optimisation turned on).
However there 'exists' some MIPs assembly language optimized versions of
the FFT routines and these will run quite fast. By fast I mean ~34000
cycles for the same 256point FFT doing the same copy and reverse
operations.
So whilst in cycle counts the PIC32 might take
longer when you then take into account the raw CPU speed (72 MIPS v 40
MIPS) it comes out about the same or faster than the dsPIC. NOW YES of
course pipeline stalls will have an effect on this, YES of course the
addressing modes on the dsPIC are neater and faster BUT then the cache
will help on the tight repetitive loops found in the FFT algorithm.
Overall there is not much to choose between the two of them on this kind of test on 16bit arithmetic.
"
Cheers.
Daniel