I was tasked to translate a Matlab script over to C++.
The Matlab script is a simple wrapper (basically for loops and some
easy vector manipulations) that calls 2 external C++ functions
compiled into DLL's.
All I had to do was translate the for loops and turn the vectorized
code into loops in C++ and call the native C++ functions that were
originally called through DLL in MATLAB.
Now here's the weird thing... On the same computer, running Windows
XP, the Matlab script took about 35 seconds to perform the job (timed
using internal time functions). Where as the compiled Windows
executable took about 90 seconds.... I've tried and tried to optimize
the C++ code but I just can't beat the Matlab script's run time...
What gives??? (I may not be a expert C++ programmer but my coding
skills are not that bad)
BTW, I'm using Matlab version 7.0.1.24704 (R14) Service Pack 1. My
C++ compiler is MS Visual C++ .NET 2003 (generating a Win32 console
application).
I would greatly appreciated it someone can shed some light on this
issue. I'm very confused...
Kevin
MATLAB is in many instances a wrapper to fortran libraries that have
been optimized intensively (LAPACK, FFTW, etc).
So my question is this: Why is it faster for Matlab to call the C
routines??? It seems that compiling the C routines along with a C
script should defitely be faster. I'm very confused...
Kevin
So it's not that the scripted Matlab language is faster than the C code,
it's that Matlab is calling highly optimized libraries and your C/C++ code
is not.
However, you can use highly optimized libraries too. I think there's even a
way to link to the BLAS and Lapack libraries that come with Matlab, but I've
never done that. But another option is to use the Math Kernel Library
available from Intel. I think that's what Mathworks uses.
Your experience is not something I'm used to seeing. I normally see
a speedup going from ML to MEX (even for vectorized ML code). Can
you describe what your code is actually doing (are you rewriting BLAS
functions)? Better still, a snippet that shows the problem would be
interesting.
- Steve
So you say that
a) In the fast version Matlab is just a wrapper, virtually all the
time is spent inside the external C code.
b) In the slow version, the C wrapper calls the same C functions, but
these identical functions take nearly 3 times as long.
Are you absolutely sure that both code versions do the same thing,
the same number of times?
Are you sure that the Matlab code doesn't have anything substantial
to do (Like matrix operations, FFTs etc.)?
Murphy
Thank you all for the replies. Here is a summary of what my code
does. For those that are familiar with communication systems: it's
basically running performance tests (BER curves) for a convolutional
decoder.
(Original Setup)
1. Decoder functions written in C and compiled into DLL's so that it
can be called from Matlab (I didn't do this part and I don't know
much about MEX things so I can't elaborate more on that)
2. Matlab wrapper generates random bits (x = rand(1:N)) encodes using
(convenc(x,trellis)) and then adds AWGN (y = x + randn(1,N)). Then
this new vector is passed to the C decoding function (out =
decode(y)) and then Matlab script counts the number of bit errors.
(New Setup)
1. Same decoder functions as above (I have the C source).
2. Write a C version of the Matlab test wrapper (call the C "rand"
function, write my own "randn" function using Box-Muller technique,
write my own "convenc" using table lookup)
3. Compile into EXE and run.
So as you can see, there are no fancy vector/matrix manipulation
routines in the original Matlab script (i.e. no FFT, SVD, matrix
inverse...).
I profiled both the Matlab script and the C code. For both cases,
the majority of the running time (over 90%) is spent in the "decode"
function (which is written in C).
On my slow desktop running Windows XP, Matlab run time is about 33
seconds. On the same machine, compiled C EXE (using Visual C++ with
no optimization options) takes about 90 seconds. I also tried to
compile the code on a fast Linux Mandrake 10 box (turning on the
optimization using g++ -O) and it takes about 37 seconds to run.
I'm still confused...
Kevin
> On my slow desktop running Windows XP, Matlab run time is about 33
> seconds. On the same machine, compiled C EXE (using Visual C++ with
> no optimization options) takes about 90 seconds.
The speed difference is a factor 3 or so. Could this be due to
compiler optimizations? I've seen similar numbers when I ran an
optimizer on a C++ program compiled under HP-UX a few years ago.
Run-time improved from 50 s to 14 s when I switched from the
default compiler optimization option -O to -O3. This was with
the native HP compiler, though, not g++.
> I also tried to
> compile the code on a fast Linux Mandrake 10 box (turning on the
> optimization using g++ -O) and it takes about 37 seconds to run.
>
> I'm still confused...
Other than that, I'd look into whether complex numbers occur
anywhere in these computations. Matlab does things a bit
differently than usual, in that the real and imaginary parts
are stored in different arrays, not side by side in memory.
Such issues could have an impact on performance; you may
have to shuffle data around to fit with whatever format the
MEX file needs. I can't, on the other hand, see why complex
numbers should appear in this particular application.
Rune
I think the answer could lie in either randn or convenc. The matlab
versions are very fast. For example randn only takes about twice as
long as rand, if you substitute rand in the C code (just as a test),
does it speed it up much?)
Murphy
Have you pondered the memory footprint of the codes ?
How big is the table used by your version of "convenc" ? How much
memory do you have in your machine ? The Matlab convenc may have
very small memory requirement, while your convenc might possibly
be blowing away the cache. Does your timing run execute once through
or is there an overall loop ?
If you want to pursue this idea further, take a look at source code
for a memory bandwith test such as STREAM, where you will find explicit
code whose function is to blow away the cache between tests.
http://www.cs.virginia.edu/stream/ref.html
> 3. Compile into EXE and run.
>
> So as you can see, there are no fancy vector/matrix manipulation
> routines in the original Matlab script (i.e. no FFT, SVD, matrix
> inverse...).
>
> I profiled both the Matlab script and the C code. For both cases,
> the majority of the running time (over 90%) is spent in the "decode"
> function (which is written in C).
>
> On my slow desktop running Windows XP, Matlab run time is about 33
> seconds. On the same machine, compiled C EXE (using Visual C++ with
> no optimization options) takes about 90 seconds. I also tried to
> compile the code on a fast Linux Mandrake 10 box (turning on the
> optimization using g++ -O) and it takes about 37 seconds to run.
You wouldn't by chance have the Matlab time on the fast Linux box ?
It would also be of interest to know the answers to what processor,
what speed, how much memory for each system.
> I'm still confused...
>
> Kevin
And an aside, I have been preferring to use C for time-sensitive code
since I figure there is less risk of unintended overhead, ie to my
way of thinking C is closer to what-you-write-is-what-you-get.
Good luck,
-rajeev-