On Monday, July 30, 2012 09:09:42 PM Rajeev Singh wrote:
> I made sure to check that all the programs are running with single thread.
I wrote out your algorithm using direct BLAS calls. That got it down to ~8s on
my machine, from an improved version using plain Julia that was running at
~12s. There is a bigger advantage to using BLAS (I've seen up to 6-fold) when
the matrices get bigger. So I'm very glad I took the time to learn this, I'll
definitely use it in some work I'll be doing soon.
Files are attached. Change Niter to 2^16 before running @time laplace_iter(u,
dx, dy, Niter).
If we assume that BLAS is basically as good as it gets, and if my laptop is
not dreadfully slower than your machine, then there's something slightly
suspicious about your results that show running times <3s.
For the Cilk and various Fortran implementations, if you really are certain
they are restricted to using only one core (you checked via top, right?),
then...are you really sure the computation is running as you expect? The
reason I ask is that there were things in your Julia code that one could
imagine would cause it to just skip the entire computation: you were returning
u[1,1], and that value is always zero, so in principle a sufficiently-smart
compiler could have just optimized the entire computation away. In my code I
changed it to make sure that didn't happen. If it is getting simply optimized
away for some of your languages (e.g., Cilk & Fortran), that would explain the
suspiciously-fast results you're seeing. Of course, if it were being optimized
away, perhaps it should have run even faster, so I am not sure this explains
it.
If all really is as you expect, then either:
1. My machine is a lot slower than yours. Try this code, and see what happens.
2. There is something significantly better than OpenBLAS out there. It would be
interesting to know what it is.
Those seem to be the only two possibilities. I profiled the laplace_blas code,
and it's spending all of its time in BLAS calls. So by this point there is no
"julia" in there, really.
--Tim