I think I have an example which pretty clearly illustrates a problem I've been having involving parallel computing. I need to get this working, so I'm going to be trying to figure out what's going on, but I would appreciate help from anyone similarly stuck :)
Start with an array of 100,000 random numbers.
Cool. Now, on one core, I can blast through those and map them to between 0 and 100.
julia> @elapsed for x in a
x = x * 100
end
0.017558935
Just for sanity sake you can add a print statement in the loop if you want. That way you know it's not getting optimized out.
Anyway, what happens when I add a second core? Let's do a pmap and compare.
julia> addprocs(1)
:ok
julia> @elapsed pmap((x)->x * 100, a)
19.985164725
What the heck? "Well," you say, "maybe 100,000 numbers isn't enough to see gains by using multiple cores."
Maybe so, but that's not the primary source of my consternation. Consider that I can generate a random array on core 2 and pull it over to core 1 lickety-split.
julia> @elapsed fetch(@spawnat 2 rand(100000))
0.425517213
Also, in an effort to compare apples to apples, we can run pmap with only one core.
julia> @elapsed pmap((x)->x * 100, a)
3.844994459
A little overhead, but not too bad.
I haven't done enough testing to say if it's just pmap, I figured I would post first and see if anyone had any insight.
Thanks!