Am 13.03.2015 um 16:20 schrieb Pieter Barendrecht <
pjbare...@gmail.com>:
> Thanks! I tried both approaches you suggested. Some results using SharedArrays (100,000 simulations)
>
> #workers #time
> 1 ~120s
> 3 ~42s
> 6 ~40s
>
> Short question. The first print statement after the for-loop is already executed before the for-loop ends. How do I prevent this from happening?
>
> Some results using the other approach (again 100,000 simulations)
>
> #workers #time
> 1 ~118s
> 2 ~60s
> 3 ~42s
> 4 ~38s
> 6 ~40s
> 6 ~40s
>
Could you post a simplified code snippet? Either here on in a gist. It is difficult to know what exactly you doing ;-)
> Couple of questions. My equivalent of "myfunc_pure()" also requires a second argument.
Is that argument changing, or is this there to switch between different algorithms etc?
> In addition, I don't make use of the "startindex" argument in the function. What's the common approach here? Next, there are actually multiple variables that should be returned, not just "result".
You can always return (a,b,c) instead of a, i.e. a tuple. The function you provide to reduce then has the following signature: myreducer(a::Tuple, b::Tuple). Combine the tuples, and again return a tuple.
>
> Overall, I'm a bit surprised that using more than 3 or 4 workers does not decrease the running time. Any ideas? I'm using Julia 0.3.6 on a 64bit Arch Linux system, Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz.
Can be any number of things, could be the memory bandwidth being the limiting factor, or that the computation is actually nicely sped up and a lot of what you see is communication overhead. In that case, work on chunks of data / batches of itertations, i.e. dont pmap over millions of things but only a couple dozen. Looking at the code might shed some light.