I have a parallel application where I want to modify some objects from worker processes, and get the modified values back into the main process. Is there a clean way to do this in Julia?
Here is a contrived example illustrating the problem (run with "julia -p 2 myfile.jl"):
@everywhere function mutate_arr!(arr, x)
println(arr)
push!(arr, x)
println(arr)
nothing
end
arr = [1, 2, 3]
println(arr)
remotecall_fetch(2, mutate_arr!, arr, 4)
println(arr)
This fails with the following error:
exception on 2: ERROR: cannot resize array with shared data
in push! at ./array.jl:459
in mutate_arr! at .../myfile.jl:3
in anonymous at multi.jl:855
in run_work_thunk at multi.jl:621
in anonymous at task.jl:855
Is it the case that remotecall and @spawn do not allow modifying the function arguments? If so, this should be documented.
My second attempt uses a RemoteRef to pass the array explicitly:
@everywhere function mutate_arr!(arr, x)
println(arr)
push!(arr, x)
println(arr)
nothing
end
@everywhere function wrapper_mutate_arr!(ref, x)
local_arr = take!(ref)
mutate_arr!(local_arr, x)
put!(ref, local_arr)
nothing
end
arr = [1, 2, 3]
println(arr)
arr_ref = RemoteRef(2)
put!(arr_ref, arr)
remotecall_fetch(2, wrapper_mutate_arr!, arr_ref, 4)
arr = take!(arr_ref)
println(arr)
This fails with the same error ("cannot resize array with shared data"). What is shared here, and why?
If I use replace
local_arr = take!(ref) with
local_arr = copy(take!(ref)), it works. But I think this creates an additional copy (one for the serialization, and one for the copy() call), which I would like to avoid.
In summary: If I have a function like mutate_arr! that I want to offload to a worker process, what is the right way to do it? Obviously the functional way of thinking is to avoid mutation entirely, but if I already have a function like that, what should I do? Just use an additional copy() as above?
(In case it matters, I'm using Julia 0.3.4.)
Thanks,
Constantin