julia> addprocs(3)
3-element Array{Int64,1}:
2
3
4
julia> nheads = @parallel (+) for i=1:200000000
Int(rand(Bool))
end
100008845In my experience, Hadoop is pretty terrible about minimizing data movement; Spark seems to be significantly better.
The only codes that really nail it are carefully handcrafted HPC codes.