Hi all,
I'm trying to parallelize some time-consuming functions in my LightGraphs package - specifically, centrality calculations that require all-pairs Dijkstra shortest paths calculations. This should be a trivial thing to parallelize, as the shortest paths from one source is completely independent of the shortest paths of any other source.
However - I cannot make what's fast, memory efficient. I've tried two things:
1) pass the graph object in to the parallel function. This requires massive copying which causes the parallel function to take 4x as long as the single-process function. I get this.
2) create a distance matrix as a SharedArray and rewrite Dijkstra to operate on matrices instead of my graph type. The problem here is that this requires a matrix of floats of size (nv, nv) which means that it won't scale for the types of calculations that would require parallelization in the first place. My system can't even handle a graph of 100k nodes as a Float64 matrix.
Shared sparse matrices might work here, and I know there was some previous work on them, but what I'd really like is to have arbitrary shared memory. Is this something 1) in the works, and/or 2) usable from a testing perspective?
Any other suggestions would be greatly appreciated.