On Sat, Feb 2, 2013 at 5:58 PM, Vivek Kulkarni <
vive...@gmail.com> wrote:
> Hi,
> Can some one guide me on whether networkx is suitable for a graph on 2
> million nodes and ~400 million edges. What order of nodes and edges is
> typically supported by networkx. I only intend to load the graph and run
> pagerank. I believe page rank should be efficient (perhaps run in the number
> of edges). I do have access to server with 128GB RAM and an Intel Xeon CPU
> with 6 cores. Also any suggestions on how to improve memory footprint would
> be helpful. I am already using integers as node labels and also building the
> graph in 2 pieces (to keep memory in check). I intended to have float edge
> weights but soon saw that I was running out of memory. So had to discard the
> edge weights.
If you have enough memory it would work. But it could be that 128GB
is not enough.
If all you want to do is calculate PageRank it is a linear algebra
problem and you don't need all of the NetworkX machinery. If you want
to use Python you could adapt the NetworkX pagerank_scipy() algorithm
and just load the SciPy sparse matrix directly with your data.
Aric