Hi,
Firstly, thanks for your work on this great toolkit! I am trying to get a rough idea of how much memory Junto will need, to figure out the feasibility of running it on large graphs.
Do you have an idea of the memory overhead per node/edge? Or whether it tends to grow as a function of the number of edges or nodes?
For example, is it feasible to run a graph with 50 million edges (and ~5 millon nodes) in the non-Hadoop implementation? In a .csv edge format this occupies roughly 3GB, but I'm not sure how Junto is representing it in memory. I'm just trying to get an idea of the order of magnitude of memory to expect to need.
Thanks in advance.