Do you have an idea of the memory overhead per node/edge? Or whether it tends to grow as a function of the number of edges or nodes?
For example, is it feasible to run a graph with 50 million edges (and ~5 millon nodes) in the non-Hadoop implementation? In a .csv edge format this occupies roughly 3GB, but I'm not sure how Junto is representing it in memory. I'm just trying to get an idea of the order of magnitude of memory to expect to need.