Hi,
I have an existing graph analysis program written in Spark, and I would like to load the resulting graphs into JanusGraph via a BulkLoaderVertexProgram. I'm struggling to find a nice way to do this!
As a test I have loaded the example Grateful Dead graph into Janus from Spark. This went fine, using loadGraphML to create a TinkerGraph from the test data, and:
val computer:GraphComputer = SparkGraphComputer
val blvp = BulkLoaderVertexProgram.build().writeGraph("conf/testConfig.properties").create(graph)
graph.compute(computer.getClass).program(blvp).submit().get()
to push it to Janus.
The problem is, my Spark program's graph data is currently stored in two RDDs, edges and vertices. I think I need to make a TinkerGraph object from these RDDs first, but can only do this via iterating over each vertex, adding them one at a time (very slow!).
The only other way I can see to do this is to dump the data into hdfs as csvs and load them that way. I'd like to avoid that if possible as I want to move to streaming input data at some point in future.
Any help much appreciated,
Rob