val g = graph.traversal().withComputer(classOf[SparkGraphComputer])
println(g.V().count().next())
However JanusGraphBlueprintsGraph is doing an explicit check that only the FulgoraGraphComputer can be used in this way (line 129).
withComputer also has a method signature of Computer (i.e Object not Class), but instantiating a SparkGraphComputer needs a HadoopGraph, not a JanusGraph.
Once I get this up and running ill submit a full code example for others to use in future.
Thanks,
Rob
val gremlinSpark = Spark.create(spark.sparkContext)
val sparkComputerConnection = GraphFactory.open(getSparkConfig)
val traversalSource = sparkComputerConnection.traversal().withComputer(classOf[SparkGraphComputer])
println(graphTraversalSource.V().count().next())
sparkComputerConnection.close()
def getSparkConfig: BaseConfiguration = {
val conf = new BaseConfiguration()
conf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph")
conf.setProperty("gremlin.hadoop.graphInputFormat", "org.janusgraph.hadoop.formats.cassandra.CassandraInputFormat")
// ####################################
// # Cassandra Cluster Config #
// ####################################
conf.setProperty("janusgraphmr.ioformat.conf.storage.backend", "cassandrathrift")
conf.setProperty("storage.backend", "cassandra")
conf.setProperty("storage.hostname", "127.0.0.1")
conf.setProperty("cassandra.input.partitioner.class", "org.apache.cassandra.dht.Murmur3Partitioner")
// ####################################
// # Spark Configuration #
// ####################################
conf.setProperty("spark.master", "local[4]")
conf.setProperty("gremlin.spark.persistContext", "true")
conf.setProperty("spark.serializer", "org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer")
conf
}
org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat class and the output location. This just writes the query output to hdfs and at least keeps you going.graph.configuration().setProperty('gremlin.spark.persistContext',true)part of the reference section you linked to. Did you also try the option with the PersistedOutputRdd from the gremlin console or only from your scala program?val computerResult = SparkGraphComputerFactory.computeResult("g.V().hasLabel('startingVertex').as('start').out('compares').as('end').select('end')", "gremlin-groovy", "cassandraSpark", "TestGraph")
def computeResult(gremlinQuery: String, gremlinLanguage: String, computerConnectionName: String, outputName: String): RDD[(Long, VertexWritable)] = {
val computerConnection = JanusGraphConnectionFactory.getConnection(computerConnectionName)
val graphComputer = getComputer(computerConnection, outputName)
graphComputer.program(TraversalVertexProgram.build()
.traversal(
computerConnection.traversal().withComputer(graphComputer.getClass),
gremlinLanguage,
gremlinQuery)
.create(computerConnection))
.submit()
.get()
Spark.getRDD(outputName + "/~g").asInstanceOf[RDD[(Long, VertexWritable)]]
}
def getComputer(connection: Graph, outputName: String): GraphComputer = {
connection.configuration().addProperty("gremlin.hadoop.outputLocation", outputName)
connection.compute(classOf[SparkGraphComputer])
.result(GraphComputer.ResultGraph.NEW)
.persist(GraphComputer.Persist.VERTEX_PROPERTIES)
}
--
You received this message because you are subscribed to the Google Groups "JanusGraph users list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
The named RDD does not exist in the Spark context
can you please share read-cassandra-3.propteries file