Hello. I am new to all parts of a Cassandra, JanusGraph, gremlin_python setup. And my Java skills are weak. I have a basic setup working via gremlin_python (all downloaded via JanusGraph), but don't understand how to use gremlin_python to save a new graph to Cassandra, or how to access a JanusGraph saved in Cassandra. All the tutorials seem to use in-memory tinkergraphs for illustration. I am trying to learn Cassandra-JanusGraph-gremlin_python, and running the scripts only on my laptop. My setup is JanusGraph 0.1.1 with tinkergraph-gremlin-3.2.3 and gremlin-server-3.2.3.
My gremlin-server.yaml file contains:
host: localhost
port: 8182
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
graph: conf/gremlin-server/janusgraph-cassandra-es-server.properties}
plugins:
- janusgraph.imports
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]},
gremlin-python: {},
gremlin-jython: {}}
My janusgraph-cassandra-es-server.properties file contains:
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
The following python code works to access the in-memory example graph:
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
graph = Graph()
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
a = g.V().both()[1:3].toList()
So far so good. But using gremlin_python only, how do I create a new graph and save it to Cassandra, giving it a name for later access? How do I query a JanusGraph that has been saved in Cassandra, make changes to it, and then save it back to Cassandra? Finally, how can I see what has been saved in Cassandra? The JanusGraph package does not include a bin/cqlsh to open a cql shell. Is there some other way to manage Cassandra while the gremlin and cassandra servers are running?
If I want to move a large set of data from a relational db into JanusGraph form, should I use the python client for JanusGraph, rather than gremlin_python? For example, I could import gremlin_python and a relational database connector, write sql queries in python and then send results to the Cassandra store for the graph db version. I'm not sure what the best practices are for using different python tools for Cassandra, JanusGraph, and TinkerPop. Any thoughts on this would be appreciated.
I've spent quite a while searching the web for answers to these questions, but no luck yet. It seems that tutorials/explanations are in short supply, other than for showing the simple python code that I've included above.
John