Index missing after backup and restore of Cassandra tables under keyspace janusgraph

316 views
Skip to first unread message

vincent....@gmail.com

unread,
Mar 13, 2018, 10:02:03 AM3/13/18
to JanusGraph users
We are using Janus 0.2 with Cassandra and ElasticSearch (elasticsearch enabled in the yaml config file but not used).

We are trying to move data from one environment to another. 
We have used Cassandra nodetool snapshot to backup the Cassandra node (which has 100% of the data, simple cluster for time being). We only backup all the tables under the keyspace cassandra

On a new cluster, with no data, we use the sstableloader for each snapshot.
No issue so far.

Janusgraph cache is disabled to keep it simple.

We can do a count() and get the same result on the source environment and the one we restored the data upon.
We can query a specific element.
However we cannot do a query with has(). It throws errors.

We thought the Janusgraph index was stored inside a Cassandra table and that if we restored all the tables under the janusgraph keyspace, we would be fine.
What are we missing here?

Jason Plurad

unread,
Mar 13, 2018, 10:20:43 AM3/13/18
to JanusGraph users
Did you verify through the schema management whether the composite index exists? I'd check that it is there first, and if it is, perhaps you need to run a reindex.

What errors are thrown when you run a query with the has() step?

vincent....@gmail.com

unread,
Mar 13, 2018, 6:27:13 PM3/13/18
to JanusGraph users
the standard 'g.V().count()' query works, but when making a  more complex query
eg "g.V().has('category', 'TEST').count()"

this returns a NullPointerException

"stackTrace": "java.lang.NullPointerException\n\tat org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.constructIndexCover(GraphCentricQueryBuilder.java:368)\n\tat org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.constructIndexCover(GraphCentricQueryBuilder.java:375)\n\tat org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.indexCover(GraphCentricQueryBuilder.java:354)\n\tat org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.constructQueryWithoutProfile(GraphCentricQueryBuilder.java:274)\n\tat org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.constructQuery(GraphCentricQueryBuilder.java:201)\n\tat org.janusgraph.graphdb.query.graph.GraphCentricQueryBuilder.vertices(GraphCentricQueryBuilder.java:164)\n\tat org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphStep.lambda$new$0(JanusGraphStep.java:69)\n\tat org.apache.tinkerpop.gremlin.process.traversal.step.map.GraphStep.processNextStart(GraphStep.java:139)\n\tat org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)\n\tat org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42)\n\tat org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83)\n\tat org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113)\n\tat org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)\n\tat org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)\n\tat org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.fill(IteratorUtils.java:62)\n\tat org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.list(IteratorUtils.java:85)\n\tat org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils.asList(IteratorUtils.java:382)\n\tat org.apache.tinkerpop.gremlin.server.handler.HttpGremlinEndpointHandler.lambda$channelRead$1(HttpGremlinEndpointHandler.java:241)\n\tat org.apache.tinkerpop.gremlin.util.function.FunctionUtils.lambda$wrapFunction$0(FunctionUtils.java:36)\n\tat org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:296)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"

vincent....@gmail.com

unread,
Mar 13, 2018, 6:38:44 PM3/13/18
to JanusGraph users
Typo in my first message "We only backup all the tables under the keyspace cassandra"
=> I meant We only backup all the tables under the keyspace Janusgraph

Can you confirm this is the only keyspace we nee to backup and restore? I made the assumption that everything is contained under the janusgraph keyspace but I know have my doubts.

We will look at the schema tool in the meantime.

vincent....@gmail.com

unread,
Apr 12, 2018, 6:55:37 AM4/12/18
to JanusGraph users
You were right index was missing. Isn't the index stored in Cassandra tables though?

if we restore all the Cassandra tables, the index should be recreated, isn't it?

If not, where does Janus store the index data?

Reindexing index one by one is a very slow process. It's becoming slower to reindex than just re run  full data load

vincent....@gmail.com

unread,
Apr 12, 2018, 7:10:22 AM4/12/18
to JanusGraph users
Note that we are backing up janusgraph/graphindex table, so it would be included in our sstableloader. Something is not right, it seems.

vincent....@gmail.com

unread,
Apr 13, 2018, 12:06:27 AM4/13/18
to JanusGraph users
after further testing, it seems that re-indexing is not necessary any longer. It might have been a glitch. The graph-index seems to be storing simple indexes
Reply all
Reply to author
Forward
0 new messages