Indexes staying in INSTALLED

154 views
Skip to first unread message

Cody Martin

unread,
Mar 26, 2020, 1:22:21 PM3/26/20
to JanusGraph users
Hi everyone,

We are using JanusGraph for a big data graph solution on one of our projects using HBase as our storage backend and SOLR as our indexing backend. We are having some trouble with our mixed/composite indexes getting stuck in INSTALLED.

Currently we stand up two servers, both with an embedded graph instance. As part of one server (lets call this server A) standing up, it creates some property keys and indexes. We also use a remote connection in one of the servers to a JanusGraph server. So in all we have 3 graph instances.

When we do this in a dev environment, the JG server, HBase, and SOLR run on a VM (all on the same VM), and then we run our two servers on the host machine for the VM. When server A starts up in our dev environment it initializes just fine by creating the property keys and then the indexes.

When we do this in a deployed environment, we keep all of the service on separate machines. So we have a machine for each of our two servers, a separate machine for the JG server, a separate cluster of machines for HBase, and a separate cluster of machines for SOLR. When we try to initialize server A in the environment, we cannot get our indexes to leave the INSTALLED state. 

I have read that for an index to be REGISTERED, all graph instances have to acknowledge the addition of the index, and no transactions can be open on the graph. Currently, when we create these instances, we do a graph.tx().rollback() but that is it. I have seen suggestions to close all other graph instances when creating an index, but I don't want to do that as we need those other graph instances later on. Also, in the deployed environment, I have checked that graph.getOpenTransactions() returns nothing and that mgmt.getOpenInstances() returns only the 3 instances I am expecting. 

I guess my question here is, what exactly is happening to get the index REGISTERED with all graph instances? How can I further debug this? Am I missing something?

Please if you need more info let me know, I am happy to have a discussion about this.

-Cody

Oleksandr Porunov

unread,
Mar 26, 2020, 9:20:01 PM3/26/20
to JanusGraph users
Hi Cody,

How do you check that you indexes are not changing their statuses to REGISTERED? Do you see timeout error or what do you have when you are awaiting for index statuses to be changed?
The most common mistake what I see is that users don't wait for indexes to be registered and even if they wait they don't ENABLE or REINDEX (which automatically enables) them after that.
Did you check the index lifecycle documentation? https://docs.janusgraph.org/index-management/index-lifecycle/

Basically what you need to do is:
1) create index and commit it - INSTALLED
2) await for it to be propagated to other instances - REGISTERED
3) enable the index by calling ENABLE_INDEX or REINDEX - ENABLED

Sincerely,
Oleksandr

Cody Martin

unread,
Mar 31, 2020, 3:54:15 PM3/31/20
to JanusGraph users
Hi Oleksandr,

Thanks for the reply!

We do actually do the 3 steps you mentioned. We create the index, wait for all keys to be REGISTERED or ENABLED, then we ENABLE them ourselves, essentially taking care of the ones that are just REGISTERED.

On that wait for all keys to be REGISTERED or ENABLED, we see the wait keep waiting for a certain amount of time and then just time out. Here is our waiting, enabling code:

public void awaitIndexToEnabled(final String indexName) {
try {
ManagementSystem.awaitGraphIndexStatus(graph_, indexName)
.status(SchemaStatus.ENABLED, SchemaStatus.REGISTERED).call();

final JanusGraphManagement mgmt = graph_.openManagement();

final JanusGraphManagement.IndexJobFuture indexFuture = mgmt
.updateIndex(mgmt.getGraphIndex(indexName),
SchemaAction.ENABLE_INDEX);

if (indexFuture != null) {
indexFuture.get();
}

mgmt.commit();

ManagementSystem.awaitGraphIndexStatus(graph_, indexName)
.status(SchemaStatus.ENABLED).call();
}
catch (final InterruptedException | ExecutionException e) {
logger_.error("Thread interrupted awaiting index " + indexName, e);
}
}

Cody Martin

unread,
Apr 4, 2020, 10:17:49 PM4/4/20
to JanusGraph users
I have simplified the issue.

We have a server that has an embedded graph instance from which we creat indexes. When the JG server is running and connected to our HBase/sole backend, the indexes will get stuck in INSTALLED. If I stop the JG server, then indexes move to REGISTERED. Why can’t I make indexes from my server while there is another graph instance? Shouldn’t the index get registered with the JG server graph instance at some point? I don’t want to force close the graph instance, I should be able to keep it open while I creat indexes shouldn’t I?

-Cody

Oleksandr Porunov

unread,
Apr 5, 2020, 3:31:36 AM4/5/20
to JanusGraph users
Hi Cody,

You described situation looks like a bug. I didn't face such problems with Cassandra + ElasticSearch. If several instances are opened, they should all be populated with the index. Do you use JanusGraph 0.5.0? How long do you wait?

Best regards,
Oleksandr

Cody Martin

unread,
Apr 5, 2020, 6:51:39 PM4/5/20
to janusgra...@googlegroups.com
Hi Oleksandr,

We are currently on janusgraph 0.4.0 at the moment. We essentially wait on that call to ENABLED/REGISTERED and it times out after a couple of minutes. 

I have noticed that I can hit that wait and watch the logs continue to say that not all keys are ENABLED or REGISTERED on our server, then shut down the JG server and it will immediately go to registered. But there are no transactions open or anything, so I don’t know why the JG server is holding things up. 

Could you explain a little more what happens when I create an index? I’ve looked the code a big, but I haven’t found exactly where the index would get set to registered. Conceptually, what happens when I create the index? How does JanusGraph know when all graph instances have acknowledged a new index, and then where in the code does it decide to actually set it to REGiSTERED? If I can understand a little more about this I may be able to debug my situation a little better. This could be a bug, but I could also be misusing the indexes. 

Thanks!
-Cody

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/a25a6b00-d2eb-41e7-ac39-6737f62e967a%40googlegroups.com.

Ben Wuest

unread,
Apr 6, 2020, 8:53:29 AM4/6/20
to janusgra...@googlegroups.com

Hi Cody –

 

I run into this all the time.   You essentially have to close all transactions AND all open Instances.   Technically – you just need to close what I like to call stranded instances (Instances from other dead processes that will never ACK the Index commands) – but from what I can tell there is no way of separating them out.   Once that is done registering and enabling work fine.   It is a real PITA when dealing with index upgrades.  

 

I would love to know if there is a better way.

 

Ben.

Cody Martin

unread,
Apr 6, 2020, 10:58:09 AM4/6/20
to JanusGraph users
Hi Ben,

Unfortunately I don't think stale instances are the issue. When I look athe mgmt.getOpenInstances() from the gremlin console i see only the JG Server Graph instance and my server's graph instance. That to me looks correct. The minute then that I kill the JG server (and therefore it's graph instance) then the indexes start registering just fine. So it seems to be a failure of the JG server to acknowledge my index.

-Cody

To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "JanusGraph users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to janusgra...@googlegroups.com.

Cody Martin

unread,
Apr 7, 2020, 12:16:36 PM4/7/20
to JanusGraph users
I did a little digging and can see where the messages are being sent from graph instance to graph instance.

They do not directly talk, but instead store messages to the backend that are pulled by each graph instance by a polling mechanism. 

In the case where my indexes are not working, My server creates the index then sends a message to the backend. I can see the JanusGraph server receiving this message, parsing it, and then sending a message itself, looking like its an acknowledgement. Unfortunately, my server never receives this acknowledgement. Any idea why this might be?

-Cody

Haroon Qureshi

unread,
Apr 13, 2020, 5:08:49 PM4/13/20
to JanusGraph users
Is there a way to get the indexes registered without having to shutdown the instance?  That isn't an option at the moment ...

thanks,
Haroon

Haroon Qureshi

unread,
Apr 13, 2020, 5:41:12 PM4/13/20
to JanusGraph users
I shutdown our JG kubernetes cluster and restarted it.  I'm still showing my indexes in an INSTALLED state.  Any ideas on how I can move forward from here?

Thanks,
Haroon

Cody Martin

unread,
Apr 23, 2020, 11:12:28 AM4/23/20
to JanusGraph users
Hi Haroon,

I am having a similar issue. I don't want to shut down all my graph instances to update the indexes, and as far as I know, conceptually you should be able to keep them open.

I have looked through the logs and can see my server sending a message to the other graph instances. It seems like the instances just failed to pick this message up when their MessagePuller polls. 

When things are working I see the message: Sent CACHED_TYPE_EVICTION_ACK: evictionID=<someID> originID=<originalInstanceID>

Am I correct in that this message is saying this graph instance is responding and acknowledging the index created by 'originInstanceID'?

If so, like I said, when my indexes are hanging in INSTALLED, I see that originInstanceID sends the message out, but I never see the above message on the other servers, indicating that they are not receiving the index message.

-Cody

Cody Martin

unread,
Apr 23, 2020, 4:50:57 PM4/23/20
to JanusGraph users
I did more digging and finally have solved the issue.

It seems my NTP server was not correctly syncing all of my servers. Note that it is really important that all of your graph instances are on time synced machines via an ntp server. One of my servers was off by about 6 seconds and it was causing the indexes to hang. I wish this was outline in the docs a little more that it seems like it needs very accurate time synchronization in order to work.

-Cody

sparshneel chanchlani

unread,
Jun 23, 2020, 8:35:09 AM6/23/20
to JanusGraph users
Cody,
I have single instance running i am still getting this same issue.

any suggestions ?

-Sparshneel
Reply all
Reply to author
Forward
0 new messages