How to detect if CP Subsystem is ready to work?

zeb...@gmail.com

unread,

Aug 12, 2019, 7:33:23 AM8/12/19

to Hazelcast

Hi!

I have an application that embeds Hazelcast and utilizes CP Subsystem. I try to write my health check a way that my application only reports UP when it can work. For this I have to be able to get locks from Hazelcast. If I require a lock from Hazelcast CP Subsystem, but the cluster is not capable to provide my a lock yet, then the incoming requests time-outs, so this is not a true healthy state of my application.

Currently I try to get a lock on a separate thread, but sometimes I got to the case where a lots of WrongTargetException is logged (like mentioned here: https://github.com/hazelcast/hazelcast/issues/3395) and it seems that the system can never recover from this state.

So my question is: what is the best state to detect system readiness with CP Subsystem?

Regards:
Balázs

Ensar Basri Kahveci

unread,

Aug 15, 2019, 10:50:48 AM8/15/19

to Hazelcast

Hi Balázs,

Normally, after you start your Hazelcast members, CP subsystem initialization must be completed in a couple of seconds. It should not take a lot of time once you have enough nodes in your cluster (=cp member count configuration). When you try to fetch a data structure proxy from the CPSubsystem interface, internally it will wait until this initialization step is done.

There is also another method that you can use with your custom timeout: CPSubsystemManagementService#awaitUntilDiscoveryCompleted(long timeout, TimeUnit timeUnit)

You can give your own timeout value here and it will return false if the cp subsystem initialization is not completed during the given duration.

Regards,

Balázs Zaicsek

unread,

Aug 16, 2019, 7:15:32 AM8/16/19

to haze...@googlegroups.com, ebka...@gmail.com

Hi!

Thank you for your answer. Yes this can happen very fast, but sometimes it takes much longer (like waiting for instances to start), or does not happen at all.

I have modified Hazelcast-Eureka example so I can demonstrate my problem: https://github.com/zebalu/hazelcast-eureka-swarm-issue/ The problems do not happen all the time but it is possible that some times Hazelcast instances form separate clusters that do not merge somehow. Do you know what can keep clusters from merge?

Regards:
Balu

--
You received this message because you are subscribed to a topic in the Google Groups "Hazelcast" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hazelcast/a26-7iBtGeM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hazelcast+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/57c88011-409d-4082-a4e2-bbadafcdbb4c%40googlegroups.com.