Hazelcast 3.7 client failures during cluster rolling restart

275 views
Skip to first unread message

ram kishore m

unread,
Dec 2, 2016, 10:05:47 AM12/2/16
to Hazelcast
Hi All,

Im using Hazelcast 3.7.1 in a client server architecture, with a 3 node Hazelcast cluster and multiple client applications connecting to it.

Im running into issues when i do a cluster rolling restart or rolling deployment without any version changes on Hazelcast just some application code changes or even just cluster restart. 

I do the following steps during cluster restart 

Node1: 

          * shutdown using HazelcastInstance.shutDown()
          * Do any upgrade steps/deploy new code
          * Start Hazelcast Instance 
          * Check if the node is safe using hazelcastInstance.getPartitionService().isLocalMemberSafe() 
          * Check if cluster is safe using hazelcastInstance.getPartitionService().isClusterSafe()
          * Move to the next node only when these two checks above return true.

Node2:

         * Repeat the steps on node1
Node3:

        * Repeat the same steps again.

During this deployment or rolling restart phase clients are receiving below exceptions failing especially for write operations. Can somebody suggest better approach or is there anything wrong in what i am doing ? I sincerely appreciate any help and its a high priority issue. Thank you for helping

Caused by: com.hazelcast.core.HazelcastInstanceNotActiveException: State: PASSIVE Operation: class com.hazelcast.map.impl.operation.PutOperation

        at com.hazelcast.spi.impl.operationservice.impl.Invocation.engineActive(Invocation.java:302)

        at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvoke(Invocation.java:249)

        at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke0(Invocation.java:232)

        at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke(Invocation.java:207)

        at com.hazelcast.spi.impl.operationservice.impl.InvocationBuilderImpl.invoke(InvocationBuilderImpl.java:59)

        at com.hazelcast.client.impl.protocol.task.AbstractPartitionMessageTask.processMessage(AbstractPartitionMessageTask.java:64)

        at com.hazelcast.client.impl.protocol.task.AbstractMessageTask.initializeAndProcessMessage(AbstractMessageTask.java:119)

        at com.hazelcast.client.impl.protocol.task.AbstractMessageTask.run(AbstractMessageTask.java:99)

        at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:137)

        at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:127)

        at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.run(OperationThread.java:102)

        at ------ submitted from ------.(Unknown Source)

        at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolve(InvocationFuture.java:111)

        at com.hazelcast.spi.impl.AbstractInvocationFuture$1.run(AbstractInvocationFuture.java:246)

        at com.hazelcast.client.impl.protocol.task.AbstractPartitionMessageTask.execute(AbstractPartitionMessageTask.java:78)

        at com.hazelcast.spi.impl.AbstractInvocationFuture.unblock(AbstractInvocationFuture.java:242)

        at com.hazelcast.spi.impl.AbstractInvocationFuture.andThen(AbstractInvocationFuture.java:218)

        at com.hazelcast.client.impl.protocol.task.AbstractPartitionMessageTask.processMessage(AbstractPartitionMessageTask.java:69)

        at com.hazelcast.client.impl.protocol.task.AbstractMessageTask.initializeAndProcessMessage(AbstractMessageTask.java:119)

        at com.hazelcast.client.impl.protocol.task.AbstractMessageTask.run(AbstractMessageTask.java:99)

        at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:137)

        at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:127)

        at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.run(OperationThread.java:102)

        at ------ submitted from ------.(Unknown Source)

        at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveAndThrow(ClientInvocationFuture.java:74)

        at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveAndThrow(ClientInvocationFuture.java:30)

        at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:158)

        at com.hazelcast.client.spi.ClientProxy.invokeOnPartition(ClientProxy.java:153)

        at com.hazelcast.client.spi.ClientProxy.invoke(ClientProxy.java:147)

        at com.hazelcast.client.proxy.ClientMapProxy.putInternal(ClientMapProxy.java:457)

        at com.hazelcast.client.proxy.ClientMapProxy.put(ClientMapProxy.java:451)

        at com.hazelcast.client.proxy.ClientMapProxy.put(ClientMapProxy.java:253)

        at



ram kishore m

unread,
Dec 12, 2016, 1:25:11 PM12/12/16
to Hazelcast
Does anybody  have any ideas related. I sincerely appreciate any help.

M. Sancar Koyunlu

unread,
Dec 13, 2016, 4:14:02 AM12/13/16
to Hazelcast
Hi, 
Seeing this exception is normal when node is shutting down or just starting. Either node is not ready to take the operations or closing down.
But client should retry it silently without letting you know. If you are seeing that exception it means that retry also timed out. This timeout
period is configurable via following config on client side
HazelcastProperty INVOCATION_TIMEOUT_SECONDS
= new HazelcastProperty("hazelcast.client.invocation.timeout.seconds", 120, SECONDS);
You may try to increase this timeout period. If client is not honouring this timeout and sending exception before 2 minutes could indicate
something else is wrong here. 

And from client there is no way to ask questions like .isLocalMemberSafe()  or .isClusterSafe() . 


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at https://groups.google.com/group/hazelcast.
To view this discussion on the web visit https://groups.google.com/d/msgid/hazelcast/0e809083-b3ac-4c70-ae39-5eff6a2dda35%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sancar Koyunlu
Software Engineer, Hazelcast
Reply all
Reply to author
Forward
0 new messages