InitializeDistributedObjectOperation rejected before execution, operation-call-timeout (Hz 3.7)

364 views
Skip to first unread message

Jean Luc

unread,
Oct 6, 2016, 8:37:33 PM10/6/16
to haze...@googlegroups.com
Hello,

We had a failure in production (stacktrace below) during a getMap() call. Googling for it returns only a handful of posts. After checking the Hazelcast source code, the exception seems associated with exceeding the timeout specified in hazelcast.operation.call.timeout.millis. A few questions:

  • is it correct to interpret it as being equivalent with a read timeout exception, i.e. a connection has been established but for some reason the application data (ie. Hazelcast's) has not flown after a while? Or is there a completely different but correct explanation that does not involve the network, that somehow the remote call was not even placed?

  • would a network outage lead to the above or would it manifest differently? Is it known what else could cause it (the other JVM being starved of threads, for instance) ?

  • for idempotent Hazelcast calls like this (getMap()), is there a way to configure automatic retries? We already have retries for connection establishment: <hz:network connection-attempt-limit="3" connection-attempt-period="3000" connection-timeout="1000" redo-operation="true" smart-routing="true"> but if I read the stacktrace correctly, our problem occurs after the connection has been established.


Many thanks for any help,

JL

com.hazelcast.core.OperationTimeoutException: InitializeDistributedObjectOperation got rejected before execution due to not starting within the operation-call-timeout of: 60000 ms. Current time: 2016-10-05 02:45:26.989. Total elapsed time: 39 ms. Invocation{op=com.hazelcast.spi.impl.proxyservice.impl.operations.InitializeDistributedObjectOperation{serviceName='hz:impl:mapService', identityHash=1515643499, partitionId=-1, replicaIndex=0, callId=0, invocationTime=1475649558657 (2016-10-05 02:39:18.657), waitTimeout=-1, callTimeout=60000}, tryCount=1, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=60000, firstInvocationTimeMs=1475649926950, firstInvocationTime='2016-10-05 02:45:26.950', lastHeartbeatMillis=0, lastHeartbeatTime='1969-12-31 19:00:00.000', target=[10.1.1.1]:5703, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=Connection[id=40, /10.1.1.2:5701->/10.1.1.1:33315, endpoint=[10.1.1.1]:5703, alive=true, type=MEMBER]}
    at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.newOperationTimeoutException(InvocationFuture.java:150) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolve(InvocationFuture.java:96) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.spi.impl.AbstractInvocationFuture$1.run(AbstractInvocationFuture.java:246) ~[hazelcast-all-3.7.jar!/:3.7]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
    at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
    at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92) ~[hazelcast-all-3.7.jar!/:3.7]
    at ------ submitted from ------.(Unknown Source) ~[na:na]
    at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveAndThrow(ClientInvocationFuture.java:74) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolveAndThrow(ClientInvocationFuture.java:30) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:158) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.spi.ProxyManager.initialize(ProxyManager.java:337) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.spi.ProxyManager.initializeWithRetry(ProxyManager.java:293) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.spi.ProxyManager.getOrCreateProxy(ProxyManager.java:274) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.impl.HazelcastClientInstanceImpl.getDistributedObject(HazelcastClientInstanceImpl.java:607) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.impl.HazelcastClientInstanceImpl.getMap(HazelcastClientInstanceImpl.java:444) ~[hazelcast-all-3.7.jar!/:3.7]
    at com.hazelcast.client.impl.HazelcastClientProxy.getMap(HazelcastClientProxy.java:115) ~[hazelcast-all-3.7.jar!/:3.7]

Peter Veentjer

unread,
Oct 18, 2016, 1:29:22 PM10/18/16
to Hazelcast
Do you see in your log file something slow operation detected?

The most common cause of this exception is that an operation (or maybe multiple) are taking a lot of time to process and onxw the pending operations get picked up; they are rejected because they should have executed within a given time window.

Please add the following JVM properties:

Dhazelcast.diagnostics.enabled=true
-Dhazelcast.diagnostics.metric.level=info
-Dhazelcast.diagnostics.invocation.sample.period.seconds=30
-Dhazelcast.diagnostics.pending.invocations.period.seconds=30
-Dhazelcast.diagnostics.slowoperations.period.seconds=30

Then the system will generate a file 'diagnostics-blabla.log' which should provide some insight in what is happening.
Reply all
Reply to author
Forward
0 new messages