WrongTargetException: WrongTarget!

已查看 1,322 次
跳至第一个未读帖子

rco...@gmail.com

未读,
2013年10月4日 12:35:472013/10/4
收件人 haze...@googlegroups.com
I have some code that updates a timestamp in a distributed Map every 2 seconds.

            _memberStatusMap.put(
                    _instance.getName(),
                    System.currentTimeMillis());

My current test environment only has a single Node so I wouldn't expect any networking issues. Strangely, the application was running for 9 days straight without issue, but then all of a sudden started reporting these errors:

com.hazelcast.spi.exception.WrongTargetException: WrongTarget! this:Address[127.0.0.1]:5776, target:null, partitionId: 231, replicaIndex: 0, operation: com.hazelcast.map.operation.PutOperation, service: hz:impl:mapService
    at com.hazelcast.spi.impl.InvocationImpl.doInvoke(InvocationImpl.java:134)
    at com.hazelcast.spi.impl.InvocationImpl.access$800(InvocationImpl.java:39)
    at com.hazelcast.spi.impl.InvocationImpl$InvocationFuture.waitForResponse(InvocationImpl.java:352)
    at com.hazelcast.spi.impl.InvocationImpl$InvocationFuture.get(InvocationImpl.java:291)
    at com.hazelcast.spi.impl.InvocationImpl$InvocationFuture.get(InvocationImpl.java:283)
    at com.hazelcast.map.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:197)
    at com.hazelcast.map.proxy.MapProxySupport.putInternal(MapProxySupport.java:167)
    at com.hazelcast.map.proxy.MapProxyImpl.put(MapProxyImpl.java:71)
    at com.hazelcast.map.proxy.MapProxyImpl.put(MapProxyImpl.java:59)
    at com.antennasoftware.ecserver.service.hazelcast.HazelcastKeepAliveTask.run(HazelcastKeepAliveTask.java:14)
    at com.antennasoftware.ecserver.service.task.TaskServiceImpl$1.run(TaskServiceImpl.java:56)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)

Note that my Node is running on port 5776.

I'm currently using Hazelcast 3.0.2. Previously I was on 2.x and didn't have this issue.

Any idea what this may be? I don't see any thing posted regarding this exception type.

Thanks!

Peter Veentjer

未读,
2013年10月4日 13:16:442013/10/4
收件人 haze...@googlegroups.com
Normally this exception is thrown (and caught) to indicate that an operation was send to the wrong node. 

This typically happens when a partition has moved to a new owner, after an operation has been send to the old owner of that partition, but before it is processed by the old owner. When this happens, this exception is thrown, and caught and the caller knows it needs to retry the call. But this is internal, should not be visible to the outside world.




--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast.
For more options, visit https://groups.google.com/groups/opt_out.

Robert Cohen

未读,
2013年10月4日 13:35:422013/10/4
收件人 haze...@googlegroups.com

I see that Hazelcast is repeating writing the following logs:

10-04 11:06:31,565 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [my-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:31,565 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [my-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:32,066 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [my-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:32,066 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [my-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:32,566 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [my-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:32,567 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [my-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:32,567 WARN [com.hazelcast.spi.Invocation] - [127.0.0.1]:5776 [my-cluster] Retrying invocation: InvocationImpl{ serviceName='hz:impl:mapService', op=PutOperation{memberStatusMap}, partitionId=231, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=130, callTimeout=60000, target=null}, Reason: com.hazelcast.spi.exception.WrongTargetException: WrongTarget! this:Address[127.0.0.1]:5776, target:null, partitionId: 231, replicaIndex: 0, operation: com.hazelcast.map.operation.PutOperation, service: hz:impl:mapService
10-04 11:06:33,067 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [ec-cluster] Initializing cluster partition table first arrangement...
10-04 11:06:33,068 INFO [com.hazelcast.partition.PartitionService] - [127.0.0.1]:5776 [ec-cluster] Initializing cluster partition table first arrangement... 

Mehmet Dogan

未读,
2013年10月4日 14:26:542013/10/4
收件人 haze...@googlegroups.com

This seems related to the issue https://github.com/hazelcast/hazelcast/issues/927.

Can you try 3.0.3-SNAPSHOT?

@mmdogan

~Sent from mobile

grat...@gmail.com

未读,
2013年10月21日 06:41:122013/10/21
收件人 haze...@googlegroups.com
I am seeing the same repeating message using 3.1.

grat...@gmail.com

未读,
2013年10月21日 07:45:482013/10/21
收件人 haze...@googlegroups.com、grat...@gmail.com
More information, I am seeing this exact same message printed out every second:

 2013/10/21 05:01:33 | WARN  [WaitNotifyServiceImpl$WaitingOp] [161.228.100.233]:2434 [SpectrumCluster-orientdb] class com.hazelcast.spi.exception.WrongTargetException: WrongTarget! this:Address[161.228.100.233]:2434, target:Address[161.228.100.221]:2434, partitionId: 166, replicaIndex: 0, operation: com.hazelcast.spi.impl.WaitNotifyServiceImpl$WaitingOp, service: hz:impl:queueService

Mehmet Dogan

未读,
2013年10月22日 07:34:452013/10/22
收件人 haze...@googlegroups.com
Do you have a code to reproduce this issue?

@mmdogan

ian.sp...@gmail.com

未读,
2013年10月24日 10:34:362013/10/24
收件人 haze...@googlegroups.com
I'm seeing this as well, using HEAD of maintenance-3.x branch. After upgrading HZ to HEAD from an earlier (recent) maintenance-3.x snapshot and restarting node 1 of my cluster, I see the following warning logged a ton of times:  

WARN  2013-10-24 14:24:50,562 com.hazelcast.spi.Invocation  [node1]:5702 [v11] Retrying invocation: InvocationImpl{ serviceName='hz:impl:lockService', op=com.hazelcast.concurrent.lock.LockOperation@54f68112, partitionId=235, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=210, callTimeout=60000, target=null}, Reason: com.hazelcast.spi.exception.WrongTargetException: WrongTarget! this:Address[node1]:5702, target:null, partitionId: 235, replicaIndex: 0, operation: com.hazelcast.concurrent.lock.LockOperation, service: hz:impl:lockService

Then a few minutes later, I see:

Caused by: com.hazelcast.spi.exception.WrongTargetException: WrongTarget! this:Address[node1]:5702, target:null, partitionId: 235, replicaIndex: 0, operation: com.hazelcast.concurrent.lock.LockOperation, service: hz:impl:lockService
        at com
.hazelcast.spi.impl.InvocationImpl.doInvoke(InvocationImpl.java:131) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.spi.impl.InvocationImpl.access$800(InvocationImpl.java:36) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.spi.impl.InvocationImpl$InvocationFuture.waitForResponse(InvocationImpl.java:355) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.spi.impl.InvocationImpl$InvocationFuture.get(InvocationImpl.java:294) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.spi.impl.InvocationImpl$InvocationFuture.get(InvocationImpl.java:286) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.concurrent.lock.proxy.LockProxySupport.lock(LockProxySupport.java:108) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.concurrent.lock.proxy.LockProxySupport.lock(LockProxySupport.java:98) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]
        at com
.hazelcast.concurrent.lock.proxy.LockProxy.lock(LockProxy.java:67) ~[hazelcast-3.1.evergage-1.jar:3.1.evergage-1]



And my app fails to start.


ian.sp...@gmail.com

未读,
2013年10月24日 10:57:332013/10/24
收件人 haze...@googlegroups.com、ian.sp...@gmail.com
Note, this problem does not occur consistently. I restarted my node1, and it started with no WrongTarget warnings or exceptions. For node2, I also saw the WrongTarget issue and had to restart the node three times before it started up successfully.

Ian Springer

未读,
2013年10月24日 11:39:372013/10/24
收件人 haze...@googlegroups.com、ian.sp...@gmail.com
Here is a bit more information about my environment to hopefully help you reproduce this issue. I have a TCP-based cluster with 6 nodes. I stop one node and then when I restart it, the issue occurs, but only like 30% of the time. I started seeing this after upgrading to the very latest from the maintenance-3.x branch as of this morning; previously I was running latest from the maintenance-3.x branch as of yesterday afternoon. So perhaps this is something that was recently introduced.
 

Ian Springer

未读,
2013年10月25日 15:51:002013/10/25
收件人 haze...@googlegroups.com
I just noticed that I see a whole bunch of the following errors on the node to which the node getting WrongTargetExceptions is trying to talk:

ERROR 2013-10-25 19:37:03,813 com.hazelcast.spi.OperationService  [node1]:5702 [release-11] Duplicate Call record! -> RemoteCallKey{caller=Address[node4]:5702, callId=138326, time=1382729823813} / RemoteCallKey{caller=Address[node4]:5702, callId=138326, time=1382729674538} == com.hazelcast.partition.PartitionServiceImpl$AssignPartitions 

So in this case, node4 is the node getting WrongTargetExceptions, which eventually fails to start due to the distributed operation call timing out, and node1 is another node node4 is trying to talk to.

Robert Cohen

未读,
2013年11月18日 10:33:262013/11/18
收件人 haze...@googlegroups.com
I should note that I have not seen the original error since upgrading to 3.0.3. Thanks!
回复全部
回复作者
转发
0 个新帖子