EventBus Messages Getting Lost on AWS Cluster

827 views
Skip to first unread message

Jeff Long

unread,
Sep 22, 2016, 1:52:52 AM9/22/16
to vert.x
Hi folks,

We have a Vert.x cluster deployed on an AWS stack across several layers/instances The paradigm we follow is a webservice verticle that accepts http requests, and then forwards requests to an appropriate processor verticle via the Vert.x EventBus. The sending of the message is being done as follows:

vertx.eventBus().send(eventBusSubject, payload, deliveryOptions, (res) -> {
   
Logging.info("Received response from processor:\n" + res);
   
if (res.succeeded()) {
       
// Do stuff that should happen
   
} else {
       
if (res.cause() != null && res.cause().toString().startsWith("(NO_HANDLERS,-1) No handlers for address")) {
       
Logging.error("Processor EventBus Listener for subject '" + eventBusSubject + "' is missing");
       context
.response().setStatusCode(500).end(INTERNAL_SERVER_ERROR);
       
} else {
       
Logging.warn("Possible timeout met calling processor subject '" + eventBusSubject + "' : " + (res.cause() != null ? res.cause().toString() : "Unknown?!?"));
       context
.response().setStatusCode(408).end("Timeout");
       
}
   
}
});

The eventBusSubject is correct because the majority of the time, the processors receive the message and everything goes as planned. However, sporadically, the processor will not receive the message. The message appears to get lost somewhere on the event bus between the webservice and the processor, and the webservice just times out returning a 408. The processor consumer is being assigned as follows:

vertx.eventBus().consumer(eventBusSubject, this::messageHandler);

and messageHandle is defined as:

private void messageHandler(Message<JsonObject> message) {
   
this.payload = (JsonObject)message.body();
   
this.vertx.executeBlocking(this::processHandler, false, (res) -> {
   
if(this.getWithCallback()) {
       
JsonObject messageReply = new JsonObject();
       
if(res.succeeded()) {
           messageReply
.put("response", (JsonObject)res.result());
       
} else {
           messageReply
.put("error", res.cause().toString());
       
}

       message
.reply(messageReply);
   
}
 
});
}

These are defined in a processor template class that each processor extends and overrides processHandler as needed.

As I said, this works roughly 90% of the time, but after a point, the webservice is not able to successfully deliver messages to the processors, resulting in a timeout. We are using Vert.x 3.2.1 with Hazelcast 3.5.2; would anyone have an idea why this might be happening? Any help is much appreciated.

Thanks!

Thomas SEGISMONT

unread,
Sep 22, 2016, 4:02:54 AM9/22/16
to ve...@googlegroups.com
When you return the 408 response, what is cause found in your logs? I see you log it at warn level.

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/a8bb42f9-2f53-4316-8a93-327eb3c7ef95%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Julien Viet

unread,
Sep 22, 2016, 9:44:01 AM9/22/16
to ve...@googlegroups.com
did you check that the worker pool used by the executeBlocking is large enough to process all your messages ?

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.

Jeff Long

unread,
Sep 22, 2016, 10:34:01 AM9/22/16
to vert.x
No real info, just things like "Possible timeout met calling processor subject 'refresh-token-prc-4' : (TIMEOUT,-1) Timed out waiting for a reply". In the processHandler method for all of the processors, the first thing I'm doing is logging that I've received a message for that processor. This logging event is never triggered at all. It seems as though the message just disappears on the event bus with the processor never receiving it.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.

Thomas SEGISMONT

unread,
Sep 22, 2016, 12:06:34 PM9/22/16
to ve...@googlegroups.com
Have you tried to change log level and check what's going on at the cluster manager level? Any exception/suspicious message around the timeout log?

To unsubscribe from this group and stop receiving emails from it, send an email to vertx+unsubscribe@googlegroups.com.

Jeff Long

unread,
Sep 23, 2016, 9:59:33 AM9/23/16
to vert.x
Yeah, when this issue occurs, which can be described as sporadic at best, the message that is being lost is the only one on the event bus. So I don't think it's a load issue.

Jeff Long

unread,
Sep 23, 2016, 10:04:46 AM9/23/16
to vert.x
Yep, we did this yesterday and saw no more Hazelcast logging statements other than the normal INFO ones that occur. We are using Log4j2, while Hazelcast only supports Log4j 1.x, so we let the Hazelcast logging system default to using the JDK, and changed the level to FINER using the JUL configuration file. It literally seems like the message is just disappearing on the event bus.

Jeff Long

unread,
Sep 23, 2016, 10:06:28 AM9/23/16
to vert.x
We have also switched to Vert.x 3.3.3 and Hazelcast 3.6.3, which appeared to solve the problem at first. However it has resurfaced.

Jeff Long

unread,
Sep 29, 2016, 11:52:33 AM9/29/16
to vert.x
Alright, I straightened out the logging issue, and I'm now having vertx and hazelcast log at DEBUG level. Here are the series of logs I'm getting:

15:35:20.065 [vert.x-worker-thread-10] INFO  com.privacystar.webservice.Server - {"logged":"2016-09-29T15:35:20Z",
"type":"INFO","message":"Received normal payload:\n{...payload...}"}

15:35:20.065 [vert.x-worker-thread-10] DEBUG com.privacystar.webservice.Server - {"logged":"2016-09-29T15:35:20Z",
"type":"DEBUG","message":"Initial payload processing step - sending to subject [user-validation-4]"}

15:35:20.068 [vert.x-eventloop-thread-0] DEBUG com.privacystar.webservice.Server - {"logged":"2016-09-29T15:35:20Z
","type":"DEBUG","message":"User info passed user validation [vert.x-eventloop-thread-0]"}

15:35:20.068 [vert.x-eventloop-thread-0] DEBUG com.privacystar.webservice.Server - {"logged":"2016-09-29T15:35:20Z
","type":"DEBUG","message":"Final payload processing step - sending to subject [license-check-prc-22]"}

[2016-09-29 15:35:20,069] DEBUG Cluster connection closed: 172.22.23.176:41951 holder io.vertx.core.eventbus.impl.
clustered.ConnectionHolder@5bb64d6f (io.vertx.core.eventbus.impl.clustered.ConnectionHolder)

and 1 minute later I get:

15:36:20.069 [vert.x-eventloop-thread-0] WARN  com.privacystar.webservice.Server - {"logged":"2016-09-29T15:36:20Z
","type":"WARN","message":"Possible timeout met calling processor subject 'license-check-prc-22' : (TIMEOUT,-1) Ti
med out after waiting 60000(ms) for a reply. address: d070e4cb-6223-4de7-a9a1-b44c97d5c6f7"}


I searched the "Cluster connection closed" message and found this thread: https://groups.google.com/forum/#!topic/vertx/7rwfcvhx8J4

It appears this may be my issue as well. Is there a solution to this?

On Thursday, September 22, 2016 at 3:02:54 AM UTC-5, Thomas Segismont wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.

Tim Fox

unread,
Sep 29, 2016, 12:14:41 PM9/29/16
to vert.x
Looking at the code in github


"cluster connection closed" occurs when the connection is closed either by a pong not being received or an exception/connection closed from other end.

There is no preceeding message in your logs saying "no pong from server"


So most likely the connection is being closed from the other end. Are you sure the other servers in your cluster aren't being stopped or restarted?

Jeff Long

unread,
Sep 29, 2016, 12:34:50 PM9/29/16
to vert.x
Well, various servers in the cluster are being stopped and restarted at different times. But shouldn't the Hazelcast cluster manager be able to dynamically adjust and reconnect to any restarted servers?

At the time of the logs I posted, all servers in the cluster were up and running. On a freshly made cluster, everything works perfectly. But once I start restarting cluster servers, things get weird, just like in the other thread I referenced.

My only guess is that, when restarting a server, the old server address is being stored somewhere and is still associated with the eventbus subject, and when I try to send a message on said subject, it tries to send to the old address. However, I haven't been able to dig into the source code enough yet to confirm this.

Jeff Long

unread,
Sep 29, 2016, 12:39:29 PM9/29/16
to vert.x
In addition, this behavior seems sporadic; sometimes the messages are passed along the eventbus just fine. But sometimes this issue happens. This is what has led me to believe that old server addresses are being cached, and are sometimes hit, since I know that if there are multiple listeners for an eventbus subject, they are cycled through in a round robin fashion.

Asher Tarnopolski

unread,
Sep 29, 2016, 12:44:37 PM9/29/16
to vert.x
this indeed sounds like the issue we had: once we restart a node every second message seems to be still sent to the old node that probably wasn't unregistered.
this happens on a machine with tcp based cluster config and not in aws.

Jeff Long

unread,
Sep 29, 2016, 12:50:04 PM9/29/16
to vert.x
Yeah, I don't believe it's a networking issue since everything works perfectly on a completely fresh cluster. Did you come up with a resolution/workaround?

Asher Tarnopolski

unread,
Sep 29, 2016, 12:55:07 PM9/29/16
to vert.x
nope. it still happens from time to time and meanwhile we implemented a retry send mechanism once we receive timeout in attempt to solve this at least partially.

Tim Fox

unread,
Sep 29, 2016, 2:48:43 PM9/29/16
to vert.x
Hazelcast is used to maintain the cluster topology - it's not used for the transport of events - that's done by straight TCP connections between nodes.

If you're sending a message along one of the connections to a node and expecting it to get processed and a result returned, and that server is shut down, then the message will appear to be "lost".

Asher Tarnopolski

unread,
Sep 29, 2016, 2:57:04 PM9/29/16
to vert.x
hey Tim,
in our case this it not what happens. we stop the vertx instance on one node and then start it again. the "new" instance functions as expected, but every second message sent to it via eventbus
fails with time out as if the old instance's consumers would still be registered on the bus.

Jeff Long

unread,
Sep 29, 2016, 3:55:04 PM9/29/16
to vert.x
That's exactly right, Asher. I'm digging through the source code right now, and I can see in the ClusteredEventBus where the server IDs are being fetched: https://github.com/eclipse/vert.x/blob/master/src/main/java/io/vertx/core/eventbus/impl/clustered/ClusteredEventBus.java#L233

calls HazelAsyncMultiMap.get() (https://github.com/vert-x3/vertx-hazelcast/blob/master/src/main/java/io/vertx/spi/cluster/hazelcast/impl/HazelcastAsyncMultiMap.java#L90), which retrieves the server IDs from vertx's cache. So I have a feeling that the cache is not being properly updated when a node is shut down.

Tim Fox

unread,
Sep 30, 2016, 3:52:59 AM9/30/16
to vert.x
Well I remember there were old bugs related to this, but afaik they were all fixed. Are you using latest versions?

Is there a new bug report with a reproducer set up to track this?

Tim Fox

unread,
Sep 30, 2016, 4:01:01 AM9/30/16
to vert.x
Well, the evidence of your logs shows cluster connections closed but not via pong timeout, so something is closing the connection from the other side. Most probably servers are being shutdown. In that case you should expect any in transit messages to go missing and code your app to deal with timeouts and retry accordingly.

Jochen Mader

unread,
Sep 30, 2016, 8:04:36 AM9/30/16
to ve...@googlegroups.com
We recently had a similar thread in here.
From what I read I think your nodes are simply shutdown which means they are not correctly removed from the cluster.
In that case they won't be removed from the list of nodes for another 300 seconds.

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Jochen Mader | Lead IT Consultant

codecentric AG | Elsenheimerstr. 55a | 80687 München | Deutschland
tel: +49 89 215486633 | fax: +49 89 215486699 | mobil: +49 152 51862390
www.codecentric.de | blog.codecentric.de | www.meettheexperts.de | www.more4fi.de

Sitz der Gesellschaft: Düsseldorf | HRB 63043 | Amtsgericht Düsseldorf
Vorstand: Michael Hochgürtel . Mirko Novakovic . Rainer Vehns
Aufsichtsrat: Patric Fedlmeier (Vorsitzender) . Klaus Jäger . Jürgen Schütz

Jeff Long

unread,
Sep 30, 2016, 9:17:19 AM9/30/16
to vert.x
I had looked into this heartbeat setting, but I don't quite think this is the issue. If it were, the cluster would still remove the dead nodes after 300 seconds and cease trying to send to them, right? This isn't the case here. Vertx tries to send to the old address(es) round robin with the active addresses forever, and the only way to resolve it is to completely restart the cluster.

Jeff Long

unread,
Sep 30, 2016, 4:25:25 PM9/30/16
to vert.x
I just captured the logs that are occurring when I shut down a node:

[2016-09-30 19:52:34,858] INFO [172.22.23.170]:5701 [noone] [3.6.3] Connection [Address[172.22.23.139]:5702] lost. Reason: Socket explicitly closed (com.hazelcast.nio.tcp.TcpIpConnection)
[2016-09-30 19:52:34,858] INFO [172.22.23.170]:5701 [noone] [3.6.3] Removing Member [172.22.23.139]:5702 (com.hazelcast.cluster.ClusterService)
[2016-09-30 19:52:34,859] INFO [172.22.23.170]:5701 [noone] [3.6.3] Removing Member [172.22.23.139]:5702 (com.hazelcast.partition.InternalPartitionService)
[2016-09-30 19:52:34,872] INFO [172.22.23.170]:5701 [noone] [3.6.3] Committing/rolling-back alive transactions of Member [172.22.23.139]:5702, UUID: 368bebd3-7f75-4c6c-83a8-ca7b51112ad1 (com.hazelcast.transaction.TransactionManagerSer
vice)
[2016-09-30 19:52:34,875] INFO [172.22.23.170]:5701 [noone] [3.6.3]

Members [7] {
        Member [172.22.23.41]:5701
        Member [172.22.23.139]:5701
        Member [172.22.23.130]:5701
        Member [172.22.23.70]:5701
        Member [172.22.23.250]:5701
        Member [172.22.23.33]:5701
        Member [172.22.23.170]:5701 this
}
 (com.hazelcast.cluster.ClusterService)
[2016-09-30 19:52:35,674] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Invocations:1 timeouts:0 backup-timeouts:0 (com.hazelcast.spi.OperationService)
[2016-09-30 19:52:36,169] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Received heartbeat from Member [172.22.23.70]:5701 (now: Fri Sep 30 19:52:36 UTC 2016, timestamp: Fri Sep 30 19:52:36 UTC 2016) (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:36,674] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Invocations:1 timeouts:0 backup-timeouts:0 (com.hazelcast.spi.OperationService)
[2016-09-30 19:52:37,198] WARN [172.22.23.170]:5701 [noone] [3.6.3] Address[172.22.23.170]:5701 could not ping Address[172.22.23.139]:5702 (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:37,198] WARN [172.22.23.170]:5701 [noone] [3.6.3] Address[172.22.23.170]:5701 will ping Address[172.22.23.139]:5701 (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:37,674] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Invocations:1 timeouts:0 backup-timeouts:0 (com.hazelcast.spi.OperationService)
[2016-09-30 19:52:37,762] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Received heartbeat from Member [172.22.23.130]:5701 (now: Fri Sep 30 19:52:37 UTC 2016, timestamp: Fri Sep 30 19:52:37 UTC 2016) (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:38,263] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Received heartbeat from Member [172.22.23.250]:5701 (now: Fri Sep 30 19:52:38 UTC 2016, timestamp: Fri Sep 30 19:52:38 UTC 2016) (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:38,674] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Invocations:1 timeouts:0 backup-timeouts:0 (com.hazelcast.spi.OperationService)
[2016-09-30 19:52:39,482] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Received heartbeat from Member [172.22.23.33]:5701 (now: Fri Sep 30 19:52:39 UTC 2016, timestamp: Fri Sep 30 19:52:39 UTC 2016) (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:39,674] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Invocations:1 timeouts:0 backup-timeouts:0 (com.hazelcast.spi.OperationService)
[2016-09-30 19:52:39,844] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Received heartbeat from Member [172.22.23.41]:5701 (now: Fri Sep 30 19:52:39 UTC 2016, timestamp: Fri Sep 30 19:52:39 UTC 2016) (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:39,868] INFO [172.22.23.170]:5701 [noone] [3.6.3] Connection [Address[172.22.23.139]:5701] lost. Reason: Socket explicitly closed (com.hazelcast.nio.tcp.TcpIpConnection)
[2016-09-30 19:52:39,868] INFO [172.22.23.170]:5701 [noone] [3.6.3] Removing Member [172.22.23.139]:5701 (com.hazelcast.cluster.ClusterService)
[2016-09-30 19:52:39,868] INFO [172.22.23.170]:5701 [noone] [3.6.3] Removing Member [172.22.23.139]:5701 (com.hazelcast.partition.InternalPartitionService)
[2016-09-30 19:52:39,871] INFO [172.22.23.170]:5701 [noone] [3.6.3] Committing/rolling-back alive transactions of Member [172.22.23.139]:5701, UUID: bdfbf746-c5f2-4567-b761-cd63bbdbf5d4 (com.hazelcast.transaction.TransactionManagerService)
[2016-09-30 19:52:39,872] INFO [172.22.23.170]:5701 [noone] [3.6.3]

Members [6] {
        Member [172.22.23.41]:5701
        Member [172.22.23.130]:5701
        Member [172.22.23.70]:5701
        Member [172.22.23.250]:5701
        Member [172.22.23.33]:5701
        Member [172.22.23.170]:5701 this
}
 (com.hazelcast.cluster.ClusterService)
[2016-09-30 19:52:40,675] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Invocations:1 timeouts:0 backup-timeouts:0 (com.hazelcast.spi.OperationService)
[2016-09-30 19:52:40,984] WARN [172.22.23.170]:5701 [noone] [3.6.3] Could not get results (com.hazelcast.map.impl.query.MapQueryEngineImpl)
com.hazelcast.core.MemberLeftException: Member [172.22.23.139]:5701 has left cluster!
        at com.hazelcast.spi.impl.operationservice.impl.InvocationMonitor$OnMemberLeftTask.run(InvocationMonitor.java:284)
        at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:212)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
        at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
        at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92)
        at ------ End remote and begin local stack-trace ------.(Unknown Source)
        at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponse(InvocationFuture.java:387)
        at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponseOrThrowException(InvocationFuture.java:337)
        at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:225)
        at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:204)
        at com.hazelcast.map.impl.query.MapQueryEngineImpl.addResultsOfPredicate(MapQueryEngineImpl.java:528)
        at com.hazelcast.map.impl.query.MapQueryEngineImpl.invokeQueryAllPartitions(MapQueryEngineImpl.java:421)
        at com.hazelcast.map.impl.proxy.MapProxyImpl.entrySet(MapProxyImpl.java:617)
        at com.hazelcast.map.impl.proxy.MapProxyImpl.entrySet(MapProxyImpl.java:606)
        at io.vertx.core.impl.HAManager.nodeLeft(HAManager.java:282)
        at io.vertx.core.impl.HAManager.access$100(HAManager.java:95)
        at io.vertx.core.impl.HAManager$1.nodeLeft(HAManager.java:142)
        at io.vertx.spi.cluster.hazelcast.HazelcastClusterManager.memberRemoved(HazelcastClusterManager.java:286)
        at com.hazelcast.cluster.impl.ClusterServiceImpl.dispatchEvent(ClusterServiceImpl.java:775)
        at com.hazelcast.cluster.impl.ClusterServiceImpl.dispatchEvent(ClusterServiceImpl.java:87)
        at com.hazelcast.spi.impl.eventservice.impl.LocalEventDispatcher.run(LocalEventDispatcher.java:56)
        at com.hazelcast.util.executor.StripedExecutor$Worker.process(StripedExecutor.java:187)
        at com.hazelcast.util.executor.StripedExecutor$Worker.run(StripedExecutor.java:171)
[2016-09-30 19:52:41,169] DEBUG [172.22.23.170]:5701 [noone] [3.6.3] Received heartbeat from Member [172.22.23.70]:5701 (now: Fri Sep 30 19:52:41 UTC 2016, timestamp: Fri Sep 30 19:52:41 UTC 2016) (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:42,204] WARN [172.22.23.170]:5701 [noone] [3.6.3] Address[172.22.23.170]:5701 could not ping Address[172.22.23.139]:5701 (com.hazelcast.cluster.impl.ClusterHeartbeatManager)
[2016-09-30 19:52:42,205] WARN [172.22.23.170]:5701 [noone] [3.6.3] Address[172.22.23.170]:5701 will ping Address[172.22.23.139]:5701 (com.hazelcast.cluster.impl.ClusterHeartbeatManager)

I killed the node by simply shutting down the EC2 instance (I bolded the IP it was at) it was on, which probably just powered off the VM. Could that exception being thrown screw things up? It looks like it's supposed to happen, though. 

Tim, shouldn't the cluster manager still remove the address associated with the node that left, even though it was forced closed? The problem isn't that messages that are in-transit while the server is shutting down are being lost, it's that after this node is restarted, and is a member of the healthy-appearing cluster, the system still thinks the old server address is valid (which it obviously isn't) and tries to send messages to it, which just results in a timeout.

Jeff Long

unread,
Sep 30, 2016, 4:49:31 PM9/30/16
to vert.x
Yes, we are using Vertx 3.3.3 and Hazelcast 3.6.3. I have not set up a bug report yet, I wanted to make sure it was actually a bug first.

Tim Fox

unread,
Oct 1, 2016, 5:50:52 AM10/1/16
to vert.x
Looking at your stack trace - Hazelcast is throwing a MemberLeftException as the ClusterMap is being iterated from HAManager line 282:

at io.vertx.core.impl.HAManager.nodeLeft(HAManager.java:282)

It might just be a matter of catching this exception and retrying the iteration as it seems (googling) that this exception can be thrown in the case one of the nodes dies, during the iteration.

Manish Kumar

unread,
Oct 2, 2016, 3:35:11 PM10/2/16
to vert.x
We have the same issue where we shutdown a node in non-graceful manner. Afterwards, when we send message on event-bus, vertx still send message to dead node sporadically.

Jeff Long

unread,
Oct 3, 2016, 4:15:49 PM10/3/16
to vert.x
I think I may have have found the actual source of the issue. When shutting down a verticle using SIGTERM, so a graceful shutdown, I am receiving this log trace:

[2016-10-03 15:06:19,507] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,510] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,511] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,511] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,512] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,512] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,513] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,515] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,516] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,517] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,518] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,524] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,528] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,528] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,531] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,533] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,533] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,534] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,534] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,538] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,538] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,538] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,539] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,539] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,539] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,539] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,540] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,540] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,540] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)
[2016-10-03 15:06:19,540] ERROR Unhandled exception (io.vertx.core.impl.ContextImpl)
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.checkComplete(FutureImpl.java:164)
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:108)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:135)
at io.vertx.core.impl.FutureImpl.handle(FutureImpl.java:23)
at io.vertx.core.eventbus.impl.clustered.ClusteredEventBus.lambda$removeSub$9(ClusteredEventBus.java:382)
at io.vertx.core.impl.FutureImpl.checkCallHandler(FutureImpl.java:158)
at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:100)
at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:271)
at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:316)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:440)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at java.lang.Thread.run(Thread.java:745)

So vertx is actually failing to remove the sub on shutdown. I am not overriding the stop method in the AbstractVerticle, so this is default behavior.

Jeff Long

unread,
Oct 3, 2016, 4:18:55 PM10/3/16
to vert.x
Yep. I think that, even though it looks like Vertx is not removing the sub properly at the time of shutdown, should it not try to remove the sub again when a message is sent to the dead address and no response is received? It clearly knows the connection is dead since I get the "Connection closed" debug log when it tries to send a message to one of these dead nodes.

Jeff Long

unread,
Oct 3, 2016, 4:48:16 PM10/3/16
to vert.x

The future is somehow being prematurely completed, does anyone have an idea why that would happen? I have been trying to trace the code to find out, but I haven't found anything yet.

 

Tim Fox

unread,
Oct 4, 2016, 2:52:13 AM10/4/16
to vert.x
Well, of course, it _should_work, and as far as I a know it did use to in this case (maybe HZ behaviour has changed?), I'm just a guy looking at the code in GitHub and it seems pretty clear right now that exception isn't being handled. I'm not saying that's the main issue but it's something that needs investigating.

I strongly recommend someone opens an issue and attaches all these findings to it, along with instructions to replicate (if this happens all the time it should be easy to reproduce right?) The core team are very busy and won't investigate things until that has happened.

Julien Viet

unread,
Oct 4, 2016, 3:39:26 AM10/4/16
to ve...@googlegroups.com
+1

-- 
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.

Jeff Long

unread,
Oct 4, 2016, 10:51:31 AM10/4/16
to vert.x
Gotcha, thanks for all the help Tim! I'll set up an issue and try to come up with a reproducer as soon as I get a chance.

Jeff Long

unread,
Oct 6, 2016, 6:57:04 PM10/6/16
to vert.x
I noticed this pull request, which describes my problem as well, and am using his solution which appears to be working great.

Tim Fox

unread,
Oct 7, 2016, 3:40:44 AM10/7/16
to vert.x
That looks great, but could you add a comment in the issue just to tie things together? Also that exception still needs to be dealt with
Reply all
Reply to author
Forward
0 new messages