Graceful shutdown without abruptly closing client connections

2,472 views
Skip to first unread message

Diego Feitosa

unread,
Apr 25, 2017, 6:50:13 PM4/25/17
to vert.x

Hi,

I've been trying to find a way to shut down a Vert.x application gracefully, but so far, had no success.

My application uses Vert.x 3.4.1 (recently migrated from 3.3.3), exposing an HTTP server that handles some requests and triggers HTTP calls to downstream services. Whenever I stop it with an SIGINT (Ctrl-C or kill -2 <pid>) or SIGTERM, all client connections to my server are abruptly closed and no response is returned.

I'm trying to find a way to save these requests.

I've tried to add a JVM shutdown hook to delay the shutdown process, but it gets executed after the hook installed by Vert.x - The application is initialized with io.vertx.core.Launcher, which installs a shutdown hook that calls vertx.close(). I couldn't find a way of playing with it.

Are there any recommendations on how to achieve this?

Thanks!
-Diego

Tim Fox

unread,
Apr 26, 2017, 2:47:18 AM4/26/17
to vert.x


On Tuesday, 25 April 2017 23:50:13 UTC+1, Diego Feitosa wrote:

Hi,

I've been trying to find a way to shut down a Vert.x application gracefully, but so far, had no success.

My application uses Vert.x 3.4.1 (recently migrated from 3.3.3), exposing an HTTP server that handles some requests and triggers HTTP calls to downstream services. Whenever I stop it with an SIGINT (Ctrl-C or kill -2 <pid>) or SIGTERM, all client connections to my server are abruptly closed and no response is returned.

I'm trying to find a way to save these requests.


I'm not really sure that you're asking here. Vert.x has no way of knowing how many requests are already on the wire, but haven't been read by the server yet, so there's never a good time to shutdown preserving all requests. Perhaps you could clarify your requirements.

Diego Feitosa

unread,
Apr 26, 2017, 2:45:19 PM4/26/17
to vert.x
Hi Tim,

Sorry if it was confusing/not clear.

I'm looking for a way to finish processing all requests that started prior to my application shutdown. There are mechanisms in place already to ensure no new requests will be accepted by my app, but I want to save what has already been accepted.

For example, if a client makes a request to my application and I stop the server before a response is sent to the client, I want to delay the shutdown until that request is finally served and my client gets the response back.

Julien Viet

unread,
Apr 26, 2017, 3:09:54 PM4/26/17
to ve...@googlegroups.com
how do you intent to handle the new requests that arrives in that during this grace period ?

-- 
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/b2fd3fd6-2051-4fac-9d95-57a8dedf6164%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Fox

unread,
Apr 26, 2017, 4:19:08 PM4/26/17
to vert.x


On Wednesday, 26 April 2017 19:45:19 UTC+1, Diego Feitosa wrote:
Hi Tim,

Sorry if it was confusing/not clear.

I'm looking for a way to finish processing all requests that started prior to my application shutdown. There are mechanisms in place already to ensure no new requests will be accepted by my app, but I want to save what has already been accepted.

For example, if a client makes a request to my application and I stop the server before a response is sent to the client, I want to delay the shutdown

I don't really see how this is possible. A client could make a request, but that request is still in transit on the wire when the server starts shutdown. The server has no requests currently being processed, so it shuts down. This will cause the client connection to be terminated and the client will never receive responses to those in transit requests.

Unless you somehow communicate to the client at the beginning of the server shutdown and say something like "don't send any more requests and tell me the id of the last request you sent, so i can wait for it on the server and process it before shutting down" there will always be the possibility of shutting down with unprocessed requests.

Diego Feitosa

unread,
Apr 26, 2017, 6:24:58 PM4/26/17
to vert.x
Hi Julien,

For the shutdown preparation, all incoming traffic is redirected to another server pool, so the server that will go down don't receive any new requests. This is already in place.

Diego Feitosa

unread,
Apr 26, 2017, 8:16:08 PM4/26/17
to vert.x
So far, I've tried two different things, one that worked as expected and another that did not work at all:

1. Custom shutdown hook on a custom launcher (not working)
I've tried not using io.vertx.core.Launcher and created a custom Launcher class. This class adds a JVM shutdown hook that calls vertx.close(), just like in io.vertx.core.impl.launcher.commands.BareCommand does, except the close() call is intended to be delayed. To delay it, I've tried vertx.setTimer(60000, x -> { vertx.close(); }) which had no effect, and later on, Thread.sleep(60000) inside the shutdown hook. Even tough sleep() is part of another thread, my requests are all blocked and once it finishes, the requests did not end successfully.

2. Exposed another endpoint that shuts down the app (worked as expected)
I've created another verticle with an HTTP server that listens on a different port and expects the shutdown command (the SIGINT/SIGTERM handling is lost, but that should be fine, I can replace a kill with a curl call). However, I'm not sure on the implications of having such code and whether it may mess with Vert.x internal states. Apart from any security concerns on not exposing it to external callers or not handling multiple "shutdown calls" made to it, does anyone see any downsides on taking this approach? Any feedback is highly appreciated.

public class ShutdownVerticle extends AbstractVerticle {

private static final Logger LOGGER = LoggerFactory.getLogger(ShutdownVerticle.class);

@Override
public void start(Future<Void> startFuture) throws Exception {
vertx.createHttpServer().requestHandler(request -> {
int shutdownTimer = 60000;
LOGGER.info("Preparing to shutdown in {} milliseconds", shutdownTimer);
vertx.setTimer(shutdownTimer, id -> {
vertx.close(result -> {
if (!result.succeeded()) {
LOGGER.info("Goodbye, world.");
}
});
});
request.response().end();
}).listen(8092);
}
}

Thanks!
-Diego

Dmitrii Golub

unread,
May 22, 2017, 7:04:38 PM5/22/17
to vert.x
Did you find any other way to handle this?
For instance: nodejs has the close method.
Something like this in vert.x?

Tim Fox

unread,
May 23, 2017, 2:33:38 AM5/23/17
to vert.x
Possible, but that's not going to help with any requests already on the wire as mentioned below

Dmitrii Golub

unread,
May 24, 2017, 4:34:27 AM5/24/17
to vert.x
Why not?
Usually, you put some kind of router before application. Zero downtime updates are not hard to archive with nginx.

For instance:
rabbitMQClient!!.start { res ->
if (res.succeeded()) {
val autoAck = true
rabbitMQClient!!.basicConsume("fromThemToUs", "fromThemToUs", autoAck, {
if (it.succeeded()) { logger.info("successfully consumed") }
            else logger.error(it.cause().stackTrace) }
        })
} else {
logger.error("Failed to connect to rabbit with ${rabbitConfig.getConfig()}")
}
}

I have no idea how to handle this case:
1. verticle receives sigterm
2. rabbitMQClient stop receiving new messages
3. Wait while verticle consume the last message
4. Shutdown instance. 

You told that this is possible, could you please give me directions.

Tim Fox

unread,
May 24, 2017, 5:00:13 AM5/24/17
to vert.x


On Wednesday, 24 May 2017 09:34:27 UTC+1, Dmitrii Golub wrote:
Why not?
Usually, you put some kind of router before application. Zero downtime updates are not hard to archive with nginx.


It's not possible with Vert.x alone (or Node.js alone, or anything else), of course if you put an intelligent proxy or load balancer on the front which has the ability to count requests and responses sent to each server and switch them over when there are no outstanding then you can achieve what you want, but that#s what was asked in the original question.

Tim Fox

unread,
May 24, 2017, 5:01:15 AM5/24/17
to vert.x


On Wednesday, 24 May 2017 10:00:13 UTC+1, Tim Fox wrote:


On Wednesday, 24 May 2017 09:34:27 UTC+1, Dmitrii Golub wrote:
Why not?
Usually, you put some kind of router before application. Zero downtime updates are not hard to archive with nginx.


It's not possible with Vert.x alone (or Node.js alone, or anything else), of course if you put an intelligent proxy or load balancer on the front which has the ability to count requests and responses sent to each server and switch them over when there are no outstanding then you can achieve what you want, but that#s what

^^ fat fingers - "that's not what was asked" :)

Tim Fox

unread,
May 24, 2017, 5:04:41 AM5/24/17
to vert.x


On Wednesday, 24 May 2017 10:00:13 UTC+1, Tim Fox wrote:


On Wednesday, 24 May 2017 09:34:27 UTC+1, Dmitrii Golub wrote:
Why not?
Usually, you put some kind of router before application. Zero downtime updates are not hard to archive with nginx.


It's not possible with Vert.x alone (or Node.js alone, or anything else), of course if you put an intelligent proxy or load balancer on the front which has the ability to count requests and responses sent to each server and switch them over when there are no outstanding then you can achieve what you want, but that#s what was asked in the original question.


So in otherwords, what you are looking for must be a function of your load balancing layer (nginx or whatever), not a function of Vert.x. If you implement this correctly in your LB layer then there's no need for Vert.x to implement anything (not that it could implement anything that's useful anyway). If your LB layer switches over servers in the pool correctly then by the time it has switched then you will know there are zero outstanding requests in Vert.x anyway, so you can just shut Vert.x down straight away at that point without waiting for anything to finish. (The same would apply to Node.js and anything else you are using behind your LB btw) :)

Tim Fox

unread,
May 24, 2017, 5:08:35 AM5/24/17
to vert.x


On Wednesday, 24 May 2017 09:34:27 UTC+1, Dmitrii Golub wrote:

I have no idea how to handle this case:
1. verticle receives sigterm
2. rabbitMQClient stop receiving new messages
3. Wait while verticle consume the last message
4. Shutdown instance. 

You told that this is possible, could you please give me directions.

Seems like a different question. Now you're asking about consuming message from RabbitMQ, the question was about how to cleanly process all HTTP requests before shutdown...

Dmitrii Golub

unread,
May 24, 2017, 6:57:27 PM5/24/17
to vert.x
Yes, you are right that original question, my question, and rabbitmq question are all different.
But they are about the same thing. 
What's the way to stop application gracefully, only after very last in progress request?

Smart LB is not the solution, as you can first request 200, and that would be a trigger to start some heavy task in your verticle.
Graceful in that case means that you'll wait for that task to complete.

What about me, that code is part of finance application, and I have no idea how to restart consumer verticle.
It's crucial here not to interrupt in progress job.

Tim Fox

unread,
May 25, 2017, 3:44:01 AM5/25/17
to vert.x


On Wednesday, 24 May 2017 23:57:27 UTC+1, Dmitrii Golub wrote:
Yes, you are right that original question, my question, and rabbitmq question are all different.
But they are about the same thing. 
What's the way to stop application gracefully, only after very last in progress request?

This is impossible to do (this is not Vert.x specific) without some kind of proxy/lb (or cooperation with the client) as mentioned before. Even if you shutdown after the processing of the current requests on the server, there may well be a bunch more requests queued up on the wire or in transit which won't get processed. So the application exit won't be graceful - the client will lose some messages and/or get some errors.

This problem can only be solved by adding a proxy/lb between the client and the server. The LB will need to count requests send to a server and responses received and when a server is taken out of the pool it will need to stop sending further requests to that server and instead route new requests to a different server and when all responses from the original server have been received that one can be shutdown. In this way you can seamlessly bring servers in and out of the cluster with no errors either client or server side. I'm assuming that's what you want.


 

Smart LB is not the solution, as you can first request 200, and that would be a trigger to start some heavy task in your verticle.
Graceful in that case means that you'll wait for that task to complete.

What about me, that code is part of finance application, and I have no idea how to restart consumer verticle.
It's crucial here not to interrupt in progress job.

That worries me- you should always design your application with failure in mind. Failure is normal!. Even if you do manage to implement true graceful shutdown (with a LB) then failures can still occur, e.g. machines can fail, power can go, network is lost, disk fails. So it's always going to be possible for your processing to fail before it's complete, and your application must be able to recover and restart processing when it comes back up. So, you'll need to code the recovery code *anyway* to cope with failures and resume processing. And if you code that recovery code anyway, you don't need graceful shutdown :)
Reply all
Reply to author
Forward
Message has been deleted
0 new messages