handler vs blockingHandler behavior

1,264 views
Skip to first unread message

Igor Spasić

unread,
Jun 10, 2015, 7:07:20 PM6/10/15
to ve...@googlegroups.com
Sorry, this is bothering me. 3.0-snapshot


we have /foo mapped to a handler that uses _blocking_ http client to fetch /bar
we have /bar mapped to a simple handler that returns string.

in other words, /foo is calling /bar and returns both results.

It happened that it is very important how they get registered: either with blockingHandler() or handler().


What bothers me is that I believe that all 3 scenarios should work the same and to have no exceptions. Sure, the performances will not be the same under the load; but execution should be the same, right? However, this is not what is going on.


Please, would you enlighten me here? Why these 3 scenarios work differently?

Igor Spasić

unread,
Jun 10, 2015, 7:10:16 PM6/10/15
to ve...@googlegroups.com
Please, just ignore the fact we are using 3rd party blocking client. 

Tim Fox

unread,
Jun 11, 2015, 3:46:48 AM6/11/15
to ve...@googlegroups.com
1. If you are getting "thread is blocked" you can so the same thing you
did before to determine why it is blocked. (look at stack trace, get
thread dump)

3. Something is throwing an exception from a handler. If you add a
failure handler as I suggested before you will be able to catch and deal
with it.

I am travelling today and tomorrow so won't have time to investigate



On 11/06/15 00:07, Igor Spasić wrote:
> Sorry, this is bothering me. 3.0-snapshot
>
> Here is reproducable: https://github.com/igorspasic/vertx
>
> we have /foo mapped to a handler that uses _blocking_ http client to fetch
> /bar
> we have /bar mapped to a simple handler that returns string.
>
> in other words, /foo is calling /bar and returns both results.
>
> It happened that it is very important how they get registered: either with
> blockingHandler() or handler().
>
>
> What bothers me is that I believe that all 3 scenarios should *work the
> same and to have no exceptions*. Sure, the performances will not be the

Tim Fox

unread,
Jun 11, 2015, 3:53:57 AM6/11/15
to ve...@googlegroups.com
Also please try with latest vertx-web master - I have made a small
commit to make unhandled exception handling cleaner

Igor Spasić

unread,
Jun 11, 2015, 8:42:36 AM6/11/15
to ve...@googlegroups.com
in 1. the thread is blocked by "java.net.SocketInputStream.socketRead0(Native Method)".
in 3. there is something blocking, i guess the same as #1

The key here is that for all 3 scenarios, the code is the same, except for the usage of handler or blocking handler! And it only works in scenario 2.

Jez P

unread,
Jun 11, 2015, 9:03:16 AM6/11/15
to ve...@googlegroups.com
In scenario 1, you have blocking code on the event bus. What do you expect to happen in that case? You're being warned that you've blocked the event bus thread, and that's quite correct. I don't understand why you think the warnings you're getting in scenario 1 shouldn't occur.

Igor Spasić

unread,
Jun 11, 2015, 12:13:50 PM6/11/15
to ve...@googlegroups.com
Because we are sending request to the same vertx server, using localhost loopback. And it always get locked. Sending request to yourself should definetly < 2 seconds - and even I extend this time, the lock always happens.

More - why the same code fails, when executed from blocking handlers?

Jez P

unread,
Jun 11, 2015, 2:05:07 PM6/11/15
to ve...@googlegroups.com
Don't know about the blocking handlers part, but if you run blocking code on the event loop I suspect the outcome isn't always predictable. You said yourself it's the thread's blocked in a net call, so you know full well it's blocking because of your choice of blocking HttpClient. The golden rule of vertx is never put blocking code on the event loop and you're doing exactly that. I think your "scenario 1" is essentially a red herring. The only potential concern you should have is when both handlers are blockingHandlers, and I'll leave that to someone from the vert.x team to answer.

Have you actually tried debugging the two blockingHandler scenario to investigate what's going on? 

Jez P

unread,
Jun 11, 2015, 2:12:08 PM6/11/15
to ve...@googlegroups.com
In your scenario 1, your http client is actually blocking the only thread which your handler can use to respond to the request your thread has made. Think about what that means.

Igor Spasić

unread,
Jun 11, 2015, 4:24:27 PM6/11/15
to ve...@googlegroups.com
Thanx for helping!

Regarding blocking of scenario #1 - yeah, the usage is blocking, but due to its nature (sending request to localhost), it should end fast. I am thinking of this like some complex calculation - it is blocking the execution for some time, but it should not be critical for this specific test. I am NOT saying this is the way to go! This is just a reproducable example for Tim :)


In your scenario 1, your http client is actually blocking the only thread which your handler can use to respond to the request your thread has made. Think about what that means.

Help me understand this, please. When I start the example, I see there are 4 event-loop threads. So when /foo request comes, one thread gets blocked. This thread fires another request to /bar, that still may be processed, since 3 event-loop threads remains, right?

Nevertheless, I created a Foo2Handler that uses vertx own http client asynchronously. Still not seeing its working (pushed changes), but maybe I am missing something.

I also had problem using async http client that uses its own executor i.e. thread pool.

Sorry if I am missing something, but this all gets me little puzzled...

Jez P

unread,
Jun 11, 2015, 4:57:19 PM6/11/15
to ve...@googlegroups.com
Each verticle instance only has a single event loop thread - vertx guarantees that any individual verticle instance is tied to a single event loop. This is key to the way it works.. You deploy a single instance of your verticle, it therefore is single-threaded. While processing the /foo request you block the only thread loop which can run that verticle. Therefore the /bar request cannot be processed because it is waiting on the same thread. Both handlers are in the same verticle.

I suspect (but have not tried) that if you deployed two instances of your verticle, scenario 1 would work. 

Igor Spasić

unread,
Jun 11, 2015, 5:14:20 PM6/11/15
to ve...@googlegroups.com
I am already deploying 2 verticles (in AppServer).

Jez P

unread,
Jun 11, 2015, 5:21:39 PM6/11/15
to ve...@googlegroups.com
Well I guess I was wrong that it would work - but anyway this has taken up far too much of my time for today. I guess you'll just have to try doing some debugging or hope someone else offers some help.

Jez P

unread,
Jun 11, 2015, 5:28:20 PM6/11/15
to ve...@googlegroups.com
One helpful hint for you, it looks like you don't actually end your HttpClientRequest (from the vertx client) so it doesn't get sent, as far as I can see (so you will never get the reply, because you don't actually complete the request).

Igor Spasić

unread,
Jun 11, 2015, 5:51:50 PM6/11/15
to ve...@googlegroups.com
You are right about this, ending request actually makes a difference!

Still would like to figure whats going on :))

Igor Spasić

unread,
Jun 12, 2015, 3:02:50 AM6/12/15
to ve...@googlegroups.com
To recap:

both 1st and 3rd scenario happens because /foo blocks when calling /bar.

What I do not understand is: why calling /bar from /foo handler hangs? We have 2 verticles, 4 event-loop threads; and this should go through?

Clement Escoffier

unread,
Jun 12, 2015, 3:27:37 AM6/12/15
to ve...@googlegroups.com, Igor Spasić
Hi,

I didn’t follow everything, so my answer is probably off. 

You have two verticles but (if I don’t miss something) you have only one instance of each, right ? Each instance is always invoked by the same event loop, regardless the number of event loops you have. So, if a verticle blocks the event loop. it won’t be called until the event loop is freed again.

Clement
--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Igor Spasić

unread,
Jun 12, 2015, 3:50:57 AM6/12/15
to ve...@googlegroups.com, igor....@gmail.com
You are actually very close :) I have 2 identical verticle instances deployed. I understand that each instance is invoked by same event loop - but does it have to be the same event loop for both instances?

Here is what i expected to be going:

+ /foo comes, vertx gets Verticle instance1, and start serving by eg EventLoop#1
+ /foo calls /bar and wait
+ /bar comes, vertx gets Verticle instance2?, and start serving by A) EventLoop#1 or B) new eventloop, eg EventLoop#2?

So either:

1) vertx gets the same instance of Verticle, and therefore uses the same EventLoop; or (why?)
2) vertx gets the other instance of Verticle, but start serving by same EventLoop. (why?)

Does anyone know how to figure this or confirm this?

Also, from user-1000-feet perspective - shouldn't this scenario works???

Jez P

unread,
Jun 12, 2015, 4:09:03 AM6/12/15
to ve...@googlegroups.com
I don't know if you're correct in your statement that you get a "Response already written" in scenario 3 because /foo blocks when calling /bar (I assume that using a blockingHandler means the code doesn't run on the event loop thread but on a worker thread), but I'm sure that /foo blocking when calling /bar is happening in scenario 1. 

I would expect both instances to be on separate event loop threads (though I don't know how vert.x allocates event loop threads) and so I would expect the /bar request to be handled by the other instance of the verticle. Can you prove that two instances are definitely starting (your code looks as though it should work, but that's not the same thing with some debug output showing that multiple instances are actually running)? If it turned out that only one instance was running, that would completely explain scenario 1, but not (AIUI) the response already written you see in scenario 3.

Igor Spasić

unread,
Jun 12, 2015, 5:03:38 AM6/12/15
to ve...@googlegroups.com
This is exactly what I expect - both instances to be on separate event loop threads - even we don't know how vertx allocates; I am expecting it will not use verticle instance that is currently being processed by an event loop.

I will try to check what you are saying.

For scenario 3 - when I add failureHandler, I see that the statusCode is 408 - and this failure is set by TimeoutHandler; but the core issue is the same: (b)locking the tread as in scenario 1.

Tim Fox

unread,
Jun 13, 2015, 3:52:52 AM6/13/15
to ve...@googlegroups.com
Scenario 3 times out because you are creating two blocking handlers
sequentially on the same verticle.

Blocking handlers use executeBlocking which is ordered (this has been
discussed a couple of times recently on this group), so the second
handler won't execute to the first one is complete, but the first one
won't complete until it gets a reply from the second one, so you have a
deadlock.

In your case you don't care it's ordered so I can add a flag when
creating the blocking handler to disable ordering which should allow
this case to work.

Igor Spasić

unread,
Jun 13, 2015, 3:22:27 PM6/13/15
to ve...@googlegroups.com
Cool, thanx for explanation, and flag would do the trick.

But what happens in Scenario #1 - when using just handlers()?

Tim Fox

unread,
Jun 13, 2015, 3:32:18 PM6/13/15
to ve...@googlegroups.com
On 13/06/15 20:22, Igor Spasić wrote:
Cool, thanx for explanation, and flag would do the trick.

But what happens in Scenario #1 - when using just handlers()?

You're blocking an event loop - which is breaking the golden rule.. don't do this!

Igor Spasić

unread,
Jun 13, 2015, 4:14:07 PM6/13/15
to ve...@googlegroups.com


You're blocking an event loop - which is breaking the golden rule.. don't do this!


I am aware of that - but sending the request on localhost should be very fast, and things dont get better when I increase the timeouts. Moreover, if I replace the sending request with eg Thread.sleep(500), which is also blocking, everything works. 

Jez P

unread,
Jun 13, 2015, 5:23:11 PM6/13/15
to ve...@googlegroups.com
Your theory about Thread.sleep being the same as your request is only valid if you know for sure that your /bar handler will be processed by a different event loop thread to your /foo handler (i.e. each request will be processed by a different instance of the same verticle). You're assuming that, but I don't think you can guarantee it. 

I think your real question is this:

You create two instances of the same HttpServer-deploying verticle, which both have two handlers (one handling /foo, the other handling /bar). When you hit /foo, its handler makes a blocking request to /bar. You expect the latter request to be handled by the other instance, which we think should be on a different event loop thread and therefore will complete in a finite time (this scenario is where Thread.sleep would be comparable). However, you don't see the response.

If for some reason both instances of the verticle end up on the same event loop thread, then the behaviour you observer (blocked indefinitely) is exactly what would be expected, because that thread is blocked waiting on a response, which means it can't run the handler for the request whose response is being awaited.

So I think your real question is: why don't the two separate requests (one from the browser, the other from the internal http client) hit separate verticle instances, or if they do hit separate verticle instances, why don't they hit different event loop threads?

Jez P

unread,
Jun 13, 2015, 5:25:19 PM6/13/15
to ve...@googlegroups.com
Have you tried having your verticle instances log the thread id during start() to make sure they are genuinely on different event loop threads (I assume they start on the event loop thread they are being assigned, but I might be wrong on that one)

Igor Spasić

unread,
Jun 13, 2015, 5:44:36 PM6/13/15
to ve...@googlegroups.com
Wow Jez P - that is exactly the question; im not that fluent in english to express this nicely like you. Thank you, very very much!

Igor Spasić

unread,
Jun 13, 2015, 5:55:07 PM6/13/15
to ve...@googlegroups.com
Sure, here is the  output:

Verticle #1
Verticle #2
Verticle: 1 started on 17 (vert.x-eventloop-thread-2)
Verticle: 2 started on 19 (vert.x-eventloop-thread-4)
Server is up


FooHandler.handle on 17 (vert.x-eventloop-thread-2)
Jun 13, 2015 11:53:24 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 2581 ms, time limit is 2000
Jun 13, 2015 11:53:25 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 3587 ms, time limit is 2000
Jun 13, 2015 11:53:26 PM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blo


Looks like the same event loop thread used for starting is used for executing.

Tim Fox

unread,
Jun 14, 2015, 4:24:58 AM6/14/15
to ve...@googlegroups.com
The event loop used to handle the connection is determined by Netty, not Vert.x.
--

Tim Fox

unread,
Jun 14, 2015, 4:47:29 AM6/14/15
to ve...@googlegroups.com
But in any case - I don't think it's relevant which event loop Netty chooses.... if you use a blocking handler for foo and a standard handler for bar (the scenario which works) and add a little handler that matches all routes:

router.route().handler(rc -> {
      System.out.println("In instance " + this + " on event loop " + Thread.currentThread());
      rc.next();
    });

You can see that when foo and bar is handled there are usually different event loops.

In this case it seems that Netty is not handling the request for bar, because you have blocked the event loop of foo. To figure out exactly *why* this is happening you will probably need to go deep into the workings of Netty (or ask on the Netty group).

The bottom line is, it's happening because you blocked the event loop. If you block an event loop all bets are off and you may see behaviour that's very hard to explain.

Igor Spasić

unread,
Jun 15, 2015, 7:10:15 AM6/15/15
to ve...@googlegroups.com
The bottom line is, it's happening because you blocked the event loop. If you block an event loop all bets are off and you may see behaviour that's very hard to explain.

This is bit too fragile imho - server has capacity of handling the situation, and yet it refuses to do so. I would expect more robust approach, where software tries to resolve conflict:)

Thank you for your time, I will still try to figure what is going on. In our case, vertx is indirectly used by users, hence I must be aware of all the crazy thing that may happens.

Tim Fox

unread,
Jun 15, 2015, 7:21:04 AM6/15/15
to ve...@googlegroups.com
On 15/06/15 12:10, Igor Spasić wrote:
The bottom line is, it's happening because you blocked the event loop. If you block an event loop all bets are off and you may see behaviour that's very hard to explain.

This is bit too fragile imho - server has capacity of handling the situation, and yet it refuses to do so. I would expect more robust approach, where software tries to resolve conflict:)

I'm not sure I understand what you mean... Blocking an event loop is strictly prohibited in Vert.x and we go to great lengths to explain that you must not do this, and that lots of bad things might happen if you do this ;)

We even check whether you're doing this and if you are we even tell you automatically the exact line of code where this is occurring so you can go and fix the code.



Thank you for your time, I will still try to figure what is going on. In our case, vertx is indirectly used by users, hence I must be aware of all the crazy thing that may happens.

Well good luck with that. I don't even know all the things that might happen.

But you don't need to know, that's why we check and if your users are doing something stupid they will be told and provided with a stack trace that shows the exact line of code where they are blocking.

Igor Spasić

unread,
Jun 15, 2015, 8:04:44 AM6/15/15
to ve...@googlegroups.com

Sure, sure - I am just personally convinced that this example should work at least somehow :))))

One crazy question - since vertx is detecting the blocking point (btw, I dont see any explicit line number, except usual stack trace?) - would it make sense to 'upgrade' the handler of that vertx to being blocking vertx from that point on, and repeat the request again? And vice-versa; to downgrade some blocking vertx to just 'event' one, if it is being executed in less then 2000ms... Of course, optionally.

wdyt?

Tim Fox

unread,
Jun 15, 2015, 8:40:03 AM6/15/15
to ve...@googlegroups.com
On 15/06/15 13:04, Igor Spasić wrote:

Sure, sure - I am just personally convinced that this example should work at least somehow :))))

When you block an event loop Vert.x provides no guarantees that things will work.



One crazy question - since vertx is detecting the blocking point (btw, I dont see any explicit line number, except usual stack trace?)

The line number is on the stack trace. If you post it here I will point you at it.


- would it make sense to 'upgrade' the handler of that vertx to being blocking vertx from that point on, and repeat the request again? And vice-versa; to downgrade some blocking vertx to just 'event' one, if it is being executed in less then 2000ms... Of course, optionally.

That's been discussed before but it's tricky.

Also.... by the time the thread is blocked that's nothing much we can do-  I guess we could interrupt the thread, but there's no guarantee that would cause the handler to exit.

Igor Spasić

unread,
Jun 15, 2015, 7:13:45 PM6/15/15
to ve...@googlegroups.com
Tim, thanx for your time, really appreciate!

Btw, method #executeBlocking() does not have the boolean flag parameter any more? (3.0.0-snapshot)
 

When you block an event loop Vert.x provides no guarantees that things will work.

As even documentation says, there is no universal threshold when we can say that thread is being blocking - at least that much so vertx start to "provide no guarantees". It may depend on the traffic we want to achieve etc. So I wonder; if my blocking operation is < 0.5 seconds in the worst case, and the threshold timeout is 2 seconds, would this be blocking beyond guarantees? I see that documentation says:

    If you block all of the event loops in Vertx instance then your application will grind to a complete halt!

i totally understand this, all threads are in use; hence the halt. Except my example blocks only 1 thread for short amount of time (must be < 500ms!). So the part I dont undestand is how blocking of e.g. 500ms max may produce such unpredicted results. I might be an idiot, but I simply dont understand this:)

I know you have a lot of work; so sorry for bugging with this; at the end I will give users a flag to do settings, so they will break their heads with this :)

Tim Fox

unread,
Jun 16, 2015, 5:16:40 AM6/16/15
to ve...@googlegroups.com
So I spent an hour with the debugger looking into the inner workings of Netty to give you a more detailed explanation of what is going on... (You could do the same with a debugger if you are interested)

You are deploying N instances of a verticle which all start an HTTP server on port 8080. Under the cover Vert.x only creates a single HTTP server which actually listens on that address and round robins between the different instances.

This single server has a single ServerSocket which listens on port 8080. When this is first created it is assigned an event loop, call this EventLoop0 - at this point there is only one event loop as its the first server that gets deployed that gets to actually listen.

When netty chooses an event loop it uses round robin, but there is only one event loop to choose from so the EventLoop0 is chosen, the position is incremented and it gets set back to zero as there is only one event loop.

The other verticles now deploy and register their event loops, and now there are N event loops to choose from but the pointer is still at zero.

Now the verticles are deployed and the server is listening.

We now make a request from a browser to http://localhost:8080/foo

The ServerSocket accepts the connection on EventLoop0 (that's the acceptor loop), it now needs to find an event loop that's going to be associated to the connection, so it calls next() on NioEventLoopGroup - the pointer is at zero, so EventLoop0 gets chosen again - this is the event loop which will forever more handle any data for that connection.

Now the foo handler gets called, on EventLoop0.

foo handler uses jodd to make a blocking HTTP request to http://localhost:8080/foo

This is the same address as before so the same and we a single ServerSocket listening on that address - but that ServerSocket is assigned EventLoop0. But that's the one we are blocking, so the connection will never get accepted! So you have a deadlock.

I hope that is clearer now :)

I'll say it again - bottom line is - don't block that event loop. If you follow that simple rule you will be ok.

There is no reason for people to "break their heads" over this. The rule is simple and Vert.x will tell you *exactly* where you are blocking if you do this.

fwiw - you won't get any such niceness in Node.js or other async systems which will just hang in this situation (e.g. pure Netty or Node.js)

Here's your stack showing the exact line of code, as I believe you missed this before:

WARNING: Thread Thread[vert.x-eventloop-thread-6,5,main] has been blocked for 6779 ms, time limit is 2000
io.vertx.core.VertxException: Thread blocked
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:170)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at jodd.http.HttpResponse.readFrom(HttpResponse.java:156)
    at jodd.http.HttpRequest.send(HttpRequest.java:657)
    at vertx.FooHandler.handle(FooHandler.java:18)  <!-------- THIS IS WHERE YOUR FOOHANDLER SENDS THE JODD REQUEST
    at vertx.FooHandler.handle(FooHandler.java:9)
    at io.vertx.ext.web.impl.RouteImpl.handleContext(RouteImpl.java:213)
    at io.vertx.ext.web.impl.RoutingContextImplBase.iterateNext(RoutingContextImplBase.java:67)
    at io.vertx.ext.web.impl.RoutingContextImpl.next(RoutingContextImpl.java:96)
    at vertx.ServerVerticle.lambda$setupRouter$2(ServerVerticle.java:56)

<snip>

Imho it doesn't get much more userfriendly than that :)

Igor Spasić

unread,
Jun 17, 2015, 6:22:30 PM6/17/15
to ve...@googlegroups.com
Thank you Tim very much!!!! I really appreciate your answer, things gets clearer now....

Starting from recent milestone, things started working for us.


Thank you again!
Reply all
Reply to author
Forward
0 new messages