Worker Threads vs Worker Verticles. App Paradigm/Design.

354 views
Skip to first unread message

Danny

unread,
Apr 12, 2021, 11:39:12 AM4/12/21
to vert.x
Hello! I'm really new to Vertx and have some questions regarding proper use of worker threads from the worker pool versus worker verticle for my Java app.

Basically, I've been testing out deploying worker verticles and I'm confused on their differences and why they impact my app so vastly differently

Currently my app is setup to execute roughly like this:
  1. App starts up and creates vertx with its options, the only non-time related one is worker pool being set to 20
  2. From here we deploy a new verticle that runs createHttpServer with separate routes
  3. When we get a client request, it goes to our routes and their handler functions, which then run the requests through our entire code logic. I believe this is simply using one (or some, not sure) of the worker threads from the vertx worker pool.
  4. After the first big synchronous block of code runs, we enter the point where we're waiting on DB queries and futures to complete. Our DB clients are created with JDBCClient.createNonShared(vertx, JsonObjectOfConfigurations)
  5. Once we receive them all, we finish our code and respond to the user

Things to note:
  • We only deploy a single verticle throughout the whole app's lifetime, the one that creates the server
  • All logs show [vert.x-eventloop-thread-0] despite it still running requests concurrently
  • If I setup event handlers, such as routingContext.request().connection.().closeHandler(..), they won't execute until step 3 moves onto step 4. (Note: I create this handler during the execution of each route's logic from within that request's routingContext.) Presumably this is because the eventloop has to finish its queue before reporting events under this implementation. This is actually what led me to try the second approach
  • If a request comes in during a huge blocking operation, it isn't started until said operation has finished

Under the new proposed approach, it executes roughly like this:
  1. App starts up and creates vertx with its options, the only non-time related one is worker pool being set to 20
  2. From here we deploy a new verticle that runs createHttpServer with separate routes
  3. Same logic, but instead of letting vertx use a worker thread, I'm specifically deploying a worker verticle to run the entire logic instead. The only deployment option I explicitly set here is setWorker(true)
  4. After the first big synchronous block of code runs, we enter the point where we're waiting on DB queries and futures to complete. Our DB clients are created with JDBCClient.createNonShared(vertx, JsonObjectOfConfigurations)
  5. Once we receive them all, we finish our code and respond to the user

Things to note about this implementation:
  • That connection().closeHandler(...) mentioned above actually executes in real time now
  • Now only the first few app logs mention  [vert.x-eventloop-thread-0], while most others mention  [vert.x-worker-thread-X]
  • Now if a request comes in during a huge blocking operation, a different worker simply takes it up and begins its processing in parallel
  • It increased our app's performance under concurrent load by about 500% somehow, it's actually kind of crazy so it made me skeptical, but our data quality and validation tests seem to be checking out too

I believe the source of the initial implementation's issues stem from putting huge blocking processes on the eventloop. I'm not sure why the other 7 eventloop threads aren't being used so that may be another solution to this issue, but I don't know how to utilize them.

Is this new implementation ok? I've read that worker threads are ideal for blocking processes, but I've had answers on stackoverflow tell me that this is incorrect because we shouldn't deploy verticles per request. I believe this is  a sort of niche case where we have huge blocking processes per request. 

Also I noticed that after some time I'll see a lot of single ERROR lines in the logs, and I'm assuming these are the old verticles decaying out of memory. So I'm guessing I need to undeploy each one at the end of the request life to avoid undefined behavior/errors, right?

Thanks for any help or clarifications!

 




Parit Bansal

unread,
Apr 17, 2021, 1:12:27 AM4/17/21
to ve...@googlegroups.com
Hi,

Here are my 2 cents. Callbacks are called in the same context as in which they are created i.e your routing handlers by default are called on eventloop. For long blocking processes either use the async api to return futures or use vertx.executeblocking that will run the blocking code (first argument of the method) on a worker thread and the result handler (second argument of the method) using the context from where it is called (mostly it will be eventloop). No need to create worker verticles unless you want to decouple code.

Hope this explains.


- Parit

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/d8654173-524c-4e08-b6f1-3a145c040eacn%40googlegroups.com.

Danny

unread,
Apr 19, 2021, 1:42:31 PM4/19/21
to vert.x
Ahhh ok gotcha. Thanks so much!
Also I forget where I'd read this, but I'd seen people generally recommend against using executeBlocking as it has a lot of problems in its own way (again, didn't test this myself but this was the consensus I saw from random forums and such). Also some of my blocking calls take minutes at a time and I read that this could cause errors with executeBlocking. 
And could you expand on the decoupling concept? Specifically how worker verticles help with decoupling. As I understand it, worker verticles are sort of just supposed to act as a pool for when you need blocking code to run, or just anything that would interfere with the eventloop. ExecuteBlocking does the same thinig, but verticles are long living and can be reused. Both use worker threads. 

If I'm misunderstanding please let me know!

Thanks again for your response!

Parit Bansal

unread,
Apr 21, 2021, 4:31:51 AM4/21/21
to ve...@googlegroups.com
Hello,

Sorry for the late reply.

I am not sure why "executeblocking" would be a problem. I remember once going in the vertx code and it was also using "executeblocking". Only thing I can think of is that over-using executeblocking goes against the philosophy of developing async message passing systems. Think of it this way, if we turn every http request to be handled by "executeblocking" then the whole system wouldn't be very much different from the servlets that work on 1 thread 1 request paradigm.  Therefore, the recommendation is to break the codebase into independently scaling units (read verticles) that can collaborate using eventbus to serve a request. By using verticles (and vertx) this can be done seamlessly in a JVM or across the JVMs. So scaling gets sorted.

Hope this helps.

- Parit   

Danny

unread,
Apr 22, 2021, 5:58:30 PM4/22/21
to vert.x
Hey!

No problem! And yeah I see what you mean now. I think my team's application was designed thinking we'd be more so acting as a delegating service that could pass a mass amount of events to many different APIs asynchronously, but in reality we have relatively low volume and large requests that require a lot of internal processing.

This has made me think of another question that's probably worth another thread so I'll post there. 

Thanks!

Reply all
Reply to author
Forward
0 new messages