Number of worker verticles vs workerPoolSize

661 views
Skip to first unread message

WD

unread,
Sep 16, 2015, 12:43:36 PM9/16/15
to vert.x
Hi there,
I am using Vertx 3.0 to develop an application to access the database. I am planning to deploy 500 worker verticles to hanlde 5000 rps to read from the database. Do I need to set the workerPoolSize to 500 so that each worker verticle has its own worker thread when the system is in full load?  Should the workerPoolSize be always greater than or equal to the total number of the worker verticles?

here is how I deploy the worker verticles:

vertx.deployVerticle("ReadDBWorkerVerticle", new DeploymentOptions()

    .setConfig(config())

    .setWorker(true)

    .setInstances(500));


And I define the workerPoolSize with -Dvertx.options.workerPoolSize=500



thanks,


WD

Tim Fox

unread,
Sep 16, 2015, 12:46:37 PM9/16/15
to ve...@googlegroups.com
How are you accessing the database? JDBC? Something else?
--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at http://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/45453075-c81d-486f-a716-b208f7930283%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

WD

unread,
Sep 16, 2015, 12:55:17 PM9/16/15
to vert.x
It's HBase, worker verticle uses hbase's java client.

Tim Fox

unread,
Sep 16, 2015, 1:09:56 PM9/16/15
to ve...@googlegroups.com
If you're using a sync blocking HBase client then you can either:

1. Use executeBlocking to run your blocking operations - you don't need worker verticles for this
2. Use worker verticles. 500 threadsseems an awful lot, and you're unlikely to get much benefit increasing threads beyond a couple of hundred due to context switching overhead. If you're going to go down this route I'd suggest starting with a lower number of threads/worker verticles and seeing how performance improves (or not) as you increase the number. Don't assume that more threads means more performance. In terms of numbers of worker verticles you need as many as the concurrency you expect for standard workers or just one for multi-threaded workers.

I haven't used HBase but a quick google showed up a fully async client:

https://github.com/OpenTSDB/asynchbase

Don't know if it's any good but if it is I'd certainly recommend using an async one over a sync one.

WD

unread,
Sep 16, 2015, 1:59:50 PM9/16/15
to vert.x
1. I started with Vertx 2.0 when executeBlocking was not available and migrated to Vertx 3.0 now. With worker verticle, it would be more flexible to deploy number of worker verticle to handle different loads.
2. I will definitely tune the number of worker verticles/ threads to maximize the performance. Just want to confirm if statement "the workPoolSize should not be smaller than the total number of worker verticles" is true or not.
3. I have looked at the Asynchbase, it doesn't support some features (lack of certain filters) that the sync client supports. and may not be updated as we move to the new version of HBASE.

again, thank you for your precious time and comments. Vertx is great for developing application. Hope it can stand the real world testing as we move the application to production.

Oje Preradovic

unread,
Sep 20, 2015, 3:08:06 PM9/20/15
to vert.x
Hi,

We tried


But on our app it just eats memory like mad and our process died. Also in our same Vertx process we have an HTTP server and outgoing HTTP clients. This is a very high load/connection app and with the asynch HBase client we eventually ran out of open files (set to about 500K in Linux).

We put it back to standard synch HBase client and called these methods within executeBlocking and now the app is rocking again!

We have a fixed HConnection pool size of 50 and 150 Vertx worker threads. At max load (thousands of concurrent connections) we are only about 45% CPU and a very nice overall spread of work in both the vertx worker pool and in the HConnection pool (verified with JProfiler). The HBase synch client is quite fast. Again the asynch one just grew out of control memory wise, which is unfortunate. Memory now is stable and self managed (i.e. continues to go down even under heavy load).

One gotcha for us was the default behaviour of executeBlocking (with no boolean) is to order (queue) the events in the context. This drove us crazy and you may be doing the same thing, as soon as found:

executeBlocking( execute -> {

}, false, result -> {

});

Our world was rocking again!

There is also a very nice asynch HBase client here:


But we are on amazon EMR and the versions are not compatible :(

Cheers
Reply all
Reply to author
Forward
0 new messages