Hi.
Here is my understanding - how event loops and vertices are working.
Let's say: you've got one Vertx instance with 2 event loops and you're deploying 5 vertices (one instance each) then we end with:
event loop 1 - verticles A,C,E
event loop 2 - verticles B, D
Each of verticles consumes one message.
So - in case we have only messages "a" and "c" and "e" then only ONE event loop will be busy and second thread will be not used... So it is important to think - how and what is deployed.
One possible solution is to create ONE verticle with all handlers for every message. And then deploy it as many times as many cores you've got.
Remember - do not block event loop!
Running more than one instance of vertex per machine is imho not so good idea. There is no benefits.
Also - running more event loops than cores is also not good - then we have massive "context switch" and it costs...
My solution: build only few verticles with many handlers each. (verticle A - "a", "b", "c", verticle B - "d", "e"). On machine with 8 cores I start 7 event loops. I deploy verticle A and B with many instances (6 x A and 3 x B - this is important issue and depends on the project and number of instances and should be eq. or grt. then # event loops). One core is free for workers (if present) and system work.
Example: I've got message "a" which is the most frequent and important one, and 7 others ("b","c","d","e",...) which are also VERY important but not so frequent. Then I prepare verticle A and B. A consumes only message "a", and B consumes others. On 8 core machine I start vertex with 7 event loops.
Verticle A deploy 5 times and B - 2 times.
However I never ever make a test is this better then A - 7 times and B - 7 times...
The most important thing - do not block event loop! and deploy at least as many messages as many event loops you've got.
Does it helps?
Reg.
Lukasz