Vertx Service Proxy bottleneck

373 views
Skip to first unread message

Jody Schering

unread,
Dec 12, 2016, 2:31:51 PM12/12/16
to vert.x
Hello, 

I'm currently using vertx with the service proxy to communicate between microservices. requests come in through a gateway via vertx REST frontend that calls various microservices using the vertx service proxy. I added some logging to print out the time it took to execute the call to the other microservce because I was noticing some slow performance and what I found was that the calls to the microservices were backing up. Here's a snippet of the log when it was nearing the end of the 10k requests I sent in:

2016-12-09 16:05:13,846 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29650ms
2016-12-09 16:05:13,846 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29648ms
2016-12-09 16:05:13,852 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29654ms
2016-12-09 16:05:13,857 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29653ms
2016-12-09 16:05:13,867 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29660ms
2016-12-09 16:05:13,873 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29663ms
2016-12-09 16:05:13,873 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29656ms
2016-12-09 16:05:13,874 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29660ms
2016-12-09 16:05:13,884 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29664ms
2016-12-09 16:05:13,900 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29678ms
2016-12-09 16:05:13,905 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29680ms
2016-12-09 16:05:13,917 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29690ms
2016-12-09 16:05:13,924 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29695ms
2016-12-09 16:05:13,939 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29710ms
2016-12-09 16:05:13,939 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29708ms
2016-12-09 16:05:13,950 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29714ms
2016-12-09 16:05:13,951 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29717ms
2016-12-09 16:05:13,958 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29718ms
2016-12-09 16:05:13,964 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29721ms
2016-12-09 16:05:13,972 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29727ms
2016-12-09 16:05:13,978 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29731ms
2016-12-09 16:05:13,993 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29745ms
2016-12-09 16:05:14,001 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29749ms
2016-12-09 16:05:14,011 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29757ms
2016-12-09 16:05:14,012 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29754ms
2016-12-09 16:05:14,012 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29756ms
2016-12-09 16:05:14,021 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29759ms
2016-12-09 16:05:14,021 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29761ms
2016-12-09 16:05:14,038 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29773ms
2016-12-09 16:05:14,050 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29783ms
2016-12-09 16:05:14,050 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29783ms
2016-12-09 16:05:14,056 INFO (vert.x-eventloop-thread-1) Log4j2LogDelegate.log: STORED : 29787ms


As you can see it went all the way up to 30 seconds. I must be doing something wrong or not have something configured properly. Can anyone shed some light on why it's backing up so severely or what I might try to improve the performance? This is all running locally on my desktop:

I register the service using the ProxyHelper and get the service using the EventBusService.<T> getProxy. I release the service after every call.

Any help would be appreciated.

Julien Viet

unread,
Dec 12, 2016, 5:32:02 PM12/12/16
to ve...@googlegroups.com
30 seconds is the default timeout for EventBus messages

so that let suggest that the services proxies are sending messages that never reach the service implementation

you should check that the service is available using a plain EventBus.send at the correct address.

Julien

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/4dd5375d-2a1d-4621-93d5-8acf6122e314%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jody Schering

unread,
Dec 12, 2016, 5:56:00 PM12/12/16
to vert.x
Hi Julien, thanks for the reply. Maybe my statement was a little misleading by saying it takes 30 seconds. What I was trying to say was that the time slowly increments from very fast all the way up to the timeout. Even doing a small number of requests, the service starts off at only a few milliseconds and goes up to around 4 seconds in almost a linear fashion. It appears that it's calling the service method but is queuing the call until it's bee released from a previous call.I have one instance of the microservice running that registered the service proxy. I then run a test sending in 100 requests to the gateway that calls the proxy. The the start time is triggered immediately but then appears to be queued and handled 1 at a time.

Thomas SEGISMONT

unread,
Dec 13, 2016, 3:55:37 AM12/13/16
to ve...@googlegroups.com
Can you put together a small reproducer? Thanks

2016-12-12 23:56 GMT+01:00 Jody Schering <scher...@gmail.com>:
Hi Julien, thanks for the reply. Maybe my statement was a little misleading by saying it takes 30 seconds. What I was trying to say was that the time slowly increments from very fast all the way up to the timeout. Even doing a small number of requests, the service starts off at only a few milliseconds and goes up to around 4 seconds in almost a linear fashion. It appears that it's calling the service method but is queuing the call until it's bee released from a previous call.I have one instance of the microservice running that registered the service proxy. I then run a test sending in 100 requests to the gateway that calls the proxy. The the start time is triggered immediately but then appears to be queued and handled 1 at a time.

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+unsubscribe@googlegroups.com.

Jody Schering

unread,
Dec 13, 2016, 12:35:45 PM12/13/16
to vert.x
Here's a reproducer project: https://github.com/JSchering/vertx.proxy.bottleneck

The project has a Junit that calls the service proxy 10,000 times the proxy just generates a random uuid. What happens for me is the time to call the proxy and get the response takes 180ms at it's fastest and is over 600ms at it's slowest. I would expect better performance. 

Thanks for your help with this.

Jody Schering

unread,
Dec 13, 2016, 6:24:39 PM12/13/16
to vert.x
I found this thread: https://groups.google.com/forum/#!topic/vertx/H3ntz_2JvB4 and it indicates that the eventbus messages are sent in the order they are received. Does this mean they are sent serially? and not in parallel? If that's true it seems to fit with what I'm seeing in my reproducer project where the time slowly increase for each subsequent call to a proxy service. Furthermore it's inhibiting the performance of using service proxies for communication between a gateway and the microservices as thousands of http requests come in the event bus appears to be a huge bottleneck to communicate with other microservices in my application.

If someone has a better handle on this please let me know.

Jody Schering

unread,
Dec 14, 2016, 11:33:32 AM12/14/16
to vert.x
Hi Thomas, I put together a reproducer, have you had a chance to try it out? Or have any suggestions on why the proxy service is taking as long as it does?

Thanks.

Jody Schering

unread,
Dec 14, 2016, 12:25:58 PM12/14/16
to vert.x
I also found this topic: https://groups.google.com/forum/#!topic/vertx/lJ_c1t92SVw and it says to register multiple instances of the service proxy to handle more load, but it's not improving the throughput in my tests.

Jody Schering

unread,
Dec 14, 2016, 1:06:55 PM12/14/16
to vert.x
I updated the reproducer to wrap the test in an executeBlocking. It seems it was impacting the service and after making the change it performs better, but it still increases in a linear fashion and registering the service multiple times has no affect on the performance.

Any help would be much appreciated.

IanP

unread,
Dec 15, 2016, 4:26:24 AM12/15/16
to vert.x
Hi Jody, 

I'm no expert on this, but here's a suggestion: if you pass in the value of i as a parameter when you create the service proxies in your verticle you could put something like System.out.println("test() invoked on service " + i"); in the test() method. That would demonstrate whether the round-robin proxy selection is working. I have also found it useful in the past to sys.out.println the name of the thread that's executing the method to check that things are getting scheduled on different threads rather than all running on the same thread.

Cheers,
Ian.

Jody Schering

unread,
Dec 15, 2016, 12:16:13 PM12/15/16
to vert.x
Hi Ian, thanks for the suggestion, I'll give it a try. In your testing are you seeing multiple service proxies instances on one verticle improving throughput?

Thanks

IanP

unread,
Dec 15, 2016, 12:18:31 PM12/15/16
to vert.x
Don't know yet - I haven't got that far in testing!

Jody Schering

unread,
Dec 15, 2016, 12:55:58 PM12/15/16
to vert.x
Hi Ian, thanks again. I was able to run a test printing out the instance and thread and it is doing round robin on the service, but it's running it on the event loop thread. So I don't think adding many service proxy instances is really going to be beneficial if you're limited by the number of event loop threads. It seems like just scaling up the number of verticles with the number of cores will provide the most benefit.

Julien Viet

unread,
Dec 15, 2016, 1:54:03 PM12/15/16
to ve...@googlegroups.com
Hi Jodi,

what do you want to measure exactly ?

Julien

On Dec 12, 2016, at 8:31 PM, Jody Schering <scher...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.

Julien Viet

unread,
Dec 15, 2016, 2:40:09 PM12/15/16
to ve...@googlegroups.com
looking at your test, it is not accurate because

1/ it uses executeBlocking to send message batches which has an impact on how the messages are being sent and received
2/ it creates unnecessary Future to wait for all results to come
3/ it uses System.out to print
4/ it does not run long enough to be jitted
5/ your tests does not run 10000 iterations but 495099

if you want to have a test closer to what an HttpServer would do (as you mentionned the proxy is called from an Rest API) then you should run this from the EventLoop with a simple vertx.runOnContext().

here is a more correct version that runs 10,000 iterations and measure what you said more accurately.


it prints the min and the max, for instance on my machine:

################ Completed Test generated: 173ms ################
max 128
min 0

it would be interesting to have an histogram of the latencies (using for instance HdrHistogram) to have an idea of the distribution.

Julien

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.

Jody Schering

unread,
Dec 15, 2016, 3:36:50 PM12/15/16
to vert.x
Hi Julien, 

Thanks for the reply I appreciate you taking the time to provide an example of the correct way to do this kind of test. I'm not very familiar with the vertx.runOnContext so I'll need to look into that more. I originally was doing only 10,000 calls but I wanted to see how the service response time changed as more requests came in. I used the execute blocking because the 2 loops I had were blocking the event loop as it was running. Anyway, now I know that wasn't the right way to do it, I'll make some changes.

Thanks again,
Jody


Reply all
Reply to author
Forward
0 new messages