Vert.x 2 and 3.x unfortunately is being blocked under heavy load

806 views
Skip to first unread message

Lenders

unread,
May 19, 2016, 3:35:30 AM5/19/16
to vert.x
Hi all,

I've recently done some performance testing with vertx 2 mainly but I believe from looking at the code for vertx 3 it probably has the same problems:

  1. JAXB has blocking issues for deserialize and serialize of objects - This NOT a vertx issue but deep down if you use the code JAXB.marshal() or JAXB.unmarshal() the JAXB context is reloaded occasionally, this is because the JAXBContext is stored in a WeakHashMap and so gets garbage collected. There is a synchronized method in the code. The work around is to create you're own JAXBContext and re-use it so no garbage collection (I put them in a ThreadLocal Map for each class).
  2. The jackson libraries block the event bus thread. I haven't got to the bottom of this entirely, but using a ThreadLocal ObjectMapper improves the number of instances here, This was not in the vertx code but when we used our own ObjectMapper.
  3. The event bus uses UUID.randomUUID() to generate the reply addresses, this unfortunately uses a static SecureRandom and this does some synchronised stuff around the readBytes call which can block the event loop. I think this was blocked because of lack of entropy (yes vertx is that fast), perhaps the code could be changed so that it uses a deterministic algorithm using Atomic longs or something?
  4. There was also a blocked thread on the eventloop, but I don't know if this is normal.


Please understand these were happening when the event bus was handling over 5000 messages a second and introduced a small latency of up to 10 ms so not massively critical (except JAXB one which was really problematic).

We used this code (Vertx 2) to monitor the event loops, it only fires every 500 ms so it's not going to catch them all.

vertx.setPeriodic(pollingInterval, timerId -> {
final long time = System.currentTimeMillis();
final Map<Thread,StackTraceElement[]> map = Thread.getAllStackTraces();
map.keySet().stream()
.filter(thread -> thread.getName().startsWith("vert.x-eventloop-thread-"))
.filter(thread -> Thread.State.BLOCKED.equals(thread.getState()))
.forEach(thread -> {
final String message = "Thread [" + thread.getName() + "] is BLOCKED";
final Exception ex = new IllegalStateException(message);
ex.setStackTrace(map.get(thread));
LOGGER.warn(message, ex);
});
LOGGER.trace("Eventloop threads checked. {} millis", System.currentTimeMillis() - time);
});


Examples:

java.lang.IllegalStateException: Thread [vert.x-eventloop-thread-4] is BLOCKED
        at java.security.SecureRandom.nextBytes(SecureRandom.java:468) ~[na:1.8.0_77]
        at java.util.UUID.randomUUID(UUID.java:145) ~[na:1.8.0_77]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.generateReplyAddress(DefaultEventBus.java:688) ~[vertx-core-2.1.6.jar:n
a]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.sendOrPub(DefaultEventBus.java:706) ~[vertx-core-2.1.6.jar:na]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.sendOrPubWithTimeout(DefaultEventBus.java:674) ~[vertx-core-2.1.6.jar:n
a]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.sendWithTimeout(DefaultEventBus.java:207) ~[vertx-core-2.1.6.jar:na]

Thread [vert.x-eventloop-thread-1] is BLOCKED
java.lang.IllegalStateException: Thread [vert.x-eventloop-thread-1] is BLOCKED
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) ~[na:1.8.0_77]
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) ~[na:1.8.0_77]
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) ~[na:1.8.0_77]
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) ~[na:1.8.0_77]
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) ~[na:1.8.0_77]
        at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) ~[netty-all-4.0.21.Final.jar:4.0.21.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) ~[netty-all-4.0.21.Final.jar:4.0.21.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.21.Final.jar

Thread [vert.x-eventloop-thread-2] is BLOCKED
java.lang.IllegalStateException: Thread [vert.x-eventloop-thread-2] is BLOCKED
        at sun.misc.Unsafe.unpark(Native Method) ~[na:1.8.0_77]
        at java.util.concurrent.locks.LockSupport.unpark(LockSupport.java:141) ~[na:1.8.0_77]
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.unparkSuccessor(AbstractQueuedSynchronizer.java:662) ~[na:1.8.0_77]
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1264) ~[na:1.8.0_77]
        at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457) ~[na:1.8.0_77]
        at java.util.concurrent.LinkedBlockingQueue.signalNotEmpty(LinkedBlockingQueue.java:176) ~[na:1.8.0_77]
        at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:430) ~[na:1.8.0_77]
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1361) ~[na:1.8.0_77]
        at org.vertx.java.core.impl.OrderedExecutorFactory$OrderedExecutor.execute(OrderedExecutorFactory.java:111) ~[vertx-core-2.1.6.jar:na]
        at org.vertx.java.core.impl.DefaultContext.executeOnOrderedWorkerExec(DefaultContext.java:156) ~[vertx-core-2.1.6.jar:na]
        at org.vertx.java.core.impl.DefaultVertx.executeBlocking(DefaultVertx.java:402) ~[vertx-core-2.1.6.jar:na]
        at org.vertx.java.spi.cluster.impl.hazelcast.HazelcastAsyncMultiMap.get(HazelcastAsyncMultiMap.java:99) ~[vertx-hazelcast-2.1.6.jar:na]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.sendOrPub(DefaultEventBus.java:731) ~[vertx-core-2.1.6.jar:na]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.sendOrPubWithTimeout(DefaultEventBus.java:674) ~[vertx-core-2.1.6.jar:na]
        at org.vertx.java.core.eventbus.impl.DefaultEventBus.sendWithTimeout(DefaultEventBus.java:135) ~[vertx-core-2.1.6.jar:na]
        at com.odobo.service.achievements.managers.GameAchievementManager.register(GameAchievementManager.java:107) ~[na:na]
        at com.odobo.service.achievements.managers.GameAchievementManager.start(GameAchievementManager.java:102) ~[na:na]
        at org.vertx.java.platform.impl.DefaultPlatformManager$22.run(DefaultPlatformManager.java:1765) ~[vertx-platform-2.1.6.jar:na]
        at org.vertx.java.core.impl.DefaultContext$3.run(DefaultContext.java:175) ~[vertx-core-2.1.6.jar:na]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:370) ~[netty-all-4.0.21.Final.jar:4.0.21.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) ~[netty-all-4.0.21.Final.jar:4.0.21.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.21.Final.jar:4.0.21.Final]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_77]

Peter

Tim Fox

unread,
May 19, 2016, 5:18:55 AM5/19/16
to ve...@googlegroups.com
Hi Peter,


On 19/05/16 08:35, Lenders wrote:
Hi all,

I've recently done some performance testing with vertx 2 mainly but I believe from looking at the code for vertx 3 it probably has the same problems:

  1. JAXB has blocking issues for deserialize and serialize of objects - This NOT a vertx issue but deep down if you use the code JAXB.marshal() or JAXB.unmarshal() the JAXB context is reloaded occasionally, this is because the JAXBContext is stored in a WeakHashMap and so gets garbage collected. There is a synchronized method in the code. The work around is to create you're own JAXBContext and re-use it so no garbage collection (I put them in a ThreadLocal Map for each class).

Vert.x doesn't use JAXB afaik.


  1. The jackson libraries block the event bus thread. I haven't got to the bottom of this entirely, but using a ThreadLocal ObjectMapper improves the number of instances here, This was not in the vertx code but when we used our own ObjectMapper.

If you're using Jackson in your own code and it "blocks the event loop", most probably you're trying to serialize or deserialize a very large object and this just takes some time. I can't see how this is a Vert.x issue.


  1. The event bus uses UUID.randomUUID() to generate the reply addresses, this unfortunately uses a static SecureRandom and this does some synchronised stuff around the readBytes call which can block the event loop. I think this was blocked because of lack of entropy (yes vertx is that fast), perhaps the code could be changed so that it uses a deterministic algorithm using Atomic longs or something?

Yes, this is a known issue with SecureRandom - depending on the OS it can block if there's not enough entropy. In Vert.x 2 we used SecureRandom for all event bus reply addresses, in Vert.x 3 we use a sequence for local addresses and only use SecureRandom for clustered event bus so it should be less of an issue. If you see an issue with Vert.x 3 please file an issue, but I think this was "fixed" quite some time ago.


  1. There was also a blocked thread on the eventloop, but I don't know if this is normal.


Please understand these were happening when the event bus was handling over 5000 messages a second and introduced a small latency of up to 10 ms so not massively critical (except JAXB one which was really problematic).

We used this code (Vertx 2) to monitor the event loops, it only fires every 500 ms so it's not going to catch them all.

vertx.setPeriodic(pollingInterval, timerId -> {
    final long time = System.currentTimeMillis();
    final Map<Thread,StackTraceElement[]> map = Thread.getAllStackTraces();
    map.keySet().stream()
            .filter(thread -> thread.getName().startsWith("vert.x-eventloop-thread-"))
            .filter(thread -> Thread.State.BLOCKED.equals(thread.getState()))
            .forEach(thread -> {
                final String message = "Thread [" + thread.getName() + "] is BLOCKED";
                final Exception ex = new IllegalStateException(message);
                ex.setStackTrace(map.get(thread));
                LOGGER.warn(message, ex);
            });
    LOGGER.trace("Eventloop threads checked. {} millis", System.currentTimeMillis() - time);
});


^ This is not a good way to measure if a thread is "blocked".  What's important for a Vert.x application is an event loop doesn't take too long to deliver an event. Waiting on a monitor can be ok, as long as it doesn't wait too long.

I suggest looking at the "blocked" thread checking code that we use in Vert.x 3 (BlockedThreadChecker.java)

In the traces below, it's perfectly normal to see a thread (very briefly) blocked in epollWait - Netty performs blocking selects in some situations  - this is ok - but if in doubt you could ask the Netty team. I believe the logic around how selecting works is highly optimised for the best performance.

The third stack trace below is simply a lock in LinkedBlockingQueue that is held for a very short time - you'd get this with any fixed size thread pool, and I can't see that it's an issue.

Disclaimer: This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message and are hereby notified that any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited.  --
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/4ac6a002-2f9b-4641-a3d2-ee8f68b51a0e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lenders

unread,
May 23, 2016, 4:14:55 AM5/23/16
to vert.x
Hi Tim,

Thanks for the comments, 

I did point out that the JAXB was NOT a vert.x issue but one with our code, It was really just a FYI for anyone who has to use it.

The issue with jackson is down again in our code, it to do with using Jackson to look up the Class for deserialization using the JSonTypeIdResolver, again this is a FYI.

I'm happy to hear the event bus issue has been improved in vertx3, I just did a quick scan of the code when I made the post and must have missed that.

I'm not sure any of these things are a Vertx issue to honest, the title of the post is wrong. We did find however that any form of blocking the eventloop can have a massive effect on overall performance and that it was only visible in performance tests, for example the JAXB code issue even though the block was a couple of milliseconds had a cascading effect on the performance.

This post was really just a precautionary tail and was not meant to criticise the excellent framework that is Vertx.

I'm really looking forward to the 3.3.

Peter







Reply all
Reply to author
Forward
0 new messages