IllegalMonitorStateException in jetty client while trying to reconnect to cometd server

39 views
Skip to first unread message

Flo

unread,
Nov 30, 2017, 5:49:53 AM11/30/17
to cometd-users
Hi Simone,

not sure whether it belongs to cometd-users or jetty, but since it occurred in cometd context, I'm putting it here.

We're having three cometd (3.1.3) java clients connected to a cometd server (on a virtualized hardware) via websockets over localhost. The BayeuxClient is configured with both JettyWebsocketTransport and LongPollingTransport. The cometd server was killed during maintenance work and was restarted later. All three clients failed to reconnect. We had following weird exception in two client logs (unfortunately level INFO only):

2017-11-27 15:12:27,187 [pool-3-thread-2] WARN  com.ect.cometd.provider.CometDServiceProvider - Service provider metaNG connection broken: 901:websocket:server error.
2017-11-27 15:14:04,396 [HttpClient@551056181-19] WARN  org.eclipse.jetty.util.thread.QueuedThreadPool -
java.lang.IllegalMonitorStateException: null
        at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:151) ~[?:1.8.0_131]
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261) ~[?:1.8.0_131]
        at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:457) ~[?:1.8.0_131]
        at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:408) ~[jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:531) ~[jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.access$700(QueuedThreadPool.java:47) ~[jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:590) [jetty-util-9.2.22.v20170606.jar:9.2.22.v20170606]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
2017-11-27 15:14:04,401 [HttpClient@551056181-19] WARN  org.eclipse.jetty.util.thread.QueuedThreadPool - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$3@89bc81f in HttpClient@551056181{STARTED,8<=8<=200,i=6,q=0}

The first line is our output when we're noticing an unsuccessful connect attempt, the exception apparently happens during reconnect attempts. It's still jetty 9.2.22, but I haven't seen substantial code changes in BlockingArrayQueue / QueuedThreadPool since then (at least nothing that would change the behavior in that area if I'm not wrong). I wasn't able to reproduce this message and wonder by which loophole a different thread could have unlocked the lock of the polling thread?

Another strange thing is that the error is actually coming from the HttpClient. Normally, when a cometd server is killed in our setup, we're seeing retries with WebSocketClient only. I wonder why the client decided to switch to long polling.

The only unusual thing I noticed during local testing (forget about it if it just detracts from the issue described above): when interrupting the _threads forcefully in the QueuedThreadPool when one thread hits the idleJobPoll method, this infamous endless loop in the SelectorManager is being provoked - but this might be totally expected in this case:
2017-11-30 09:47:29,093 [WebSocketClient@579294521-28-selector-WebSocketClientSelectorManager@fb9c7aa/0] DEBUG org.eclipse.jetty.io.SelectorManager - Selector loop waiting on select
2017-11-30 09:47:29,093 [WebSocketClient@579294521-28-selector-WebSocketClientSelectorManager@fb9c7aa/0] DEBUG org.eclipse.jetty.io.SelectorManager - Selector loop woken up from select, 0/0 selected

Best Regards,
Florian

Simone Bordet

unread,
Dec 1, 2017, 2:11:41 AM12/1/17
to cometd-users
Hi,

On Thu, Nov 30, 2017 at 11:49 AM, Flo <fdum...@gmail.com> wrote:
> Hi Simone,
>
> not sure whether it belongs to cometd-users or jetty, but since it occurred
> in cometd context, I'm putting it here.
>
> We're having three cometd (3.1.3) java clients connected to a cometd server
> (on a virtualized hardware) via websockets over localhost. The BayeuxClient
> is configured with both JettyWebsocketTransport and LongPollingTransport.
> The cometd server was killed during maintenance work and was restarted
> later. All three clients failed to reconnect. We had following weird
> exception in two client logs (unfortunately level INFO only):
>
> 2017-11-27 15:12:27,187 [pool-3-thread-2] WARN
> com.ect.cometd.provider.CometDServiceProvider - Service provider metaNG
> connection broken: 901:websocket:server error.
> 2017-11-27 15:14:04,396 [HttpClient@551056181-19] WARN
> org.eclipse.jetty.util.thread.QueuedThreadPool -
> java.lang.IllegalMonitorStateException: null

This seems a JVM bug.

In the code we do:

lock.lock()
try {
...
} finally {
lock.unlock();
}

And the IllegalMonitorStateException is thrown from the finally block.
I don't see how a thread could have entered the finally block without
locking the lock first.

What exact JDK are you using ?
Oracle's or OpenJDK's or Zulu's ?

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.

Flo

unread,
Dec 1, 2017, 3:04:17 AM12/1/17
to cometd-users
Hi Simone,

it looks really unbelievable when analyzing the code. There were also no _obvious_ code paths popping into my attention where another thread could get the chance to call _lock.unlock() before the failing thread hits unlock... Or is there some crazy condition where this could actually happen - e.g. some thread gets interrupted in an adverse moment/place, followed by a context switch? But any other theoretical unlock attempt from a different thread should fail first...?

On that server we're currently running the Oracle VM
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

OS is RHEL7 x86_64 on VMware Virtual Platform. Its a production server with very conservative update cycles.

Best Regards,
Florian

Simone Bordet

unread,
Dec 1, 2017, 3:22:57 AM12/1/17
to cometd-users
Hi,

On Fri, Dec 1, 2017 at 9:04 AM, Flo <fdum...@gmail.com> wrote:
> Hi Simone,
>
> it looks really unbelievable when analyzing the code. There were also no
> _obvious_ code paths popping into my attention where another thread could
> get the chance to call _lock.unlock() before the failing thread hits
> unlock... Or is there some crazy condition where this could actually happen
> - e.g. some thread gets interrupted in an adverse moment/place, followed by
> a context switch? But any other theoretical unlock attempt from a different
> thread should fail first...?
>
> On that server we're currently running the Oracle VM
> java version "1.8.0_131"
> Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
> Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
>
> OS is RHEL7 x86_64 on VMware Virtual Platform. Its a production server with
> very conservative update cycles.

If you can reliably reproduce the issue, perhaps file a bug report to Oracle ?
Reply all
Reply to author
Forward
0 new messages