RabbitMQ Java client : Unable to close channel when a server node is under disk watermark

731 views
Skip to first unread message

Guillaume Darmont

unread,
Dec 23, 2014, 5:43:21 AM12/23/14
to rabbitm...@googlegroups.com

Env : RHEL 6.4, RabbitMQ Server 3.3.5, Erlang 17, Java amqp-client 3.4.2

Always reproduced


Hi RabbitMQ users, 


we currently have a problem when our Java client wants to close a channel when the remote server is blocked. In our case, the server is blocked because the disk free space is under the configured watermark.

The Java client seems to want a response from the server but since this one is blocked, it won’t respond.

So we end up in a situation where a client just wants to disconnect from a server because the server is blocked but, since it is blocked, it won’t let client to disconnect. See stacktraces below.


What I don’t understand is why the close() method is waiting infinitely for a reply (see k.getReply(-1); line 573). The reply is not even used and it is generally not a good idea to wait infinitely, whatever we do. The timeout should be configurable.


What can we do to change this behavior ?


Thanks for your answers,

Guillaume



"ServerService Thread Pool -- 48" prio=6 tid=0x000000000cf90800 nid=0x1ed0 in Object.wait() [0x0000000010fbe000]

  java.lang.Thread.State: WAITING (on object monitor)

at java.lang.Object.wait(Native Method)

- waiting on <0x00000000ce951c68> (a com.rabbitmq.utility.BlockingValueOrException)

at java.lang.Object.wait(Object.java:503)

at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:50)

- locked <0x00000000ce951c68> (a com.rabbitmq.utility.BlockingValueOrException)

at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:65)

- locked <0x00000000ce951c68> (a com.rabbitmq.utility.BlockingValueOrException)

at com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:111)

- locked <0x00000000ce951c68> (a com.rabbitmq.utility.BlockingValueOrException)

at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:37)

at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:349)

at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:573)

at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:505)

at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:498)

at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.reset(CachingConnectionFactory.java:448)

- locked <0x00000000cafce148> (a java.util.LinkedList)

at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$ChannelCachingConnectionProxy.destroy(CachingConnectionFactory.java:660)

at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.destroy(CachingConnectionFactory.java:425)

at [...]


"pool-8-thread-1" prio=6 tid=0x000000001456c000 nid=0x2d78 runnable [0x000000001902f000]

  java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(SocketInputStream.java:152)

at java.net.SocketInputStream.read(SocketInputStream.java:122)

at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)

at java.io.BufferedInputStream.read(BufferedInputStream.java:254)

- locked <0x00000000cb398af0> (a java.io.BufferedInputStream)

at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288)

at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95)

at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139)

- locked <0x00000000cb398ad0> (a java.io.DataInputStream)

at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:534)

at java.lang.Thread.run(Thread.java:745)


Michael Klishin

unread,
Dec 23, 2014, 5:56:05 AM12/23/14
to Guillaume Darmont, rabbitm...@googlegroups.com
blocked server stops reading from the socket, so it cannot receive a channel.close and send a channel.close-ok.

Use Channel#abort with a timeout.

MK
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Guillaume Darmont

unread,
Dec 24, 2014, 2:44:47 AM12/24/14
to rabbitm...@googlegroups.com, guillaum...@gmail.com
Thanks for the reply.

Since we're using spring-amqp, I had to recompile the original code to use the abort() method. But it didn't change anything, the call is still blocked (see stacktrace below).
A quick look at the source code shows that the "abort" boolean is not used before the blocking rpc call, so it seems there is no chance to have a different behavior in using abort() instead of close() in our case.

BTW, from a client point of view, the amqp client only wants to close its local channel, whatever the server state (running or blocked). 
IMHO, the case of having a server blocked when the client wants to close a channel should be handled internally.

Thanks for your time, any suggestions are welcome.

Guillaume


"ServerService Thread Pool -- 48" prio=6 tid=0x000000000a47c800 nid=0x23f4 in Object.wait() [0x000000001346e000]

  java.lang.Thread.State: WAITING (on object monitor)

at java.lang.Object.wait(Native Method)

- waiting on <0x00000000f1ea5b68> (a com.rabbitmq.utility.BlockingValueOrException)

at java.lang.Object.wait(Object.java:503)

at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:50)

- locked <0x00000000f1ea5b68> (a com.rabbitmq.utility.BlockingValueOrException)

at com.rabbitmq.utility.BlockingCell.get(BlockingCell.java:65)

- locked <0x00000000f1ea5b68> (a com.rabbitmq.utility.BlockingValueOrException)

at com.rabbitmq.utility.BlockingCell.uninterruptibleGet(BlockingCell.java:111)

- locked <0x00000000f1ea5b68> (a com.rabbitmq.utility.BlockingValueOrException)

at com.rabbitmq.utility.BlockingValueOrException.uninterruptibleGetValue(BlockingValueOrException.java:37)

at com.rabbitmq.client.impl.AMQChannel$BlockingRpcContinuation.getReply(AMQChannel.java:349)

at com.rabbitmq.client.impl.ChannelN.close(ChannelN.java:573)

at com.rabbitmq.client.impl.ChannelN.abort(ChannelN.java:519)

at com.rabbitmq.client.impl.ChannelN.abort(ChannelN.java:512)

at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.reset(CachingConnectionFactory.java:448)

- locked <0x00000000cce44a78> (a java.util.LinkedList)

at org.springframework.amqp.rabbit.connection.CachingConnectionFactory$ChannelCachingConnectionProxy.destroy(CachingConnectionFactory.java:660)

at org.springframework.amqp.rabbit.connection.CachingConnectionFactory.destroy(CachingConnectionFactory.java:425)

at [...]

Michael Klishin

unread,
Dec 24, 2014, 4:40:21 AM12/24/14
to Guillaume Darmont, rabbitm...@googlegroups.com
On 24 December 2014 at 10:44:49, Guillaume Darmont (guillaum...@gmail.com) wrote:
> BTW, from a client point of view, the amqp client only wants to
> close its local channel, whatever the server state (running
> or blocked).
> IMHO, the case of having a server blocked when the client wants
> to close a channel should be handled internally.

That would make client and server state go out of sync.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Reply all
Reply to author
Forward
0 new messages