Hi all,
I think I have discovered a bug in the Java amqp-client.
This happens when we block for more than requestedHeartbeat / 4 while reading message payload.
VERSIONS:
Java amqp-client-5.8.0 (is also present in the latest)
Context:We receive the following exception multiple times a day:
ForgivingExceptionHandler - An unexpected connection driver error occured
java.net.SocketTimeoutException: Read timed out
at java.base/java.net.SocketInputStream.socketRead0(Native Method)
at java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168)
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)
at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1374)
at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:985)
at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:290)
at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:351)
at java.base/java.io.DataInputStream.readFully(DataInputStream.java:200)
at java.base/java.io.DataInputStream.readFully(DataInputStream.java:170)
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:113)
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:184)
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:645)
at java.base/java.lang.Thread.run(Thread.java:829)
I think the problem is on this line:
https://github.com/rabbitmq/rabbitmq-java-client/blob/32dfd0fb338a3ff1db2c79dfeb42c9fd451284a0/src/main/java/com/rabbitmq/client/impl/Frame.java#L112The problem is there is no check for the SocketTimeoutException (as on the line 91), so instead of returning null the SocketTimeoutException exception will be thrown.
As a result in AMQPConnection.java it will jump to the handleFailure method which will shutdown the client.
In case we would return null and not throw exception there would be a logic that would correctly check for the missed heartbeats and wait two heartbeats intervals before it will terminate.
Example scenario:1) User sets requestedHeartbeat to 16s.
2) This will result in setting _socket.setSoTimeout(4000) in SocketFrameHandler.java
3) Reading payload gets blocked for more than 4s (for example connection issue)
4) The connection is terminated (even if we haven't missed any heartbeat yet)
Possible SolutionI think possible solution is to catch SocketTimeoutException in Frame.java in readFrom method whenever we access DataInputStream variable and return null in that case.
But I am not sure if this is the right approach, becasuse we may lose the data we have already read in the Frame.readFom method.
Can anyone please confirm this issue?
Regards,
Ondrej