Issues with NIO SSL in java client

72 views
Skip to first unread message

Dmitry Andrianov

unread,
Oct 13, 2017, 9:30:13 AM10/13/17
to rabbitmq-users
Hi.

I raised a ticket for an issue in Java client library - it makes client randomly throw and abort connections when SSL and NIO mode are used - https://github.com/rabbitmq/rabbitmq-java-client/issues/317

However, there is something else I wanted to discuss which is slightly out of scope for that specific issue.


I see two problems:
1. firs of all, 100ms is an arbitrary delay that thread will have to sleep for even if data becomes available the very next millisecond
2. When retryRead fails, the SslEngineByteBufferInputStream will just throw complaining it did not receive data from the network. But it was just 300ms of waiting! What if data takes time to arrive?
I understand that it is probably very difficult to trigger that condition in real life use cases but still, it is probably should not be the application-level library to make decision of what is appropriate timeout for data to be available from network. There is a whole TCP/IP stack for that.

I haven't checked code in full but I I assume this logic is there because you are using traditional (non NIO) approach when reading stuff from the socket. That is there is some code that needs to read the frame, it reads its size then reads all the necessary bytes "pulling" them from the SslEngineByteBufferInputStream. So NIO is only used to indicate when data is actually available in the socket and then, standard "synchronous" pulling starts.

I would think that NIO is supposed to be used the other way around - your network thread gets data from the socket when the data is available and passes it to some buffering reassembler whose job is to buffer these bits and pieces until a full frame can be build. Then it passes that full frame for processing and at no point you need to be polling socket trying to get more data from it. I can be wrong. of course, because I never really worked with NIO myself but this is how it reads really.

Cheers



Arnaud Cogoluègnes

unread,
Oct 16, 2017, 4:20:28 AM10/16/17
to rabbitm...@googlegroups.com

I raised a ticket for an issue in Java client library - it makes client randomly throw and abort connections when SSL and NIO mode are used - https://github.com/rabbitmq/rabbitmq-java-client/issues/317


Thanks for you contribution Dmitry. Your fix has been merged to 5.0.1 and backported to 4.3.0.
 
However, there is something else I wanted to discuss which is slightly out of scope for that specific issue.


I see two problems:
1. firs of all, 100ms is an arbitrary delay that thread will have to sleep for even if data becomes available the very next millisecond
2. When retryRead fails, the SslEngineByteBufferInputStream will just throw complaining it did not receive data from the network. But it was just 300ms of waiting! What if data takes time to arrive?
I understand that it is probably very difficult to trigger that condition in real life use cases but still, it is probably should not be the application-level library to make decision of what is appropriate timeout for data to be available from network. There is a whole TCP/IP stack for that.


Yes, you're right, this hack is far from ideal. I filled in an issue for it [1].
 
I haven't checked code in full but I I assume this logic is there because you are using traditional (non NIO) approach when reading stuff from the socket. That is there is some code that needs to read the frame, it reads its size then reads all the necessary bytes "pulling" them from the SslEngineByteBufferInputStream. So NIO is only used to indicate when data is actually available in the socket and then, standard "synchronous" pulling starts.


We haven't measured it yet, but this retry logic should be a corner case and should not be called for every single frame.
 
I would think that NIO is supposed to be used the other way around - your network thread gets data from the socket when the data is available and passes it to some buffering reassembler whose job is to buffer these bits and pieces until a full frame can be build. Then it passes that full frame for processing and at no point you need to be polling socket trying to get more data from it. I can be wrong. of course, because I never really worked with NIO myself but this is how it reads really.


Yes, this is how NIO works. In the Java client NIO mode, the NioLoop class [2] is in charge of dealing with this logic (checking sockets for available inbound data, getting frames, and submitting their processing).

Dmitry Andrianov

unread,
Oct 16, 2017, 8:53:44 AM10/16/17
to rabbitmq-users
Thank you, Arnaud.
Just to let you know - while I saw the issue as purely theoretical, it was actually quite easily reproducible. Roughly one out of three invocations of NioTlsUnverifiedConnection.java I did while testing my changes resulted in that "should be reading from the network" exception to be thrown. But my laptop was a bit on the slower side to be fair.
Probably does not happen much in real life.

I am glad there is an issue for that on your Github now.

Cheers

Arnaud Cogoluègnes

unread,
Oct 16, 2017, 9:52:36 AM10/16/17
to rabbitm...@googlegroups.com
Well, it's not that theoretical as I reproduced it with the new test case you provided in the PR. Out of curiosity, did you experience this issue also in production?

hivehome.com



Hive | London | Cambridge | Houston | Toronto
The information contained in or attached to this email is confidential and intended only for the use of the individual(s) to which it is addressed. It may contain information which is confidential and/or covered by legal professional or other privilege. The views expressed in this email are not necessarily the views of Centrica plc, and the company, its directors, officers or employees make no representation or accept any liability for their accuracy or completeness unless expressly stated to the contrary. 
Centrica Connected Home Limited (company no: 5782908), registered in England and Wales with its registered office at Millstream, Maidenhead Road, Windsor, Berkshire SL4 5GD.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dmitry Andrianov

unread,
Oct 16, 2017, 5:45:06 PM10/16/17
to rabbitmq-users
Not that I am aware of.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

Michael Klishin

unread,
Oct 16, 2017, 6:05:41 PM10/16/17
to rabbitm...@googlegroups.com
Nonetheless, thank you for contributing (again).

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Reply all
Reply to author
Forward
0 new messages