I'm once again doing some stuff with Netty's SCTP support - for a
project involving thousands of sensors that are read periodically and
the data collected on a private network, its characteristics are pretty
ideal for small wads of metrics being shot over the wire.
In
stress-testing the code (in ways that are unlikely to appear in real
life, but still...), I've run into an odd little corner case:
NioSctpChannel does the following to send a message:
final int writtenBytes = javaChannel().send(nioData, mi);
return writtenBytes > 0;
while the JDK's SctpChannelImpl, in the send() method that takes a ByteBuffer:
do {
n = send(fdVal, buffer, messageInfo);
} while ((n == IOStatus.INTERRUPTED) && isOpen());
and in the send method that takes a file descriptor:
that winds up calling down into the JDK's SctpChannelImpl.sendFromNativeBuffer() to do this:
int written = send0(fd, ((DirectBuffer)bb).address() + pos, rem, addr,
port, -1 /*121*/, streamNumber, unordered, ppid);
if (written > 0)
bb.position(pos + written);
return written;
and as best I can tell, send0() does not necessarily write all of the data.
So the client winds up receiving a truncated (but claiming to be complete) SctpMessage.
I've got a few workarounds:
-
Flush on a delay instead of immediately (the test is really artificial
in that it's slamming the server with messages in a loop)
-
Store the actual number of bytes written via a threadlocal and check it
(discarded this approach - if you aren't calling from an event loop
thread, the threadlocal is useless, and the below is more reliable)
- Subclass NioSctpChannel, and have doWrite() compute its return value like this:
int limit = nioData.limit();
final int writtenBytes = javaChannel().send(nioData, mi);
return writtenBytes == limit;
which does
send garbage across the wire; so have the first bytes of each SCTP
message contain the length the sender expected to send, and if the
buffer length doesn't match, simply ignore it (since the sender will
know the write failed and [?] retry it).
This solves the problem, but it is pretty ugly.
Any less ugly approaches? Or something dumb that I'm not thinking of that would solve the problem?
Thanks,
Tim