Java network code comparison

351 views
Skip to first unread message

shah...@gmail.com

unread,
Feb 18, 2013, 12:30:48 AM2/18/13
to mechanica...@googlegroups.com
Hi All,
I'm attempting to do a comparison of Java's main networking APIs: old IO, NIO and NIO2 and some variations within those APIs. 

My goal is to take up Martin's suggestion of actually doing tests, rather than relying on folklore and specifically to see if how to optimize a FIX client library (https://github.com/falconair/FIXClient).

The code is here:

I would love to get some feedback before I clean it up further. If there are glaring errors or suggestions, please share them. I wrote the program so you can pretty much copy and paste into your editor, name the file correctly and run it (jdk 7). You should then be able to copy and paste the output into Excel to draw some charts.

The tests I ran on my laptop show numbers which are likely not very odd to the audience here. 

For lowest latency (sort by avgLatency), use NIO channels but instead of blocking or using selection keys, just spin in a loop (not sure how practical this is in a real system). Second best performance is given by using the old InputStream and third by using NIO channels in blocking mode.

For highest throughput (sort by nanosPerByte), use NIO channels while spinning in a loop, then NIO channels in blocking mode then InputStream.

BufferedInputStream is not always better than InputStream and using Selectors is generally bad (although I understand the point of selectors is to optimize the number of open connections...something I didn't test here)

(these numbers are from the last run of this code on my machine. many numbers are very close so don't take this to be the final conclusion).

Please note that I am mainly interested in testing latency, then throughput. I am NOT (yet) testing the number of clients a server connection can handle. In fact, I am really only testing single threaded client side connections. Currently I am just transferring timestamps from server to client, but I plan on sending a more complex message (basically I'll send a FIX message, in ASCII, will need to be parsed into key/value pairs). 

I am also thinking of adding two additional choices: allocate vs allocateDirect and convert byte[] to long using a ByteBuffer vs bit masks.

Kasper Nielsen

unread,
Feb 18, 2013, 4:18:33 AM2/18/13
to mechanica...@googlegroups.com
Hi,

I'm very interested in this. Especially how real network AIO (AsynchronousChannel) stack up against the selector-based AIO most libraries use.

- Kasper

Peter Lawrey

unread,
Feb 18, 2013, 4:34:09 AM2/18/13
to Kasper Nielsen, mechanica...@googlegroups.com
Hello,
   The way I have implemented low latency FIX with chronicle is to have multiple connections per busy waiting thread. The library supports a single thread with multiple busy waiting tasks with WaitingRunnable.  I use Chronicle to record every packet coming in and out but also to do the string generation and parsing without producing garbage.  I would expect that if you include the cost of persistence and parsing the text, Chronicle might be a good layer to add.
   Note: if you want ultra low latency you will need a kernel bypass solution like Solarflare.  This can reduce the wire to java latency from 20 micro-seconds to 5 micro-seconds. i.e. make more difference than which method you use to read the data in Java. Note: Solarflare claim a wire to C latency of 3.2 micro-seconds.

Regards,
  Peter.



--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Simone Bordet

unread,
Feb 19, 2013, 11:33:46 AM2/19/13
to shah...@gmail.com, mechanica...@googlegroups.com
Hi,

On Mon, Feb 18, 2013 at 6:30 AM, <shah...@gmail.com> wrote:
> Hi All,
> I'm attempting to do a comparison of Java's main networking APIs: old IO,
> NIO and NIO2 and some variations within those APIs.
>
> My goal is to take up Martin's suggestion of actually doing tests, rather
> than relying on folklore and specifically to see if how to optimize a FIX
> client library (https://github.com/falconair/FIXClient).
>
> The code is here:
> https://gist.github.com/falconair/4975243
>
> I would love to get some feedback before I clean it up further. If there are
> glaring errors or suggestions, please share them. I wrote the program so you
> can pretty much copy and paste into your editor, name the file correctly and
> run it (jdk 7).

Just to recap, your server writes 8 bytes via Socket output stream for
100_000 times, and clients read using a buffer of 80 bytes.
It's a very particular benchmark, but that's ok-ish.

Some comments:

* There is no configuration for TCP_NODELAY, which I think should be
there since you're writing small byte arrays.
Call socket.setTcpNoDelay(true) on both the client and the server
(accepted) socket.
This made a *big* difference for me.

* The selector code is wrong. The contract of selector.select() is
that it returns > 0 if *new* reads are available.
But in the code you just set read interest once, never remove it, and
just read from the channel.
In fact, you will only perform 1 read of 80 bytes and that's it, not
reading the whole content from the server.
Removing the call to selector.select() makes this loop equal to
clientNonBlockedSpinChannel(), with same performance.
I am not sure for your simple benchmark you can compare apples to
apples using the selector.

* In my setup, the JIT stops compiling after 16 or so runs, so I'd say
only after those the JVM is warmed up properly.

Just for the record, at Jetty have selectors performing similarly to
any other "mode" for slightly more complex benchmarks, but sure it's
complex code to write (well don't, use Jetty's client and server
libraries instead :)

Finally, a quick note on JDK7 7's AsynchronousChannels: we considered
using them for Jetty 9 (JDK 7 based), but after a deeper look we did
not use them.
The problem lies in the AsynchronousChannels API, that while it's tons
simpler to use than using a Selector for the casual user, it is not
right for a high performance client/server (I stop here to not hijack
this thread).

--
Simone Bordet
http://bordet.blogspot.com
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless. Victoria Livschitz

shah...@gmail.com

unread,
Feb 20, 2013, 11:05:32 AM2/20/13
to mechanica...@googlegroups.com, shah...@gmail.com
Simone,
Thanks for the detailed feedback, very much appreciated.
I've modified the gist (https://gist.github.com/falconair/4975243) with some updated code.

I am sending just 8 bytes (which is just System.nanoTime()) because I only want to test the transfer of data from server to client (the client receives the timestamp and checks it against current nanoTime...and uses this to calculate latency). I actually have another version of this program which sends a FIX protocol message (a protocol used in finance, simple text key value pairs). Once I get this test right, I'll work on the more complex one.

Along with timestamp, I am now also sending a simple counter. The clients get the data and confirm the value of the counter. This is to make sure I am not accidentally dropping messages. It doesn't look like any of the implementations are dropping any messages.

There does seem to be _some_ error in there which is causing some sort of overflow error. I'm debugging that now.

Regarding the selector code, it doesn't seem to be dropping any messages. Selector is also, by far, the slowest code. It is no where near the 'spin' code. I'm reading more about selectors to get a handle on the bug you reported.

I changed the server to TCP_NODELAY. I'll probably modify the clients by testing both, true and false values for TCP_NODELAY. That is not in the test yet.

Hopefully by end of day today I'll be able to display the results in a chart and put them up somewhere.
Reply all
Reply to author
Forward
0 new messages