Aeron stability and transfer of big messages

1,558 views
Skip to first unread message

ivenhov

unread,
May 28, 2015, 6:00:34 PM5/28/15
to mechanica...@googlegroups.com
Hi 

What's the situation with Aeron at the moment?
How stable and bulletproof is it to use as a replacement of existing RPC messaging system?
Current implementation on server side is Java and client side is Java and C++ so I'm interested in both implementations.

Also I would like to use fast binary encoding, possibly SBE along with Aeron.
Is my use case valid for those technologies with messages around 64kB-128KB ?

Todd Montgomery

unread,
May 28, 2015, 6:22:25 PM5/28/15
to mechanica...@googlegroups.com
Martin and I have been working for the last couple weeks on stability related to Aeron. Overall, it is getting pretty stable for normal use. It's also gotten a good deal faster lately as well.

Aeron C++ API is incomplete at the moment and behind on the changes that have been done on the Java side. But is next up on my list to be done.

SBE and Aeron are integrated now. Using SBE with Aeron is quite simple and straight forward as both use the Agrona DirectBuffer and MutableDirectBuffer in Java. In C++, it is also quite straight forward.

Aeron will fragment and reassemble as needed for large messages. By default, the max message size is about 8MB, IIRC. But can be increased by increasing the term size if necessary.

-- Todd

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ivenhov

unread,
May 29, 2015, 4:56:03 AM5/29/15
to mechanica...@googlegroups.com
Thanks Todd
 
Aeron C++ API is incomplete at the moment and behind on the changes that have been done on the Java side. But is next up on my list to be done.

What features are not currently implemented in C++ version?
Is there a roadmap I could have a look into?

Regards
Daniel

Todd Montgomery

unread,
May 29, 2015, 1:04:59 PM5/29/15
to mechanica...@googlegroups.com
An official roadmap is not yet available.

In terms of what the C++ API has yet to do (in very rough priority):
- bring LogBuffer, ManyToOneRingBuffer, and Counter semantics in line with recent changes on the Java side
- bring API to/from driver message formats and file formats in line with recent changes on the Java side
- implement managed resource semantics from API connection management in line with recent additions to Java side
- implement the Publication offer/tryClaim semantics, specifically rotation
- implement the Subscription poll semantics, specifically rotation

-- Todd

--

ivenhov

unread,
Jun 7, 2015, 3:43:27 PM6/7/15
to mechanica...@googlegroups.com
Thanks Todd.

I managed to set up test environment with Aeron.

I want to test throughput on 10Gbit NICs with bigger messages, 128k.
I used StreamingPublisher and RateSubscriber as describer in https://github.com/real-logic/Aeron/wiki/Performance-Testing with 128 message size.
I got 250MB/s. 
Then I modified uk.co.real_logic.aeron.samples.BasicPublisher and uk.co.real_logic.aeron.samples.BasicSubscriber so it reuses 128kB and does not print message content.
I got 350MB/s
On that test system with iperf I can get 9.2Gb/s without any problems nor tuning.

Any idea what parameters or test I should be using for my case to get closer to 10Gbit?

D.


Martin Thompson

unread,
Jun 8, 2015, 6:00:51 AM6/8/15
to mechanica...@googlegroups.com, iwan....@gmail.com
I've just done some quick tests with the RateSubscriber and StreamingPublisher via local loopback. This will test the limitations of Aeron much more so than the network.

With 128k messages I can get 8.0x10^8 bytes per second with default settings.

If I go with the following settings for the MediaDriver I can then get 1.7x10^9 bytes per second at 128k messages. In fact pretty much any message size above 1k gives this rate.

 -Daeron.mtu.length=16384 -Daeron.socket.so_sndbuf=2097152 -Daeron.socket.so_rcvbuf=2097152 -Daeron.rcv.buffer.length=16384 -Daeron.rcv.initial.window.length=2097152 -Dagrona.disable.bounds.checks=true 

Are you using the latest build from GitHub?

Martin...

Martin Thompson

unread,
Jun 8, 2015, 6:02:48 AM6/8/15
to mechanica...@googlegroups.com
It is also worth adding.

-XX:BiasedLockingStartupDelay=0 

Martin Thompson

unread,
Jun 8, 2015, 6:21:33 AM6/8/15
to mechanica...@googlegroups.com
I noticed the wiki page only had settings for increasing the receive buffer and not the send buffer. I've updated it for both now.


ivenhov

unread,
Jun 8, 2015, 7:55:30 AM6/8/15
to mechanica...@googlegroups.com
Thanks Martin

Much better now but not 10Gb.
BTW. when I start MediaDriver I see spin on 2 cores on both node1 and node2. Is that normal?

What would be next thing to tune to fill bandwidth?

MediaDriver on both machines

java -cp aeron-samples/build/libs/aeron-samples-0.1-SNAPSHOT.jar:aeron-driver-0.1-SNAPSHOT.jar:Agrona-0.3.2-SNAPSHOT.jar:aeron-common-0.1-SNAPSHOT.jar -XX:BiasedLockingStartupDelay=0 -Daeron.mtu.length=16384 -Daeron.socket.so_sndbuf=2097152 -Daeron.socket.so_rcvbuf=2097152 -Daeron.rcv.buffer.length=2097152 -Daeron.rcv.initial.window.length=2097152 -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.LowLatencyMediaDriver


java -Daeron.sample.channel=udp://10.173.240.2:40123 -cp ./aeron-samples/build/classes/main:Agrona-0.3.2-SNAPSHOT.jar:aeron-client-0.1-SNAPSHOT.jar:aeron-common-0.1-SNAPSHOT.jar -Daeron.sample.messageLength=131072 -Daeron.sample.messages=500000000 -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.StreamingPublisher

java -Daeron.sample.channel=udp://10.173.240.2:40123 -cp ./aeron-samples/build/classes/main:Agrona-0.3.2-SNAPSHOT.jar:aeron-client-0.1-SNAPSHOT.jar:aeron-common-0.1-SNAPSHOT.jar -Dagrona.disable.bounds.checks=true -Daeron.sample.frameCountLimit=256 uk.co.real_logic.aeron.samples.RateSubscriber

Result:

New connection on udp://10.173.240.2:40123 streamId=10 sessionId=-888791179 at position=0 from 10.173.240.1:58607
1.9e+03 msgs/sec, 2.5e+08 bytes/sec, totals 1919 messages 239 MB
4.7e+03 msgs/sec, 6.2e+08 bytes/sec, totals 6642 messages 830 MB
4.9e+03 msgs/sec, 6.4e+08 bytes/sec, totals 11525 messages 1440 MB
5.1e+03 msgs/sec, 6.7e+08 bytes/sec, totals 16612 messages 2076 MB
5.1e+03 msgs/sec, 6.6e+08 bytes/sec, totals 21666 messages 2708 MB


Java
m@node2:~$ java -version
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)


System properies

m@node2:~$ sysctl net.core

net.core.bpf_jit_enable = 0
net.core.dev_weight = 64
net.core.message_burst = 10
net.core.message_cost = 5
net.core.netdev_budget = 300
net.core.netdev_max_backlog = 1000
net.core.netdev_tstamp_prequeue = 1
net.core.optmem_max = 20480
net.core.rmem_default = 212992
net.core.rmem_max = 2097152
net.core.rps_sock_flow_entries = 0
net.core.somaxconn = 128
net.core.warnings = 1
net.core.wmem_default = 212992
net.core.wmem_max = 2097152
net.core.xfrm_acq_expires = 30
net.core.xfrm_aevent_etime = 10
net.core.xfrm_aevent_rseqth = 2
net.core.xfrm_larval_drop = 1


D.

Martin Thompson

unread,
Jun 8, 2015, 8:05:48 AM6/8/15
to mechanica...@googlegroups.com
Try starting the driver and clients with the following command in front of them and see what you get.

    $ numactl --membind=0 --cpunodebind=0 <your normal command line>

You can play with thread usage by driver threading mode and and idle strategies.




--

Todd Montgomery

unread,
Jun 8, 2015, 11:00:25 AM6/8/15
to mechanica...@googlegroups.com
Some more suggestions for maxing out Bytes/sec.

Will want to see if the machines, NICs, and switches can be saturated. I would start with TCP and see if TCP can saturate and what it takes to make it saturate. I.e. send() call sizes and receive() call sizes as well as SO_RCVBUF and SO_SNDBUF and other settings. Aeron will be different, but it will give you a baseline to then extrapolate from.

128K is probably too big as a message size. You will want to size everything so there is no additional work for fragmentation and reassembly (both IP and Aeron) to get the most out of everything. Also, Aeron has a 24 byte header per fragment. So, to get the most out of everything, you will want messages to fit in the buffers and window in multiples of that (including the headers).

Will want to size the Aeron MTU with the network MTU (taking IP and UDP headers into account). I would recommend jumbo frames if using 10GigE and Java as you want to minimize the number of system calls and trade that off with the IP fragmentation. What works for loopback might not work for going over a LAN as the MTUs are different.


Reply all
Reply to author
Forward
0 new messages