Aeron perf testing

1,944 views
Skip to first unread message

Jan van Oort

unread,
Nov 14, 2014, 10:06:38 AM11/14/14
to mechanica...@googlegroups.com
Hello everyone, 

My use case for Aeron: upload a ( large ) file to Apache HDFS, and while uploading it, "park" its lines somewhere in order to create an index upon the file. Apache HDFS being cluster software, I need a hi-performance transport protocol and distributed log to "park" the file's bytes on the cluster, and am in the process of replacing Apache Kafka with Aeron. This afternoon, I broke all personal speed, throughput and latency records with Aeron. 

The following numbers were obtained on the most powerful of my home machines. The setup was, for now, extremely simple: read a 101 GB file from a device over eSATA, line after line, and publish it as fast as possible to Aeron. A line, in this file ( an enormous log file from a web crawler ), averages 315 bytes.


Machine config: 
Fujitsu TX 200 R7 / Intel E5-2420 ( 1 * 6 cores physical, 1*12 cores logical, 1.9 GHz, 15 M cache ) / 16 Gb RAM PC12300 @1333 MHz / 2 * 300 GB HDD, 7.2k rpm,  RAID-0
eSATA device: 4 * 500 GB HDD, 7.2k rpm, RAID 0

OS: Ubuntu Server 14.04.1 LTS ( headless )

JDK: Oracle 1.8.0
JDK settings: vanilla, if not for -Xms8000m -Xmx8000m

I created my own AeronPublisher, which keeps offering and offering a line until the operation goes through, i.e. NO backoff or idle spin strategy: we simply need to jam the entire file through Aeron :-) I also wrote an AeronRateSubscriber for doing the test, very much similar to the one provided in the samples. 

Results: 
=> top throughput flirts with 1,100,000 lines / second = 346.5 MB ( payload ) per second, more than twice that of Apache Kafka on the same config, see RateSubscriberScreenshot
=> latency 99.99 percentile = 49 nanoseconds ( !!! ), see HdrHistogram text file

=> 346.5 MB / second = 1.247 TeraByte / hour; the world indexing record for Apache Lucene being currently at around 280 GB / hour, Aeron puts me into a very comfortable position to try and break the Lucene indexing record :-)

=> I have no explanation yet for the outliers in the attached HdrHistogramScreenshot, which peak at around 111.5 microseconds, that is 2271 times the 99.99 percentile ?


Proof that the Aeron guys are doing a great job ! 
I'll be back ASAP with a "true" clustering perfs benchmark :-D 


RateSubscriberScreenshot.jpg
HdrHistogram.txt

Martin Thompson

unread,
Nov 14, 2014, 11:39:57 AM11/14/14
to mechanica...@googlegroups.com
Thanks for sharing the results.

What network are you testing over? I don't know any that have a RTT measured in 10s of nanoseconds! Something seems a bit fishy with the latency figures.

Do you tests generate any garbage? If so, the GC pauses are likely the cause of the latency spikes.

A good way to baseline how quiet your machine is with jHiccup.



--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Todd Montgomery

unread,
Nov 14, 2014, 11:47:11 AM11/14/14
to mechanica...@googlegroups.com
Awesome stuff! Thanks for sharing!

-- Todd

Jan van Oort

unread,
Nov 14, 2014, 12:17:44 PM11/14/14
to mechanica...@googlegroups.com
Agree. I posted the figures anyway, trusting that you would immediately go for the fishy 0.1xy nanoseconds. I used the standard HdrHistogram, the way you guys use it: 

                t1 = System.nanoTime();                
                __BUFFER.putBytes( 0, b );
                boolean result = publication.offer( __BUFFER, 0, b.length );

                while ( !result )
                {
                    result = publication.offer( __BUFFER, 0, b.length );
                }
                t2 = System.nanoTime();
                __HISTOGRAM.recordValue( t2  - t1 );


Did this test just hit a bug in Histogram ? I **did** notice that a lot of multithreading goes on, the test generated a Unix load between 11 and 12, of which exactly 2.06 could be ascribed to the Aeron media driver. 

I used a 10 GigE network, was lucky to borrow some stuff :-) 

I "mavenized" Aeron BTW, and ran through maven exec:java. I know - you guys think differently there. I simply can't live with gradle... 






Fortuna audaces adiuvat - hos solos ? 

Jan van Oort

unread,
Nov 14, 2014, 12:19:59 PM11/14/14
to mechanical-sympathy
Todd, you're welcome. Keep braced for more until the end of the year :-D



Fortuna audaces adiuvat - hos solos ? 

On 14 November 2014 17:47, Todd Montgomery <toddleem...@gmail.com> wrote:

Martin Thompson

unread,
Nov 14, 2014, 12:21:17 PM11/14/14
to mechanica...@googlegroups.com
Ah!

You are just measuring the time to offer the message to Aeron, not the time taken to transmit the message. For round trip time (RTT) you need to measure the total time from publisher to subscriber and back again.

Jan van Oort

unread,
Nov 14, 2014, 12:30:03 PM11/14/14
to mechanical-sympathy
That is actually what I *want* to measure. I am uploading a file to HDFS, and need to "park" its lines somewhere. In Aeron, that is. As soon as Aeron says "OK, got it", I trust that the subscriber will [ eventually ? ] consume it -- if the subscriber doesn't, my AeronPublisher will get blocked anyway, right ? 



Fortuna audaces adiuvat - hos solos ? 

Martin Thompson

unread,
Nov 14, 2014, 12:42:22 PM11/14/14
to mechanica...@googlegroups.com
That is actually what I *want* to measure. I am uploading a file to HDFS, and need to "park" its lines somewhere. In Aeron, that is. As soon as Aeron says "OK, got it", I trust that the subscriber will [ eventually ? ] consume it -- if the subscriber doesn't, my AeronPublisher will get blocked anyway, right ? 

If offer returns true then the message is saved in the publication buffer. If the driver is out of process then you are safe even from the client crashing at this stage. The driver will then send this reliably to a single subscriber in the unicast case, or multiple subscribers in the multicast case. If the message is lost in transmission it will be NAK'ed for by the receiver. The receiver will then hold it in the connection buffer until all subscriptions have consumed it. 

Aeron is reliable but not guaranteed delivery. If the subscriber dies and times out then the message delivery will not be retried.  A protocol could be layered on top of Aeron to archive published messages so a subscriber could recover but that is not in the base functionality.

If the subscriber cannot keep up, or dies, then the publication will backup due to flow control and offer will return false.

Todd Montgomery

unread,
Nov 14, 2014, 12:42:34 PM11/14/14
to mechanica...@googlegroups.com
Aeron does not provide transactional guarantees. So, when an offer is done, when it returns it is either rejected or it is placed in the shared memory log buffer to
be sent by the driver. The driver does its best to send it as flow control allows.

But, Aeron does try it's best to get that to the subscribers. It's a best effort delivery guarantee much like TCP, but a slight bit better in that it can handle certain glitches
in connectivity that TCP can't. But it doesn't guarantee it.

-- Todd

Todd Montgomery

unread,
Nov 14, 2014, 12:57:20 PM11/14/14
to mechanica...@googlegroups.com
I should mention that the protocol design is deliberately done to allow for "guaranteed" delivery without additional frames or operation.

-- Todd

Jan van Oort

unread,
Nov 14, 2014, 1:08:48 PM11/14/14
to mechanical-sympathy
Right. At first sight, it seems this was what I was aiming for. I will need to check, however, if "reliable" is enough, or if I need guaranteed delivery. Transactional guarantee is too strong, there is no need for that  ( yet ) in my use case. In brief: I need to figure out a way to handle the case a subscriber dies. Preferrable without the hassle of putting some protocol on top of Aeron. 

Hm. 



Fortuna audaces adiuvat - hos solos ? 

ymo

unread,
Nov 14, 2014, 3:08:04 PM11/14/14
to mechanica...@googlegroups.com
Something i like in kafka is that they put the burden of the "guaranteed" delivery on the subscriber side. The subscriber keeps track of the last message it received/processed. If it crashes it connects , after coming back alive , back to the *store" and asks all the missed messages to be played back. On the sender side they store all the successfully sent messages to disk and replay it back when a new subscriber asks for something outdated. The files are kept on disk for a configurable finite time (say 1/2 weeks) and deleted afterwards.

Assuming that you have a dumb subscriber next to the publisher you could write the bytes to a file in an append only fashion and keep track of sent  bytes in this way. Playing back these bytes should also be fairly straight forward. If i was writing this the real questions for me would be:
1) can aeron detect when all subscribers are *gone* (crashed) ?
2) how big should the disk storage file be ? 1* logfile size ? maybe 2* ?
3) Should the storage file be a memory mapped file as well ? or mounted memory file system ?
4) how would i make a new subscriber "join" the list of receivers for a particular aeron publisher  in a live fashion once it catches back to all the passed items.


On Friday, November 14, 2014 1:08:48 PM UTC-5, Jan van Oort wrote:
Right. At first sight, it seems this was what I was aiming for. I will need to check, however, if "reliable" is enough, or if I need guaranteed delivery. Transactional guarantee is too strong, there is no need for that  ( yet ) in my use case. In brief: I need to figure out a way to handle the case a subscriber dies. Preferrable without the hassle of putting some protocol on top of Aeron. 

Hm. 



Fortuna audaces adiuvat - hos solos ? 

On 14 November 2014 18:57, Todd Montgomery <toddleem...@gmail.com> wrote:
I should mention that the protocol design is deliberately done to allow for "guaranteed" delivery without additional frames or operation.

-- Todd
On Fri, Nov 14, 2014 at 9:42 AM, Martin Thompson <mjp...@gmail.com> wrote:

That is actually what I *want* to measure. I am uploading a file to HDFS, and need to "park" its lines somewhere. In Aeron, that is. As soon as Aeron says "OK, got it", I trust that the subscriber will [ eventually ? ] consume it -- if the subscriber doesn't, my AeronPublisher will get blocked anyway, right ? 

If offer returns true then the message is saved in the publication buffer. If the driver is out of process then you are safe even from the client crashing at this stage. The driver will then send this reliably to a single subscriber in the unicast case, or multiple subscribers in the multicast case. If the message is lost in transmission it will be NAK'ed for by the receiver. The receiver will then hold it in the connection buffer until all subscriptions have consumed it. 

Aeron is reliable but not guaranteed delivery. If the subscriber dies and times out then the message delivery will not be retried.  A protocol could be layered on top of Aeron to archive published messages so a subscriber could recover but that is not in the base functionality.

If the subscriber cannot keep up, or dies, then the publication will backup due to flow control and offer will return false.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Richard Warburton

unread,
Nov 14, 2014, 4:15:25 PM11/14/14
to mechanica...@googlegroups.com
Hi,

Awesome stuff! Thanks for sharing!

+1

Given the usecase it might be worth increasing the log buffer size on Aeron in order to allow you to park more data (-Daeron.term.buffer.size=<size in bytes>). Not sure of the exact tradeoffs you're looking for here, but worth thinking about at any rate.

regards,

  Richard Warburton

Jan van Oort

unread,
Nov 14, 2014, 5:30:31 PM11/14/14
to mechanica...@googlegroups.com
Richard, 

+1

gonna try that tomorrow. Just curious as to what effect that creates  :-)



Op vrijdag 14 november 2014 22:15:25 UTC+1 schreef Richard Warburton:

Martin Thompson

unread,
Nov 17, 2014, 8:58:09 AM11/17/14
to mechanica...@googlegroups.com
As Todd and I have mentioned Aeron is just a pure OSI layer 4 implementation. These sort of features could be layered on top and have been considered in the design.

Some work I'm doing in the new year is in this space and it will go open source. For now if you really need these types of features then consider something like Kafka or UM Persistence.


To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Gil Tene

unread,
Nov 17, 2014, 4:43:44 PM11/17/14
to mechanica...@googlegroups.com
For your outlier question: I'd add jHiccup as an agent to the run, and make sure to use the -c flag. This will produce a hiccup log for the process and for the system for the duration of the run. You can then use that information to see if the outlier is cause by (sam thing as seen by) the system [the control hiccup log], the process [the hiccup log] or neither. This won't quite tell what it is, but it will tell you what it isn't...

And BTW, here is your histogram plotted (using http://hdrhistogram.github.io/HdrHistogram/plotFiles.html):

Reply all
Reply to author
Forward
0 new messages