Hello everyone,
My use case for Aeron: upload a ( large ) file to Apache HDFS, and while uploading it, "park" its lines somewhere in order to create an index upon the file. Apache HDFS being cluster software, I need a hi-performance transport protocol and distributed log to "park" the file's bytes on the cluster, and am in the process of replacing Apache Kafka with Aeron. This afternoon, I broke all personal speed, throughput and latency records with Aeron.
The following numbers were obtained on the most powerful of my home machines. The setup was, for now, extremely simple: read a 101 GB file from a device over eSATA, line after line, and publish it as fast as possible to Aeron. A line, in this file ( an enormous log file from a web crawler ), averages 315 bytes.
Machine config:
Fujitsu TX 200 R7 / Intel E5-2420 ( 1 * 6 cores physical, 1*12 cores logical, 1.9 GHz, 15 M cache ) / 16 Gb RAM PC12300 @1333 MHz / 2 * 300 GB HDD, 7.2k rpm, RAID-0
eSATA device: 4 * 500 GB HDD, 7.2k rpm, RAID 0
OS: Ubuntu Server 14.04.1 LTS ( headless )
JDK: Oracle 1.8.0
JDK settings: vanilla, if not for -Xms8000m -Xmx8000m
I created my own AeronPublisher, which keeps offering and offering a line until the operation goes through, i.e. NO backoff or idle spin strategy: we simply need to jam the entire file through Aeron :-) I also wrote an AeronRateSubscriber for doing the test, very much similar to the one provided in the samples.
Results:
=> top throughput flirts with 1,100,000 lines / second = 346.5 MB ( payload ) per second, more than twice that of Apache Kafka on the same config, see RateSubscriberScreenshot
=> latency 99.99 percentile = 49 nanoseconds ( !!! ), see HdrHistogram text file
=> 346.5 MB / second = 1.247 TeraByte / hour; the world indexing record for Apache Lucene being currently at around 280 GB / hour, Aeron puts me into a very comfortable position to try and break the Lucene indexing record :-)
=> I have no explanation yet for the outliers in the attached HdrHistogramScreenshot, which peak at around 111.5 microseconds, that is 2271 times the 99.99 percentile ?
Proof that the Aeron guys are doing a great job !
I'll be back ASAP with a "true" clustering perfs benchmark :-D