Java Chronicle Performance

Nguyen Hoa

unread,

Sep 13, 2015, 12:26:05 PM9/13/15

to Chronicle

Hi team,

Today, I made a performance test to compare Chronicle and Apache Kafka for research purpose. The benchmark (see attached excel file) clearly shown that Kafka outperformed Chronicle.

My test machine is:

Laptop MacBook Pro

Processor 2.2 GHz Intel Core i7

Memory 16GB 1600 MHz DDR3

Storage 106 GB free of 250 GB

Graphics Intel Iris Pro 1546 MB

For Chronicle, based the following KalfaTestMain, I updated it to ChronicleTestMail (see attached)

https://github.com/OpenHFT/Chronicle-Queue/blob/master/chronicle/src/test/java/net/openhft/chronicle/comparison/KafkaTestMain.java

And for Kafka, I used the following benchmark command:

bin/kafka-consumer-perf-test.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --messages 50000000 --topic test --threads 1

which can be found at: https://gist.github.com/jkreps/c7ddb4041ef62a900e6c

Could you please take a look at the PerformaceTest.xlsx and let me know if the result is correct, or still there're some mistakes made by me during the experiment ?

Thank you,

Hoa.

--------------

FYI: The two following charts are extracted from the PerformanceTest.xlsx

And

PerformanceTestForKafkaAndChronicle.xlsx

ChronicleTestMain.java

Peter Lawrey

unread,

Sep 13, 2015, 3:38:14 PM9/13/15

to java-ch...@googlegroups.com

There is one line which needs fixing.

for (int i = 0; i < message_number; i += batch_size) {

Otherwise the number of messages actually written was message_number * batch_size.

If you use vanilla chronicle for 10,000,000 messages it takes 0.65 seconds.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

100, 10000000, 200, 651, 2147483647, 100

If you use indexed chronicle for 10,000,000 messages it takes 0.55 seconds.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

100, 10000000, 200, 552, 2147483647, 100

Where Chronicle will be much faster is smaller batch sizes. Adding batches adds complexity and a lot of latency, so try without batching.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

100, 10000000, 1, 1046, 2147483647, 100

Note: how without batching, chronicle takes just over twice as long.

Another test which would be interesting is measuring latency. Batching increases latency dramatically as it has to delay messages to build up a batch. If your message rate varies, you don't know how long it will take to get an optimal batch size, but you will always know when you have one message to send.

If you look at the latency of individual messages, Chronicle should be about a micro-second typically for a message of this size. You can measure latency by adding a timestamp to the start of each message.

Regards,

Peter.

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peter Lawrey

unread,

Sep 13, 2015, 3:55:22 PM9/13/15

to java-ch...@googlegroups.com

On my system, a batch size of 10 is close to optimal.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

100, 10000000, 10, 626, 2147483647, 100

This represents 1 GB in 0.626 seconds or almost 1.6 GB/s.

Even the single message at a time gets a bandwidth of 956 MB/s

Also if you want very fast messages you can try much smaller ones. Like 40 byte messages.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

40, 10000000, 1, 842, 2147483647, 100

This is writing a small 40 byte message, one at a time and still it is 475 MB/s. The test is so short, the code is still warming up. With 100 million messages you see that the results are less than 10x longer.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

40, 100000000, 1, 7699, 2147483647, 100

With half a billion messages/batches took 38 seconds.

message_size, message_number, batch_size, processing_time, flush_count, flush_period

40, 500000000, 1, 37998, 2147483647, 100

Regards,

Peter.

Peter Lawrey

unread,

Sep 13, 2015, 3:58:45 PM9/13/15

to java-ch...@googlegroups.com

I suggest using the following at the start. This will clear the messages between tests. Without this the perform of a test might depend on which tests you ran previously as Chronicle is persisting all the messages.

ChronicleTools.deleteOnExit(basePath);

I assume Kafka is persisting all the messages as well or it is not a fair comparison.

Otis Gospodnetić

unread,

Sep 13, 2015, 10:30:51 PM9/13/15

to java-ch...@googlegroups.com

Yes, Kafka writes to disk, too.

Hoa - thank you for sharing! It would be great to see the results of the next run after you apply Peter's suggestions.

Thanks,

Otis
--

Monitoring * Alerting * Anomaly Detection * Centralized Log Management

Solr & Elasticsearch Support * http://sematext.com/

Nguyen Hoa

unread,

Sep 14, 2015, 12:20:42 AM9/14/15

to Chronicle

Hi Peter and Otis,

I update the code following your suggestion, and the result is very promising. Here is our new charts (please have a look at PerformanceTest.xlsx for more details)

And

Thank you,

Hoa.

PerformanceTest.xlsx

Ben

unread,

Sep 17, 2015, 4:21:24 PM9/17/15

to Chronicle

Nguyen - during your research of Kafka vs Chronicle, have you found any other compelling reasons (other than throughput/latency) that swung you towards one or the other ? eg. high availability, clustering, etc ?

Thanks,

Ben

Peter Lawrey

unread,

Sep 17, 2015, 4:37:23 PM9/17/15

to java-ch...@googlegroups.com

Chronicle Queue supports HA via replication. This is a simple implementation but it can be suitable for some use cases.

Otis Gospodnetić

unread,

Sep 17, 2015, 10:28:02 PM9/17/15

to java-ch...@googlegroups.com

Hi,

Thank you for sharing, Hoa!

One suggestion: when comparing two solutions it's best to let authors of both projects tune their solution. Otherwise you end up comparing either out of the box configuration or custom, but likely suboptimal configurations.

Otis
--

Monitoring * Alerting * Anomaly Detection * Centralized Log Management

Solr & Elasticsearch Support * http://sematext.com/

Peter Lawrey

unread,

Sep 18, 2015, 1:22:25 AM9/18/15

to java-ch...@googlegroups.com

Agreed, which is why when I compare with alternative solutions I try to point out that I am not the best person to optimise any other solution.

Also the choice of test will suit different solutions. As we work in the low latency space, you want a batch size of 1 as much as possible and not surprisingly we do well for this use case as that is what we optimised for. Batches of larger than 1 mean delaying some of the messages so you can build a batch rather sending them as soon as they arrive.

Being low latency can suit the reactive space better whereas large batches can suit the traditional transactional space better.

Regards, Peter.

Reply all

Reply to author

Forward

Laptop	MacBook Pro
Processor	2.2 GHz Intel Core i7
Memory	16GB 1600 MHz DDR3
Storage	106 GB free of 250 GB
Graphics	Intel Iris Pro 1546 MB