Improving remote throughput for large bursts

Patrik Nordwall

unread,

Apr 11, 2014, 2:40:59 AM4/11/14

to akka...@googlegroups.com

I'm working on improving the throughput of sending messages to remote system for a scenario that seems to be scaringly common. Sending many messages in one go.

What happens is that the TCP buffer gets full and doesn't accept more writes. Then we must buffer, backoff and try again. First step was to replace the stashing in the endpoint writer with a more efficient internal buffer.

That made things worse for some buffer sizes. The reason is probably that the inefficient stashing accidentally provided the needed backoff.

Now I have implemented an adaptive backoff strategy that seems to be the right direction. Attached the results of my tests. Better throughput for all tested combinations, and most important it handles bursts of 300000 messages without degraded throughput or false failure detection.

This is only one of many tests that should be done, but I wanted to share the so far good news.

Cheers,

Patrik

--

Patrik Nordwall
Typesafe - Reactive apps on the JVM
Twitter: @patriknw

JOIN US. REGISTER TODAY!

remote-bench-result2.pdf

Björn Antonsson

unread,

Apr 11, 2014, 3:02:11 AM4/11/14

to Patrik Nordwall, akka...@googlegroups.com

Awesome improvements.

B/

--
You received this message because you are subscribed to the Google Groups "Akka Developer List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Björn Antonsson

Typesafe – Reactive Apps on the JVM

twitter: @bantonsson

Roland Kuhn

unread,

Apr 11, 2014, 3:22:07 AM4/11/14

to akka-dev

Great results, you are absolutely right that they must be shared! :-)

--
You received this message because you are subscribed to the Google Groups "Akka Developer List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

<remote-bench-result2.pdf>

Dr. Roland Kuhn
Akka Tech Lead
Typesafe – Reactive apps on the JVM.
twitter: @rolandkuhn

Patrik Nordwall

unread,

Apr 25, 2014, 9:28:19 AM4/25/14

to akka...@googlegroups.com

These improvements have been merged to master and release-2.3 branches. I made another improvement that I have high hopes for. Heartbeat messages for the remote and cluster death watch have priority over other messages, which means that they have a better chance of getting through even when bursts of many messages are sent. Heartbeats of the transport failure detector was changed to piggyback on normal message payload, so those should also pass through.

I encourage anyone with heavy usage of akka remote/cluster to try this timestamped snapshot: 2.3-20140425-151510

that is published to repo http://repo.akka.io/snapshots/

Cheers,

Patrik

Reply all

Reply to author

Forward