Concurrency considered bad for high performance trading (LMAX)

47 views
Skip to first unread message

Toby Corkindale

unread,
Apr 18, 2012, 1:27:32 AM4/18/12
to Scala Melbourne
This is a very interesting article -- it claims that concurrency in
your application is not good for performance, and goes on to discuss
how an application was split up into numerous components that can run
independently, without locking, etc. and achieve high reliability and
performance. (6 million financial trades per second in a single thread
on the JVM)

It touches upon something mentioned at a meeting a while back --
having a mutable state and achieving failure recovery by replaying the
logs to return to that state. (And discusses ways around some problems
with that approach)

http://martinfowler.com/articles/lmax.html

-Toby

--
Turning and turning in the widening gyre
The falcon cannot hear the falconer
Things fall apart; the center cannot hold
Mere anarchy is loosed upon the world

Ben Hutchison

unread,
Apr 18, 2012, 5:37:19 AM4/18/12
to scala...@googlegroups.com

Whoah there! Let's look at what LMAX *actually did*: they built a high performance single threaded application.

Kudos to them. But the conclusion that: concurrency => slow isn't inferable from this piece of data.

Ben

--
You received this message because you are subscribed to the Google Groups "Melbourne Scala User Group" group.
To post to this group, send an email to scala...@googlegroups.com.
To unsubscribe from this group, send email to scala-melb+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scala-melb?hl=en-GB.

Toby Corkindale

unread,
Apr 18, 2012, 8:45:32 PM4/18/12
to scala...@googlegroups.com
On 18 April 2012 19:37, Ben Hutchison <brhut...@gmail.com> wrote:
> Whoah there! Let's look at what LMAX *actually did*: they built a high
> performance single threaded application.
>
> Kudos to them. But the conclusion that: concurrency => slow isn't inferable
> from this piece of data.

Actually, that did seem to be one of the main points. The argument
being that computer hardware only truly runs fast when the CPU is able
to operate within the CPU caches -- Which occurs when you don't have
lots of threads running and thus requesting lots of random bits of
memory. Also that concurrent writes to the same memory areas destroy
that performance too.

I'm certainly not taking it as gospel, but I thought it was an
interesting article, and relevant to the recent discussions about
Actors.

Toby

Ishaaq Chandy

unread,
Apr 18, 2012, 9:32:08 PM4/18/12
to scala...@googlegroups.com
I agree - the name of the project "Disruptor" actually emphasises the fact that this is what they are claiming (no idea if it is true, but some of the things they say make sense to me).

I found the QCon video a bit more enlightening than Fowler's essay. http://www.infoq.com/presentations/LMAX

Having said that, after watching the video I meant to have a poke around with Disruptor but that didn't eventuate.

Ishaaq

Ben Hutchison

unread,
Apr 18, 2012, 11:20:19 PM4/18/12
to scala...@googlegroups.com

Toby, the tone of my original reply was maybe a bit sharp, partly from
replying on smartphone in few words.

I do think LMAX is an interesting, thought provoking case study -
thanks for posting it. What I did want to emphasize is that LMAX is
strong evidence that you can do single-threaded trading fast (up to
some limit). But the case that you cannot do trading fast concurrently
is more of a secondhand report, ie "we tried concurrency and it didn't
work well...".

What I wonder about is why they couldn't partition their data into
several fairly non-contending segments? By market, or by buyer/seller,
or by stock? And use one thread for each segment?

Because in a way, the LMAX story (as Martin Fowler tells it - I havent
watched the vid) is quite a pessimistic one: "we couldn't make
concurrency work, so we had to fallback to a highly tuned single
thread". Fast it may be, but that doesn't leave them any room to move,
any way to scale, when that single thread reaches capacity.

Game programming was in this situation during the naughties; they were
the industry experts in super-fast single threading, with all sorts of
memory layout techniques to get really good caching. But it was a
difficult adjustment to accomodate the rise of multi-core, because
game engines were built around a single-threaded assumption.

-Ben

Bernie Pope

unread,
Apr 19, 2012, 7:49:31 AM4/19/12
to scala...@googlegroups.com
On 19/04/2012, at 10:45 AM, Toby Corkindale wrote:

> Actually, that did seem to be one of the main points. The argument
> being that computer hardware only truly runs fast when the CPU is able
> to operate within the CPU caches

Yes, memory access patterns are very important for modern CPU performance, but ...

> -- Which occurs when you don't have
> lots of threads running and thus requesting lots of random bits of
> memory.

… it is also possible to arrange for multiple threads to use the cache effectively.

> Also that concurrent writes to the same memory areas destroy
> that performance too.
>
> I'm certainly not taking it as gospel, but I thought it was an
> interesting article, and relevant to the recent discussions about
> Actors.

Agreed.

As an aside, for anyone who is interested in the nitty gritty details of CPU/memory performance then I highly recommend the following paper by Ulrich Drepper:

What Every Programmer Should Know About Memory
http://www.akkadia.org/drepper/cpumemory.pdf

It is a couple of years old now (2007) but still quite relevant.

Cheers,
Bernie.

Reply all
Reply to author
Forward
0 new messages