Low latency

177 views
Skip to first unread message

Jon Harrop

unread,
Nov 30, 2012, 12:52:29 AM11/30/12
to pragmatic-functional...@googlegroups.com

I worked on stock exchange software last year and it occurred to me that low
latency applications are ripe for functional programming because purely
functional data structures are fragmented in the heap which makes low
latency incremental and compacting collection much easier. Surprisingly,
this idea doesn't seem to have been pushed much outside Erlang.

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com

Malcolm Matalka

unread,
Nov 30, 2012, 6:31:59 AM11/30/12
to Jon Harrop, pragmatic-functional...@googlegroups.com
In what specific way do functional data structures help here? The
specific benefit of Erlang is that each sequential unit has its own copy
of almost all objects so you never have to scan outside its heap space
to perform a GC. The restrictions in Erlang actually make the GC
basically a straight out of the book copying-collector implementation.

/M

Jon Harrop

unread,
Nov 30, 2012, 10:24:12 AM11/30/12
to pragmatic-functional...@googlegroups.com
A lot of complexity in low latency general purpose GCs comes from handling
arbitrarily-large objects, arrays and thread stacks.

Traditional functional data structures are fragmented into many tiny parts,
which makes it easy to write a traditional Dijkstra-style incremental
tricolor marking phase. CPS can be used to eliminate the stack. If you can
bound the max allocation size then you could even use something like Baker's
treadmill for easy real-time collection.

Cheers,
Jon.

isaiah perumalla

unread,
Dec 2, 2012, 3:46:53 PM12/2/12
to pragmatic-functional...@googlegroups.com

Hi Jon
Purely functional structures like you say are fragmented in the heap so that would also mean , cpu cache locality suffer and hence more cache misses than a traditional imperative array backed mutable structures?

Jon Harrop

unread,
Dec 4, 2012, 4:18:00 AM12/4/12
to pragmatic-functional...@googlegroups.com
Yes, exactly. And that means not only ~10x worse serial throughput but also poor scalability on multicores.

Although locality is taken very seriously in research on imperative data structures and parallel algorithms (particularly for multicore), it has been almost entirely neglected in the context of purely functional data structures. So there are (to my knowledge) no theories that deal with this. I personally believe that this is the achilles heel of purely functional data structures. The problem is exacerbated by the fact that the GC can move your heap blocks around and change locality under your feet.

So this is about sacrificing throughput to get better latency. As I said to Jack, purely functional data structures have many important advantages but, if you need performance, they are the first thing you want to consider dropping.

As an aside, this is one of the reasons why I wanted to benchmark not only n-queens using immutable linked lists but also rolling immutable red-black trees. The latter will stress the GC in a different way that is also practically important.

Cheers,
Jon.


On Sunday, 2 December 2012 12:46:53 UTC-8, isaiah perumalla wrote:

Hi Jon
Purely functional structures like you say are fragmented in the heap so that would also mean , cpu cache locality suffer and hence more cache misses than a traditional imperative array backed mutable structures?


On Friday, November 30, 2012 3:24:12 PM UTC, Jon Harrop wrote:
> A lot of complexity in low latency general purpose GCs comes from handling
>
> arbitrarily-large objects, arrays and thread stacks.
>
>
>
> Traditional functional data structures are fragmented into many tiny parts,
>
> which makes it easy to write a traditional Dijkstra-style incremental
>
> tricolor marking phase. CPS can be used to eliminate the stack. If you can
>
> bound the max allocation size then you could even use something like Baker's
>
> treadmill for easy real-time collection.
>
>
>
> Cheers,
>
> Jon.
>
>
>
> > -----Original Message-----
>
> > From: Malcolm Matalka [mailto:mmat...@gmail.com]
>
> > Sent: 30 November 2012 11:32
>
> > To: Jon Harrop
>

Malcolm Matalka

unread,
Dec 4, 2012, 5:33:23 AM12/4/12
to Jon Harrop, pragmatic-functional...@googlegroups.com
There is a big umbrella project called RELEASE in Europe that is trying
to figure out how to write programs for thousands of cores. Currently
Erlang is the big player there. It's a big project but I'm sure they
will have to deal with cache locality to make it all work:

http://www.release-project.eu/

/M
>> > > Cc: pragmatic-functional...@googlegroups.com<javascript:>
Reply all
Reply to author
Forward
0 new messages