Who is working on high performance threadsafe mutable data structures in Haskell?

210 views
Skip to first unread message

Ryan Newton

unread,
Oct 27, 2011, 11:45:05 AM10/27/11
to Haskell Cafe, mona...@googlegroups.com, parallel...@googlegroups.com
Hello cafe,

In the context of the monad-par project we're just getting to the point of trying to replace our work stealing Deque's with something more efficient (in Haskell).

Based a quick perusal of Hackage there does not seem to be a lot of work in this area.  Of course, for Haskell the importance of this topic may be diminished relative to pure data structures, but for doing systems-level work like monad par good concurrent data structures are also very important.

We are about to embark on some work to fix this problem for monad-par & Deques, but if there are others working in this vicinity it would be nice to team up.  
   We are going to try both pure Haskell approaches using the new casMutVar# primop as well as wrapping foreign data structures such as those provided by TBB.  There are a whole bunch of issues with the latter -- see Appendix A and help me out if you know how to do this.

My first goal is to get a parameterizable deque interface, implementation, and benchmarks that makes it possible to express  variations of queues including:
  • Double, single, and 1.5-ended (e.g. both ends can be RW or only R or W)
  • Threadsafe or single-threaded on both ends
  • Bounded or unbounded
If you require that at least the "left"-end support Write and the "right" end support Read (excluding, for example stacks), that leaves a configuration space of 32 options.  Some of these 32 options have much better algorithms than others, but a general algorithm (such as Michael & Scott queues) could satisfy all of them while leaving "room to improve".  The TBB guys have considered making their concurrent_queus configurable to this extent but haven't gotten around to it.

Thanks,
  -Ryan


Appendix A: Using foreign data structures with Haskell
----------------------------------------------------------------------------

It's easy to call foreign code in Haskell, but how to use a foreign data structure to store polymorphic Haskell values?

For example we would want a TBB concurrent_queue to store words representing pointers into the Haskell heap.  We would need, for example:
  • (1) to pin arbitrary objects by making StablePtrs
  • (2) to communicate those to a foreign enqueue operation
  • (3) to unpin objects when the pointer is removed from a foreign structure (or use a reference count to determine when to unpin/free the StablePtr)
I don't have any experience here and would appreciate hearing from anyone who does.  Is this a non-starter?  Will the overheads introduced by all the StablePtr ops destroy any gains from using an efficient data structure?

Johan Tibell

unread,
Oct 27, 2011, 6:13:32 PM10/27/11
to parallel...@googlegroups.com, Haskell Cafe, mona...@googlegroups.com, Gregory Collins
Hi,

On Thu, Oct 27, 2011 at 8:45 AM, Ryan Newton <rrne...@gmail.com> wrote:
Based a quick perusal of Hackage there does not seem to be a lot of work in this area.  Of course, for Haskell the importance of this topic may be diminished relative to pure data structures, but for doing systems-level work like monad par good concurrent data structures are also very important.

Gregory Collins and I haven't wanted a fast lock-free hashtable for some time. There's also a priority queue (inside an IORef) in the I/O manager that could use a replacement. Note that this priority queue needs to support access both by priority and key.
 
We are about to embark on some work to fix this problem for monad-par & Deques, but if there are others working in this vicinity it would be nice to team up.  
   We are going to try both pure Haskell approaches using the new casMutVar# primop as well as wrapping foreign data structures such as those provided by TBB.  There are a whole bunch of issues with the latter -- see Appendix A and help me out if you know how to do this.

You could try the FFI approach if it's not too much work but I expect it to perform worse than a native Haskell version. It'd also be bad for the GC, which doesn't like having lots of small pinned objects as they fragment the heap. I'd rather look into what primops/compiler optimizations we're lacking to be able to do this well from within Haskell.

-- Johan

Prabhat Totoo

unread,
Oct 28, 2011, 9:51:54 AM10/28/11
to parallel...@googlegroups.com, Haskell Cafe, mona...@googlegroups.com, rrne...@gmail.com
Hi,
As i mentioned before, i'm looking into inherently parallel data structures for my phd. Okasaki's book on purely functional data structures has quite a few implementations of these structures (though sequential). As a starting point, i'll look into concurrent/parallel impl of his deque implementation. I've been experimenting with par monad as well. I could possibly help here.

I would be happy to collaborate with anyone working in the same/relating direction.


--Prabhat

Simon Marlow

unread,
Oct 31, 2011, 5:25:21 AM10/31/11
to parallel...@googlegroups.com, mona...@googlegroups.com

Hi Ryan,

 

For work stealing in particular, take a look at Karl-Filip Faxen’s Wool library:

 

http://www.sics.se/~kff/wool/

 

He has investigated many different implementations of work-stealing queues, and Wool has a few nice tricks that improve on algorithms like the Chase/Lev work-stealing deque that we use in the GHC RTS.

 

Cheers,

                Simon

Ryan Newton

unread,
Nov 2, 2011, 2:21:22 PM11/2/11
to mona...@googlegroups.com, parallel...@googlegroups.com
Thanks Simon!  Wool looks quite interesting, and I haven't heard of it before.  At first glance it reminds me a bit of Adam Dunkel's protothreads (from the sensor network world), which may be no coincidence since they're at the same institution:

   http://www.sics.se/~adam/pt/

My ultimate goal is an "abstract-deque" parameterizable interface that abstracts over bounded/unbounded, concurrent/non-concurrent, single, 1.5, and double-ended queues, which would include both, say, Michael & Scott linked queues and the Chase-Lev work-stealing deques.  An aggregated queue package could use type families to dispatch to the best available queue implementation for a given configuration.

I've got a couple drafts of how this might work.  They're in different branches here:


One of them uses an associated data type family, the other an unassociated one.  The type family has a bunch (six) phantom type parameters, and the unassociated one allows hiding those parameters at the expense of introducing more type classes.

MegaQueue.hs will be used to aggregate together different queue implementations.  The end result is that someone can create a new queue by setting all the switches on the type, as follows:

 test = do 
     q :: Deque NT T SingleEnd SingleEnd Bound Safe Int <- newQ
     pushL q 33
     x <- tryPopR q
     print x
     return q

With those settings, requiring only single-ended rather than double-ended queues, the above code can use the LinkedQueue (Michael and Scott) implementation included in that repo.  
   That little test, by the way, segfaults for me on Mac OS under GHC 7.2.1 even WITHOUT using casMutVar# (It's using Data.CAS.Fake presently.)  I'll file a bug report after I sanity check it some more.

Disregarding that... the big problem in this case is the inability to create overlapping type family instances.  

In this case what we dearly WANT to do is specialize some subset of the 64-mode configuration space and then have a "fallback".  You can take a look at my struggling in MegaDeque.  Both of my approaches require specifying explicitly the modes that you don't cover.

Worse, we may want to specialize on element type as well.  For example, for simple scalar types (or even Storable types) it may be desirable to use something foreign, like TBB queues.  But in that case, there's no way to enumerate the types NOT specialized.  As far as I know there is no way for type families to accomplish this (specialize, say "Int", and have a fallback for everything else).  In general, is there a recognized work-around for this?  For example, is this a place where using functional dependencies instead might do the trick?

Also, there are some software-engineering issues here with respect to where to put the instances.  It would be nice to put the instances for DequeClass inside the individual implementations (like LinkedQueue.hs).  But that leads to problems because it's really the job of MegaQueue.hs to figure out how to spread implementations over the configuration-space.  And once you put the instances in LinkedQueue.hs there is of course no way to qualify or hide them upon import....

Feedback welcome.

Cheers,
  -Ryan

Daniel Peebles

unread,
Nov 2, 2011, 2:30:13 PM11/2/11
to rrne...@gmail.com, Edward Kmett, mona...@googlegroups.com, parallel...@googlegroups.com
I haven't been following the details here too closely, but I know that Edward Kmett has worked on concurrent deques, so I'm including him on this thread in case he's not on the list.

-Dan

Bas van Dijk

unread,
Nov 9, 2011, 1:01:29 PM11/9/11
to rrne...@gmail.com, mona...@googlegroups.com, parallel...@googlegroups.com
On 2 November 2011 19:21, Ryan Newton <rrne...@gmail.com> wrote:
>  As far as I know there is no way for type families to accomplish this
> (specialize, say "Int", and have a fallback for everything else)

You may want to add yourself to the CC list of ticket:

http://hackage.haskell.org/trac/ghc/ticket/4259

Even better: add a description to the ticket of your use-case of
overlapping type family instances.

Bas

Edward Kmett

unread,
Nov 9, 2011, 1:25:39 PM11/9/11
to Daniel Peebles, rrne...@gmail.com, mona...@googlegroups.com, parallel...@googlegroups.com
On Wed, Nov 2, 2011 at 2:30 PM, Daniel Peebles <pumpk...@gmail.com> wrote:
On Wed, Nov 2, 2011 at 2:21 PM, Ryan Newton <rrne...@gmail.com> wrote:
Disregarding that... the big problem in this case is the inability to create overlapping type family instances.  

In this case what we dearly WANT to do is specialize some subset of the 64-mode configuration space and then have a "fallback".  You can take a look at my struggling in MegaDeque.  Both of my approaches require specifying explicitly the modes that you don't 
 
Worse, we may want to specialize on element type as well.  For example, for simple scalar types (or even Storable types) it may be desirable to use something foreign, like TBB queues.  But in that case, there's no way to enumerate the types NOT specialized.  As far as I know there is no way for type families to accomplish this (specialize, say "Int", and have a fallback for everything else).  In general, is there a recognized work-around for this?  For example, is this a place where using functional dependencies instead might do the trick?

The usual idiom that I use is to define one general purpose Boxed case.

newtype Wrapped a = Wrap { unwrap :: a }
type family Foo a :: *
type instance Foo (Wrapped a) = ...
type instance Foo Int = ...

Then most things just use the newtype wrapper, it requires the end user to manually wrap and unwrap, but you don't have to rely on scarily inconsistent overlapping families with different representations.

-Edward
Reply all
Reply to author
Forward
0 new messages