Way to expose BLACKHOLES through an API?

Ryan Newton

unread,

Nov 7, 2011, 9:50:05 AM11/7/11

to glasgow-haskell-users, parallel-haskell

Hi GHC users,

When implementing certain concurrent systems-level software in Haskell
it is good to be aware of all potentially blocking operations.
Presently, blocking on an MVar is explicit (it only happens when you
do a takeMVar), but blocking on a BLACKHOLE is implicit and can
potentially happen anywhere.

If there are known thunks where we, the programmers, know that
contention might occur, would it be possible to create a variant of
"Control.Monad.Evaluate" that allows us to construct non-blocking
software:

evaluate :: a -> IO a
evaluateNonblocking :: a -> IO (Maybe a)

It would simply return Nothing if the value is BLACKHOLE'd. Of course
it may be helpful to also distinguish the evaluated and unevaluated
states. Further, the above simple version allows data-races (it may
become blackhole'd right after we evaluate). An extreme version would
actively blackhole it to "lock" the thunk... but maybe that's overkill
and there are some other good ideas out there.

A mechanism like the proposed should, for example, allow us to consume
just as much of a lazy Bytestring as has already been computed by a
producer, WITHOUT blocking and waiting on that producer thread, or
migrating the producer computation over to our own thread (blowing its
cache).

Thanks,
-Ryan

Ryan Newton

unread,

Nov 7, 2011, 11:18:02 AM11/7/11

to Jan-Willem Maessen, parallel-haskell, glasgow-haskell-users

Jan voted for the explicit "lockAndBlackhole" version as safer.

I realize that for the Bytestring example all you would want to "gently" consume what is already available is WHNF detection alone. In that scenario you don't want to evaluate anything, just consume what is already evaluated.

I would propose that when you do want to explicitly and actively blackhole something that that call be non-blocking (i.e. if someone else has already blackhole'd you don't wait). So the state machine would go:

tryAcquire x =
case unsafeRTStatus x of
Blackhole -> return Nothing
Unevaluated -> do b <- tryBlackhole x

if b

then return (Just x)

else return Nothing

Evaluated -> return (Just x)

>> It would simply return Nothing if the value is BLACKHOLE'd. Of course
>> it may be helpful to also distinguish the evaluated and unevaluated
>> states. Further, the above simple version allows data-races (it may
>> become blackhole'd right after we evaluate). An extreme version would
>> actively blackhole it to "lock" the thunk... but maybe that's overkill
>> and there are some other good ideas out there.
>

> I'd submit that your latter suggestion is far safer: return Nothing
> unless we successfully blackhole the thunk or find that it's already
> been evaluated. We actually *know* the blocking behavior we'll get,
> and it's behavior we can't easily obtain through any other mechanism
> (eg we'd have to add multiple unsafe indirections through mutable
> cells into the lazy bytestring implementation to obtain the same
> behavior in any other way, and essentially write out the laziness
> longhand losing the benefits of indirection removal and so forth).

>> A mechanism like the proposed should, for example, allow us to consume
>> just as much of a lazy Bytestring as has already been computed by a
>> producer, WITHOUT blocking and waiting on that producer thread, or
>> migrating the producer computation over to our own thread (blowing its
>> cache).
>

> For that you probably want WHNF-or-not detection as well (at least if
> you want to schedule streaming of computation.

Simon Marlow

unread,

Nov 8, 2011, 10:43:17 AM11/8/11

to rrne...@gmail.com, glasgow-haskell-users, parallel-haskell

The problem is that a thunk may depend on other thunks, which may or may
not themselves be BLACKHOLEs. So you might be a long way into
evaluating the argument and have accumulated a deep stack before you
encounter the BLACKHOLE.

Hmm, but there is something you could do. Suppose a thread could be in
a mode in which instead of blocking on a BLACKHOLE it would just throw
an asynchronous exception WouldBlock. Any computation in progress would
be safely abandoned via the usual asynchronous exception mechanism, and
you could catch the exception to implement your evaluateNonBlocking
operation.

I'm not sure this would actually be useful in practice, but it's
certainly doable.

Cheers,
Simon

Duncan Coutts

unread,

Nov 16, 2011, 4:43:27 AM11/16/11

to marl...@gmail.com, rrne...@gmail.com, glasgow-haskell-users, parallel-haskell

On Tue, 2011-11-08 at 15:43 +0000, Simon Marlow wrote:

> Hmm, but there is something you could do. Suppose a thread could be in
> a mode in which instead of blocking on a BLACKHOLE it would just throw
> an asynchronous exception WouldBlock. Any computation in progress would
> be safely abandoned via the usual asynchronous exception mechanism, and
> you could catch the exception to implement your evaluateNonBlocking
> operation.
>
> I'm not sure this would actually be useful in practice, but it's
> certainly doable.

The linux kernel folks have been discussing a similar idea on and off
for the last few years. The idea is to "return" in another thread if the
initial system call blocks.

Perhaps there's an equivalent here. We have an evaluateThingy function
and when the scheduler notices that thread is going to block for some
reason (either any reason or some specific reason) we return from
evaluateThingy with some info about the blocked thread.

The thing that the kernel folks could never decide on was to do with
thread identity: if it was the original thread that blocked and we
return in a new thread, or if the original thread returns and a clone is
the one that blocks.

Or perhaps it's a crazy idea and it would never work at all :-)

Duncan

Simon Marlow

unread,

Nov 16, 2011, 6:07:34 AM11/16/11

to Jean-Marie Gaillourdet, Duncan Coutts, parallel-haskell, glasgow-haskell-users

On 16/11/2011 10:20, Jean-Marie Gaillourdet wrote:
> Hi,

> The difference between the requirements of the Linux kernel folks and the OP is that in the Linux kernel the evaluation has to continue, while in the Haskell case we know that the BLACKHOLE is already evaluated by someone else. I am obviously no expert on the GHC internals, but that is what I understood by reading the papers about the runtime system. So, in GHC I'd say it would make sense to stay in the original thread and throw the exception as Simon Marlow said.

We have a great solution to the thread identity problem already - just
freeze the computation using an asynchronous exception, and return in
the original thread. The freezing process stores the state of the
computation (i.e. the stack) on the heap, where it can be resumed by
just evaluating the same value again.

So I'm still not sure why we would want to do this, and we need a
concrete application to be sure that the design is useful. Ryan had
some application in mind using lazy bytestrings, but I don't think I
really understand how this scheme would help yet.

Cheers,
Simon

Reply all

Reply to author

Forward