[Caml-list] Faking concurrency using Unix forks and pipes (ray tracing results)

Jon Harrop

unread,

Jun 4, 2007, 7:32:05 AM6/4/07

to caml...@yquem.inria.fr

I just got a first working version of .NET style asynchronous invocation
working in OCaml using process forking.

The following OCaml function forks a new process and computes "f x" in that
process, returning a function that blocks and returns the result using
marshalling.

let invoke (f : 'a -> 'b) x : unit -> 'b =
let input, output = Unix.pipe() in
match Unix.fork() with
| 0 ->
Unix.close input;
let output = Unix.out_channel_of_descr output in
Marshal.to_channel output (try `Res(f x) with e -> `Exn e) [];
exit 0
| _ ->
Unix.close output;
let input = Unix.in_channel_of_descr input in
fun () ->
match Marshal.from_channel input with
| `Res x -> x
| `Exn e -> raise e

This function tries to account for reraising exceptions on the parent process
but that is untested.

You can write a higher-order "map" function in terms of invoke like this:

let ( |> ) x f = f x

let map (f : 'a -> 'b) a : 'b array =
Array.map (invoke f) a |>
Array.map (fun f -> f())

When you apply this map to an array, a new process is forked for each element.
As forking is time consuming, you should only apply this to short arrays.

The performance characteristics of this approach are very interesting.
Firstly, I can observe doubled performance on my dual core by invoking two
simple but CPU-intensive operations concurrently:

map fib [|43; 43|]

However, performance is easily degraded using this approach, partly because
forking is expensive but also because of other effects that I do not yet
understand. My original benchmark summed the elements of an array using
fold_left. For some reason, this is extremely inefficient, as if the entire
array is copied.

Anyway, this function is so simple that it took no time to work it into my ray
tracer benchmark. The benefits of concurrency on my dual-core system reduce
the time taken by OCaml from 4s to 3s.

I'll try a concurrent F# version and see how it compares...

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Jonathan Bryant

unread,

Jun 4, 2007, 11:38:57 AM6/4/07

to caml...@yquem.inria.fr

On Jun 4, 2007, at 6:56 AM, Jon Harrop wrote:

>
> When you apply this map to an array, a new process is forked for
> each element.
> As forking is time consuming, you should only apply this to short
> arrays.

It might be worth your while to implement a process pool. I said
before that I fork of one base process from which to fork the others,
and it's not possible to make that process just a controller for a
pool of waiting processes.

--Jonathan

Jonathan Bryant

unread,

Jun 4, 2007, 11:43:23 AM6/4/07

to caml...@yquem.inria.fr

On Jun 4, 2007, at 11:33 AM, Jonathan Bryant wrote:

> process from which to fork the others, and it's not possible to
> make that process just a controller for a

I don't know why I put the "not" in there...

Jonathan Bryant

unread,

Jun 4, 2007, 12:16:22 PM6/4/07

to Benedikt Grundmann, caml...@yquem.inria.fr

On Jun 4, 2007, at 11:50 AM, Benedikt Grundmann wrote:

> Hi Jonathan,
>
> 2007/6/4, Jonathan Bryant <jtbr...@valdosta.edu>:

>>
>> On Jun 4, 2007, at 6:56 AM, Jon Harrop wrote:
>>
>> >
>> > When you apply this map to an array, a new process is forked for
>> > each element.
>> > As forking is time consuming, you should only apply this to short
>> > arrays.
>>
>> It might be worth your while to implement a process pool. I said
>> before that I fork of one base process from which to fork the others,
>> and it's not possible to make that process just a controller for a
>> pool of waiting processes.
>>
>> --Jonathan
>>
>

> Do you also do that in your work for the OSP? I'm asking cause you
> mentioned previously that you want to stay compatible to the ocaml
> threads library... and I just can not figure out how you want to
> implement Thread.create using that base process...

I'm staying with the feel of the threads library, but I'm going to
have to make it incompatible. My API uses functors to build
everything up and then resembles the existing API. I know there has
been some talk on this list about performance issues with that, but I
prefer a clean design over speed (at least at the moment). Mainly
this is because Marshal is not trustworthy, so every message type
needs to have to_ and of_ string functions:

module type Message =
sig
type 'a t
val to_string : 'a t -> string
val of_string : string -> 'a t
end

From there it builds up. I'm actually trying to design it in such a
way that channels aren't limited to UNIX/TCP sockets, and that the
on-wire protocol can be changed as well. Hopefully that will go as
planned.

As for threads, there are actually 3 levels: concurrent (the
existing threads), parallel (forked processes), and remote (forked
process on another machine). The level of concurrency is specified
at thread creation either by using a special version of Thread.create
(such as Thread.concurrent or Thread.parallel) or by an optional
parameter to Thread.create. This allows the programmer to use the
mechanism needed (threads/processes) in the same manner. One
additional change is that threads can either be attached (return a
value), or detached (like they are now and return unit), allowing for
a future-esque usage of threads. Also, since it's internally just
HOFs filled in by the runtime, it allows programs written to run on a
cluster (using remote threads) to run in a single process for testing
(using just concurrent threads), and then moved to a multi-core /
multi-machine environment without any recoding or recompilation.

> BTW: Maybe we should subscribe each others project's mailing list? So
> tha twe can help and learn from each other?
>

I'm already on yours :).

Mine's http://groups.google.com/group/osp2007-northpole

From what you've said, I think we might be covering a lot of the
same ground.

>
> Here is the link to mine:
>
> http://groups.google.com/group/osp2007-econcurrency
>
> Cheers,
>
> Bene
>
>
> --
> Calvin: I try to make everyone's day a little more
> surreal.
>
> (From Calvin & Hobbes)

Jonathan Bryant

unread,

Jun 4, 2007, 12:55:37 PM6/4/07

to Benedikt Grundmann, caml...@yquem.inria.fr

On Jun 4, 2007, at 12:33 PM, Benedikt Grundmann wrote:

>
> It looks a bit more complex, but that's just to avoid big strings in
> case of big messages
> (e.g. with the "simple" interface you end up with the "same" content
> in memory twice).

I think big strings are unavoidable in this case. They can be broken
up at the protocol level for sending, but a large data structure is
going to be marshaled into a big string. As far as same content in
memory twice, that should be the case for pure values even in a
regular OCaml program. As for mutable values, they shouldn't be sent
over a channel to begin with. Once channels are available though,
creating a synchronous mutable cell is only a few lines of code.
(Check out Reppy's book/papers).

> I'm trying to do something similar, but I'm not 100% sure right now
> that it will all work out exactly as I initially thought.. :-) You
> can have a look at my current "interface", I already have a prototype
> commited... src/mailbox.mli is the "main" interface of the library.
> The comments are not totally in sync with the interface right now as
> I'm restructuring a lot these days.

I need to do a commit. I'm using modules which return a record of
functions, and you can compose them to get channels that work over
multiple mediums.

> Okay I'm still a little bit confused how does your Thread.create looks
> like (e.g what is
> its signature)?

Something like:

module Thread =
struct
type con_type =
| Concurrent
| Parallel
| Remote

let create ?(con = Concurrent) f x = (* Create thread *)
let concurrent f x = create ~con:Concurrent f x
let parallel f x = create ~con:Parallel f x
let remote f x = create ~con:Remote f x

let create_attached ?(con = Concurrent) f x = (* Create thread and
return event *)
let concurrent_attached f x = create ~con:Concurrent f x
let parallel_attached f x = create ~con:Parallel f x
let remote_attached f x = create ~con:Remote f x

(* Other thread functions *)
end

I've implemented a good chunk of channels/events so far, but haven't
got to the threads yet, so don't quote me on that exact signature.

--Jonathan

Brian Hurt

unread,

Jun 4, 2007, 2:02:27 PM6/4/07

to Jonathan Bryant, Benedikt Grundmann, caml...@yquem.inria.fr

Jonathan Bryant wrote:

>
> On Jun 4, 2007, at 12:33 PM, Benedikt Grundmann wrote:
>
>>
>> It looks a bit more complex, but that's just to avoid big strings in
>> case of big messages
>> (e.g. with the "simple" interface you end up with the "same" content
>> in memory twice).
>
>
> I think big strings are unavoidable in this case. They can be broken
> up at the protocol level for sending, but a large data structure is
> going to be marshaled into a big string. As far as same content in
> memory twice, that should be the case for pure values even in a
> regular OCaml program. As for mutable values, they shouldn't be sent
> over a channel to begin with. Once channels are available though,
> creating a synchronous mutable cell is only a few lines of code.
> (Check out Reppy's book/papers).

I'm wondering if inversion of control isn't an answer here. The idea is
to have a function of type buf_t -> string -> unit. Instead of building
up a giant string, the of_* functions would call this function on small
strings. Not unlike Buffers. But instead of building up in memory,
these function would fill a buffer and then send it off.

Brian