Google Groups Home
Help | Sign in
Concurrent GC: a good thing?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  15 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
Jon Harrop  
View profile
 More options Jun 11 2007, 3:39 pm
Newsgroups: comp.lang.functional
From: Jon Harrop <j...@ffconsultancy.com>
Date: Mon, 11 Jun 2007 20:39:30 +0100
Local: Mon, Jun 11 2007 3:39 pm
Subject: Concurrent GC: a good thing?

I have been comparing OCaml and F# recently and .NET's support for
concurrency made F# look preferable.

I had thought that it was not possible to write concurrent programs in OCaml
because its GC is not concurrent. However, I recently realised that you can
actually write concurrent OCaml programs very easily by forking a Unix
process and marshalling the resulting value or exception back to the
parent.

For example, the following "invoke" function spawns a child process
computing f(x) and returns a closure that blocks until the result is ready
and returns it:

let invoke (f : 'a -> 'b) x : unit -> 'b =
  let input, output = Unix.pipe() in
  match Unix.fork() with
  | -1 -> (let v = f x in fun () -> v)
  | 0 ->
      Unix.close input;
      let output = Unix.out_channel_of_descr output in
      Marshal.to_channel output (try `Res(f x) with e -> `Exn e) [];
      close_out output;
      exit 0
  | pid ->
      Unix.close output;
      let input = Unix.in_channel_of_descr input in
      fun () ->
        let v = Marshal.from_channel input in
        ignore (Unix.waitpid [] pid);
        close_in input;
        match v with
        | `Res x -> x
        | `Exn e -> raise e

This is quite a different approach to concurrency and, in particular, each
new process has its own GC.

I have worked this into my ray tracer benchmark and, sure enough, it doubles
performance on my dual-core machine.

In contrast, a parallel F# implementation does not double performance,
perhaps because of context switching between 3 threads (two worker and one
GC) on 2 CPUs.

Given OCaml remains both simpler and faster in all cases, is concurrent GC
worth having? Can it scale as well? Can it be as efficient?

--
Dr Jon D Harrop, Flying Frog Consultancy
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?usenet


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel C. Wang  
View profile
 More options Jun 12 2007, 12:25 am
Newsgroups: comp.lang.functional
From: "Daniel C. Wang" <danwan...@gmail.com>
Date: Mon, 11 Jun 2007 21:25:30 -0700
Local: Tues, Jun 12 2007 12:25 am
Subject: Re: Concurrent GC: a good thing?

Jon Harrop wrote:
> I have been comparing OCaml and F# recently and .NET's support for
> concurrency made F# look preferable.

> I had thought that it was not possible to write concurrent programs in OCaml
> because its GC is not concurrent. However, I recently realised that you can
> actually write concurrent OCaml programs very easily by forking a Unix
> process and marshalling the resulting value or exception back to the
> parent.
{stuff deleted}
> Given OCaml remains both simpler and faster in all cases, is concurrent GC
> worth having? Can it scale as well? Can it be as efficient?

So long as the two threads don't share any common mutable state. I think
two separate processes with their own single-threaded GC will always be
more efficient than a concurrent GC running two threads. The general
case of GCing concurrent threads that share mutable state is
fundamentally more difficult than having two threads that share no state
with separate GCs.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Rubin  
View profile
 More options Jun 12 2007, 12:39 am
Newsgroups: comp.lang.functional
From: Paul Rubin <http://phr...@NOSPAM.invalid>
Date: 11 Jun 2007 21:39:42 -0700
Local: Tues, Jun 12 2007 12:39 am
Subject: Re: Concurrent GC: a good thing?

Jon Harrop <j...@ffconsultancy.com> writes:
> Given OCaml remains both simpler and faster in all cases, is concurrent GC
> worth having? Can it scale as well? Can it be as efficient?

Of course.  Look at the cost of marshalling those large structures
between processes.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jon Harrop  
View profile
 More options Jun 12 2007, 12:44 am
Newsgroups: comp.lang.functional
From: Jon Harrop <j...@ffconsultancy.com>
Date: Tue, 12 Jun 2007 05:44:48 +0100
Local: Tues, Jun 12 2007 12:44 am
Subject: Re: Concurrent GC: a good thing?

Daniel C. Wang wrote:
> So long as the two threads don't share any common mutable state. I think
> two separate processes with their own single-threaded GC will always be
> more efficient than a concurrent GC running two threads. The general
> case of GCing concurrent threads that share mutable state is
> fundamentally more difficult than having two threads that share no state
> with separate GCs.

So concurrent GC makes communication cheaper at the expense of everything
else?

--
Dr Jon D Harrop, Flying Frog Consultancy
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?usenet


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Vityok  
View profile
 More options Jun 12 2007, 2:52 am
Newsgroups: comp.lang.functional
From: Vityok <bob...@ua.fm>
Date: Mon, 11 Jun 2007 23:52:12 -0700
Local: Tues, Jun 12 2007 2:52 am
Subject: Re: Concurrent GC: a good thing?
Jon Harrop        :

> Daniel C. Wang wrote:
> > So long as the two threads don't share any common mutable state. I think
> > two separate processes with their own single-threaded GC will always be
> > more efficient than a concurrent GC running two threads. The general
> > case of GCing concurrent threads that share mutable state is
> > fundamentally more difficult than having two threads that share no state
> > with separate GCs.

> So concurrent GC makes communication cheaper at the expense of everything
> else?

It seems to me, that marshalling can be worked around by implementing
specific serializer/deserializer functions for data structures being
processed.

Thanks to the binary I/O API of the ExtLib[1], I was able to reduce
the size of data produced by marshalling records.  Though, I did not
benchmark performance.

[1]  http://ocaml-lib.sourceforge.net/doc/IO.html

With best regards,
Victor Anyakin


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Rubin  
View profile
 More options Jun 12 2007, 3:05 am
Newsgroups: comp.lang.functional
From: Paul Rubin <http://phr...@NOSPAM.invalid>
Date: 12 Jun 2007 00:05:21 -0700
Local: Tues, Jun 12 2007 3:05 am
Subject: Re: Concurrent GC: a good thing?

Vityok <bob...@ua.fm> writes:
> It seems to me, that marshalling can be worked around by implementing
> specific serializer/deserializer functions for data structures being
> processed.

I'm more familiar with Haskell than ML so forgive me for using Haskell
notation.  Let's say I have a list of pairs (a,b):

   x :: [(Integer, Integer)]

I want to add up all the a's, and I also want to add up all the b's:

   sum(map fst x)
   sum(map snd x)

Because x is a very long list, I want to use separate threads or
processes for these summations, to do them in parallel.

Can any method involving serialization do anywhere near as well as
just letting both threads access x directly?


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel C. Wang  
View profile
 More options Jun 12 2007, 3:54 am
Newsgroups: comp.lang.functional
From: "Daniel C. Wang" <danwan...@gmail.com>
Date: Tue, 12 Jun 2007 00:54:32 -0700
Local: Tues, Jun 12 2007 3:54 am
Subject: Re: Concurrent GC: a good thing?

Jon Harrop wrote:
> Daniel C. Wang wrote:
>> So long as the two threads don't share any common mutable state. I think
>> two separate processes with their own single-threaded GC will always be
>> more efficient than a concurrent GC running two threads. The general
>> case of GCing concurrent threads that share mutable state is
>> fundamentally more difficult than having two threads that share no state
>> with separate GCs.

> So concurrent GC makes communication cheaper at the expense of everything
> else?

That's my basic claim. Note that it only makes shared memory
communication cheaper. If you're going to distribute your job across
machines. It doesn't even help! For the large class of embarrassingly
parallel apps that have little communication costs. The overhead of
concurrent GC which often requires the addition either a read or write
barrier along with some mutator synchronization method. It doesn't seem
worth the trouble, and if you really have a big job you're going to
distribute anyway, so exploiting shared memory isn't going to buy you much.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel C. Wang  
View profile
 More options Jun 12 2007, 4:19 am
Newsgroups: comp.lang.functional
From: "Daniel C. Wang" <danwan...@gmail.com>
Date: Tue, 12 Jun 2007 01:19:47 -0700
Local: Tues, Jun 12 2007 4:19 am
Subject: Re: Concurrent GC: a good thing?
If you have a single list producer that sends the components of each
pair to two consumers you're just marshalling an int. So if you
marshaling on demand in a lazy fashion I see no reason to see why
sharing "x" would be more efficient.

Remember in a shared memory system the data goes over the memory bus to
the cache of each processor. The shared memory model is really just an
interface to message passing hardware. In the end it's all about bits
going over wires. There's nothing magical about shared memory.

I can always decompose my app into a "memory server" that reads
addresses and returns values, and a bunch of concurrent threads that
request locations and receive values back. This will give me the same
algorithmic complexity of any shared memory system.

Of course you could decompose your app in a smarter way and get
potentially better algorithmic complexity and scalability. However, if
you use shared memory you're basically limited to the "memory sever"
model I described. Of course the hardware does a good job of making it
fast. But for really parallel apps you hit a scaling limit and end up
rolling your own message passing in the end to get beyond it.

MPI exists for a reason. BTW this message is from the CAML list is
illustrative of my point.

http://caml.inria.fr/pub/ml-archives/caml-list/2003/07/155910c4eeb09e...


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tony Finch  
View profile
 More options Jun 12 2007, 6:52 am
Newsgroups: comp.lang.functional
From: Tony Finch <d...@dotat.at>
Date: 12 Jun 2007 11:52:27 +0100 (BST)
Local: Tues, Jun 12 2007 6:52 am
Subject: Re: Concurrent GC: a good thing?

Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
>Jon Harrop <j...@ffconsultancy.com> writes:
>> Given OCaml remains both simpler and faster in all cases, is concurrent GC
>> worth having? Can it scale as well? Can it be as efficient?

>Of course.  Look at the cost of marshalling those large structures
>between processes.

Erlang uses copying to pass messages, even for co-operative threads within
a single OS process, and it isn't crippling. Erlang has a separate GCed
heap for each thread which makes dealing with garbage cheap: small heaps,
often short-lived, are easier to deal with than big sprawling ones. It
also translates well to SMP and distributed settings.

The Erlang system can be run in an alternative mode with a shared heap
and pointer passing for messages, or a hybrid mode where allocated data
is placed on the shared or per-process heap depending on whether static
analysis predicts that it'll be part of a message.

Tony.
--
f.a.n.finch  <d...@dotat.at>  http://dotat.at/
ROCKALL MALIN: EAST OR NORTHEAST BECOMING VARIABLE 3 OR 4. SLIGHT OCCASIONALLY
MODERATE. SHOWERS, FOG PATCHES. MODERATE OR GOOD, OCCASIONALLY VERY POOR.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jon Harrop  
View profile
 More options Jun 12 2007, 10:13 am
Newsgroups: comp.lang.functional
From: Jon Harrop <j...@ffconsultancy.com>
Date: Tue, 12 Jun 2007 15:13:16 +0100
Local: Tues, Jun 12 2007 10:13 am
Subject: Re: Concurrent GC: a good thing?

That is embarassingly parallel so it needs no serialization => forking will
be faster than concurrent GC.

Indeed, I'm hard pressed to find a practically-important application that
will be better off with a concurrent GC...

--
Dr Jon D Harrop, Flying Frog Consultancy
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?usenet


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel C. Wang  
View profile
 More options Jun 12 2007, 11:05 am
Newsgroups: comp.lang.functional
From: "Daniel C. Wang" <danwan...@gmail.com>
Date: Tue, 12 Jun 2007 08:05:04 -0700
Local: Tues, Jun 12 2007 11:05 am
Subject: Re: Concurrent GC: a good thing?
Jon Harrop wrote:

{stuff deleted}

> That is embarassingly parallel so it needs no serialization => forking will
> be faster than concurrent GC.

> Indeed, I'm hard pressed to find a practically-important application that
> will be better off with a concurrent GC...

I imagine there are apps where a shared heap can get you a non-trivial
constant factor. However, if you care about scalability the concurrent
GC of a shared heap can only get in your way. I suspect you can prove
the statement above in a formal way.

    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ulf Wiger  
View profile
 More options Jun 13 2007, 5:55 pm
Newsgroups: comp.lang.functional
From: Ulf Wiger <etxu...@cbe.ericsson.se>
Date: Wed, 13 Jun 2007 23:55:58 +0200
Local: Wed, Jun 13 2007 5:55 pm
Subject: Re: Concurrent GC: a good thing?