It seems Moore's law is taking us in the direction of more cores per
microprocessor with less effort placed on exploring ILP. With the
advent of multi-core processors, and their inevitable ubiquity, are
there any plans, considerations, or ideas for a concurrent garbage
collector in Ocaml?
To my knowledge, Caml Light had a concurrent garbage collector under
development by Xavier Leroy but was abandoned due to significant
technical challenges. Prior to that, there appears to have been some
academic research regarding concurrent GC (Doligez, Leroy).
Thanks,
Jim
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs
There's some more recent stuff eg
"Scalable Real-time parallel GC for SMP", Cheng
(sorry lost the URL ..) which suggests an implementation
for ML like language is quite possible. This paper
describes an algorithms for *parallel* collection,
not merely concurrent collection, that is, multiple
threads (multiple CPUs) participating in the collection.
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
12 page journal version:
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/guyb/www/papers/gc2001.pdf
Ph.D. thesis:
http://reports-archive.adm.cs.cmu.edu/anon/2001/CMU-CS-01-174.ps
Perry Cheng was a schoolmate of mine. The world is so small.
Best regards,
Andrej
> It seems Moore's law is taking us in the direction of more cores per
> microprocessor with less effort placed on exploring ILP. With the
> advent of multi-core processors, and their inevitable ubiquity, are
> there any plans, considerations, or ideas for a concurrent garbage
> collector in Ocaml?
It's very nice to have a concurrent run-time system, and we know how
to do it (at significant cost), but it's not really worth the trouble
until we have an answer to this question: what programming language
features do we put on top of it?
Shared memory with threads, locks, and conditions just doesn't cut
it, in terms of writing programs that work.
> To my knowledge, Caml Light had a concurrent garbage collector under
> development by Xavier Leroy but was abandoned due to significant
> technical challenges. Prior to that, there appears to have been some
> academic research regarding concurrent GC (Doligez, Leroy).
The development was done by myself, it was done before the
publications, and as part of the academic research (like everything
we do here).
The most significant challenges are in making Posix threads work
under Unix (threads and Unix signals just don't mix well, given the
currently available APIs).
In summary, you shouldn't hold your breath. In my opinion, we
will need some major breakthrough before we can make good use
of multicore architectures in normal programs. In any language.
-- Damien
I understand where you're coming from, but this point of view implies, I
think, an inappropriate division of labor between language designers and
language users. I agree, ordinary shared-state concurrency with threads is
a disaster. But the burden of figuring out how to write concurrent programs
in a reasonable way is not just the responsibility of the language designer
--- library writers can come up with good abstractions to take advantage of
threads as well.
That is, they could do so if they had access to threads with real
concurrency. Right now, ocaml doesn't provide that, and it's a real
weakness of the language. The lack of a concurrent GC, in my opinion,
stifles innovation in this area, at least within the caml world.
That said, I do understand that a concurrent GC is a big technical
challenge, and I can understand why the ocaml team isn't eager to take it on
right now.
Use my sunrpc library. It allows you to do asynchronous calls. One
process can call the other worker processes and wait until all the
results are back. Of course, the calls are type-safe.
There is even now a manager for forking subprocesses called netplex.
I've recently used these libraries to develop a highly parallelized
system that runs on a cluster of seven machines.
Both libraries (rpc and netplex) are part of the not yet released
ocamlnet2 package. You find the sources here:
https://gps.dynxs.de/svn/lib-ocamlnet2/trunk/
A test release is here (this version is used for the mentioned cluster
and very stable):
http://ocaml-programming.de/packages/ocamlnet-2.2test11.tar.gz
Gerd
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
ge...@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------
> It's very nice to have a concurrent run-time system, and we know how
> to do it (at significant cost), but it's not really worth the trouble
> until we have an answer to this question: what programming language
> features do we put on top of it?
>
> Shared memory with threads, locks, and conditions just doesn't cut
> it, in terms of writing programs that work.
For writing shared memory concurrent programs in OCaml, the Event
module is a far better fit than locks, mutexes and conditions (oh,
my!). It fits the functional paradigm much better and makes program
design far easier to reason about. Even simple programs can benefit
(design wise) from its use. One of the major advantages is that it's
defined the same way as most of the other major types in the language
(lists, sets, etc.): recursively, so it flows natrually from the
language. For example, in C/C++/Java (shared memory concurrency), an
algorithm such as merge sort is tricky, and parallelizing it is
downright nasty, but in OCaml with the Event module (message passing
concurrency), merge sort borders on trivial, and parallelizing it is
almost as simple because the recursion of the algorithm and the
recursion of the parallelism go hand in hand. Try it and you'll see.
Mutexes, conditions, and locks really only need to be there for
transliterating sequential, array based parallel code, but if you are
doing that, the question really is why? OCaml outperforms C pretty
much everywhere but that, and the true benefits of OCaml in this
situation (cleanliness, safety, readability, etc.) are much better
exploited by rethinking the algorithm using message passing. I feel
that the (possible) slight hit in speed is completely justified by
the numerous other benefits of the Event module.
> I understand where you're coming from, but this point of view
> implies, I think, an inappropriate division of labor between
> language designers and language users. I agree, ordinary shared-
> state concurrency with threads is a disaster. But the burden of
> figuring out how to write concurrent programs in a reasonable way
> is not just the responsibility of the language designer --- library
> writers can come up with good abstractions to take advantage of
> threads as well.
Again, great abstractions to take advantage of these multi-cored
processors is already in the OCaml standard library: the Event
module. All that we need is the ability to actually take advantage
of it.
> ...threads with real concurrency. Right now, ocaml doesn't provide
> that, and it's a real weakness of the language. The lack of a
> concurrent GC, in my opinion, stifles innovation in this area...
Agreed.
--Jonathan Bryant
> It seems Moore's law is taking us in the direction of more cores per
> microprocessor with less effort placed on exploring ILP. With the
> advent of multi-core processors, and their inevitable ubiquity, are
> there any plans, considerations, or ideas for a concurrent garbage
> collector in Ocaml?
Right now, concurrent garbage collection seems to offer significantly
less throughput. I'm not sure if it's worth all the effort.
Another thing that might be interesting is a way to execute multiple
run-times with independent heaps in a single process, with Ada-style
rendezvous betweeen them and a special low-overhead form of
marshalling.
That said, I do understand that a concurrent GC is a big technical
challenge, and I can understand why the ocaml team isn't eager to take it on
right now.
The ocaml team could document the GC interface and modularize
everything, such that the user can choose between different
garbage collectors (at compile time, or even better at
application start).
Some parallel enthusiast might then come up with a parallel GC,
that is good enough for experiments until the right solutions are
available.
Bye,
Hendrik
I doubt this is possible with native code compilers.
AFAIK allocation, write barrier, and int/box flag bit handling
aren't modelled using standard C ABI calls?
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
_______________________________________________
> The ocaml team could document the GC interface and modularize
> everything, such that the user can choose between different
> garbage collectors (at compile time, or even better at
> application start).
The main cost of a concurrent GC is that the application code has to be
changed to cooperate with the GC. So if you want to be able to use the same
application code for both the concurrent and the non-concurrent GC, you end
up paying the price of concurrent GC in both cases :-(
Stefan
Can you explain this in more detail?
--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net
_______________________________________________
Frankly, I don't think so. Lorenz Huelsbergen and Phil Winterbottom
implemented their VCGC algorithm [1] for the Limbo programming language
under Inferno and for SMLNJ as well, and their use of that concurrent
garbage collection algorithm required no changes to any Limbo or SMLNJ
application code.
[1] http://cm.bell-labs.com/who/lorenz/papers/ismm98.pdf
--
We must remember that we have more power than our enemies to
worsen our fate.
http://stormwyrm.blogspot.com/
> Frankly, I don't think so. Lorenz Huelsbergen and Phil Winterbottom
> implemented their VCGC algorithm [1] for the Limbo programming language
> under Inferno and for SMLNJ as well, and their use of that concurrent
> garbage collection algorithm required no changes to any Limbo or SMLNJ
> application code.
I meant *compiled* application code, of course. I.e. you can't change the
GC after compilation. Admittedly, the VCGC algorithm is sufficiently
non-intrusive on the mutator that it might be OK to pay its cost even
when not using the VCGC algorithm.
Well, any garbage collection algorithm that requires read or write
barriers on the mutator (not just concurrent algorithms I believe) would
need such changes. The VCGC algorithm has a write barrier used only
when references are changed, and given that most Ocaml code (like SML
code) is written in a functional style where references are seldom used,
it seems that it could be an excellent algorithm.
On the other hand, if you're talking about byte-compiled code, it might
be possible to insert any required write or read barriers in the virtual
machine itself. For Limbo this only required Winterbottom and
Huelsbergen to change the Dis virtual machine; all previously compiled
Limbo code ran properly without change once they replaced the older mark
and sweep collector with it.
> The ocaml team could document the GC interface and modularize
> everything, such that the user can choose between different
> garbage collectors (at compile time, or even better at
> application start).
The main cost of a concurrent GC is that the application code has to be
changed to cooperate with the GC. So if you want to be able to use the same
application code for both the concurrent and the non-concurrent GC, you end
up paying the price of concurrent GC in both cases :-(
For a start it would be sufficient to have a compiler switch that
disables all inlining of GC operations and uses some well defined
library interface instead. Nobody would have to pay anything
then. Optizing performance bottemlacks can wait until the first
alternative garbage collector is out.
Hendrik