On Sat, Apr 23, 2011 at 12:41 AM, Kurt Stephens
<kur...@gmail.com> wrote:
Sorry, forgot to reply to you directly.
-KAS
-------- Original Message --------
On 4/14/11 11:57 AM, rogerdpack wrote:
> There was (apparently) a generational GC tried once (that was slower),
> but it wasn't the BDW collector, it was hand made.
AFIAK, generational GC requires efficient write barriers and causes more
fragmentation, if collection does not compact objects, which MRI
does/cannot not do because it is conservative.
Matz mentioned that using BDW caused ruby to crash, but I don't know how
long ago that was. I suspect that was due to a memory
overflow/corruption bug in ruby that caused BDW to crash, but did not
cause symptoms when using Ruby's gc.c and malloc().
> @Kurt I might be interested in helping, though the internals of ruby's
> GC scare me (such as...how do you mark the roots so that the BDW
> doesn't collect everything?)
BDW doesn't require the registration of any roots. It conservatively
scans all (known) data segments, including data segments within dynamic
libraries (see GC_no_dls in gc.h). This behavior is optional. There's
a bunch of BDW features that are unnecessary for MRI. BDW mark
callbacks might not behave well with REE's zero-copy stacks
limitations. Will need to configure BDW to not replace malloc()/free().
Yeah I did that once I believe. It wasn't too hard to just rip out gc.c and replace it with BDW equivalents. It did seg fault, though, so I'm not sure what was going on. I'd be happy to try it again.
Maye we can continue the conversation with this group:
http://groups.google.com/group/ruby-optimization?pli=1
ObjectSpace.each_object is difficult to implement with BDW, but it could
be done by hacking BDW itself;
Yeah I don't care much for it either. I guess mod_rails uses it to look up all IO objects before forking, but I'm not sure of anything else.
I started writing SMAL specifically to replace MRI gc.c: minimal,
configurable and extendable. SMAL supports each_object. I'm not a fan
of each_object at all; should probably look at how JRuby handles it.
SMAL would need to be configured to use it's mark_queue.c with REE's
zero-copy stacks.
Yeah I would probably do an initial fork from MRI, or do you think forking REE would be better?
A strategy for replacing MRI gc.c with another GC library, would be to
create a stub (gc_stub.c) where rb_newobj() calls malloc(), rb_gc() does
nothing and stub out all the remaining non-static functions.
Then use gc_stub.c as boilerplate/internal API for hooking in other
GCs. Change the original gc.c to conform to the new internal API. Then
hook in other GC libraries.
The biggest difficulties will be interfacing marking (gc_mark_children()
and friends) and finalizers (add_final(), rm_final()) to other GC
libraries; gc_mark_children() should be refactored.
BDW supports finalizers I think, so last time I tried this I just hooked into that for anything that currently has a finalizer, type-wise. Like I say, it segfaulted, but it wasn't too hard so might be a proof of concept.
So is your first suggestion to try to plug smal into ruby, or to plug in BDW, or to make a generic interface?
Re: smal, my first concern is that it basically requires finalizer for all objects that have any malloc'ed data assigned to them, is that right?
Cheers!
-r