Garbage Collection Tasks

0 views
Skip to first unread message

Dan Sugalski

unread,
Dec 29, 2003, 11:13:24 AM12/29/03
to perl6-i...@perl.org
Well, two tasks actually.

What with threads coming in, some of the limitations of a copying
collector are becoming very apparent. While it's really nice, and
just feels really cool, in the single-threaded case, neither nice nor
cool justifies potentially massive headaches and synchronization
issues when sharing data between interpreters. So, time for some work.

The task is twofold:

1) The garbage collection and memory allocation APIs needs to be
formalized and abstracted a bit. (We're likely most of the way there,
but it's been a while since I've looked and, honestly, the GC/DOD
system's seen some pretty major upheavals since then) It also needs
to live in a vtable of sorts that can hang off the interpreter, so we
can swap it out as need be.

2) We need a more traditional, non-moving GC as an alternative to the
copying collector. A modified mark & sweep'd be fine, but as it'll be
living behind the API it doesn't much matter.

#1 is a bit of code digging and gruntwork combined with a bit of
thought (if the API needs formalizing) and can be done even if you've
no clue what's going on in the GC or DOD.

#2 is a bit more interesting and, if we do it right, means that we'll
end up with a pluggable garbage collection system and may (possibly)
be able to have type 3 threads share object arenas and memory pools,
which'd be rather nice. Even if not, it leaves a good window for
experimentation in allocation and collection, which isn't bad either.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Jeff Clites

unread,
Dec 29, 2003, 2:28:46 PM12/29/03
to Dan Sugalski, perl6-i...@perl.org
On Dec 29, 2003, at 8:13 AM, Dan Sugalski wrote:

> 2) We need a more traditional, non-moving GC as an alternative to the
> copying collector. A modified mark & sweep'd be fine, but as it'll be
> living behind the API it doesn't much matter.

This is really only for the chunks of memory backing strings and such,
right? We're already using mark-and-sweep for PMCs and strings and such
(that is, their headers), right?

As I see it, it's really the allocation that is more complicated with a
mark-and-sweep collector (since you have to search for a correct-sized
free chunk, efficiently)--the collection itself is the easy part.
Actually, it seems like this is just the GC_IS_MALLOC case--that
already gives us non-moving GC, and lets malloc (the system version or
the Lea version) deal with managing the bookkeeping.

> #2 is a bit more interesting and, if we do it right, means that we'll
> end up with a pluggable garbage collection system and may (possibly)
> be able to have type 3 threads share object arenas and memory pools,
> which'd be rather nice. Even if not, it leaves a good window for
> experimentation in allocation and collection, which isn't bad either.

This, combined with the mention of needing to drive this off of a
vtable, means being able to determine the GC algorithm at runtime
instead of compile-time (i.e., that was part of your point). I'm all
for that--it will mean less #if's in the code, and it will make the
code a bit clearer.

JEff

Dan Sugalski

unread,
Dec 29, 2003, 2:48:41 PM12/29/03
to Jeff Clites, perl6-i...@perl.org
At 11:28 AM -0800 12/29/03, Jeff Clites wrote:
>On Dec 29, 2003, at 8:13 AM, Dan Sugalski wrote:
>
>>2) We need a more traditional, non-moving GC as an alternative to
>>the copying collector. A modified mark & sweep'd be fine, but as
>>it'll be living behind the API it doesn't much matter.
>
>This is really only for the chunks of memory backing strings and
>such, right? We're already using mark-and-sweep for PMCs and strings
>and such (that is, their headers), right?

Right, this is for the memory pools backing string bodies and other
variable-length things.

>As I see it, it's really the allocation that is more complicated
>with a mark-and-sweep collector (since you have to search for a
>correct-sized free chunk, efficiently)--the collection itself is the
>easy part. Actually, it seems like this is just the GC_IS_MALLOC
>case--that already gives us non-moving GC, and lets malloc (the
>system version or the Lea version) deal with managing the
>bookkeeping.

Allocation's more complex, but so is deallocation. Properly
maintaining a free list with adjacent piece coalescing can be
non-trivial, and there's the added complication of COW so multiple
string headers may be pointing to the same chunk of memory, so
freeing up one's not a sufficient reason to return the memory to the
free pool.

>>#2 is a bit more interesting and, if we do it right, means that
>>we'll end up with a pluggable garbage collection system and may
>>(possibly) be able to have type 3 threads share object arenas and
>>memory pools, which'd be rather nice. Even if not, it leaves a good
>>window for experimentation in allocation and collection, which
>>isn't bad either.
>
>This, combined with the mention of needing to drive this off of a
>vtable, means being able to determine the GC algorithm at runtime
>instead of compile-time (i.e., that was part of your point). I'm all
>for that--it will mean less #if's in the code, and it will make the
>code a bit clearer.

Yep. Care to take a shot at it?

Jeff Clites

unread,
Dec 30, 2003, 2:23:32 PM12/30/03
to Dan Sugalski, perl6-i...@perl.org
On Dec 29, 2003, at 11:48 AM, Dan Sugalski wrote:

>> As I see it, it's really the allocation that is more complicated with
>> a mark-and-sweep collector (since you have to search for a
>> correct-sized free chunk, efficiently)--the collection itself is the
>> easy part. Actually, it seems like this is just the GC_IS_MALLOC
>> case--that already gives us non-moving GC, and lets malloc (the
>> system version or the Lea version) deal with managing the
>> bookkeeping.
>
> Allocation's more complex, but so is deallocation. Properly
> maintaining a free list with adjacent piece coalescing can be
> non-trivial, and there's the added complication of COW so multiple
> string headers may be pointing to the same chunk of memory, so freeing
> up one's not a sufficient reason to return the memory to the free
> pool.

Yep, I think the Lea allocator takes care of the tricky parts of the
free-list management, and the GC_IS_MALLOC code is already handling the
COW cases.

>>> #2 is a bit more interesting and, if we do it right, means that
>>> we'll end up with a pluggable garbage collection system and may
>>> (possibly) be able to have type 3 threads share object arenas and
>>> memory pools, which'd be rather nice. Even if not, it leaves a good
>>> window for experimentation in allocation and collection, which isn't
>>> bad either.
>>
>> This, combined with the mention of needing to drive this off of a
>> vtable, means being able to determine the GC algorithm at runtime
>> instead of compile-time (i.e., that was part of your point). I'm all
>> for that--it will mean less #if's in the code, and it will make the
>> code a bit clearer.
>
> Yep. Care to take a shot at it?

I'll add it to my to-do list. I'm trying to finish up a few things
which I've left sitting 90% done, so if someone else wants to tackle
this before I get to it then go ahead--just post to the list when you
start so that we don't duplicate efforts (and I'll do the same, if I
get to it and no one else has started).

JEff

Reply all
Reply to author
Forward
0 new messages