Register scanning

Steve Fink

unread,

Dec 18, 2002, 2:03:12 AM12/18/02

to perl6-i...@perl.org

Pardon me for reopening a can of particularly slimy worms, but are we
sure we want to require all architecture/os/compiler combinations to
be able to scan all hardware registers for live pointers? This is
looking more and more problematic. For example, IA64 is kind of
similar to Parrot itself: it has two stacks, one for return addresses
and one for register "windows" (not quite the same as Sparc register
windows). setjmp() will not get all of those; you have to do an
asm("flushrs") at least. There seem to be several ways of flushing the
register windows on the Sparc, depending on exactly what version of
things you have.

For a full description of what you would need to do to support most of
the possibilities out there (but still not be complete), I've attached
the relevant file from the Boehm garbage collector. I don't know if we
can use the code therein. (Apparently the Mono people are using the
Boehm collector, btw.)

I propose that we have configure probes for the variations that we can
support, but have everything else (including miniparrot) fall back to
a registration scheme. In other words, whenever you allocate a
collectible pointer and don't immediately anchor it to something, you
have to pass it to a PIN_POBJ(p) macro, and once it's anchored, you
call RELEASE_POBJ(p). Or something like that. That way, we're always
correct. In configurations that someone cared about enough to
implement register scanning, we will also avoid the common-case
overhead of pin/release (by making the macros expand to nothing.)

I'm saying this from a rather naive perspective, since I really don't
know how difficult this would be to implement. But right now we're a
long ways away from being correct on all architectures, and forgive my
cowardly conservative attitude, but I really do think that correctness
trumps performance. :-)

mach_dep.c

Steve Fink

unread,

Dec 18, 2002, 2:49:50 AM12/18/02

to perl6-i...@perl.org

As a more concrete demonstration of what I'm talking about, here's an
implementiation of the easy part: the pinning and releasing macros.
(UNPIN would probably be better than RELEASE, huh?) It's a naive
implementation with a low fixed limit on the max number of pinned
objects (10), but at least it's fairly fast. Pinned PMCs could
probably be better done as a prebuilt prefix to the next_for_GC list.

pin.patch

Nicholas Clark

unread,

Dec 18, 2002, 4:46:21 AM12/18/02

to perl6-i...@perl.org

I don't know how relevant this thought is, but it should be possible to make
a dynamic list of such objects, provided it always has at least one free
slot maintained at all times. If the pinning routine finds it is about to
fill the free slot, then it pins the thing passed in, and immediately allocates
more space. This might trigger GC, but it's safe as everything is pinned :-)

Nicholas Clark

Andy Wardley

unread,

Dec 18, 2002, 5:01:24 AM12/18/02

to Steve Fink, perl6-i...@perl.org

Steve Fink wrote:
> (UNPIN would probably be better than RELEASE, huh?)

Maybe ATTACH / DETACH or AQUIRE / RELEASE?

A

Jason Gloudon

unread,

Dec 18, 2002, 10:32:45 AM12/18/02

to Steve Fink, perl6-i...@perl.org

We have indeed gone through this before. The last time the dominant argument
was that these types of mark/unmark operations can be mis-used just as readily
as malloc/free, because the programmer has to know when and where to call them.
I'm just repeating this for everyone's benefit, not giving an opinion.

Another approach to to register the address of the "PMC *" variables instead of
registering the pointers themselves. This way you let the collector know where
the automatic variables that may hold PMC pointers are and before that function
returns a single call is made to unregister all of those variables. This makes
for simple programming rules, but makes for more overhead everwhere it is used.
This is one way of achieving "accurate garbage collection" without compiler
support.

Using register and stack walking means the garbage collector must be
conservative in considering data unreachable, as random bytes on the stack or
in registers that look like valid PMC pointers must be treated as such. A
single such value can cause the collector to retain an unbounded amount of data
that is genuinely unreachable (and possibly delaying destruction). This is the
main problem with "conservative garbage collection". The additional CPU
overhead now shows up when the collector runs because it has to do more work to
decide what is a PMC pointer and what is not.

I personally prefer the accurate collector approach for parrot, because it
makes for more predictable performance with zero platform dependent code. Sun
moved to an accurate collector in their production JVM when they introducted
HotSpot.

In any case, I've sent about 2 iterations of a patch to handle SPARC register
windows. To deal with IA-64 and that one platform with non-contiguous stack
frames, we would have to refactor the trace_system_stack function to trace
contiguous chunks of memory. The platform dependent code for IA-64 et al will
want something like this.

BTW... IA-64 seems like such a non-starter of platform, that some might argue
it's not worth making it a core platform for parrot at this time.

--
Jason

Leopold Toetsch

unread,

Dec 18, 2002, 3:54:29 PM12/18/02

to Steve Fink, perl6-i...@perl.org

Steve Fink wrote:

> I propose that we have configure probes for the variations that we can
> support, but have everything else (including miniparrot) fall back to
> a registration scheme. In other words, whenever you allocate a
> collectible pointer and don't immediately anchor it to something, you
> have to pass it to a PIN_POBJ(p) macro, and once it's anchored, you
> call RELEASE_POBJ(p).

Just setting the live bit on new PObjs would be enough IMHO. We are
currently running with --gc-debug, which collects dead PObjs on each
allocation.

When the live_FLAG is set on get_free_xx() it will not be collected in
the next GC run - which will occur some 1000 allocations later. The rule
is, you have to anchor the PObj in the meantime.

So, without --gc-debug, above proposal should be safe, when we know that
we don't construct 1000 array elements w/o anchoring the array first.

leo