[COMMIT] GC_DEBUG, Some GC Fixes, and Remaining GC Bugs

Mike Lambert

unread,

Aug 12, 2002, 3:56:27 AM8/12/02

to perl6-i...@perl.org

Hey,

I re-added the GC_DEBUG define today, and weeded out a bunch of issues.
For those who don't remember, GC_DEBUG (currently in parrot.h) causes
various limits and settings and logic to be setup such that GC bugs occur
relatively soon after the offending code. It allocates one header at
a time, and performs DOD and collection runs extremely frequently
(effectively, anywhere they could possibly occur if GC_DEBUG weren't
defined.) It's goal is to make GC bugs which appear only in complex
programs...appear in simpler ones as well.

Check the cvs-commit traffic if you're interested in what issues I've
fixed already. From what I can tell, two things remain:
- regexes (these are known to be broken. angel's latest patch should fix
these in theory. Probably not worth spending time on fixing these.)
- hashes (these were recently rewritten to use indices, a step forward,
but they aren't 100% clean yet)
- lexicals (there's one remaining issue on the last test I didn't look
into)
- subs (likely includes all variety of them. Basically, I got the wrong
result on one test, instead of GPF failures like I received on the above
bugs.)
- possibly other that got lost in the noise of the above issues

Anyone more well-versed in these departments than I care to take a look at
the potential problems? Just change GC_DEBUG in parrot.h, and you can be
on your way. :)

Thanks,
Mike Lambert

Jonathan Sillito

unread,

Aug 12, 2002, 12:13:36 PM8/12/02

to Mike Lambert, Perl internals mailing list

Mike,

I tried to have a look at the lexicals and subs issues but I am finding
that with the GC_DEBUG flag set to 1, I *always* get a segmentation
fault, even for simple pasm files like :

set I0, 12
end

or just:

end

Is there something I am doing wrong? I did a fresh checkout about an
hour ago.

???
---
Jonathan Sillito

Steve Fink

unread,

Aug 12, 2002, 3:17:53 PM8/12/02

to Mike Lambert, perl6-i...@perl.org

On Mon, Aug 12, 2002 at 03:56:27AM -0400, Mike Lambert wrote:
> - hashes (these were recently rewritten to use indices, a step forward,
> but they aren't 100% clean yet)

I know of at least one remaining problem with these, but...

> Anyone more well-versed in these departments than I care to take a look at
> the potential problems? Just change GC_DEBUG in parrot.h, and you can be
> on your way. :)

I can't get to it because parrot doesn't survive past initialization
for me. When it creates the Array PMC for userargv, it allocates the
PMC first and then the buffer for the array's data. During the
buffer's creation, it does a collection that wipes out the PMC. My
lo_var_ptr and hi_var_ptr are set to reasonable-sounding values at the
top of trace_system_stack(), but I haven't been able to track it
farther yet. Oh, and I do have your recent patch to set
interpreter->lo_var_ptr early.

The userargv PMC is not anchored other than in the C stack, because it
dies in the pmc_new() creation process before the assignment to P0 can
run.

Sean O'Rourke

unread,

Aug 12, 2002, 3:34:28 PM8/12/02

to Steve Fink, Mike Lambert, perl6-i...@perl.org

On Mon, 12 Aug 2002, Steve Fink wrote:
> farther yet. Oh, and I do have your recent patch to set
> interpreter->lo_var_ptr early.

How early is "early"? It looks like setting lo_var_ptr in Parrot_runcode
instead of runops should be necessary/sufficient. If not, does
initializing it to the address of a local in main make this go away?

/s

Jason Gloudon

unread,

Aug 12, 2002, 4:04:40 PM8/12/02

to Mike Lambert, perl6-i...@perl.org

On Mon, Aug 12, 2002 at 03:56:27AM -0400, Mike Lambert wrote:

Here is a workaround for one hash related GC bug.

--
Jason

hash.patch

Steve Fink

unread,

Aug 12, 2002, 4:08:30 PM8/12/02

to Sean O'Rourke, Mike Lambert, perl6-i...@perl.org

Nope. lo_var_ptr was getting set plenty early. This was the problem,
which I'll apply once I make a few more problems go away:

diff -p -u -r1.11 dod.c
--- dod.c 12 Aug 2002 07:46:35 -0000 1.11
+++ dod.c 12 Aug 2002 20:07:48 -0000
@@ -350,9 +350,9 @@ trace_system_stack(struct Parrot_Interp
cur_var_ptr = (size_t)( (ptrdiff_t)cur_var_ptr + direction * PARROT_PTR_ALIGNMENT )
) {
size_t ptr = *(size_t *)cur_var_ptr;
- if (pmc_min < ptr && ptr < pmc_max && is_pmc_ptr(interpreter,(void *)ptr)) {
+ if (pmc_min <= ptr && ptr <= pmc_max && is_pmc_ptr(interpreter,(void *)ptr)) {
last = mark_used((PMC *)ptr, last);
- } else if (buffer_min < ptr && ptr < buffer_max && is_buffer_ptr(interpreter,(void *)ptr)) {
+ } else if (buffer_min <= ptr && ptr <= buffer_max && is_buffer_ptr(interpreter,(void *)ptr)) {
buffer_lives((Buffer *)ptr);
}
}

Steve Fink

unread,

Aug 12, 2002, 4:13:26 PM8/12/02

to Jason Gloudon, perl6-i...@perl.org

On Mon, Aug 12, 2002 at 04:04:40PM -0400, Jason Gloudon wrote:
> On Mon, Aug 12, 2002 at 03:56:27AM -0400, Mike Lambert wrote:
>
> Here is a workaround for one hash related GC bug.

Nice catch. That, in combination with the patch I just posted, fixes
all but 4 of the failures for me.

Mike Lambert

unread,

Aug 12, 2002, 4:23:14 PM8/12/02

to Steve Fink, perl6-i...@perl.org

> > Anyone more well-versed in these departments than I care to take a look at
> > the potential problems? Just change GC_DEBUG in parrot.h, and you can be
> > on your way. :)
>
> I can't get to it because parrot doesn't survive past initialization
> for me. When it creates the Array PMC for userargv, it allocates the
> PMC first and then the buffer for the array's data. During the
> buffer's creation, it does a collection that wipes out the PMC. My
> lo_var_ptr and hi_var_ptr are set to reasonable-sounding values at the
> top of trace_system_stack(), but I haven't been able to track it
> farther yet. Oh, and I do have your recent patch to set
> interpreter->lo_var_ptr early.
>
> The userargv PMC is not anchored other than in the C stack, because it
> dies in the pmc_new() creation process before the assignment to P0 can
> run.

Weird. I had to move the lo_var_ptr initialization code to runcode instead
of runops, in order to avoid collecting the ARGV pmc. The new code looks
like:

void *dummy_ptr;
PMC *userargv;

Is it possible that some systems might put dummy_ptr higher in memory than
userargv, thus causing userargv to become prematurely collected? If
so, there are three options:
- make two dummy ptrs, and choose the lesser of the two.
- set the dummy ptr to userargv, and hope we don't add two
header variables. ;)
- force the setting of lo_var_ptr upon the 'main' code in test_main.c,
above all possible functions.

I think 1 is easiest, but 3 does have the advantage of allowing the user
to do GC stuff outside of the parrot execution loop, like allocating
global variables (like argv, but app-specific), etc. Of course, it also
imposees additional coding overhead on the embedding programmer.

Mike Lambert

Steve Fink

unread,

Aug 12, 2002, 4:41:09 PM8/12/02

to Mike Lambert, perl6-i...@perl.org

On Mon, Aug 12, 2002 at 04:23:14PM -0400, Mike Lambert wrote:
> Weird. I had to move the lo_var_ptr initialization code to runcode instead
> of runops, in order to avoid collecting the ARGV pmc. The new code looks
> like:
>
> void *dummy_ptr;
> PMC *userargv;

You sure? That's what I thought at first too, but then I realized that
(to be userargv)->vtable wasn't surviving the return from pmc_new().
So when it crashes, userargv is filled in, but at the point where GC
scrozzles the PMC, userargv hasn't been assigned yet so it's address
doesn't need to be within the stack range. It's actually the variable
'pmc' inside of pmc_new() that holds onto the newly-created PMC.

> Is it possible that some systems might put dummy_ptr higher in memory than
> userargv, thus causing userargv to become prematurely collected? If
> so, there are three options:
> - make two dummy ptrs, and choose the lesser of the two.

I think anything like this will just get defeated by an optimizing
compiler on some platform.

> - set the dummy ptr to userargv, and hope we don't add two
> header variables. ;)

Ick. But since this *does* keep the problem isolated to Parrot code,
it might be the way to go. If we want another variable, we can call a
set_stack_top() function for each one, and it'll keep the minimum.

> - force the setting of lo_var_ptr upon the 'main' code in test_main.c,
> above all possible functions.

I have a patch that adds an argument to Parrot_init for this purpose.

> I think 1 is easiest, but 3 does have the advantage of allowing the user
> to do GC stuff outside of the parrot execution loop, like allocating
> global variables (like argv, but app-specific), etc. Of course, it also
> imposees additional coding overhead on the embedding programmer.

Is main() even guaranteed to be early enough (assuming the order of
local variable declaration doesn't determine the order of stack
usage)? Actually, it's not really main(), it's whatever function
creates an interpreter and calls Parrot_init().

I can easily imagine a function that creates an interpreter, creates a
PMC (say a hashtable describing the application or something), does
something with the PMC using the interpreter, then shuts down the
interpreter. All in the same function. Describing what's wrong with
this and how to avoid it is kind of complicated. (But perhaps
unavoidable.)

Nicholas Clark

unread,

Aug 12, 2002, 4:48:56 PM8/12/02

to Jason Gloudon, Mike Lambert, perl6-i...@perl.org

On Mon, Aug 12, 2002 at 04:04:40PM -0400, Jason Gloudon wrote:

> On Mon, Aug 12, 2002 at 03:56:27AM -0400, Mike Lambert wrote:
>
> Here is a workaround for one hash related GC bug.

^^^^^^^^^^

> + volatile Buffer *for_gc;
> /* hash->buffer.flags |= BUFFER_report_FLAG; */
>
> /* We rely on the fact that expand_hash() will be called before
> @@ -295,7 +296,12 @@
> hash->max_chain = (HashIndex) -1;
>
> hash->entries = 0;
> - hash->bucket_pool = new_buffer_header(interpreter);
> +
> + /* In order to keep the bucket_pool header from being
> + * collected from expand_hash(), a pointer to it is put on
> + * the system stack where the stack walking code can find it */
> + for_gc = hash->bucket_pool = new_buffer_header(interpreter);
> +

This is a case of a temporary allocated value being only in a CPU register?
If so, I believe that this is no solution of the cause, merely working round
a symptom. The stack walking code needs custom assembler to find all values in
the CPU registers that may not be on the stack.

We cannot control correct third party code that is linked against parrot.
So we cannot expect the solution to GC crashes to be "oh, you need to
get the source to libfoo and declare a variable volatile so that the stack
walking code can see it"

If I've misunderstood the cause of the above problem, oops, sorry.
But I believe that we still need to check for pointers in the CPU registers,
because on a register rich CPU is is not impossible that something has the
only pointer to an in use value still only in a CPU register.

Nicholas Clark
--
Even better than the real thing: http://nms-cgi.sourceforge.net/

Jason Gloudon

unread,

Aug 12, 2002, 6:29:32 PM8/12/02

to Nicholas Clark, Mike Lambert, perl6-i...@perl.org

On Mon, Aug 12, 2002 at 09:48:56PM +0100, Nicholas Clark wrote:

> This is a case of a temporary allocated value being only in a CPU register?

Not in this case. The link to the hash structure from the PMC is not
established, which prevents the custom mark routine in the PMC from identifying
the buffer in question. Another way of fixing this is to change new_hash to
accept a pointer to the PMC's ->data pointer so it can initialize this before
doing any further allocation.

> But I believe that we still need to check for pointers in the CPU registers,
> because on a register rich CPU is is not impossible that something has the
> only pointer to an in use value still only in a CPU register.

Yes. This is definitely necessary. We also need to flush register windows on
platforms like SPARC, IA-64.

--
Jason

Jason Gloudon

unread,

Aug 12, 2002, 7:23:26 PM8/12/02

to perl6-i...@perl.org, Nicholas Clark, Mike Lambert

Here is another way of resolving this, as I mentioned in the last message.
Having new_hash() initialize the PMC's data pointer so that the hash is
immediately visible to the collector.

--
Jason

hash.patch

Jason Gloudon

unread,

Aug 12, 2002, 8:23:06 PM8/12/02

to perl6-i...@perl.org, Nicholas Clark, Mike Lambert

The last patch had an issue I didn't see on the test I was working against.
mark_hash() assumes the buffer_pool is always initialized, but with the patch
this is no longer the case.

--
Jason

hash2.patch

Jason Greene

unread,

Aug 12, 2002, 10:02:04 PM8/12/02

to Jason Gloudon, perl6-i...@perl.org, Nicholas Clark, Mike Lambert

Here is one additional check...

-Jason (The other one)

> ----
>

> Index: parrot/hash.c
> ===================================================================
> RCS file: /cvs/public/parrot/hash.c,v
> retrieving revision 1.19
> diff -u -r1.19 hash.c
> --- parrot/hash.c 7 Aug 2002 20:27:53 -0000 1.19
> +++ parrot/hash.c 13 Aug 2002 00:20:39 -0000
> @@ -126,7 +126,9 @@
> HashIndex i;
>
> buffer_lives((Buffer *)hash);
> - buffer_lives(hash->bucket_pool);
> + if(hash->bucket_pool){
> + buffer_lives(hash->bucket_pool);
> + }
>
> for (i = 0; i <= hash->max_chain; i++) {
> HASHBUCKET *bucket = lookupBucket(hash, i);
> @@ -281,10 +283,12 @@
> return NULL;
> }
>
> -HASH *
> -new_hash(Interp *interpreter)
> +void
> +new_hash(Interp *interpreter, HASH **hash_ptr)
> {
> HASH *hash = (HASH *)new_bufferlike_header(interpreter, sizeof(*hash));
> + *hash_ptr = hash;
> +

> /* hash->buffer.flags |= BUFFER_report_FLAG; */
>
> /* We rely on the fact that expand_hash() will be called before

> @@ -295,11 +299,13 @@

> hash->max_chain = (HashIndex) -1;
>
> hash->entries = 0;

> +
> + /* Ensure mark_hash doesn't try to mark the buffer live */
> + hash->bucket_pool = NULL;
> hash->bucket_pool = new_buffer_header(interpreter);
> /* hash->bucket_pool->flags |= BUFFER_report_FLAG; */
> hash->free_list = NULLBucketIndex;
> expand_hash(interpreter, hash);
> - return hash;
> }
>
> /*=for api key hash_size
> @@ -412,9 +418,10 @@
>
> HASH *
> hash_clone(struct Parrot_Interp * interp, HASH * hash) {
> - HASH * ret = new_hash(interp);
> + HASH *ret;
> BucketIndex* table = (BucketIndex*) hash->buffer.bufstart;
> BucketIndex i;
> + new_hash(interp, &ret);
> for (i = 0; i <= hash->max_chain; i++) {
> HASHBUCKET * b = lookupBucket(hash, i);
> while (b) {
> Index: parrot/classes/perlhash.pmc
> ===================================================================
> RCS file: /cvs/public/parrot/classes/perlhash.pmc,v
> retrieving revision 1.25
> diff -u -r1.25 perlhash.pmc
> --- parrot/classes/perlhash.pmc 12 Aug 2002 07:47:06 -0000 1.25
> +++ parrot/classes/perlhash.pmc 13 Aug 2002 00:20:40 -0000
> @@ -61,7 +61,7 @@
> undef->flags |= PMC_constant_FLAG;
> }
> SELF->flags |= PMC_custom_mark_FLAG;
> - SELF->data = new_hash(INTERP);
> + new_hash(INTERP, (HASH **)&SELF->data);
> }
>
> /* The end of used parameter is passed into the mark_used function of
> Index: parrot/include/parrot/hash.h
> ===================================================================
> RCS file: /cvs/public/parrot/include/parrot/hash.h,v
> retrieving revision 1.6
> diff -u -r1.6 hash.h
> --- parrot/include/parrot/hash.h 7 Aug 2002 19:02:06 -0000 1.6
> +++ parrot/include/parrot/hash.h 13 Aug 2002 00:20:40 -0000
> @@ -20,7 +20,7 @@
> /* HASH is really a hashtable, but 'hash' is standard perl nomenclature. */
> typedef struct _hash HASH;
>
> -HASH *new_hash(Interp * interpreter);
> +void new_hash(Interp * interpreter, HASH **hash_ptr);
> HASH *hash_clone(Interp * interpreter, HASH * hash);
> INTVAL hash_size(Interp * interpreter, HASH *hash);
> void hash_set_size(Interp * interpreter, HASH *hash, UINTVAL size);
>
>

hash3.patch

Jason Greene

unread,

Aug 12, 2002, 11:38:33 PM8/12/02

to Jason Greene, Jason Gloudon, perl6-i...@perl.org, Nicholas Clark, Mike Lambert

Sorry, this version now includes the missing modifications to hash.h.

-Jason

hash4.patch

Jason Greene

unread,

Aug 13, 2002, 12:07:37 AM8/13/02

to Jason Greene, Jason Gloudon, perl6-i...@perl.org, Nicholas Clark, Mike Lambert

One more safety check (fixes another crash bug). Hopefully this is the
last patch patch.

-Jason

> ----
>

> Index: hash.c

> ===================================================================
> RCS file: /cvs/public/parrot/hash.c,v
> retrieving revision 1.19
> diff -u -r1.19 hash.c

> --- hash.c 7 Aug 2002 20:27:53 -0000 1.19
> +++ hash.c 13 Aug 2002 03:30:56 -0000
> @@ -126,8 +126,14 @@

> HashIndex i;
>
> buffer_lives((Buffer *)hash);
> - buffer_lives(hash->bucket_pool);

> -

> + if(hash->bucket_pool){
> + buffer_lives(hash->bucket_pool);
> + }

> +
> + if (hash->buffer.bufstart == NULL) {
> + return end_of_used_list;
> + }

> +
> for (i = 0; i <= hash->max_chain; i++) {
> HASHBUCKET *bucket = lookupBucket(hash, i);

> while (bucket) {
> @@ -281,10 +287,12 @@

> return NULL;
> }
>
> -HASH *
> -new_hash(Interp *interpreter)
> +void
> +new_hash(Interp *interpreter, HASH **hash_ptr)
> {
> HASH *hash = (HASH *)new_bufferlike_header(interpreter, sizeof(*hash));
> + *hash_ptr = hash;
> +
> /* hash->buffer.flags |= BUFFER_report_FLAG; */
>
> /* We rely on the fact that expand_hash() will be called before

> @@ -295,11 +303,13 @@

> hash->max_chain = (HashIndex) -1;
>
> hash->entries = 0;
> +
> + /* Ensure mark_hash doesn't try to mark the buffer live */
> + hash->bucket_pool = NULL;
> hash->bucket_pool = new_buffer_header(interpreter);
> /* hash->bucket_pool->flags |= BUFFER_report_FLAG; */
> hash->free_list = NULLBucketIndex;
> expand_hash(interpreter, hash);
> - return hash;
> }
>
> /*=for api key hash_size

> @@ -412,9 +422,10 @@

>
> HASH *
> hash_clone(struct Parrot_Interp * interp, HASH * hash) {
> - HASH * ret = new_hash(interp);
> + HASH *ret;
> BucketIndex* table = (BucketIndex*) hash->buffer.bufstart;
> BucketIndex i;
> + new_hash(interp, &ret);
> for (i = 0; i <= hash->max_chain; i++) {
> HASHBUCKET * b = lookupBucket(hash, i);
> while (b) {

> Index: classes/perlhash.pmc

> ===================================================================
> RCS file: /cvs/public/parrot/classes/perlhash.pmc,v
> retrieving revision 1.25
> diff -u -r1.25 perlhash.pmc

> --- classes/perlhash.pmc 12 Aug 2002 07:47:06 -0000 1.25
> +++ classes/perlhash.pmc 13 Aug 2002 03:30:57 -0000

> @@ -61,7 +61,7 @@
> undef->flags |= PMC_constant_FLAG;
> }
> SELF->flags |= PMC_custom_mark_FLAG;
> - SELF->data = new_hash(INTERP);
> + new_hash(INTERP, (HASH **)&SELF->data);
> }
>
> /* The end of used parameter is passed into the mark_used function of

> Index: include/parrot/hash.h

> ===================================================================
> RCS file: /cvs/public/parrot/include/parrot/hash.h,v
> retrieving revision 1.6
> diff -u -r1.6 hash.h

> --- include/parrot/hash.h 7 Aug 2002 19:02:06 -0000 1.6
> +++ include/parrot/hash.h 13 Aug 2002 03:30:57 -0000

hash5.patch

Dan Sugalski

unread,

Aug 13, 2002, 2:21:12 AM8/13/02

to Nicholas Clark, Jason Gloudon, Mike Lambert, perl6-i...@perl.org

At 9:48 PM +0100 8/12/02, Nicholas Clark wrote:
>On Mon, Aug 12, 2002 at 04:04:40PM -0400, Jason Gloudon wrote:
>> On Mon, Aug 12, 2002 at 03:56:27AM -0400, Mike Lambert wrote:
>>
>> Here is a workaround for one hash related GC bug.
> ^^^^^^^^^^
>
>> + volatile Buffer *for_gc;
>> /* hash->buffer.flags |= BUFFER_report_FLAG; */
>>
>> /* We rely on the fact that expand_hash() will be called before
>> @@ -295,7 +296,12 @@
>> hash->max_chain = (HashIndex) -1;
>>
>> hash->entries = 0;
>> - hash->bucket_pool = new_buffer_header(interpreter);
>> +
>> + /* In order to keep the bucket_pool header from being
>> + * collected from expand_hash(), a pointer to it is put on
>> + * the system stack where the stack walking code can find it */
>> + for_gc = hash->bucket_pool = new_buffer_header(interpreter);
>> +
>
>This is a case of a temporary allocated value being only in a CPU register?
>If so, I believe that this is no solution of the cause, merely working round
>a symptom. The stack walking code needs custom assembler to find all values in
>the CPU registers that may not be on the stack.

It may also be the case where the values are on the stack, but at a
spot where the GC won't look, as they're allocated in the code that's
calling into the interpreter and not rooted. Which means we *must*
root the argv PMCs quickly.

>If I've misunderstood the cause of the above problem, oops, sorry.
>But I believe that we still need to check for pointers in the CPU registers,
>because on a register rich CPU is is not impossible that something has the
>only pointer to an in use value still only in a CPU register.

Yup. We have to do this, so we might as well go for it now.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Steve Fink

unread,

Aug 13, 2002, 7:14:19 PM8/13/02

to Mike Lambert, perl6-i...@perl.org

In tracing down the next crash bug using GC_DEBUG, I found that the
following code in new_stack is unsafe:

stack->buffer = new_buffer_header(interpreter);
Parrot_allocate(interpreter, stack->buffer,
sizeof(Stack_Entry_t) * STACK_CHUNK_DEPTH);

A GC can be triggered by Parrot_allocate that frees the newly created
buffer. This shows up when creating interpreter->ctx.pad_stack.

This is really the same problem as the hash->buffer bug. The system
stack walker will find the stack variable, but won't know how to look
inside it to find the buffer.

I can think of the following ways of fixing this (these are all
rehashes of things we've used or discussed in the past):

1. Also putting stack->buffer into a local variable and praying that
it doesn't get optimized away
2. Disabling GC during the Parrot_allocate() call
3. Passing in an extra OUT parameter to new_stack so we can anchor the
stack before the Parrot_allocate() call.
4. Tagging the newly created number with a generation counter so it
won't get freed until after the next opcode is run (in this case,
that'll be the first opcode). (This requires implementing Peter
Gibbs' (?) idea of not freeing things created in the current generation.)
5. Stuffing the new buffer into a temporary holding area that is
added to the root set.
6. Setting a neonate flag on the buffer that'll get automatically get
cleared at an appropriate time.
7. Setting an immune flag on the buffer and manually clearing it after
anchoring the stack to the root set.
8. Wrapping the stack in a PMC (so the stack struct would go into a
->data buffer and would be traversed with a custom mark routine) and
being very, very careful about the order of initialization.

Any votes?

I have also come to the conclusion that tracking down these memory
bugs is way too difficult right now. I tracked the above problem back
from a seg fault, which was resulting from a garbage value in the byte
code stream, which was triggered by adding a PMC to the free list
until I added a bunch of debugging code. With the debugging code, I
could see that the dead userargv PMC (that wasn't really dead because
it was on the system stack even though it would never get used again)
pointed to a PMC with a bogus type, but also the memory corruption was
now triggered by a stack push where the stack and some PMC incorrectly
shared a buffer.

Or, in summary: these kinds of problems result in memory corruption
that is unlikely to be immediately fatal.

I wonder if we could add something to GC_DEBUG that would mark all
dead Buffers and PMCs in an easily recognizable way, and then check
all live PMCs and Buffers to be sure that nothing they point to is
marked as being dead. (Right now, they all just unconditionally mark
things as "live for this DOD pass".)

Steve Fink

unread,

Aug 13, 2002, 10:28:55 PM8/13/02

to Mike Lambert, perl6-i...@perl.org

This is a patch that clears up all of the GC_DEBUG-revealed bugs, at
least on my machine. As I enumerated in my previous email, there are a
lot of different ways to fix these sorts of problems, and this just
kinda picks one for each problem encountered. This patch includes the
stuff done by Jason {Gloudon,Greene}.

In summary, this patch

- Adds a parameter to Parrot_init() to allow setting the stack top.
- Fixes a bug in dod.c where the first PMC might be skipped (changed < to <=)
- Adds some NULL checks in mark_hash()
- Adds an OUT parameter to new_hash() so the hash is anchored to the root set
while it is being constructed.
- Adds an OUT parameter to hash_clone() for early anchoring.
- Repeatedly calls lookupBucket() while cloning a hash in case things moved
- In interpreter.c, asserts that a few of the early buffer creations do not
return the same buffer (provides early warning of GC mischief)
- Adds an OUT parameter to rx_allocate_info() for early anchoring.
- Makes a major change to the Pointer PMC: the previously unused ->cache area
is now used to hold a pointer to a custom mark routine that will get fired
during PMC traversal. Previously, Pointers had the PMC_private_GC_FLAG set,
but nothing ever looked at it. With this change, Pointers behave as they
always did unless something externally sets the ->cache.struct_val field
(in other words, there is no vtable entry for setting the mark routine,
and the PMC's custom mark routine does nothing if that field is NULL.)
- Reorders the rx_allocinfo opcode to assign things in the correct order and
fill in the ->cache.struct_val field of the Pointer PMC it creates.
- Briefly disables DOD while a stack is being created so allocating the contents
of the stack buffer doesn't destroy the unanchored buffer header.
- Sets the PMC_custom_mark_FLAG on a cloned PerlHash PMC.
- Fixes a comment in the cloning test case in t/pmc/perlhash.t

Somebody gimme a cookie.

If the rx info object is going away, then obviously those parts of the
patch need not be applied. But in the meantime, it's nice to have a
Parrot that doesn't crash.

I'm not going to apply this patch yet because I'm sure someone will
disagree with how it fixes some or all of these bugs. So would that
someone please speak up? Thanks.

patch

Daniel Grunblatt

unread,

Aug 13, 2002, 11:05:53 PM8/13/02

to ja...@inetgurus.net, perl6-i...@perl.org

On 12 Aug 2002, Jason Greene wrote:

> One more safety check (fixes another crash bug). Hopefully this is the
> last patch patch.

Applied, thanks.

Daniel Grunblatt.

Jason Greene

unread,

Aug 13, 2002, 11:10:51 PM8/13/02

to Steve Fink, Mike Lambert, perl6-i...@perl.org

> I have also come to the conclusion that tracking down these memory
> bugs is way too difficult right now. I tracked the above problem back
> from a seg fault, which was resulting from a garbage value in the byte
> code stream, which was triggered by adding a PMC to the free list
> until I added a bunch of debugging code. With the debugging code, I
> could see that the dead userargv PMC (that wasn't really dead because
> it was on the system stack even though it would never get used again)
> pointed to a PMC with a bogus type, but also the memory corruption was
> now triggered by a stack push where the stack and some PMC incorrectly
> shared a buffer.
>
> Or, in summary: these kinds of problems result in memory corruption
> that is unlikely to be immediately fatal.
>
> I wonder if we could add something to GC_DEBUG that would mark all
> dead Buffers and PMCs in an easily recognizable way, and then check
> all live PMCs and Buffers to be sure that nothing they point to is
> marked as being dead. (Right now, they all just unconditionally mark
> things as "live for this DOD pass".)

+1 to something like this, I have been spent too many hours at the
debugger trying to figure out some of these GC problems (one of them
involving large buffers I still haven't solved)

-Jason

Mike Lambert

unread,

Aug 14, 2002, 3:16:20 AM8/14/02

to Steve Fink, perl6-i...@perl.org

> Somebody gimme a cookie.

/me hands Steve a cookie.

> If the rx info object is going away, then obviously those parts of the
> patch need not be applied. But in the meantime, it's nice to have a
> Parrot that doesn't crash.

I agree. My disclaimer about the regex code in my original email was to
suggest that we didn't need to focus on the rx issues, but if you've
already done it... :)

> I'm not going to apply this patch yet because I'm sure someone will
> disagree with how it fixes some or all of these bugs. So would that
> someone please speak up? Thanks.

I suppose that someone is me, although there might be other someones.

> In summary, this patch
>

> - Adds an OUT parameter to new_hash() so the hash is anchored to the root set
> while it is being constructed.
> - Adds an OUT parameter to hash_clone() for early anchoring.

> - Adds an OUT parameter to rx_allocate_info() for early anchoring.

> - Briefly disables DOD while a stack is being created so allocating the contents
> of the stack buffer doesn't destroy the unanchored buffer header.

These are needed for now. However, when we get that buffer/pmc unification,
we should be able to make mark() methods in the header pools. Then, with
support for non-pmc-wrapped buffers, we can find references to them
on the system stack, and call their mark() method directly, avoiding
the above hoops. At least, that's my hope. Is it possible to mark the
above code with some XXX tag so that we can re-address it when we get the
unification in place?

> - Makes a major change to the Pointer PMC: the previously unused ->cache area
> is now used to hold a pointer to a custom mark routine that will get fired
> during PMC traversal. Previously, Pointers had the PMC_private_GC_FLAG set,
> but nothing ever looked at it. With this change, Pointers behave as they
> always did unless something externally sets the ->cache.struct_val field
> (in other words, there is no vtable entry for setting the mark routine,
> and the PMC's custom mark routine does nothing if that field is NULL.)
>
> - Reorders the rx_allocinfo opcode to assign things in the correct order and
> fill in the ->cache.struct_val field of the Pointer PMC it creates.

These are a bit hackish, but I agree they are needed to solve our GC_DEBUG
problems (and by extension, "real-world Parrot programs" ;). Both of these
should also be able to "go away" with the unification, so see previous
paragraph. :)

I think I'm going to make GC_DEBUG a parameter of the interpreter, and
allow it to be turned on/off via opcodes. Then we could force our test
suite to use GC_DEBUG to root out GC problems a lot sooner than they
otherwise would. Fixing all GC_DEBUG problems would help allow this kind
of testing to be part of the standard test suite.

> - In interpreter.c, asserts that a few of the early buffer creations do not
> return the same buffer (provides early warning of GC mischief)

Oooh, nice! :)

The rest of the things you listed, which I didn't comment on are, imo,
perfectly fine.

In conclusion, I don't have any objections to this patch, although it
would be nice if "XXX Unification" markers were included in places that
needed to be addressed later.

Mike Lambert

Steve Fink

unread,

Aug 15, 2002, 7:17:35 PM8/15/02

to Mike Lambert, perl6-i...@perl.org

On Wed, Aug 14, 2002 at 03:16:20AM -0400, Mike Lambert wrote:
> > If the rx info object is going away, then obviously those parts of the
> > patch need not be applied. But in the meantime, it's nice to have a
> > Parrot that doesn't crash.
>
> I agree. My disclaimer about the regex code in my original email was to
> suggest that we didn't need to focus on the rx issues, but if you've
> already done it... :)

Yes, I knew that when I tackled it, but after climbing the painful
learning curve to fix all the other problems, I figured I might as
well nail this one down too while I was in there and in the right
mindset (or maybe it was just that I didn't want to delete my zillions
of debugging printouts quite yet...)

> > In summary, this patch
> >
> > - Adds an OUT parameter to new_hash() so the hash is anchored to the root set
> > while it is being constructed.
> > - Adds an OUT parameter to hash_clone() for early anchoring.
> > - Adds an OUT parameter to rx_allocate_info() for early anchoring.
> > - Briefly disables DOD while a stack is being created so allocating the contents
> > of the stack buffer doesn't destroy the unanchored buffer header.
>
> These are needed for now. However, when we get that buffer/pmc unification,
> we should be able to make mark() methods in the header pools. Then, with
> support for non-pmc-wrapped buffers, we can find references to them
> on the system stack, and call their mark() method directly, avoiding
> the above hoops. At least, that's my hope. Is it possible to mark the
> above code with some XXX tag so that we can re-address it when we get the
> unification in place?

That will fix some of these problems, but not all. For one, the
Pointer PMC needs a custom mark routine in order to work properly
*after* the initial creation, because whatever the Pointer PMC points
to may contain PMCs and Buffers that probably won't be on the system
stack after creation. So some way to define a custom mark routine will
still be needed (or we'll need to restrict Pointer PMCs to point to
objects that are not subject to garbage collection, either because
they are allocated externally to Parrot, or they are anchored in some
other way.)

Also, the rxinfo struct and the Stack_Chunk_t struct are both
allocated with mem_sys_allocate, so they won't be helped by making
Buffers smarter. The problem is that the system stack walk only sees
the address of the struct, and doesn't know how to delve inside it. I
don't know if there's any good reason why they can't be allocated with
new_bufferlike_header (that's how the hash struct works), in which
case the smarter Buffers would fix the problem. Or was this what you
were intending? If so, then I'll plaster each of those with a FIXME
comment. Otherwise, I'll leave them as-is.