Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Infant mortality

5 views
Skip to first unread message

Benjamin Goldberg

unread,
Aug 3, 2003, 11:05:31 PM8/3/03
to perl6-i...@perl.org
I was recently reading the following:

http://www.parrotcode.org/docs/dev/infant.dev.html

It's missing some things. One of which is the (currently used?) way of
preventing infant mortality: anchor right away, or else turn off DoD
until the new object isn't needed.

This document doesn't mention another technique, which was mentioned
recently:

http://groups.google.com/groups?
selm=m2k7asg87i.fsf%40helium.physik.uni-kl.de

, at the "use a linked list of frames" part.


Another similar idea (one which I thought of myself, so feel free to
shoot it down!) is to use a generational system, with the "current
generation" as a value on the C stack, passed as an argument after the
interpreter. That is, something like:

foo(struct ParrotInterp *interpreter, int generation, ...)
{
PMC * temp = bar(interpreter, generation);
baz(interpreter, generation+1);
}

Because inside baz(), generation is a higher value than it was when temp
was created, a DOD run inside of baz() won't kill foo.

During a DOD run, any PMC with a generation less than or equal to the
current generation is considered live. Any PMC with a generation
greater than the current generation gets it's generation set to 0.

Like the linked list scheme, this works through longjmps and recursive
run_cores, and it's much simpler for the user, too: just add one to the
generation to prevent all temporaries in scope from being freed.

It similarly has the drawback of altering the signature of every parrot
function.

There's another drawback I can think of... consider:

foo(struct ParrotInterp *interpreter, int generation, ...)
{
PMC * temp = bar(interpreter, generation);
baz(interpreter, generation+1);
qux(interpreter, generation+1);
}

If baz creates a temporary object and returns, then qux performs a DOD,
baz's (dead) object won't get cleaned up.

This could be solved by keeping a stack of newly created objects, and
providing some sort of generational_dod_helper() function, which would
do something like:
while( neonates && neonates->top->generation > current_generation ) {
neonates->top->generation = 0;
neonates = neonates->next;
}
, and calling that in foo between baz and qux. (And maybe sometimes at
toplevel, between opcodes... at times when the generation count in a
"normal" generation count scheme (with a global counter) would be
incremented) You lost a bit of simplicity, by having to call this
function occcasionally, but it can save a bit of memory.

--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Juergen Boemmels

unread,
Aug 4, 2003, 9:28:25 AM8/4/03
to Benjamin Goldberg, perl6-i...@perl.org
Benjamin Goldberg <ben.go...@hotpop.com> writes:

> I was recently reading the following:
>
> http://www.parrotcode.org/docs/dev/infant.dev.html
>
> It's missing some things. One of which is the (currently used?) way of
> preventing infant mortality: anchor right away, or else turn off DoD
> until the new object isn't needed.

This is Solution 3: Explicit root set augmentation

Its not explained in so much details than the other solutions.

> This document doesn't mention another technique, which was mentioned
> recently:
>
> http://groups.google.com/groups?
> selm=m2k7asg87i.fsf%40helium.physik.uni-kl.de
>
> , at the "use a linked list of frames" part.

In principle this is a variant of the explicit root set
augmentation. The linked list of frames is part of the root set. The
big advantages of this list is that return and longjmp automatically
drop the now unused objects, because each partial list is stored in
the C-stack-frames.

> Another similar idea (one which I thought of myself, so feel free to
> shoot it down!) is to use a generational system, with the "current
> generation" as a value on the C stack, passed as an argument after the
> interpreter. That is, something like:
>
> foo(struct ParrotInterp *interpreter, int generation, ...)
> {
> PMC * temp = bar(interpreter, generation);
> baz(interpreter, generation+1);
> }
>
> Because inside baz(), generation is a higher value than it was when temp
> was created, a DOD run inside of baz() won't kill foo.

This is a solution similar to Solution 2 / Variant 4: Generation
count.

> During a DOD run, any PMC with a generation less than or equal to the
> current generation is considered live.

You can even use mark the current generation as free. If a function
wants to keep data it uses Parrot_do_DOD(interp, generation + 1). If
it does not have any data to keep it calls Parrot_do_DOD(interp,
generation).

> Any PMC with a generation
> greater than the current generation gets it's generation set to 0.

You mean is marked as free. An anchored object should be set to 0.

> Like the linked list scheme, this works through longjmps and recursive
> run_cores, and it's much simpler for the user, too: just add one to the
> generation to prevent all temporaries in scope from being freed.
>
> It similarly has the drawback of altering the signature of every parrot
> function.

If this will lead to exact and fast DOD we might bite the bullet.

> There's another drawback I can think of... consider:
>
> foo(struct ParrotInterp *interpreter, int generation, ...)
> {
> PMC * temp = bar(interpreter, generation);
> baz(interpreter, generation+1);
> qux(interpreter, generation+1);
> }
>
> If baz creates a temporary object and returns, then qux performs a DOD,
> baz's (dead) object won't get cleaned up.

When asume the current generation as free (less than instead of less
equal) this is not problem. Only if qux calls another function with an
increased generation count without a prior call to Parrot_do_DOD the
temporaries of baz will survive. But this can be solved by:

foo(struct ParrotInterp *interpreter, int generation, ...)
{
PMC * temp = bar(interpreter, generation);
baz(interpreter, generation+1);

Parrot_do_DOD(interpreter, generation+1); /* free the temporaries baz */
qux(interpreter, generation+1);
}

> This could be solved by keeping a stack of newly created objects, and
> providing some sort of generational_dod_helper() function, which would
> do something like:
> while( neonates && neonates->top->generation > current_generation ) {
> neonates->top->generation = 0;
> neonates = neonates->next;
> }
> , and calling that in foo between baz and qux. (And maybe sometimes at
> toplevel, between opcodes... at times when the generation count in a
> "normal" generation count scheme (with a global counter) would be
> incremented) You lost a bit of simplicity, by having to call this
> function occcasionally, but it can save a bit of memory.

In principle your generational_dod_helper is the sweep-function of the
garbage-collector (or in parrot its called free_unused_objects). If
you are sure that no objects have been anchored since the last mark no
further mark is necessary (Parrot_do_DOD). But I'm not sure if you can
guarantee that.

I have the feeling that extending the signature of all
Parrot-functions will remove the need of walking the C-Stack
entirely. If this will be the linked list of frames or the generation
count is more or less a matter of taste: The generation count forces
some intermediate DOD-runs, whereas the linked list of stack
frames makes the creation of temporaries a little bit more
complicated.

bye
boe
--
Juergen Boemmels boem...@physik.uni-kl.de
Fachbereich Physik Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47

Dan Sugalski

unread,
Aug 4, 2003, 10:22:02 AM8/4/03
to Juergen Boemmels, Benjamin Goldberg, perl6-i...@perl.org
At 3:28 PM +0200 8/4/03, Juergen Boemmels wrote:
>I have the feeling that extending the signature of all
>Parrot-functions will remove the need of walking the C-Stack
>entirely. If this will be the linked list of frames or the generation
>count is more or less a matter of taste: The generation count forces
>some intermediate DOD-runs, whereas the linked list of stack
>frames makes the creation of temporaries a little bit more
>complicated.

I'm still not convinced that any of the non-stackwalking alternatives
are really worth it. Stackwalking as a higher per-run cost (as stack
elements are more expensive to evaluate) and potential portability
issues (as we can't guarantee stacks on a platform, and may have fun
getting register contents) but has the advantage of having no
per-allocation anchoring expense and reducing the number of places in
the source that have to be specially designed to not die too soon.

We have a solution, and it works. The garbage collector certainly has
things that can be done to it, but at this point we'd be better
served by work on a generational collector, or teaching the dead
sweep to build a tree of dead objects for ordered destruction.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Benjamin Goldberg

unread,
Aug 4, 2003, 2:54:03 PM8/4/03
to perl6-i...@perl.org

Actually, I meant generation set to MAX_INT, not 0.

Marking pmcs as free happens at the end of DOD. Marking pmcs as live or
dead happens earlier. I was thinking of something like:

foreach(pmc in all_pmcs) {
if( pmc->generation <= current_generation )
mark pmc as live
if( pmc is an aggregate )
add pmc to list of pmcs to be traced
/* don't alter it's generation */
else
mark pmc as dead
/* and if it becomes alive again, then */
/* on the next DOD, it shouldn't be considered */
/* alive due to being neonate/on the stack */
pmc->generation = MAX_INT;
}
foreach(pmc in root set) {
if( pmc is marked as live ) next;
mark pmc as live
if( pmc is an aggregate )
add pmc to list of pmcs to be traced
}
trace the pmcs in the list of pmcs to be traced
foreach(pmc in all_pmcs) {
if( pmc isn't live )
add pmc to the free list
}

> > Like the linked list scheme, this works through longjmps and recursive
> > run_cores, and it's much simpler for the user, too: just add one to
> > the generation to prevent all temporaries in scope from being freed.
> >
> > It similarly has the drawback of altering the signature of every
> > parrot function.
>
> If this will lead to exact and fast DOD we might bite the bullet.

Actually, I just thought of a seriously cool idea. If the C stack
always grows up, or always grows down, then we can use the address of
the local variable holding the interpreter, as the generation number.
Thus, we can avoid changing the prototype of all our functions. Sortof
like how one write an alloca() library function.

> > There's another drawback I can think of... consider:
> >
> > foo(struct ParrotInterp *interpreter, int generation, ...)
> > {
> > PMC * temp = bar(interpreter, generation);
> > baz(interpreter, generation+1);
> > qux(interpreter, generation+1);
> > }
> >
> > If baz creates a temporary object and returns, then qux performs a
> > DOD, baz's (dead) object won't get cleaned up.
>
> When asume the current generation as free (less than instead of less
> equal) this is not problem. Only if qux calls another function with an
> increased generation count without a prior call to Parrot_do_DOD the
> temporaries of baz will survive. But this can be solved by:
>
> foo(struct ParrotInterp *interpreter, int generation, ...)
> {
> PMC * temp = bar(interpreter, generation);
> baz(interpreter, generation+1);
> Parrot_do_DOD(interpreter, generation+1);
> /* free the temporaries baz */
> qux(interpreter, generation+1);
> }
>
> > This could be solved by keeping a stack of newly created objects, and
> > providing some sort of generational_dod_helper() function, which would
> > do something like:
> > while( neonates && neonates->top->generation > current_generation ) {
> > neonates->top->generation = 0;

Again, that should be set to MAX_INT, not 0.

> > neonates = neonates->next;
> > }
> > , and calling that in foo between baz and qux. (And maybe sometimes
> > at toplevel, between opcodes... at times when the generation count in
> > a "normal" generation count scheme (with a global counter) would be
> > incremented) You lost a bit of simplicity, by having to call this
> > function occcasionally, but it can save a bit of memory.
>
> In principle your generational_dod_helper is the sweep-function of the
> garbage-collector (or in parrot its called free_unused_objects).

Except that generational_dod_helper is much simpler and faster -- it
doesn't mark anything as alive or free, it only adjusts the generation
of those pmcs that were created in C functions which we have since
returned from.

(Hmm, that code should be setting neonates->top->generation to MAX_INT,
not zero. I think.)

> If you are sure that no objects have been anchored since the last mark
> no further mark is necessary (Parrot_do_DOD). But I'm not sure if you
> can guarantee that.
>
> I have the feeling that extending the signature of all
> Parrot-functions will remove the need of walking the C-Stack
> entirely. If this will be the linked list of frames or the generation
> count is more or less a matter of taste: The generation count forces
> some intermediate DOD-runs, whereas the linked list of stack
> frames makes the creation of temporaries a little bit more
> complicated.

The generation count doesn't *force* intermediate DOD-runs... or at
least, not a *full* DOD, anyway.

Benjamin Goldberg

unread,
Aug 4, 2003, 3:05:17 PM8/4/03
to perl6-i...@perl.org

Having applied a bit more thought, having the generation field as part
of the PMC isn't all that great -- it makes PMCs larger, but it's really
only needed for new/neonate pmcs.

Instead of attatching the generation directly to the pmc, have a global
(per-interpreter) stack of neonate pmcs. Each stack entry contains both
the generation it was added, and the pointer to the pmc.

When a pmc is created, it does something like:
while( size(neonate_stack) && top(neonate_stack).gen > cur_gen )
pop(neonate_stack);
push(neonate_stack, { self, cur_gen });

During DOD, we do:

foreach(pmc in all_pmcs)
mark pmc as dead
while( size(neonate_stack) && top(neonate_stack).gen > cur_gen )
pop(neonate_stack)
foreach( pmc in root_set and on the neonate_stack ) {
if( pmc is alive ) next;


mark pmc as live
if( pmc is an aggregate )

add pmc to the list of pmcs to be traced


}
foreach( pmc in all_pmcs )

if( pmc is dead )
add pmc to free list;

Juergen Boemmels

unread,
Aug 5, 2003, 6:01:51 AM8/5/03
to Benjamin Goldberg, perl6-i...@perl.org
Benjamin Goldberg <ben.go...@hotpop.com> writes:

> Actually, I meant generation set to MAX_INT, not 0.
>
> Marking pmcs as free happens at the end of DOD. Marking pmcs as live or
> dead happens earlier. I was thinking of something like:
>
> foreach(pmc in all_pmcs) {

...
> }
> foreach(pmc in root set) {
...


> }
> trace the pmcs in the list of pmcs to be traced
> foreach(pmc in all_pmcs) {

...
> }

These are two complete sweeps over the allocated memory. But its
necessary.

> Actually, I just thought of a seriously cool idea. If the C stack
> always grows up, or always grows down, then we can use the address of
> the local variable holding the interpreter, as the generation number.
> Thus, we can avoid changing the prototype of all our functions. Sortof
> like how one write an alloca() library function.

Cool idea. This way the generation is the depth of the call chain, and
not the depth of calls with temporaries. A deeply nested DOD run will
not very likely free any temporaries. This increases the need for DOD
runs in the run-loop.

> Except that generational_dod_helper is much simpler and faster -- it
> doesn't mark anything as alive or free, it only adjusts the generation
> of those pmcs that were created in C functions which we have since
> returned from.

Its still one full sweep.

> The generation count doesn't *force* intermediate DOD-runs... or at
> least, not a *full* DOD, anyway.

But it needs intermediate sweeps.

Benjamin Goldberg

unread,
Aug 7, 2003, 11:43:03 PM8/7/03
to perl6-i...@perl.org
Juergen Boemmels wrote:
>
> Benjamin Goldberg <ben.go...@hotpop.com> writes:
[snip]

> > Except that generational_dod_helper is much simpler and faster -- it
> > doesn't mark anything as alive or free, it only adjusts the generation
> > of those pmcs that were created in C functions which we have since
> > returned from.
>
> Its still one full sweep.

Did you read my other message in this thread?

It could/should be much simpler: only a sweep of the stack of newly
allocated pmcs; and since the stack only grows at one end, it's *only*
those newly allocated pmcs whose depths are greater than our current
depth.

> > The generation count doesn't *force* intermediate DOD-runs... or at
> > least, not a *full* DOD, anyway.
>
> But it needs intermediate sweeps.

But with the generation-number kept seperate from the pmc in a stack,
these intermediate sweeps can be very fast.

0 new messages