Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Infant mortality

8 views
Skip to first unread message

Steve Fink

unread,
Dec 30, 2002, 7:00:55 PM12/30/02
to perl6-i...@perl.org
Many of our tinderbox failures result from architecture-specific
shortcomings in our current root set scanning code. The whole
stackwalk/register scan is also rather slow, as Peter Gibbs showed a
few decades back. (Okay, so it was probably only about a year ago.) In
pondering the problem, I found myself reinventing several solutions
that have already been proposed, and in many cases implemented, by
other people. So I decided to summarize the various approaches in
hopes that something will jump out at someone. I would be grateful if
interested people could check this over and

(1) tell me what I have wrong or am overlooking (especially with
respect to longjmp() -- I have the feeling that it's a bigger
problem than I realize),

(2) point out what's wrong with my "variant 5: generational
stack",

and/or

(3) propose something else that solves the whole problem neatly.

infant.dev

Jerome Vouillon

unread,
Dec 31, 2002, 7:39:06 AM12/31/02
to Steve Fink, perl6-i...@perl.org
On Mon, Dec 30, 2002 at 04:00:55PM -0800, Steve Fink wrote:
> =head1 Solution 3: Explicity root set augmentation
>
> A final possible solution is to provide a mechanism to temporarily
> anchor an otherwise unanchored object to the root set. (eg, have an
> array of objects associated with the interpreter that are all
> considered to be part of the root set.) This has pretty much the same
> advantages and disadvantages of explicit neonate flag setting:
>
> + Simple
> + Fast DOD
> - Slow for unanchored temporaries
> - Sometimes slow for anchored objects (depending on whether they need
> to be temporarily anchored before the final anchoring)

What do you mean by slow here?

> - Easy to forget to remove temporaries from the root set
> - Easy to double-anchor objects and forget to remove the temporary
> anchoring
> - longjmp() can bypass the unanchoring

The temporary objects could be stored in a stack, which is popped when
leaving the current function (both with normal exits and longjmp).
This should make it a lot less likely to forget the unanchoring.

-- Jerome

Leopold Toetsch

unread,
Dec 31, 2002, 6:59:29 AM12/31/02
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

> ... So I decided to summarize the various approaches in


> hopes that something will jump out at someone.


Great document, thanks for all the work of summarizing this.


> (2) point out what's wrong with my "variant 5: generational
> stack",


I think, it moves the problems just around with a lot of overhead. E.g.
cloning a PerlArray of 10^6 entries would need 1000 generations (when
1000 PMCs are allocated in one bunch), which would need a ever growing
generation stack (using mem too) and leading to a probably non linear
slow down in DOD runs.
The ~1000 DOD runs want give you any resources (except for the first),
so it would be cheaper to disable DOD alltogether for e.g. clone - or
call do_dod_run explicitly and then disable it.


> (3) propose something else that solves the whole problem neatly.


I think, we will need a mixture of:

> In this case, the solution is simple: resize the array, if necessary,
> before creating the PMC.


i.e. reorder code where possible, and anchor early,
and ...


> =head2 Variant 3: clear during DOD
>
> The neonate flag is cleared during DOD when an object is encountered
> during the recursive root set traversal. (Leopold Toetsch's trick of
> setting the live_FLAG during creation is equivalent to this variation,
> I think.)
>
> + Simple
> + Fast DOD (DOD already manipulates the flags)
> - If there are multiple DOD runs before the object is anchored or
> dies, it will be prematurely freed


IMHO it should be possible to guarantee, that there are no consecutive
DOD runs for certain operations. The major problem is cloning of
aggregates which could take arbitrary resources (though the aggregates
normally know how big they are).
A second problem is: a DOD run can be triggered from two different code
paths: get_free_object->more_traceable_objects->DOD and (for the copying
allocator) by a shortage in the memory_pool from mem_allocate() [1].

By rearraning code and disabling DOD during clone (when needed), it
should be possible to avoid walking the processor stack/regs alltogether.

[1] which BTW seems bogus to me: pool_compact() may be called (when
there is enough reclaimable space), without having a DOD run immediately
before it. So e.g. the on_free_list flag might not reflect current usage
of a buffer.

leo

Steve Fink

unread,
Dec 31, 2002, 9:32:42 PM12/31/02
to Jerome Vouillon, perl6-i...@perl.org
On Dec-31, Jerome Vouillon wrote:
> On Mon, Dec 30, 2002 at 04:00:55PM -0800, Steve Fink wrote:
> > =head1 Solution 3: Explicity root set augmentation
> >
> > A final possible solution is to provide a mechanism to temporarily
> > anchor an otherwise unanchored object to the root set. (eg, have an
> > array of objects associated with the interpreter that are all
> > considered to be part of the root set.) This has pretty much the same
> > advantages and disadvantages of explicit neonate flag setting:
> >
> > + Simple
> > + Fast DOD
> > - Slow for unanchored temporaries
> > - Sometimes slow for anchored objects (depending on whether they need
> > to be temporarily anchored before the final anchoring)
>
> What do you mean by slow here?

For both of those, "slow" refers to the need to append/push infant
objects onto the root set. This will slow down the common case of no
DOD, because they'll need to be added to the root set and removed,
without their membership ever being looked at. Objects which end up
anchored may or may not have to pay that cost, because you may be able
to anchor them before there is any possibility of a DOD run.

As to whether that is truly slow or not -- well, I personally probably
wouldn't worry about it, but I think this very concern has caused this
proposal to be rejected in the past. It's certainly a good candidate
for better benchmarking. If all objects have space for a pointer in
them (eg the next_for_GC link), then it seems quick enough to push
them onto the beginning of a list during allocation. Removing them
from the list is harder, since you'd need to traverse the whole stack.
Although you might be able to eliminate that problem by using both
temporary anchoring and flags, so that you could mark them as "no
longer temporarily anchored" with a single bit flip and strip them all
out in one pass during DOD...

Oops, I think I'm overdoing it again. Your later comment seems correct
-- they'll always be manipulated LIFO anyway, so you can just keep a
top-of-stack pointer before doing anything with them.

> > - Easy to forget to remove temporaries from the root set
> > - Easy to double-anchor objects and forget to remove the temporary
> > anchoring
> > - longjmp() can bypass the unanchoring
>
> The temporary objects could be stored in a stack, which is popped when
> leaving the current function (both with normal exits and longjmp).
> This should make it a lot less likely to forget the unanchoring.

How do you do this with longjmp? I could see chaining another handler
onto the longjmp context so that longjmp would backtrack through all
of these allocations, but that would require allocating space for
another context. And allocating space further slows down the common
case...

longjmp is really the major blocker for this option, so if you can
work around it, then maybe we'll have something. Your stack-based
approach sounds plausible as a way to make it easier to handle the
bookkeeping, as long as you don't do it with macros. :-) Maybe wrapper
functions?

static inline retval some_op()...

retval wrap_some_op(interp, args...)
{
Pobj *stacktop = interp->infants;
retval rv = some_op(interp, args);
interp->infants = stacktop;
return rv;
}

Brent Dax

unread,
Jan 1, 2003, 1:59:01 AM1/1/03
to Steve Fink, Jerome Vouillon, perl6-i...@perl.org
Steve Fink:
# > > - Easy to forget to remove temporaries from the root set
# > > - Easy to double-anchor objects and forget to remove the
# temporary
# > > anchoring
# > > - longjmp() can bypass the unanchoring
# >
# > The temporary objects could be stored in a stack, which is
# popped when
# > leaving the current function (both with normal exits and longjmp).
# > This should make it a lot less likely to forget the unanchoring.
#
# How do you do this with longjmp? I could see chaining another
# handler onto the longjmp context so that longjmp would
# backtrack through all of these allocations, but that would
# require allocating space for another context. And allocating
# space further slows down the common case...
#
# longjmp is really the major blocker for this option, so if
# you can work around it, then maybe we'll have something. Your
# stack-based approach sounds plausible as a way to make it
# easier to handle the bookkeeping, as long as you don't do it
# with macros. :-) Maybe wrapper functions?

Pseudocode:

void some_func(struct Parrot_Interp *interpreter) {
anchor(interpreter, a);
anchor(interpreter, b);

TRY(interpreter) {
something_that_might_throw_an_exception();
}
CATCH(interpreter) {

}

unanchor(interpreter, b);
unanchor(interpreter, a);
}

#define TRY(interp) if(setup_exception((interp))
#define CATCH(interp) else

BOOLVAL setup_exception(struct Parrot_Interp *interpreter) {
Parrot_jmpbuf jb=NULL;
//setup jmpbuf...

if(setjmp(jmpbuf)) {
jb=stack_pop(interpreter, interpreter->jb_stack);
//Everything above us naturally falls off
interpreter->anchor_stack_top=jb->anchor_stack_top;
}
else {
Parrot_jmpbuf *jb=malloc(sizeof(Parrot_jmpbuf));

jb->real_jmpbuf=jmpbuf;
jb->anchor_stack_top=interpreter->anchor_stack_top;
...
stack_push(interpreter, interpreter->jb_stack, jb);
}
}

Yeah, I know, macros are evil, but they make this code *soooo* much
prettier...

--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

"If you want to propagate an outrageously evil idea, your conclusion
must be brazenly clear, but your proof unintelligible."
--Ayn Rand, explaining how today's philosophies came to be

Leopold Toetsch

unread,
Jan 1, 2003, 9:18:54 AM1/1/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

> Many of our tinderbox failures result from architecture-specific
> shortcomings in our current root set scanning code.


Here is a test result (i386/linux) *without* --gc-debug and without
trace_system_areas:

Failed Test Stat Wstat Total Fail Failed List of Failed
-------------------------------------------------------------------------------
t/op/string.t 1 256 99 1 1.01% 95
t/pmc/perlhash.t 2 512 24 2 8.33% 8-9
9 subtests skipped.
Failed 2/42 test scripts, 95.24% okay. 3/584 subtests failed, 99.49%
okay.

I know, that things get worse with optimization and with more complex
programs. But overall it doesn't look that bad. All 3 test failures seem
to be caused by string_concat. When string_concat would have a
**dest_ptr like e.g. string_repat, we could anchor the created string
early, which should fix the problem.

When looking at PMCs, I<new> does anchor the newly created PMC already.
The clone functions could be redone like:
Parrot_do_dod_run(); // get free headers if possible
Parrot_block_DOD/GC();
do_clone
Parrot_unblock_DOD/GC
This would also be faster, because there are no DOD runs during clone
(which wouldn't yield more free headers anyway).

I think that by either anchoring a newly created header early or by
blocking DOD/GC we could solve the infant mortality problem.

leo


Steve Fink

unread,
Jan 1, 2003, 1:57:03 PM1/1/03
to Brent Dax, Jerome Vouillon, perl6-i...@perl.org
On Dec-31, Brent Dax wrote:
> Steve Fink:

> # How do you do this with longjmp? I could see chaining another
> # handler onto the longjmp context so that longjmp would
> # backtrack through all of these allocations, but that would
> # require allocating space for another context. And allocating
> # space further slows down the common case...
> # .
> # .
> # .
>
> Pseudocode:
> .
> .
> .

> else {
> Parrot_jmpbuf *jb=malloc(sizeof(Parrot_jmpbuf));
>
> jb->real_jmpbuf=jmpbuf;
> jb->anchor_stack_top=interpreter->anchor_stack_top;
> ...
> stack_push(interpreter, interpreter->jb_stack, jb);
> }
> }

Your pseudocode illustrates exactly what I was talking about --
including a memory allocation. And it's a memory allocation that will
really only be used in the case when a DOD is triggered, which is
exactly when we're low on memory...

It is perhaps not a fatal flaw, if we can bound the number of chained
longjmp buffers. But it does still slow down the common case.

> Yeah, I know, macros are evil, but they make this code *soooo* much
> prettier...

Actually, it seems to me if in your pseudocode you just renamed
setup_exception() to try(), then you don't really need the macro:

if (try(interp)) {
...code...
} else {
Parrot_jmpbuf...
}

Oops. Except I think your code, with or without macros, has a bug --
setup_exception adds another stackframe, then returns, then enters
something_that_might_throw_an_exception() which tramples
setup_exception()'s frame. So if you longjmp back, the
setup_exception() frame (including eg the 'interpreter' param) is
trashed.

So I think you're right -- you'd need a macro, but the macro would be
the entire body of the setup_exception() function. I knew there was a
reason setjmp/longjmp have always scared me.

Steve Fink

unread,
Jan 1, 2003, 2:00:09 PM1/1/03
to Leopold Toetsch, perl6-i...@perl.org
On Jan-01, Leopold Toetsch wrote:
> Steve Fink wrote:
>
> >Many of our tinderbox failures result from architecture-specific
> >shortcomings in our current root set scanning code.
>
>
> Here is a test result (i386/linux) *without* --gc-debug and without
> trace_system_areas:
> ...
>
> I know, that things get worse with optimization and with more complex
> programs.

And with other architectures that make heavier use of registers --
i386 may be an unrealistic example, since not much can even fit into
the available register set.

> When looking at PMCs, I<new> does anchor the newly created PMC already.
> The clone functions could be redone like:
> Parrot_do_dod_run(); // get free headers if possible
> Parrot_block_DOD/GC();
> do_clone
> Parrot_unblock_DOD/GC
> This would also be faster, because there are no DOD runs during clone
> (which wouldn't yield more free headers anyway).

But wouldn't that force a DOD run on every clone? Normally, DOD should
be extremely rare compared to clones.

Leopold Toetsch

unread,
Jan 1, 2003, 3:36:58 PM1/1/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

> On Jan-01, Leopold Toetsch wrote:


>>I know, that things get worse with optimization and with more complex
>>programs.

> And with other architectures that make heavier use of registers --
> i386 may be an unrealistic example, since not much can even fit into
> the available register set.


Yep. The goal can only be, to either have *all* code paths that might
trigger DOD checked, or use one of the mentioned strategies. Though
reducing possibilities of infant mortatility by early anchoring is IMHO
always a good thing: it would e.g. help active pinning.


>>The clone functions could be redone like:
>> Parrot_do_dod_run(); // get free headers if possible
>> Parrot_block_DOD/GC();
>> do_clone
>> Parrot_unblock_DOD/GC
>>This would also be faster, because there are no DOD runs during clone
>>(which wouldn't yield more free headers anyway).

> But wouldn't that force a DOD run on every clone? Normally, DOD should
> be extremely rare compared to clones.


With above code yes. But classes can estimate, how much resources would
be needed to e.g. clone a PerlArray of 100 elements. When the needed
header count is far beyond the allocation of HEADERS_PER_ALLOC above
strategy is a win. Cloning small aggregates or scalars could be done
without explicit Parrot_do_dod_run when there are enough headers on the
free list.

So we would have additionally something like:

if (pmc->vtable->needed_headers_for_clone() > pmc_pool->replenish_level)
do_dod_run()

eventually checking other pools to, which is a lot cheaper then, e.g.
100 unnecessary DOD runs during cloning a huge array.

leo

Matt Fowles

unread,
Jan 1, 2003, 4:47:14 PM1/1/03
to Leopold Toetsch, Steve Fink, perl6-i...@perl.org
All~

> >>The clone functions could be redone like:
> >> Parrot_do_dod_run(); // get free headers if possible
> >> Parrot_block_DOD/GC();
> >> do_clone
> >> Parrot_unblock_DOD/GC
> >>This would also be faster, because there are no DOD runs during clone
> >>(which wouldn't yield more free headers anyway).
>
> > But wouldn't that force a DOD run on every clone? Normally, DOD should
> > be extremely rare compared to clones.

You could instead have a function Parrot_do_at_most_N_DOD/GC(int n), that
would never do a DOD run if it was passed 0. Do 1 if it was passed one, 2
for 2 etc, and do DOD runs whenever if passed -1.

Thus the code would be

Parrot_do_at_most_N_DOD/GC(1);
do_clone
Parrot_unblock_DOD/GC(-1);

This would allow it to do one DOD run if it realized that it needed it part
way through the clone, but would never do more. This could also allow more
flexible control of the GC if one wanted it.

Matt
----
"Computer Science is merely the post-Turing decline of Formal Systems
Theory"
-???

Steve Fink

unread,
Jan 1, 2003, 10:11:10 PM1/1/03
to Leopold Toetsch, perl6-i...@perl.org
On Dec-31, Leopold Toetsch wrote:
> Steve Fink wrote:
>
> >... So I decided to summarize the various approaches in
> >hopes that something will jump out at someone.
>
>
> Great document, thanks for all the work of summarizing this.
>
>
> > (2) point out what's wrong with my "variant 5: generational
> > stack",
>
>
> I think, it moves the problems just around with a lot of overhead. E.g.
> cloning a PerlArray of 10^6 entries would need 1000 generations (when
> 1000 PMCs are allocated in one bunch), which would need a ever growing
> generation stack (using mem too) and leading to a probably non linear
> slow down in DOD runs.

I don't understand. The outer clone belongs to generation N. Then if
you called clone() individually on each PMC within the array, then by
default you wouldn't create any more generations at all. If you were
worried that you might accumulate too much garbage while doing all
those clones (which is certainly reasonable), then you could put each
sub-clone into its own generation, but the stack would still only get
2 deep (each sub-clone would be surrounded by a generation push/pop).
I don't understand how batching up the PMC allocations changes that
analysis. You could certainly subdivide the clone into as many
generations as you want, but they'd still only ever be 2 deep.

I was thinking that for the most part only complete opcodes would get
their own generation, and only recursive calls to the interpreter to
execute other chunks of PASM would normally push onto the generational
stack. So by default, PerlArray.clone would all fit into one
generation unless the PMCs within the PerlArray implement their clone
vtable entries with PASM code.

> I think, we will need a mixture of:
>
> >In this case, the solution is simple: resize the array, if necessary,
> >before creating the PMC.
>
>
> i.e. reorder code where possible, and anchor early,
> and ...

I'm starting to think you're right.

> >=head2 Variant 3: clear during DOD
> >
> >The neonate flag is cleared during DOD when an object is encountered
> >during the recursive root set traversal. (Leopold Toetsch's trick of
> >setting the live_FLAG during creation is equivalent to this variation,
> >I think.)
> >
> > + Simple
> > + Fast DOD (DOD already manipulates the flags)
> > - If there are multiple DOD runs before the object is anchored or
> > dies, it will be prematurely freed
>
> IMHO it should be possible to guarantee, that there are no consecutive
> DOD runs for certain operations. The major problem is cloning of
> aggregates which could take arbitrary resources (though the aggregates
> normally know how big they are).

I'm not convinced that's ever possible. What if an object overrides
its clone()? Then when an aggregate containing it is cloned, it could
do pretty much anything...

Wait. Dumb question: is clone a shallow or deep copy?

> By rearraning code and disabling DOD during clone (when needed), it
> should be possible to avoid walking the processor stack/regs alltogether.

Yes, I'm hoping there's some solution where things that have full
knowledge of what might happen can make use of it and gain speed in
the common case, while there's still a safe mechanism for things that
are too difficult or impossible to predict in advance. So we can start
with the easy approach, and then optimize them into the more detailed
approach if needed. Or something. My hands are getting tired from all
the waving around.

Steve Fink

unread,
Jan 1, 2003, 10:12:01 PM1/1/03
to Leopold Toetsch, perl6-i...@perl.org
On Jan-01, Leopold Toetsch wrote:
> Steve Fink wrote:
>
> >On Jan-01, Leopold Toetsch wrote:
>
>
> >>I know, that things get worse with optimization and with more complex
> >>programs.
>
> >And with other architectures that make heavier use of registers --
> >i386 may be an unrealistic example, since not much can even fit into
> >the available register set.
>
>
> Yep. The goal can only be, to either have *all* code paths that might
> trigger DOD checked, or use one of the mentioned strategies. Though
> reducing possibilities of infant mortatility by early anchoring is IMHO
> always a good thing: it would e.g. help active pinning.

Could you expand on that? What's "active pinning"?

Steve Fink

unread,
Jan 1, 2003, 10:46:42 PM1/1/03
to perl6-i...@perl.org
Another (maybe silly) possibility suggested itself to me based on a
private mail re: infant mortality from Thomas Whateley: could we try
optimistic allocations?

Say we have a single memory backtracking jump buffer in the
interpreter. Then, before creating objects that cannot be easily
anchored to the root set, do a setjmp() and proceed ahead
optimistically under the assumption that no DOD runs will occur. You
must be careful here not to do anything with side effects that can't
be undone easily. Then, if DOD is triggered before you can anchor or
forget about the allocated objects, the interpreter does a longjmp()
back to your backtrack point:

somefunc() {
int dod_blocked = 0;
if (setjmp(interp->dod_backtrack_jmpbuf)) {
undo any partial state
do_DOD_run()
block DOD
dod_blocked = 1;
}

...code...

interp->dod_backtrack_jmpbuf = NULL;
if (dod_blocked) unblock DOD;
}

It adds a single setjmp() to the common case, which may be
prohibitively slow for all I know. Although you can use the same
optimistic approach without longjmp(); you'd just have to write your
code so that it undoes the partial state if any object allocation call
fails:

somefunc() {
interp->on_alloc_fail = RETURN_NULL;
.
.
.
if (alloc_object() == NULL) {
undo everything
do_DOD_run
interp->on_alloc_fail = CRASH_AND_BURN
start over
}
.
.
.
interp->on_alloc_fail = DO_DOD_RUN /* the default */
}

This way, the common case isn't slowed at all.

Both feel too complex to be workable, though. I just thought I'd throw
it out there to round out the set of possibilities.

Steve Fink

unread,
Jan 1, 2003, 11:09:05 PM1/1/03
to perl6-i...@perl.org
To anyone out there who is thinking of a Grand Unified Infant
Mortality Solution, here's another thing that vastly complicates
things, and that we don't yet have a workable solution for yet: prompt
finalization.

my %x;
{ # start scope
my $timer1 = Timer->new();
my $timer2 = Timer->new();
$x{timer} = $timer2;
...
} # $timer1->DESTROY should be called
# $timer2->DESTROY should NOT be called

Right now, the only correct implementation of this that has been
proposed is to do a full DOD run at the end of every single scope. We
can do minor optimizations of that, by for example suppressing the DOD
if no object that implements a DESTROY method is live, but that's it.
At the moment, this threatens to make perl6 the *slowest* language
hosted by Parrot! (Since we haven't implemented any other languages
yet that need this.) It is possible that this problem will be fixed by
weakening the finalization guarantees in the language definition, but
it's definitely something to worry about.

The connection to the infant mortality problem is just that some
solutions might make this easier. Or harder. For example, the current
conservative stackwalk breaks even the painfully slow solution: there
might be a stale pointer somewhere on the C stack (or in a register,
or a saved register window) that keeps the DESTROYable object alive
unnaturally.

Leopold Toetsch

unread,
Jan 2, 2003, 5:25:54 AM1/2/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

> On Dec-31, Leopold Toetsch wrote:


>>I think, it moves the problems just around with a lot of overhead. E.g.
>>cloning a PerlArray of 10^6 entries would need 1000 generations

> I don't understand. The outer clone belongs to generation N. Then if


> you called clone() individually on each PMC within the array, then by
> default you wouldn't create any more generations at all.


Ah, ok. I thougt that at each dod run a new generation is started. But
you seem to manage generation count manually.


> I'm not convinced that's ever possible. What if an object overrides
> its clone()?


The vtable->clone in core.ops would be surrounded by my proposed
do_dod_run/block_DOD, so the clone function could be anything.


> Wait. Dumb question: is clone a shallow or deep copy?


Deep and recursive. BTW recursively marking was declined becaus of stack
usage - what about clone?


>>By rearraning code and disabling DOD during clone (when needed), it
>>should be possible to avoid walking the processor stack/regs alltogether.

> Yes, I'm hoping there's some solution where things that have full
> knowledge of what might happen can make use of it and gain speed in
> the common case, while there's still a safe mechanism for things that
> are too difficult or impossible to predict in advance. So we can start
> with the easy approach, and then optimize them into the more detailed
> approach if needed. Or something. My hands are getting tired from all
> the waving around.

I think its time to do some coding.
leo

Leopold Toetsch

unread,
Jan 2, 2003, 6:05:34 AM1/2/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

> To anyone out there who is thinking of a Grand Unified Infant
> Mortality Solution, here's another thing that vastly complicates
> things, and that we don't yet have a workable solution for yet: prompt
> finalization.
>
> my %x;
> { # start scope
> my $timer1 = Timer->new();
> my $timer2 = Timer->new();
> $x{timer} = $timer2;
> ...
> } # $timer1->DESTROY should be called
> # $timer2->DESTROY should NOT be called
>
> Right now, the only correct implementation of this that has been
> proposed is to do a full DOD run at the end of every single scope.


Active destroying (did I say: "We don't need it" ;-)

The question is: does the HL know, that $timer1 needs to be destroyed,
while $timer2 is still refererenced outside the current scope?

> ... We


> can do minor optimizations of that, by for example suppressing the DOD
> if no object that implements a DESTROY method is live, but that's it.


Or a more specific destroy flag or a separate object list for objects,
which are not allowed to live beyond the end of a scope?


> At the moment, this threatens to make perl6 the *slowest* language
> hosted by Parrot! (Since we haven't implemented any other languages
> yet that need this.)


Doing a full DOD run at the end of a scope is not an option.

> ... It is possible that this problem will be fixed by


> weakening the finalization guarantees in the language definition, but
> it's definitely something to worry about.


Weakening guarantees is probably not an option: file handles have the
same problem - they should AFAIK be closed, when going out of scope.


> The connection to the infant mortality problem is just that some
> solutions might make this easier. Or harder. For example, the current
> conservative stackwalk breaks even the painfully slow solution: there
> might be a stale pointer somewhere on the C stack (or in a register,
> or a saved register window) that keeps the DESTROYable object alive
> unnaturally.

... which seems to indicate, that stack/register walking is unusable for
such obbjects, which are not allowed to be live.

leo

Leopold Toetsch

unread,
Jan 2, 2003, 5:53:55 AM1/2/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

> Another (maybe silly) possibility suggested itself to me based on a
> private mail re: infant mortality from Thomas Whateley: could we try
> optimistic allocations?

> if (alloc_object() == NULL) {
> undo everything
> do_DOD_run
> interp->on_alloc_fail = CRASH_AND_BURN
> start over
> }

What if e.g. clone needs more headers then one dod run can provide? With
CRASH_AND_BURN it would not succeed. With RETURN_NULL it would start
over and over again, until objects_per_alloc increases the needed header
count.

leo


Leopold Toetsch

unread,
Jan 2, 2003, 5:38:18 AM1/2/03
to Matt Fowles, Steve Fink, perl6-i...@perl.org
Matt Fowles wrote:

> All~


>
> Parrot_do_at_most_N_DOD/GC(1);
> do_clone
> Parrot_unblock_DOD/GC(-1);
>
> This would allow it to do one DOD run if it realized that it needed it part
> way through the clone, but would never do more.


Very interesting approach. Yep. No need to check for probably needed
resources. But the problem here is - this works only for the approach
"setting the live (or neonate) FLAG for new objects - only one DOD run
allowed".

The problem with this approach is, that you have far to many live
objects laying around. DOD runs are rare.

I did test it with bench_newp.pasm: loops per second decreased from ~770
to 7, PMC count was 100fold so DOD runs took a long time.

This would work only, in combination with resetting live_FLAG for
already anchored objects, which share the problem with other neonate
flag approaches: How and when to reset the flag.


> Matt

leo

Leopold Toetsch

unread,
Jan 2, 2003, 5:41:36 AM1/2/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:


Proposed "Solution 3: Explicity root set augmentation"


leo


Jason Gloudon

unread,
Jan 2, 2003, 11:17:12 AM1/2/03
to Steve Fink, perl6-i...@perl.org
On Wed, Jan 01, 2003 at 08:09:05PM -0800, Steve Fink wrote:

> To anyone out there who is thinking of a Grand Unified Infant
> Mortality Solution, here's another thing that vastly complicates
> things, and that we don't yet have a workable solution for yet: prompt
> finalization.

Timely finalization has no perfect solution even under the best conditions
(compiler support. etc.), this is why Java and .NET offer no promises about
timely finalization.

.
.

> The connection to the infant mortality problem is just that some
> solutions might make this easier. Or harder. For example, the current
> conservative stackwalk breaks even the painfully slow solution: there

As long as you're using a tracing collector, you cannot provide a general
guarantee of timely finalization without running a full sweep of the 'heap'.
Even if you do not have to walk the hardware stack that's too costly to be
practical.

The best partial solution to early finalization is compile-time tracing of
possible references by the compiler which can explicitly generate the
appropriate DESTROY calls.

--
Jason

Leopold Toetsch

unread,
Jan 2, 2003, 11:56:48 AM1/2/03
to Jason Gloudon, Steve Fink, perl6-i...@perl.org
Jason Gloudon wrote:

> On Wed, Jan 01, 2003 at 08:09:05PM -0800, Steve Fink wrote:
>
>
>>To anyone out there who is thinking of a Grand Unified Infant
>>Mortality Solution, here's another thing that vastly complicates
>>things, and that we don't yet have a workable solution for yet: prompt
>>finalization.

> As long as you're using a tracing collector, you cannot provide a general


> guarantee of timely finalization without running a full sweep of the 'heap'.
> Even if you do not have to walk the hardware stack that's too costly to be
> practical.


Yep. s. google: "active destruction group:perl.perl6.internals" Subject
"DOD etc"


> The best partial solution to early finalization is compile-time tracing of
> possible references by the compiler which can explicitly generate the
> appropriate DESTROY calls.

... if the compiler can track all usage, which I doubt.

leo

Angel Faus

unread,
Jan 2, 2003, 4:24:13 PM1/2/03
to Jason Gloudon, perl6-i...@perl.org
>
> The best partial solution to early finalization is compile-time
> tracing of possible references by the compiler which can explicitly
> generate the appropriate DESTROY calls.
>

What about refcounting + real GC?

refcounting will free most of the objects as soon as possible, and the
real GC system makes sure that even cyclical references are
eventually freed..

-angel
(who is not really paying atention at all)

Dan Sugalski

unread,
Jan 2, 2003, 2:04:52 PM1/2/03
to perl6-i...@perl.org
At 10:24 PM +0100 1/2/03, Angel Faus wrote:
> >
>> The best partial solution to early finalization is compile-time
>> tracing of possible references by the compiler which can explicitly
>> generate the appropriate DESTROY calls.
>>
>
>What about refcounting + real GC?

Yech, refcounting is one of the things we're trying to avoid. The
cost and error-prone-ness is its big downfall.

Anyway, to respond to the whole thread here, as I dig through my mail backlog:

1) We're not generally guaranteeing timely destruction
2) If something *wants* timely destruction, it can push a DOD run as
a scope exit action. If something needing scope exit cleanup actually
escapes its construction scope, odds are it'll live for some
indeterminate time anyway, so that's fine
3) If #2 isn't sufficient, we can consider having a "needs timely
cleanup" flag with corresponding interpreter counter, and on scope
exit if the counter is non-zero a DOD run can be scheduled, with the
interpreter handling the counter.

While I don't really like walking the system stack and probing the
registers on DOD runs, the cost seems generally lower and less
error-prone than explicit save/unsave actions that happen every time
a PMC or buffer is allocated, and I'd like to start working in some
sort of adaptive allocation/collection code so the DOD/GC doesn't
fire off every time there's temporary resource starvation.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Leopold Toetsch

unread,
Jan 2, 2003, 2:12:45 PM1/2/03
to af...@corp.vlex.com, Jason Gloudon, perl6-i...@perl.org
Angel Faus wrote:

>
> What about refcounting + real GC?


As soon as you start to refcount objects, which need active destroying,
you end up refcounting all objects (a timer may be stored in an
aggregate, passed to subs, may have references, ...).


So this is not an option, except changing everything to refcounting,

which is a PITA.


> -angel

leo


Nicholas Clark

unread,
Jan 2, 2003, 3:15:02 PM1/2/03
to Leopold Toetsch, af...@corp.vlex.com, Jason Gloudon, perl6-i...@perl.org
On Thu, Jan 02, 2003 at 08:12:45PM +0100, Leopold Toetsch wrote:
> Angel Faus wrote:
>
> >
> >What about refcounting + real GC?
>
>
> As soon as you start to refcount objects, which need active destroying,
> you end up refcounting all objects (a timer may be stored in an
> aggregate, passed to subs, may have references, ...).

Well, I thought that you only needed to refcount anything (and everything)
that currently contains an object that needs active destroying (where
references, subs and anything else that holds a reference counts as
"containing", because I can't think of a better word)

Effectively everything currently holding an object needing active
destruction becomes an object needing active destruction, (for the duration,
but no longer), and so on all the way back up to the root set.

Whether in practical terms this turns out to mean virtually everything
anyway, I don't know.

> So this is not an option, except changing everything to refcounting,
>
> which is a PITA.

Quite.

We're not guaranteeing perfection, though, are we? Just better than perl5?

[ie things wanting timely destruction get their destructor called at the
correct time, unless you've been silly enough to send them off into a
refcount loop. In which case they will get spotted at some point probably
before program exit, compared with perl5 which only spots them if your
program exits]

Nicholas Clark

Leopold Toetsch

unread,
Jan 2, 2003, 4:10:40 PM1/2/03
to Nicholas Clark, perl6-i...@perl.org
Nicholas Clark wrote:

> On Thu, Jan 02, 2003 at 08:12:45PM +0100, Leopold Toetsch wrote:


[ refcounting ]


> Effectively everything currently holding an object needing active
> destruction becomes an object needing active destruction, (for the duration,
> but no longer), and so on all the way back up to the root set.
>
> Whether in practical terms this turns out to mean virtually everything
> anyway, I don't know.


As you don't know, where such an object may be stored, you have the
"virtually everything".


> We're not guaranteeing perfection, though, are we? Just better than perl5?


Dan did outline our strategy, which seems to be very reasonbale for me.
If timers or such have to be destroyed at the end of scope, the HL is
responsible for telling parrot about this, either by an explicit "sweep"
call, or - with more coding on parrots side - by some counting of "must
destroy objects" + a full DOD run.


> [ie things wanting timely destruction get their destructor called at the
> correct time, unless you've been silly enough to send them off into a
> refcount loop. In which case they will get spotted at some point probably
> before program exit, compared with perl5 which only spots them if your
> program exits]


Yep. A reference to an object is a reference the programmer of such code
has to be aware of. And - Parrot doesn't suffer of refcount loops.


> Nicholas Clark

leo


bk...@cornell.edu

unread,
Jan 2, 2003, 6:52:19 PM1/2/03
to Dan Sugalski, perl6-i...@perl.org
Here's a proposal w.r.t. adaptive GC (it's a variant of
the NEW/BABY/DONTGCME/whatever flag): all buffers are
created with this flag, but we only clear the flag on
DOD runs when we _know_ it's safe (e.g. between opcodes);
if we find we're getting low on resources, we queue a do-DOD-now
event to be run next time we check for pending events. The
plus is that there are no "what if we do DOD twice before
this object gets rooted" issues; the minus is that we
either have to fail allocations prematurely or give up
on guarranteed resource limits. On the other hand, if
someone _knows_ they're generating a lot of garbage in
a loop (in C, not Parrot), they are always free to do
a DOD run (if needed) when they know it's safe.

-- BKS

Dan Sugalski

unread,
Jan 3, 2003, 1:56:06 AM1/3/03
to bk...@cornell.edu, perl6-i...@perl.org

This doesn't work well at all with recursive calls into an
interpreter, which is an issue. It also has some potentially large
overhead as we need to do stuff between every opcode, whien generally
we're going to be triggering runs when we're in the middle of an
opcode.

Jerome Vouillon

unread,
Jan 3, 2003, 2:45:28 PM1/3/03
to Steve Fink, Jerome Vouillon, perl6-i...@perl.org
On Tue, Dec 31, 2002 at 06:32:42PM -0800, Steve Fink wrote:
> On Dec-31, Jerome Vouillon wrote:
> > The temporary objects could be stored in a stack, which is popped when
> > leaving the current function (both with normal exits and longjmp).
> > This should make it a lot less likely to forget the unanchoring.
>
> How do you do this with longjmp? I could see chaining another handler
> onto the longjmp context so that longjmp would backtrack through all
> of these allocations, but that would require allocating space for
> another context. And allocating space further slows down the common
> case...

I'm thinking of something like this.

First, PMCs are not stored directly in C variables, but only in "PMC
references" (in the field "ref"). A stack of PMC references is
maintained, using the field "previous". The GC will consider all the
PMCs in the stack as part of the root.

typedef struct _PMC_ref {
PMC * ref;
_PMC_REF * previous
} PMC_ref;

These PMC references need to be initialized: the PMC pointer is set to
a valid value and they are pushed into the stack. (This function
could take a least of PMC references instead of just one.)

void init_pmc_ref (Parrot_Interp *interp, PMC_ref *ref) {
ref->ref = NULL;
ref->previous = interp->c_roots;
interp->c_roots = &ref;
}

A "scope" datastructure is used to save the current stack top, and
restore it later.

typedef struct {
struct Parrot_Interp *interp;
PMC_ref * saved_top;
} scope;

void enter_scope (Parrot_Interp *interp, scope * s) {
s->interp = interp;
s->saved_top = s->c_roots;
}

void exit_scope (scope * s) {
s->interp->c_roots = s->saved_top;
}

Then, the body of a C function would look something like this:

/* Create a new scope */
scope s;
enter_scope(interp, &s);
/* Initialize the PMC references */
PMC_ref a;
init_pmc_ref(interp, &a);
...
/* Exit the scope and return */
exit_scope(&s);
return;

I don't think there is any difficulty with exception handling :

scope s;
jmp_buf buf;
enter_scope(interp, &s);
if (setjmp(buf)) {
... /* Something that might throw an exception */
exit_scope(&s);
} else {
exit_scope(&s);
... /* Handle the exception */
}

-- Jerome

Steve Fink

unread,
Jan 3, 2003, 10:29:48 PM1/3/03
to Leopold Toetsch, perl6-i...@perl.org
On Jan-02, Leopold Toetsch wrote:
> Steve Fink wrote:
>
> >On Dec-31, Leopold Toetsch wrote:
>
>
> >>I think, it moves the problems just around with a lot of overhead. E.g.
> >>cloning a PerlArray of 10^6 entries would need 1000 generations
>
> >I don't understand. The outer clone belongs to generation N. Then if
> >you called clone() individually on each PMC within the array, then by
> >default you wouldn't create any more generations at all.
>
> Ah, ok. I thougt that at each dod run a new generation is started. But
> you seem to manage generation count manually.

I was assuming the generation would be automatically incremented per
op, either in the runops loop or (if that is too slow) in the
implementation of selected ops.

Leopold Toetsch

unread,
Jan 4, 2003, 12:01:48 PM1/4/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:


> I was assuming the generation would be automatically incremented per
> op, either in the runops loop or (if that is too slow) in the
> implementation of selected ops.


In the runops loop would be slow, and not very practical for e.g. JIT or
CGoto. But e.g. branch instructions would be a good point, though we
have a lot of these.

leo


Steve Fink

unread,
Jan 4, 2003, 8:15:55 PM1/4/03
to Leopold Toetsch, perl6-i...@perl.org
On Jan-02, Leopold Toetsch wrote:

I think it's reasonable to not guarantee 100% perfect memory usage, if
it makes the problem significantly easier. _How much_ to relax that
constraint is another question. I was thinking that it might be okay
to say that any single operation (even those that could conceivably
trigger many other operations) must fit all of its allocations within
the total amount of available memory when it starts. Or, in short, I'm
saying that maybe it's all right for it to crash even though it could
actually succeed if you ran a DOD immediately before every single
allocation.

A single opcode is probably too constraining, though. We'd probably
have to do better for operations that can trigger arbitrary numbers of
other operations. I'm not convinced enough of the utility of the
general approach, though, so I'm not going to try to figure out how to
make the granularity smaller.

Leopold Toetsch

unread,
Jan 4, 2003, 10:11:26 PM1/4/03
to Steve Fink, perl6-i...@perl.org
Steve Fink wrote:

It's late here, but I'll try an answer ;-)


> ... I was thinking that it might be okay


> to say that any single operation (even those that could conceivably
> trigger many other operations) must fit all of its allocations within
> the total amount of available memory when it starts.


We don't know that. There could be a clone of a PerlArray with size 10^6
elements. Or one element containing another element which is a
PerlArray nested (until (program (syntax (or (I (dont (know (ends));-)))))

> ... Or, in short, I'm


> saying that maybe it's all right for it to crash even though it could
> actually succeed if you ran a DOD immediately before every single
> allocation.


We can only increase allocating resources by some heuristic. It was by
4, now by 1.75, but you can never allocate for the future, you might
have more crahses - and a DOD run does not help, when you just clone and
you have no free headers.

You just get increasing crash numbers causing more DOD runs and bigger
allocations, until you reach the needed header limit.

Or for short, it doesn't work (all IMHO of course).

leo

Leopold Toetsch

unread,
Jan 9, 2003, 7:46:21 AM1/9/03
to Dan Sugalski, perl6-i...@perl.org
Leopold Toetsch wrote:

> Steve Fink wrote:
>
>> (3) propose something else that solves the whole problem neatly.


>
> i.e. reorder code where possible, and anchor early,

RFC: changing the clone op functionality slightly:

Clone does in the first step pmc_new{,noinit} a PMC of the desired type.
This is common to all classes. Then class specific deep copying jumps in.
I would like to change the clone functionality slightly to be:

inline op clone(out PMC, in PMC) {
$1 = pmc_new_noinit(interpreter, $2->vtable->type());
$2->vtable->clone(interpreter, $2, $1);
goto NEXT();
}

So, the newly created PMC would be anchored to the root set immediately.
Cloning e.g a KEY, which currently would be b0rken, could then be done
w/o disabling DOD.

leo

Dan Sugalski

unread,
Jan 9, 2003, 2:14:13 PM1/9/03
to Leopold Toetsch, perl6-i...@perl.org

Works for me. If it preserves the semantics we want to express, and
it looks like it does, I'm just fine with it.

0 new messages