(1) tell me what I have wrong or am overlooking (especially with
respect to longjmp() -- I have the feeling that it's a bigger
problem than I realize),
(2) point out what's wrong with my "variant 5: generational
stack",
and/or
(3) propose something else that solves the whole problem neatly.
What do you mean by slow here?
> - Easy to forget to remove temporaries from the root set
> - Easy to double-anchor objects and forget to remove the temporary
> anchoring
> - longjmp() can bypass the unanchoring
The temporary objects could be stored in a stack, which is popped when
leaving the current function (both with normal exits and longjmp).
This should make it a lot less likely to forget the unanchoring.
-- Jerome
> ... So I decided to summarize the various approaches in
> hopes that something will jump out at someone.
Great document, thanks for all the work of summarizing this.
> (2) point out what's wrong with my "variant 5: generational
> stack",
I think, it moves the problems just around with a lot of overhead. E.g.
cloning a PerlArray of 10^6 entries would need 1000 generations (when
1000 PMCs are allocated in one bunch), which would need a ever growing
generation stack (using mem too) and leading to a probably non linear
slow down in DOD runs.
The ~1000 DOD runs want give you any resources (except for the first),
so it would be cheaper to disable DOD alltogether for e.g. clone - or
call do_dod_run explicitly and then disable it.
> (3) propose something else that solves the whole problem neatly.
I think, we will need a mixture of:
> In this case, the solution is simple: resize the array, if necessary,
> before creating the PMC.
i.e. reorder code where possible, and anchor early,
and ...
> =head2 Variant 3: clear during DOD
>
> The neonate flag is cleared during DOD when an object is encountered
> during the recursive root set traversal. (Leopold Toetsch's trick of
> setting the live_FLAG during creation is equivalent to this variation,
> I think.)
>
> + Simple
> + Fast DOD (DOD already manipulates the flags)
> - If there are multiple DOD runs before the object is anchored or
> dies, it will be prematurely freed
IMHO it should be possible to guarantee, that there are no consecutive
DOD runs for certain operations. The major problem is cloning of
aggregates which could take arbitrary resources (though the aggregates
normally know how big they are).
A second problem is: a DOD run can be triggered from two different code
paths: get_free_object->more_traceable_objects->DOD and (for the copying
allocator) by a shortage in the memory_pool from mem_allocate() [1].
By rearraning code and disabling DOD during clone (when needed), it
should be possible to avoid walking the processor stack/regs alltogether.
[1] which BTW seems bogus to me: pool_compact() may be called (when
there is enough reclaimable space), without having a DOD run immediately
before it. So e.g. the on_free_list flag might not reflect current usage
of a buffer.
leo
For both of those, "slow" refers to the need to append/push infant
objects onto the root set. This will slow down the common case of no
DOD, because they'll need to be added to the root set and removed,
without their membership ever being looked at. Objects which end up
anchored may or may not have to pay that cost, because you may be able
to anchor them before there is any possibility of a DOD run.
As to whether that is truly slow or not -- well, I personally probably
wouldn't worry about it, but I think this very concern has caused this
proposal to be rejected in the past. It's certainly a good candidate
for better benchmarking. If all objects have space for a pointer in
them (eg the next_for_GC link), then it seems quick enough to push
them onto the beginning of a list during allocation. Removing them
from the list is harder, since you'd need to traverse the whole stack.
Although you might be able to eliminate that problem by using both
temporary anchoring and flags, so that you could mark them as "no
longer temporarily anchored" with a single bit flip and strip them all
out in one pass during DOD...
Oops, I think I'm overdoing it again. Your later comment seems correct
-- they'll always be manipulated LIFO anyway, so you can just keep a
top-of-stack pointer before doing anything with them.
> > - Easy to forget to remove temporaries from the root set
> > - Easy to double-anchor objects and forget to remove the temporary
> > anchoring
> > - longjmp() can bypass the unanchoring
>
> The temporary objects could be stored in a stack, which is popped when
> leaving the current function (both with normal exits and longjmp).
> This should make it a lot less likely to forget the unanchoring.
How do you do this with longjmp? I could see chaining another handler
onto the longjmp context so that longjmp would backtrack through all
of these allocations, but that would require allocating space for
another context. And allocating space further slows down the common
case...
longjmp is really the major blocker for this option, so if you can
work around it, then maybe we'll have something. Your stack-based
approach sounds plausible as a way to make it easier to handle the
bookkeeping, as long as you don't do it with macros. :-) Maybe wrapper
functions?
static inline retval some_op()...
retval wrap_some_op(interp, args...)
{
Pobj *stacktop = interp->infants;
retval rv = some_op(interp, args);
interp->infants = stacktop;
return rv;
}
Pseudocode:
void some_func(struct Parrot_Interp *interpreter) {
anchor(interpreter, a);
anchor(interpreter, b);
TRY(interpreter) {
something_that_might_throw_an_exception();
}
CATCH(interpreter) {
}
unanchor(interpreter, b);
unanchor(interpreter, a);
}
#define TRY(interp) if(setup_exception((interp))
#define CATCH(interp) else
BOOLVAL setup_exception(struct Parrot_Interp *interpreter) {
Parrot_jmpbuf jb=NULL;
//setup jmpbuf...
if(setjmp(jmpbuf)) {
jb=stack_pop(interpreter, interpreter->jb_stack);
//Everything above us naturally falls off
interpreter->anchor_stack_top=jb->anchor_stack_top;
}
else {
Parrot_jmpbuf *jb=malloc(sizeof(Parrot_jmpbuf));
jb->real_jmpbuf=jmpbuf;
jb->anchor_stack_top=interpreter->anchor_stack_top;
...
stack_push(interpreter, interpreter->jb_stack, jb);
}
}
Yeah, I know, macros are evil, but they make this code *soooo* much
prettier...
--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)
"If you want to propagate an outrageously evil idea, your conclusion
must be brazenly clear, but your proof unintelligible."
--Ayn Rand, explaining how today's philosophies came to be
> Many of our tinderbox failures result from architecture-specific
> shortcomings in our current root set scanning code.
Here is a test result (i386/linux) *without* --gc-debug and without
trace_system_areas:
Failed Test Stat Wstat Total Fail Failed List of Failed
-------------------------------------------------------------------------------
t/op/string.t 1 256 99 1 1.01% 95
t/pmc/perlhash.t 2 512 24 2 8.33% 8-9
9 subtests skipped.
Failed 2/42 test scripts, 95.24% okay. 3/584 subtests failed, 99.49%
okay.
I know, that things get worse with optimization and with more complex
programs. But overall it doesn't look that bad. All 3 test failures seem
to be caused by string_concat. When string_concat would have a
**dest_ptr like e.g. string_repat, we could anchor the created string
early, which should fix the problem.
When looking at PMCs, I<new> does anchor the newly created PMC already.
The clone functions could be redone like:
Parrot_do_dod_run(); // get free headers if possible
Parrot_block_DOD/GC();
do_clone
Parrot_unblock_DOD/GC
This would also be faster, because there are no DOD runs during clone
(which wouldn't yield more free headers anyway).
I think that by either anchoring a newly created header early or by
blocking DOD/GC we could solve the infant mortality problem.
leo
Your pseudocode illustrates exactly what I was talking about --
including a memory allocation. And it's a memory allocation that will
really only be used in the case when a DOD is triggered, which is
exactly when we're low on memory...
It is perhaps not a fatal flaw, if we can bound the number of chained
longjmp buffers. But it does still slow down the common case.
> Yeah, I know, macros are evil, but they make this code *soooo* much
> prettier...
Actually, it seems to me if in your pseudocode you just renamed
setup_exception() to try(), then you don't really need the macro:
if (try(interp)) {
...code...
} else {
Parrot_jmpbuf...
}
Oops. Except I think your code, with or without macros, has a bug --
setup_exception adds another stackframe, then returns, then enters
something_that_might_throw_an_exception() which tramples
setup_exception()'s frame. So if you longjmp back, the
setup_exception() frame (including eg the 'interpreter' param) is
trashed.
So I think you're right -- you'd need a macro, but the macro would be
the entire body of the setup_exception() function. I knew there was a
reason setjmp/longjmp have always scared me.
And with other architectures that make heavier use of registers --
i386 may be an unrealistic example, since not much can even fit into
the available register set.
> When looking at PMCs, I<new> does anchor the newly created PMC already.
> The clone functions could be redone like:
> Parrot_do_dod_run(); // get free headers if possible
> Parrot_block_DOD/GC();
> do_clone
> Parrot_unblock_DOD/GC
> This would also be faster, because there are no DOD runs during clone
> (which wouldn't yield more free headers anyway).
But wouldn't that force a DOD run on every clone? Normally, DOD should
be extremely rare compared to clones.
> On Jan-01, Leopold Toetsch wrote:
>>I know, that things get worse with optimization and with more complex
>>programs.
> And with other architectures that make heavier use of registers --
> i386 may be an unrealistic example, since not much can even fit into
> the available register set.
Yep. The goal can only be, to either have *all* code paths that might
trigger DOD checked, or use one of the mentioned strategies. Though
reducing possibilities of infant mortatility by early anchoring is IMHO
always a good thing: it would e.g. help active pinning.
>>The clone functions could be redone like:
>> Parrot_do_dod_run(); // get free headers if possible
>> Parrot_block_DOD/GC();
>> do_clone
>> Parrot_unblock_DOD/GC
>>This would also be faster, because there are no DOD runs during clone
>>(which wouldn't yield more free headers anyway).
> But wouldn't that force a DOD run on every clone? Normally, DOD should
> be extremely rare compared to clones.
With above code yes. But classes can estimate, how much resources would
be needed to e.g. clone a PerlArray of 100 elements. When the needed
header count is far beyond the allocation of HEADERS_PER_ALLOC above
strategy is a win. Cloning small aggregates or scalars could be done
without explicit Parrot_do_dod_run when there are enough headers on the
free list.
So we would have additionally something like:
if (pmc->vtable->needed_headers_for_clone() > pmc_pool->replenish_level)
do_dod_run()
eventually checking other pools to, which is a lot cheaper then, e.g.
100 unnecessary DOD runs during cloning a huge array.
leo
> >>The clone functions could be redone like:
> >> Parrot_do_dod_run(); // get free headers if possible
> >> Parrot_block_DOD/GC();
> >> do_clone
> >> Parrot_unblock_DOD/GC
> >>This would also be faster, because there are no DOD runs during clone
> >>(which wouldn't yield more free headers anyway).
>
> > But wouldn't that force a DOD run on every clone? Normally, DOD should
> > be extremely rare compared to clones.
You could instead have a function Parrot_do_at_most_N_DOD/GC(int n), that
would never do a DOD run if it was passed 0. Do 1 if it was passed one, 2
for 2 etc, and do DOD runs whenever if passed -1.
Thus the code would be
Parrot_do_at_most_N_DOD/GC(1);
do_clone
Parrot_unblock_DOD/GC(-1);
This would allow it to do one DOD run if it realized that it needed it part
way through the clone, but would never do more. This could also allow more
flexible control of the GC if one wanted it.
Matt
----
"Computer Science is merely the post-Turing decline of Formal Systems
Theory"
-???
I don't understand. The outer clone belongs to generation N. Then if
you called clone() individually on each PMC within the array, then by
default you wouldn't create any more generations at all. If you were
worried that you might accumulate too much garbage while doing all
those clones (which is certainly reasonable), then you could put each
sub-clone into its own generation, but the stack would still only get
2 deep (each sub-clone would be surrounded by a generation push/pop).
I don't understand how batching up the PMC allocations changes that
analysis. You could certainly subdivide the clone into as many
generations as you want, but they'd still only ever be 2 deep.
I was thinking that for the most part only complete opcodes would get
their own generation, and only recursive calls to the interpreter to
execute other chunks of PASM would normally push onto the generational
stack. So by default, PerlArray.clone would all fit into one
generation unless the PMCs within the PerlArray implement their clone
vtable entries with PASM code.
> I think, we will need a mixture of:
>
> >In this case, the solution is simple: resize the array, if necessary,
> >before creating the PMC.
>
>
> i.e. reorder code where possible, and anchor early,
> and ...
I'm starting to think you're right.
> >=head2 Variant 3: clear during DOD
> >
> >The neonate flag is cleared during DOD when an object is encountered
> >during the recursive root set traversal. (Leopold Toetsch's trick of
> >setting the live_FLAG during creation is equivalent to this variation,
> >I think.)
> >
> > + Simple
> > + Fast DOD (DOD already manipulates the flags)
> > - If there are multiple DOD runs before the object is anchored or
> > dies, it will be prematurely freed
>
> IMHO it should be possible to guarantee, that there are no consecutive
> DOD runs for certain operations. The major problem is cloning of
> aggregates which could take arbitrary resources (though the aggregates
> normally know how big they are).
I'm not convinced that's ever possible. What if an object overrides
its clone()? Then when an aggregate containing it is cloned, it could
do pretty much anything...
Wait. Dumb question: is clone a shallow or deep copy?
> By rearraning code and disabling DOD during clone (when needed), it
> should be possible to avoid walking the processor stack/regs alltogether.
Yes, I'm hoping there's some solution where things that have full
knowledge of what might happen can make use of it and gain speed in
the common case, while there's still a safe mechanism for things that
are too difficult or impossible to predict in advance. So we can start
with the easy approach, and then optimize them into the more detailed
approach if needed. Or something. My hands are getting tired from all
the waving around.
Could you expand on that? What's "active pinning"?
Say we have a single memory backtracking jump buffer in the
interpreter. Then, before creating objects that cannot be easily
anchored to the root set, do a setjmp() and proceed ahead
optimistically under the assumption that no DOD runs will occur. You
must be careful here not to do anything with side effects that can't
be undone easily. Then, if DOD is triggered before you can anchor or
forget about the allocated objects, the interpreter does a longjmp()
back to your backtrack point:
somefunc() {
int dod_blocked = 0;
if (setjmp(interp->dod_backtrack_jmpbuf)) {
undo any partial state
do_DOD_run()
block DOD
dod_blocked = 1;
}
...code...
interp->dod_backtrack_jmpbuf = NULL;
if (dod_blocked) unblock DOD;
}
It adds a single setjmp() to the common case, which may be
prohibitively slow for all I know. Although you can use the same
optimistic approach without longjmp(); you'd just have to write your
code so that it undoes the partial state if any object allocation call
fails:
somefunc() {
interp->on_alloc_fail = RETURN_NULL;
.
.
.
if (alloc_object() == NULL) {
undo everything
do_DOD_run
interp->on_alloc_fail = CRASH_AND_BURN
start over
}
.
.
.
interp->on_alloc_fail = DO_DOD_RUN /* the default */
}
This way, the common case isn't slowed at all.
Both feel too complex to be workable, though. I just thought I'd throw
it out there to round out the set of possibilities.
my %x;
{ # start scope
my $timer1 = Timer->new();
my $timer2 = Timer->new();
$x{timer} = $timer2;
...
} # $timer1->DESTROY should be called
# $timer2->DESTROY should NOT be called
Right now, the only correct implementation of this that has been
proposed is to do a full DOD run at the end of every single scope. We
can do minor optimizations of that, by for example suppressing the DOD
if no object that implements a DESTROY method is live, but that's it.
At the moment, this threatens to make perl6 the *slowest* language
hosted by Parrot! (Since we haven't implemented any other languages
yet that need this.) It is possible that this problem will be fixed by
weakening the finalization guarantees in the language definition, but
it's definitely something to worry about.
The connection to the infant mortality problem is just that some
solutions might make this easier. Or harder. For example, the current
conservative stackwalk breaks even the painfully slow solution: there
might be a stale pointer somewhere on the C stack (or in a register,
or a saved register window) that keeps the DESTROYable object alive
unnaturally.
> On Dec-31, Leopold Toetsch wrote:
>>I think, it moves the problems just around with a lot of overhead. E.g.
>>cloning a PerlArray of 10^6 entries would need 1000 generations
> I don't understand. The outer clone belongs to generation N. Then if
> you called clone() individually on each PMC within the array, then by
> default you wouldn't create any more generations at all.
Ah, ok. I thougt that at each dod run a new generation is started. But
you seem to manage generation count manually.
> I'm not convinced that's ever possible. What if an object overrides
> its clone()?
The vtable->clone in core.ops would be surrounded by my proposed
do_dod_run/block_DOD, so the clone function could be anything.
> Wait. Dumb question: is clone a shallow or deep copy?
Deep and recursive. BTW recursively marking was declined becaus of stack
usage - what about clone?
>>By rearraning code and disabling DOD during clone (when needed), it
>>should be possible to avoid walking the processor stack/regs alltogether.
> Yes, I'm hoping there's some solution where things that have full
> knowledge of what might happen can make use of it and gain speed in
> the common case, while there's still a safe mechanism for things that
> are too difficult or impossible to predict in advance. So we can start
> with the easy approach, and then optimize them into the more detailed
> approach if needed. Or something. My hands are getting tired from all
> the waving around.
I think its time to do some coding.
leo
> To anyone out there who is thinking of a Grand Unified Infant
> Mortality Solution, here's another thing that vastly complicates
> things, and that we don't yet have a workable solution for yet: prompt
> finalization.
>
> my %x;
> { # start scope
> my $timer1 = Timer->new();
> my $timer2 = Timer->new();
> $x{timer} = $timer2;
> ...
> } # $timer1->DESTROY should be called
> # $timer2->DESTROY should NOT be called
>
> Right now, the only correct implementation of this that has been
> proposed is to do a full DOD run at the end of every single scope.
Active destroying (did I say: "We don't need it" ;-)
The question is: does the HL know, that $timer1 needs to be destroyed,
while $timer2 is still refererenced outside the current scope?
> ... We
> can do minor optimizations of that, by for example suppressing the DOD
> if no object that implements a DESTROY method is live, but that's it.
Or a more specific destroy flag or a separate object list for objects,
which are not allowed to live beyond the end of a scope?
> At the moment, this threatens to make perl6 the *slowest* language
> hosted by Parrot! (Since we haven't implemented any other languages
> yet that need this.)
Doing a full DOD run at the end of a scope is not an option.
> ... It is possible that this problem will be fixed by
> weakening the finalization guarantees in the language definition, but
> it's definitely something to worry about.
Weakening guarantees is probably not an option: file handles have the
same problem - they should AFAIK be closed, when going out of scope.
> The connection to the infant mortality problem is just that some
> solutions might make this easier. Or harder. For example, the current
> conservative stackwalk breaks even the painfully slow solution: there
> might be a stale pointer somewhere on the C stack (or in a register,
> or a saved register window) that keeps the DESTROYable object alive
> unnaturally.
... which seems to indicate, that stack/register walking is unusable for
such obbjects, which are not allowed to be live.
leo
> Another (maybe silly) possibility suggested itself to me based on a
> private mail re: infant mortality from Thomas Whateley: could we try
> optimistic allocations?
> if (alloc_object() == NULL) {
> undo everything
> do_DOD_run
> interp->on_alloc_fail = CRASH_AND_BURN
> start over
> }
What if e.g. clone needs more headers then one dod run can provide? With
CRASH_AND_BURN it would not succeed. With RETURN_NULL it would start
over and over again, until objects_per_alloc increases the needed header
count.
leo
> All~
>
> Parrot_do_at_most_N_DOD/GC(1);
> do_clone
> Parrot_unblock_DOD/GC(-1);
>
> This would allow it to do one DOD run if it realized that it needed it part
> way through the clone, but would never do more.
Very interesting approach. Yep. No need to check for probably needed
resources. But the problem here is - this works only for the approach
"setting the live (or neonate) FLAG for new objects - only one DOD run
allowed".
The problem with this approach is, that you have far to many live
objects laying around. DOD runs are rare.
I did test it with bench_newp.pasm: loops per second decreased from ~770
to 7, PMC count was 100fold so DOD runs took a long time.
This would work only, in combination with resetting live_FLAG for
already anchored objects, which share the problem with other neonate
flag approaches: How and when to reset the flag.
> Matt
leo
Proposed "Solution 3: Explicity root set augmentation"
leo
> To anyone out there who is thinking of a Grand Unified Infant
> Mortality Solution, here's another thing that vastly complicates
> things, and that we don't yet have a workable solution for yet: prompt
> finalization.
Timely finalization has no perfect solution even under the best conditions
(compiler support. etc.), this is why Java and .NET offer no promises about
timely finalization.
.
.
> The connection to the infant mortality problem is just that some
> solutions might make this easier. Or harder. For example, the current
> conservative stackwalk breaks even the painfully slow solution: there
As long as you're using a tracing collector, you cannot provide a general
guarantee of timely finalization without running a full sweep of the 'heap'.
Even if you do not have to walk the hardware stack that's too costly to be
practical.
The best partial solution to early finalization is compile-time tracing of
possible references by the compiler which can explicitly generate the
appropriate DESTROY calls.
--
Jason
> On Wed, Jan 01, 2003 at 08:09:05PM -0800, Steve Fink wrote:
>
>
>>To anyone out there who is thinking of a Grand Unified Infant
>>Mortality Solution, here's another thing that vastly complicates
>>things, and that we don't yet have a workable solution for yet: prompt
>>finalization.
> As long as you're using a tracing collector, you cannot provide a general
> guarantee of timely finalization without running a full sweep of the 'heap'.
> Even if you do not have to walk the hardware stack that's too costly to be
> practical.
Yep. s. google: "active destruction group:perl.perl6.internals" Subject
"DOD etc"
> The best partial solution to early finalization is compile-time tracing of
> possible references by the compiler which can explicitly generate the
> appropriate DESTROY calls.
... if the compiler can track all usage, which I doubt.
leo
What about refcounting + real GC?
refcounting will free most of the objects as soon as possible, and the
real GC system makes sure that even cyclical references are
eventually freed..
-angel
(who is not really paying atention at all)
Yech, refcounting is one of the things we're trying to avoid. The
cost and error-prone-ness is its big downfall.
Anyway, to respond to the whole thread here, as I dig through my mail backlog:
1) We're not generally guaranteeing timely destruction
2) If something *wants* timely destruction, it can push a DOD run as
a scope exit action. If something needing scope exit cleanup actually
escapes its construction scope, odds are it'll live for some
indeterminate time anyway, so that's fine
3) If #2 isn't sufficient, we can consider having a "needs timely
cleanup" flag with corresponding interpreter counter, and on scope
exit if the counter is non-zero a DOD run can be scheduled, with the
interpreter handling the counter.
While I don't really like walking the system stack and probing the
registers on DOD runs, the cost seems generally lower and less
error-prone than explicit save/unsave actions that happen every time
a PMC or buffer is allocated, and I'd like to start working in some
sort of adaptive allocation/collection code so the DOD/GC doesn't
fire off every time there's temporary resource starvation.
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
>
> What about refcounting + real GC?
As soon as you start to refcount objects, which need active destroying,
you end up refcounting all objects (a timer may be stored in an
aggregate, passed to subs, may have references, ...).
So this is not an option, except changing everything to refcounting,
which is a PITA.
> -angel
leo
Well, I thought that you only needed to refcount anything (and everything)
that currently contains an object that needs active destroying (where
references, subs and anything else that holds a reference counts as
"containing", because I can't think of a better word)
Effectively everything currently holding an object needing active
destruction becomes an object needing active destruction, (for the duration,
but no longer), and so on all the way back up to the root set.
Whether in practical terms this turns out to mean virtually everything
anyway, I don't know.
> So this is not an option, except changing everything to refcounting,
>
> which is a PITA.
Quite.
We're not guaranteeing perfection, though, are we? Just better than perl5?
[ie things wanting timely destruction get their destructor called at the
correct time, unless you've been silly enough to send them off into a
refcount loop. In which case they will get spotted at some point probably
before program exit, compared with perl5 which only spots them if your
program exits]
Nicholas Clark
> On Thu, Jan 02, 2003 at 08:12:45PM +0100, Leopold Toetsch wrote:
[ refcounting ]
> Effectively everything currently holding an object needing active
> destruction becomes an object needing active destruction, (for the duration,
> but no longer), and so on all the way back up to the root set.
>
> Whether in practical terms this turns out to mean virtually everything
> anyway, I don't know.
As you don't know, where such an object may be stored, you have the
"virtually everything".
> We're not guaranteeing perfection, though, are we? Just better than perl5?
Dan did outline our strategy, which seems to be very reasonbale for me.
If timers or such have to be destroyed at the end of scope, the HL is
responsible for telling parrot about this, either by an explicit "sweep"
call, or - with more coding on parrots side - by some counting of "must
destroy objects" + a full DOD run.
> [ie things wanting timely destruction get their destructor called at the
> correct time, unless you've been silly enough to send them off into a
> refcount loop. In which case they will get spotted at some point probably
> before program exit, compared with perl5 which only spots them if your
> program exits]
Yep. A reference to an object is a reference the programmer of such code
has to be aware of. And - Parrot doesn't suffer of refcount loops.
> Nicholas Clark
leo
-- BKS
This doesn't work well at all with recursive calls into an
interpreter, which is an issue. It also has some potentially large
overhead as we need to do stuff between every opcode, whien generally
we're going to be triggering runs when we're in the middle of an
opcode.
I'm thinking of something like this.
First, PMCs are not stored directly in C variables, but only in "PMC
references" (in the field "ref"). A stack of PMC references is
maintained, using the field "previous". The GC will consider all the
PMCs in the stack as part of the root.
typedef struct _PMC_ref {
PMC * ref;
_PMC_REF * previous
} PMC_ref;
These PMC references need to be initialized: the PMC pointer is set to
a valid value and they are pushed into the stack. (This function
could take a least of PMC references instead of just one.)
void init_pmc_ref (Parrot_Interp *interp, PMC_ref *ref) {
ref->ref = NULL;
ref->previous = interp->c_roots;
interp->c_roots = &ref;
}
A "scope" datastructure is used to save the current stack top, and
restore it later.
typedef struct {
struct Parrot_Interp *interp;
PMC_ref * saved_top;
} scope;
void enter_scope (Parrot_Interp *interp, scope * s) {
s->interp = interp;
s->saved_top = s->c_roots;
}
void exit_scope (scope * s) {
s->interp->c_roots = s->saved_top;
}
Then, the body of a C function would look something like this:
/* Create a new scope */
scope s;
enter_scope(interp, &s);
/* Initialize the PMC references */
PMC_ref a;
init_pmc_ref(interp, &a);
...
/* Exit the scope and return */
exit_scope(&s);
return;
I don't think there is any difficulty with exception handling :
scope s;
jmp_buf buf;
enter_scope(interp, &s);
if (setjmp(buf)) {
... /* Something that might throw an exception */
exit_scope(&s);
} else {
exit_scope(&s);
... /* Handle the exception */
}
-- Jerome
I was assuming the generation would be automatically incremented per
op, either in the runops loop or (if that is too slow) in the
implementation of selected ops.
> I was assuming the generation would be automatically incremented per
> op, either in the runops loop or (if that is too slow) in the
> implementation of selected ops.
In the runops loop would be slow, and not very practical for e.g. JIT or
CGoto. But e.g. branch instructions would be a good point, though we
have a lot of these.
leo
I think it's reasonable to not guarantee 100% perfect memory usage, if
it makes the problem significantly easier. _How much_ to relax that
constraint is another question. I was thinking that it might be okay
to say that any single operation (even those that could conceivably
trigger many other operations) must fit all of its allocations within
the total amount of available memory when it starts. Or, in short, I'm
saying that maybe it's all right for it to crash even though it could
actually succeed if you ran a DOD immediately before every single
allocation.
A single opcode is probably too constraining, though. We'd probably
have to do better for operations that can trigger arbitrary numbers of
other operations. I'm not convinced enough of the utility of the
general approach, though, so I'm not going to try to figure out how to
make the granularity smaller.
It's late here, but I'll try an answer ;-)
> ... I was thinking that it might be okay
> to say that any single operation (even those that could conceivably
> trigger many other operations) must fit all of its allocations within
> the total amount of available memory when it starts.
We don't know that. There could be a clone of a PerlArray with size 10^6
elements. Or one element containing another element which is a
PerlArray nested (until (program (syntax (or (I (dont (know (ends));-)))))
> ... Or, in short, I'm
> saying that maybe it's all right for it to crash even though it could
> actually succeed if you ran a DOD immediately before every single
> allocation.
We can only increase allocating resources by some heuristic. It was by
4, now by 1.75, but you can never allocate for the future, you might
have more crahses - and a DOD run does not help, when you just clone and
you have no free headers.
You just get increasing crash numbers causing more DOD runs and bigger
allocations, until you reach the needed header limit.
Or for short, it doesn't work (all IMHO of course).
leo
> Steve Fink wrote:
>
>> (3) propose something else that solves the whole problem neatly.
>
> i.e. reorder code where possible, and anchor early,
RFC: changing the clone op functionality slightly:
Clone does in the first step pmc_new{,noinit} a PMC of the desired type.
This is common to all classes. Then class specific deep copying jumps in.
I would like to change the clone functionality slightly to be:
inline op clone(out PMC, in PMC) {
$1 = pmc_new_noinit(interpreter, $2->vtable->type());
$2->vtable->clone(interpreter, $2, $1);
goto NEXT();
}
So, the newly created PMC would be anchored to the root set immediately.
Cloning e.g a KEY, which currently would be b0rken, could then be done
w/o disabling DOD.
leo
Works for me. If it preserves the semantics we want to express, and
it looks like it does, I'm just fine with it.