Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Timely Destruction: An efficient, complete solution

12 views
Skip to first unread message

Luke Palmer

unread,
Aug 17, 2003, 7:48:14 AM8/17/03
to perl6-i...@perl.org
Here comes that ever-reincarnating thread again, sorry.

This is a proposal for an efficient solution to the timely destruction
problem, which doesn't use refcounting, and fits in to the current
scheme pretty well.

It is based on the fact that 90% of the time (or more), all objects
needing timely destruction will be found live. It uses a priority
method to try to find these objects quickly, and ceases if it does.
This behavior is only carried out if the DOD was invoked by C<sweep 0>.

There is an external list of objects needing timely destruction, which
is not walked by DOD. Each object has a DOD_high_priority_FLAG. Each
time an impatient object is created, its high_priority_FLAG is set.

As everything is walked, if an object with this flag set is encountered,
whichever thing referenced it also gets the flag set[1].

The DOD queue has two segments: high_priority and low_priority.
high_priority is always drained and processed first. When this portion
of the queue is completely drained, the external list of impatient
objects is checked. If every object has been marked live, DOD
terminates (and GC is not allowed to run, because that would result in a
lot of wrongly murdered objects). If there are dead objects in that
list, DOD continues and does a full sweep.

When an impatient object is destructed, it might be good to reset all
high_priority_FLAGs, except for other impatient objects, so there is
nothing being walked at high priority that doesn't deserve it.

That's it. The first few times C<sweep 0> is run, it will take just as
long as C<sweep 1>, but the priorities will propogate down to things in
the root set, and it will begin to speed up. And the algorithmic
complexity never exceeds that of a regular sweep[2].

Luke

[1] This probably has to be done when the object is enqueued, not
dequeued. I don't know whether this impacts performance significantly.
See [2].

[2] Although a single cycle of the algorithm becomes more complex, so it
will slow things down a little when it's not doing any good. But at the
expense of a little templating, these checks could be eliminated when
the sweep wasn't triggered by C<sweep 0>.

Benjamin Goldberg

unread,
Aug 17, 2003, 8:52:37 PM8/17/03
to perl6-i...@perl.org
Luke Palmer wrote:
>
> Here comes that ever-reincarnating thread again, sorry.
>
> This is a proposal for an efficient solution to the timely destruction
> problem, which doesn't use refcounting, and fits in to the current
> scheme pretty well.
>
> It is based on the fact that 90% of the time (or more), all objects
> needing timely destruction will be found live. It uses a priority
> method to try to find these objects quickly, and ceases if it does.
> This behavior is only carried out if the DOD was invoked by C<sweep 0>.
>
> There is an external list of objects needing timely destruction, which
> is not walked by DOD. Each object has a DOD_high_priority_FLAG. Each
> time an impatient object is created, its high_priority_FLAG is set.
>
> As everything is walked, if an object with this flag set is encountered,
> whichever thing referenced it also gets the flag set[1].
>
> The DOD queue has two segments: high_priority and low_priority.
> high_priority is always drained and processed first.

All this, I understand so far.

> When this portion of the queue is completely drained, the external list
> of impatient objects is checked. If every object has been marked live,
> DOD terminates

This, I don't understand. Why do we stop DoD here? Shouldn't we try
and free up some of the non-impatient objects?

[insert sound of gears turning]

OIC... if this is only for "sweep 0", then we haven't really *asked* to
try and free up the impatient objects; we've only asked to free up those
objects needing timely destruction.

> (and GC is not allowed to run, because that would result
> in a lot of wrongly murdered objects).

I don't understand this, though. Why would GC murder something, if that
something has been marked as live? Or do you mean, "if we stoped DoD in
the middle, then obviously don't follow it with the GC that normally
follows a DOD." That makes sense, though I think that skipping GC after
an *incomplete* DOD is a sufficiently obvious thing that it needn't be
mentioned ;)

> If there are dead objects in that list,
> DOD continues and does a full sweep.

When you say, "DOD continues," I assume you mean that we now drain the
"low priority" queue. I assume that it's *this* part of the DOD, when,
if we now encounter something with a high_priority flag set, we set the
high_priority flag of the object which can reach it. (Certainly not
when we're draining the high priority queue: all the things doing the
reaching there already have the high_priority flag set on them).

> When an impatient object is destructed, it might be good to reset all
> high_priority_FLAGs, except for other impatient objects, so there is
> nothing being walked at high priority that doesn't deserve it.

Or, set a global (per-interpreter) flag indicating that an impatient
object has been destructed, and all high_priority_FLAGs need to be reset
at the beginning of the next DoD run.

> That's it. The first few times C<sweep 0> is run, it will take just as
> long as C<sweep 1>, but the priorities will propogate down to things in
> the root set, and it will begin to speed up. And the algorithmic
> complexity never exceeds that of a regular sweep[2].

I do see a potential problem: Suppose that a highly visible PMC (one
reachable through lots of objects) is able to reach a PMC needing timely
destruction, and then a sweep 0 is done, and afterwards that highly
visible PMC changes so it no longer can reach the object needing timely
destruction. It still has it's high_priority flag set, and this will
propogate into the many things which will reach it, eventually going
into the root set.

And the more things which (incorrectly) end up in the high priority
queue instead of the low priority queue, the closer it's speed will be
to a regular sweep.

I can think of a workaround, but it would cost more memory.
Specifically, make the "priority" an actual number, which decreases with
the distance away from an object needing timely destruction (e.g., one
less than the max of the priorities of the objects we can reach). If an
object needs timely destruction, it's priority is fixed at MAX_INT.
PMCs which used to be able to reach things needing timely destruction,
but no longer can, will thus no longer have high priorities.

An advantage of this is that if an object needing timely destruction is
destructed, then we no longer need to clear the priorities of all PMCs;
most of the priorities (the ones leading to other objects needing timely
destruction) are still good, and the priorities going to the destructed
PMc will eventually decrease.

An improvment that can be made, if numeric priorities are used, would be
to make the "high priority queue" into an actual priority queue.
Although the algorithmic complexity will be higher than that of a
regular sweep, the objects needing timely destruction will be reached
faster, since the search is directed straight at them.

================

For any scheme (yours scheme, my suggested modification, or anything
else), a potential improvment to the speed of "sweep 0" would be to keep
a counter of the number of objects which need timely destruction that
have been marked as alive. Once this number reaches the total number of
objects needing timely destruction, we can abort the DOD.

--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Dave Mitchell

unread,
Aug 19, 2003, 6:46:32 PM8/19/03
to Luke Palmer, perl6-i...@perl.org
On Sun, Aug 17, 2003 at 05:48:14AM -0600, Luke Palmer wrote:
> Here comes that ever-reincarnating thread again, sorry.
>
> This is a proposal for an efficient solution to the timely destruction
> problem, which doesn't use refcounting, and fits in to the current
> scheme pretty well.

I don't quite follow all the below (probably mainly due to my
infamiliarity with parrot's DOD/GC stuff).

Would it be possible to illuminate it by explaining how the following
Perl5 code (presumably being executed by Ponie/Parrot) would ensure that
the two destructors are called at the places marked:

{
my $ref;
{
my $obj1 = Foo->new(...);
my $obj2 = Foo->new(...);
$ref = $obj1;
} # $obj2->DESTROY called here
# ...
} # $obj1->DESTROY called here
# ...

--
print+qq&$}$"$/$s$,$*${$}$g$s$@$.$q$,$:$.$q$^$,$@$*$~$;$.$q$m&if+map{m,^\d{0\,},,${$::{$'}}=chr($"+=$&||1)}q&10m22,42}6:17*2~2.3@3;^2$g3q/s"&=~m*\d\*.*g

Benjamin Goldberg

unread,
Aug 20, 2003, 6:40:51 PM8/20/03
to perl6-i...@perl.org

Dave Mitchell wrote:
>
> On Sun, Aug 17, 2003 at 05:48:14AM -0600, Luke Palmer wrote:
> > Here comes that ever-reincarnating thread again, sorry.
> >
> > This is a proposal for an efficient solution to the timely destruction
> > problem, which doesn't use refcounting, and fits in to the current
> > scheme pretty well.
>
> I don't quite follow all the below (probably mainly due to my
> infamiliarity with parrot's DOD/GC stuff).
>
> Would it be possible to illuminate it by explaining how the following
> Perl5 code (presumably being executed by Ponie/Parrot) would ensure that
> the two destructors are called at the places marked:
>
> {
> my $ref;
> {
> my $obj1 = Foo->new(...);
> my $obj2 = Foo->new(...);
> $ref = $obj1;
> } # $obj2->DESTROY called here
> # ...
> } # $obj1->DESTROY called here
> # ...

At each of the two end-of-scope braces, a "sweep 0" instruction would be
emmited. This performs Dead Object Detection, which at the first "}"
detects that $obj2 is dead, and at the second "}" detects that $obj1 is
unreachable.

Dave Mitchell

unread,
Aug 20, 2003, 6:57:15 PM8/20/03
to Benjamin Goldberg, perl6-i...@perl.org
On Wed, Aug 20, 2003 at 06:40:51PM -0400, Benjamin Goldberg wrote:
> Dave Mitchell wrote:
> >
> > On Sun, Aug 17, 2003 at 05:48:14AM -0600, Luke Palmer wrote:
> > > Here comes that ever-reincarnating thread again, sorry.
> > >
> > > This is a proposal for an efficient solution to the timely destruction
> > > problem, which doesn't use refcounting, and fits in to the current
> > > scheme pretty well.
> >
> > I don't quite follow all the below (probably mainly due to my
> > infamiliarity with parrot's DOD/GC stuff).
> >
> > Would it be possible to illuminate it by explaining how the following
> > Perl5 code (presumably being executed by Ponie/Parrot) would ensure that
> > the two destructors are called at the places marked:
> >
> > {
> > my $ref;
> > {
> > my $obj1 = Foo->new(...);
> > my $obj2 = Foo->new(...);
> > $ref = $obj1;
> > } # $obj2->DESTROY called here
> > # ...
> > } # $obj1->DESTROY called here
> > # ...
>
> At each of the two end-of-scope braces, a "sweep 0" instruction would be
> emmited. This performs Dead Object Detection, which at the first "}"
> detects that $obj2 is dead, and at the second "}" detects that $obj1 is
> unreachable.

Yeah, I understood that bit. It was all the stuff about lists of
high-priority and low-priority objects etc that I got lost in. In
particular how does it detect that $obj2 should be kept alive at the end
of the inner block withoput doing a *full* DoD sweep? Since I presume
that's what the proposal was intended to avoid.

--
"There's something wrong with our bloody ships today, Chatfield."
Admiral Beatty at the Battle of Jutland, 31st May 1916.

Benjamin Goldberg

unread,
Aug 20, 2003, 8:57:11 PM8/20/03
to perl6-i...@perl.org

(For the sake of my fingers, please read any "ntd" as "needing timely
destruction", and "owntd" below as "object which needs timely
destruction.")

Well, *most* of the time, when we reach a "}", *nothing* has gone out of
scope. Or at least, no owntds have gone out of scope.

Thus, on *those* occasions (when nothing ntd has gone out of scope), we
want the "sweep 0" instruction to execute really really fast, not
slowing our interpreter down at all.

When we normally do dead object detection, when we discover an object
which is reachable, which we hadn't previously marked as reachable, we
would just add it to a simple queue -- a linked list. I forget if this
list is a fifo or stack, but it doesn't really matter... the point is,
the order we scan our objects depends soley on the order in which we see
them.

Luke's proposal is that we change to *two* queues -- one queue
containing objects which are believed to be able to reach owntds (this
is the high priority queue), and a second queue for objects which we
don't think are able to reach any owntds (this is the low priority
queue).

*Assuming* that no owntds have gone out of scope, then by scanning the
high priority queue first, we *should* (we hope) be able to (quickly)
mark all of the owntds as being alive (just by scanning the high
priority queue), and then quickly abort the DOD.

> In particular how does it detect that $obj2 should be kept alive at the
> end of the inner block withoput doing a *full* DoD sweep?

ITYM, s/kept alive/destructed/.

Alas, in those situations when an owntd goes out of scope, a full DoD
sweep will be needed.

After sweeping the high priority queue, and discovering there are some
owntds not yet marked as alive, we continue with the low priority queue,
and if they're *still* not all marked as alive, we continue with garbage
collection and firing off of destructors.

But this situation *isn't* the majority case.

> Since I presume that's what the proposal was intended to avoid.

It's intended to speed up the majority case, when no owntds have gone
out of scope, and thus no destructors need be called.

Steve Fink

unread,
Sep 5, 2003, 3:24:05 AM9/5/03
to Benjamin Goldberg, perl6-i...@perl.org
(I am way, way behind on reading this mailing list.)

Assuming I understand your scheme properly, it seems incorrect to me.

An object must be finalized during the first 'sweep 0' after it is no
longer reachable from the root set. (And no object may be finalized
while it is still reachable, but your scheme preserves that most
fundamental property of GC.)

So let's say I have objects A, B, and C, where A references B references
C. C requires timely finalization. After one DOD run, we have B and C in
the high priority set. Now A becomes unreachable and we do a 'sweep 0'.
We scan the high priority set and reach C, so we finish. But there are
no references to B, and hence no references to C, and yet we did not
finalize C. Oops.

Am I misunderstanding something?

Still, the basic approach seems promising. The key insight is that we
can terminate a 'sweep 0' as soon as all impatient objects have been
found, and we can use the knowledge gained from one DOD run to decrease
the expected time required to find all those impatient objects in the
next run.

Perhaps we keep (arbitrary) back pointers as we do DOD, and when we
reach an object requiring timely finalization, we store the entire chain
from the root set somewhere, as well as setting your high priority flag
on every object in the chain? Then, 'sweep 0' would do a full DOD,
except that whenever we reach a high priority object, we collect all of
the children together and check whether the next link in the stored
chain from that object is present. If so, it is visited first.
Otherwise, it's a normal full DOD, except as in your scheme we terminate
early once all impatient objects have been marked live.

Any chain roots in the root set should be visited before the rest of the
root set, of course. I am visualizing a DOD that works by always
collecting all of the children at one level before recursing into any of
them, and possibly choosing one or more of those children to visit
before the rest.

Whoops. Upon further reflection, I realize that this has a rather large
flaw -- you need to somehow jump over to a whole different search space
whenever you reach an impatient object (other than the last), because
otherwise the depth-first ordering will make you visit just about
everything else in the graph anyway.

Umm... explicitly try to walk each of the saved chains in turn? I'm
worried that the root nodes in the chains will quickly move a node or
two away from the root set between 'sweep 0' calls, but perhaps that
fear is unreasonable.

Try to save a "continuation" whenever you start chasing an impatient
rabbit down its hole, and jump to that saved search spot when you find
the impatient object? (But remember, you'd still have to come back to
the original search area if you don't catch enough rabbits.) That would
be a bit tricky, since many chains will share a common prefix, so you
have to figure out which continuation to choose that won't skip other
rabbit trails.

Wait... are we still constructing an explicit linked list of objects to
visit? Then perhaps instead of appending discovered children to the end
of the list (or the front -- are we doing BFS or DFS?), we use the "high
priority" flag -- objects with the flag set get inserted at the front of
the list; everything else gets inserted at the end. This naturally
results in all the stored chains being visited first (assuming they're
still valid), but still visiting every object only once. We don't even
need to explicitly store the chains anywhere, we just have to traverse
the back pointers from impatient objects and set the high priority flag
on everything encountered on the path back to the root set.

Ok, that seems simple enough that either there's some huge problem with
it, or it's a standard technique that everyone besides me already knows.
Which is it? :-)


Luke Palmer

unread,
Sep 5, 2003, 4:00:01 AM9/5/03
to Steve Fink, Benjamin Goldberg, perl6-i...@perl.org
Steve Fink writes:
> (I am way, way behind on reading this mailing list.)
>
> Assuming I understand your scheme properly, it seems incorrect to me.
>
> An object must be finalized during the first 'sweep 0' after it is no
> longer reachable from the root set. (And no object may be finalized
> while it is still reachable, but your scheme preserves that most
> fundamental property of GC.)
>
> So let's say I have objects A, B, and C, where A references B references
> C. C requires timely finalization. After one DOD run, we have B and C in
> the high priority set. Now A becomes unreachable and we do a 'sweep 0'.
> We scan the high priority set and reach C, so we finish. But there are
> no references to B, and hence no references to C, and yet we did not
> finalize C. Oops.
>
> Am I misunderstanding something?

Yes. See below. :-)

>
> [back pointers]
> [nevermind]
>
> [append high priority to front rather than back]


>
> Ok, that seems simple enough that either there's some huge problem with
> it, or it's a standard technique that everyone besides me already knows.
> Which is it? :-)

It's that it's exactly the same as my proposal, but with a simpler
implementation[1].

The thing you misunderstood was that the queue is emptied each DOD run.
So B couldn't get in the queue, because DOD would have to visit A first,
and A couldn't be reached.

The reason I proposed two seperate queues is so when the high priority
one was drained completely, you could check against the list of
impatient objects. But as Benjamin pointed out, you only need keep a
counter: how many impatient objects exist/how many have been found.
So there's no need to compilicate things with two seperate queues.

Also as Benjamin pointed out, we could switch to a full-blown priority
queue to improve performance in the long term. I don't much like that,
not because of the memory, but because heap operations are quite a bit
slower than queue operations for a little optimization like that. Maybe
the object could reset its high priority flag if it doesn't parent any
high priority objects....

Hmm.

In any case, seeing that depth first case (see the footnote) has given
me even more hope that I won't be agonizing over scope exit.

Luke


[1] Well, not I<exactly> the same. The high priority objects are
traversed depth-first rather than breadth-first, which may actually turn
out to be a win. Consider there only being one impatient object: after
a few DODs, the high priority flags mark a "path" down the tree, since
only the parent inherits the flag. If the search is depth-first, the
DOD will run directly down the path toward the object.

Leopold Toetsch

unread,
Sep 5, 2003, 5:23:20 AM9/5/03
to Luke Palmer, perl6-i...@perl.org
Luke Palmer <fibo...@babylonia.flatirons.org> wrote:

> In any case, seeing that depth first case (see the footnote) has given
> me even more hope that I won't be agonizing over scope exit.

Can you summarize your scheme again please WRT this and other
enhancements. I'm somewhat lost in all the improvements that were
proposed since your original.

> Luke

TIA
leo

Luke Palmer

unread,
Sep 5, 2003, 7:38:02 AM9/5/03
to Leopold Toetsch, perl6-i...@perl.org

Alright, here's a patch that implements it. The current scheme is
actually quite a bit simpler than the one I originally proposed. It
should provide much better performance, too.

Bear with me, because I don't know my way around the DOD very well.
First, do I increment num_early_DOD_PMCs in the right place? That's
where it was set to 1 before...

Also, the critical block that makes this an optimization is missing,
marked with an XXX. I didn't implement it because I don't really know
the control flow of DOD, so I didn't know how to abort cleanly. Can
someone give me some hints here?

Other than those few, but unignorable, caveats, Enjoy!


Index: include/parrot/interpreter.h
===================================================================
RCS file: /cvs/public/parrot/include/parrot/interpreter.h,v
retrieving revision 1.85
diff -u -r1.85 interpreter.h
--- include/parrot/interpreter.h 28 Aug 2003 15:26:24 -0000 1.85
+++ include/parrot/interpreter.h 5 Sep 2003 11:28:24 -0000
@@ -188,14 +188,19 @@

/* per interpreter global vars */
INTVAL world_inited; /* Parrot_init is done */
- PMC *mark_ptr; /* last PMC marked used in DOD runs */
PMC *iglobals; /* SArray of PMCs, containing: */
/* 0: PMC *Parrot_base_classname_hash; hash containing name->base_type */
/* 1: PMC *Parrot_compreg_hash; hash containing assembler/compilers */
/* 2: PMC *Argv; list of argv */
/* 3: PMC *Env; hash_like Env PMC */
/* 4: PMC *ParrotInterpreter that's me */
- int has_early_DOD_PMCs; /* Flag that some want immediate destruction */
+ UINTVAL num_early_DOD_PMCs; /* how many want immediate destruction */
+ UINTVAL DOD_early_PMCs_seen; /* how many that want immediate destruction
+ * has DOD seen */
+ PMC *dod_mark_ptr; /* last PMC marked used in DOD runs */
+ PMC *dod_trace_ptr; /* last PMC trace_children was called on */
+ int lazy_dod; /* flag indicating whether we should stop
+ when we find all impatient pmcs */
} Interp;

/* &gen_from_enum(iglobals.pasm) */
Index: include/parrot/pobj.h
===================================================================
RCS file: /cvs/public/parrot/include/parrot/pobj.h,v
retrieving revision 1.28
diff -u -r1.28 pobj.h
--- include/parrot/pobj.h 28 Aug 2003 13:44:25 -0000 1.28
+++ include/parrot/pobj.h 5 Sep 2003 11:28:24 -0000
@@ -198,7 +198,7 @@
PObj_active_destroy_FLAG = 1 << 22,
/* For debugging, report when this buffer gets moved around */
PObj_report_FLAG = 1 << 23,
-
+
/* PMC specific FLAGs */

/* Set to true if the PMC data pointer points to something that
@@ -218,12 +218,14 @@
*/
b_PObj_is_special_PMC_FLAG = 1 << 26,

- b_PObj_needs_early_DOD_FLAG = 1 << 27,
+ /* This PObj is connected by some route to a "needs_early_DOD" object */
+ PObj_high_priority_DOD_FLAG = 1 << 27,
+ b_PObj_needs_early_DOD_FLAG = (1 << 27 | 1 << 28),

/* True if the PMC is a class */
- PObj_is_class_FLAG = 1 << 28,
+ PObj_is_class_FLAG = 1 << 29,
/* True if the PMC is a parrot object */
- PObj_is_object_FLAG = 1 << 29
+ PObj_is_object_FLAG = 1 << 30

} PObj_flags;

@@ -340,6 +342,10 @@
#define PObj_report_TEST(o) PObj_flag_TEST(report, o)
#define PObj_report_SET(o) PObj_flag_SET(report, o)
#define PObj_report_CLEAR(o) PObj_flag_CLEAR(report, o)
+
+#define PObj_high_priority_DOD_TEST(o) PObj_flag_TEST(high_priority_DOD, o)
+#define PObj_high_priority_DOD_SET(o) PObj_flag_SET(high_priority_DOD, o)
+#define PObj_high_priority_DOD_CLEAR(o) PObj_flag_CLEAR(high_priority_DOD, o)

#define PObj_on_free_list_TEST(o) DOD_flag_TEST(on_free_list, o)
#define PObj_on_free_list_SET(o) DOD_flag_SET(on_free_list, o)
Index: include/parrot/dod.h
===================================================================
RCS file: /cvs/public/parrot/include/parrot/dod.h,v
retrieving revision 1.11
diff -u -r1.11 dod.h
--- include/parrot/dod.h 28 Jul 2003 13:38:00 -0000 1.11
+++ include/parrot/dod.h 5 Sep 2003 11:28:24 -0000
@@ -40,7 +40,12 @@
#define Parrot_is_blocked_GC(interpreter) \
((interpreter)->GC_block_level)

-void Parrot_do_dod_run(struct Parrot_Interp *, int trace_stack);
+enum {
+ DOD_trace_stack_FLAG = 1 << 0,
+ DOD_lazy_FLAG = 1 << 1
+};
+
+void Parrot_do_dod_run(struct Parrot_Interp *, UINTVAL flags);
void trace_system_areas(struct Parrot_Interp *);
void trace_mem_block(struct Parrot_Interp *, size_t, size_t);

Index: dod.c
===================================================================
RCS file: /cvs/public/parrot/dod.c,v
retrieving revision 1.70
diff -u -r1.70 dod.c
--- dod.c 28 Aug 2003 16:56:15 -0000 1.70
+++ dod.c 5 Sep 2003 11:28:24 -0000
@@ -43,19 +43,34 @@
size_t ns = n >> ARENA_FLAG_SHIFT;
UINTVAL nm = (n & ARENA_FLAG_MASK) << 2;
UINTVAL *dod_flags = arena->dod_flags + ns;
- if (*dod_flags & ((PObj_on_free_list_FLAG | PObj_live_FLAG) << nm))
+ if (*dod_flags & PObj_on_free_list_FLAG << nm)
+ return;
+ if (PObj_high_priority_DOD_TEST(obj) && interpreter->dod_trace_ptr)
+ /* set obj's parent to high priority */
+ PObj_high_priority_DOD_SET(interpreter->dod_trace_ptr);
+ if (*dod_flags & PObj_live_FLAG << nm)
return;
++arena->live_objects;
*dod_flags |= PObj_live_FLAG << nm;

if (*dod_flags & (PObj_is_special_PMC_FLAG << nm)) {
if (((PMC*)obj)->pmc_ext) {
- /* put it on the end of the list */
- interpreter->mark_ptr->next_for_GC = (PMC *)obj;
+ if (PObj_high_priority_DOD_TEST(obj)
+ && ++interpreter->DOD_early_PMCs_seen
+ && interpreter->dod_trace_ptr) {
+ /* put it at the head of the list */
+ ((PMC*)obj)->next_for_GC = interpreter->dod_trace_ptr
+ ->next_for_GC->next_for_GC;
+ interpreter->dod_trace_ptr->next_for_GC = (PMC*)obj;
+ }
+ else {
+ /* put it on the end of the list */
+ interpreter->dod_mark_ptr->next_for_GC = (PMC *)obj;

- /* Explicitly make the tail of the linked list be
- * self-referential */
- interpreter->mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ /* Explicitly make the tail of the linked list be
+ * self-referential */
+ interpreter->dod_mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ }
}
else if (PObj_custom_mark_TEST(obj))
VTABLE_mark(interpreter, (PMC *) obj);
@@ -84,6 +99,8 @@
}
# endif
#endif
+ if (PObj_high_priority_DOD_TEST(obj) && interpreter->dod_trace_ptr)
+ PObj_high_priority_DOD_SET(interpreter->dod_trace_ptr);
/* mark it live */
PObj_live_SET(obj);
/* if object is a PMC and contains buffers or PMCs, then attach
@@ -91,12 +108,22 @@
*/
if (PObj_is_special_PMC_TEST(obj)) {
if (((PMC*)obj)->pmc_ext) {
- /* put it on the end of the list */
- interpreter->mark_ptr->next_for_GC = (PMC *)obj;
+ if (PObj_high_priority_DOD_TEST(obj)
+ && ++interpreter->DOD_early_PMCs_seen
+ && interpreter->dod_trace_ptr) {
+ /* put it at the head of the list */
+ ((PMC*)obj)->next_for_GC = interpreter->dod_trace_ptr
+ ->next_for_GC->next_for_GC;
+ interpreter->dod_trace_ptr->next_for_GC = (PMC*)obj;
+ }
+ else {
+ /* put it on the end of the list */
+ interpreter->dod_mark_ptr->next_for_GC = (PMC *)obj;

- /* Explicitly make the tail of the linked list be
- * self-referential */
- interpreter->mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ /* Explicitly make the tail of the linked list be
+ * self-referential */
+ interpreter->dod_mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ }
}
else if (PObj_custom_mark_TEST(obj))
VTABLE_mark(interpreter, (PMC *) obj);
@@ -134,7 +161,7 @@
struct Stash *stash = 0;

/* We have to start somewhere, the interpreter globals is a good place */
- interpreter->mark_ptr = current = interpreter->iglobals;
+ interpreter->dod_mark_ptr = current = interpreter->iglobals;

/* mark it as used */
pobject_lives(interpreter, (PObj *)interpreter->iglobals);
@@ -204,7 +231,13 @@
unsigned i = 0;
UINTVAL mask = PObj_is_PMC_ptr_FLAG | PObj_is_buffer_ptr_FLAG
| PObj_custom_mark_FLAG;
-
+
+ if (interpreter->DOD_early_PMCs_seen >= interpreter->num_early_DOD_PMCs)
+ ; /* XXX (lp): abort DOD and set a flag saying "no GC allowed" */
+
+ PObj_high_priority_DOD_CLEAR(current);
+ interpreter->dod_trace_ptr = current;
+
for (; current != prev; current = current->next_for_GC) {
UINTVAL bits = PObj_get_FLAGS(current) & mask;

@@ -454,7 +487,7 @@
#endif

/* We have no impatient things. Yet. */
- interpreter->has_early_DOD_PMCs = 0;
+ interpreter->num_early_DOD_PMCs = 0;

/* Run through all the buffer header pools and mark */
for (cur_arena = pool->last_Arena;
@@ -499,11 +532,11 @@
total_used++;
#if ARENA_DOD_FLAGS
if ((*dod_flags & (PObj_needs_early_DOD_FLAG << nm)))
- interpreter->has_early_DOD_PMCs = 1;
+ ++interpreter->num_early_DOD_PMCs;
#else
PObj_live_CLEAR(b);
if (PObj_needs_early_DOD_TEST(b))
- interpreter->has_early_DOD_PMCs = 1;
+ ++interpreter->num_early_DOD_PMCs;
#endif
}
else {
@@ -699,7 +732,7 @@

/* See if we can find some unused headers */
void
-Parrot_do_dod_run(struct Parrot_Interp *interpreter, int trace_stack)
+Parrot_do_dod_run(struct Parrot_Interp *interpreter, UINTVAL flags)
{
struct Small_Object_Pool *header_pool;
int j;
@@ -711,6 +744,8 @@
}
Parrot_block_DOD(interpreter);

+ interpreter->lazy_dod = flags & DOD_lazy_FLAG;
+ interpreter->dod_trace_ptr = NULL;
#if ARENA_DOD_FLAGS
clear_live_counter(interpreter, interpreter->arena_base->pmc_pool);
for (j = 0; j < (INTVAL)interpreter->arena_base->num_sized; j++) {
@@ -720,7 +755,7 @@
}
#endif
/* Now go trace the PMCs */
- trace_active_PMCs(interpreter, trace_stack);
+ trace_active_PMCs(interpreter, flags & DOD_trace_stack_FLAG);

/* And the buffers */
trace_active_buffers(interpreter);
@@ -730,7 +765,7 @@
* marking everything, if something was missed
* not - these could also be stale objects
*/
- if (trace_stack) {
+ if (flags & DOD_trace_stack_FLAG) {
# if ! DISABLE_GC_DEBUG
CONSERVATIVE_POINTER_CHASING = 1;
# endif
@@ -763,6 +798,7 @@
}
/* Note it */
interpreter->dod_runs++;
+ interpreter->dod_trace_ptr = NULL;
Parrot_unblock_DOD(interpreter);
return;
}
Index: interpreter.c
===================================================================
RCS file: /cvs/public/parrot/interpreter.c,v
retrieving revision 1.199
diff -u -r1.199 interpreter.c
--- interpreter.c 28 Aug 2003 15:26:21 -0000 1.199
+++ interpreter.c 5 Sep 2003 11:28:24 -0000
@@ -865,7 +865,7 @@
*/
Parrot_IOData_mark(interpreter, interpreter->piodata);

- if (interpreter->has_early_DOD_PMCs)
+ if (interpreter->num_early_DOD_PMCs)
free_unused_pobjects(interpreter, interpreter->arena_base->pmc_pool);

/* we destroy all child interpreters and the last one too,
Index: resources.c
===================================================================
RCS file: /cvs/public/parrot/resources.c,v
retrieving revision 1.108
diff -u -r1.108 resources.c
--- resources.c 28 Jul 2003 13:37:55 -0000 1.108
+++ resources.c 5 Sep 2003 11:28:24 -0000
@@ -106,13 +106,13 @@
interpreter->mem_allocs_since_last_collect++;
}
if (0 && GC_DEBUG(interpreter)) {
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);
if (pool->compact) {
(*pool->compact) (interpreter, pool);
}
}
if (pool->top_block->free < size) {
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);
/* Compact the pool if allowed and worthwhile */
if (pool->compact) {
/* don't bother reclaiming if it's just chicken feed */
Index: smallobject.c
===================================================================
RCS file: /cvs/public/parrot/smallobject.c,v
retrieving revision 1.26
diff -u -r1.26 smallobject.c
--- smallobject.c 26 Aug 2003 10:05:44 -0000 1.26
+++ smallobject.c 5 Sep 2003 11:28:24 -0000
@@ -68,7 +68,7 @@
if (pool->skip)
pool->skip = 0;
else {
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);
if (pool->num_free_objects <= pool->replenish_level)
pool->skip = 1;
}
Index: string.c
===================================================================
RCS file: /cvs/public/parrot/string.c,v
retrieving revision 1.143
diff -u -r1.143 string.c
--- string.c 27 Aug 2003 16:27:28 -0000 1.143
+++ string.c 5 Sep 2003 11:28:24 -0000
@@ -943,7 +943,7 @@

/* It's easy to forget that string comparison can trigger GC */
if (interpreter && GC_DEBUG(interpreter))
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);

if (s1->type != s2->type || s1->encoding != s2->encoding) {
s1 = string_transcode(interpreter, s1, NULL, string_unicode_type,
@@ -1018,7 +1018,7 @@

/* trigger GC for debug */
if (interpreter && GC_DEBUG(interpreter))
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);

if (s1->type != s2->type || s1->encoding != s2->encoding) {
s1 = string_transcode(interpreter, s1, NULL, string_unicode_type,
@@ -1080,7 +1080,7 @@

/* trigger GC for debug */
if (interpreter && GC_DEBUG(interpreter))
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);

if (s1 && s2) {
if (s1->type != s2->type || s1->encoding != s2->encoding) {
@@ -1162,7 +1162,7 @@

/* trigger GC for debug */
if (interpreter && GC_DEBUG(interpreter))
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);

if (s1 && s2) {
if (s1->type != s2->type || s1->encoding != s2->encoding) {
Index: classes/timer.pmc
===================================================================
RCS file: /cvs/public/parrot/classes/timer.pmc,v
retrieving revision 1.7
diff -u -r1.7 timer.pmc
--- classes/timer.pmc 25 Aug 2003 09:46:23 -0000 1.7
+++ classes/timer.pmc 5 Sep 2003 11:28:24 -0000
@@ -253,7 +253,7 @@
t->self = SELF;
SELF->cache.struct_val = t;
PObj_active_destroy_SET(SELF);
- interpreter->has_early_DOD_PMCs = 1;
+ interpreter->num_early_DOD_PMCs++;
}

void init_pmc(PMC *init) {
Index: io/io.c
===================================================================
RCS file: /cvs/public/parrot/io/io.c,v
retrieving revision 1.53
diff -u -r1.53 io.c
--- io/io.c 2 Sep 2003 16:41:14 -0000 1.53
+++ io/io.c 5 Sep 2003 11:28:24 -0000
@@ -748,12 +748,12 @@
INTVAL i;
ParrotIOTable table = piodata->table;

- /* XXX boe: Parrot_really_destroy might call us with mark_ptr not
+ /* XXX boe: Parrot_really_destroy might call us with dod_mark_ptr not
* set. This is neccessary until destruction ordering prevents
* the premature destruction of the standardhandles
*/
- if (!interpreter->mark_ptr)
- interpreter->mark_ptr = table[0];
+ if (!interpreter->dod_mark_ptr)
+ interpreter->dod_mark_ptr = table[0];

for (i = 0; i < PIO_NR_OPEN; i++) {
if (table[i]) {
Index: core.ops
===================================================================
RCS file: /cvs/public/parrot/core.ops,v
retrieving revision 1.324
diff -u -r1.324 core.ops
--- core.ops 29 Aug 2003 11:30:17 -0000 1.324
+++ core.ops 5 Sep 2003 11:28:24 -0000
@@ -837,8 +837,11 @@
=cut

op sweep(inconst INT) {
- if ($1 || interpreter->has_early_DOD_PMCs)
+ if ($1)
Parrot_do_dod_run(interpreter, 0);
+ else
+ if (interpreter->num_early_DOD_PMCs)
+ Parrot_do_dod_run(interpreter, DOD_lazy_FLAG);
goto NEXT();
}

@@ -906,7 +909,7 @@

op needs_destroy(in PMC) {
PObj_needs_early_DOD_SET($1);
- interpreter->has_early_DOD_PMCs = 1;
+ interpreter->num_early_DOD_PMCs = 1;
goto NEXT();
}

Leopold Toetsch

unread,
Sep 5, 2003, 10:47:10 AM9/5/03
to Luke Palmer, perl6-i...@perl.org
Luke Palmer <fibo...@babylonia.flatirons.org> wrote:

> Leopold Toetsch writes:
>>
>> Can you summarize your scheme again please WRT this and other
>> enhancements. I'm somewhat lost in all the improvements that were
>> proposed since your original.

> Alright, here's a patch that implements it.

Wow.

Some remarks:
First of course did you test & benchmark it? Please ...

(Allocate 10^x PMCs into some aggregates. Then: set active object(s) in
the root set, then do 2 or 3 "sweep 1" for cache settling, then time a
"sweep 1" against a "sweep 0". The same with some active objects deeper
buried inside a nested structure)

> Bear with me, because I don't know my way around the DOD very well.
> First, do I increment num_early_DOD_PMCs in the right place? That's
> where it was set to 1 before...

Looks good.
I didn't look, but do you decrement the counter, when such an object
gets destroyed?

> Also, the critical block that makes this an optimization is missing,
> marked with an XXX. I didn't implement it because I don't really know
> the control flow of DOD, so I didn't know how to abort cleanly. Can
> someone give me some hints here?

It should probably be inside trace_children's loop. Then, when you have
all your num_early_DOD_PMCs seen, you just leave the loop for the lazy
case. That's it.

Please note: you don't have to worry about GC. It might be inefficient
to move already dead object buffers around, but this is the way its done
currently. (Another thing to benchmark - though)

> Other than those few, but unignorable, caveats, Enjoy!

> + && interpreter->dod_trace_ptr) {


> + /* put it at the head of the list */

> + }


> + else {
> + /* put it on the end of the list */

/Me thinks, Dan will not like that part. Seems to break his
serialization scheme.

leo

Luke Palmer

unread,
Sep 10, 2003, 12:52:32 AM9/10/03
to Leopold Toetsch, perl6-i...@perl.org
Okay, after some major changes, here's the second revision of my patch.
This one is fully functional.

On my system, it creates over a 10x speedup for lazy DOD runs!

(I'll post the benchmark program if someone wants; it's pretty long)

Luke


Index: core.ops
===================================================================
RCS file: /cvs/public/parrot/core.ops,v
retrieving revision 1.324
diff -u -r1.324 core.ops
--- core.ops 29 Aug 2003 11:30:17 -0000 1.324

+++ core.ops 10 Sep 2003 04:37:06 -0000


@@ -837,8 +837,11 @@
=cut

op sweep(inconst INT) {
- if ($1 || interpreter->has_early_DOD_PMCs)
+ if ($1)
Parrot_do_dod_run(interpreter, 0);
+ else
+ if (interpreter->num_early_DOD_PMCs)
+ Parrot_do_dod_run(interpreter, DOD_lazy_FLAG);
goto NEXT();
}

@@ -906,8 +909,8 @@



op needs_destroy(in PMC) {
PObj_needs_early_DOD_SET($1);
- interpreter->has_early_DOD_PMCs = 1;

- goto NEXT();
+ interpreter->num_early_DOD_PMCs++;
+ goto NEXT();
}

=back


Index: dod.c
===================================================================
RCS file: /cvs/public/parrot/dod.c,v
retrieving revision 1.70
diff -u -r1.70 dod.c
--- dod.c 28 Aug 2003 16:56:15 -0000 1.70

+++ dod.c 10 Sep 2003 04:37:06 -0000
@@ -31,7 +31,7 @@
#endif

static size_t find_common_mask(size_t val1, size_t val2);
-static void trace_children(struct Parrot_Interp *interpreter, PMC *current);
+static int trace_children(struct Parrot_Interp *interpreter, PMC *current);

#if ARENA_DOD_FLAGS

@@ -43,23 +43,49 @@


size_t ns = n >> ARENA_FLAG_SHIFT;
UINTVAL nm = (n & ARENA_FLAG_MASK) << 2;
UINTVAL *dod_flags = arena->dod_flags + ns;
- if (*dod_flags & ((PObj_on_free_list_FLAG | PObj_live_FLAG) << nm))
+
+ if (*dod_flags & PObj_on_free_list_FLAG << nm)
+ return;
+ if (PObj_high_priority_DOD_TEST(obj) && interpreter->dod_trace_ptr)
+ /* set obj's parent to high priority */
+ PObj_high_priority_DOD_SET(interpreter->dod_trace_ptr);
+ if (*dod_flags & PObj_live_FLAG << nm)
return;
++arena->live_objects;
*dod_flags |= PObj_live_FLAG << nm;

+ if (PObj_needs_early_DOD_TEST(obj))
+ ++interpreter->DOD_early_PMCs_seen;
+

if (*dod_flags & (PObj_is_special_PMC_FLAG << nm)) {
if (((PMC*)obj)->pmc_ext) {
- /* put it on the end of the list */
- interpreter->mark_ptr->next_for_GC = (PMC *)obj;
+ if (PObj_high_priority_DOD_TEST(obj)

+ && interpreter->dod_trace_ptr) {
+ PMC* tptr = interpreter->dod_trace_ptr;
+ /* Ugh, we have to check for self reference, otherwise
+ * we get a nasty circular loop. But, since whenever
+ * this block is executed, we're getting a nice
+ * optimization, I figure it's okay */
+ if (tptr->next_for_GC == tptr) {
+ ((PMC*)obj)->next_for_GC = (PMC*)obj;


+ }
+ else {
+ /* put it at the head of the list */

+ ((PMC*)obj)->next_for_GC = tptr->next_for_GC;
+ }


+ interpreter->dod_trace_ptr->next_for_GC = (PMC*)obj;

+ }
+ else {
+ /* put it on the end of the list */

+ interpreter->dod_mark_ptr->next_for_GC = (PMC *)obj;

- /* Explicitly make the tail of the linked list be
- * self-referential */
- interpreter->mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ /* Explicitly make the tail of the linked list be
+ * self-referential */
+ interpreter->dod_mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ }
}
else if (PObj_custom_mark_TEST(obj))
VTABLE_mark(interpreter, (PMC *) obj);

- return;
}
}

@@ -84,23 +110,42 @@


}
# endif
#endif
+ if (PObj_high_priority_DOD_TEST(obj) && interpreter->dod_trace_ptr)
+ PObj_high_priority_DOD_SET(interpreter->dod_trace_ptr);
/* mark it live */
PObj_live_SET(obj);
/* if object is a PMC and contains buffers or PMCs, then attach

* the PMC to the chained mark list
*/
- if (PObj_is_special_PMC_TEST(obj)) {
+ if (*dod_flags & (PObj_is_special_PMC_FLAG << nm)) {


if (((PMC*)obj)->pmc_ext) {
- /* put it on the end of the list */
- interpreter->mark_ptr->next_for_GC = (PMC *)obj;
+ if (PObj_high_priority_DOD_TEST(obj)

+ && interpreter->dod_trace_ptr) {
+ PMC* tptr = interpreter->dod_trace_ptr;
+ /* Ugh, we have to check for self reference, otherwise
+ * we get a nasty circular loop. But, since whenever
+ * this block is executed, we're getting a nice
+ * optimization, I figure it's okay */
+ if (tptr->next_for_GC == tptr) {
+ ((PMC*)obj)->next_for_GC = (PMC*)obj;


+ }
+ else {
+ /* put it at the head of the list */

+ ((PMC*)obj)->next_for_GC = tptr->next_for_GC;
+ }


+ interpreter->dod_trace_ptr->next_for_GC = (PMC*)obj;

+ }
+ else {
+ /* put it on the end of the list */

+ interpreter->dod_mark_ptr->next_for_GC = (PMC *)obj;

- /* Explicitly make the tail of the linked list be
- * self-referential */
- interpreter->mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ /* Explicitly make the tail of the linked list be
+ * self-referential */
+ interpreter->dod_mark_ptr = ((PMC*)obj)->next_for_GC = (PMC *)obj;
+ }
}
else if (PObj_custom_mark_TEST(obj))
VTABLE_mark(interpreter, (PMC *) obj);

- return;
}
#if GC_VERBOSE
/* buffer GC_DEBUG stuff */
@@ -117,8 +162,10 @@
#endif


-/* Do a full trace run and mark all the PMCs as active if they are */
-static void
+/* Do a full trace run and mark all the PMCs as active if they are.
+ * Returns whether the run wasn't aborted (whether it is safe to
+ * proceed with GC) */
+static int
trace_active_PMCs(struct Parrot_Interp *interpreter, int trace_stack)
{
PMC *current;
@@ -134,7 +181,7 @@


struct Stash *stash = 0;

/* We have to start somewhere, the interpreter globals is a good place */
- interpreter->mark_ptr = current = interpreter->iglobals;
+ interpreter->dod_mark_ptr = current = interpreter->iglobals;

/* mark it as used */
pobject_lives(interpreter, (PObj *)interpreter->iglobals);

@@ -194,20 +241,30 @@
#endif
/* Okay, we've marked the whole root set, and should have a good-sized
* list 'o things to look at. Run through it */
- trace_children(interpreter, current);
+ return trace_children(interpreter, current);
}

-static void
+/* Returns whether the process wasn't aborted. */
+static int
trace_children(struct Parrot_Interp *interpreter, PMC *current)
{
PMC *prev = NULL;


unsigned i = 0;
UINTVAL mask = PObj_is_PMC_ptr_FLAG | PObj_is_buffer_ptr_FLAG
| PObj_custom_mark_FLAG;
-

+ int lazy_dod = interpreter->lazy_dod;

+
for (; current != prev; current = current->next_for_GC) {
UINTVAL bits = PObj_get_FLAGS(current) & mask;

+ if (lazy_dod &&
+ interpreter->DOD_early_PMCs_seen >= interpreter->num_early_DOD_PMCs) {
+ return 0;
+ }


+ interpreter->dod_trace_ptr = current;

+ if (!PObj_needs_early_DOD_TEST(current))
+ PObj_high_priority_DOD_CLEAR(current);
+
/* mark properties */
if (current->metadata) {
pobject_lives(interpreter, (PObj *)current->metadata);
@@ -250,6 +307,7 @@

prev = current;
}
+ return 1;
}

/* Scan any buffers in S registers and other non-PMC places and mark
@@ -453,9 +511,6 @@
UINTVAL free_arenas = 0, old_total_used = 0;
#endif

- /* We have no impatient things. Yet. */


- interpreter->has_early_DOD_PMCs = 0;

-


/* Run through all the buffer header pools and mark */
for (cur_arena = pool->last_Arena;

NULL != cur_arena; cur_arena = cur_arena->prev) {
@@ -497,13 +552,8 @@
{
/* its live */
total_used++;
-#if ARENA_DOD_FLAGS
- if ((*dod_flags & (PObj_needs_early_DOD_FLAG << nm)))


- interpreter->has_early_DOD_PMCs = 1;

-#else
+#if !ARENA_DOD_FLAGS
PObj_live_CLEAR(b);
- if (PObj_needs_early_DOD_TEST(b))


- interpreter->has_early_DOD_PMCs = 1;

#endif
}
else {
@@ -517,6 +567,8 @@
if (PObj_is_PMC_TEST(b)) {
/* then destroy it here
*/
+ if (PObj_high_priority_DOD_TEST(b))
+ --interpreter->num_early_DOD_PMCs;
if (PObj_active_destroy_TEST(b))
VTABLE_destroy(interpreter, (PMC *)b);

@@ -699,7 +751,7 @@



/* See if we can find some unused headers */
void
-Parrot_do_dod_run(struct Parrot_Interp *interpreter, int trace_stack)
+Parrot_do_dod_run(struct Parrot_Interp *interpreter, UINTVAL flags)
{
struct Small_Object_Pool *header_pool;
int j;

@@ -711,6 +763,8 @@


}
Parrot_block_DOD(interpreter);

+ interpreter->lazy_dod = flags & DOD_lazy_FLAG;
+ interpreter->dod_trace_ptr = NULL;
#if ARENA_DOD_FLAGS
clear_live_counter(interpreter, interpreter->arena_base->pmc_pool);
for (j = 0; j < (INTVAL)interpreter->arena_base->num_sized; j++) {

@@ -720,49 +774,51 @@


}
#endif
/* Now go trace the PMCs */
- trace_active_PMCs(interpreter, trace_stack);

+ if (trace_active_PMCs(interpreter, flags & DOD_trace_stack_FLAG)) {

- /* And the buffers */
- trace_active_buffers(interpreter);
+ /* And the buffers */
+ trace_active_buffers(interpreter);
#if !TRACE_SYSTEM_AREAS
# if GC_VERBOSE
- /* whe, we don't trace stack and registers, we check after
- * marking everything, if something was missed
- * not - these could also be stale objects
- */
- if (trace_stack) {
+ /* whe, we don't trace stack and registers, we check after
+ * marking everything, if something was missed
+ * not - these could also be stale objects
+ */


+ if (flags & DOD_trace_stack_FLAG) {
# if ! DISABLE_GC_DEBUG

- CONSERVATIVE_POINTER_CHASING = 1;
+ CONSERVATIVE_POINTER_CHASING = 1;
# endif
- trace_system_areas(interpreter);
+ trace_system_areas(interpreter);
# if ! DISABLE_GC_DEBUG
- CONSERVATIVE_POINTER_CHASING = 0;
+ CONSERVATIVE_POINTER_CHASING = 0;
# endif
- }
+ }
# endif
#endif

- /* Now put unused PMCs on the free list */
- header_pool = interpreter->arena_base->pmc_pool;
- free_unused_pobjects(interpreter, header_pool);
- total_free += header_pool->num_free_objects;
-
- /* And unused buffers on the free list */
- for (j = 0; j < (INTVAL)interpreter->arena_base->num_sized; j++) {
- header_pool = interpreter->arena_base->sized_header_pools[j];
- if (header_pool) {
+ /* Now put unused PMCs on the free list */
+ header_pool = interpreter->arena_base->pmc_pool;
+ free_unused_pobjects(interpreter, header_pool);
+ total_free += header_pool->num_free_objects;
+
+ /* And unused buffers on the free list */
+ for (j = 0; j < (INTVAL)interpreter->arena_base->num_sized; j++) {
+ header_pool = interpreter->arena_base->sized_header_pools[j];
+ if (header_pool) {
#ifdef GC_IS_MALLOC
- used_cow(interpreter, header_pool, 0);
+ used_cow(interpreter, header_pool, 0);
#endif
- free_unused_pobjects(interpreter, header_pool);
- total_free += header_pool->num_free_objects;
+ free_unused_pobjects(interpreter, header_pool);
+ total_free += header_pool->num_free_objects;
#ifdef GC_IS_MALLOC
- clear_cow(interpreter, header_pool, 0);
+ clear_cow(interpreter, header_pool, 0);
#endif
+ }


}
}
/* Note it */
interpreter->dod_runs++;
+ interpreter->dod_trace_ptr = NULL;
Parrot_unblock_DOD(interpreter);
return;
}
Index: interpreter.c
===================================================================
RCS file: /cvs/public/parrot/interpreter.c,v
retrieving revision 1.199
diff -u -r1.199 interpreter.c
--- interpreter.c 28 Aug 2003 15:26:21 -0000 1.199

+++ interpreter.c 10 Sep 2003 04:37:06 -0000


@@ -865,7 +865,7 @@
*/
Parrot_IOData_mark(interpreter, interpreter->piodata);

- if (interpreter->has_early_DOD_PMCs)
+ if (interpreter->num_early_DOD_PMCs)
free_unused_pobjects(interpreter, interpreter->arena_base->pmc_pool);

/* we destroy all child interpreters and the last one too,

@@ -1006,6 +1006,9 @@
break;
case TOTAL_COPIED:
ret = interpreter->memory_collected;
+ break;
+ case IMPATIENT_PMCS:
+ ret = interpreter->num_early_DOD_PMCs;
break;
}
return ret;


Index: resources.c
===================================================================
RCS file: /cvs/public/parrot/resources.c,v
retrieving revision 1.108
diff -u -r1.108 resources.c
--- resources.c 28 Jul 2003 13:37:55 -0000 1.108

+++ resources.c 10 Sep 2003 04:37:07 -0000


@@ -106,13 +106,13 @@
interpreter->mem_allocs_since_last_collect++;
}
if (0 && GC_DEBUG(interpreter)) {
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);
if (pool->compact) {
(*pool->compact) (interpreter, pool);
}
}
if (pool->top_block->free < size) {
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);
/* Compact the pool if allowed and worthwhile */
if (pool->compact) {
/* don't bother reclaiming if it's just chicken feed */
Index: smallobject.c
===================================================================
RCS file: /cvs/public/parrot/smallobject.c,v
retrieving revision 1.26
diff -u -r1.26 smallobject.c
--- smallobject.c 26 Aug 2003 10:05:44 -0000 1.26

+++ smallobject.c 10 Sep 2003 04:37:07 -0000


@@ -68,7 +68,7 @@
if (pool->skip)
pool->skip = 0;
else {
- Parrot_do_dod_run(interpreter, 1);
+ Parrot_do_dod_run(interpreter, DOD_trace_stack_FLAG);
if (pool->num_free_objects <= pool->replenish_level)
pool->skip = 1;
}
Index: string.c
===================================================================
RCS file: /cvs/public/parrot/string.c,v
retrieving revision 1.143
diff -u -r1.143 string.c
--- string.c 27 Aug 2003 16:27:28 -0000 1.143

+++ string.c 10 Sep 2003 04:37:07 -0000

+++ classes/timer.pmc 10 Sep 2003 04:37:07 -0000


@@ -253,7 +253,7 @@
t->self = SELF;
SELF->cache.struct_val = t;
PObj_active_destroy_SET(SELF);
- interpreter->has_early_DOD_PMCs = 1;
+ interpreter->num_early_DOD_PMCs++;
}

void init_pmc(PMC *init) {

Index: include/parrot/dod.h
===================================================================
RCS file: /cvs/public/parrot/include/parrot/dod.h,v
retrieving revision 1.11
diff -u -r1.11 dod.h
--- include/parrot/dod.h 28 Jul 2003 13:38:00 -0000 1.11

+++ include/parrot/dod.h 10 Sep 2003 04:37:07 -0000


@@ -40,7 +40,12 @@
#define Parrot_is_blocked_GC(interpreter) \
((interpreter)->GC_block_level)

-void Parrot_do_dod_run(struct Parrot_Interp *, int trace_stack);
+enum {
+ DOD_trace_stack_FLAG = 1 << 0,
+ DOD_lazy_FLAG = 1 << 1
+};
+
+void Parrot_do_dod_run(struct Parrot_Interp *, UINTVAL flags);
void trace_system_areas(struct Parrot_Interp *);
void trace_mem_block(struct Parrot_Interp *, size_t, size_t);

Index: include/parrot/interpreter.h
===================================================================
RCS file: /cvs/public/parrot/include/parrot/interpreter.h,v
retrieving revision 1.85
diff -u -r1.85 interpreter.h
--- include/parrot/interpreter.h 28 Aug 2003 15:26:24 -0000 1.85

+++ include/parrot/interpreter.h 10 Sep 2003 04:37:07 -0000

+++ include/parrot/pobj.h 10 Sep 2003 04:37:07 -0000


@@ -198,7 +198,7 @@
PObj_active_destroy_FLAG = 1 << 22,
/* For debugging, report when this buffer gets moved around */
PObj_report_FLAG = 1 << 23,
-
+
/* PMC specific FLAGs */

/* Set to true if the PMC data pointer points to something that
@@ -218,12 +218,14 @@
*/
b_PObj_is_special_PMC_FLAG = 1 << 26,

- b_PObj_needs_early_DOD_FLAG = 1 << 27,
+ /* This PObj is connected by some route to a "needs_early_DOD" object */
+ PObj_high_priority_DOD_FLAG = 1 << 27,

+ PObj_needs_early_DOD_FLAG = (1 << 27 | 1 << 28),



/* True if the PMC is a class */
- PObj_is_class_FLAG = 1 << 28,
+ PObj_is_class_FLAG = 1 << 29,
/* True if the PMC is a parrot object */
- PObj_is_object_FLAG = 1 << 29
+ PObj_is_object_FLAG = 1 << 30

} PObj_flags;

@@ -240,7 +242,6 @@
# define d_PObj_live_FLAG 0x01
# define d_PObj_on_free_list_FLAG 0x02
# define d_PObj_is_special_PMC_FLAG 0x04
-# define d_PObj_needs_early_DOD_FLAG 0x08

/*
* arenas are constant sized ~32 byte object size, ~128K objects
@@ -297,14 +298,12 @@
# define PObj_live_FLAG d_PObj_live_FLAG
# define PObj_on_free_list_FLAG d_PObj_on_free_list_FLAG
# define PObj_is_special_PMC_FLAG d_PObj_is_special_PMC_FLAG
-# define PObj_needs_early_DOD_FLAG d_PObj_needs_early_DOD_FLAG

#else

# define PObj_live_FLAG b_PObj_live_FLAG
# define PObj_on_free_list_FLAG b_PObj_on_free_list_FLAG
# define PObj_is_special_PMC_FLAG b_PObj_is_special_PMC_FLAG
-# define PObj_needs_early_DOD_FLAG b_PObj_needs_early_DOD_FLAG

# define DOD_flag_TEST(flag, o) PObj_flag_TEST(flag, o)
# define DOD_flag_SET(flag, o) PObj_flag_SET(flag, o)
@@ -341,6 +340,10 @@


#define PObj_report_SET(o) PObj_flag_SET(report, o)
#define PObj_report_CLEAR(o) PObj_flag_CLEAR(report, o)

+#define PObj_high_priority_DOD_TEST(o) PObj_flag_TEST(high_priority_DOD, o)
+#define PObj_high_priority_DOD_SET(o) PObj_flag_SET(high_priority_DOD, o)
+#define PObj_high_priority_DOD_CLEAR(o) PObj_flag_CLEAR(high_priority_DOD, o)

+


#define PObj_on_free_list_TEST(o) DOD_flag_TEST(on_free_list, o)
#define PObj_on_free_list_SET(o) DOD_flag_SET(on_free_list, o)

#define PObj_on_free_list_CLEAR(o) DOD_flag_CLEAR(on_free_list, o)
@@ -361,9 +364,9 @@
#define PObj_sysmem_SET(o) PObj_flag_SET(sysmem, o)
#define PObj_sysmem_CLEAR(o) PObj_flag_CLEAR(sysmem, o)

-#define PObj_needs_early_DOD_TEST(o) DOD_flag_TEST(needs_early_DOD, o)
-#define PObj_needs_early_DOD_SET(o) DOD_flag_SET(needs_early_DOD, o)
-#define PObj_needs_early_DOD_CLEAR(o) DOD_flag_CLEAR(needs_early_DOD, o)
+#define PObj_needs_early_DOD_TEST(o) PObj_flag_TEST(needs_early_DOD, o)
+#define PObj_needs_early_DOD_SET(o) PObj_flag_SET(needs_early_DOD, o)
+#define PObj_needs_early_DOD_CLEAR(o) PObj_flag_CLEAR(needs_early_DOD, o)

#define PObj_special_SET(flag, o) do { \
PObj_flag_SET(flag, o); \
Index: include/parrot/resources.h
===================================================================
RCS file: /cvs/public/parrot/include/parrot/resources.h,v
retrieving revision 1.44
diff -u -r1.44 resources.h
--- include/parrot/resources.h 21 Jul 2003 18:00:42 -0000 1.44
+++ include/parrot/resources.h 10 Sep 2003 04:37:07 -0000
@@ -82,6 +82,7 @@
#define HEADER_ALLOCS_SINCE_COLLECT 8
#define MEM_ALLOCS_SINCE_COLLECT 9
#define TOTAL_COPIED 10
+#define IMPATIENT_PMCS 11

/* &end_gen */



Index: io/io.c
===================================================================
RCS file: /cvs/public/parrot/io/io.c,v
retrieving revision 1.53
diff -u -r1.53 io.c
--- io/io.c 2 Sep 2003 16:41:14 -0000 1.53

+++ io/io.c 10 Sep 2003 04:37:07 -0000


@@ -748,12 +748,12 @@
INTVAL i;
ParrotIOTable table = piodata->table;

- /* XXX boe: Parrot_really_destroy might call us with mark_ptr not

+ /* XXX boe: Parrot_really_destroy might call us with pmc_mark_ptr not


* set. This is neccessary until destruction ordering prevents
* the premature destruction of the standardhandles
*/
- if (!interpreter->mark_ptr)
- interpreter->mark_ptr = table[0];
+ if (!interpreter->dod_mark_ptr)
+ interpreter->dod_mark_ptr = table[0];

for (i = 0; i < PIO_NR_OPEN; i++) {
if (table[i]) {

Index: t/op/gc.t
===================================================================
RCS file: /cvs/public/parrot/t/op/gc.t,v
retrieving revision 1.5
diff -u -r1.5 gc.t
--- t/op/gc.t 1 Jul 2003 23:08:14 -0000 1.5
+++ t/op/gc.t 10 Sep 2003 04:37:07 -0000
@@ -35,10 +35,10 @@
interpinfo I1, 2 # How many DOD runs have we done already?
new P0, .PerlUndef
needs_destroy P0
+ new P0, .PerlUndef # kill object
sweep 0
interpinfo I2, 2 # Should be one more now
sub I3, I2, I1
- new P0, .PerlUndef # kill 1st object
sweep 0
interpinfo I4, 2 # Should be same as last
sub I5, I4, I2

Daniel Grunblatt

unread,
Sep 10, 2003, 12:54:57 AM9/10/03
to Luke Palmer, perl6-i...@perl.org
On Wednesday 10 September 2003 01:52, Luke Palmer wrote:
> Okay, after some major changes, here's the second revision of my patch.
> This one is fully functional.
>
> On my system, it creates over a 10x speedup for lazy DOD runs!
Yay!

>
> (I'll post the benchmark program if someone wants; it's pretty long)

What about puting it under examples/benchmarks/ ?

>
> Luke

Daniel

Leopold Toetsch

unread,
Sep 10, 2003, 5:47:19 AM9/10/03
to Luke Palmer, perl6-i...@perl.org
Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
> Okay, after some major changes, here's the second revision of my patch.
> This one is fully functional.

> On my system, it creates over a 10x speedup for lazy DOD runs!

We need that!!!1

> (I'll post the benchmark program if someone wants; it's pretty long)

Yes, please.

> Luke

leo

Dan Sugalski

unread,
Sep 10, 2003, 11:04:59 AM9/10/03
to Luke Palmer, Leopold Toetsch, perl6-i...@perl.org
On Tue, 9 Sep 2003, Luke Palmer wrote:

> Okay, after some major changes, here's the second revision of my patch.
> This one is fully functional.
>
> On my system, it creates over a 10x speedup for lazy DOD runs!

What's it do for non-lazy runs?

> (I'll post the benchmark program if someone wants; it's pretty long)

Please do. Better yet, commit it to the benchmark directory.

I'd like a test program for this. It doesn't necessarily have to be a .t
test (though that wouldn't hurt) but I'd like it to try as hard as it
possibly can to break this code. You've written it, so go over the code
and look for edge cases, weak spots, and other dodgy bits, and try and
construct a test to stress them.

If this all works out, we can commit the changes. (At some point we should
talk about what sorts of source support would be needed for proper
generational GC, but lets get this in first)

Dan

0 new messages