Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

More on stackwalking

14 views
Skip to first unread message

David Robins

unread,
May 5, 2003, 3:28:43 PM5/5/03
to perl6-i...@perl.org
I read Dan's stackwalking blog entry
(http://www.sidhe.org/~dan/blog/archives/000174.html) and it was just what I
expected stackwalking to be, except it left me with a few questions, like
how do you differentiate between-

PMC* pmc = PMC_new(...); /* a PMC pointer */
int number = 0x80041234; /* an int that looks very much like a pointer */

Not being completely helpless, I started poking around the source a bit,
beginning with a quick grep for 'walk the stack', and decided to post my
'findings' here for others that didn't know all this already and were
curious. Here's the path:

<GC run invoked 'somehow' (grep for Parrot_do_dod_run())>
dod.c: Parrot_do_dod_run()
dod.c: trace_active_PMCs()
cpu_dep.c: trace_system_areas()
cpu_dep.c: trace_system_stack() - gets start/end of stack
dod.c: trace_mem_block()

trace_mem_block() is where interesting things start to happen; to start
with, the interpreter is asked for the start/end of the PMC and buffer
memory areas, and then it just walks the memory area, stepping by the size
of a pointer, and for each value, checks if:

(1) if it masks to a calculated prefix (this is an optimization, presumable
because & is much faster than < and > twice each)

(2) if it's within either the PMC or buffer memory range

(3) if it passes header.c's is_pmc_ptr/is_buffer_ptr checks, which do a more
fine-grained version of (2) but also check alignment (see smallobject.c's
contained_in_pool()), since both PMCs and buffers have fixed-size headers

if these all pass, the pointer is considered to be a valid PMC pointer and
dod.c's frodo_lives(), er, pobject_lives() is used to make it 'live'.
pobject_lives() won't mark an object live twice, and it skips PMCs on the
free list (accomplished by a quick flags check; at this point it's safe to
do this since we know we have some sort of PMC header).

So, we eliminate false negatives (counting a value as a PMC pointer when it
isn't) by range checking, alignment checking, and flag checking.

False positives (counting a value as a PMC when it's an innocent integer,
part of a struct, etc.) are, as far as I can see, not eliminated (so the
answer to the original question is "you don't"). We don't care if a PMC
lives slightly longer than it should* - GC timing isn't guaranteed anyway.

* Or do we? Suppose the destructor for a PMC closes a file. The last ref
to the PMC goes away, so it's _eligible_ for GC (but not collected yet),
and then that file is reopened... could this be messy?

The rest of GC is fairly easy: mark the nonempty registers, interpreter
stacks, and (?) globals; these become part of the "root set" and anything
reachable from them is marked "live", the rest of the world is free for
reuse (tidily eliminating the circular references problem perl5 has,
although at the cost of not knowing when destructors are called - ?).

I also got curious about destruction, it's done in free_unused_pobjects()
which is also invoked from Parrot_do_dod_run(). Question: if object A uses
(has a pointer to) object B, and then the variable pointing to A is unset,
is there any guarantee that B's destructor is called before A's (it doesn't
look like it...)?

Hope this is helpful to some... apologies if it's already been hashed to
death.

(As an aside, a search on the mailing list archives would be a nice addition
to http://dev.perl.org/perl6/lists/.)

--
Dave
Isa. 40:31

Dan Sugalski

unread,
May 5, 2003, 5:07:17 PM5/5/03
to David Robins, perl6-i...@perl.org
At 3:28 PM -0400 5/5/03, David Robins wrote:
>False positives (counting a value as a PMC when it's an innocent integer,
>part of a struct, etc.) are, as far as I can see, not eliminated (so the
>answer to the original question is "you don't"). We don't care if a PMC
>lives slightly longer than it should* - GC timing isn't guaranteed anyway.
>
>* Or do we? Suppose the destructor for a PMC closes a file. The last ref
> to the PMC goes away, so it's _eligible_ for GC (but not collected yet),
> and then that file is reopened... could this be messy?

You've hit the one part of the DOD system that's conservative--the
stackwalking code. Since we can't be sure whether something is or
isn't a pointer we assume it is.

Yes, this does mean that in some odd and unusual circumstances that a
PMC/String/Buffer will live on well past the point it ought to.
There's not a whole lot to be done about this, since something living
too long is better than something getting cleaned up too soon. The
one advantage we do have is that the system stack is generally
shallow, so there's not that much data to wade through, and hence few
chances for false positives.

It is perfectly reasonable for a language compiler to explicitly
destroy PMCs if they know that destruction is viable, though I'm not
sure how often you'd actually want to do that.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Benjamin Goldberg

unread,
May 9, 2003, 6:12:26 PM5/9/03
to perl6-i...@perl.org

Dan Sugalski wrote:
>
> At 3:28 PM -0400 5/5/03, David Robins wrote:
> >False positives (counting a value as a PMC when it's an innocent integer,
> >part of a struct, etc.) are, as far as I can see, not eliminated (so the
> >answer to the original question is "you don't"). We don't care if a PMC
> >lives slightly longer than it should* - GC timing isn't guaranteed anyway.
> >
> >* Or do we? Suppose the destructor for a PMC closes a file. The last ref
> > to the PMC goes away, so it's _eligible_ for GC (but not collected yet),
> > and then that file is reopened... could this be messy?

Socket- and File- handles are the archetypical examples -- one generally
needs/wants them to be cleaned up in a timely manner.

If it's a plain file, and it's opened for reading, it's *relatively*
harmless. There's one fewer fd available, but most modern OSs have
enough fds for this not to be a concern. If the file was locked for
reading, and then next open is for writing, then this could indeed be a
problem -- deadlock could occur.

If the file is /dev/something, and the device reacts to being closed,
then obviously not closing the handle when it *should* get closed could
be a problem.

If it's a plain file, opened for writing, then there are two possible
sources of messiness -- if there's an unflushed output buffer, or if the
file is locked, either mandatorily (required by the OS), or manually
(due to an flock() type call).

> You've hit the one part of the DOD system that's conservative--the
> stackwalking code. Since we can't be sure whether something is or
> isn't a pointer we assume it is.
>
> Yes, this does mean that in some odd and unusual circumstances that a
> PMC/String/Buffer will live on well past the point it ought to.

For Strings and Buffers, this is entirely(?) harmless.

> There's not a whole lot to be done about this, since something living
> too long is better than something getting cleaned up too soon. The
> one advantage we do have is that the system stack is generally
> shallow, so there's not that much data to wade through, and hence few
> chances for false positives.
>
> It is perfectly reasonable for a language compiler to explicitly
> destroy PMCs if they know that destruction is viable, though I'm not
> sure how often you'd actually want to do that.

--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Dave Mitchell

unread,
May 10, 2003, 5:02:42 PM5/10/03
to Benjamin Goldberg, perl6-i...@perl.org
On Fri, May 09, 2003 at 06:12:26PM -0400, Benjamin Goldberg wrote:
> > At 3:28 PM -0400 5/5/03, David Robins wrote:
> > >* Or do we? Suppose the destructor for a PMC closes a file. The last ref
> > > to the PMC goes away, so it's _eligible_ for GC (but not collected yet),
> > > and then that file is reopened... could this be messy?
>
> Socket- and File- handles are the archetypical examples -- one generally
> needs/wants them to be cleaned up in a timely manner.

I though that parrot, by using GC rather than refcounting, had abandoned
all attempts at timely calling of destructors on scope exit?

--
Never do today what you can put off till tomorrow.

Graham Barr

unread,
May 11, 2003, 3:19:39 AM5/11/03
to Benjamin Goldberg, perl6-i...@perl.org
On Fri, May 09, 2003 at 06:12:26PM -0400, Benjamin Goldberg wrote:
>
>
> Dan Sugalski wrote:
> >
> > At 3:28 PM -0400 5/5/03, David Robins wrote:
> > >False positives (counting a value as a PMC when it's an innocent integer,
> > >part of a struct, etc.) are, as far as I can see, not eliminated (so the
> > >answer to the original question is "you don't"). We don't care if a PMC
> > >lives slightly longer than it should* - GC timing isn't guaranteed anyway.
> > >
> > >* Or do we? Suppose the destructor for a PMC closes a file. The last ref
> > > to the PMC goes away, so it's _eligible_ for GC (but not collected yet),
> > > and then that file is reopened... could this be messy?
>
> Socket- and File- handles are the archetypical examples -- one generally
> needs/wants them to be cleaned up in a timely manner.

Anyone who depends on GC for the closing of file handles asks for all they
get. If they need the handles closed in a timely manner then they should
close the explicitly.

Graham.

Hildo Biersma

unread,
May 11, 2003, 12:37:23 PM5/11/03
to Dave Mitchell, Benjamin Goldberg, perl6-i...@perl.org
>>>>> "Dave" == Dave Mitchell <da...@fdgroup.com> writes:

Dave> On Fri, May 09, 2003 at 06:12:26PM -0400, Benjamin Goldberg wrote:
>> > At 3:28 PM -0400 5/5/03, David Robins wrote:
>> > >* Or do we? Suppose the destructor for a PMC closes a file. The last ref
>> > > to the PMC goes away, so it's _eligible_ for GC (but not collected yet),
>> > > and then that file is reopened... could this be messy?
>>
>> Socket- and File- handles are the archetypical examples -- one generally
>> needs/wants them to be cleaned up in a timely manner.

Dave> I though that parrot, by using GC rather than refcounting, had
Dave> abandoned all attempts at timely calling of destructors on scope
Dave> exit?

That doesn't mean that languages running on Parrot cannot enforce
this.

For example, a high-level language can put in "check ref count and
close on zero" code on leaving any context scope that has defined a
file descriptor (plus maybe also the enclosing subroutine). If that
is combined with a similar check on assigment to variables of
file-handle class, you can pretty much guarantee this behavior even
with GC.

Dave Mitchell

unread,
May 11, 2003, 12:56:32 PM5/11/03
to Hildo....@morganstanley.com, Benjamin Goldberg, perl6-i...@perl.org

Yeah, but there's no ref count to check. So unless you want to do a full DOD
run at the end of each scope, I can't see how a HLL can achieve
destroy-on-scope-exit.

Dave.

--
print+qq&$}$"$/$s$,$*${$}$g$s$@$.$q$,$:$.$q$^$,$@$*$~$;$.$q$m&if+map{m,^\d{0\,},,${$::{$'}}=chr($"+=$&||1)}q&10m22,42}6:17*2~2.3@3;^2$g3q/s"&=~m*\d\*.*g

Benjamin Goldberg

unread,
May 11, 2003, 9:07:09 PM5/11/03
to perl6-i...@perl.org
Hildo Biersma wrote:
>
> Dave wrote:

In other words, all refcounting needs to be done manually?

Silly question, if I have code like:

my FileHandle $x = ...;
# refcount of the handle object is 1
my FileHandle $y = $x;
# refcount of the handle object is is now 2
my $z = $x;
# What is the refcount of the handle object now?
undef $x; undef $y;
# What is the refcount of the handle object now?
undef $z;
# Should that have caused the handle to be destructed,
# or merely enabled GC to destruct it, "eventually"?

We would *like* undef()ing $z to instantly destruct the object; however,
since $z isn't a variable of a type that's known to be refcounted, no
instructions would have been generated to decrement the reference count
of the object contained in it when that undef instruction occurs.

I suspect that if we want timely destruction of objects, *not only* does
manual refcounting have to be done on those objects, but we *cannot*
assign a refcounted value into a non-refcounted variable, since the two
behaviors are (I suspect) logically incompatible.


On such objects, the garbage collector should not be touching them until
some (potentially long) time after the refcount has gone to zero, and
should merely be freeing memory... not actually closing files/sockets.
If, at the time that the GC destructs the object, the file descriptor
*is* open or the refcount is nonzero, then something has gone wrong.

PS: Should refcounting be done with a new opcode, or via a method?

Dan Sugalski

unread,
May 15, 2003, 7:57:56 AM5/15/03
to Benjamin Goldberg, perl6-i...@perl.org
At 9:07 PM -0400 5/11/03, Benjamin Goldberg wrote:
>Hildo Biersma wrote:
>>
>> Dave wrote:
>>
>> Dave> Benjamin Goldberg wrote:
>> >> > At 3:28 PM -0400 5/5/03, David Robins wrote:
>> >> > >* Or do we? Suppose the destructor for a PMC closes a file. The
>> >> > > last ref to the PMC goes away, so it's _eligible_ for GC (but
>> >> > > not collected yet), and then that file is reopened... could
>> >> > > this be messy?
>> >>
>> >> Socket- and File- handles are the archetypical examples -- one
>> >> generally needs/wants them to be cleaned up in a timely manner.
>>
>> Dave> I though that parrot, by using GC rather than refcounting, had
>> Dave> abandoned all attempts at timely calling of destructors on scope
>> Dave> exit?
>>
>> That doesn't mean that languages running on Parrot cannot enforce
>> this.
>>
>> For example, a high-level language can put in "check ref count and
>> close on zero" code on leaving any context scope that has defined a
>> file descriptor (plus maybe also the enclosing subroutine). If that
>> is combined with a similar check on assigment to variables of
>> file-handle class, you can pretty much guarantee this behavior even
>> with GC.
>
>In other words, all refcounting needs to be done manually?

For parrot it'd be more like triggering a DOD run on scope exit if
the scope allocated a variable that may have an active destructor and
go out of scope on scope exit. (Though with continuations there's
always the question of when scope *really* exits...) The current plan
is to provide a means for the allocator for these sorts of variables
to push a DOD-run-trigger on the call stack so a DOD run is triggered
when the scope is cleaned up.

Dave Mitchell

unread,
May 15, 2003, 9:02:29 AM5/15/03
to Dan Sugalski, Benjamin Goldberg, perl6-i...@perl.org
On Thu, May 15, 2003 at 07:57:56AM -0400, Dan Sugalski wrote:
> For parrot it'd be more like triggering a DOD run on scope exit if
> the scope allocated a variable that may have an active destructor and
> go out of scope on scope exit. (Though with continuations there's
> always the question of when scope *really* exits...) The current plan
> is to provide a means for the allocator for these sorts of variables
> to push a DOD-run-trigger on the call stack so a DOD run is triggered
> when the scope is cleaned up.

But surely for the typical perl6-written-in-a-perl5-style program,
ie where you don't do lots of explicit typing of vars and subs, most
vars may have an active destructor, and therefore most programs
will need to be hit with a DOD run at every scope exit ?
ie

{ my $x = foo() } # perhaps $x has a destructor? Who knows?

--
"Strange women lying in ponds distributing swords is no basis for a system
of government. Supreme executive power derives from a mandate from the
masses, not from some farcical aquatic ceremony."
Dennis - Monty Python and the Holy Grail.

Dan Sugalski

unread,
May 15, 2003, 9:22:43 AM5/15/03
to Dave Mitchell, Benjamin Goldberg, perl6-i...@perl.org
At 2:02 PM +0100 5/15/03, Dave Mitchell wrote:
>On Thu, May 15, 2003 at 07:57:56AM -0400, Dan Sugalski wrote:
>> For parrot it'd be more like triggering a DOD run on scope exit if
>> the scope allocated a variable that may have an active destructor and
>> go out of scope on scope exit. (Though with continuations there's
>> always the question of when scope *really* exits...) The current plan
>> is to provide a means for the allocator for these sorts of variables
>> to push a DOD-run-trigger on the call stack so a DOD run is triggered
>> when the scope is cleaned up.
>
>But surely for the typical perl6-written-in-a-perl5-style program,
>ie where you don't do lots of explicit typing of vars and subs, most
>vars may have an active destructor, and therefore most programs
>will need to be hit with a DOD run at every scope exit ?
>ie
>
> { my $x = foo() } # perhaps $x has a destructor? Who knows?

Potentially, yeah, but generally they can be ignored. It's the
(relatively) few things that affect the outside world--filehandles,
db handles, expensive resource handles--that may need explicit
cleanup quickly, and they can push a request on the stack. I'd not
expect normal destructors to need that, rather have it a special case.

Which then argues for mutable continuation objects, since that's what
you'd need to access--the stack in the continuation object your code
got. Which has a number of Evil Possibilities that I'm only beginning
to realize. (Muahahahahahaha! (In case that wasn't obvious :))

Dave Mitchell

unread,
May 15, 2003, 9:56:27 AM5/15/03
to Dan Sugalski, Benjamin Goldberg, perl6-i...@perl.org
On Thu, May 15, 2003 at 09:22:43AM -0400, Dan Sugalski wrote:
> At 2:02 PM +0100 5/15/03, Dave Mitchell wrote:
> >But surely for the typical perl6-written-in-a-perl5-style program,
> >ie where you don't do lots of explicit typing of vars and subs, most
> >vars may have an active destructor, and therefore most programs
> >will need to be hit with a DOD run at every scope exit ?
> >ie
> >
> > { my $x = foo() } # perhaps $x has a destructor? Who knows?
>
> Potentially, yeah, but generally they can be ignored. It's the
> (relatively) few things that affect the outside world--filehandles,
> db handles, expensive resource handles--that may need explicit
> cleanup quickly, and they can push a request on the stack. I'd not
> expect normal destructors to need that, rather have it a special case.

Yes, but neither parrot nor the perl6 compiler can know which are the
special cases. Therefore it has to DOD on scope exit for *all* non-typed
lexicals, just in case it *might* hold a filehandle object. Or am I
missing something?

--
You never really learn to swear until you learn to drive.

Dan Sugalski

unread,
May 15, 2003, 10:10:51 AM5/15/03
to Dave Mitchell, Benjamin Goldberg, perl6-i...@perl.org

Yep. You're missing the non-automaticness of this. If the filehandle
class decides that cleanup on scope exit is the right thing to do,
then the constructor for that class pushes a scope exit action into
its caller. Parrot doesn't do that automatically.

Dave Mitchell

unread,
May 15, 2003, 10:19:12 AM5/15/03
to Dan Sugalski, Benjamin Goldberg, perl6-i...@perl.org
On Thu, May 15, 2003 at 10:10:51AM -0400, Dan Sugalski wrote:
> At 2:56 PM +0100 5/15/03, Dave Mitchell wrote:
> >Yes, but neither parrot nor the perl6 compiler can know which are the
> >special cases. Therefore it has to DOD on scope exit for *all* non-typed
> >lexicals, just in case it *might* hold a filehandle object. Or am I
> >missing something?
>
> Yep. You're missing the non-automaticness of this. If the filehandle
> class decides that cleanup on scope exit is the right thing to do,
> then the constructor for that class pushes a scope exit action into
> its caller. Parrot doesn't do that automatically.

I don't see that that can work:

sub new_fh {
my $fh = IO::Filehandle->new(...);
.... do some locking or other value-added stuff here ...
return $fh;
} # filehandle destroyed here ...?

{
my $fh = new_fh();
....
} # .. or here ?

If IO::Filehandle pushes some scope exit stuff onto the caller,
then the pbject will get destroyed on the exit from new_fh, which is
wrong.



--
Technology is dominated by two types of people: those who understand what
they do not manage, and those who manage what they do not understand.

Dan Sugalski

unread,
May 15, 2003, 10:31:50 AM5/15/03
to Dave Mitchell, Benjamin Goldberg, perl6-i...@perl.org
At 3:19 PM +0100 5/15/03, Dave Mitchell wrote:
>On Thu, May 15, 2003 at 10:10:51AM -0400, Dan Sugalski wrote:
>> At 2:56 PM +0100 5/15/03, Dave Mitchell wrote:
>> >Yes, but neither parrot nor the perl6 compiler can know which are the
>> >special cases. Therefore it has to DOD on scope exit for *all* non-typed
>> >lexicals, just in case it *might* hold a filehandle object. Or am I
>> >missing something?
>>
>> Yep. You're missing the non-automaticness of this. If the filehandle
>> class decides that cleanup on scope exit is the right thing to do,
>> then the constructor for that class pushes a scope exit action into
>> its caller. Parrot doesn't do that automatically.
>
>I don't see that that can work:
>
> sub new_fh {
> my $fh = IO::Filehandle->new(...);
> .... do some locking or other value-added stuff here ...
> return $fh;
> } # filehandle destroyed here ...?
>
> {
> my $fh = new_fh();
> ....
> } # .. or here ?
>
>If IO::Filehandle pushes some scope exit stuff onto the caller,
>then the pbject will get destroyed on the exit from new_fh, which is
>wrong.

Nononononono, you misunderstand. We're not pushing a "kill this
variable" action, we're pushing a "do a DOD run" action. All that
will do is go check to see if there are dead objects.

Dave Mitchell

unread,
May 15, 2003, 10:40:50 AM5/15/03
to Dan Sugalski, Benjamin Goldberg, perl6-i...@perl.org
On Thu, May 15, 2003 at 10:31:50AM -0400, Dan Sugalski wrote:
> At 3:19 PM +0100 5/15/03, Dave Mitchell wrote:
> >I don't see that that can work:
> >
> > sub new_fh {
> > my $fh = IO::Filehandle->new(...);
> > .... do some locking or other value-added stuff here ...
> > return $fh;
> > } # filehandle destroyed here ...?
> >
> > {
> > my $fh = new_fh();
> > ....
> > } # .. or here ?
> >
> >If IO::Filehandle pushes some scope exit stuff onto the caller,
> >then the pbject will get destroyed on the exit from new_fh, which is
> >wrong.
>
> Nononononono, you misunderstand. We're not pushing a "kill this
> variable" action, we're pushing a "do a DOD run" action. All that
> will do is go check to see if there are dead objects.

Ok, so that causes a DOD run on exit from new_fh(). All well and good.
But we also need a DOD run on exit from the { my $fh = new_fh(); } block.
Or indeed from any block that the filehandle object may eventually
directly or indrectly get passed back to. And I don't see how that can
happen.

--
Justice is when you get what you deserve.
Law is when you get what you pay for.

Piers Cawley

unread,
May 15, 2003, 3:10:44 PM5/15/03
to Dan Sugalski, Benjamin Goldberg, perl6-i...@perl.org
Dave Mitchell <da...@fdgroup.com> writes:

As soon as a filehandle is instantiated (with some possible caveats
for things like C<open FH: "path/to/file" -> $line {...}> (ie, a
method that creates a file handle, iterates over it line by line
passing each line to the block, then closes the filehandle)) then the
filehandle class tells Perl to do a DOD run every time a scope is
exited from now until the program stops (though having a filehandle
decrement the 'open filehandles' count on closure, and have the class
withdraw its request for DOD on every scope exit when that count hits
zero makes sense). Something like:

class Handle {
my WholeNumber $open_handles = 0;
sub open {
...
Perl.request_frequent_DOD(Handle) unless $open_handles++;
}
sub close {
...
Perl.request_lazy_DOD(Handle) unless --$open_handles;
}
}

class Perl {

my %timely_destruction_requests;

sub request_frequent_DOD ($requester) {
Scope.do_DOD_on_exit unless %timely_destruction_requests.keys
%timely_destruction_requests{$requester} = 1;
}

sub request_lazy_DOD ($requester) {
delete %timely_destruction_requests{$requester}
Scope.no_DOD_on_exit unless %timely_destruction_requests.keys;
}
}

--
Piers

Luke Palmer

unread,
May 15, 2003, 5:17:22 PM5/15/03
to pdca...@bofh.org.uk, d...@sidhe.org, ben.go...@hotpop.com, perl6-i...@perl.org

You mean

for <open "path/to/file"> -> $line {...}

Right?

> then the filehandle class tells Perl to do a DOD run every time a
> scope is exited from now until the program stops (though having a
> filehandle decrement the 'open filehandles' count on closure, and
> have the class withdraw its request for DOD on every scope exit when
> that count hits zero makes sense). Something like:
>
> class Handle {
> my WholeNumber $open_handles = 0;
> sub open {
> ...
> Perl.request_frequent_DOD(Handle) unless $open_handles++;
> }
> sub close {
> ...
> Perl.request_lazy_DOD(Handle) unless --$open_handles;
> }
> }
>
> class Perl {
>
> my %timely_destruction_requests;
>
> sub request_frequent_DOD ($requester) {
> Scope.do_DOD_on_exit unless %timely_destruction_requests.keys
> %timely_destruction_requests{$requester} = 1;
> }
>
> sub request_lazy_DOD ($requester) {
> delete %timely_destruction_requests{$requester}
> Scope.no_DOD_on_exit unless %timely_destruction_requests.keys;
> }
> }

Um. If someone calls C<close> on a filehandle, it shouldn't do DOD,
it should close the file. I may just be misunderstanding your
code/description, but this:

sub recurse($count) {
if $count > 0 {
recurse($count-1);
recurse($count-1);
}
}

sub foo() {
my $fh = open "path/to/file";
recurse(15);
print ~<$fh>;
}

foo;

Had better not do 32769 DOD runs.

Why don't we just force open() to do a DOD run before it opens
anything? Then those nasty sync problems go (mostly) down the drain.
IPC might still have a problem with it... but I wonder if it would be
so bad in that case just to tell people to close() their handles
themselves.

Luke

Piers Cawley

unread,
May 15, 2003, 5:53:01 PM5/15/03
to Luke Palmer, d...@sidhe.org, ben.go...@hotpop.com, perl6-i...@perl.org
Luke Palmer <fibo...@babylonia.flatirons.org> writes:

No. I mean:

method Handle::open (String $path, &block) {
my $fh = $_.open($path);
for <$fh> -> { &block($_) }
close $fh;
}

ie: Something which explicitly opens and closes the filehandle and
doesn't need any DESTROY semantics.


>> then the filehandle class tells Perl to do a DOD run every time a
>> scope is exited from now until the program stops (though having a
>> filehandle decrement the 'open filehandles' count on closure, and
>> have the class withdraw its request for DOD on every scope exit when
>> that count hits zero makes sense). Something like:
>>
>> class Handle {
>> my WholeNumber $open_handles = 0;
>> sub open {
>> ...
>> Perl.request_frequent_DOD(Handle) unless $open_handles++;
>> }
>> sub close {
>> ...
>> Perl.request_lazy_DOD(Handle) unless --$open_handles;
>> }
>> }
>>
>> class Perl {
>>
>> my %timely_destruction_requests;
>>
>> sub request_frequent_DOD ($requester) {
>> Scope.do_DOD_on_exit unless %timely_destruction_requests.keys
>> %timely_destruction_requests{$requester} = 1;
>> }
>>
>> sub request_lazy_DOD ($requester) {
>> delete %timely_destruction_requests{$requester}
>> Scope.no_DOD_on_exit unless %timely_destruction_requests.keys;
>> }
>> }
>
> Um. If someone calls C<close> on a filehandle, it shouldn't do DOD,

Well, as written it doesn't do a DOD. It says "If that was the last
filehandle being closed, stop doing DODs at the end of every

> it should close the file. I may just be misunderstanding your
> code/description, but this:
>
> sub recurse($count) {
> if $count > 0 {
> recurse($count-1);
> recurse($count-1);
> }
> }
>
> sub foo() {
> my $fh = open "path/to/file";
> recurse(15);
> print ~<$fh>;
> }
>
> foo;
>
> Had better not do 32769 DOD runs.

It will do. Unless you have a solution to the Halting Problem written
down in a margin somewhere.

> Why don't we just force open() to do a DOD run before it opens
> anything? Then those nasty sync problems go (mostly) down the drain.
> IPC might still have a problem with it... but I wonder if it would be
> so bad in that case just to tell people to close() their handles
> themselves.

Personally I'd lean towards telling people to close their handles in
all cases and having done with it (well, that and providing a bunch of
helper methods which do the file handling for them, see any Smalltalk
image/Ruby for examples...). But that's not been the way Perl does it
in the past.

--
Piers

David Robins

unread,
May 15, 2003, 5:11:01 PM5/15/03
to Piers Cawley, perl6-i...@perl.org
On Thu, 15 May 2003, Piers Cawley wrote:

> Personally I'd lean towards telling people to close their handles in
> all cases and having done with it (well, that and providing a bunch of
> helper methods which do the file handling for them, see any Smalltalk
> image/Ruby for examples...). But that's not been the way Perl does it
> in the past.

It's a little late in the game to change the GC but has a method like
Python's refcount-and-GC-unreachables (http://arctrix.com/nas/python/gc/)
been considered?

I think having timely (automatic) destruction is a very nice feature, and
wouldn't like to see it left out.

(What a can of worms I've opened here....)

Dave
Isa. 40:31

Luke Palmer

unread,
May 15, 2003, 6:30:17 PM5/15/03
to pdca...@bofh.org.uk, d...@sidhe.org, ben.go...@hotpop.com, perl6-i...@perl.org
> > You mean
> >
> > for <open "path/to/file"> -> $line {...}
> >
> > Right?
>
> No. I mean:
>
> method Handle::open (String $path, &block) {
> my $fh = $_.open($path);
> for <$fh> -> { &block($_) }
> close $fh;
> }
>
> ie: Something which explicitly opens and closes the filehandle and
> doesn't need any DESTROY semantics.

Ahh.

> > it should close the file. I may just be misunderstanding your
> > code/description, but this:
> >
> > sub recurse($count) {
> > if $count > 0 {
> > recurse($count-1);
> > recurse($count-1);
> > }
> > }
> >
> > sub foo() {
> > my $fh = open "path/to/file";
> > recurse(15);
> > print ~<$fh>;
> > }
> >
> > foo;
> >
> > Had better not do 32769 DOD runs.
>
> It will do. Unless you have a solution to the Halting Problem written
> down in a margin somewhere.

Alright, there is a better solution than this. This is not the
perfect solution even if DOD took zero time, BTW. The reference to
the handle could get killed in the middle of a scope.

I think it would be good to associate handles a special "dodme"
container. Upon every scope exit it would call a DOD. We need a
container that has a scope exit method anyway, because of
hypotheticals' siblings. Then in the recursion, since the recursing
sub has no idea the handle even exists, it doesn't need to run DOD.

If you end up storing a handle out in a data structure somewhere,
you're never going to get "timely" destruction semantics, because
there's simply no way of knowing when to run DOD except after every
statement. And I, personally, think that's fine. The C<close>
function has to get some use.

> > Why don't we just force open() to do a DOD run before it opens
> > anything? Then those nasty sync problems go (mostly) down the drain.
> > IPC might still have a problem with it... but I wonder if it would be
> > so bad in that case just to tell people to close() their handles
> > themselves.
>
> Personally I'd lean towards telling people to close their handles in
> all cases and having done with it (well, that and providing a bunch of
> helper methods which do the file handling for them, see any Smalltalk
> image/Ruby for examples...). But that's not been the way Perl does it
> in the past.

I'm not looking for a perfect solution to this problem, because I'm
quite certain one doesn't exist. Even refcounting has it's drawbacks
(apparently so many that we've decided not to use it anymore). So I'm
looking for one that does what people want in the most cases without
being too inefficient or changing Parrot too much.

I call running DOD after *every* scope exit while there is a
filehandle open too inefficient. It's not a fundamental problem,
because there's only a few cases that need it. I'm sure people would
rather close handles themselves than pay that kind of efficiency
price. But let's make it so they have to do neither.

</rant>

Luke

Benjamin Goldberg

unread,
May 16, 2003, 3:32:12 AM5/16/03
to perl6-i...@perl.org
Dan Sugalski wrote:
> Benjamin Goldberg wrote:
>> Hildo Biersma wrote:
>>> Dave wrote:
>>>> Benjamin Goldberg wrote:
>>>>> At 3:28 PM -0400 5/5/03, David Robins wrote:
>>>>>> * Or do we? Suppose the destructor for a PMC closes a file. The
>>>>>> last ref to the PMC goes away, so it's _eligible_ for GC (but
>>>>>> not collected yet), and then that file is reopened... could
>>>>>> this be messy?
>>>>>
>>>>> Socket- and File- handles are the archetypical examples -- one
>>>>> generally needs/wants them to be cleaned up in a timely manner.
>>>
>>>> I though that parrot, by using GC rather than refcounting, had
>>>> abandoned all attempts at timely calling of destructors on scope
>>>> exit?
>>>
>>> That doesn't mean that languages running on Parrot cannot enforce
>>> this.
>>>
>>> For example, a high-level language can put in "check ref count and
>>> close on zero" code on leaving any context scope that has defined a
>>> file descriptor (plus maybe also the enclosing subroutine). If that
>>> is combined with a similar check on assigment to variables of
>>> file-handle class, you can pretty much guarantee this behavior even
>>> with GC.
>>
>> In other words, all refcounting needs to be done manually?
>
> For parrot it'd be more like triggering a DOD run on scope exit if
> the scope allocated a variable that may have an active destructor and
> go out of scope on scope exit. (Though with continuations there's
> always the question of when scope *really* exits...) The current plan
> is to provide a means for the allocator for these sorts of variables
> to push a DOD-run-trigger on the call stack so a DOD run is triggered
> when the scope is cleaned up.

I've thought of a way of getting a combination of refcounting and DoD,
but I'm not sure how good it is.

First, there'd be a property/trait, "is refcounted", which can be
attatched to variables. When assignments are made between two variables
with that trait, or when a variable with that trait is explicitly
undef()ed, refcounting is done a la perl5.

Maybe in slightly more places, since in perl5, we don't refcount values
on the parameter/return stack; in perl6, any function parameter or
return value without the "is rw" property could probably be refcounted,
I think.

When the contents of a refcounted variable are assigned to the contents
of a non-refcounted variable, something special now happens... A flag
(call this BIT_A for simplicity's sake) on the object gets set,
indicating that it *may* need to be cleaned up by the GC, instead of
pure-refcounting.

At the beginning of a DoD run, all objects with the BIT_A flag get a
second flag (call it BIT_B) set on them. Every time a refcounted object
gets come across when doing DoD, we check if the variable that the
object was seen in has the "is refcounted" trait, and if *not*, clear
BIT_B. At the end of the DoD run, we copy the BIT_Bs into the BIT_As of
those objects which are still alive.

Any time a refcounted variable goes out of scope, the refcount of the
object in it gets decremented. If that refcount goes to 0, and it does
*not* have a BIT_A flag set, then it gets destructed *immediately*.

If an object's refcount reaches 0 due it having been in a variable which
just went out of scope, and if it *does* have a BIT_A flag set... then
we trigger a DoD run, which might or might not result in the object
being cleaned up.

And for an additional sanity check, if a refcounted object is found to
be unreachable at the end of a DoD run, but it has a nonzero refcount,
then we throw an exception.

Does this sound like a reasonable idea, or am I off of my rocker?

Stéphane Payrard

unread,
May 16, 2003, 9:24:01 AM5/16/03
to perl6-i...@perl.org
On Thu, May 15, 2003 at 10:10:51AM -0400, Dan Sugalski wrote:
> At 2:56 PM +0100 5/15/03, Dave Mitchell wrote:
> >On Thu, May 15, 2003 at 09:22:43AM -0400, Dan Sugalski wrote:
> >> At 2:02 PM +0100 5/15/03, Dave Mitchell wrote:
> >> >But surely for the typical perl6-written-in-a-perl5-style program,
> >> >ie where you don't do lots of explicit typing of vars and subs, most
> >> >vars may have an active destructor, and therefore most programs
> >> >will need to be hit with a DOD run at every scope exit ?
> >> >ie
> >> >
> >> > { my $x = foo() } # perhaps $x has a destructor? Who knows?
> >>
> >> Potentially, yeah, but generally they can be ignored. It's the
> >> (relatively) few things that affect the outside world--filehandles,
> >> db handles, expensive resource handles--that may need explicit
> >> cleanup quickly, and they can push a request on the stack. I'd not
> >> expect normal destructors to need that, rather have it a special case.
> >
> >Yes, but neither parrot nor the perl6 compiler can know which are the
> >special cases. Therefore it has to DOD on scope exit for *all* non-typed
> >lexicals, just in case it *might* hold a filehandle object. Or am I
> >missing something?

If a variable is not referenced outside the scope where it has been
declared, is it possible that the compiler would emit the code to free
the variable? For these special cases, it would buy us clean-up at
scope exit. I suspect that, most of the time, these are the very
cases when people expect that behavior.

This certainly flies in the face of implementation purity but that has
never been a concern in Perl circles that are more concerned by
pragmatism.

I am not sure what the impact of exceptions or continuations on such a
possibility.

--
stef

Garrett Goebel

unread,
May 16, 2003, 10:13:20 AM5/16/03
to Piers Cawley, Luke Palmer, dbro...@davidrobins.net, d...@sidhe.org, perl6-i...@perl.org
Aren't you all talking past one another? It sounds like Dan's already agreed
to make a special case for vars which need timely destruction. The real
issue isn't whether or not to allow the special case, but how to minimize
the cost.

Dan Sugalski wrote:
>
> For parrot it'd be more like triggering a DOD run on scope
> exit if the scope allocated a variable that may have an active
> destructor and go out of scope on scope exit. (Though with
> continuations there's always the question of when scope
> *really* exits...) The current plan is to provide a means for
> the allocator for these sorts of variables to push a DOD-run-
> trigger on the call stack so a DOD run is triggered when the
> scope is cleaned up.

Okay. So a "special" var which goes out-of-scope and "may have an active
destructor" winds up triggering a DOD-run. I.e., the var's allocator pushes
a DOD-run-trigger on the call stack.

Does it follow that the var's destructor if called explicitly before the
scope exits would pop the DOD-run-trigger off the call stack? So we can skip
the extra DOD-run if we know there's no active destructor...

Oh, and if one allocates several special vars in a given scope without
calling their destructors explicitly... is parrot really going to push a
DOD-run-trigger on the call stack for each special var or each scope
containing special vars?

--
Garrett Goebel
IS Development Specialist

ScriptPro Direct: 913.403.5261
5828 Reeds Road Main: 913.384.1008
Mission, KS 66202 Fax: 913.384.2180
www.scriptpro.com garrett at scriptpro dot com

Luke Palmer

unread,
May 16, 2003, 2:28:55 PM5/16/03
to st...@payrard.net, perl6-i...@perl.org
> If a variable is not referenced outside the scope where it has been
> declared, is it possible that the compiler would emit the code to free
> the variable? For these special cases, it would buy us clean-up at
> scope exit. I suspect that, most of the time, these are the very
> cases when people expect that behavior.

Well, provided nobody closes on the scope, or accesses it lexical pad,
or any of that stuff that we can't guarantee. The destructor DOD push
handles this special case and several more general cases.

> This certainly flies in the face of implementation purity but that has
> never been a concern in Perl circles that are more concerned by
> pragmatism.

I think we're willing to trade purity for a little bit of DWIMity.
Perl has done that in the past, successfully.

> I am not sure what the impact of exceptions or continuations on such a
> possibility.

Exceptions are easy---you clean up if the stack is unwound through the
sub, just like in C++ or any other language with exceptions.

Continuations are a kind of super closure, so the method breaks down
there as well. Again, destructor DOD push works pretty well with
these.

Luke

Benjamin Goldberg

unread,
May 16, 2003, 9:06:57 PM5/16/03
to perl6-i...@perl.org

Stéphane Payrard wrote:
>
> On Thu, May 15, 2003 at 10:10:51AM -0400, Dan Sugalski wrote:
> > At 2:56 PM +0100 5/15/03, Dave Mitchell wrote:
> > >On Thu, May 15, 2003 at 09:22:43AM -0400, Dan Sugalski wrote:
> > >> At 2:02 PM +0100 5/15/03, Dave Mitchell wrote:
> > >> >But surely for the typical perl6-written-in-a-perl5-style program,
> > >> >ie where you don't do lots of explicit typing of vars and subs, most
> > >> >vars may have an active destructor, and therefore most programs
> > >> >will need to be hit with a DOD run at every scope exit ?
> > >> >ie
> > >> >
> > >> > { my $x = foo() } # perhaps $x has a destructor? Who knows?
> > >>
> > >> Potentially, yeah, but generally they can be ignored. It's the
> > >> (relatively) few things that affect the outside world--filehandles,
> > >> db handles, expensive resource handles--that may need explicit
> > >> cleanup quickly, and they can push a request on the stack. I'd not
> > >> expect normal destructors to need that, rather have it a special case.
> > >
> > >Yes, but neither parrot nor the perl6 compiler can know which are the
> > >special cases. Therefore it has to DOD on scope exit for *all* non-typed
> > >lexicals, just in case it *might* hold a filehandle object. Or am I
> > >missing something?
>
> If a variable is not referenced outside the scope where it has been
> declared, is it possible that the compiler would emit the code to free
> the variable?

Please don't confuse value and variable, cause that can make the
discussion more confused than it already is :P

With my idea, if the value is never referenced through a variable
which doesn't have the 'is refcounted' property, then the instant it
goes out of scope everywhere it's referenced, the value gets cleaned
up.

In your case, which is a subset of that, yes indeed it would get freed
as soon as the variable goes out of scope.

> For these special cases, it would buy us clean-up at
> scope exit. I suspect that, most of the time, these are the very
> cases when people expect that behavior.
>
> This certainly flies in the face of implementation purity but that has
> never been a concern in Perl circles that are more concerned by
> pragmatism.

IIRC, in perl6, we're allowed to write optree filters which are like
source filters, but are passed the optree instead of text source, and
which then mangle that optree.

Thus, dealing with objects which need timely destruction could be
done with pure-perl6 code -- first a user-defined 'is refcounted'
property sets a flag on each variable, then the optree filter is used
to insert appropriate "perform refcount-inc", "perform refcount-dec",
"push onto the perform-on-scope-exit stack a 'perform refcount-dec'
request", and "mark as potentially needing to be killed through a DoD
run" code at appropriate locations.

> I am not sure what the impact of exceptions or continuations on such a
> possibility.

As Luke Palmer said, when there's an exception, every callback that's in
the perform-on-scope-exit stack, up to the scope where the exception
got caught, gets run.

> > Yep. You're missing the non-automaticness of this. If the filehandle
> > class decides that cleanup on scope exit is the right thing to do,
> > then the constructor for that class pushes a scope exit action into
> > its caller. Parrot doesn't do that automatically.

Right -- it's the perl6 compiler's task to insert appropriate code to
do this, not parrot's.

The only parrotish thing which is relevant here is that a language might
want to define it's own opcodes for things it does very frequently, and
we need to have a way of switching between opcode lookup tables for when
we're running bits and pieces of code which were written in different
languages.

Once that's done, perl6 for parrot can define one opcode each for
"refcountinc Px", "refcountdec Px", etc., and python for parrot can
define it's own versions of those opcodes, and parrot will call the
correct implementation when it sees a "refcountinc" opcode in a perl6
subroutine or in a python subroutine.

Fortunatly, Dan Sugalski has plans for bundling up the opcode function
table as part of the continuation object when we do makecontext/callcc.

Benjamin Goldberg

unread,
May 16, 2003, 9:10:55 PM5/16/03
to perl6-i...@perl.org
Garrett Goebel wrote:
[snip]

> Oh, and if one allocates several special vars in a given scope without
> calling their destructors explicitly... is parrot really going to push
> a DOD-run-trigger on the call stack for each special var or each scope
> containing special vars?

Parrot will do whatever it's told to do -- the real question is, will
the *perl6 compiler* insert code to push a DOD-run-trigger on the call
stack for each ....?

Probably not -- perl6 should be smart enough to avoid pushing two or
more DOD-run-triggers on the call stack within a given scope.

Piers Cawley

unread,
May 19, 2003, 6:48:35 AM5/19/03
to Luke Palmer, d...@sidhe.org, ben.go...@hotpop.com, perl6-i...@perl.org
Luke Palmer <fibo...@babylonia.flatirons.org> writes:
> Alright, there is a better solution than this. This is not the
> perfect solution even if DOD took zero time, BTW. The reference to
> the handle could get killed in the middle of a scope.

Which is fine. Perl 5 guarantees that the DESTROY stuff will be
triggered at the end of the containing scope.

> I think it would be good to associate handles a special "dodme"
> container. Upon every scope exit it would call a DOD. We need a
> container that has a scope exit method anyway, because of
> hypotheticals' siblings. Then in the recursion, since the recursing
> sub has no idea the handle even exists, it doesn't need to run DOD.
>
> If you end up storing a handle out in a data structure somewhere,
> you're never going to get "timely" destruction semantics, because
> there's simply no way of knowing when to run DOD except after every
> statement. And I, personally, think that's fine. The C<close>
> function has to get some use.

Ah... gotcha. Actually, it should be possible to write things so that:

$arbitrary_untyped_variable = $something_which_needs_timely_destruction;

copies the DODme property to $arbitrary_untyped_variable as
well. As you point out, things become rather more complicated when you
do, say:

@foo[0] = $needs_timely_destruction;

Though there's probably enough information kicking around for the
DODme to get attached in the 'right' place. Trouble is, things start
getting complicated rather quickly.

>> > Why don't we just force open() to do a DOD run before it opens
>> > anything? Then those nasty sync problems go (mostly) down the drain.
>> > IPC might still have a problem with it... but I wonder if it would be
>> > so bad in that case just to tell people to close() their handles
>> > themselves.
>>
>> Personally I'd lean towards telling people to close their handles in
>> all cases and having done with it (well, that and providing a bunch of
>> helper methods which do the file handling for them, see any Smalltalk
>> image/Ruby for examples...). But that's not been the way Perl does it
>> in the past.
>
> I'm not looking for a perfect solution to this problem, because I'm
> quite certain one doesn't exist. Even refcounting has it's drawbacks
> (apparently so many that we've decided not to use it anymore). So I'm
> looking for one that does what people want in the most cases without
> being too inefficient or changing Parrot too much.

According to at least one reference I've read, the only thing that
refcounting has going for it is that it's simple to
implement. Apparently turns out to be computationally expensive
because of the overhead of updating and checking refcounts at the end
of every scope. ISTR it can really mess with cache integrity as
well...

Anyhow, I'd argue that if we're going to commit to doing the same
thing as Perl 5 does regarding timely destruction, then that's exactly
what we should commit to. That means that filehandles held in deeply
nested datastructures should see the same timely destruction that
they'd see if they were held in a simple scalar, and that means either
DOD runs at every scope exit so long as there are such objects in the
pool, or fun with property propagation, or 'pseudo DOD runs' where
first you walk the tree with the variables that are going out of scope
as your rootset and trigger a full DOD run iff you reach something
with a DODme property (actually, that may not be an awful idea...).

> I call running DOD after *every* scope exit while there is a
> filehandle open too inefficient. It's not a fundamental problem,
> because there's only a few cases that need it. I'm sure people would
> rather close handles themselves than pay that kind of efficiency
> price. But let's make it so they have to do neither.

Well, I'd argue that we should provide a bunch of the same kind of
utility methods one sees with the likes of Ruby and Smalltalk that
encapsulate the open/close stuff in sensible methods which preempt the
need for timely destruction without forcing the programmer to close
files explicitly.

--
Piers

0 new messages