Lazy DOD

Luke Palmer

unread,

May 24, 2003, 7:37:15 AM5/24/03

to perl6-i...@perl.org

Consider this program:

my $fh = open 'foobar';
main_program_which_has_nothing_to_do_with_fh();
print $fh: "Done\n";

Under the current Lazy DOD, every single scope exit in the main
program will suffer.

So, how about we have a similar flag on PMCs to replace the current
one (even if it's named the same thing :-) which will trigger a DOD
run whenever *a* reference is lost to this PMC. This allows code that
doesn't need timely destruction (even if there's still a variable
around that wants it, which will probably *always* be the case) not to
pay for it.

The costs of such a thing require checking a PMC flag in every op
which has an out pmc argument. Is this acceptable?

Luke

Leopold Toetsch

unread,

May 24, 2003, 9:34:53 AM5/24/03

to Luke Palmer, perl6-i...@perl.org

Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
> Consider this program:

> my $fh = open 'foobar';
> main_program_which_has_nothing_to_do_with_fh();
> print $fh: "Done\n";

> Under the current Lazy DOD, every single scope exit in the main
> program will suffer.

I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes
in code, where no filehandle is used? When no filehandle is used, no one
can get out of scope, so...

It could be different, when $fh goes into main_prog... but, as $fh is
used after main_prog again, its also senseless to insert lazysweep
opcodes.

> The costs of such a thing require checking a PMC flag in every op
> which has an out pmc argument. Is this acceptable?

IMHO too expensive.

> Luke

leo

Graham Barr

unread,

May 25, 2003, 3:24:52 AM5/25/03

to Leopold Toetsch, Luke Palmer, perl6-i...@perl.org

On Sat, May 24, 2003 at 03:34:53PM +0200, Leopold Toetsch wrote:
> Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
> > Consider this program:
>
> > my $fh = open 'foobar';
> > main_program_which_has_nothing_to_do_with_fh();
> > print $fh: "Done\n";
>
> > Under the current Lazy DOD, every single scope exit in the main
> > program will suffer.
>
> I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes
> in code, where no filehandle is used? When no filehandle is used, no one
> can get out of scope, so...

Really ?

sub foo (%hash) {
...
delete %hash{value_is_a_filehandle};
...
}

Not a filehandle in sight.

Just because code does not explicitly use a filehandle does not mean that one
will not be affected by its actions. If you want timely destruction of filehandles
the you must call DOD on every scope exit, unless you know that a scope cannot
affect anything outside the variables it introduces itself, which are very rare

> > The costs of such a thing require checking a PMC flag in every op
> > which has an out pmc argument. Is this acceptable?
>
> IMHO too expensive.

I have yet to see an inexpensive solution to timely destruction.

Graham.

Luke Palmer

unread,

May 25, 2003, 5:06:05 AM5/25/03

to gb...@pobox.com, l...@toetsch.at, perl6-i...@perl.org

> On Sat, May 24, 2003 at 03:34:53PM +0200, Leopold Toetsch wrote:
> > Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
> > > Consider this program:
> >
> > > my $fh = open 'foobar';
> > > main_program_which_has_nothing_to_do_with_fh();
> > > print $fh: "Done\n";
> >
> > > Under the current Lazy DOD, every single scope exit in the main
> > > program will suffer.
> >
> > I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes
> > in code, where no filehandle is used? When no filehandle is used, no one
> > can get out of scope, so...
>
> Really ?
>
> sub foo (%hash) {
> ...
> delete %hash{value_is_a_filehandle};
> ...
> }
>
> Not a filehandle in sight.

Plus, since Perl is untyped, it has no idea whether something is or
isn't a filehandle at code generation time.

So, my DOD on loss of reference might get this case, if the filehandle
was loaded into a register before being deleted.... Then when the
register lost it, it would DOD.

There are, of couse, cases where this fails (as, I believe, any
timely destruction method without refcounting (and even refcounting
can't handle circular thingies)). But the idea behind this, er, idea
is that it will get 95% of the cases correct, and let a regular DOD
catch the other 5% (eventually).

> Just because code does not explicitly use a filehandle does not mean
> that one will not be affected by its actions. If you want timely
> destruction of filehandles the you must call DOD on every scope
> exit, unless you know that a scope cannot affect anything outside
> the variables it introduces itself, which are very rare

Unless you're a good functional programmer... but that's not the
average Perl hacker.

> > > The costs of such a thing require checking a PMC flag in every op
> > > which has an out pmc argument. Is this acceptable?
> >
> > IMHO too expensive.
>
> I have yet to see an inexpensive solution to timely destruction.

I'd personally trade timely destruction in favor of speed, but I also
see an argument for strong semantics. Whatever's best in they eyes of
the king...

Also, as I've said before, the problem with filehandles (neglecting
IPC) is trivially solved by adding a DOD before each C<open>. But
it's no good as a general solution.

Luke

Leopold Toetsch

unread,

May 25, 2003, 6:47:04 AM5/25/03

to Luke Palmer, perl6-i...@perl.org

Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
>> On Sat, May 24, 2003 at 03:34:53PM +0200, Leopold Toetsch wrote:
>> > Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
>> > > Consider this program:
>> >
>> > > my $fh = open 'foobar';
>> > > main_program_which_has_nothing_to_do_with_fh();
>> > > print $fh: "Done\n";
>> >
>> > > Under the current Lazy DOD, every single scope exit in the main
>> > > program will suffer.
>> >
>> > I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes
>> > in code, where no filehandle is used? When no filehandle is used, no one
>> > can get out of scope, so...
>>
>> Really ?
>>
>> sub foo (%hash) {
>> ...
>> delete %hash{value_is_a_filehandle};
>> ...
>> }
>>
>> Not a filehandle in sight.

> Plus, since Perl is untyped, it has no idea whether something is or
> isn't a filehandle at code generation time.

Still my second argument counts, I think. As $fh is used in line 1 and
line 3 the filehandle can't get unused in the main routine in the
middle. Untyped or not, $fh is known to hold a filehandle, because its
used as the LHS of an "open".

But of course for a slightly more complicated example the compiler will
be unable to follow the usage of the filehandle so that the lazysweeps
(or refcounting or whatever) are necessary.

> Luke

leo

Gopal V

unread,

May 25, 2003, 9:33:29 AM5/25/03

to perl6-i...@perl.org

If memory serves me right, Leopold Toetsch wrote:
> But of course for a slightly more complicated example the compiler will
> be unable to follow the usage of the filehandle so that the lazysweeps
> (or refcounting or whatever) are necessary.

Following the type is a halting problem ... which is why some of the
languages like JVM and IL enforce that all paths to a location produce
the same local & stack types ..

IMHO the best option here would be to let the programmer close the file
handle explicitly and else to catch them in lazy sweeps .

"If you want timely destruction , do it yourself"... ;-)

Of course , it might be just that I hate things which are hard to
control or trace and debug...

(lacking in parrot jargon , I assume a DOD is the counterpart of a
Finalize run, correct me if I'm wrong)

Gopal
--
The difference between insanity and genius is measured by success

Leopold Toetsch

unread,

May 25, 2003, 11:23:20 AM5/25/03

to Gopal V, perl6-i...@perl.org

Gopal V <gopa...@symonds.net> wrote:

> IMHO the best option here would be to let the programmer close the file
> handle explicitly and else to catch them in lazy sweeps .

> "If you want timely destruction , do it yourself"... ;-)

If you want a fast program ... probably. As Perl defines, that files are
closed on scope exit (when the last reference to the file handle gets
out of scope), perl and parrot must have some means to handle this.

> (lacking in parrot jargon , I assume a DOD is the counterpart of a
> Finalize run, correct me if I'm wrong)

DOD = Dead Object Detection. When a PMC is found unused ("dead") and has
the active_destroy_FLAG set, then the destroy handler gets called, which
frees PMC specific memory or closes a file or shuts down a timer. This
is similar to the destructor in C++. A DOD run normally is not done on
scope exit, but when there is a shortage in some resources.

> Gopal

leo

Gopal V

unread,

May 25, 2003, 3:18:11 PM5/25/03

to perl6-i...@perl.org

If memory serves me right, Leopold Toetsch wrote:

> If you want a fast program ... probably. As Perl defines, that files are
> closed on scope exit (when the last reference to the file handle gets
> out of scope), perl and parrot must have some means to handle this.

Let me get this straight ... "on" scope exit ? or just "after" scope
exit ? ... And if it's possible , how does Perl 5 handle this sort of
behaviour ?.

I had a look to see if it really works that way with perl5 ...

open $somefile,"<test" or die;

And did an strace of perl5's run

read(3, "open $somefile,\"<test\" or die;\n", 4096) = 31
read(3, "", 4096) = 0
close(3) = 0
munmap(0x40027000, 4096) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
open("test", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
fcntl64(0x3, 0x2, 0x1, 0) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
_exit(0) = ?

I could not find where the "test" is closed ..

> DOD = Dead Object Detection. When a PMC is found unused ("dead") and has
> the active_destroy_FLAG set, then the destroy handler gets called, which
> frees PMC specific memory or closes a file or shuts down a timer. This
> is similar to the destructor in C++. A DOD run normally is not done on
> scope exit, but when there is a shortage in some resources.

Yes, a finalizer ... Like , I could have a finalizer and a deallocator
of which the finalizer is called before deallocation by the garbage
collector . Running around and calling the destroy handler is exactly
what I meant.

Another TLA entered into database :-)

> leo

Dan Sugalski

unread,

May 25, 2003, 3:39:16 PM5/25/03

to Luke Palmer, perl6-i...@perl.org

At 5:37 AM -0600 5/24/03, Luke Palmer wrote:
>Consider this program:
>
> my $fh = open 'foobar';
> main_program_which_has_nothing_to_do_with_fh();
> print $fh: "Done\n";
>
>Under the current Lazy DOD, every single scope exit in the main
>program will suffer.

Maybe, yep.

>So, how about we have a similar flag on PMCs to replace the current
>one (even if it's named the same thing :-) which will trigger a DOD
>run whenever *a* reference is lost to this PMC. This allows code that
>doesn't need timely destruction (even if there's still a variable
>around that wants it, which will probably *always* be the case) not to
>pay for it.
>
>The costs of such a thing require checking a PMC flag in every op
>which has an out pmc argument. Is this acceptable?

Nope, unfortunately not. I think everyone's gone over the problems,
plus it puts the burden on sections of code that don't guarantee any
sort of timely destruction, which I'd like to avoid.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Dan Sugalski

unread,

May 25, 2003, 3:50:39 PM5/25/03

to Gopal V, perl6-i...@perl.org

At 7:03 PM +0530 5/25/03, Gopal V wrote:
>If memory serves me right, Leopold Toetsch wrote:
>> But of course for a slightly more complicated example the compiler will
>> be unable to follow the usage of the filehandle so that the lazysweeps
>> (or refcounting or whatever) are necessary.
>
>Following the type is a halting problem ... which is why some of the
>languages like JVM and IL enforce that all paths to a location produce
>the same local & stack types ..
>
>IMHO the best option here would be to let the programmer close the file
>handle explicitly and else to catch them in lazy sweeps .
>
>"If you want timely destruction , do it yourself"... ;-)

Love to, but we can't. Perl 5 effectively guarantees timely
destruction. If Larry mandates that for Perl 6, then we have to do
it, like it or not. Or at least provide a way that perl 5 and 6 code
can make it happen, which I think the scheme I checked in does. The
fact that it can be turned *off* is the nice bit. The fact that full
sweeps can be darned pricey is the not-nice bit. A generational GC
scheme might make this better, I'm not sure.

>Of course , it might be just that I hate things which are hard to
>control or trace and debug...
>
>(lacking in parrot jargon , I assume a DOD is the counterpart of a
> Finalize run, correct me if I'm wrong)

It's when we sweep the arenas looking for objects that are unreferenced.

Dan Sugalski

unread,

May 25, 2003, 3:57:31 PM5/25/03

to Gopal V, perl6-i...@perl.org

At 12:48 AM +0530 5/26/03, Gopal V wrote:
>If memory serves me right, Leopold Toetsch wrote:
>> If you want a fast program ... probably. As Perl defines, that files are
>> closed on scope exit (when the last reference to the file handle gets
>> out of scope), perl and parrot must have some means to handle this.
>
>Let me get this straight ... "on" scope exit ? or just "after" scope
>exit ? ... And if it's possible , how does Perl 5 handle this sort of
>behaviour ?.

It's part of scope exit. When the scratchpad for the lexicals gets
deleted on scope exit (because exiting decrements its refcount to
zero) all the refcounts of the variables in the scratchpad get
decremented, and any of those that go to zero get cleaned up.

Perl generally doesn't do anything special when you fall off the end
of the world. It's a shortcut, of sorts. Try throwing the code in a
block, making sure you use a lexical filehandle, and see what happens.

Leopold Toetsch

unread,

May 25, 2003, 4:05:13 PM5/25/03

to Gopal V, perl6-i...@perl.org

Gopal V <gopa...@symonds.net> wrote:
> If memory serves me right, Leopold Toetsch wrote:
>> If you want a fast program ... probably. As Perl defines, that files are
>> closed on scope exit (when the last reference to the file handle gets
>> out of scope), perl and parrot must have some means to handle this.

> Let me get this straight ... "on" scope exit ? or just "after" scope
> exit ? ... And if it's possible , how does Perl 5 handle this sort of
> behaviour ?.

#v+

$ strace perl -e '{ my $f; open $f, "1" } print "ok\n"'
...
open("1", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=3773, ...}) = 0
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
close(3) = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000
ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) = 0
write(1, "ok\n", 3ok

#v-

I don't know, what perl5 does, when dying, but this isn't the normal
case.

leo

Graham Barr

unread,

May 26, 2003, 4:22:51 AM5/26/03

to Leopold Toetsch, Luke Palmer, perl6-i...@perl.org

On Sun, May 25, 2003 at 12:47:04PM +0200, Leopold Toetsch wrote:
> Still my second argument counts, I think. As $fh is used in line 1 and
> line 3 the filehandle can't get unused in the main routine in the
> middle. Untyped or not, $fh is known to hold a filehandle, because its
> used as the LHS of an "open".

True. But that main routine could be in a different module that was compiled
seaprately. So the compiler at that time has no idea if what it is doing
will affetc filehandles, so would have to insert DOD runs just in case.

Graham.

Sean O'Rourke

unread,

May 26, 2003, 10:04:37 AM5/26/03

to perl6-i...@perl.org

This can be less painful if we divide file handle use into two cases.
First, we have cases where you only care that your process doesn't run out
of file descriptors. In this case, we can treat fds, like memory, as just
another resource, and garbage collect them. Second, we have cases where
closing the file sends a signal to the outside world, e.g. network
sockets. In these cases, it seems reasonable to either require people to
do an explicit close(), or have them add an explicit POST block when the
file is opened. Since most applications fall into the first category, it
might be worth separating them. As long as no POST blocks are in play, we
can afford to garbage collect file descriptors only when necessary.

To get back to nearly where perl 5 is, we could also hide the POST block
for sockets down in the Socket module, and most people would just see

use Sockets autoclose => 1;

Yeah, this doesn't "solve the problem" -- I don't think there's any
solution -- but it hopefully shows that it's not nearly as pervasive as
people seem to think.

/s

Dan Sugalski

unread,

May 26, 2003, 1:31:10 PM5/26/03

to Graham Barr, Leopold Toetsch, Luke Palmer, perl6-i...@perl.org

Yep. One of the compiler switches will disable strict end-of-scope
checking, so modules that don't care don't have to emit the code to
do the checking.

Dan Sugalski

unread,

May 26, 2003, 1:33:20 PM5/26/03

to Sean O'Rourke, perl6-i...@perl.org

At 7:04 AM -0700 5/26/03, Sean O'Rourke wrote:
>This can be less painful if we divide file handle use into two cases.
>First, we have cases where you only care that your process doesn't run out
>of file descriptors. In this case, we can treat fds, like memory, as just
>another resource, and garbage collect them. Second, we have cases where
>closing the file sends a signal to the outside world, e.g. network
>sockets. In these cases, it seems reasonable to either require people to
>do an explicit close(), or have them add an explicit POST block when the
>file is opened. Since most applications fall into the first category, it
>might be worth separating them. As long as no POST blocks are in play, we
>can afford to garbage collect file descriptors only when necessary.

I think what we're going to have to do is have a way to mark
filehandes as either eager for destruction or lazy for destruction.
(I'm not sure which, it depends on the default Larry chooses) That
way there'll at least be some way to mark a filehandle as not needing
immediate destruction, so that if there aren't any
immediate-destruction filehandles (or other objects) around then we
don't trigger the DOD with the lazysweep op.

Paul Johnson

unread,

May 26, 2003, 12:19:54 PM5/26/03

to Sean O'Rourke, perl6-i...@perl.org

Sean O'Rourke said:

> This can be less painful if we divide file handle use into two cases.

[ Possible solution to the problem of timely destruction of file handles. ]

Don't forget (not to imply that you have) that whilst filehandles may
deserve special treatment, they are simply a specific example of a general
problem which also needs to be solved.

I hope I am not driving this discussion round in circles, but are there
any problems with simply adding reference counting only to those objects
which have destructors and/or which request it?

--
Paul Johnson - pa...@pjcj.net
http://www.pjcj.net

Dan Sugalski

unread,

May 26, 2003, 3:11:25 PM5/26/03

to Paul Johnson, Sean O'Rourke, perl6-i...@perl.org

Yes. Search back through the archives, the discussion is in there.
(Might also be in the docs or FAQ--if not I think I may add it)

Tim Bunce

unread,

May 26, 2003, 4:19:48 PM5/26/03

to Dan Sugalski, Paul Johnson, Sean O'Rourke, perl6-i...@perl.org

On Mon, May 26, 2003 at 03:11:25PM -0400, Dan Sugalski wrote:
> At 6:19 PM +0200 5/26/03, Paul Johnson wrote:
> >
> >I hope I am not driving this discussion round in circles, but are there
> >any problems with simply adding reference counting only to those objects
> >which have destructors and/or which request it?
>
> Yes. Search back through the archives, the discussion is in there.
> (Might also be in the docs or FAQ--if not I think I may add it)

I think that would be a good idea.

Tim [who'll need to ponder DBI handle issues one day...]

Brent Dax

unread,

May 27, 2003, 3:22:09 AM5/27/03

to l...@toetsch.at, Luke Palmer, perl6-i...@perl.org

Leopold Toetsch:
# Still my second argument counts, I think. As $fh is used in
# line 1 and line 3 the filehandle can't get unused in the main
# routine in the middle. Untyped or not, $fh is known to hold a
# filehandle, because its used as the LHS of an "open".

What makes you so sure that the filehandle will live on until line 3?

sub main_program_which_has_nothing_to_do_with_fh() {
...
#Badly named, as it turns out.
$fh=open "> otherfile";
...
}

--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

>How do you "test" this 'God' to "prove" it is who it says it is?
"If you're God, you know exactly what it would take to convince me. Do
that."
--Marc Fleury on alt.atheism

Brent Dax

unread,

May 27, 2003, 3:24:26 AM5/27/03

to Sean O'Rourke, perl6-i...@perl.org

Sean O'Rourke:
# First, we have cases where you only care that your process
# doesn't run out of file descriptors. In this case, we can
# treat fds, like memory, as just another resource, and garbage
# collect them. Second, we have cases where closing the file
# sends a signal to the outside world, e.g. network sockets.

Even standard filehandles can be in the second case--consider a locked
file, for example. You can't just divide the world into "files" and
"everything else".

Leopold Toetsch

unread,

May 27, 2003, 5:40:07 AM5/27/03

to Brent Dax, Luke Palmer, perl6-i...@perl.org

Brent Dax wrote:

> Leopold Toetsch:
> # Still my second argument counts, I think. As $fh is used in
> # line 1 and line 3 the filehandle can't get unused in the main
> # routine in the middle. Untyped or not, $fh is known to hold a
> # filehandle, because its used as the LHS of an "open".
>
> What makes you so sure that the filehandle will live on until line 3?
>
> sub main_program_which_has_nothing_to_do_with_fh() {
> ...
> #Badly named, as it turns out.
> $fh=open "> otherfile";
> ...
> }

You are cheating ;-)
main_program_which_has_nothing_to_do_with_fh
^^^^^^^^^^^^^^^^^^^^^

But anyway, this was a special case, where the compiler could follow the
life range of the file handle. Normally we will need lazysweep opcodes.

leo

Nicholas Clark

unread,

Aug 24, 2003, 11:56:56 AM8/24/03

to Dan Sugalski, Sean O'Rourke, perl6-i...@perl.org

On Mon, May 26, 2003 at 01:33:20PM -0400, Dan Sugalski wrote:

> I think what we're going to have to do is have a way to mark
> filehandes as either eager for destruction or lazy for destruction.
> (I'm not sure which, it depends on the default Larry chooses) That
> way there'll at least be some way to mark a filehandle as not needing
> immediate destruction, so that if there aren't any
> immediate-destruction filehandles (or other objects) around then we
> don't trigger the DOD with the lazysweep op.

I would have hoped that simply putting "lazy" on any filehandle opened read
only on a regular file would be good enough as a win. And if anything
places a lock on that file handle then it turns "eager". Everything else
starts out eager, and (obviously) lazy handles are garbage collected
like memory.

But until we have real sized programs running I don't think we can tell,
so it's probably too early to get too far into thinking about this.
Time would be better spent helping Jürgen Bömmels with the IO rewrite.

Nicholas Clark