my $fh = open 'foobar'; main_program_which_has_nothing_to_do_with_fh(); print $fh: "Done\n";
Under the current Lazy DOD, every single scope exit in the main program will suffer.
So, how about we have a similar flag on PMCs to replace the current one (even if it's named the same thing :-) which will trigger a DOD run whenever *a* reference is lost to this PMC. This allows code that doesn't need timely destruction (even if there's still a variable around that wants it, which will probably *always* be the case) not to pay for it.
The costs of such a thing require checking a PMC flag in every op which has an out pmc argument. Is this acceptable?
Luke Palmer <fibon...@babylonia.flatirons.org> wrote: > Consider this program: > my $fh = open 'foobar'; > main_program_which_has_nothing_to_do_with_fh(); > print $fh: "Done\n"; > Under the current Lazy DOD, every single scope exit in the main > program will suffer.
I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes in code, where no filehandle is used? When no filehandle is used, no one can get out of scope, so...
It could be different, when $fh goes into main_prog... but, as $fh is used after main_prog again, its also senseless to insert lazysweep opcodes.
> The costs of such a thing require checking a PMC flag in every op > which has an out pmc argument. Is this acceptable?
On Sat, May 24, 2003 at 03:34:53PM +0200, Leopold Toetsch wrote: > Luke Palmer <fibon...@babylonia.flatirons.org> wrote: > > Consider this program:
> > my $fh = open 'foobar'; > > main_program_which_has_nothing_to_do_with_fh(); > > print $fh: "Done\n";
> > Under the current Lazy DOD, every single scope exit in the main > > program will suffer.
> I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes > in code, where no filehandle is used? When no filehandle is used, no one > can get out of scope, so...
Really ?
sub foo (%hash) { ... delete %hash{value_is_a_filehandle}; ... }
Not a filehandle in sight.
Just because code does not explicitly use a filehandle does not mean that one will not be affected by its actions. If you want timely destruction of filehandles the you must call DOD on every scope exit, unless you know that a scope cannot affect anything outside the variables it introduces itself, which are very rare
> > The costs of such a thing require checking a PMC flag in every op > > which has an out pmc argument. Is this acceptable?
> IMHO too expensive.
I have yet to see an inexpensive solution to timely destruction.
> On Sat, May 24, 2003 at 03:34:53PM +0200, Leopold Toetsch wrote: > > Luke Palmer <fibon...@babylonia.flatirons.org> wrote: > > > Consider this program:
> > > my $fh = open 'foobar'; > > > main_program_which_has_nothing_to_do_with_fh(); > > > print $fh: "Done\n";
> > > Under the current Lazy DOD, every single scope exit in the main > > > program will suffer.
> > I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes > > in code, where no filehandle is used? When no filehandle is used, no one > > can get out of scope, so...
Plus, since Perl is untyped, it has no idea whether something is or isn't a filehandle at code generation time.
So, my DOD on loss of reference might get this case, if the filehandle was loaded into a register before being deleted.... Then when the register lost it, it would DOD.
There are, of couse, cases where this fails (as, I believe, any timely destruction method without refcounting (and even refcounting can't handle circular thingies)). But the idea behind this, er, idea is that it will get 95% of the cases correct, and let a regular DOD catch the other 5% (eventually).
> Just because code does not explicitly use a filehandle does not mean > that one will not be affected by its actions. If you want timely > destruction of filehandles the you must call DOD on every scope > exit, unless you know that a scope cannot affect anything outside > the variables it introduces itself, which are very rare
Unless you're a good functional programmer... but that's not the average Perl hacker.
> > > The costs of such a thing require checking a PMC flag in every op > > > which has an out pmc argument. Is this acceptable?
> > IMHO too expensive.
> I have yet to see an inexpensive solution to timely destruction.
I'd personally trade timely destruction in favor of speed, but I also see an argument for strong semantics. Whatever's best in they eyes of the king...
Also, as I've said before, the problem with filehandles (neglecting IPC) is trivially solved by adding a DOD before each C<open>. But it's no good as a general solution.
Luke Palmer <fibon...@babylonia.flatirons.org> wrote: >> On Sat, May 24, 2003 at 03:34:53PM +0200, Leopold Toetsch wrote: >> > Luke Palmer <fibon...@babylonia.flatirons.org> wrote: >> > > Consider this program:
>> > > my $fh = open 'foobar'; >> > > main_program_which_has_nothing_to_do_with_fh(); >> > > print $fh: "Done\n";
>> > > Under the current Lazy DOD, every single scope exit in the main >> > > program will suffer.
>> > I don't think so. Wyh should the perl6 compiler insert lazysweep opcodes >> > in code, where no filehandle is used? When no filehandle is used, no one >> > can get out of scope, so...
>> Not a filehandle in sight. > Plus, since Perl is untyped, it has no idea whether something is or > isn't a filehandle at code generation time.
Still my second argument counts, I think. As $fh is used in line 1 and line 3 the filehandle can't get unused in the main routine in the middle. Untyped or not, $fh is known to hold a filehandle, because its used as the LHS of an "open".
But of course for a slightly more complicated example the compiler will be unable to follow the usage of the filehandle so that the lazysweeps (or refcounting or whatever) are necessary.
> But of course for a slightly more complicated example the compiler will > be unable to follow the usage of the filehandle so that the lazysweeps > (or refcounting or whatever) are necessary.
Following the type is a halting problem ... which is why some of the languages like JVM and IL enforce that all paths to a location produce the same local & stack types ..
IMHO the best option here would be to let the programmer close the file handle explicitly and else to catch them in lazy sweeps .
"If you want timely destruction , do it yourself"... ;-)
Of course , it might be just that I hate things which are hard to control or trace and debug...
(lacking in parrot jargon , I assume a DOD is the counterpart of a Finalize run, correct me if I'm wrong)
Gopal -- The difference between insanity and genius is measured by success
Gopal V <gopal...@symonds.net> wrote: > IMHO the best option here would be to let the programmer close the file > handle explicitly and else to catch them in lazy sweeps . > "If you want timely destruction , do it yourself"... ;-)
If you want a fast program ... probably. As Perl defines, that files are closed on scope exit (when the last reference to the file handle gets out of scope), perl and parrot must have some means to handle this.
> (lacking in parrot jargon , I assume a DOD is the counterpart of a > Finalize run, correct me if I'm wrong)
DOD = Dead Object Detection. When a PMC is found unused ("dead") and has the active_destroy_FLAG set, then the destroy handler gets called, which frees PMC specific memory or closes a file or shuts down a timer. This is similar to the destructor in C++. A DOD run normally is not done on scope exit, but when there is a shortage in some resources.
> If you want a fast program ... probably. As Perl defines, that files are > closed on scope exit (when the last reference to the file handle gets > out of scope), perl and parrot must have some means to handle this.
Let me get this straight ... "on" scope exit ? or just "after" scope exit ? ... And if it's possible , how does Perl 5 handle this sort of behaviour ?.
I had a look to see if it really works that way with perl5 ...
> DOD = Dead Object Detection. When a PMC is found unused ("dead") and has > the active_destroy_FLAG set, then the destroy handler gets called, which > frees PMC specific memory or closes a file or shuts down a timer. This > is similar to the destructor in C++. A DOD run normally is not done on > scope exit, but when there is a shortage in some resources.
Yes, a finalizer ... Like , I could have a finalizer and a deallocator of which the finalizer is called before deallocation by the garbage collector . Running around and calling the destroy handler is exactly what I meant.
Another TLA entered into database :-)
> leo
Gopal -- The difference between insanity and genius is measured by success
> my $fh = open 'foobar'; > main_program_which_has_nothing_to_do_with_fh(); > print $fh: "Done\n";
>Under the current Lazy DOD, every single scope exit in the main >program will suffer.
Maybe, yep.
>So, how about we have a similar flag on PMCs to replace the current >one (even if it's named the same thing :-) which will trigger a DOD >run whenever *a* reference is lost to this PMC. This allows code that >doesn't need timely destruction (even if there's still a variable >around that wants it, which will probably *always* be the case) not to >pay for it.
>The costs of such a thing require checking a PMC flag in every op >which has an out pmc argument. Is this acceptable?
Nope, unfortunately not. I think everyone's gone over the problems, plus it puts the burden on sections of code that don't guarantee any sort of timely destruction, which I'd like to avoid. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
>If memory serves me right, Leopold Toetsch wrote: >> But of course for a slightly more complicated example the compiler will >> be unable to follow the usage of the filehandle so that the lazysweeps >> (or refcounting or whatever) are necessary.
>Following the type is a halting problem ... which is why some of the >languages like JVM and IL enforce that all paths to a location produce >the same local & stack types ..
>IMHO the best option here would be to let the programmer close the file >handle explicitly and else to catch them in lazy sweeps .
>"If you want timely destruction , do it yourself"... ;-)
Love to, but we can't. Perl 5 effectively guarantees timely destruction. If Larry mandates that for Perl 6, then we have to do it, like it or not. Or at least provide a way that perl 5 and 6 code can make it happen, which I think the scheme I checked in does. The fact that it can be turned *off* is the nice bit. The fact that full sweeps can be darned pricey is the not-nice bit. A generational GC scheme might make this better, I'm not sure.
>Of course , it might be just that I hate things which are hard to >control or trace and debug...
>(lacking in parrot jargon , I assume a DOD is the counterpart of a > Finalize run, correct me if I'm wrong)
It's when we sweep the arenas looking for objects that are unreferenced. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
>If memory serves me right, Leopold Toetsch wrote: >> If you want a fast program ... probably. As Perl defines, that files are >> closed on scope exit (when the last reference to the file handle gets >> out of scope), perl and parrot must have some means to handle this.
>Let me get this straight ... "on" scope exit ? or just "after" scope >exit ? ... And if it's possible , how does Perl 5 handle this sort of >behaviour ?.
It's part of scope exit. When the scratchpad for the lexicals gets deleted on scope exit (because exiting decrements its refcount to zero) all the refcounts of the variables in the scratchpad get decremented, and any of those that go to zero get cleaned up.
Perl generally doesn't do anything special when you fall off the end of the world. It's a shortcut, of sorts. Try throwing the code in a block, making sure you use a lexical filehandle, and see what happens. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
Gopal V <gopal...@symonds.net> wrote: > If memory serves me right, Leopold Toetsch wrote: >> If you want a fast program ... probably. As Perl defines, that files are >> closed on scope exit (when the last reference to the file handle gets >> out of scope), perl and parrot must have some means to handle this. > Let me get this straight ... "on" scope exit ? or just "after" scope > exit ? ... And if it's possible , how does Perl 5 handle this sort of > behaviour ?.
On Sun, May 25, 2003 at 12:47:04PM +0200, Leopold Toetsch wrote: > Still my second argument counts, I think. As $fh is used in line 1 and > line 3 the filehandle can't get unused in the main routine in the > middle. Untyped or not, $fh is known to hold a filehandle, because its > used as the LHS of an "open".
True. But that main routine could be in a different module that was compiled seaprately. So the compiler at that time has no idea if what it is doing will affetc filehandles, so would have to insert DOD runs just in case.
This can be less painful if we divide file handle use into two cases. First, we have cases where you only care that your process doesn't run out of file descriptors. In this case, we can treat fds, like memory, as just another resource, and garbage collect them. Second, we have cases where closing the file sends a signal to the outside world, e.g. network sockets. In these cases, it seems reasonable to either require people to do an explicit close(), or have them add an explicit POST block when the file is opened. Since most applications fall into the first category, it might be worth separating them. As long as no POST blocks are in play, we can afford to garbage collect file descriptors only when necessary.
To get back to nearly where perl 5 is, we could also hide the POST block for sockets down in the Socket module, and most people would just see
use Sockets autoclose => 1;
Yeah, this doesn't "solve the problem" -- I don't think there's any solution -- but it hopefully shows that it's not nearly as pervasive as people seem to think.
>On Sun, May 25, 2003 at 12:47:04PM +0200, Leopold Toetsch wrote: >> Still my second argument counts, I think. As $fh is used in line 1 and >> line 3 the filehandle can't get unused in the main routine in the >> middle. Untyped or not, $fh is known to hold a filehandle, because its >> used as the LHS of an "open".
>True. But that main routine could be in a different module that was compiled >seaprately. So the compiler at that time has no idea if what it is doing >will affetc filehandles, so would have to insert DOD runs just in case.
Yep. One of the compiler switches will disable strict end-of-scope checking, so modules that don't care don't have to emit the code to do the checking. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
>This can be less painful if we divide file handle use into two cases. >First, we have cases where you only care that your process doesn't run out >of file descriptors. In this case, we can treat fds, like memory, as just >another resource, and garbage collect them. Second, we have cases where >closing the file sends a signal to the outside world, e.g. network >sockets. In these cases, it seems reasonable to either require people to >do an explicit close(), or have them add an explicit POST block when the >file is opened. Since most applications fall into the first category, it >might be worth separating them. As long as no POST blocks are in play, we >can afford to garbage collect file descriptors only when necessary.
I think what we're going to have to do is have a way to mark filehandes as either eager for destruction or lazy for destruction. (I'm not sure which, it depends on the default Larry chooses) That way there'll at least be some way to mark a filehandle as not needing immediate destruction, so that if there aren't any immediate-destruction filehandles (or other objects) around then we don't trigger the DOD with the lazysweep op. -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
> This can be less painful if we divide file handle use into two cases.
[ Possible solution to the problem of timely destruction of file handles. ]
Don't forget (not to imply that you have) that whilst filehandles may deserve special treatment, they are simply a specific example of a general problem which also needs to be solved.
I hope I am not driving this discussion round in circles, but are there any problems with simply adding reference counting only to those objects which have destructors and/or which request it?
>> This can be less painful if we divide file handle use into two cases.
>[ Possible solution to the problem of timely destruction of file handles. ]
>Don't forget (not to imply that you have) that whilst filehandles may >deserve special treatment, they are simply a specific example of a general >problem which also needs to be solved.
>I hope I am not driving this discussion round in circles, but are there >any problems with simply adding reference counting only to those objects >which have destructors and/or which request it?
Yes. Search back through the archives, the discussion is in there. (Might also be in the docs or FAQ--if not I think I may add it) -- Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai d...@sidhe.org have teddy bears and even teddy bears get drunk
On Mon, May 26, 2003 at 03:11:25PM -0400, Dan Sugalski wrote: > At 6:19 PM +0200 5/26/03, Paul Johnson wrote:
> >I hope I am not driving this discussion round in circles, but are there > >any problems with simply adding reference counting only to those objects > >which have destructors and/or which request it?
> Yes. Search back through the archives, the discussion is in there. > (Might also be in the docs or FAQ--if not I think I may add it)
I think that would be a good idea.
Tim [who'll need to ponder DBI handle issues one day...]
Leopold Toetsch: # Still my second argument counts, I think. As $fh is used in # line 1 and line 3 the filehandle can't get unused in the main # routine in the middle. Untyped or not, $fh is known to hold a # filehandle, because its used as the LHS of an "open".
What makes you so sure that the filehandle will live on until line 3?
sub main_program_which_has_nothing_to_do_with_fh() { ... #Badly named, as it turns out. $fh=open "> otherfile"; ... }
Sean O'Rourke: # First, we have cases where you only care that your process # doesn't run out of file descriptors. In this case, we can # treat fds, like memory, as just another resource, and garbage # collect them. Second, we have cases where closing the file # sends a signal to the outside world, e.g. network sockets.
Even standard filehandles can be in the second case--consider a locked file, for example. You can't just divide the world into "files" and "everything else".
Brent Dax wrote: > Leopold Toetsch: > # Still my second argument counts, I think. As $fh is used in > # line 1 and line 3 the filehandle can't get unused in the main > # routine in the middle. Untyped or not, $fh is known to hold a > # filehandle, because its used as the LHS of an "open".
> What makes you so sure that the filehandle will live on until line 3?
> sub main_program_which_has_nothing_to_do_with_fh() { > ... > #Badly named, as it turns out. > $fh=open "> otherfile"; > ... > }
You are cheating ;-) main_program_which_has_nothing_to_do_with_fh ^^^^^^^^^^^^^^^^^^^^^
But anyway, this was a special case, where the compiler could follow the life range of the file handle. Normally we will need lazysweep opcodes.
On Mon, May 26, 2003 at 01:33:20PM -0400, Dan Sugalski wrote: > I think what we're going to have to do is have a way to mark > filehandes as either eager for destruction or lazy for destruction. > (I'm not sure which, it depends on the default Larry chooses) That > way there'll at least be some way to mark a filehandle as not needing > immediate destruction, so that if there aren't any > immediate-destruction filehandles (or other objects) around then we > don't trigger the DOD with the lazysweep op.
I would have hoped that simply putting "lazy" on any filehandle opened read only on a regular file would be good enough as a win. And if anything places a lock on that file handle then it turns "eager". Everything else starts out eager, and (obviously) lazy handles are garbage collected like memory.
But until we have real sized programs running I don't think we can tell, so it's probably too early to get too far into thinking about this. Time would be better spent helping Jürgen Bömmels with the IO rewrite.