Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Ignorant wumpus? Or scoping problem?

0 views
Skip to first unread message

Glenn Linderman

unread,
Nov 25, 2003, 5:10:00 PM11/25/03
to perl5 porters
I asked this question in the perl-win...@activestate.com forum, but
apparently there isn't the right level of expertise there to provide a
definitive answer, so I'm trying here... if there is a better forum for
this, please inform me, and I'll go there.

Prelude:

Using Perl 5.8.0 ActiveState build 805 on Windows 2000 SP4

I have this script that is 1.2 MB long.... rather than post the whole
thing, I've made a test case... surprisingly I was able to get it down
to 60 lines. I guess it could have even been a little smaller...

So I'm writing an interpreter, in Perl, for another language (no, this
is a real project, not an attempt to bring a Pentium 4 to its knees).
The other language is not available on Windows, and my collection of
machines that run it is dying... being about 15 years old and orphaned
about 12 years. The basic idea is to provide the building blocks of the
other language inside perl as perl functions, and then to translate each
syntactical construct of the other language into perl syntax, thus
building up a perl function that is functionally equivalent to the
function in the other language. The perl functions are first built into
strings, and then "eval"ed into existance. Of course, as with all
automated translation, the resultant Perl is not particularly beautiful
to behold, but it is somewhat decipherable for debugging, and with the
source of the other language available, it should be an OK environment.

I have the translator mostly finished, and was starting to attempt to
execute some code from the other language (and it is the source code in
the other language that is the bulk of the Perl script).

Questions, problems, and misunderstandings:

1) Do the initial "use strict" and "use warnings" apply to the eval'd
code, or do I need to include them in each eval'd string as shown?
Initially, I didn't generate the "use strict" and "use warnings" into
each eval'd string, but after I ran into some problems, I went back and
added them. Didn't change the problem. And from what I can read, I
think I don't really need to do that, that the eval's should inherit the
"use strict" and "use warnings" from the lexical scope in which the eval
exists.

2) The variables of interest are all lexicals in the scope of the
closure where function foo is defined... $scaler, @array, and %hash.
The functions, with strict and warnings enabled, seem to compile fine.
But when function eval_printall runs, the variables seem to all be
undefined, even though they all have been given values in the code
immediately after their creation, and even though those values are
visible to function printall. I would expect that either the variables
would not be visable at all, and function eval_printall would fail to
compile OR (more likely, from my reading of the Camel book) that the
function would compile successfully, the variables would be visible, and
the values of the variables available. This last seems not to be the
case, however.

Does anyone have a clue what is going on here? Is this a perl bug? Do
I just need some additional magic option? or what?

Interestingly, if you comment out the line in function eval_printall
which prints the scaler, you don't even get a warning message... the
variables apparently all appear to be undef, and the way the access to
@array and %hash are written, they simply don't have values, so nothing
gets printed... without warning. This really confused me last night,
and until I made this test case this morning.

Not that I'm not still confused, but if all the variables appear undef,
it explains why I got some of the symptoms I got.


3) If this is a bug, or a technical question no one here can answer,
what forum should I turn to next?


Pared down test case code:
(you should be able to pipe this message to perl -x to execute the test
case)

#!perl -w
use strict;
use warnings;

{
my ( $scaler, @array, %hash );
$scaler = 33;
push @array, 1, 22, 333, 4444;
$hash{'key'} = 'value';
$hash{'value'} = 'key';
$hash{'unique'} = 'plain';

sub printall
{ print "scaler: $scaler\n";
foreach my $ix ( 0 .. $#array )
{ print "array: $ix $array[ $ix ]\n"
}
foreach my $key ( keys %hash )
{ print "hash key/value: $key/$hash{ $key }\n";
}
}

my $outerstring = <<'END_OF_OUTER_STRING';
use strict;
use warnings;
my $innerstring = <<'END_OF_INNER_STRING';
use strict;
use warnings;
sub eval_printall
{ print "eval_printall coming up! Watch this space!\n";
print "eval scaler: $scaler\n";
foreach my $ix ( 0 .. $#array )
{ print "eval array: $ix $array[ $ix ]\n"
}
foreach my $key ( keys %hash )
{ print "eval hash key/value: $key/$hash{ $key }\n";
}
}
1;
END_OF_INNER_STRING

print "before eval\n";
& printall();

eval $innerstring;

print "after eval\n";
& printall();
& eval_printall();
print "after eval_printall\n";
& printall();
END_OF_OUTER_STRING

sub foo
{ eval $outerstring;
}
}

& foo ();
__END__

--
Glenn -- http://nevcal.com/
===========================
Like almost everyone, I receive a lot of spam every day, much of it
offering to help me get out of debt or get rich quick. It's ridiculous.
-- Bill Gates

And here is why it is ridiculous:
The division that includes Windows posted an operating profit of $2.26
billion on revenue of $2.81 billion.
--from Reuters via
http://biz.yahoo.com/rc/031113/tech_microsoft_msn_1.html

So that's profit of over 400% of investment... with a bit more
investment in Windows technology, particularly in the area of
reliability, the profit percentage might go down, but so might the bugs
and security problems? Seems like it would be a reasonable tradeoff.
WalMart earnings are 3.4% of investment.

Dave Mitchell

unread,
Nov 26, 2003, 5:38:17 AM11/26/03
to Glenn Linderman, perl5 porters
On Tue, Nov 25, 2003 at 02:10:00PM -0800, Glenn Linderman wrote:
> 1) Do the initial "use strict" and "use warnings" apply to the eval'd
> code

Yes, they are lexically scoped, eg

$ perl -e 'use strict; eval q{$x=1}; print $@'
Global symbol "$x" requires explicit package name at (eval 1) line 1.

> 2) The variables of interest are all lexicals in the scope of the
> closure where function foo is defined... $scaler, @array, and %hash.
> The functions, with strict and warnings enabled, seem to compile fine.
> But when function eval_printall runs, the variables seem to all be
> undefined

Closures work at follows: when a sub is created, it captures the currrent
instances of any lexical variables that are referred to inside the sub,
but which are delcared outside that sub. The use of eval often delays this
capturing, so that the required lexicals are no longer available.

eg this simple example:


{
my $x = 1;
sub f1 { print "f1: x=$x\n" }
sub f2 { eval 'print "f2: x=$x\n"' }
}

f1;
f2;

produces the following output:

f1: x=1
Use of uninitialized value in concatenation (.) or string at (eval 1) line 1.
f2: x=

What happens here is that when f1 is compiled, the compiler notices that
the sub f1 makes mention of the outer lexical $x, so f1 gets its own
private reference to that variable. When f2 is compiled, it has no such
mention of $x, so it doesn't also capture $x.

After the { } block is exited, the interpreter disacards the current (and
only) instance of $x. When f1 is later called, it still has its private
copy of $x, and so can print out its value. When the eval is compiled via
f2, f2 hasn't got a private copy of $x, so the eval tries to grab the
value of the 'real' $x, which is now undef.

In the current development version of perl, you actually get a warning
when this happens:

f1: x=1
Variable "$x" is not available at (eval 1) line 1.
Use of uninitialized value in concatenation (.) or string at (eval 1) line 1.
f2: x=


The moral of this tale is to be careful with evals. They often do funny
things because the compiler can't know in advance what the eval might
contain.

--
"I do not resent critisism, even when, for the sake of emphasis,
it parts for the time with reality".
-- Winston Churchill, House of Commons, 22nd Jan 1941.

Glenn Linderman

unread,
Nov 26, 2003, 5:23:35 PM11/26/03
to Dave Mitchell, perl5 porters
On approximately 11/26/2003 2:38 AM, came the following characters from
the keyboard of Dave Mitchell:

> On Tue, Nov 25, 2003 at 02:10:00PM -0800, Glenn Linderman wrote:
>
>>1) Do the initial "use strict" and "use warnings" apply to the eval'd
>>code
>
>
> Yes, they are lexically scoped, eg
>
> $ perl -e 'use strict; eval q{$x=1}; print $@'
> Global symbol "$x" requires explicit package name at (eval 1) line 1.

Thanks for the clarification. This is what I thought happened, but when
I ran into the other problems I began to wonder.

>>2) The variables of interest are all lexicals in the scope of the
>>closure where function foo is defined... $scaler, @array, and %hash.
>>The functions, with strict and warnings enabled, seem to compile fine.
>>But when function eval_printall runs, the variables seem to all be
>>undefined
>
>
> Closures work at follows: when a sub is created, it captures the currrent
> instances of any lexical variables that are referred to inside the sub,
> but which are delcared outside that sub. The use of eval often delays this
> capturing, so that the required lexicals are no longer available.
>
> eg this simple example:
>
>
> {
> my $x = 1;
> sub f1 { print "f1: x=$x\n" }
> sub f2 { eval 'print "f2: x=$x\n"' }
> }
>
> f1;
> f2;
>
> produces the following output:
>
> f1: x=1
> Use of uninitialized value in concatenation (.) or string at (eval 1) line 1.
> f2: x=
>
> What happens here is that when f1 is compiled, the compiler notices that
> the sub f1 makes mention of the outer lexical $x, so f1 gets its own
> private reference to that variable. When f2 is compiled, it has no such
> mention of $x, so it doesn't also capture $x.

A follow-on question here: when f1 gets its own private reference to $x,
I'm assuming that it would still share the value with any other subs
defined in that block that would also reference $x. That is, that each
such sub would get its own reference, but only one value would exist,
and the subs could communicate through that variable if they chose to.
Experimentally, this seems to be the case.

> After the { } block is exited, the interpreter disacards the current (and
> only) instance of $x. When f1 is later called, it still has its private
> copy of $x, and so can print out its value. When the eval is compiled via
> f2, f2 hasn't got a private copy of $x, so the eval tries to grab the
> value of the 'real' $x, which is now undef.
>
> In the current development version of perl, you actually get a warning
> when this happens:
>
> f1: x=1
> Variable "$x" is not available at (eval 1) line 1.
> Use of uninitialized value in concatenation (.) or string at (eval 1) line 1.
> f2: x=

OK, this warning sure would have been helpful to my understanding this
issue... in fact, why is it only a warning, instead of an error... if
the variable really doesn't exist, and use strict is in effect, should
it not be an error? Although, I'd rather have it work, as you can read
below....

> The moral of this tale is to be careful with evals. They often do funny
> things because the compiler can't know in advance what the eval might
> contain.

Sure, I understand the compiler can't know in advance what the eval
might contain.... but I guess I would have expected, from comments in
the Camel about eval being evaluated in the lexical context where the
eval call appears, that the variables in the surrounding block would be
available.

It seems like it would be possible to make it work that way, and that
working that way would produce more useful results.

The corresponding cost of doing so would seem to be that instead of a
function squirrelling away a reference to the variables it uses to make
the closure, that it would squirrel away a reference to the whole
collection of lexicals (do they call that the pad?) available at the
scope of its definition. Yes, this would cost more in memory
consumption if only a few variables are referenced but many defined...
on the other hand, one doesn't normally define subs inside blocks
containing lexical variables, unless the intention is to define a
closure... and if one is defining dozens or hundreds of subs within that
closure (as I am doing), saving the reference to the whole collection
might actually save space. And if one is defining dozens or hunders of
subs within that closure by executing "eval" with a function in that
closure, it would have a much better chance of doing The Right Thing for
those eval'd functions.

So I suppose one _could_ implement both techniques, the current one of
just saving the referenced variables, and my suggested one, of saving
everything, perhaps turning from one technique to the other via a
pragmatic module... or heuristically, switching to saving everything for
any function that contains an eval, so that everything would be there
for the eval'd code to reference.

Does anyone think this behavior of Perl/closures/evals is a bug? One
that is worth fixing? Or even possible to fix as outlined above? Or
that will ever get fixed? Or possibly even documented better, so the
trap can be avoided? The warning in the development version does help
with avoiding the trap, I guess. I guess the fact that the warning was
added implies that someone besides myself was surprised by the current
behavior....

For the moment, I've chosen the workaround of moving all the variables
that I intended to have inside the closure, into global space. This
caused a few naming collisions, which I was able to resolve fairly
quickly, because all the source code of concern was authored by me :)
This workaround, based on your helpful explanation, is, in fact, a
complete 100% cure for the issue... but it suffers a bit in modularity
vs the intended collection of functions sharing the variables of a
closure. I suppose an alternative solution would be to wrap a different
package around those definitions, which would give me a 2nd global scope
to contain those variables, which would have avoided the naming
collisions, but also forced me to prefix calls to each of the hundreds
of functions defined in for the package with the package name (or to
export them all).

Dave Mitchell

unread,
Nov 26, 2003, 6:03:11 PM11/26/03
to Glenn Linderman, perl5 porters
On Wed, Nov 26, 2003 at 02:23:35PM -0800, Glenn Linderman wrote:
> On approximately 11/26/2003 2:38 AM, came the following characters from
> the keyboard of Dave Mitchell:

Each time a block is entered that contains a my declaration, a new instance
of that variable is created (in internals terminology, a new SV is
created). When subs 'capture' a lexical at compile time, they create a
pointer to the current instance. When the block is exited, the block's
reference to the instance is deleted, and if nothing else (such as a
closure) has a reference to it, it is freed. So this instance may indeed
be shared betwen subs, as in

sub new_counter {
my $x = 0;
return sub {$x}, sub {$x++}, sub {$x--};
}
my ($sub_val, $sub_inc, $sub_dec) = new_counter;



> >After the { } block is exited, the interpreter disacards the current (and
> >only) instance of $x. When f1 is later called, it still has its private
> >copy of $x, and so can print out its value. When the eval is compiled via
> >f2, f2 hasn't got a private copy of $x, so the eval tries to grab the
> >value of the 'real' $x, which is now undef.
> >
> >In the current development version of perl, you actually get a warning
> >when this happens:
> >
> >f1: x=1
> >Variable "$x" is not available at (eval 1) line 1.
> >Use of uninitialized value in concatenation (.) or string at (eval 1) line
> >1.
> >f2: x=
>
> OK, this warning sure would have been helpful to my understanding this
> issue... in fact, why is it only a warning, instead of an error... if
> the variable really doesn't exist, and use strict is in effect, should
> it not be an error? Although, I'd rather have it work, as you can read
> below....

Because it would probably break too much existing code.



> >The moral of this tale is to be careful with evals. They often do funny
> >things because the compiler can't know in advance what the eval might
> >contain.
>
> Sure, I understand the compiler can't know in advance what the eval
> might contain.... but I guess I would have expected, from comments in
> the Camel about eval being evaluated in the lexical context where the
> eval call appears, that the variables in the surrounding block would be
> available.
>
> It seems like it would be possible to make it work that way, and that
> working that way would produce more useful results.

Consider the following:

sub X::DESTROY { print "X::DESTROY called\n" }

{
my $x = bless {}, 'X';
}
print "outside block\n";

This outputs

X::DESTROY called
outside block

because $x is destroyed as soon as the block is exited. I hope you'll
agree this is the expected behaviour. Now lets modify it a bit:

sub X::DESTROY { print "X::DESTROY called\n" }

{
my $x = bless {}, 'X';
sub f2 { eval 'print "x=$x\n"' }
}
print "outside block\n";
f2;

Here f2 and the eval are called after the first and only instance of $x
has already been destroyed. What would you like the eval to do at this
point?




> Does anyone think this behavior of Perl/closures/evals is a bug? One
> that is worth fixing? Or even possible to fix as outlined above? Or
> that will ever get fixed? Or possibly even documented better, so the
> trap can be avoided?

It's not a bug, but it needs to be documented better. I made a start on
improving the documentation, but life got in the way.

[ By the end of today I've got to write a risk assessment and a get-in
schedule for a production of Macbeth that I foolishly agreed to get
involved in. It's aready 11pm, and I haven't started yet. Instead,
I'm engaged in the displacement activity of replying to emails on p5p ;-) ]

> The warning in the development version does help
> with avoiding the trap, I guess. I guess the fact that the warning was
> added implies that someone besides myself was surprised by the current
> behavior....

There were a lot of bugs in the closure code up until 5.8.0. A lot of
these have been fixed in 5.8.1, and even more have been fixed in what will
be 5.10.0. I added the warning as part of fix that correctly handles
the case when the variable should no longer be available (previously
the undef value you got may have been mysteriously shared in odd places).

> For the moment, I've chosen the workaround of moving all the variables
> that I intended to have inside the closure, into global space. This
> caused a few naming collisions, which I was able to resolve fairly
> quickly, because all the source code of concern was authored by me :)
> This workaround, based on your helpful explanation, is, in fact, a
> complete 100% cure for the issue... but it suffers a bit in modularity
> vs the intended collection of functions sharing the variables of a
> closure. I suppose an alternative solution would be to wrap a different
> package around those definitions, which would give me a 2nd global scope
> to contain those variables, which would have avoided the naming
> collisions, but also forced me to prefix calls to each of the hundreds
> of functions defined in for the package with the package name (or to
> export them all).

Another approach is to ensure that the sub that calls the eval
mentions all the lexicals that need to be captured and preserved beyond
their normal lifespan, eg

{
my ($a,$b,$c);
sub do_eval {
{ no warnings; $a; $b; $c } # capture lexicals
eval $_[0];
}
}

I don't know whether this would be practical in your situation though.

Dave.

--
"The GPL violates the U.S. Constitution, together with copyright,
antitrust and export control laws"
-- SCO smoking crack again.

Glenn Linderman

unread,
Nov 26, 2003, 7:10:17 PM11/26/03
to Dave Mitchell, perl5 porters
On approximately 11/26/2003 3:03 PM, came the following characters from

the keyboard of Dave Mitchell:
> On Wed, Nov 26, 2003 at 02:23:35PM -0800, Glenn Linderman wrote:
>
>>A follow-on question here: when f1 gets its own private reference to $x,
>>I'm assuming that it would still share the value with any other subs
>>defined in that block that would also reference $x. That is, that each
>>such sub would get its own reference, but only one value would exist,
>>and the subs could communicate through that variable if they chose to.
>>Experimentally, this seems to be the case.
>
>
> Each time a block is entered that contains a my declaration, a new instance
> of that variable is created (in internals terminology, a new SV is
> created). When subs 'capture' a lexical at compile time, they create a
> pointer to the current instance. When the block is exited, the block's
> reference to the instance is deleted, and if nothing else (such as a
> closure) has a reference to it, it is freed. So this instance may indeed
> be shared betwen subs, as in
>
> sub new_counter {
> my $x = 0;
> return sub {$x}, sub {$x++}, sub {$x--};
> }
> my ($sub_val, $sub_inc, $sub_dec) = new_counter;

Good. Thanks for the confirmation on this one too.

>>OK, this warning sure would have been helpful to my understanding this
>>issue... in fact, why is it only a warning, instead of an error... if
>>the variable really doesn't exist, and use strict is in effect, should
>>it not be an error? Although, I'd rather have it work, as you can read
>>below....
>
> Because it would probably break too much existing code.

Perhaps so, but likely it would only break already broken code, like
that that I was trying to make work. In any case, $SIG{__WARN__} lets
us convert warnings to errors, so ...

Well, under the theory that functions containing evals and declared
within closures should capture all lexicals, $x wouldn't have been
destroyed yet. So I guess I'd expect it to do the same thing as the
case you didn't include, which is:

sub X::DESTROY { print "X::DESTROY called\n" }
{
my $x = bless {}, 'X';

sub f2 { print "x=$x\n" }


}
print "outside block\n";
f2;

In other words, the fact that the code is eval'd or not eval'd should
not affect the running of the code. Of course, this argues that for
consistency, even functions that only reference one of 500 lexicals in
the block should preserve all the lexicals, because if any of them were
recoded to contain an eval, then some of the side-effects of destroying
the other lexicals could change in timing. But if you expect the
lexical to be destroyed, why did you declare it inside the closure?

>>Does anyone think this behavior of Perl/closures/evals is a bug? One
>>that is worth fixing? Or even possible to fix as outlined above? Or
>>that will ever get fixed? Or possibly even documented better, so the
>>trap can be avoided?
>
>
> It's not a bug, but it needs to be documented better. I made a start on
> improving the documentation, but life got in the way.

Yes, well the difference between a bug and a feature is whether they are
documented to work that way. Documentation in this area is very
sparse, so I guess that means that using it at all is a bug :)

> [ By the end of today I've got to write a risk assessment and a get-in
> schedule for a production of Macbeth that I foolishly agreed to get
> involved in. It's aready 11pm, and I haven't started yet. Instead,
> I'm engaged in the displacement activity of replying to emails on p5p ;-) ]

A much appreciated displacement activity from my point of view. This is
a major project (for me) that is being affected, so I want to understand
what is what, before getting too far in the wrong direction.

In playing with the Perl debugger some, trying to see what was going on,
I think I observed that it is also affected by this sort of behavior, or
exacerbates this sort of behavior within programs that use evals, by
adding (I guess) another level of eval somewhere along the line? I
usually debug with print statements, so I'm not familiar enough with the
debugger to be sure of that. I need to do some more experimentation in
that area.

> Another approach is to ensure that the sub that calls the eval
> mentions all the lexicals that need to be captured and preserved beyond
> their normal lifespan, eg
>
> {
> my ($a,$b,$c);
> sub do_eval {
> { no warnings; $a; $b; $c } # capture lexicals
> eval $_[0];
> }
> }
>
> I don't know whether this would be practical in your situation though.

Yes, I thought of that, and I also thought of making additional subs as
accessors for each of the variables in the closure, which would also fix
the problem, by using the accessor functions in the evals.

These workarounds all seem uglier than preserving all the variables in
the closure automatically.

Dave Mitchell

unread,
Nov 26, 2003, 7:30:55 PM11/26/03
to Glenn Linderman, perl5 porters
On Wed, Nov 26, 2003 at 04:10:17PM -0800, Glenn Linderman wrote:
> On approximately 11/26/2003 3:03 PM, came the following characters from
> the keyboard of Dave Mitchell:

But what you're proposing is that the mere presence of an eval statement
somewhere in your code suddenly makes all lexical variables within the
viasibility of that eval immortal or semi-immortal. This is a major
change, would break lots of code, and would really annoy people. Many of
the closure bugs I fixed for 5.8.1 were ones relating to lexical variables
not being freed soon enough; now none of them would be freed!

I think a more sensible approach is to just state that the delayed
compilation effect of an eval simply means that some things may no
longer be available, in the same way the early compilation affored by
BEGIN simply means that some things may not yet be available.

> >[ By the end of today I've got to write a risk assessment and a get-in
> >schedule for a production of Macbeth that I foolishly agreed to get
> >involved in. It's aready 11pm, and I haven't started yet. Instead,
> >I'm engaged in the displacement activity of replying to emails on p5p ;-) ]
>
> A much appreciated displacement activity from my point of view. This is
> a major project (for me) that is being affected, so I want to understand
> what is what, before getting too far in the wrong direction.

I've now written the bit where I point out that its a good idea for the
actors not to kill themselves with the swords which we're letting them
play with (Or words to that effect.)

> In playing with the Perl debugger some, trying to see what was going on,
> I think I observed that it is also affected by this sort of behavior, or
> exacerbates this sort of behavior within programs that use evals, by
> adding (I guess) another level of eval somewhere along the line?

Yes, before 5.10, the evals done by the debugger can have some
side-effects.


--
print+qq&$}$"$/$s$,$*${$}$g$s$@$.$q$,$:$.$q$^$,$@$*$~$;$.$q$m&if+map{m,^\d{0\,},,${$::{$'}}=chr($"+=$&||1)}q&10m22,42}6:17*2~2.3@3;^2$g3q/s"&=~m*\d\*.*g

Glenn Linderman

unread,
Nov 27, 2003, 2:05:29 AM11/27/03
to Dave Mitchell, perl5 porters
On approximately 11/26/2003 4:30 PM, came the following characters from

Not quite. What I'm proposing is that the mere presence of an eval
statement within a sub defined in a closure, should make all lexical
variables within the visibility of that sub immortal or semi-immortal
(Your words, I'm not sure of all the implications of those terms, but
that sounds sort of correct to this fellow that has read perlguts, but
not used it).

I don't know how most people code Perl, but I generally don't define
subs except at the outer level of nesting, unless I intend it to be a
closure, and then I generally only have one or two variables defined in
that same block. This current project was the first time I've ventured
beyond one or two variables, and also the first time I am using eval in
a closure. And, that is clearly venturing into bug (not well
documented) territory.

> I think a more sensible approach is to just state that the delayed
> compilation effect of an eval simply means that some things may no
> longer be available, in the same way the early compilation affored by
> BEGIN simply means that some things may not yet be available.

I'm not sure it is more sensible, but it is probably easier to implement
and harder to describe, and seems much less DWIMish... the description
would have to point out that lexical variables in a given scope are
sometimes treated one way, and sometimes another way, depending on the
contents of the code in the functions defined in that scope.

My naive understanding of closures from page 133 of Camel III is "The
eval STRING operator also works as a nested scope, since the code in the
eval can see its caller's lexicals (as long as the names aren't hidden
by identical declarations within the eval's own scope)." ... " If a
block evals a string that creates an anonymous subroutine, the
subroutine becomes a closure with FULL ACCESS to the lexicals of both
the eval and the block, even after the eval and the block have exited."

Of course, that is only Larry Wall, Tom Christiansen, and Jon Orwant
speaking... what do they know about how Perl works ??? You might send
them a note for Camel IV, though, although Camel IV will probably cover
Perl 6... [Tongue in cheek, I shouldn't do that to you, because you've
been very helpful in confirming that the issue I've run into is an
issue, and describing how it does work, as opposed to how I'd expect it
to work from what I have found to read.... and whether it gets fixed or
documented in the future, I'm coding today, and do appreciate the help
in understanding The Way Things Are.]

>>In playing with the Perl debugger some, trying to see what was going on,
>>I think I observed that it is also affected by this sort of behavior, or
>> exacerbates this sort of behavior within programs that use evals, by
>>adding (I guess) another level of eval somewhere along the line?
>
>
> Yes, before 5.10, the evals done by the debugger can have some
> side-effects.

And, once again, thanks for confirming that.

Alan Burlison

unread,
Nov 27, 2003, 5:05:44 AM11/27/03
to Glenn Linderman, Dave Mitchell, perl5 porters
Glenn Linderman wrote:

> My naive understanding of closures from page 133 of Camel III is "The
> eval STRING operator also works as a nested scope, since the code in the
> eval can see its caller's lexicals (as long as the names aren't hidden
> by identical declarations within the eval's own scope)." ... " If a
> block evals a string that creates an anonymous subroutine, the
> subroutine becomes a closure with FULL ACCESS to the lexicals of both
> the eval and the block, even after the eval and the block have exited."

But that's precisely what you are *not* doing. The eval of $innerstring is
deferred until you eval $outerstring, and by that time the variables that
you refer to have already gone out of scope and have been destroyed. The
behaviour you observe is exactly as documented above.

If you are worried about namespace pollution by the global variables, why
not put them in a seperate package?

--
Alan Burlison
--

Dave Mitchell

unread,
Nov 27, 2003, 9:34:48 AM11/27/03
to Glenn Linderman, perl5 porters
On Wed, Nov 26, 2003 at 11:05:29PM -0800, Glenn Linderman wrote:
> On approximately 11/26/2003 4:30 PM, came the following characters from
> the keyboard of Dave Mitchell:
> >But what you're proposing is that the mere presence of an eval statement
> >somewhere in your code suddenly makes all lexical variables within the
> >viasibility of that eval immortal or semi-immortal. This is a major
> >change, would break lots of code, and would really annoy people. Many of
> >the closure bugs I fixed for 5.8.1 were ones relating to lexical variables
> >not being freed soon enough; now none of them would be freed!
>
> Not quite. What I'm proposing is that the mere presence of an eval
> statement within a sub defined in a closure,

I think your terminology is wrong here. The sub is not defined in a
closure (a closure is just a sub that refers to outer lexicals); I think
you meant "sub definedwithin an enclosing block"

> should make all lexical
> variables within the visibility of that sub immortal or semi-immortal
> (Your words, I'm not sure of all the implications of those terms, but
> that sounds sort of correct to this fellow that has read perlguts, but
> not used it).
>
> I don't know how most people code Perl, but I generally don't define
> subs except at the outer level of nesting, unless I intend it to be a
> closure, and then I generally only have one or two variables defined in
> that same block. This current project was the first time I've ventured
> beyond one or two variables, and also the first time I am using eval in
> a closure. And, that is clearly venturing into bug (not well
> documented) territory.

Consider the following file, Foo.pm:

package Foo;
sub DESTROY {...}
my $x = bless {};
my $y = bless {};
sub f { $x }

Here, f() is a closure too, even though it isn't enlosed by surrounding
braces. When you do a C<use Foo>, DESTROY is called for $y as soon as the
C<use> is completed, even before the rest of then main program has been
compiled, while $x lives on until the end of the program's execution.

>
> >I think a more sensible approach is to just state that the delayed
> >compilation effect of an eval simply means that some things may no
> >longer be available, in the same way the early compilation affored by
> >BEGIN simply means that some things may not yet be available.
>
> I'm not sure it is more sensible, but it is probably easier to implement
> and harder to describe, and seems much less DWIMish... the description
> would have to point out that lexical variables in a given scope are
> sometimes treated one way, and sometimes another way, depending on the
> contents of the code in the functions defined in that scope.

IMHO the current behaviour is consistent, if underdocumented.

I think at this point we will have to agree to disagree.

Regards,

Dave.

--
In the 70's we wore flares because we didn't know any better.
What possible excuse does the current generation have?

Glenn Linderman

unread,
Nov 27, 2003, 12:29:09 PM11/27/03
to Dave Mitchell, perl5 porters
On approximately 11/27/2003 6:34 AM, came the following characters from

the keyboard of Dave Mitchell:
> On Wed, Nov 26, 2003 at 11:05:29PM -0800, Glenn Linderman wrote:
>
>>On approximately 11/26/2003 4:30 PM, came the following characters from
>>the keyboard of Dave Mitchell:
>>
>>>But what you're proposing is that the mere presence of an eval statement
>>>somewhere in your code suddenly makes all lexical variables within the
>>>viasibility of that eval immortal or semi-immortal. This is a major
>>>change, would break lots of code, and would really annoy people. Many of
>>>the closure bugs I fixed for 5.8.1 were ones relating to lexical variables
>>>not being freed soon enough; now none of them would be freed!
>>
>>Not quite. What I'm proposing is that the mere presence of an eval
>>statement within a sub defined in a closure,
>
>
> I think your terminology is wrong here. The sub is not defined in a
> closure (a closure is just a sub that refers to outer lexicals); I think
> you meant "sub definedwithin an enclosing block"

I can agree that the your description is what I was referring to.

>>should make all lexical
>>variables within the visibility of that sub immortal or semi-immortal
>>(Your words, I'm not sure of all the implications of those terms, but
>>that sounds sort of correct to this fellow that has read perlguts, but
>>not used it).
>>
>>I don't know how most people code Perl, but I generally don't define
>>subs except at the outer level of nesting, unless I intend it to be a
>>closure, and then I generally only have one or two variables defined in
>>that same block. This current project was the first time I've ventured
>>beyond one or two variables, and also the first time I am using eval in
>>a closure. And, that is clearly venturing into bug (not well
>>documented) territory.
>
>
> Consider the following file, Foo.pm:
>
> package Foo;
> sub DESTROY {...}
> my $x = bless {};
> my $y = bless {};
> sub f { $x }
>
> Here, f() is a closure too, even though it isn't enlosed by surrounding
> braces. When you do a C<use Foo>, DESTROY is called for $y as soon as the
> C<use> is completed, even before the rest of then main program has been
> compiled, while $x lives on until the end of the program's execution.

Hmm. Yes, that broadens my mental picture of a closure. Of course,
since $y isn't referenced, it probably shouldn't even be defined
(certainly the shown definition is useless, but perhaps there are some
desirable side effects of doing something similar in certain cases).

If $y weren't defined, it wouldn't be a problem if it were kept around :)

But I can also see that because of things like this, a lot more stuff
could be kept around that seems to be necessary...

But if it is the case that $y is intended to live a very transient life,
wouldn't it be better to make that explicit?

package Foo;
sub DESTROY {...}
my $x = bless {};
{
my $y = bless {};
}
sub f { $x }

Then keeping all the lexicals for use by evals would be less
problematical... and the existing Camel descriptions would be leading
(that is intended to be the opposite of misleading, here :) ).

>>>I think a more sensible approach is to just state that the delayed
>>>compilation effect of an eval simply means that some things may no
>>>longer be available, in the same way the early compilation affored by
>>>BEGIN simply means that some things may not yet be available.
>>
>>I'm not sure it is more sensible, but it is probably easier to implement
>>and harder to describe, and seems much less DWIMish... the description
>>would have to point out that lexical variables in a given scope are
>>sometimes treated one way, and sometimes another way, depending on the
>>contents of the code in the functions defined in that scope.
>
>
> IMHO the current behaviour is consistent, if underdocumented.

Oh I agree that the there are a number of ways to make it consistent
(and a number of definitions of "current" too, I'm assuming you are
referring to 5.10 where there are a number of bug fixes that hopefully
reach consistency)... and that it is underdocumented. Perhaps even
somewhat erroneously documented, per the Camel passages I quoted... they
certainly lead the naive user to think that all the lexicals will be
available to evals.

> I think at this point we will have to agree to disagree.

About what is more DWIMish, perhaps so. About how easy it would be to
document things one way vs. the other, perhaps so. About how it
actually works, you are much more the expert than I, and I appreciate
your time and responses, they were very helpful and educational.

> Regards,
>
> Dave.

Glenn Linderman

unread,
Nov 27, 2003, 12:52:11 PM11/27/03
to Alan Burlison, Dave Mitchell, perl5 porters
On approximately 11/27/2003 2:05 AM, came the following characters from
the keyboard of Alan Burlison:

> Glenn Linderman wrote:
>
>> My naive understanding of closures from page 133 of Camel III is "The
>> eval STRING operator also works as a nested scope, since the code in
>> the eval can see its caller's lexicals (as long as the names aren't
>> hidden by identical declarations within the eval's own scope)." ... "
>> If a block evals a string that creates an anonymous subroutine, the
>> subroutine becomes a closure with FULL ACCESS to the lexicals of both
>> the eval and the block, even after the eval and the block have exited."
>
> But that's precisely what you are *not* doing. The eval of $innerstring
> is deferred until you eval $outerstring, and by that time the variables
> that you refer to have already gone out of scope and have been
> destroyed. The behaviour you observe is exactly as documented above.

"eval works as a nested scope" ... "the code in the eval can see its
caller's lexicals"... hmmm. "Nested scope", in every language
definition I have ever seen before, means that the innermost scope can
see the variables defined in each outer enclosing scope... not just its
immediate parent.

So yes, my example has multiple nested scopes. In order of hierarchy,
let's name the scopes:

file scope
anonymous block scope
eval $outerstring scope
eval $innerstring scope

So if an eval STRING operator also works as a nested scope, then it
should have access to the variables declared in each of the enclosing
scopes. But it doesn't, as my code demonstrates, and Dave Mitchell
concurs based on the implementation.

So while the desired implementation (according to Dave's emails in this
thread) may be much closer to the current implementation, rather than
what is documented here, I strongly disagree that the behavior I am
observing is what is documented in these passages quoted from Camel, or
that the term "nested scope" can appropriately be used to describe the
current implementation behavior.

> If you are worried about namespace pollution by the global variables,
> why not put them in a seperate package?

This may be the best alternative to reach my goals, among TWTDI that are
presently available and presently provide the functionality desired,
although by resolving the namespace conflicts, I have been able to
continue making progress. But I'd like others not to fall into the trap
that I did, so I'd like to see either the documentation clarified to
describe the limitations of the implementation, or the implementation
enhanced to do what the documentation currently says. Of course the
docs shipped with perl are much less explicit regarding the description
of the scope of an eval than the Camel is... which one should be
considered definitive?

Stas Bekman

unread,
Nov 28, 2003, 4:10:12 AM11/28/03
to Dave Mitchell, perl5 porters
Dave, I took the liberty to add your explanation almost verbatim to perlref.
You may want to polish it later, but I think it's pretty important to put that
info in. And your explanation and the example were pretty clear to me. I
mainly stressed that it's the string form of eval (it doesn't apply to eval {} )

Also I hope that I put it in the right place in perlref, as it explains
closures in two places.

Here is the patch:

--- pod/perlref.pod.orig 2003-11-28 00:58:48.000000000 -0800
+++ pod/perlref.pod 2003-11-28 01:06:17.000000000 -0800
@@ -195,6 +195,51 @@
continue to work as they have always worked. Closure is not something
that most Perl programmers need trouble themselves about to begin with.

+Note that lexical variables used inside the string form of C<eval()>
+might not create a closure if not references elsewhere in the
+subroutine. This is because only when the string is eval'ed at
+run-time, can perl see that a variable is references from within the
+the eval'ed string. At compile time it's not known whether there is a
+lexical variable inside the string or not. For example:
+
+ {
+ my $x = 1;
+ sub f1 { print "f1: x=$x\n" }
+ sub f2 { eval 'print "f2: x=$x\n"' }
+ }
+
+ f1;
+ f2;
+
+produces the following output:
+
+ f1: x=1
+ Use of uninitialized value in concatenation (.) or string at (eval 1) line 1.
+ f2: x=
+
+What happens here is that when f1 is compiled, the compiler notices
+that the sub f1 makes mention of the outer lexical $x, so f1 gets its
+own private reference to that variable. When f2 is compiled, it has no
+such mention of $x, so it doesn't also capture $x.
+
+After the { } block is exited, the interpreter discards the current
+(and only) instance of $x. When f1 is later called, it still has its
+private copy of $x, and so can print out its value. When the eval is
+compiled via f2, f2 hasn't got a private copy of $x, so the eval tries
+to grab the value of the 'real' $x, which is now undef.
+
+In the current development version of perl, you get a warning when
+this happens:
+
+ f1: x=1
+ Variable "$x" is not available at (eval 1) line 1.
+ Use of uninitialized value in concatenation (.) or string at (eval 1) line 1.
+ f2: x=
+
+Therefore be careful when wanting to create a closure and using the
+string form of eval.
+
+
=item 5.

References are often returned by special subroutines called constructors.


__________________________________________________________________
Stas Bekman JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/ mod_perl Guide ---> http://perl.apache.org
mailto:st...@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org http://ticketmaster.com

Brad Baxter

unread,
Nov 28, 2003, 9:39:59 AM11/28/03
to perl5 porters
On Wed, 26 Nov 2003, Glenn Linderman wrote:
> I'm not sure it is more sensible, but it is probably easier to implement
> and harder to describe, and seems much less DWIMish... the description
> would have to point out that lexical variables in a given scope are
> sometimes treated one way, and sometimes another way, depending on the
> contents of the code in the functions defined in that scope.
>
> My naive understanding of closures from page 133 of Camel III is "The
> eval STRING operator also works as a nested scope, since the code in the
> eval can see its caller's lexicals (as long as the names aren't hidden
> by identical declarations within the eval's own scope)." ... " If a
> block evals a string that creates an anonymous subroutine, the
> subroutine becomes a closure with FULL ACCESS to the lexicals of both
> the eval and the block, even after the eval and the block have exited."

I'm sorry, but I don't agree that the passage you've quoted would lead to
your conclusion about the visibility of the lexicals. Your code defines a
subroutine that contains an eval string. The passage describes what
happens when such a subroutine is called. The fact that the passage also
mentions the possibility of the evalled code defining a subroutine is a
completely different situation. I don't know, maybe you're suggesting
something like "The eval STRING operator also works as a nested scope [at
the time the eval is called], since the code ...", but to me that's
inferred by, "can see its caller's lexicals". The big difference, I
think, is between "defined" and "called".

Regards,

Brad

Brad Baxter

unread,
Nov 28, 2003, 9:45:58 AM11/28/03
to perl5 porters
On Fri, 28 Nov 2003, Stas Bekman wrote:

> +Note that lexical variables used inside the string form of C<eval()>
> +might not create a closure if not references elsewhere in the

+might not create a closure if not referenced elsewhere in the
^

> +subroutine. This is because only when the string is eval'ed at
> +run-time, can perl see that a variable is references from within the

+run-time, can perl see that a variable is referenced from within the
^

Regards,

Brad

Brad Baxter

unread,
Nov 28, 2003, 11:57:37 AM11/28/03
to perl5 porters
On Fri, 28 Nov 2003, Brad Baxter wrote:
> I'm sorry, but I don't agree that the passage you've quoted would lead to
> your conclusion about the visibility of the lexicals. Your code defines a
> subroutine that contains an eval string. The passage describes what
> happens when such a subroutine is called. The fact that the passage also
> mentions the possibility of the evalled code defining a subroutine is a
> completely different situation. I don't know, maybe you're suggesting
> something like "The eval STRING operator also works as a nested scope [at
> the time the eval is called], since the code ...", but to me that's
> inferred by, "can see its caller's lexicals". The big difference, I
> think, is between "defined" and "called".

Hmmm, I confess I expected the code below to print "x=2". The fact that
it doesn't makes me think I ought not to have spoken up. :-)

{
my $x = 1;
sub f1 { no warnings; $x; eval 'print "f1: x=$x\n"' }
sub f2 { eval 'print "f2: x=$x\n"'; no warnings; $x; }
}

my $x = 2;
f1;
f2;
____

f1: x=1
f2: x=1


Best regards,

Brad


Stas Bekman

unread,
Nov 28, 2003, 12:23:59 PM11/28/03
to Brad Baxter, perl5 porters
Brad Baxter wrote:

> +might not create a closure if not referenced elsewhere in the

...


> +run-time, can perl see that a variable is referenced from within the

Thanks Brad, here is the fixed patch:

--- pod/perlref.pod.orig 2003-11-28 00:58:48.000000000 -0800

+++ pod/perlref.pod 2003-11-28 09:22:51.000000000 -0800


@@ -195,6 +195,51 @@
continue to work as they have always worked. Closure is not something
that most Perl programmers need trouble themselves about to begin with.

+Note that lexical variables used inside the string form of C<eval()>
+might not create a closure if not referenced elsewhere in the


+subroutine. This is because only when the string is eval'ed at

+run-time, can perl see that a variable is referenced from within the

Glenn Linderman

unread,
Nov 28, 2003, 2:02:10 PM11/28/03
to Brad Baxter, perl5 porters
On approximately 11/28/2003 8:57 AM, came the following characters from
the keyboard of Brad Baxter:

Thanks for being honest, Brad, and actually writing code to test your
assumptions. This is a very arcane area that I have stumbled into, it
seems.

The statement in the Camel seems very understandable from a naive point
of view, and it is clear that the implementation doesn't provide a
complete nested scope. Yet that seems to be the intention. Of course,
since the implementation appears to have never actually implemented what
the Camel says, Dave and Alan seem to be concerned about breaking
behavior that has been part of the implementation for many years, rather
than conforming to the description in the Camel.

As it seems there are other ways for me to achieve my goals in my
current project, and since I'm not at a point of time availability or
knowledge of Perl internals to contribute patches for the issue (to see
just how much code would break if the Camel description were
implemented), I'm content to see either code or documentation modified,
so as to eliminate the trap that I fell into... I did read both Camel
and pod, Camel was clear (but doesn't match the implementation), pod was
nearly silent on the topic. I appreciate Stas updating the perlref.pod
with Dave's description of how things really work... had that been in
there before, I would have obtained a better understanding of how it
works before trying it, and getting very confused with the results.

The terms "closure" and "nested scope" seem to have very pure
definitions, but the current Perl implementation doesn't seem to be
particularly pure with regards to evals. So the description of how Perl
works can use the terms "closure" and "nested scope" as a first
approximation of how things work, but Dave description of how things
actually work, which is somewhat more limited, is very necessary to
understanding the current situation.

I would hope that the implementation would eventually migrate to the
pure defifinition listed in the Camel as that is much easier to
understand, and provides a much cleaner basis for coding.

Stas Bekman

unread,
Nov 28, 2003, 4:40:56 PM11/28/03
to Glenn Linderman, Brad Baxter, perl5 porters
Glenn Linderman wrote:
[...]

> The terms "closure" and "nested scope" seem to have very pure
> definitions, but the current Perl implementation doesn't seem to be
> particularly pure with regards to evals. So the description of how Perl
> works can use the terms "closure" and "nested scope" as a first
> approximation of how things work, but Dave description of how things
> actually work, which is somewhat more limited, is very necessary to
> understanding the current situation.

In the mod_perl world users of Apache::Registry have to deal with closure
problems quite often. Since most people aren't aware that what registry does
is wrapping your cgi script into a 'sub handler { }' wrapper, turning any
other sub foo {} in thec cgi script into a closure if it references any
lexical variable defined on what used to be a top level of the cgi script.
Therefore we have a special document dedicated to this kind of tricky Perl issues:
http://perl.apache.org/docs/general/perl_reference/perl_reference.html#my___Scoped_Variable_in_Nested_Subroutines
http://perl.apache.org/docs/general/perl_reference/perl_reference.html#Understanding_Closures____the_Easy_Way
http://perl.apache.org/docs/general/perl_reference/perl_reference.html#When_You_Cannot_Get_Rid_of_The_Inner_Subroutine

I find the following section to be the most helpful to understanding closures:
http://perl.apache.org/docs/general/perl_reference/perl_reference.html#Understanding_Closures____the_Easy_Way
If I remember correctly it was Randal Schwartz who suggested this technique to
tell closures from non-closures.

Dave Mitchell

unread,
Nov 28, 2003, 4:47:51 PM11/28/03
to Glenn Linderman, Brad Baxter, perl5 porters

AAAARRRRRRRRRGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHH !!!!!!!!!!!!!!!!!!!!!!!!

The implementation provides a completely correct nested scope. It is not
broken, it does not require fixing. I am not advocating not touching
broken behaviour for backwards compatibility reasons, I am happy that the
current behaviour is correct and sensible.

At all times Perl follows the following five basic rules:

1. Any *use* of a lexical variable is, at *compile* time, matched against
the nearest *lexically* (not dynmically) matching 'my' declaration.

2. Each time a block with a 'my' declaration is entered, a new instance
of that lexical is created, and each time the block is exited, that
instance is discarded (of course, if something else holds a reference
to it, then the actual thinbg itself continues to exist, it is just not
accessible via the lexical name). The one proviso to this is that the
first instance exists from the moment of creation of the sub or file it
resides in, rather from entry into the block during first execution.

3. At the creation time of a sub that references an outer lexical,
that sub captures the current instance of that lexical. If there is no
currently valid instance, a warning is issued, and a new undef value is
'captured' instead. For named subs, creation time equals compilation time;
for anonymous subs, creation time is later, when you execute the 'sub'
bit.

4. For our purposes, a string eval is just a sub that is compiled once,
executed once, then discarded.

5. When a sub has captured an instance, any mention of '$x' which is
lexically contained within that sub, such as in an eval or nested
anonymous sub, will see that captured instance rather than the outer one.

Okay, I've skipped a few subtleties, such as what constitutes a
sub, and what happens after you undef a sub, but that's the jist of it.

Under those rules, the code above should clearly print out 1, because:
under rule 1, the $x in f1 and f2 is always matched to the delcaration on
the line above them; under rule 3, f1 and f2 capture the first instance of
$x at their compile time; during execution this first instance is assigned
the value 1; later when the evals are compiled, the '$x' in the evals are
matched to the instance captured by f1 and f2, which has the value 1.

As regards to the Camel text on p.133:

"the code in the eval can see its caller's lexicals"

{
my $x = 1;
eval 'print "x=$x\n"'; # prints 1
}

"Anonymous subroutines can likewise access any lexical variables from their
caller's scopes; if they do so, they're what are known as closures."

our $a;
{
my $x = 1;
$a = sub { print "x=$x\n" };
}
$a->(); # prints 1

"Combining those two notions, if a block evals a string that creates an
anonymous subroutine, the subroutine becomes a closure with full access to


the lexicals of both the eval and the block, even after the eval and the
block have exited."


our $a;
{
my $x1 = 1;
eval q[
my $x2 = 2;
$a = sub { print "x1=$x1, x2=$x2\n" }
]
}
$a->(); # prints 1, 2


So, the camel text is not incorrect.
Of course, the Camel text glosses over the complications of what happens
when the lexical is not declared in a *directly* enclosing block to the
eval, such as

{ my $x; sub f { eval '$x' } }

but then we've all already agreed that this behaviour is currently
underdocumented.

This is really going to be my final word on the subject.
I've got a theatre production which will soak up this weekend and my
evenings next week, so I'm probably not going to be responding to emails
much.

Dave.

--
In my day, we used to edit the inodes by hand. With magnets.

Dave Mitchell

unread,
Nov 28, 2003, 4:53:37 PM11/28/03
to Stas Bekman, Brad Baxter, perl5 porters
On Fri, Nov 28, 2003 at 09:23:59AM -0800, Stas Bekman wrote:
> Brad Baxter wrote:
>
> >+might not create a closure if not referenced elsewhere in the
> ...
> >+run-time, can perl see that a variable is referenced from within the
>
> Thanks Brad, here is the fixed patch:

Thanks for this Stas. Given that I'm already in the process of trying
to update the doumentation on closures, I'd suggest that this isn't
applied yet, but that instead I'll try to incorpoate it into what I'm
doing. If after a few weeks I've failed to deliver the goods, then I'll
apply it anyway.


Dave.

--
"Strange women lying in ponds distributing swords is no basis for a system
of government. Supreme executive power derives from a mandate from the
masses, not from some farcical aquatic ceremony."
-- Dennis - Monty Python and the Holy Grail.

Rafael Garcia-Suarez

unread,
Nov 28, 2003, 4:59:10 PM11/28/03
to Dave Mitchell, Glenn Linderman, Brad Baxter, perl5 porters
Dave Mitchell wrote:
>
> The implementation provides a completely correct nested scope. It is not
> broken, it does not require fixing. I am not advocating not touching
> broken behaviour for backwards compatibility reasons, I am happy that the
> current behaviour is correct and sensible.

If my advice has some weight, I'm happy to say that I'm 100% with Dave
here.

(Dave, thanks for writing this, I was just going to write a similar
reply, but much less clear:)

Stas Bekman

unread,
Nov 28, 2003, 5:24:37 PM11/28/03
to Dave Mitchell, Brad Baxter, perl5 porters
Dave Mitchell wrote:

> Thanks for this Stas. Given that I'm already in the process of trying
> to update the doumentation on closures, I'd suggest that this isn't
> applied yet, but that instead I'll try to incorpoate it into what I'm
> doing. If after a few weeks I've failed to deliver the goods, then I'll
> apply it anyway.

Sure, I'll let you handle this then ;) Thanks Dave.

Glenn Linderman

unread,
Nov 28, 2003, 5:46:48 PM11/28/03
to Stas Bekman, Brad Baxter, perl5 porters
On approximately 11/28/2003 1:40 PM, came the following characters from
the keyboard of Stas Bekman:

> Glenn Linderman wrote:
> [...]
>
>> The terms "closure" and "nested scope" seem to have very pure
>> definitions, but the current Perl implementation doesn't seem to be
>> particularly pure with regards to evals. So the description of how
>> Perl works can use the terms "closure" and "nested scope" as a first
>> approximation of how things work, but Dave description of how things
>> actually work, which is somewhat more limited, is very necessary to
>> understanding the current situation.
>
> In the mod_perl world users of Apache::Registry have to deal with
> closure problems quite often. Since most people aren't aware that what
> registry does is wrapping your cgi script into a 'sub handler { }'
> wrapper, turning any other sub foo {} in thec cgi script into a closure
> if it references any lexical variable defined on what used to be a top
> level of the cgi script. Therefore we have a special document dedicated
> to this kind of tricky Perl issues:
> http://perl.apache.org/docs/general/perl_reference/perl_reference.html#my___Scoped_Variable_in_Nested_Subroutines
>
> http://perl.apache.org/docs/general/perl_reference/perl_reference.html#Understanding_Closures____the_Easy_Way
>
> http://perl.apache.org/docs/general/perl_reference/perl_reference.html#When_You_Cannot_Get_Rid_of_The_Inner_Subroutine
>
>
> I find the following section to be the most helpful to understanding
> closures:
> http://perl.apache.org/docs/general/perl_reference/perl_reference.html#Understanding_Closures____the_Easy_Way
>
> If I remember correctly it was Randal Schwartz who suggested this
> technique to tell closures from non-closures.

Thanks, Stas, more documentation is always better than less
documentation, and that looks like a very nice explanation of some parts
of Perl closures, and some issues when using Apache, which I haven't yet
run into, but as my web host uses Apache, I may someday run into them...
I'll have to read the whole document at some point -- in my cursor
searches of Apache documentation thus far, I hadn't yet found that
document.

The sections you mentioned don't address issues with mixing in eval
STRING constructs, which seems to be the root of the discussion at hand,
however. I wonder if the Apache::Registry affects that issue at all,
like the Perl debugger (prior to version 5.10) does. I don't have the
bandwidth or need to pursue that line of thought, at present.

Stas Bekman

unread,
Nov 28, 2003, 6:27:26 PM11/28/03
to Glenn Linderman, Brad Baxter, perl5 porters
Glenn Linderman wrote:
[...]
>> http://perl.apache.org/docs/general/perl_reference/perl_reference.html#Understanding_Closures____the_Easy_Way
[...]

> The sections you mentioned don't address issues with mixing in eval
> STRING constructs, which seems to be the root of the discussion at hand,
> however. I wonder if the Apache::Registry affects that issue at all,
> like the Perl debugger (prior to version 5.10) does. I don't have the
> bandwidth or need to pursue that line of thought, at present.

I think it's quite relevant as it does show you whether sub { eval '$x' } is a
closure or not (it isn't). e.g.:

1)
% perl -le 'for (1..2) { my $x; push @a, sub { eval q[$x] } }; print join
"\n", @a '
CODE(0x8064fdc)
CODE(0x8064fdc)

2)
% perl -le 'for (1..2) { my $x; push @a, sub { eval qq[$x] } }; print join
"\n", @a '
CODE(0x804c65c)
CODE(0x8065064)

You can clearly see that eval 'string' (1) creates no closure, because the
compiler can't see $x, whereas when it sees $x (2) it creates a closure.

3)
% perl -le 'for (1..2) { my $x; push @a, sub { $x; eval q[$x] } }; print join
"\n", @a '
CODE(0x804c65c)
CODE(0x806508c)

and here you can see that we have a closure, because $x is referenced outside
of eval string.

Glenn Linderman

unread,
Nov 28, 2003, 7:08:04 PM11/28/03
to Stas Bekman, Brad Baxter, perl5 porters
On approximately 11/28/2003 3:27 PM, came the following characters from

the keyboard of Stas Bekman:

> Glenn Linderman wrote:


> [...]
>
>>> http://perl.apache.org/docs/general/perl_reference/perl_reference.html#Understanding_Closures____the_Easy_Way
>
>
> [...]
>
>> The sections you mentioned don't address issues with mixing in eval
>> STRING constructs, which seems to be the root of the discussion at
>> hand, however. I wonder if the Apache::Registry affects that issue at
>> all, like the Perl debugger (prior to version 5.10) does. I don't
>> have the bandwidth or need to pursue that line of thought, at present.
>
>
> I think it's quite relevant as it does show you whether sub { eval '$x'
> } is a closure or not (it isn't). e.g.:

Ah yes, OK, none of the examples in the document used evals, though, and
the mentioned sections of the document don't discuss that topic at all.
But yes, I can see now how the technique is relevant... but in the
cases of concern we'd already figured out that the variables were not
being captured, and were mostly discussing why. This could have been a
handy tool in my arsenal when debugging, if I had known about it at the
time, but actually, because of the number of variables in my closure,
I'm not sure I would have been able to discriminate which variables were
captured or exactly when.

--

Glenn Linderman

unread,
Nov 28, 2003, 8:34:44 PM11/28/03
to Dave Mitchell, Brad Baxter, perl5 porters
On approximately 11/28/2003 1:47 PM, came the following characters from
the keyboard of Dave Mitchell:

Dave, sorry for all the frustration here, but I really appreciate your
sticking with it this far, and coming up with the 5 points below. Maybe
now I have it... this is very enlightening, even more so than your
discussions up to now.

> The implementation provides a completely correct nested scope. It is not
> broken, it does not require fixing. I am not advocating not touching
> broken behaviour for backwards compatibility reasons, I am happy that the
> current behaviour is correct and sensible.

I apologize for attributing emotions to you which were not yours, but
rather based on my still incomplete understanding of what was happening.

> At all times Perl follows the following five basic rules:
>
> 1. Any *use* of a lexical variable is, at *compile* time, matched against
> the nearest *lexically* (not dynmically) matching 'my' declaration.
>
> 2. Each time a block with a 'my' declaration is entered, a new instance
> of that lexical is created, and each time the block is exited, that
> instance is discarded (of course, if something else holds a reference
> to it, then the actual thinbg itself continues to exist, it is just not
> accessible via the lexical name). The one proviso to this is that the
> first instance exists from the moment of creation of the sub or file it
> resides in, rather from entry into the block during first execution.

By creation in that last sentence, I assume you mean compilation... at
least in the half referring to the file. Or maybe you really mean
"creation of the scope for a sub or file" ??? So a little wordsmithing
might improve the clarity of that sentence, but once again, I think I
have it.

> 3. At the creation time of a sub that references an outer lexical,
> that sub captures the current instance of that lexical. If there is no
> currently valid instance, a warning is issued, and a new undef value is
> 'captured' instead. For named subs, creation time equals compilation time;
> for anonymous subs, creation time is later, when you execute the 'sub'
> bit.

"currently valid instance" could use a little more explanation. Does
this mean "instance that is not undef"? From my new understanding based
on this set of 5 rules, I would think that all instances are valid, but
some are undef. So the warning for capturing an outer lexical that is
undef, is kind of like the warning for operating with a numeric or
string operator on a variable that is undef ... only in this case the
operation is "capture variable from outer scope for later use", and the
warning (and now I think I understand why you didn't make it an error)
is appropriate because there isn't much need to capture undef values for
later use, one could obtain lots of undef values later! The benefit of
capturing the variable, is when it has a particular value that should be
preserved.

> 4. For our purposes, a string eval is just a sub that is compiled once,
> executed once, then discarded.
>
> 5. When a sub has captured an instance, any mention of '$x' which is
> lexically contained within that sub, such as in an eval or nested
> anonymous sub, will see that captured instance rather than the outer one.

Thank you!

Right. And I think the key point of confusion for me was (and I hope it
stays "was", and that I am not still confused) that because the
variables are lexical, they are recreated each time the block is
entered... but *because the block is not always entered at the top*,
code like

our $a;
{
my $x1 = 1;

sub f1
{
my $x2 = 2;
eval q[print "x1=$x1, x2=$x2\n" ];
}
}
& f1(); # prints x1=, x2=2

bypasses the initialization statement for $x1, when the block is
reentered by the call. And because when a variable is "captured" for a
closure then it seems to be "persistent", in fact it is only persistent
in the subblock(s) that captured it.

> This is really going to be my final word on the subject.
> I've got a theatre production which will soak up this weekend and my
> evenings next week, so I'm probably not going to be responding to emails
> much.

Thanks again for writing down these rules. I read in a different
subthread that you were working on the documentation, and if you can get
the meat of this discussion into the documentation, I think it will go a
long way towards keeping future wumpuses from being quite as ignorant as
I. I would be delighted to read/proof-read/comment on such future
documentation, if you think it would help, although the real proof of
the documentation is if a future ignorant wumpus can read it and be
educated, without causing you the frustration that I have.

> Dave.

Dave Mitchell

unread,
Nov 29, 2003, 8:27:34 AM11/29/03
to Glenn Linderman, Brad Baxter, perl5 porters
On Fri, Nov 28, 2003 at 05:34:44PM -0800, Glenn Linderman wrote:
> On approximately 11/28/2003 1:47 PM, came the following characters from
> the keyboard of Dave Mitchell:
> >At all times Perl follows the following five basic rules:
> >
> >1. Any *use* of a lexical variable is, at *compile* time, matched against
> >the nearest *lexically* (not dynmically) matching 'my' declaration.
> >
> >2. Each time a block with a 'my' declaration is entered, a new instance
> >of that lexical is created, and each time the block is exited, that
> >instance is discarded (of course, if something else holds a reference
> >to it, then the actual thinbg itself continues to exist, it is just not
> >accessible via the lexical name). The one proviso to this is that the
> >first instance exists from the moment of creation of the sub or file it
> >resides in, rather from entry into the block during first execution.
>
> By creation in that last sentence, I assume you mean compilation... at
> least in the half referring to the file. Or maybe you really mean
> "creation of the scope for a sub or file" ??? So a little wordsmithing
> might improve the clarity of that sentence, but once again, I think I
> have it.

Rule 3 below tries to distinguish between compilation and creation -
mainly important for anonymous subs, whihc can be created multiple times.

> >3. At the creation time of a sub that references an outer lexical,
> >that sub captures the current instance of that lexical. If there is no
> >currently valid instance, a warning is issued, and a new undef value is
> >'captured' instead. For named subs, creation time equals compilation time;
> >for anonymous subs, creation time is later, when you execute the 'sub'
> >bit.
>
> "currently valid instance" could use a little more explanation. Does
> this mean "instance that is not undef"?

No, I mean where an instance has been thrown away, and a new instance has
not yet been created, eg

sub f { $x = 1 }
f(); # during this call, the first instance of $x exists
# at this point, no instance exists
f(); # during this call, the second instance of $x exists

Normally this is just academic, but but when you call a function that does
an eval, you can end up trying to access the lexical, but no instance of
it currently exists, hence a) you get the warning about it not being
available, b) Perl gives you a new undef value to play with instead; this
value is not shared with any of the instances (in bleedperl, anyway).

> Thanks again for writing down these rules. I read in a different
> subthread that you were working on the documentation, and if you can get
> the meat of this discussion into the documentation, I think it will go a
> long way towards keeping future wumpuses from being quite as ignorant as
> I. I would be delighted to read/proof-read/comment on such future
> documentation, if you think it would help,

Okay, will do.

> although the real proof of
> the documentation is if a future ignorant wumpus can read it and be
> educated, without causing you the frustration that I have.

Regards,

Dave.

--
Little fly, thy summer's play my thoughtless hand
has terminated with extreme prejudice.
(with apologies to William Blake)

Glenn Linderman

unread,
Nov 29, 2003, 1:04:19 PM11/29/03
to Dave Mitchell, Brad Baxter, perl5 porters
On approximately 11/29/2003 5:27 AM, came the following characters from

the keyboard of Dave Mitchell:

> On Fri, Nov 28, 2003 at 05:34:44PM -0800, Glenn Linderman wrote:
>
>>On approximately 11/28/2003 1:47 PM, came the following characters from
>>the keyboard of Dave Mitchell:
>>
>>>At all times Perl follows the following five basic rules:
>>>
>>>1. Any *use* of a lexical variable is, at *compile* time, matched against
>>>the nearest *lexically* (not dynmically) matching 'my' declaration.
>>>
>>>2. Each time a block with a 'my' declaration is entered, a new instance
>>>of that lexical is created, and each time the block is exited, that
>>>instance is discarded (of course, if something else holds a reference
>>>to it, then the actual thinbg itself continues to exist, it is just not
>>>accessible via the lexical name). The one proviso to this is that the
>>>first instance exists from the moment of creation of the sub or file it
>>>resides in, rather from entry into the block during first execution.
>>
>>By creation in that last sentence, I assume you mean compilation... at
>>least in the half referring to the file. Or maybe you really mean
>>"creation of the scope for a sub or file" ??? So a little wordsmithing
>>might improve the clarity of that sentence, but once again, I think I
>>have it.
>
>
> Rule 3 below tries to distinguish between compilation and creation -
> mainly important for anonymous subs, whihc can be created multiple times.

And yes it does, for named and anonymous subs, but not for files...
Clearly the creation of a file has no effect on capturing instances...
one has to feed it to Perl first... so that is where I'm still a little
fuzzy as to your meaning. I suspect that by "creation time for a file"
you are referring to its compilation time, just like for named subs...
the time that the scope for the entity (file, named sub, anonymous sub)
is created. For the first two, the scope is created at compilation
time, for the latter, the scope is created when you execute the 'sub'
bit. Yes? And so I still think the "creation of the scope for a sub or
file" phrase that I used above still could apply? Or just add words
about the "file" case to rule 3.

>>>3. At the creation time of a sub that references an outer lexical,
>>>that sub captures the current instance of that lexical. If there is no
>>>currently valid instance, a warning is issued, and a new undef value is
>>>'captured' instead. For named subs, creation time equals compilation time;
>>>for anonymous subs, creation time is later, when you execute the 'sub'
>>>bit.
>>
>>"currently valid instance" could use a little more explanation. Does
>>this mean "instance that is not undef"?
>
>
> No, I mean where an instance has been thrown away, and a new instance has
> not yet been created, eg
>
> sub f { $x = 1 }
> f(); # during this call, the first instance of $x exists
> # at this point, no instance exists
> f(); # during this call, the second instance of $x exists
>
> Normally this is just academic, but but when you call a function that does
> an eval, you can end up trying to access the lexical, but no instance of
> it currently exists, hence a) you get the warning about it not being
> available, b) Perl gives you a new undef value to play with instead; this
> value is not shared with any of the instances (in bleedperl, anyway).

The code you mention doesn't illustrate this point completely, as far as
I can tell... but from the words I think we need one more nested scope
to illustrate it completely?

sub f
{ my $x = 1;
sub g
{ eval '$x';
}
}

& f(); # during this call, the first instance of $x exists
# at this point no instance exists
& g(); # during this call, an undef $x is created, and warning happens
# at this point no instance exists
& f(); # during this call, the second instance of $x exists

OK, if I've got that now, the warning is even more useful than I
thought, because it won't flag "currently undef" instances when
captured, as I thought yesterday.

>>Thanks again for writing down these rules. I read in a different
>>subthread that you were working on the documentation, and if you can get
>>the meat of this discussion into the documentation, I think it will go a
>>long way towards keeping future wumpuses from being quite as ignorant as
>>I. I would be delighted to read/proof-read/comment on such future
>>documentation, if you think it would help,
>
>
> Okay, will do.
>
>
>>although the real proof of
>>the documentation is if a future ignorant wumpus can read it and be
>>educated, without causing you the frustration that I have.
>
>
> Regards,
>
> Dave.
>

--

Dave Mitchell

unread,
Nov 29, 2003, 2:12:11 PM11/29/03
to Glenn Linderman, Brad Baxter, perl5 porters
On Sat, Nov 29, 2003 at 10:04:19AM -0800, Glenn Linderman wrote:
> On approximately 11/29/2003 5:27 AM, came the following characters from
> >Rule 3 below tries to distinguish between compilation and creation -
> >mainly important for anonymous subs, whihc can be created multiple times.
>
> And yes it does, for named and anonymous subs, but not for files...
> Clearly the creation of a file has no effect on capturing instances...
> one has to feed it to Perl first... so that is where I'm still a little
> fuzzy as to your meaning. I suspect that by "creation time for a file"
> you are referring to its compilation time, just like for named subs...
> the time that the scope for the entity (file, named sub, anonymous sub)
> is created. For the first two, the scope is created at compilation
> time, for the latter, the scope is created when you execute the 'sub'
> bit. Yes? And so I still think the "creation of the scope for a sub or
> file" phrase that I used above still could apply? Or just add words
> about the "file" case to rule 3.

There are a whole bunch of things that act as subs: ie a separate chunk
of code that can be compiled and invoked. These include named subs, anon
subs, files (excluding any embedded subs), evals, and formats. Of these,
all except anon subs and formats have their creation at the same time as
compilation. Anon subs care created when you execute the actual 'sub'
expression, and formats are created when you execute the 'write'.
(This 'creation' business is purely a convenient label to indicate when
outer lexicals are captured.)

Yep, that's correct.
When g() is called, no instance exists.

--
"Emacs isn't a bad OS once you get used to it.
It just lacks a decent editor."

Glenn Linderman

unread,
Nov 29, 2003, 2:18:36 PM11/29/03
to Dave Mitchell, Brad Baxter, perl5 porters
On approximately 11/29/2003 11:12 AM, came the following characters from

the keyboard of Dave Mitchell:

Great! I think we've reached closure closure! (I couldn't resist!)

Now you can do your theater stuff in real peace, knowing that you've
educated the ignorant wumpus! Thanks again, I've learned a lot, about
something I thought I understood, but didn't.

0 new messages