Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Throwing lexicals

6 views
Skip to first unread message

Aaron Sherman

unread,
Sep 9, 2002, 9:48:47 AM9/9/02
to Perl6 Language List
I was thinking about regular expressions and hypotheticals again this
weekend, and something was bothering me quite a lot. How do rules create
hypotheticals?

Since a rule behaves like a closure, I can see how it could gain access
to existing lexicals, if it's declared inside of the same scope:

my $x = 1;
/ $x /;

Good so far, now we change said lexical:

my $x = 1;
/ $x := (2) /;

Ok, got that. Now what about lexicals that aren't declared:

/ $x := (2) /;

This bothers me. Yes, I can see how you could do it via %MY:: or however
that's spelled (I'm typing this in a Vermont B&B, so I can't quite go
check A2), but opening up the stack of my caller and mucking around with
their contents seems rather rude. As Pink Floyd once said, "get your
hands off of my stack!"

So, here's an immodest proposal for kinder and gentler stack mucking.
Essentially, these lexicals are out-of-band return values, and Perl 6
already has a mechanism for out-of-band return information: properties.
When you look at it like this, what you want to do is "throw" these
lexicals up the stack to your caller and let them do the right thing
with them. Here's a non-regexp version of what I'm describing:

sub a {
my $x = 1;
b();
print "x=$x y=$y\n";
}
sub b {
my $x = 2;
my $y = 3;
return undef but lexicals(x=>$x, y=>$y);
}

This has an unfortunate consequence: The existing lexical C<$x> gets
stomped with the new value of 2. While this might be what you wanted in
some cases, it's probably not a very good idea to allow it in general.
So, lets say that that would generate a warning (error?), and in order
to allow it, you would have to associate a property with your existing
lexical:

my $x is volatile = 1;

I stole this property name from C, where it means that the variable's
value might be stomped by some external side-effect, which is exactly
what it means here. Every subroutine invocation would have to check
return values for a lexicals property and instantiate any variables as
needed. Variables created this way would be considered volatile so that:

b(); b();

Would not generate warnings or errors about stomping the first call's
variables with the second's.

Going back to patterns, this gives us an added bonus. It not only
explains the behavior of hypotheticals, but also of subexpression
placeholders, which are created when the pattern returns:

$self but lexicals(0=>$self, 1=> $self.{1}, 2=> $self.{2}, etc...)

That yields the side effect that you can say:

sub match_digits($string //= $_) {
return / (\d+) /;
}
if match_digits("The time is 12:03") {
print $1;
}

I think this is a very clean and simple way to get everything that
patterns were supposed to do plus a lot of added benefit for things
like:

sub getpwuid(int $uid //= $_) {
%pwd = external("getpwuid",$uid);
return %pwd but lexicals(%pwd);
}
getpwuid($<);
print "I am $user from $dir, and I have a secret ($passwd)\n";

You should be able to "protect" yourself from these side effects. There
are two ways to do that:

{getpwuid($<)}

or

getpwuid($<) but lexicals();

I would expect either one of those to work, though the second is a bit
of magic in terms of order of events.


--
Aaron Sherman <a...@ajs.com>
http://www.ajs.com/~ajs

Luke Palmer

unread,
Sep 9, 2002, 3:12:24 PM9/9/02
to Aaron Sherman, Perl6 Language List
> Going back to patterns, this gives us an added bonus. It not only
> explains the behavior of hypotheticals, but also of subexpression
> placeholders, which are created when the pattern returns:
>
> $self but lexicals(0=>$self, 1=> $self.{1}, 2=> $self.{2}, etc...)
>
> That yields the side effect that you can say:
>
> sub match_digits($string //= $_) {
> return / (\d+) /;
> }
> if match_digits("The time is 12:03") {
> print $1;
> }
>
> I think this is a very clean and simple way to get everything that
> patterns were supposed to do plus a lot of added benefit for things
> like:
>
> sub getpwuid(int $uid //= $_) {
> %pwd = external("getpwuid",$uid);
> return %pwd but lexicals(%pwd);
> }
> getpwuid($<);
> print "I am $user from $dir, and I have a secret ($passwd)\n";
>
> You should be able to "protect" yourself from these side effects. There
> are two ways to do that:
>
> {getpwuid($<)}
>
> or
>
> getpwuid($<) but lexicals();
>
> I would expect either one of those to work, though the second is a bit
> of magic in terms of order of events.

This does bring up an interesting point. I think your solution is an
interesting idea, but not really necessary. But consider this:

my $date;
# lots of code
sub foo {
#lots more code
sub bar {
#lots more code
m/ $date := <date> /;
}
}

This is terrible. Calling foo which calls bar mysteriously overwrites
$date? "Why is $date changing?" the programmer asks. He does an
exhaustive search through his code and finally says "ohhhhhh," and has to
change all references to the inner $date to something like $mydate.

This is obviously a problem (unless I misunderstood how hypotheticals
change their surrounding scope). For a solution, let's just look how we
do it in subroutines.

my $date;
sub foo {
my $date = 'Jul 17, 1984';
# ...
}

Oh. Duh. Why don't we have such a mechanism for matches?

m/ my $date := <date> /

is ambiguous to the eyes. But I think it's necessary to have a lexical
scoping mechanism for matches, as to avoid the problem I addressed above.
Any ideas?

Luke

Me

unread,
Sep 9, 2002, 3:14:25 PM9/9/02
to Luke Palmer, Aaron Sherman, Perl6 Language List
I may be missing your point, but based on my somewhat
fuzzy understanding:

> Oh. Duh. Why don't we have such a mechanism for matches?
>
> m/ my $date := <date> /
>
> is ambiguous to the eyes. But I think it's necessary to have a
lexical
> scoping mechanism for matches

The above would at least have to be:

m/ { my $date := <date> } /

(otherwise the 'my ' and ':=' would be matched literally.)

And you can of course do that.

But you won't be able to access $date outside the closure.

Hence the introduction of let:

m/ { let $date := <date> } /

which makes (a symbol table like entry for) $date available
somewhere via the match object.

And has the additional effect that $date (I think the whole
variable/entry, but at the very least its value) disappears
if the match backtracks over the closure.

Right?

--
ralph

Andrew Wilson

unread,
Sep 9, 2002, 3:52:24 PM9/9/02
to Me
On Mon, Sep 09, 2002 at 02:14:25PM -0500, Me wrote:
> Hence the introduction of let:
>
> m/ { let $date := <date> } /
>
> which makes (a symbol table like entry for) $date available
> somewhere via the match object.

Somewhere? where it appears in in the namespace of the caller.
Apparently there is no way to use someone else's grammar and prevent it
trashing your namespace.

> And has the additional effect that $date (I think the whole
> variable/entry, but at the very least its value) disappears
> if the match backtracks over the closure.
>
> Right?

As I understand it that is the intent.

Perhaps what we need is something like @EXPORT and @EXPORT_OK for
hypotheticals in grammars. I'm not suggesting those as names, but
something along those lines. It would give the caller some degree of
protection if they had to specifically ask you to overwrite their
variables with hypothetically bound results.

I think if we don't do something like that it's going to make other
peoples grammars very hard to use. You will have to read them and know
which variables they will mess with before you start.

andrew
--
Scorpio: (Oct. 24 - Nov. 21)
It's been almost three decades, but you think you're finally beginning to
recover from the long, national nightmare of Vietnam movies.

Luke Palmer

unread,
Sep 9, 2002, 4:13:55 PM9/9/02
to Andrew Wilson, Me, perl6-l...@perl.org
On Mon, 9 Sep 2002, Andrew Wilson wrote:

> On Mon, Sep 09, 2002 at 02:14:25PM -0500, Me wrote:
> > Hence the introduction of let:
> >
> > m/ { let $date := <date> } /
> >
> > which makes (a symbol table like entry for) $date available
> > somewhere via the match object.
>
> Somewhere? where it appears in in the namespace of the caller.
> Apparently there is no way to use someone else's grammar and prevent it
> trashing your namespace.

Err.. I don't think so.

# Date.pm
grammar Date;
my $date;
rule date_rule { $date := <something> }

# uses_date.p6 (hmm.. I wonder what a nice extension would be...)
use Date;
my $date;
m/ <Date::date_rule> /;

This would mess with $Date::date, not $main::date. If there was no
$Date::date, it wouldn't mess with anything, and it would store
the return value of <something> in $0{date}.

I'm talking about just in the same namespace, how do we keep rules from
messing with file-scoped (or any-scoped, for that matter) lexicals or
globals. How do we get rule- or closure-scoped lexicals that are put into
$0?

Which of these are legal, and would provide a solution?

/ <something> { let my $date = $something } /
/ <something> { $0{date} = $something } /

If either, I guess I have no complaints. I'll be angry if the latter
isn't legal. Still, they seem a little bit hack-like....

Luke

Andrew Wilson

unread,
Sep 9, 2002, 4:38:17 PM9/9/02
to Luke Palmer, Me, perl6-l...@perl.org
On Mon, Sep 09, 2002 at 02:13:55PM -0600, Luke Palmer wrote:
> Err.. I don't think so.
>
> # Date.pm
> grammar Date;
> my $date;
> rule date_rule { $date := <something> }
>
> # uses_date.p6 (hmm.. I wonder what a nice extension would be...)
> use Date;
> my $date;
> m/ <Date::date_rule> /;
>
> This would mess with $Date::date, not $main::date. If there was no
> $Date::date, it wouldn't mess with anything, and it would store
> the return value of <something> in $0{date}.

Ok, That makes a great deal more sense. I was confused by the
discussion of dynamic scope.

andrew
--
Cancer: (June 22 - July 22)
You will soon find yourself entangled in a messy accident with a knife thrower,
although drunk driving, not knife throwing, is actually the real issue.

Aaron Sherman

unread,
Sep 9, 2002, 4:55:12 PM9/9/02
to Luke Palmer, Perl6 Language List
On Mon, 2002-09-09 at 15:12, Luke Palmer wrote:
> > Going back to patterns, this gives us an added bonus. It not only
> > explains the behavior of hypotheticals, but also of subexpression
> > placeholders, which are created when the pattern returns:
[...]

> > I think this is a very clean and simple way to get everything that
> > patterns were supposed to do plus a lot of added benefit for things
> > like:

> This does bring up an interesting point. I think your solution is an

> interesting idea, but not really necessary. But consider this:

Before we consider your concern (which I will address below), why is
this "not really necessary"? As I see it, here are the needs addressed:

1. Creating lexically scoped variables in caller's namespace
2. Protecting existing lexicals from unexpected side-effects.

One of those is mandated by A5, and one of those is, IMHO, requisite for
maintainable programming.

We also achieve the following:

1. Limiting TCL-upvar-like manipulation of caller's stack
2. Allowing "pass-through" subroutines

This last bit is kind of crucial, IMHO. It makes a lot of sense to me
that this would work:

sub match_digits($str //= $_) { /(\d+)/ }
if match_digits { print $1 }

And if C<$0> contains a property that causes the lexicals to be created
upon return, then it would (because match_digits just returns C<$0> to
the caller).

> my $date;
> # lots of code
> sub foo {
> #lots more code
> sub bar {
> #lots more code
> m/ $date := <date> /;
> }
> }

If you used my suggestion, this would produce a warning or error
depending on strictness. That would have to be "my volatile $date" to
allow a thrown lexical to stomp it, in which case

> Oh. Duh. Why don't we have such a mechanism for matches?

My question exactly.

There is more that you can do once you can throw lexicals. For example,
you could provide a property for subroutines and rules which asserts the
lexicals which it can throw:

rule date is declaring($date) { # or is that declaring('date')?
$date := (<parse_date>)
}
# or
sub stat($filename//=$_) is declaring($mtime, $ctime, ...) {
# ...
return %statstruct but lexicals(%statstruct);
}

Now, the compiler can generate stomping warnings at compile-time instead
of just at run time.

Me

unread,
Sep 10, 2002, 3:52:20 AM9/10/02
to Luke Palmer, Andrew Wilson, perl6-l...@perl.org
> I'm talking about just in the same namespace, how
> do we keep rules from messing with file-scoped
> (or any-scoped, for that matter) lexicals or globals.
> How do we get rule- or closure-scoped lexicals
> that are put into $0?

How about something like the following rework of
the capture/hypotheticals thing:

You're allowed to declare variables beginning
with a 0 inside rules, eg $0foo or @0bar.

Rules by default capture to $0rulename.

Variables that begin with a digit are rule variables,
all rule variables are always hypothetical, no other
variables are hypothetical.

Drop the 'let' keyword.

--
ralph

Damian Conway

unread,
Sep 10, 2002, 6:49:51 AM9/10/02
to perl6-l...@perl.org
Luke Palmer fretted:

> This is terrible. Calling foo which calls bar mysteriously overwrites
> $date? "Why is $date changing?" the programmer asks. He does an
> exhaustive search through his code and finally says "ohhhhhh," and has to
> change all references to the inner $date to something like $mydate.
>
> This is obviously a problem (unless I misunderstood how hypotheticals
> change their surrounding scope). For a solution, let's just look how we
> do it in subroutines.
>
> my $date;
> sub foo {
> my $date = 'Jul 17, 1984';
> # ...
> }
>
> Oh. Duh. Why don't we have such a mechanism for matches?
>
> m/ my $date := <date> /
>
> is ambiguous to the eyes. But I think it's necessary to have a lexical
> scoping mechanism for matches, as to avoid the problem I addressed above.
> Any ideas?

Yep. Allison and I have two rather nice solutions to this problem, and as soon
as we run them by Larry, and get his agreement on one or both of them,
I'll describe them here in detail.

Rest assured that we see the problem, and we're working on a solution -- or
perhaps two -- that will satisfy everyone's concerns in this area (or, at
least, make everybody equally unhappy ;-)

Damian


0 new messages