More A5/E5 questions

Luke Palmer

unread,

Sep 7, 2002, 1:27:59 AM9/7/02

to du...@pobox.com, perl6-l...@perl.org

Answering to the best of my knowledge.

On Sat, 7 Sep 2002, Jonathan Scott Duff wrote:

> Question #2:
>
> Why are we storing the hypothetical's sigil in the match object?

I think it's to differentiate the different namespaces (scalar, array,
hash) within the match object's hash. Personally, I don't like it, and
think that people should just not do:

/ $var := <foo> , @var := <bar>* /

Because it's dumb.

> Question #3:
>
> Related to question #2, if I didn't use hypotheticals, how would I
> access the Nth match of a repitition? For instance, in E5, there's an
> example that looks like this:
>
> rule file { ^ @adonises := <hunk>* $ }
>
> If I didn't have the hypothetical @adonises, how would I retrieve the
> 3rd hunk matched? Would I need to write it like so:
>
> rule file { ^ <hunks> $ }
> rule hunks :e { (<hunk>) }

No. I think you can do this:

/ <hunk>* / # maybe / (<hunk>)* /
$0{hunk}[3]

Luke

Jonathan Scott Duff

unread,

Sep 7, 2002, 1:09:50 AM9/7/02

to perl6-l...@perl.org

Question #1:

If \n matches any one of the platform-specific newline character
sequences, does that mean that if I have a string like this[*]:

"foo bar baz\rfoo bar baz\nfoo bar bar\r\n"

that \n will match in 3 places? How do you tell perl that you only
want \n to match a specific newline sequence? And if \n does match in
3 places in that string, does that mean that ^^ and $$ will also match
in 3 places?

[*] In my string \r and \n are substituting for carriage return
and line feed respectively.

Question #2:

Why are we storing the hypothetical's sigil in the match object?

Question #3:

Related to question #2, if I didn't use hypotheticals, how would I
access the Nth match of a repitition? For instance, in E5, there's an
example that looks like this:

rule file { ^ @adonises := <hunk>* $ }

If I didn't have the hypothetical @adonises, how would I retrieve the
3rd hunk matched? Would I need to write it like so:

rule file { ^ <hunks> $ }
rule hunks :e { (<hunk>) }

and then access it with $0{file}{hunks}[2] ?

-Scott
--
Jonathan Scott Duff
du...@cbi.tamucc.edu

Nicholas Clark

unread,

Sep 7, 2002, 5:36:06 AM9/7/02

to Luke Palmer, du...@pobox.com, perl6-l...@perl.org

On Fri, Sep 06, 2002 at 11:27:59PM -0600, Luke Palmer wrote:
>
> Answering to the best of my knowledge.
>
> On Sat, 7 Sep 2002, Jonathan Scott Duff wrote:
>
> > Question #2:
> >
> > Why are we storing the hypothetical's sigil in the match object?
>
> I think it's to differentiate the different namespaces (scalar, array,
> hash) within the match object's hash. Personally, I don't like it, and
> think that people should just not do:
>
> / $var := <foo> , @var := <bar>* /
>
> Because it's dumb.

Related, I think: no-one answered my question about what happens when I
define

sub dumb ($var, @var) {
...
}

and then call it with the pair var=>$thing

It's ambiguous, because (if I understand perl6 correctly) arrays will auto
convert to array refs if required, so there's no simple way to decide which
parameter that ought to bind to.

Presumably for clarity it is better to store the sigil in the match object,
because if I've understood perl6 symbol tables correctly they are going to
be storing names-with-sigils. (To allow the elimination of typeglobs)

Nicholas Clark
--
Even better than the real thing: http://nms-cgi.sourceforge.net/

David Helgason

unread,

Sep 9, 2002, 6:32:22 AM9/9/02

to Jonathan Scott Duff, perl6-l...@perl.org

Jonathan Scott Duff wrote:
> Question #3:
>
> Related to question #2, if I didn't use hypotheticals, how would I
> access the Nth match of a repitition? For instance, in E5, there's an
> example that looks like this:
>
> rule file { ^ @adonises := <hunk>* $ }
>
> If I didn't have the hypothetical @adonises, how would I retrieve the
> 3rd hunk matched? Would I need to write it like so:
>
> rule file { ^ <hunks> $ }
> rule hunks :e { (<hunk>) }
>
> and then access it with $0{file}{hunks}[2] ?

For a while worries about this have been brewing in my mind.

[worry #1]
The hypothetical 'variables' we bind to aren't really variables but keys to a hash. Thus they shouldn't have sigils in their names at all.

Ok, that may give us problems with giving rules context, but maybe we could simplify that, such that rules always got scalar context to work in (that's pretty close to the truth anyhow since their results are being stored in a hash - for some definition of 'truth').

Then maybe we could differentiate between building a match object and capturing data into variables that are defined in a higher scope.

So

/^ <hunks> $ /
(or alternately (the binding just changes the name
of the key in the match object))
/^ myhunk := <hunks> $ /

and

/^ $hunk := <hunks> $ /

would do different things, since the first only builds a match object, and the second only binds $hunk.

[worry #2]
Since $0 has only a rather vague relation to $1..$n, maybe its name isn't that relevant. Especially since we'll be indexing into it all the time. Maybe $MATCH, $RESULT, $RX .... (those names aren't convincing me either, sorry).

If nothing else, this would at least get rid of "one more cryptically names variable".

David
--
www.panmedia.dk - Excellent Perl consulting for Denmark, Scandinavia, the World

Damian Conway

unread,

Sep 8, 2002, 3:50:45 PM9/8/02

to perl6-l...@perl.org

Nicholas Clark wrote:

> Related, I think: no-one answered my question about what happens when I
> define
>
> sub dumb ($var, @var) {
> ...
> }
>
> and then call it with the pair var=>$thing

Exception, probably. Perhaps the error would be something like:

"Dumb ambiguous binding of dumb named parameter ("var") at demo.pl line 1. Dummy."

;-)

> Presumably for clarity it is better to store the sigil in the match object,
> because if I've understood perl6 symbol tables correctly they are going to
> be storing names-with-sigils. (To allow the elimination of typeglobs)

The part about sigils being part of symbol table keys is indeed correct.

Damian

Damian Conway

unread,

Sep 9, 2002, 11:08:33 AM9/9/02

to perl6-l...@perl.org

David Helgason wrote:

> [worry #1]
> The hypothetical 'variables' we bind to aren't really variables but keys to a hash.

Welcome to Perl 6. Where *no* variable is really a variable, but all are keys to
a hash (which is known as the symbol table) ;-)

> Thus they shouldn't have sigils in their names at all.

But they do in the Perl 6 symbol table.

> Then maybe we could differentiate between building a match object and
> capturing data into variables that are defined in a higher scope.
>
> So
>
> /^ <hunks> $ /
> (or alternately (the binding just changes the name
> of the key in the match object))
> /^ myhunk := <hunks> $ /
>
> and
>
> /^ $hunk := <hunks> $ /

This *is* an interesting point. Allison and I have discussed this point
at some length and have come up with a rather neat solution that we'll be
discussing with Larry ASAP. I'll report back as soon as I can.

> [worry #2]
> Since $0 has only a rather vague relation to $1..$n, maybe its name isn't that relevant.
> Especially since we'll be indexing into it all the time. Maybe $MATCH, $RESULT, $RX ....
> (those names aren't convincing me either, sorry).

I still think $0 is the right name for it.

> If nothing else, this would at least get rid of "one more cryptically named variable".

But only at the expense of adding one more arbitrarily named variable. :-(

Damian

Nicholas Clark

unread,

Sep 9, 2002, 11:19:40 AM9/9/02

to Damian Conway, perl6-l...@perl.org

On Sun, Sep 08, 2002 at 09:50:45PM +0200, Damian Conway wrote:
> Nicholas Clark wrote:
>
> > Related, I think: no-one answered my question about what happens when I
> > define
> >
> > sub dumb ($var, @var) {
> > ...
> > }
> >
> > and then call it with the pair var=>$thing
>
> Exception, probably. Perhaps the error would be something like:
>
> "Dumb ambiguous binding of dumb named parameter ("var") at demo.pl line 1. Dummy."
>
> ;-)

What happens if I call a function (maybe not my dumb function above)
with the pair ('$var' => value) - ie the sigil is already in the name of
the pair. Presumably it "just works" for normal function.

So in this case, can I disambiguate things for my dumb function.

And if I call a function

crazy ('$param' => $value, 'param' => $other_value);

presumably that also throws some sort of exception about ambiguity?

Nicholas Clark

David Helgason

unread,

Sep 9, 2002, 11:32:09 AM9/9/02

to Damian Conway, perl6-l...@perl.org

Damian Conway Wrote:

>> [worry #1]
>> The hypothetical 'variables' we bind to aren't really variables
>> but keys to a hash.

>Welcome to Perl 6. Where *no* variable is really a variable, but
> all are keys to a hash (which is known as the symbol table) ;-)

Ok, you're obviously right. But $0{'$foobar'} still hurts my eyes,
not to mention how mysterious it may look to newbies trying to
cope with $h{$foo} and $h{foo} as well - unless we are really good
at educating them (but we will be!).

>> [worry #2]
>> Since $0 has only a rather vague relation to $1..$n, maybe its
>> name isn't that relevant. Especially since we'll be indexing
>> into it all the time. Maybe $MATCH, $RESULT, $RX .... (those
>> names aren't convincing me either, sorry).

>I still think $0 is the right name for it.

>> If nothing else, this would at least get rid of "one more
>> cryptically named variable".

>But only at the expense of adding one more arbitrarily named variable. :-(

Coming to think of it, why have a named variable at all? If the
match object gets returned anyhow there is no need for a cleverly
named magical variable ($0, $MATCH, ...).

David
--
www.panmedia.dk - Ingenious perl consulting in Denmark, Scandinavia & the World

Damian Conway

unread,

Sep 9, 2002, 11:51:56 AM9/9/02

to perl6-l...@perl.org

David Helgason wrote:

> Coming to think of it, why have a named variable at all? If the
> match object gets returned anyhow there is no need for a cleverly
> named magical variable ($0, $MATCH, ...).

Probably for the same reason that we have $1, $2, $_, etc.
Because people are lazy. :-)

Damian

Jonathan Scott Duff

unread,

Sep 9, 2002, 12:51:13 PM9/9/02

to Damian Conway, perl6-l...@perl.org

On Mon, Sep 09, 2002 at 03:52:30PM +0200, Damian Conway wrote:
> Hi Scott,
>
> You asked (off-list):

Oops, that should've been on-list so that everyone can benefit from my
ignorance :-)

> > Then how do I tell ^^ and $$ to only match just after and just before
> > my platform specific newline sequence? ^^ and $$ seem less useful if
> > I can't do that. (Maybe it's just an erroneous assumption on my part
> > but \n, ^^, and $$ all seem intimately related)
>
> Only in that they use the same set of values when looking for a line
> terminator. You could consider ^^ and $$ to be abbreviations of:
>
> <after <nl>|^>
> <before <nl>|$>
>
> where <nl> matches any newline sequence.
>
> So if you wanted a platform-dependent version, you'd write:
>
> rule <sol> { <after \c[CR]|^> }
> rule <eol> { <before \c[CR]|$> }
>
> and use those.

Okay, that makes sense. Or, presumably, I could lexically redefine ^^
and $$ just like I can any other operator.

> > Then, I'll ask again, why are we storing the sigil in the match object
> > for explicit hypotheticals? The difference between
> >
> > $0{file}{hunk}[2] # and
> > $0{file}{'@adonises'}[2]
> >
> > seems unnecessary. Or, how is

> >
> > rule file { ^ @adonises := <hunk>* $ }
> >

> > different from
> >
> > rule adonises { <hunk> }
> > rule file { ^ <adonises>* $ }
> >
> > aside from the extra indirection?
>
> Yes, they're different. Explicitly binding to @adonises binds the (dereferenced)
> reference to the array of C<hunk> match objects. The implicit binding
> of C<< <adonises>* >> binds the (un-dereferenced) reference to the array
> of match objects.
>
> That is:
>
> rule file { ^ <adonises>* $ }
>
> is like:
>
> rule file { ^ $adonises:=<adonises>* $ }

Aha! Thanks. Can you pop back in time real quick and add these
comments to E5? :)

But it still doesn't make sense to me that we are storing the sigils.
I mean, I thought the whole point of things like:

$aref = @array;
$aref[2] = 5;

was so that we wouldn't have to know or care about the type of thing
we're dealing with and that Perl would just Do The Right Thing. It
seems that by storing the sigils in the match object, we're back to
partitioning rather than unifying.