context matters

Jerry Gay

unread,

Nov 14, 2005, 9:59:51 PM11/14/05

to p6c

while adding some shiny new pge tests for return context, i came
across this PIRism:

using keyed string access to the match object
##...
rulesub = p6rule('$<A>:=(.)')
match = rulesub('abc')
.local string res
res = match['A']
print res
## prints: a

using keyed string access to the match object
##...
rulesub = p6rule('$<A>:=(.)')
match = rulesub('abc')
.local string res
res = match[0]
print res
## errors: Null PMC access in get_string()

this is fixed by
##...
rulesub = p6rule('$<A>:=(.)')
match = rulesub('abc')
.local string res
$P0 = match[0]
res = $P0
print res
## prints: a

it seems that in keyed string access to the match object, the result
is returned directly as a string. in keyed integer access to the match
object, an intermediate pmc must be used. although the workaround is
simple, the lack of symmetry seems odd. is this due to PIR
restrictions, or to PGE implementation?

~jerry

Patrick R. Michaud

unread,

Nov 15, 2005, 12:51:10 AM11/15/05

to jerry gay, p6c

On Mon, Nov 14, 2005 at 06:59:51PM -0800, jerry gay wrote:
> while adding some shiny new pge tests for return context, i came
> across this PIRism:
>

> ##...
> rulesub = p6rule('$<A>:=(.)')
> match = rulesub('abc')
> .local string res
> res = match['A']
> print res
> ## prints: a
>
> using keyed string access to the match object
> ##...
> rulesub = p6rule('$<A>:=(.)')
> match = rulesub('abc')
> .local string res
> res = match[0]
> print res
> ## errors: Null PMC access in get_string()

> [...]

> it seems that in keyed string access to the match object, the result
> is returned directly as a string. in keyed integer access to the match
> object, an intermediate pmc must be used. although the workaround is
> simple, the lack of symmetry seems odd. is this due to PIR
> restrictions, or to PGE implementation?

Well, I suppose it can be argued either way. First, note that
PGE::Match is a subclass of Hash, so one has available all of the
(non-integer) keyed methods by default. The PGE::Match object
then overloads some (but currently not all) of the *_keyed_int
methods to be able to provide access to the array component of the
Match object.

Thus, while PGE::Match currently defines a C<__get_pmc_keyed_int>
method, it's doesn't yet define a C<__get_string_keyed_int> method.
So, a statement like

.local string res
.local pmc match
res = match[0]

is defaulting to using the inherited op from the Hash class, and
since there's not an entry at the 0 key in the hash (as opposed to
the array) you get the null PMC.

I guess I should go ahead and provide methods in the match objects
for the other *_keyed_int operations, if only to avoid this sort of
confusion and the need to store things to an intermediate pmc.

Another possibility may be to simply use the hash for all captures,
including the "array" captures, thus removing the numbers from
being valid keys. Something I read in S05 leads me to believe
that $/<1> and $/[1] are actually the same, in which case we might
not need the separate "array component" and *_keyed_int methods
for Match objects.

Pm

Jerry Gay

unread,

Nov 15, 2005, 1:26:05 PM11/15/05

to Patrick R. Michaud, p6c

On 11/14/05, Patrick R. Michaud <pmic...@pobox.com> wrote:
> On Mon, Nov 14, 2005 at 06:59:51PM -0800, jerry gay wrote:
> > it seems that in keyed string access to the match object, the result
> > is returned directly as a string. in keyed integer access to the match
> > object, an intermediate pmc must be used. although the workaround is
> > simple, the lack of symmetry seems odd. is this due to PIR
> > restrictions, or to PGE implementation?
>
> Well, I suppose it can be argued either way. First, note that
> PGE::Match is a subclass of Hash, so one has available all of the
> (non-integer) keyed methods by default. The PGE::Match object
> then overloads some (but currently not all) of the *_keyed_int
> methods to be able to provide access to the array component of the
> Match object.
>
> Thus, while PGE::Match currently defines a C<__get_pmc_keyed_int>
> method, it's doesn't yet define a C<__get_string_keyed_int> method.
> So, a statement like
>
> .local string res
> .local pmc match
> res = match[0]
>
> is defaulting to using the inherited op from the Hash class, and
> since there's not an entry at the 0 key in the hash (as opposed to
> the array) you get the null PMC.
>

it seems to me it could inherit from Array as well, but it may not be
a precise fit.

> I guess I should go ahead and provide methods in the match objects
> for the other *_keyed_int operations, if only to avoid this sort of
> confusion and the need to store things to an intermediate pmc.
>

this is probably the better way to go, and seems easy enough to
implement (and test.) :)
i'll take a stab at it, if you don't mind.

> Another possibility may be to simply use the hash for all captures,
> including the "array" captures, thus removing the numbers from
> being valid keys. Something I read in S05 leads me to believe
> that $/<1> and $/[1] are actually the same, in which case we might
> not need the separate "array component" and *_keyed_int methods
> for Match objects.
>

if my read of S05 is correct, i believe $<1> (and $1) equates to
$/[0]. of course, you may have a newer copy than i do. ;)

i still have some tests to write for match return values, i'll try to
get some tests in for the above based on a close reading of S05.

~jerry

Patrick R. Michaud

unread,

Nov 15, 2005, 1:32:38 PM11/15/05

to jerry gay, p6c

On Tue, Nov 15, 2005 at 10:26:05AM -0800, jerry gay wrote:
> > Thus, while PGE::Match currently defines a C<__get_pmc_keyed_int>
> > method, it's doesn't yet define a C<__get_string_keyed_int> method.
> > So, a statement like
> >
> > .local string res
> > .local pmc match
> > res = match[0]
> >
> > is defaulting to using the inherited op from the Hash class, and
> > since there's not an entry at the 0 key in the hash (as opposed to
> > the array) you get the null PMC.
> >
> it seems to me it could inherit from Array as well, but it may not be
> a precise fit.

Worse, I think the two might interact in strange and undesirous
ways.

> > I guess I should go ahead and provide methods in the match objects
> > for the other *_keyed_int operations, if only to avoid this sort of
> > confusion and the need to store things to an intermediate pmc.
> >
> this is probably the better way to go, and seems easy enough to
> implement (and test.) :)
> i'll take a stab at it, if you don't mind.

Sure, that'd be great!

> if my read of S05 is correct, i believe $<1> (and $1) equates to
> $/[0]. of course, you may have a newer copy than i do. ;)

I have a newer copy. $<0>, $0, and $/[0] are now all the same.

Pm

Larry Wall

unread,

Nov 15, 2005, 3:28:30 PM11/15/05

to p6c, perl6-l...@perl.org

On Tue, Nov 15, 2005 at 12:32:38PM -0600, Patrick R. Michaud wrote:

: On Tue, Nov 15, 2005 at 10:26:05AM -0800, jerry gay wrote:
: > > Thus, while PGE::Match currently defines a C<__get_pmc_keyed_int>
: > > method, it's doesn't yet define a C<__get_string_keyed_int> method.
: > > So, a statement like
: > >
: > > .local string res
: > > .local pmc match
: > > res = match[0]
: > >
: > > is defaulting to using the inherited op from the Hash class, and
: > > since there's not an entry at the 0 key in the hash (as opposed to
: > > the array) you get the null PMC.
: > >
: > it seems to me it could inherit from Array as well, but it may not be
: > a precise fit.
:
: Worse, I think the two might interact in strange and undesirous
: ways.

Inheritance is wrong here anyway. We need some kind of basic Tree node
object that *does* Hash, Array, and Item, but isn't any of them.

Think about how you'd want to represent XML, for instance:

~$obj name of tag, probably
+$obj number of elements?
+$obj[] number of elements?
+$obj{} number of attributes?
$obj[] ordered child elements
$obj{} unordered attributes

But the scalar values don't match up with how Match objects work, so it
would likely have to be:

~$obj representation of entire <tag>...</tag>.
+$obj +~$obj (0 with warning?)

Another approach would be to say that we make Hash smart enough to
behave like an array or a scalar in context, and then we write

~%obj name of tag, probably
+%obj number of attributes?
+%obj[] number of elements?
%obj[] elements
%obj{} attributes

But then hashes should have to store scalars and arrays as "hidden"
keys, and we still have an inconsistent scalar interface. Plus it
smacks of pseudo-hashery.

Yet another approach is to reinvent typeglobish objects (but without
confusing them with symbol table entries.) But we've stolen the *
sigil since then. And it might be more readable to simply be able
to declare highlanderish variables such that

my Node $obj;
my @obj ::= $obj[];
my %obj ::= $obj{};

And otherwise we just stick with $ sigil and semantics. Basically,
match objects are ordinary objects that merely *contain* other types,
while providing Str, Int, Num, Array and Hash roles.

Of course, we could give syntactic relief in just the declaration
on the order of

my ?obj; # the '?' is negotiable, of course

that implies the creation of a highlander variable. Outside the
declaration you'd only be able to use one of the real sigils.
Interestingly, though, that kind of implies that ^obj as an rvalue
would give the type of $obj in that scope.

One interesting question is, if you said

my ?obj := %random_hash;

whether it would try to emulate the $ and @ views or merely fail, or
something in between, like returning null lists and undefined values.
Presumably &obj would likely fail unless ?obj contained a code object
of some sort. It would make sense to allow tests for "exists &obj"
and such.

And then maybe we'd be talking about the ?/ variable rather than the
$/ variable. And we'd get @/ and %/, FWIW. Of course, none of this
highlander stuff buys you anything as soon as you go down a level in
the tree (unless you realias the child nodes). To my mind the main
benefit of declaring something like ?obj rather than $obj is that
you are documenting the expected polymorphism, and only secondarily
that you're claiming all the local "obj" namespaces.

[Followups to p6l.]

Larry

Patrick R. Michaud

unread,

Nov 15, 2005, 3:38:32 PM11/15/05

to p6c

On Tue, Nov 15, 2005 at 12:28:30PM -0800, Larry Wall wrote:
> On Tue, Nov 15, 2005 at 12:32:38PM -0600, Patrick R. Michaud wrote:
> : On Tue, Nov 15, 2005 at 10:26:05AM -0800, jerry gay wrote:
> : > > Thus, while PGE::Match currently defines a C<__get_pmc_keyed_int>
> : > > method, it's doesn't yet define a C<__get_string_keyed_int> method.

> [...]

> Inheritance is wrong here anyway. We need some kind of basic Tree node
> object that *does* Hash, Array, and Item, but isn't any of them.

Agreed; I just went with Hash for the time being as a cheap
implementation approach for now, and provide interfaces that
allow Match objects to act in the various roles we have defined
thus far. I fully suspect the actual type and implementation of
Match will change as we define and understand the problem space
a bit better.

The rest of the items in Larry's message I leave for p6l. :-)

Pm

Jerry Gay

unread,

Nov 16, 2005, 5:36:59 PM11/16/05

to Patrick R. Michaud, p6c

On 11/15/05, Patrick R. Michaud <pmic...@pobox.com> wrote:
> On Tue, Nov 15, 2005 at 10:26:05AM -0800, jerry gay wrote:
> > > I guess I should go ahead and provide methods in the match objects
> > > for the other *_keyed_int operations, if only to avoid this sort of
> > > confusion and the need to store things to an intermediate pmc.
> > >
> > this is probably the better way to go, and seems easy enough to
> > implement (and test.) :)
> > i'll take a stab at it, if you don't mind.
>
> Sure, that'd be great!
>

'__get_string_keyed_int' is now implemented for PGE::Match objects.
'number' and 'integer' variants should also make their way into the
code--i'll add them (along with some appropriate tests) next chance i
have.

~jerry