Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Patterns

0 views
Skip to first unread message

Luke Palmer

unread,
Jan 5, 2007, 3:47:18 PM1/5/07
to perl6language
I propose that we remove the following two lines from the smart match
table in S03:

Hash Any hash entry existence exists $_{$x}
Array Any array contains item* any($_) === $x

These are the two lines with Any on the right side. I want to remove
these so that we can have an extensible ~~ operator.

You can think of $x ~~ $y as saying "$x matches $y", i.e. $y is some
data pattern and we are asking if $x conforms to it. I want to be
able to add my own kinds of patterns, and entries with Any in the
right side undermine this.

Why? Let's say I created a pattern which implemented another textual
matcher alongside Perl's regexes, which was more convenient for some
things (I shouldn't have to convince you that this is possible and
useful). Let's call it Irregex. I want it to work just like Perl's
regexes; i.e. I want it so that if you ask whether an array matches
it, it will match against the elements of the array as though they
were characters. So I just overload ~~ to work on (Array, Irregex)
and that overrules (Array, Any) so there is no MMD ambiguity.

However! Let's say I'm using a payroll library (or something, sorry
for the mundane examples; I'm trying to keep it realistic) which sends
a notification if any user's status matches a pattern, and it is
implemented like so:

method notify(Any|Junction $pattern) {
if @.users>>.<status> ~~ $pattern {
# send notification
}
}

This is perfectly fine, elegant code given the ~~ table. But it
breaks since we have added our new pattern type! It doesn't check if
any user's status matches, it checks if the users' statuses'
concatenated considered as atoms matches, totally wrong!

So, I'm essentially asking for ~~ to be singly-dispatched based on its
right argument, which gets to pick how to interpret the left one. In
order to encourage this usage, I propose a design like so:

role Pattern {
method match($item) {...}
}
# Code, Bool, Undef, Whatever, Num, Junction, Str, Hash, ... ...
all do Pattern

sub *infix:<~~> ($item, Pattern $pat) {
$pat.match($item);
}

So to add a new pattern, you just implement the Pattern role, and you are safe.

If you are very attached do those two rows of the smart match table,
we can reimplement them as Pattern objects, at the expense of some
unreadability:

given @array {
when 42 { ... }
}

Becomes:

given @array {
when contains 42 { ... }
}

etc.

Luke

Larry Wall

unread,
Jan 5, 2007, 5:48:46 PM1/5/07
to perl6language
On Fri, Jan 05, 2007 at 08:47:18PM +0000, Luke Palmer wrote:
: I propose that we remove the following two lines from the smart match

: table in S03:
:
: Hash Any hash entry existence exists $_{$x}
: Array Any array contains item* any($_) === $x
:
: These are the two lines with Any on the right side. I want to remove
: these so that we can have an extensible ~~ operator.
:
: You can think of $x ~~ $y as saying "$x matches $y", i.e. $y is some
: data pattern and we are asking if $x conforms to it. I want to be
: able to add my own kinds of patterns, and entries with Any in the
: right side undermine this.

It's much worse than just those two entries, because any entry not marked
with * is currently considered reversible. However, I think you're
the path toward sanity.

: Why? Let's say I created a pattern which implemented another textual


: matcher alongside Perl's regexes, which was more convenient for some
: things (I shouldn't have to convince you that this is possible and
: useful). Let's call it Irregex. I want it to work just like Perl's
: regexes; i.e. I want it so that if you ask whether an array matches
: it, it will match against the elements of the array as though they
: were characters. So I just overload ~~ to work on (Array, Irregex)
: and that overrules (Array, Any) so there is no MMD ambiguity.
:
: However! Let's say I'm using a payroll library (or something, sorry
: for the mundane examples; I'm trying to keep it realistic) which sends
: a notification if any user's status matches a pattern, and it is
: implemented like so:
:
: method notify(Any|Junction $pattern) {
: if @.users>>.<status> ~~ $pattern {
: # send notification
: }
: }
:
: This is perfectly fine, elegant code given the ~~ table. But it
: breaks since we have added our new pattern type! It doesn't check if
: any user's status matches, it checks if the users' statuses'
: concatenated considered as atoms matches, totally wrong!

I'm not so worried about extensibility as I am about consistency between
compile time and run time, such that the optimizer can optimize without
surprising the user. But this also argues against any anonymous Any on
the right. If all the when expressions are Num, the optimizer wants to
be able to evaluate the given in a numeric context whether or not you
put + on the front of it.

: So, I'm essentially asking for ~~ to be singly-dispatched based on its


: right argument, which gets to pick how to interpret the left one. In
: order to encourage this usage, I propose a design like so:
:
: role Pattern {
: method match($item) {...}
: }
: # Code, Bool, Undef, Whatever, Num, Junction, Str, Hash, ... ...
: all do Pattern
:
: sub *infix:<~~> ($item, Pattern $pat) {
: $pat.match($item);

: }

One little problem here is that .match is currently defined the reverse
of that for the method form of m//.

: So to add a new pattern, you just implement the Pattern role, and you are

: safe.
:
: If you are very attached do those two rows of the smart match table,
: we can reimplement them as Pattern objects, at the expense of some
: unreadability:
:
: given @array {
: when 42 { ... }
: }
:
: Becomes:
:
: given @array {
: when contains 42 { ... }
: }
:
: etc.

Rather than special casing array matching, I'd say we want a general form
for explicitly reversing the figure/ground. If .match demands a pattern on
the left, then

given @array {
when .match(42) {...}
}

will already reverse the test to mean 42 ~~ @array. As I say, we need to
figure out what to do with "foo".match(/bar/) though.

Larry

Larry Wall

unread,
Jan 5, 2007, 9:08:35 PM1/5/07
to perl6language
At the moment I'm leaning toward:

$pattern.accepts($thing)
$pattern.rejects($thing)

and going the other way

$thing ~~ $pattern
$thing.match($pattern)
$thing.subst($pattern)

and most generally, something like:

$thing.does($pattern)
$thing.when($pattern)

By the way, the Pattern type is already called Selector in the
synopses. Or perhaps Selector is a bit more general. Arguably Pattern
is the role that defines .accepts and .rejects, but Selector is
really the Pattern|Junction type you use in a signature. But we can't
just have Pattern does Junction or everything turns into a junction. :/

Or maybe they should be the other way around, where Selector is the role
and Pattern is the sig type. Or maybe the sig type should be Where so
we can say

multi grep (Where $where, *@list) {...}

and call it with :where{...} or where => {...}. Selector is kind of
a stupid name in any event.

Anyway, that gives us:

given $pattern {
when .accepts(42) {...}
}

which given typical usage patterns of switch statements is probably
adequately huffmanized, unless we want to go for something shorter
than accepts/rejects, like

acc/rej
pix/nix
ok/bogus

Of course, all of the .rejects() variants could simply be spelled
.accepts().not, but that seem unnecessarily cruel.

Larry

Jonathan Lang

unread,
Jan 6, 2007, 9:49:26 PM1/6/07
to perl6language
Larry Wall wrote:
> Anyway, that gives us:
>
> given $pattern {
> when .accepts(42) {...}
> }
>
> which given typical usage patterns of switch statements is probably
> adequately huffmanized, unless we want to go for something shorter
> than accepts/rejects, like
>
> acc/rej
> pix/nix
> ok/bogus

...at which point, legibility becomes an issue. .accepts() and
.rejects() seem to work well enough on a case-by-case basis - although
it occurs to me that someone who intends to put the pattern first
within a given/when construct is probably going to want to do so for
every case, or very nearly so:

given $pattern {
when .accepts($a) { ... }
when .accepts($b) { ... }
when .accepts($c) { ... }
when .accepts($d) { ... }
}

It would be nice if there were some way to "factor out" the .accepts()
in the above code. Maybe something like:

given $pattern {
when $a { ... } # if $a ~~ $pattern { ... ; next }
when $b { ... } # if $b ~~ $pattern { ... ; next }
when $c { ... } # if $c ~~ $pattern { ... ; next }
when $d { ... } # if $d ~~ $pattern { ... ; next }
} is backward;

Although I don't like the fact that you have to wait until reading the
whole block to find out that the cases are being processed inside-out.

--

Would it be possible to put a closure on the left and an argument list
on the right, with the effect being that the closure gets called using
the argument list? This would let you do something like:

given { ^x ~~ $pattern } {
when $a { ... } # if $a ~~ $pattern { ... ; next }
when $b { ... } # if $b ~~ $pattern { ... ; next }
when $c { ... } # if $c ~~ $pattern { ... ; next }
when $d { ... } # if $d ~~ $pattern { ... ; next }
}

or

given { ^x <= $number < ^y } {
when 1, 2 { ... } $ if 1 <= $number < 2 { ... ; next }
when 2, 3 { ... } $ if 2 <= $number < 3 { ... ; next }
}

(Yes, I know that this second case could be handled more flexibly
using right-side Ranges. TIMTOWTDI.)

The main problem that I have with this idea is its rigidity. The
_only_ way that the topic can be used in the above is as a test for
the 'when' clauses; and it messes with the 'when *' idiom. (But then,
any code that follows the final 'when' in a block is already the 'if
all else fails...' code; 'when *' is merely syntactic sugar to
highlight this fact.)

--

It would be nice to have something that is to 'given' as 'loop' is to
'for', letting you specify a topic and a separate case tester up
front:

test $pattern, { $^x ~~ $_ } {
when $a { ... } # if $a ~~ $pattern { ... ; next }
when $b { ... } # if $b ~~ $pattern { ... ; next }
when $c { ... } # if $c ~~ $pattern { ... ; next }
when $d { ... } # if $d ~~ $pattern { ... ; next }
}

Or, to avoid additional keywords, just say that a 'given' statement
with two arguments uses the first argument as its topic and the second
argument as the case tester to replace '{ $_ ~~ $^x }' in 'when'
clauses. Better, use 'when' as is, but introduce a 'test' alternative
that is customized by the second parameter:

given $pattern, { $^x ~~ $_ } {
test $a { ... } # if $a ~~ $pattern { ... ; next }
test $b { ... } # if $b ~~ $pattern { ... ; next }
test $c { ... } # if $c ~~ $pattern { ... ; next }
test $d { ... } # if $d ~~ $pattern { ... ; next }
when * { ... } # if $pattern ~~ * { ... ; next }
}

That way, 'given' and 'when' still behave exactly as currently
described; the case tester only comes into play when you explicitly
ask for it.

--

At this point, though, the case tester is looking like it might be
better as a property of the 'given' block:

given $pattern {
TEST { $^x ~~ $_ }
test $a { ... } # if $a ~~ $pattern { ... ; next }
test $b { ... } # if $b ~~ $pattern { ... ; next }
test $c { ... } # if $c ~~ $pattern { ... ; next }
test $d { ... } # if $d ~~ $pattern { ... ; next }
when * { ... } # if $pattern ~~ * { ... ; next }
}

Or go ahead and let TEST change the behavior of 'when' within the
block, and rely on nested blocks to allow different kinds of tests
within a 'given' block:

given $pattern
{
{
TEST -> $x { $x ~~ $_ }
when $a { ... } # if $a ~~ $pattern { ... ; next }
when $b { ... } # if $b ~~ $pattern { ... ; next }
when $c { ... } # if $c ~~ $pattern { ... ; next }
when $d { ... } # if $d ~~ $pattern { ... ; next }
}
when * { ... } # if $pattern ~~ * { ... ; next }
}

...and someone silly enough to want to frequently change the case
tester within a given block is advised to use 'if' instead.

What do you think?

--
Jonathan "Dataweaver" Lang

Ashley Winters

unread,
Jan 7, 2007, 12:01:15 AM1/7/07
to perl6language
On 1/5/07, Larry Wall <la...@wall.org> wrote:
> Anyway, that gives us:
>
> given $pattern {
> when .accepts(42) {...}
> }

I think this type of usage should be encouraged with a bit more
huffmanization. My first thought would be to add C<against> to invert
the arguments to ~~ versus C<with>.

given @something {
when $this { ... } # @something ~~ $this
against $that { ... } # $that ~~ @something
}

That would help keep the ~~ DWIM table from trying to guess on which
side you really wanted @something on.

- Ashley Winters

Larry Wall

unread,
Jan 7, 2007, 12:28:43 AM1/7/07
to perl6language
Lots of interesting ideas. But I don't think the reverse-test
situation will arise all that frequently. How 'bout we let the user
just say:

my macro statement_control:<test> { "when .accepts: " }

or some such...

Larry

Jonathan Lang

unread,
Jan 7, 2007, 6:33:19 AM1/7/07
to perl6language

IIRC, you don't even need .accepts for this. You could just say

given $pattern {
when $a ~~ $_ { ... }
when $b ~~ $_ { ... }
when $c ~~ $_ { ... }
}

...since an explicit smart-match is a boolean expression (it is, isn't
it?), and boolean expressions implicitly smart-match according to
their truth. Right?

--
Jonathan "Dataweaver" Lang

Larry Wall

unread,
Jan 7, 2007, 10:23:28 AM1/7/07
to perl6language
On Sun, Jan 07, 2007 at 03:33:19AM -0800, Jonathan Lang wrote:
: IIRC, you don't even need .accepts for this. You could just say

:
: given $pattern {
: when $a ~~ $_ { ... }
: when $b ~~ $_ { ... }
: when $c ~~ $_ { ... }
: }
:
: ...since an explicit smart-match is a boolean expression (it is, isn't
: it?), and boolean expressions implicitly smart-match according to
: their truth. Right?

Very nearly, except that ~~ merely promises to return something
that can be used as a boolean, not something that is of Bool type.
And, in fact, for a normal regex you'd get a Match object, which as
a form of Capture would then try to compare itself back to $pattern,
which is not what you want. So you'd have to force it:

given $pattern {
when true $a ~~ $_ { ... }
when true $b ~~ $_ { ... }
when true $c ~~ $_ { ... }
}

Larry

0 new messages