Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[svn:perl6-synopsis] r13519 - doc/trunk/design/syn

8 views
Skip to first unread message

la...@cvs.perl.org

unread,
Jan 8, 2007, 5:35:44 AM1/8/07
to perl6-l...@perl.org
Author: larry
Date: Mon Jan 8 02:35:42 2007
New Revision: 13519

Modified:
doc/trunk/design/syn/S03.pod

Log:
A bunch more "tough love" applied to the smartmatching semantics.
Change $x notation to X notation to better reflect metasyntactic nature.
Num and Str as patterns now consistently force + and ~ context for
optimizability. They no longer "autogrep" anything.
Arrays now always notionally match the entire list, but can use * as wildcard.
Added table of deprecated semantics and new notations to get the same effect
using pattern retyping, * wildcarding, or just ordinary methods.
Attempted to clarify when buffers can be used as strings.
Renamed LazyStr to LazyCat, which now only cats in string context.
Unified treatment of sets and hash keys under junctive methods.


Modified: doc/trunk/design/syn/S03.pod
==============================================================================
--- doc/trunk/design/syn/S03.pod (original)
+++ doc/trunk/design/syn/S03.pod Mon Jan 8 02:35:42 2007
@@ -12,9 +12,9 @@

Maintainer: Larry Wall <la...@wall.org>
Date: 8 Mar 2004
- Last Modified: 7 Jan 2007
+ Last Modified: 8 Jan 2007
Number: 3
- Version: 86
+ Version: 87

=head1 Changes to Perl 5 operators

@@ -601,7 +601,7 @@
compilation unit). Smart matching is generally done on the current
"topic", that is, on C<$_>. In the table below, C<$_> represents the
left side of the C<~~> operator, or the argument to a C<given>,
-or to any other topicalizer. C<$x> represents the pattern to be
+or to any other topicalizer. C<X> represents the pattern to be
matched against on the right side of C<~~>, or after a C<when>.

The first section contains privileged syntax; if a match can be done
@@ -615,105 +615,80 @@
is still somewhat privileged, insofar as the C<~~> operator is one
of the few operators in Perl that does not use multiple dispatch.
Instead, type-based smart matches singly dispatch to an underlying
-method belonging to the C<$x> pattern object.
+method belonging to the C<X> pattern object.

In other words, smart matches are dispatched first on the basis of the
-pattern's form or type (the C<$x> below), and then that pattern itself
+pattern's form or type (the C<X> below), and then that pattern itself
decides whether and how to pay attention to the type of the topic
(C<$_>). So the second column below is really the primary column.
The C<Any> entries in the first column indicate a pattern that either
doesn't care about the type of the topic, or that picks that entry
as a default because the more specific types listed above it didn't match.

- $_ $x Type of Match Implied Match if
- ====== ===== ===================== =============
- Any Code:($) scalar sub truth $x($_)
- Any Code:() simple closure truth $x() (ignoring $_)
- Any undef undefined not defined $_
+ $_ X Type of Match Implied Match if (given $_)
+ ====== ===== ===================== ===================
+ Any Code:($) scalar sub truth X($_)
+ Any Code:() simple closure truth X() (ignoring $_)
+ Any undef undefined not .defined
Any * block signature match block successfully binds to |$_
- Any .foo method truth ?any($_.foo)
- Any .foo(...) method truth ?any($_.foo(...))
- Any .(...) list sub call truth ?any($_(...))
- Any .[...] array value slice truth ?any($_[...])
- Any .{...} hash value slice truth ?any($_{...})
- Any .<...> hash value slice truth ?any($_<...>)
-
- Any Bool simple truth $x.true given $_
-
- Num Num numeric equality $_ == $x
- Capture Num numeric equality +$_ == $x
- Array Num array contains number any(@$_) == $x
- Hash Num hash key existence $_.exists($x)
- Byte Num numeric equality +$_ == $x
- Any Num numeric equality +$_ == $x
-
- Str Str string equality $_ eq $x
- Capture Str string equality ~$_ eq $x
- Array Str array contains string any(@$_) eq $x
- Hash Str hash key existence $_.exists($x)
- Byte Str string equality ~$_ eq $x
- Any Str string equality ~$_ eq $x
-
- Buf Buf buffer equality $_ eq $x
- Str Buf string equality $_ eq Str($x)
- Array Buf arrays are comparable $_ »===« @$x
- Hash Buf hash key existence $_.exists($x)
- Any Buf buffer equality Buf($_) eq $x
-
- Buf Byte buffer contains byte $_.match(/$x/)
- Str Byte string contains byte Buf($_).match(/$x/)
-
- Str Char string contains char $_.match(/$x/)
- Buf Char string contains char Str($_).match(/$x/)
-
- Set Set identical sets $_ === $x
- Hash Set hash keys same set $_.keys === $x
- Array Set array equiv to set Set($_) === $x
- Any Set identical sets Set($_) === $x
-
- Array Array arrays are comparable $_ »===« $x
- Buf Array arrays are comparable @$_ »===« $x
- Str Array array contains string any(@$x) eq $_
- Num Array array contains number any(@$x) == $_
- Hash Array hash slice exists $_.exists(any(@$x))
- Scalar Array array contains object any(@$x) === $_
- Set Array array equiv to set $_ === Set($x)
- Any Array lists are comparable @$_ »===« $x
-
- Hash Hash hash keys same set $_.keys === $x.keys
- Set Hash hash keys same set $_ === $x.keys
- Array Hash hash slice existence $x.exists(any @$_)
- Regex Hash hash key grep any($_.keys) === /$x/
- Scalar Hash hash entry existence $x.exists($_)
- Any Hash hash slice existence $x.exists(any @$_)
-
- Str Regex string pattern match $_.match($x)
- Hash Regex hash key grep any($_.keys) === /$x/
- Array Regex match array as string cat(@$_).match($x)
- Any Regex pattern match $_.match($x)
-
- Num Range in numeric range $x.min <= $_ <= $x.max (mod ^'s)
- Str Range in string range $x.min le $_ le $x.max (mod ^'s)
- Any Range in generic range [!after] $x.min,$_,$x.max (etc.)
-
- Any Type type membership $_.does($x)
-
- Signature Signature sig compatibility $_ is a subset of $x ???
- Code Signature sig compatibility $_.sig is a subset of $x ???
- Capture Signature parameters bindable $_ could bind to $x (doesn't!)
- Any Signature parameters bindable |$_ could bind to $x (doesn't!)
-
- Signature Capture parameters bindable $x could bind to $_
-
- Set Scalar set member exists any($_.keys) === $x
- Hash Scalar hash key exists any($_.keys) === $x
- Array Scalar array contains item any(@$_) === $x
- Scalar Scalar scalars are identical $_ === $x
+ Any .foo method truth ?X i.e. ?.foo
+ Any .foo(...) method truth ?X i.e. ?.foo
+ Any .(...) sub call truth ?X i.e. ?.(...)
+ Any .[...] array value slice truth ?all(X) i.e. ?all(.[...])
+ Any .{...} hash value slice truth ?all(X) i.e. ?all(.{...})
+ Any .<...> hash value slice truth ?all(X) i.e. ?all(.<...>)
+
+ Any Bool simple truth X
+ Any Num numeric equality +$_ == X
+ Any Str string equality ~$_ eq X
+
+ Set Set identical sets $_ === X
+ Hash Set hash keys same set $_.keys === X
+ Any Set force set comparison Set($_) === X
+ Set Subset subset .any === X.all
+ Hash Subset subset of hash keys .any === X.all
+ Any Subset force set comparison .Set.any === X.all
+ Set Superset superset .any === X.all
+ Hash Superset superset of hash keys .any === X.all
+ Any Superset force set comparison .Set.any === X.all
+
+ Array Array arrays are comparable $_ «===» X (dwims * wildcards!)
+ Set Array array equiv to set $_ === Set(X)
+ Any Array lists are comparable @$_ «===» X
+
+ Hash Hash hash keys same set $_.keys === X.keys
+ Set Hash hash keys same set $_ === X.keys
+ Array Hash hash slice existence X.exists(any @$_)
+ Regex Hash hash key grep any($_.keys) === /X/
+ Scalar Hash hash entry existence X.exists($_)
+ Any Hash hash slice existence X.exists(any @$_)
+
+ Str Regex string pattern match .match(X)
+ Hash Regex hash key "boolean grep" .any.match(/X/)
+ Array Regex array "boolean grep" .any.match(/X/)
+ Any Regex pattern match .match(X)
+
+ Num Range in numeric range X.min <= $_ <= X.max (mod ^'s)
+ Str Range in string range X.min le $_ le X.max (mod ^'s)
+ Any Range in generic range [!after] X.min,$_,X.max (etc.)
+
+ Any Type type membership $_.does(X)
+
+ Signature Signature sig compatibility $_ is a subset of X ???
+ Code Signature sig compatibility $_.sig is a subset of X ???
+ Capture Signature parameters bindable $_ could bind to X (doesn't!)
+ Any Signature parameters bindable |$_ could bind to X (doesn't!)
+
+ Signature Capture parameters bindable X could bind to $_
+
+ Any Any scalars are identical $_ === X
+
+The final rule is applied only if no other pattern type claims X.

All smartmatch types are scalarized; both C<~~> and C<given>/C<when>
provide scalar contexts to their arguments, and autothread any
junctive matches so that the eventual dispatch to C<.accepts> never
-sees anything "plural". So both C<$_> and C<$x> above are potentially
+sees anything "plural". So both C<$_> and C<X> above are potentially
container objects that are treated as scalars. (You may hyperize
C<~~> explicitly, though. In this case all smartmatching is done
using the type-based dispatch to C<.accepts>, not the form-based
@@ -721,11 +696,11 @@

The exact form of the underlying type-based method dispatch is:

- $x.accepts($_) # for ~~
- $x.rejects($_) # for !~~
+ X.accepts($_) # for ~~
+ X.rejects($_) # for !~~

As a single dispatch call this pays attention only to the type of
-C<$x> initially. The C<accepts> method interface is defined by the
+C<X> initially. The C<accepts> method interface is defined by the
C<Pattern> role. Any class composing the C<Pattern> role may choose
to provide a single C<accepts> method to handle everything, which
corresponds to those pattern types that have only one entry with
@@ -747,15 +722,28 @@
KeySet KeyBag KeyHash Hash
Class Subset Enum Role Type
Subst Grammar Regex
- Buf Char LazyStr Str
+ Char LazyCat Str
Int UInt etc. Num
Match Capture
+ Byte Str or Int
+ Buf Str or Array of Int

(Note, however, that these mappings can be overridden by explicit
definition of the appropriate C<accepts> and C<rejects> methods.
If the redefinition occurs at compile time prior to analysis of the
smart match then the information is also available to the optimizer.)

+A C<Buf> type containing any bytes or integers outside the ASCII
+range may silently promote to a C<Str> type for pattern matching if
+and only if its relationship to Unicode is clearly declared or typed.
+This type information might come from an input filehandle, or the
+C<Buf> role may be a parametric type that allows you to instantiate
+buffers with various known encodings. In the absence of such typing
+information, you may still do pattern matching against the buffer, but
+(apart from assuming the lowest 7 bits represent ASCII) any attempt
+to treat the buffer as other than a sequence integers is erroneous,
+and warnings may be generously issued.
+
Matching against a C<Grammar> object will call the C<TOP> method
defined in the grammar. The C<TOP> method may either be a rule
itself, or may call the actual top rule automatically. How the
@@ -794,12 +782,12 @@
call to the underlying C<accepts> method using $_ as the pattern.
For example:

- $_ $value Type of Match Wanted What to use on the right
- ====== ====== ==================== ========================
- Code Any scalar sub truth .accepts($value) or .($value)
- Range Any in range .accepts($value)
- Type Any type membership .accepts($value) or .does($value)
- Regex Any pattern match .accepts($value)
+ $_ X Type of Match Wanted What to use on the right
+ ====== === ==================== ========================
+ Code Any scalar sub truth .accepts(X) or .(X)
+ Range Any in range .accepts(X)
+ Type Any type membership .accepts(X) or .does(X)
+ Regex Any pattern match .accepts(X)
etc.

Similar tricks will allow you to bend the default matching rules for
@@ -819,6 +807,37 @@
accepts $c { ... }
}

+Various proposed-but-deprecated smartmatch behaviors may be easily
+(and we hope, more readably) emulated as follows:
+
+ $_ X Type of Match Wanted What to use on the right
+ ====== === ==================== ========================
+ Array Num array element truth .[X]
+ Array Num array contains number *,X,*
+ Array Str array contains string *,X,*
+ Array Seq array begins with seq X,*
+ Array Seq array contains seq *,X,*
+ Array Seq array ends with seq *,X
+ Hash Str hash element truth .{X}
+ Hash Str hash key existence .exists(X)
+ Hash Num hash element truth .{X}
+ Hash Num hash key existence .exists(X)
+ Buf Int buffer contains int .match(X)
+ Str Char string contains char .match(X)
+ Str Str string contains string .match(X)
+ Array Scalar array contains item .any === X
+ Str Array array contains string X.any
+ Num Array array contains number X.any
+ Scalar Array array contains object X.any
+ Hash Array hash slice exists .exists(X.all) .exists(X.any)
+ Any Set Subset relation Subset(X)
+ Any Hash Subset relation Subset(X)
+ Any Set Superset relation Superset(X)
+ Any Hash Superset relation Superset(X)
+ Any Set Sets intersect .exists(X.any)
+ Set Array Subset relation X,* # (conjectured)
+ Array Regex match array as string .cat.match(X)
+
Boolean expressions are those known to return a boolean value, such
as comparisons, or the unary C<?> operator. They may reference C<$_>
explicitly or implicitly. If they don't reference C<$_> at all, that's
@@ -840,6 +859,10 @@

Better, just use an C<if> statement.

+Note also that regex matching does I<not> return a C<Bool>, but merely
+a C<Match> object that can be used as a boolean value. Use an explicit
+C<?> or C<true> to force a C<Bool> value if desired.
+
The primary use of the C<~~> operator is to return a boolean value in
a boolean context. However, for certain operands such as regular
expressions, use of the operator within scalar or list context transfers
@@ -855,8 +878,8 @@
the replication count of those unique keys. (Obviously, a C<Set> can
have only 0 or 1 replication because of the guarantee of uniqueness).

-The C<LazyStr> type allows you to have an infinitely extensible string.
-You can match an array or iterator by feeding it to a C<LazyStr>,
+The C<LazyCat> type allows you to have an infinitely extensible string.
+You can match an array or iterator by feeding it to a C<LazyCat>,
which is essentially a C<Str> interface over an iterator of some sort.
Then a C<Regex> can be used against it as if it were an ordinary
string. The C<Regex> engine can ask the string if it has more
@@ -867,23 +890,25 @@
the whole string, it may be feel compelled to slurp in the rest of
the string, which may or may not be expeditious.)

-The C<cat> operator in scalar context takes a (potentially lazy) list
-and returns a C<LazyStr> object, so you can search a gather like this:
+The C<cat> operator takes a (potentially lazy) list and returns a
+C<LazyCat> object. In string context this coerces each of its elements
+to strings lazily, and behaves as a string of indeterminate length.
+You can search a gather like this:

my $lazystr := cat gather for @foo { take .bar }

$lazystr ~~ /pattern/;

-The C<LazyStr> interface allows the regex to match element boundaries
+The C<LazyCat> interface allows the regex to match element boundaries
with the C<< <,> >> assertion, and the C<StrPos> objects returned by
the match can be broken down into elements index and position within
that list element. If the underlying data structure is a mutable
array, changes to the array (such as by C<shift> or C<pop>) are tracked
-by the C<LazyStr> so that the element numbers remain correct. Strings,
+by the C<LazyCat> so that the element numbers remain correct. Strings,
arrays, lists, sequences, captures, and tree nodes can all be pattern
matched by regexes or by signatures more or less interchangably.
However, the structure searched is not guaranteed to maintain a C<.pos>
-unless you are searching a C<Str> or C<LazyStr>.
+unless you are searching a C<Str> or C<LazyCat>.

=head1 Meta operators

@@ -1517,6 +1542,11 @@

will not complain if $b happens to contain a junction at runtime.

+Junctive methods on arrays, lists, and sets work just like the
+corresponding list operators. However, junctive methods on a hash
+make a junction of only the hash's keys. Use the listop form (or an
+explicit C<.pairs>) to make a junction of pairs.
+
=head1 Chained comparisons

Perl 6 supports the natural extension to the comparison operators,

Darren Duncan

unread,
Jan 8, 2007, 6:01:05 AM1/8/07
to perl6-l...@perl.org
At 2:35 AM -0800 1/8/07, la...@cvs.develooper.com wrote:
>+ Set Set identical sets $_ === X
>+ Hash Set hash keys same set $_.keys === X
>+ Any Set force set comparison Set($_) === X
>+ Set Subset subset .any === X.all
>+ Hash Subset subset of hash keys .any === X.all
>+ Any Subset force set comparison .Set.any === X.all
>+ Set Superset superset .any === X.all
>+ Hash Superset superset of hash keys .any === X.all
>+ Any Superset force set comparison .Set.any === X.all

With the last 3 (Superset), shouldn't it be ".all === X.any", which
is the opposite of what Subset has?

>@@ -747,15 +722,28 @@
> KeySet KeyBag KeyHash Hash
> Class Subset Enum Role Type
> Subst Grammar Regex
>- Buf Char LazyStr Str
>+ Char LazyCat Str
> Int UInt etc. Num
> Match Capture
>+ Byte Str or Int
>+ Buf Str or Array of Int

Possible omission-typo.

Should the lower quoted section mention Superset as it does Subset,
like the top quoted section already does with this update?

-- Darren Duncan

Luke Palmer

unread,
Jan 8, 2007, 7:54:30 AM1/8/07
to la...@cvs.perl.org, perl6-l...@perl.org
On 1/8/07, la...@cvs.perl.org <la...@cvs.perl.org> wrote:
> + Set Subset subset .any === X.all
> + Set Superset superset .any === X.all

I think these should be reversed. Since function application is
commonly read "of", this:

Set(2,3) ~~ Subset(1,2,3)

is likely to be read as "The set of (2,3) is a subset of (1,2,3)". Similarly:

given $set {
when Subset(1,2,3) {...}
}

is likely to be read "when it's a subset of (1,2,3)".

Luke

Larry Wall

unread,
Jan 8, 2007, 12:57:24 PM1/8/07
to perl6-l...@perl.org

Okay, that's two strikes against the type fakery approach, since
we also have confusion with the "subset" declarator. I think maybe
we should go with .contains and .containedby or some other possibly
shorter synonym. Or maybe .contains and .exists need to be unified,
though the fact that I can't think of the other direction just
points out the fact that .exists has a poor valence linguistically
for expression subsetness. Maybe we should change .exists to .contains.
Hmm...

Larry

0 new messages