Modified:
doc/trunk/design/syn/S02.pod
Log:
More long dot cleanup.
Modified: doc/trunk/design/syn/S02.pod
==============================================================================
--- doc/trunk/design/syn/S02.pod (original)
+++ doc/trunk/design/syn/S02.pod Fri Apr 7 13:04:37 2006
@@ -12,9 +12,9 @@
Maintainer: Larry Wall <la...@wall.org>
Date: 10 Aug 2004
- Last Modified: 6 Apr 2006
+ Last Modified: 7 Apr 2006
Number: 2
- Version: 20
+ Version: 21
This document summarizes Apocalypse 2, which covers small-scale
lexical items and typological issues. (These Synopses also contain
@@ -72,7 +72,9 @@
and ends with dots and contains whitespace and commentary between the dots.
The pattern for "long dot" is C<< m:p/\.+ \s<ws> \./ >>. (A minor
consequence of this is that the C<< postfix:<...> >> operator should not
-be followed by whitespace.)
+be followed by whitespace. Also, if you put space after the C<..> range
+operator, it should have space before it as well. But you already do it
+that way, right?)
For instance, if you were to add your own C<< infix:<++> >> operator,
then it must have space before it. The normal autoincrementing
@@ -111,6 +113,13 @@
But you'll have to be sure to always put whitespace in front of it, or
it would be interpreted as a postfix method call instead.)
+The long dot form of the C<...> postfix is C<0. ...> rather than
+C<0. ....> because the long dot eats the first dot after the whitespace.
+It does not follow that you can write C<0....> because that would
+take the first three dots under the longest token rule. (The long dot
+does not count as a longer token because the longest-token rule only
+applies to the fixed prefix of any rule with variable components.)
+
=item *
Single-line comments work as in Perl 5, starting with a C<#> character
@@ -381,10 +390,11 @@
Subscripts now consistently dereference the container produced by
whatever was to their left. Whitespace is not allowed between a
variable name and its subscript. However, there is a corresponding
-B<dot> form of each subscript (C<@foo.[1]> and C<%bar.{'a'}>) which
-allows optional whitespace after the dot (except when interpolating).
-Constant string subscripts may be placed in angles, so C<%bar.{'a'}>
-may also be written as C<< %bar<a> >> or C<< %bar.<a> >>.
+B<dot> form of each subscript (C<@foo.[1]> and C<%bar.{'a'}>).
+There is also a "long dot" form which allows optional whitespace
+between dots. (The long dot is not allowed when interpolating). Constant
+string subscripts may be placed in angles, so C<%bar.{'a'}> may also
+be written as C<< %bar<a> >> or C<< %bar.<a> >>.
=item *
@@ -473,9 +483,11 @@
&foo($arg1, $arg2);
Whitespace is not allowed before the parens, but there is a corresponding
-C<.()> operator, which allows you to insert optional whitespace after the dot:
+C<.()> operator, plus a "long dot" form which allows you to insert optional whitespace between dots:
- &foo. ($arg1, $arg2);
+ &foo. .($arg1, $arg2);
+ &foo... #comment
+ .($arg1, $arg2);
=item *
@@ -1337,20 +1349,21 @@
(Unlike in Perl 5, where version numbers didn't autoquote.)
You can also use the :key($value) form to quote the keys of option
-pairs. To align values of option pairs, you may not use the
-dot postfix forms:
+pairs. To align values of option pairs, you may use the
+"long dot" postfix forms:
- :longkey. ($value)
- :shortkey. <string>
- :fookey. { $^a <=> $^b }
+ :longkey. .($value)
+ :shortkey. .<string>
+ :fookey. .{ $^a <=> $^b }
These will be interpreted as
- :longkey(1). ($value)
- :shortkey(1). <string>
- :fookey(1). { $^a <=> $^b }
+ :longkey($value)
+ :shortkey<string>
+ :fookey{ $^a <=> $^b }
-You just have to put spaces inside the parenthesis form to align things.
+But note that C<..> is not a long dot because at least one internal space
+is required to differentiate from the range operator.
=item *
Yes, before anyone else points it out to me, that still doesn't quite
make sense, insofar as the long-dot rule has to take precedence over
postfix ... if there is trailing whitespace. I think the long-dot rule
is built into the parser rather than falling out of the longest-token
rule. And the long-dot rule has to look ahead to see if there is
whitespace following. Seems a lot more benign than the previous forms
of lookahead though. Definitely easier to parse visually, I think.
Larry
> : +It does not follow that you can write C<0....> because that would
> : +take the first three dots under the longest token rule. (The long dot
> : +does not count as a longer token because the longest-token rule only
> : +applies to the fixed prefix of any rule with variable components.)
>
> Yes, before anyone else points it out to me, that still doesn't quite
> make sense, insofar as the long-dot rule has to take precedence over
What's not making sense to me is why it's not
The long dot form of the C<...> postfix is C<0. ....> rather than
C<0. ...> because the long dot eats the first dot after the whitespace.
(ie swapped the 3 and 4 dots), given that the long dot eats the first dot,
and 3 eat 1 leaves 2, whereas 4 eat 1 leaves 3.
Nicholas Clark
It can certainly be argued either way. It kind of depends on what
you think about the first dot of '$x...' with respect to the next
two dots, and whether the long dot rule ends with \. or <before \.>.
I tend to think of the "long dot" as a substitute for the first dot
in any dotted short form, not as a prefix. To me it seems easier
to teach as a "long dot" (which is dotty on both ends) than as a dot
extender (which is dotty at the front and spacey at the back, but just
happens to require that the next thing be a dot.)
Larry
> before anyone else points it out to me
<hihi>
> I think the long-dot
> rule is built into the parser rather than falling out of the
> longest-token rule.
I think so too, but why then cling to the dot?
s:p5/[\][#][^\]*[#][\]// (does not match \#\ )
The backslash is not (or not always) the proper character for this.
s:p5/([.^\])[#].*?[#]\1/$1/
thing.# #.xFF(1,2,3)
thing.\# #\xFF
(0 .. ^# #^42)
It's a prophet. :)
--
Groet, Ruud