Luke
J> And for symmetry, can we get 0d and \d for decimal, for those cases
J> where you want to be explicit?
in a regex \d is a digit, so that isn't a good idea. it would be better
to require \0d. the others also need a base designator character so
decimals don't need a shortcut. and why would we need 0d123 as a
literal? there are no ambiguities and decimal is the simple and obvious
default. i just don't think we need symmetry in all possible cases.
J> I think \777 should be chr(777). As should \0d777, should you want to
J> document that it's really not octal. (Important mostly the first year
J> after the first release.)
too much history with \777 being octal. i think that should be a compile
time error (die) as it is illegal. but p5 currently just stops parsing
when it sees an out of range char. this is a silent bug IMO. at least a
warning should be generated. but the other side will want support for a
single char value and not requiring leading pad 0's.
perl -e'print "\xQW"' | od -c
0000000 \0 Q W
perl -e'print "\xaQW"' | od -c
0000000 \n Q W
perl -e'print "\777"' | od -c
0000000 307 277
i don't know what is happening there.
so if you have no valid value chars or are out of range (as with \777),
then i would want to know as i made a mistake. leading pad 0's can be
skipped if some legal value is found.
uri
--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
It's probably best to make them completely illegal for now. At some future
point we might consider relaxing 0777 to be decimal, but that would be
possible only after the current octal usage pretty much dies out in the
culture, and that probably won't happen any time soon. Though I do truly
believe that the age of octals has passed.
Note that \1 is also illegal now in rules, since backrefs are done with $1.
I guess another question is what we do when we see \012. We probably
need to make that illegal too. It's far too soon to change over
to interpreting that as "{chr(0)}12". We'll have a hard enough
time convincing everyone to write "\o15\o12" instead of "\015\012".
Maybe we can get people to write "\xd\xa" now. But we really need to
put leading zeroes back to radix-neutral, or our great-grandchildren
will curse our lack of foresight.
Larry
In a rule, whitespace is a very good disambiguator.
> it would be better to require \0d.
I think nullbyte-d is rather likely to occur.
> why would we need 0d123 as a literal?
Symmetry.
0x10
0o10
0b10
0d10
Note that it isn't strictly *needed*. But if you take necessity to extremes,
0x, 0o and 0b are redundant too.
Juerd
--
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html
http://convolution.nl/gajigu_juerd_n.html
And for symmetry, can we get 0d and \d for decimal, for those cases
where you want to be explicit?
I think \777 should be chr(777). As should \0d777, should you want to
document that it's really not octal. (Important mostly the first year
after the first release.)
I don't think you can assume it'll only be confusing for a year. For
one thing, the "\nnn for octal character code nnn" did not originate
with Perl; it (still) works in C, the UNIX shells, and other
programming languages with a similar heritage.
I'm more than willing to give up compatibility on the "leading zero
means octal in numeric literals" front, because that has never anything
but a source of confusion, but I'm less sanguine about switching '\nnn'
from octal to decimal. I agree that '\nnn' should *not* be interpreted
as octal anymore, or else nobody will change their habits. So perhaps
either making it completely illegal, or at least providing a warning
while interpreting it as decimal, would be the way to go.
Incidentally, will \o, \x, and the hypothetical \d still work without
curlies for a certain number of digits but require curlies for larger
numbers? I'd rather see consistency there.
-Mark
It won't matter then anyway...Perl 25 code will come straight from our
brainwaves:
_____
__...---'-----`---...__
_===============================
,----------------._/' `---..._______...---'
(_______________||_) . . ,--'
/ /.---' `/
'--------_- - - - - _/
`--------'
Brett :)
Perl6 ToDo:
http://www.parrotcode.org/todo
Well, we switched to square brackets for those to avoid closure
confusion, and the consistent rule is that it will parse as many
valid digits as possible, and you need the square brackets only if
the following character would be mistaken for a valid digit. So
\x1abcd is a valid Unicode codepoint somewhere in Plane 1. It can
also be written \x[1abcd].
I don't think \d is gonna work. Maybe \x is short for \0x and that
also gives us \0o, \0d and \0b, plus any other radix we come up with,
assuming we decide it isn't overly ambiguous with bare \0.
Larry
Works for me. So when you really do want a \0 in the middle of a string
followed by a lowercase letter, how do you indicate that? Something
like \0b0b for a NUL followed by a lowercase 'b'? Or maybe, despite
the "no \nnn" rule, we could allow \0 to be followed by any number of
0s, which are ignored but prevent a following letter from being
interpreted as a radix key. After all, zero is zero in any base
(or at least any base it makes sense to use for code points). So
to get a NUL followed by a 'b', you could use:
1. \00b (or \000b, or \000000000000000000b, etc)
2. \0b0b, \0d0b, \0o0b, \0x[0]b, etc
3. (maybe) \0b as long as the character after the b, if any, isn't a 0 or 1
?
--
Mark REED | CNN Internet Technology
1 CNN Center Rm SW0831G | mark...@cnn.com
Atlanta, GA 30348 USA | +1 404 827 4754