Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

perl6-language@perl.org

20 views
Skip to first unread message

Miko O'Sullivan

unread,
Aug 1, 2002, 6:02:14 PM8/1/02
to perl6-l...@perl.org
This is a small collection of ideas for the Perl6 language. Think of this
posting as a light and refreshing summer fruit salad, composed of three
ideas to while away the time during this August lull in perl6-language.


--------------------------------------------------------
Give split an option to keep the delimiters in the returned array

I often find that I want to split an expression, but I don't want to get rid
of the delimiters. For example, I've been parsing a lot of SQL lately, and
I find myself needing to split expressions like this:

rank=?

It would be really groovy if that expression could be split with the
delimiters in place, something like this:

@tokens = split _/[?=*-+]/, $sql, keep=>'all';

and get back an array with these values: ('rank', '=', '?')

But that raises a problem: what if the expression is this (note the spaces):

rank = ?

In that case I would want the = and ? but I wouldn't want the spaces. A
slightly different option could keep just stuff in parens:

@tokens = split _/\s*([?=*-+])\s*/, $sql, keep=>'parens';


--------------------------------------------------------
Set preferred boolean string for scope

In Perl5, if you use a boolean expression (e.g. $x==$y) you get back 1 for
true and an empty string for false. That makes sense, of course, but I've
always preferred 1 for true and 0 for false. I generally use exactly only
those two values for true and false in my databases, and I find I'm forever
writing things like ($x==$y ? 1 : 0) to tidy up my booleans.

It would be cool if in Perl6 you could indicate the preferred default values
of true and false for a given namespace or scope, something like this:

use BooleanValues TRUE=>1, FALSE=>0;

Perl itself could respect these requests in expressions like $x==$y.
Functions that declare themselves as booleans can return anything they like,
but the results would be translated into the caller's preferred true/false
values. If no such values are indicated for a namespace then whatever the
functions returns is returned.

--------------------------------------------------------
Push with []

Our friends over in PHP have a nifty little way of saying "push this onto
the end of the array". You simply assign the value to the array using an
empty index. In Perl6 it could look like this:

@arr[] = $var;

The expression above would be exactly equivalent to

push @arr, $var;

I've always found the first form more intuitive: it feels like I'm assigning
something. It's a paradigm issue... I'm not suggesting that we get rid of
push, just that we create this additional form that allows the programmer to
think of it in a different way.

Uri Guttman

unread,
Aug 1, 2002, 6:17:11 PM8/1/02
to Miko O'Sullivan, perl6-l...@perl.org
>>>>> "MO" == Miko O'Sullivan <mosul...@crtinc.com> writes:

MO> Give split an option to keep the delimiters in the returned array

perl5 can already do that. just wrap the delim part in parens and split
will return them. also by using a lookahead/behind as the regex split
won't strip out that text and it will be returned but attached to the
text next to the delimiter.

MO> Set preferred boolean string for scope

MO> In Perl5, if you use a boolean expression (e.g. $x==$y) you get back 1 for
MO> true and an empty string for false. That makes sense, of course, but I've
MO> always preferred 1 for true and 0 for false. I generally use exactly only
MO> those two values for true and false in my databases, and I find I'm forever
MO> writing things like ($x==$y ? 1 : 0) to tidy up my booleans.

do these instead:

$bool += 0 ;
($x == $y) + 0

:)

MO> Push with []

MO> @arr[] = $var;

MO> The expression above would be exactly equivalent to

MO> push @arr, $var;

push is just fine with me.

uri

--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
----- Stem and Perl Development, Systems Architecture, Design and Coding ----
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Dave Mitchell

unread,
Aug 1, 2002, 6:11:42 PM8/1/02
to Miko O'Sullivan, perl6-l...@perl.org
On Thu, Aug 01, 2002 at 06:02:14PM -0400, Miko O'Sullivan wrote:
> It would be really groovy if that expression could be split with the
> delimiters in place, something like this:
>
> @tokens = split _/[?=*-+]/, $sql, keep=>'all';
>
> and get back an array with these values: ('rank', '=', '?')
>
> But that raises a problem: what if the expression is this (note the spaces):
>
> rank = ?
>
> In that case I would want the = and ? but I wouldn't want the spaces. A
> slightly different option could keep just stuff in parens:
>
> @tokens = split _/\s*([?=*-+])\s*/, $sql, keep=>'parens';

But perl5 already does this:

$ perl -le 'print join "|", split /\s*([?=*-+])\s*/, "rank = ?"'
rank|=||?
$

Dave.

--
You live and learn (although usually you just live).

Damian Conway

unread,
Aug 1, 2002, 6:30:05 PM8/1/02
to Miko O'Sullivan, perl6-l...@perl.org
Miko O'Sullivan suggested:


> Give split an option to keep the delimiters in the returned array

As Dave mentioned, this already happens if you capture within the
split pattern.


> --------------------------------------------------------
> Set preferred boolean string for scope

It's possible that Perl 6 will have built-in functions C<true> and C<false>.
When called without arguments, they will return the standard true and false
values (1 and "") respectively. If that is the case, then to dynamically
change them, you'd just write:

{
temp sub false() {0}
# etc.
}

Then, if the built-ins were all defined to use C<true> and C<false> to
return true and false values, you'd have exactly the control you need.

Though I must say I can't see the real need for this. Especially when
you can prefix any boolean expression with unary + and ensure that
any ""s are converted to 0's anyway.


> --------------------------------------------------------
> Push with []
>
> Our friends over in PHP have a nifty little way of saying "push this onto
> the end of the array". You simply assign the value to the array using an
> empty index. In Perl6 it could look like this:
>
> @arr[] = $var;

I have to admit that don't find that syntax very intuitive.

Besides, in Perl 5 the same functionality just:

$arr[@arr] = $var;

In Perl 6, that would be:

@arr[+@arr] = $var;

or:

@arr[@arr.length] = $var;

or maybe just :

@arr[.length] = $var;

(if an array were to be made the topic inside its own accessor brackets).

Damian


PS: Thanks for the ideas, Mike! :-)

Graham Barr

unread,
Aug 1, 2002, 6:32:28 PM8/1/02
to Miko O'Sullivan, perl6-l...@perl.org
On Thu, Aug 01, 2002 at 06:02:14PM -0400, Miko O'Sullivan wrote:
> This is a small collection of ideas for the Perl6 language. Think of this
> posting as a light and refreshing summer fruit salad, composed of three
> ideas to while away the time during this August lull in perl6-language.
>
>
> --------------------------------------------------------
> Give split an option to keep the delimiters in the returned array
>
> I often find that I want to split an expression, but I don't want to get rid
> of the delimiters. For example, I've been parsing a lot of SQL lately, and
> I find myself needing to split expressions like this:
>
> rank=?
>
> It would be really groovy if that expression could be split with the
> delimiters in place, something like this:
>
> @tokens = split _/[?=*-+]/, $sql, keep=>'all';

Try using

@tokens = split /([?=*-+])/, $sql;

> and get back an array with these values: ('rank', '=', '?')
>
> But that raises a problem: what if the expression is this (note the spaces):
>
> rank = ?
>
> In that case I would want the = and ? but I wouldn't want the spaces. A
> slightly different option could keep just stuff in parens:
>

> @tokens = split /\s*([?=*-+])\s*/, $sql, keep=>'parens';

@tokens = split /\s*([?=*-+])\s*/, $sql;

already does, in perl 5, what you want.

Graham.

Miko O'Sullivan

unread,
Aug 1, 2002, 7:04:55 PM8/1/02
to perl6-l...@perl.org
From: "Dave Mitchell" <da...@fdgroup.com>

> But perl5 already does this:

Dave gets the "First to Point Out the Feature Exists" award. I knew that
out of three ideas I'd be lucky if just one of them was actually a new
feature idea.

I might still say that the parens don't make things quite obvious... what if
I need to use parens for a complex regex but *don't* want the delimiters?
But I'm not sure that it's worth changing if it already exists in some form.


From: "Damian Conway" <dam...@conway.org>
[talking about the boolean representation thing]
> Though I must say I can't see the real need for this. Especially when
> you can prefix any boolean expression with unary + and ensure that
> any ""s are converted to 0's anyway.

.... what would "true" (the string) be converted to? Here's my point more
explicitly: in a boolean context, there's no need to get any specific string
(0, 1, "yup") as long as it correctly expresses true or false. It's when
you convert a boolean into a string or number that it becomes convenient to
define how they are represented by default. Yes, of course there are already
ways to change a variable from [some representation of false] to 0, but by
giving a slick way to default the string, a lot of ?? :: type stuff can be
done away with.

> {
> temp sub false() {0}
> # etc.
> }

That sounds like a great way to do it. A follow up question, then: would it
be easy enough to accomplish that in a use-type format? I.e., something
like I said earlier:

use StrictBoolean TRUE=>1, FALSE=>0;

or even just let it default:

use StrictBoolean;

> > @arr[] = $var;
> >
> I have to admit that don't find that syntax very intuitive.
> Besides, in Perl 5 the same functionality just:
>
> $arr[@arr] = $var;
>
> In Perl 6, that would be:
>
> @arr[+@arr] = $var;
>
> or:
>
> @arr[@arr.length] = $var;
>
> or maybe just :
>
> @arr[.length] = $var;

The issue of what is more intuitive is of course highly subjective, but I
would argue that for several reasons the more concise version is more
unituitive to the population at large:

- Generally, shorter is better as long as it isn't ambiguous (maybe it is
ambigous, what do you think?)

- It doesn't get bogged down in what the stuff in the braces means

- Those +s and ?s and _s are going to take some getting used to. People
will probably learn this little shortcut faster than they learn the new
casting symbols.

- There's already a huge population of programmers out there who already use
this notation. I frankly admit that I think of PHP as a great idea that
wasn't done quite right. I'd love to see the PHPers of the world migrate to
Perl.


Oh, and sorry about the subject line in the previous email... my
cut-n-pasting was off target.

-Miko

David Wheeler

unread,
Aug 1, 2002, 7:09:52 PM8/1/02
to Mark J. Reed, perl6-l...@perl.org
On Thursday, August 1, 2002, at 04:05 PM, Mark J. Reed wrote:

> Having the subscript operator change the topic is, IMHO, a rather strong
> violation of the principle of least surprise.

I'm inclined to agree. I think I'd much rather not have it change there,
since I'll frequently do stuff like this:

my %hash;
for qw(one two three) {
%hash{$_} = 1; # $_ should *not* == %hash here!
}

Regards,

David

--
David Wheeler AIM: dwTheory
da...@wheeler.net ICQ: 15726394
http://david.wheeler.net/ Yahoo!: dew7e
Jabber: The...@jabber.org

Mark J. Reed

unread,
Aug 1, 2002, 7:05:35 PM8/1/02
to perl6-l...@perl.org
On Fri, Aug 02, 2002 at 08:30:05AM +1000, Damian Conway wrote:
> @arr[@arr.length] = $var;
>
> or maybe just :
>
> @arr[.length] = $var;
>
> (if an array were to be made the topic inside its own accessor brackets).
I know this idea was just thrown in there, but I find that I really dislike
it. I can see many more uses for the current topic than for array-as-topic
in index expressions. I can easily envision instance methods wishing
to use instance variables as [part of] indices, for instance.

Having the subscript operator change the topic is, IMHO, a rather strong
violation of the principle of least surprise.

--
Mark REED | CNN Internet Technology
1 CNN Center Rm SW0831G | mark...@cnn.com
Atlanta, GA 30348 USA | +1 404 827 4754
--
Why isn't there a special name for the tops of your feet?
-- Lily Tomlin

Dave Mitchell

unread,
Aug 1, 2002, 8:00:10 PM8/1/02
to Uri Guttman, Miko O'Sullivan, perl6-l...@perl.org
On Thu, Aug 01, 2002 at 06:17:11PM -0400, Uri Guttman wrote:
> do these instead:
>
> $bool += 0 ;
> ($x == $y) + 0

or even

$x == $y || 0

--
Never do today what you can put off till tomorrow.

Damian Conway

unread,
Aug 1, 2002, 9:25:27 PM8/1/02
to Miko O'Sullivan, perl6-l...@perl.org
Miko O'Sullivan aksed:

> .... what would "true" (the string) be converted to?

In a numeric context: 0 (as in Perl 5).


> Here's my point more
> explicitly: in a boolean context, there's no need to get any specific string
> (0, 1, "yup") as long as it correctly expresses true or false. It's when
> you convert a boolean into a string or number that it becomes convenient to
> define how they are represented by default. Yes, of course there are already
> ways to change a variable from [some representation of false] to 0, but by
> giving a slick way to default the string, a lot of ?? :: type stuff can be
> done away with.

Well, given that Perl 6 has an actual BOOL subtype, maybe C<true> and C<false>
could return objects with:

* a Boolean conversion to 1/""
* a string conversion to "true"/"false"
* a numeric conversion to 1/0

> That sounds like a great way to do it. A follow up question, then: would it
> be easy enough to accomplish that in a use-type format? I.e., something
> like I said earlier:
>
> use StrictBoolean TRUE=>1, FALSE=>0;
>
> or even just let it default:
>
> use StrictBoolean;

Yes. In Perl 6 C<use> statements will, by default be lexical in effect.
The module itself *might* look like this:

module StrictBoolean;

my @truth = ({TRUE=>1, FALSE=>0});

sub import ($class, *%beauty) { push @truth, \%beauty }
sub unimport { pop @truth }

sub true is exported { return @truth[-1]{TRUE} }
sub false is exported { return @truth[-1]{FALSE} }


> The issue of what is more intuitive is of course highly subjective, but I
> would argue that for several reasons the more concise version is more
> unituitive to the population at large:

Quite possibly. This is why I was so subjunctive about that option. And why
I'm happy to leave such decisions to Larry. His track-record on DWIMity is
exceptional. :-)


> - There's already a huge population of programmers out there who already use
> this notation. I frankly admit that I think of PHP as a great idea that
> wasn't done quite right.

I agree. Including that notation! ;-)


Damian

Damian Conway

unread,
Aug 1, 2002, 9:32:23 PM8/1/02
to David Wheeler, Mark J. Reed, perl6-l...@perl.org
>> Having the subscript operator change the topic is, IMHO, a rather strong
>> violation of the principle of least surprise.
>
> I'm inclined to agree. I think I'd much rather not have it change there,
> since I'll frequently do stuff like this:
>
> my %hash;
> for qw(one two three) {
> %hash{$_} = 1; # $_ should *not* == %hash here!
> }
>

Yes, this is one of the reasons we're hesitant to do it. Even though it does
give you lovely slice idioms like:

@public = %hash{ /^(<-[_]>.*)/ };

And, if we *did* go that way, your example would become:

my %hash;
for qw(one two three) -> $key {
%hash{$key} = 1;
}

which doesn't seem a terribly high price to pay.

On the other hand, too many topicalizing contexts actually reduce the
intrinsic value of topics themselves, since the current topic then doesn't
hang around long enough to actually be useful. I suspect that it the clincher
against this idea.

Damian

Christian Renz

unread,
Aug 1, 2002, 6:24:22 PM8/1/02
to perl6-l...@perl.org
perl 5 already does that:

print "'$_' " foreach split /(=)/, "rank=?";
print "\n";
print "'$_' " foreach split /\s*(=)\s*/, "rank = ?";
print "\n";

# Output:
# 'rank' '=' '?'
# 'rank' '=' '?'

Greetings,
Christian

--
cr...@web42.com - http://www.web42.com/crenz/ - http://www.web42.com/

"Faith (...) is the art of holding onto things your reason has once
accepted, in spite of your changing moods." -- C.S. Lewis, Mere Christianity

Miko O'Sullivan

unread,
Aug 2, 2002, 8:33:22 AM8/2/02
to perl6-l...@perl.org
> > - There's already a huge population of programmers out there who already
use
> > this notation. I frankly admit that I think of PHP as a great idea that
> > wasn't done quite right.
>
> I agree. Including that notation! ;-)

Touche. Darn it's difficult disagreeing with pithy people. :-)

OK, would that notation ( @arr[] = $var ) be something that could be added
by a module, in the same way that operators and /* */ will be addable? I
don't know exactly what the syntax for adding /* */ will be, but if you can
say to the preprocessor something like s#/*#=comment#g then perhaps you can
also say something like s#\[\s*\]\s*=#binpush#g and then also define binpush
as an operator.

-Miko

Trey Harris

unread,
Aug 2, 2002, 8:53:51 AM8/2/02
to Miko O'Sullivan, perl6-l...@perl.org
In a message dated Fri, 2 Aug 2002, Miko O'Sullivan writes:
> OK, would that notation ( @arr[] = $var ) be something that could be added
> by a module, in the same way that operators and /* */ will be addable?

I don't think we've seen too much about how Larry plans to do
Perl-munging-Perl except that we know it will be much more easily
possible, and it will be based upon the grammars we saw in A5.

That said, you could add this syntax fairly easily in Perl 5 with source
filters (well, as easily as source filters ever are). I'd be highly
surprised if that ability went away in Perl 6.

You've often asked this list, "will doing X in a module be possible?"
Consider the things that Damian's already done with modules in Perl 5. I
think Damian's involvement in Perl 6 if nothing else will insure that, no
matter what X stands for, the answer will be "yes." :-)

Trey

(With the possible exception of modules that disobey the laws of physics,
but I'm not putting anything past Larry... no strict 'physics' ;)

Larry Wall

unread,
Aug 2, 2002, 12:32:37 PM8/2/02
to Nicholas Clark, Trey Harris, Miko O'Sullivan, perl6-l...@perl.org
On Fri, 2 Aug 2002, Nicholas Clark wrote:
: On Fri, Aug 02, 2002 at 08:53:51AM -0400, Trey Harris wrote:
: > (With the possible exception of modules that disobey the laws of physics,

: > but I'm not putting anything past Larry... no strict 'physics' ;)
:
: Yay!
:
: $ cat infinite_compression.pl
: #!/usr/local/bin/perl6
: use strict; # Hopefully this triggers the p5 to p6 convertor.
: use warnings;
: no strict 'physics';
: use Compress::SnakeOil;
:
: while my $infile (@ARGV) {
: my $outfile = "$infile.inf";
: compress_file (infile => $infile, outfile => $outfile, level => "Infinite");
: die "Problem compressing $infile to $outfile" unless -z $outfile;
: }
: __END__
:
:
: I do hope that works. :-)

Infinite compression works great. Unfortunately, it's lossy.
And done right, it requires infinite time.

Larry

Nicholas Clark

unread,
Aug 2, 2002, 10:41:48 AM8/2/02
to Trey Harris, Miko O'Sullivan, perl6-l...@perl.org
On Fri, Aug 02, 2002 at 08:53:51AM -0400, Trey Harris wrote:
> You've often asked this list, "will doing X in a module be possible?"
> Consider the things that Damian's already done with modules in Perl 5. I
> think Damian's involvement in Perl 6 if nothing else will insure that, no
> matter what X stands for, the answer will be "yes." :-)
>
> Trey
>
> (With the possible exception of modules that disobey the laws of physics,
> but I'm not putting anything past Larry... no strict 'physics' ;)

Yay!

$ cat infinite_compression.pl
#!/usr/local/bin/perl6
use strict; # Hopefully this triggers the p5 to p6 convertor.
use warnings;
no strict 'physics';
use Compress::SnakeOil;

while my $infile (@ARGV) {
my $outfile = "$infile.inf";
compress_file (infile => $infile, outfile => $outfile, level => "Infinite");
die "Problem compressing $infile to $outfile" unless -z $outfile;
}
__END__


I do hope that works. :-)

Nicholas Clark

Dan Sugalski

unread,
Aug 2, 2002, 12:51:58 PM8/2/02
to Trey Harris, Miko O'Sullivan, perl6-l...@perl.org
At 8:53 AM -0400 8/2/02, Trey Harris wrote:
>(With the possible exception of modules that disobey the laws of physics,
>but I'm not putting anything past Larry... no strict 'physics' ;)

Yeek! Hopefully Larry'll forbear--while he may be able to pull that
one off, I'm afraid I'm not up to the task... :)
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Damian Conway

unread,
Aug 2, 2002, 6:21:24 PM8/2/02
to Miko O'Sullivan, perl6-l...@perl.org
Miko O'Sullivan wrote:

> OK, would that notation ( @arr[] = $var ) be something that could be added
> by a module, in the same way that operators and /* */ will be addable? I
> don't know exactly what the syntax for adding /* */ will be

Something like this:

grammar Perl::With::Ugly::C::Comments is Perl {

rule ws { <Perl::ws> | <ugly_c_comment> }

rule ugly_c_comment {
/\* [ .*? <ugly_c_comment>? ]*? \*/
{ let $0 := " " }
}
}

caller{MY}.parser(Perl::With::Ugly::C::Comments);


> but if you can
> say to the preprocessor something like s#/*#=comment#g then perhaps you can
> also say something like s#\[\s*\]\s*=#binpush#g and then also define binpush
> as an operator.

You could rebuild the lexical parser grammar as above (to allow the lamentable
C<@arr[] = $scalar> syntax), or you could just create a new operator with
something like:

module BinaryPush;

my sub operator:<-- is exported (@array is rw, $scalar) {
push @array, $scalar;
}

# and elsewhere...

use BinaryPush;

@arr <-- $val;

Damian

Chip Salzenberg

unread,
Aug 2, 2002, 7:36:09 PM8/2/02
to Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
According to Damian Conway:

> {
> temp sub false() {0}
> # etc.
> }

I'm a bit concerned about what that would do to subroutines in other
modules called during the block's execution. Perhaps "my sub" instead?

PS: I wonder if the names would be &FALSE and &TRUE to avoid polluting
the non-all-caps namespace ... ?
--
Chip Salzenberg - a.k.a. - <ch...@pobox.com>
"It furthers one to have somewhere to go."

Dave Storrs

unread,
Aug 3, 2002, 12:26:11 AM8/3/02
to Damian Conway, Miko O'Sullivan, perl6-l...@perl.org

On Sat, 3 Aug 2002, Damian Conway wrote:

> > don't know exactly what the syntax for adding /* */ will be
>
> Something like this:
>
> grammar Perl::With::Ugly::C::Comments is Perl {
>
> rule ws { <Perl::ws> | <ugly_c_comment> }
>
> rule ugly_c_comment {
> /\* [ .*? <ugly_c_comment>? ]*? \*/
> { let $0 := " " }
> }
> }
>
> caller{MY}.parser(Perl::With::Ugly::C::Comments);


I'm still having trouble getting my head around the new
grammar-construction rules. Three questions:

1) Am I right that anything inside a "rule" block is considered to be
inside a regex? If not, why didn't you have to write:
rule ugly_c_comment {
/
\/ \* [ .*? <ugly_c_comment>? ]*? \* \/
{ let $0 := " " }
/
}

2) As written, I believe that the ugly_c_comment rule would permit nested
comments (that is, /* /**/ */), but would break if the comments were
improperly nested (e.g., /* /* */). Is that correct?

3) The rule will replace the comment with a single, literal space. Why is
this replacement necessary...isn't it sufficient to simply define it as
whitespace, as was done above?

Dave Storrs

Ken Fox

unread,
Aug 3, 2002, 12:53:46 PM8/3/02
to Dave Storrs, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
Dave Storrs wrote:
> why didn't you have to write:
>
> rule ugly_c_comment {
>
/
>
\/ \* [ .*? <ugly_c_comment>? ]*? \* \/
>
{ let $0 := " " }
>
/
> }

Think of the curly braces as the regex quotes. If "{" is the quote
then there's nothing special about "/" and it doesn't need to be
escaped. Also, I don't think you want spaces between "/" and "*"
because "/ *" isn't a comment delimiter.

> 2) As written, I believe that the ugly_c_comment rule would permit nested
> comments (that is, /* /**/ */), but would break if the comments were
> improperly nested (e.g., /* /* */). Is that correct?

It wouldn't fail, but it would scan to EOF and then back track.
Basically the inner <ugly_c_comment> succeeds and then the rest
of the file is scanned for <'*/'>. When that fails, the regex
back tracks to the inner <ugly_c_comment>, fails that and then
skips the unbalanced "/*" with .*?. I'd like to add ::: to fail
the entire comment if the inner comment fails, but I'm not sure
how to do it. Does this work?

/\* [ .*? | <ugly_c_comment> ::: ]*? \*/

> 3) The rule will replace the comment with a single, literal space. Why is
> this replacement necessary...isn't it sufficient to simply define it as
> whitespace, as was done above?

Probably. I think it's a hold-over from thinking of parser vs lexer,
but that may not be true depending on how the rest of the grammar
uses white space. IMHO value bound to the white space production
should be the actual text (the comment in this case).

- Ken

Uri Guttman

unread,
Aug 3, 2002, 1:44:28 PM8/3/02
to Ken Fox, Dave Storrs, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
>>>>> "KF" == Ken Fox <kf...@vulpes.com> writes:

> Dave Storrs wrote:
>> why didn't you have to write:
>>
>> rule ugly_c_comment {
>> /
>> \/ \* [ .*? <ugly_c_comment>? ]*? \* \/
>> { let $0 := " " }
>> /
>> }

> Think of the curly braces as the regex quotes. If "{" is the quote
> then there's nothing special about "/" and it doesn't need to be
> escaped. Also, I don't think you want spaces between "/" and "*"
> because "/ *" isn't a comment delimiter.

but remember that whitespace is ignored as the /x mode is on all the
time.

Dave Storrs

unread,
Aug 3, 2002, 1:44:19 PM8/3/02
to Ken Fox, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org

On Sat, 3 Aug 2002, Ken Fox wrote:

> Dave Storrs wrote:
> > why didn't you have to write:
> >
> > rule ugly_c_comment {
> >
> /
> >
> \/ \* [ .*? <ugly_c_comment>? ]*? \* \/
> >
> { let $0 := " " }
> >
> /
> > }
>
> Think of the curly braces as the regex quotes. If "{" is the quote
> then there's nothing special about "/" and it doesn't need to be
> escaped.


Ok, good. Then it *does* work the way I thought. Thanks.


>Also, I don't think you want spaces between "/" and "*"
> because "/ *" isn't a comment delimiter.


True, but as I understand it, literal whitespace in a regex is no
longer significant...so writing "/ *" in a regex is equivalent to writing
"/*" or "/ *" etc. In order to match an actual "/ *", you would need
to write "/\s+*".

Actually, this is one thing that has troubled me about the new
regex rules, and I've mentioned it before. I would still like for there
to be a "reverse /x" switch, that would tell the regex that I want it to
treat whitespace literally...if for no other reason than because it would
reduces line noise in regexen. In most situations you probably wouldn't
want it, but I can think of occasions when you would.


Dave Storrs

Ken Fox

unread,
Aug 3, 2002, 2:55:37 PM8/3/02
to Uri Guttman, Dave Storrs, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
Uri Guttman wrote:
> but remember that whitespace is ignored as the /x mode is on
> all the time.

Whoops, yeah. For some reason I kept literal mode on when
reading the spaces between two literals.

The rules {foo bar} and {foobar} are the same, but some
very low level part of my brain is resisting that. I have
no trouble with {foo | bar} and {foo|bar} though. Is this
a standard issue defect or should I complain to my parents?

- Ken

Stephen Rawls

unread,
Aug 3, 2002, 2:05:31 PM8/3/02
to Dave Storrs, Ken Fox, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
--- Dave Storrs <dst...@dstorrs.com> wrote:
> Actually, this is one thing that has troubled me
> about the new regex rules, and I've mentioned it
> before. I would still like for there to be
> a "reverse /x" switch, that would tell the
> regex that I want it to treat whitespace
> literally

Doesn't the :w option do that?

:w/one two/ translates to /one \s+ two/

cheers,
Stephen Rawls

__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com

Miko O'Sullivan

unread,
Aug 4, 2002, 12:28:38 PM8/4/02
to perl6-l...@perl.org
From: "Damian Conway" <dam...@conway.org>

> > .... what would "true" (the string) be converted to?
>
> In a numeric context: 0 (as in Perl 5).


.... which was my point. You wouldn't want to cast any ol' scalar as a
number just to get 1 or 0 representations or TRUE or FALSE... that wouldn't
DWIM.

-Miko

Dave Storrs

unread,
Aug 4, 2002, 4:51:52 PM8/4/02
to Stephen Rawls, Ken Fox, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org

On Sat, 3 Aug 2002, Stephen Rawls wrote:

> --- Dave Storrs <dst...@dstorrs.com> wrote:
> > Actually, this is one thing that has troubled me
> > about the new regex rules, and I've mentioned it
> > before. I would still like for there to be
> > a "reverse /x" switch, that would tell the
> > regex that I want it to treat whitespace
> > literally
>
> Doesn't the :w option do that?
>
> :w/one two/ translates to /one \s+ two/


Not exactly. The regex you showed would match any of these (using
underscores for spaces for clarity):

"one_two"
"one__two"
"one______________________two"


I may be wrong about the need for this...certainly the :w option would be
preferred over "reverse x" in most cases. This could just be a case of
kneejerk "waddya *mean* the computa is gonna figger out what I meant?!
It's a computa, how can it be smart enuf ta figger anyt'ing out?!". In a
year or two, I may regret ever raising this issue.


Dave Storrs

Dave Storrs

unread,
Aug 4, 2002, 4:52:48 PM8/4/02
to Ken Fox, Uri Guttman, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org


Well, _I_ certainly have the same problem. I'm not sure if that makes it
standard issue.


Dave Storrs

Damian Conway

unread,
Aug 4, 2002, 7:07:16 PM8/4/02
to perl6-l...@perl.org
Chip wrote:


>> temp sub false() {0}


>
>
> I'm a bit concerned about what that would do to subroutines in other
> modules called during the block's execution.

Err...that was the point. They specifically wanted other subroutines that are
called within that scope and that return true or false values to return those
particular true and false values.

But, yes, normally I would strongly prefer lexically scoped subroutines,
so as to minimize the effect of local variations of semantics.


> PS: I wonder if the names would be &FALSE and &TRUE to avoid polluting
> the non-all-caps namespace ... ?

Quite possibly. If, of course, Larry likes the notion at all.

Damian


Chip Salzenberg

unread,
Aug 4, 2002, 7:46:24 PM8/4/02
to Damian Conway, perl6-l...@perl.org
According to Damian Conway:

> Chip wrote:
> >> temp sub false() {0}
> >
> >I'm a bit concerned about what that would do to subroutines in other
> >modules called during the block's execution.
>
> Err...that was the point. They specifically wanted other subroutines that
> are called within that scope and that return true or false values to return
> those particular true and false values.

Gee, I don't think I really want outside code to be able to change the
value of "1 != 0" ... we're not actually going to allow that already via
overloading, are we? (/me braces for impact)

Chip Salzenberg

unread,
Aug 4, 2002, 8:01:59 PM8/4/02
to Julie R. Wheeler, Damian Conway, perl6-l...@perl.org
According to Julie R. Wheeler:
> I think that was just a simple example of how one could effectively
> use temp subroutines.

Well, that's nothing new as a feature, you can already do it in Perl 5:

{
local *Pkg::func = sub { temporary_stuff_here };
# ...
}

I'm more concerned about the implications of modifying the meaning of
base operators on values of base types.

Damian Conway

unread,
Aug 4, 2002, 10:04:13 PM8/4/02
to perl6-l...@perl.org
Dave Storrs wrote:


> I'm still having trouble getting my head around the new
> grammar-construction rules. Three questions:
>
> 1) Am I right that anything inside a "rule" block is considered to be
> inside a regex?

A rule *is* a regex. We're calling them "rules" because (a) as expressions
they're no longer "regular", and (b) we want them to live inside grammars.


> 2) As written, I believe that the ugly_c_comment rule would permit nested
> comments (that is, /* /**/ */), but would break if the comments were
> improperly nested (e.g., /* /* */). Is that correct?

Yes. Though it wouldn't "break" on improper nesting, merely fail.


> 3) The rule will replace the comment with a single, literal space. Why is
> this replacement necessary...isn't it sufficient to simply define it as
> whitespace, as was done above?

No. The whitespace might be important in some applications. Better to map the
complete definition of a C comment (i..e "...and equivalent to a single
whitespace character"). Of course, it might be even better to represent the
lexical element as an object:

rule ugly_c_comment {
/\* ([ .*? <ugly_c_comment>? ]*?) \*/
{ let $0 := C_Comment->new(text=>$1, means=>" ") }
}

There will presumably be some means like that for hooking into and extending
the parse trees that the Perl parser builds, though I don't yet know exactly
what it will be.

Damian

Damian Conway

unread,
Aug 4, 2002, 10:14:20 PM8/4/02
to perl6-l...@perl.org
>>Err...that was the point. They specifically wanted other subroutines that
>>are called within that scope and that return true or false values to return
>>those particular true and false values.
>
>
> Gee, I don't think I really want outside code to be able to change the
> value of "1 != 0"

I don't thing anyone was suggesting that. What was being toyed with was the
fact the Perl built-ins (usually) return 1 for true and "" for false. Miko
wanted to change that, so I demonstrated one possible mechanism for doing so.


> ... we're not actually going to allow that already via
> overloading, are we? (/me braces for impact)

Incoming! Of *course* we are:

my sub operator:!=($x,$y) { return 1 } # Bwah-ha-ha-ha-ha!!!!!

We're not going to *recommend* it, but we're not going to prevent people
who might need to be able to do that from doing it. ;-)

Damian


Julie R. Wheeler

unread,
Aug 4, 2002, 7:53:31 PM8/4/02
to Chip Salzenberg, Damian Conway, perl6-l...@perl.org
On Sunday, August 4, 2002, at 04:46 PM, Chip Salzenberg wrote:

> Gee, I don't think I really want outside code to be able to change the
> value of "1 != 0" ... we're not actually going to allow that already via
> overloading, are we? (/me braces for impact)

Yeah, baby! I think that was just a simple example of how one could
effectively use temp subroutines. I don't doubt that you could think of
lots of other great uses. But like anything this powerful, one must be
aware of the power so that one doesn't cut one's own head off with the
swiss-army-chainsaw.

I like this syntax a *lot*.

David

--
David Wheeler AIM: dwTheory
da...@wheeler.net ICQ: 15726394
http://david.wheeler.net/ Yahoo!: dew7e
Jabber: The...@jabber.org

Ken Fox

unread,
Aug 5, 2002, 9:34:33 AM8/5/02
to Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
Damian Conway wrote:
> temp sub false() {0}

How does perl know when to inline constant subs
and when to leave them for possible run-time
behavior changes?

- Ken

mosul...@crtinc.com

unread,
Aug 5, 2002, 9:59:36 AM8/5/02
to perl6-l...@perl.org
From: Chip Salzenberg ch...@pobox.com

> Gee, I don't think I really want outside code to be able
> to change the value of "1 != 0" ... we're not actually
> going to allow thatalready via overloading, are we?


Ugh, we've really lost the meaning of what I was suggesting. Let me
restate.

In Perl6 there are going to be objects called TRUE and FALSE. They are
what will be returned by boolean statements and functions. 1 == 0 returns
a FALSE object.

Now, like all objects, TRUE and FALSE stringify to something. If Perl 6
follows Perl 5, then they will stringify to 1 and an empty string. So far
so good, that seems to do for most people.

However, for me that's always been an inconvenient stringification, and I
suspect there are other people who would prefer a different stringification
too. In my databases, for fields true/false fields, I use 1 and 0. My
code has a lot of lines like this:

$sth->execute($member ? 1 : 0);

and in Perl6 they will be even uglier

$sth.execute $member ?? 1 :: 0;

So here's my simple suggestion (for which Damian theorized a very nice
simple implementation): on a lexical scope basis, be able to set your
prefered stringification of TRUE and FALSE. Then you will be able to save
TRUE and FALSE to your database however you like:

temp sub true {1}
temp sub false {0}

or even

temp sub true {'T'}
temp sub false {'F' is false}

OK, sorry about being verbose. I rest my case.

-Miko

--------------------------------------------------------------------
mail2web - Check your email from the web at
http://mail2web.com/ .


Chip Salzenberg

unread,
Aug 5, 2002, 10:58:55 AM8/5/02
to Damian Conway, perl6-l...@perl.org
According to Damian Conway:
> Chip:

> >... we're not actually going to allow that already via
> >overloading, are we? (/me braces for impact)
>
> Incoming! Of *course* we are:
>
> my sub operator:!=($x,$y) { return 1 } # Bwah-ha-ha-ha-ha!!!!!
>

Just to be clear, that's supposed to have universal effect if defined
in e.g. class SCALAR?

Chip Salzenberg

unread,
Aug 5, 2002, 11:07:03 AM8/5/02
to mosul...@crtinc.com, perl6-l...@perl.org
According to mosul...@crtinc.com:

> temp sub true {'T'}
> temp sub false {'F' is false}

Yes, I got that, and I even kind of like it. It's the idea that this
might work as a "temp sub", rather than a "my sub", that I'm not at
all sanguine about.

Why, it's almost surely bad even for your sample scenario! Imagine
what might happen inside &DBI::execute if the values of boolean ops
change globally. You'll break simple stuff like:

@socket[$hostname eq 'localhost']

Yes, if you somehow manage to fix numification you'll be OK with this
specific case, but I hope the principle is clear: Lexical overrides
good, global overrides bad^Wextremely hazardous.

Stephen Rawls

unread,
Aug 5, 2002, 12:42:22 PM8/5/02
to Dave Storrs, Ken Fox, Damian Conway, Miko O'Sullivan, perl6-l...@perl.org
>> Doesn't the :w option do that?
>> :w/one two/ translates to /one \s+ two/

>Not exactly. The regex you showed would match any of these (using
underscores for
>spaces for clarity):

>"one_two", "one__two", "one______________________two"

Ah, ok. This is from Synopsis 5, this should be what you want:

A leading ' indicates an interpolated literal match (including whitespace):
/ <'match this exactly (whitespace matters)'> /

cheers,
Stephen Rawls

Damian Conway

unread,
Aug 5, 2002, 6:46:31 PM8/5/02
to perl6-l...@perl.org
Chip Salzenberg asked:

>> my sub operator:!=($x,$y) { return 1 } # Bwah-ha-ha-ha-ha!!!!!
>>
>
>
> Just to be clear, that's supposed to have universal effect if defined
> in e.g. class SCALAR?

No. The C<my> makes it lexical.

AFAIK, there is no universal overloading of operators.
Only lexical and class-based.

Damian


Damian Conway

unread,
Aug 5, 2002, 6:59:47 PM8/5/02
to perl6-l...@perl.org, Larry Wall
Chip observed:

> Yes, I got that, and I even kind of like it. It's the idea that this
> might work as a "temp sub", rather than a "my sub", that I'm not at
> all sanguine about.

See my comment below.

> Why, it's almost surely bad even for your sample scenario! Imagine
> what might happen inside &DBI::execute if the values of boolean ops
> change globally. You'll break simple stuff like:
>
> @socket[$hostname eq 'localhost']

Code like that (which relies on the poorly specified standard values of
truth and falsehood) *deserves* to break. See me argument at the end of
this message.


> Yes, if you somehow manage to fix numification you'll be OK with this
> specific case, but I hope the principle is clear: Lexical overrides
> good, global overrides bad^Wextremely hazardous.

I *thoroughly* agree. I totally agree.

But that won't stop me from telling people how to do bad things when they ask. ;-)


BTW, I would strongly argue that Perl ought to have a *proper* boolean type.
In the same way it has proper numeric and string types. There should be
built-ins C<true> and C<false> that return the canonical true and false
values. These values should have numeric conversions (to 1 and 0) and string
conversions (to C<"true"> and C<"false" but false>). They should be used
by all built-ins (and preferably in user code as well).


Damian


Joe Gottman

unread,
Aug 5, 2002, 7:35:18 PM8/5/02
to perl6-l...@perl.org

----- Original Message -----
From: "Stephen Rawls" <s.r...@larc.nasa.gov>
>
> Ah, ok. This is from Synopsis 5, this should be what you want:
>
> A leading ' indicates an interpolated literal match (including
whitespace):
> / <'match this exactly (whitespace matters)'> /


Is the closing quote necessary? What would happen if I attempted to
create a rule that looked like <'foo bar> ?

Joe Gottman


Chip Salzenberg

unread,
Aug 5, 2002, 8:09:51 PM8/5/02
to Damian Conway, perl6-l...@perl.org, Larry Wall
(about time we had a proper Subject)

According to Damian Conway:
> Chip observed:


> >You'll break simple stuff like:
> > @socket[$hostname eq 'localhost']
>
> Code like that (which relies on the poorly specified standard values of
> truth and falsehood) *deserves* to break.

Gee, I wouldn't call +1/'' "poorly specified". Those values go back
to Perl's prehistory, at least as far back as I can remember. And
their numified values 1/0 are from C and go back, what, 30 years?
I don't think code depending on Perl's earliest design decisions,
echoing design of one of Perl's primary sources, 'deserves to break'.

I can see permitting C<temp sub FALSE> on principle. But its
predictably bad effects aren't the fault of reasonable module authors.

I'd like to point out that C<temp sub FALSE> is likely to be awfully
darned expensive, because it'll disable lots of constant expression
folding and possibly other optimizations as well.

> BTW, I would strongly argue that Perl ought to have a *proper* boolean
> type. In the same way it has proper numeric and string types. There should
> be built-ins C<true> and C<false> that return the canonical true and false
> values. These values should have numeric conversions (to 1 and 0) and
> string conversions (to C<"true"> and C<"false" but false>). They should be
> used by all built-ins (and preferably in user code as well).

Well, I can't really argue with you here. There should be C<true> and
C<false> because it's just cheesy to have to write C<1==1> and C<1==0>.

Peter Scott

unread,
Aug 5, 2002, 8:43:41 PM8/5/02
to Damian Conway, perl6-l...@perl.org
At 08:59 AM 8/6/02 +1000, Damian Conway wrote:
>BTW, I would strongly argue that Perl ought to have a *proper* boolean
>type. In the same way it has proper numeric and string types. There
>should be built-ins C<true> and C<false> that return the canonical
>true and false values. These values should have numeric conversions
>(to 1 and 0) and string conversions (to C<"true"> and C<"false" but
>false>). They should be used
>by all built-ins (and preferably in user code as well).

Hmm. This stringification would be locale-sensitive? I feel an attack
of the vapors coming on...
--
Peter Scott
Pacific Systems Design Technologies

Chip Salzenberg

unread,
Aug 5, 2002, 8:10:28 PM8/5/02
to Damian Conway, perl6-l...@perl.org
According to Damian Conway:

> AFAIK, there is no universal overloading of operators.
> Only lexical and class-based.

OK, that makes me a happy camper.

John Siracusa

unread,
Aug 5, 2002, 9:54:18 PM8/5/02
to Perl 6 Language
On 8/5/02 6:59 PM, Damian Conway wrote:
> BTW, I would strongly argue that Perl ought to have a *proper* boolean type.
> In the same way it has proper numeric and string types. There should be
> built-ins C<true> and C<false> that return the canonical true and false
> values. These values should have numeric conversions (to 1 and 0) and string
> conversions (to C<"true"> and C<"false" but false>). They should be used
> by all built-ins (and preferably in user code as well).

I agree, but I thought that was already planned(?) Or at least the part
about true and false builtins...

-John

Deborah Ariel Pickett

unread,
Aug 5, 2002, 10:09:18 PM8/5/02
to Damian Conway, perl6-l...@perl.org, Larry Wall
Damian wrote:
> BTW, I would strongly argue that Perl ought to have a *proper* boolean type.
> In the same way it has proper numeric and string types. There should be
> built-ins C<true> and C<false> that return the canonical true and false
> values. These values should have numeric conversions (to 1 and 0) and string
> conversions (to C<"true"> and C<"false" but false>). They should be used
> by all built-ins (and preferably in user code as well).

Yes, but . . .

Let me digress: there is a lesson to be learned about boolean
comparisons from C. C has had a long history of "0 is true, 1 is false"
and representing booleans as int. Perl has, for better or worse,
inherited that.

Some C programmers get antsy about if-conditions and while-conditions
using an int value as a boolean, and start doing silly things like this:

#define FALSE (0)
#define TRUE (1) /* or (!FALSE) */

and then wonder why, when they changed this:

if (intValue)

to this:

if (intValue == TRUE)

everything broke.

C99 tried to fix this by inventing the _Bool type, but it doesn't solve
the above, because this:

#include <stdbool.h>
if (intValue == true)

promotes the _Bool value true into an int before comparing.

This all comes about because C has multiple values that are true, and
very few of them actually are "equal to" true in the sense that
novice programmers think.

Perl has the same problem, in fact, it's worse because Perl has multiple
values for false too.

Perhaps a solution is to treat "==" and "!=" specially in boolean
context, so that if either of its arguments is boolean, the other is
evaluated in boolean context too. To me that seems broken and
un-Perl-ish, but it would fix the problem. (Assuming that you can even
tell what type an argument is before assigning context to the other
argument. Not all boolean arguments are as simple as "true" or "false".)

Of course, the best solution is to not write code like that. But the
moment you treat boolean things as a special case of integer things,
people are going to start bending the rules. All the cases we've seen
here - including that comparison-as-an-array-index beauty, are using
booleans for something that they were not created for.

All I'm saying is that we need to cover these cases in Perl so that we
don't repeat the same mistakes as C.

--
Debbie Pickett http://www.csse.monash.edu.au/~debbiep deb...@csse.monash.edu.au
"Who is that girl I see, staring straight back at me? Why is my reflection
someone I don't know? Must I pretend that I'm someone else for all time? When
will my reflection show who I am inside?" - Reflection (pop version), _Mulan_

Dave Storrs

unread,
Aug 11, 2002, 5:06:45 PM8/11/02
to Stephen Rawls, The Perl6 Language List

Ah! Ok, yes, I had missed that. Thanks, this is exactly what I wanted.

Dave

0 new messages