Proposal for the "purge" command as the opposite of "grep" in the same way
that "unless" is the opposite of "if".
DETAILS
I've lately been going a lot of greps in which I want to keep all the
elements in an array that do *not* match some rule. For example, suppose
I have a list of members of a club, and I want to remove (i.e. "purge")
from the list everybody for whom the "quit" property is true. With grep
it's done like this:
@members = grep {! $_->{'quit'}} @members;
Obviously that works well enough, but just like "unless" somehow
simplifies the logic by removing that leading !, "purge" can simplifiy the
array filter:
@members = purge {$_->{'quit'}} @members;
FWIW, I came up with "purge" because my first inclination was to spell
"grep" backwards: "perg". :-)
-miko
Miko O'Sullivan
Programmer Analyst
Rescue Mission of Roanoke
I like it.
But reading it reminded me of another common thing I do
with grep: partitioning a list into equivalence classes.
a simple case:
@pass = grep {$_->ok} @candidates;
@fail = grep {! $_->ok} @candidates;
This could perhaps be expessed as:
(@pass, @fail) = unzip { $_->ok } @candidates;
A more general mechanism might be:
%results = partition
{ $_->pass ? "pass" : $_->fail ? "fail" : "unknown" }
@canditates;
print "pass: @{%results{pass}}";
print "fail: @{%results{fail}}";
print "unknown: @{%results{unknown}}";
Dave.
While "purge" is cute, it certainly is not obvious what it does. Of
course neither is "grep" unless you are an aging unix guru...
How about something which is at least obvious to someone who knows what
grep is, such as "vgrep" or "grep:v"?
Or maybe that's not any better than "grep !(...)".
~ John Williams
For reference, Ruby uses .detect and .reject.
--
3rd Law of Computing:
Anything that can go wr
fortune: Segmentation violation -- Core dumped
> On Wed, 4 Dec 2002, Miko O'Sullivan wrote:
>>
>> FWIW, I came up with "purge" because my first inclination was to spell
>> "grep" backwards: "perg". :-)
>
> While "purge" is cute, it certainly is not obvious what it does. Of
> course neither is "grep" unless you are an aging unix guru...
The idea certainly has merit, though. It _is_ a quite common operation.
What about "divvy" (or are we already using that for something else?)
my(@a,@b) = divvy { ... } @c;
Other possibilities from the ol' thesaurus: C<allot>, C<deal>, C<dole>,
C<dispense>.
<thinking aloud...>
Note that this does not generalize for cases > 2. If you want to split
things into, say, three different lists, or five, you have to use a
'given', and it gets less pleasant. Perhaps a C<divvy> can be a
derivation of C<given> or C<for> by "dividing the streams", either like
this:
my(@a,@b,@c,@d) = divvy {
/foo/ ::
/bar/ ::
/zap/ ::
} @source;
or this (?):
divvy( @source; /foo/ :: /bar/ :: /zap/ ) -> @a, @b, @c, @d;
where C<::> is whatever delimiter we deem appropriate, and an empty
test is taken as the "otherwise" case.
Just pondering. Seems like a useful variation on the whole C<given>
vs. C<grep> vs. C<for> theme, though.
MikeL
--
Adam Lopresto (ad...@cec.wustl.edu)
http://cec.wustl.edu/~adam/
I love apathy with a passion.
--Jamin Gray
> While "purge" is cute, it certainly is not obvious what it does. Of
> course neither is "grep" unless you are an aging unix guru...
>
> How about something which is at least obvious to someone who knows what
> grep is, such as "vgrep" or "grep:v"?
How about my original inclinaton: "perg"? It just screams out "the
opposite of grep".
So it greps a list in reverse order?
-R (who does not see any benefit of 'perg' over grep { ! code } )
> -R (who does not see any benefit of 'perg' over grep { ! code } )
My problem with grep { ! code } is the same problem I have with if (!
expression): I've never developed a real trust in operator precedence.
Even looking at your pseudocode example, I itched to "fix" it with grep {!
(code) }.
This may be a weakness on my part, but I like computers to address my
weaknesses: I certainly spend enough time addressing theirs.
@$#@%*. Trying to do too many %#@%@ things at once. I meant 'divvy'
instead of 'seperate', not 'purge', obviously (duh). I like Angel's
general theorizing, but maybe we base it on C<for> instead of C<given>?
Any such solution must use := rather than =. I'd go as far as to say
that divvy should be illegal in a list context.
Note that if the closure is expected to return a small integer saying
which array to divvy to, then boolean operators fall out naturally
because they produce 0 and 1.
Larry
I'm not sure I understand that: we're assigning here, not binding (aren't
we?).
> Note that if the closure is expected to return a small integer saying
> which array to divvy to, then boolean operators fall out naturally
> because they produce 0 and 1.
Only if we apply a bit of magic (2 is a true value). The rule might be:
If context is an list of arrays, then the coderef is evaluated in
integer context: to map each input value to an integer, which selects
which array to append the input-value onto.
If the size of the context is "list of 2 arrays", then the coderef is
evaluated in Boolean context, and the index determined as
c< $result ?? 1 :: 0 >.
If the context is a single array, then it is assumed to be an
array-of-arrays: and the coderef is evaluated in integer-context.
If the context is a hash, then the coderef is evaluated in scalar
context, and the result used as a hash key: the value is pushed
onto the array, in the hash, identified by the key.
One more thing: how to I tell the assignment not to clear to
LHS at the start of the operation. Can I say:
my (@a,@b) = divvy { ... } @a1;
(@a,@b) push= divvy { ... } @a2;
Dave.
I like "purge", although "except", "exclude", and "omit" all have their
charms.
For partition function, I like "divvy", "carve", "segment" (in that order)
and almost anything other than "separate", which IIRC is one of the most
misspelled words in English.
===============================================
Mark Leighton Fisher fis...@tce.com
Thomson multimedia, Inc. Indianapolis IN
"we have tamed lightning and used it to teach sand to think"
> Only if we apply a bit of magic (2 is a true value). The rule might be:
How about if we just have two different methods: one for boolean and one
for multiple divvies:
my(@true, @false) := @array.cull{/some test/};
my (@a, @b, @c) := @array.divvy{some code}
If you want good'ol Unix flavor, call it "vrep". Compare the ed(1) /
ex(1) / vi(1) commands (where 're' stands for regular expression, of
course) :
:g/re/p
:v/re/p
What would be an idiomatic Perl 6 implementation of such a vrep function ?
I think you are correct, but only because of the psychology of
affordances: you wrote "@true, @false", not "@false, @true".
I use the same mental ordering, so I expect it would be a
common bug.
I think that c<cull> would be an abysmal name: that implies
"keep the false ones". I'm not sure that there is a synonym
for "boolean partition" though. Perhaps we need some help
from a linguist! ;)
Dave.
> If you want good'ol Unix flavor, call it "vrep". Compare the ed(1) /
> ex(1) / vi(1) commands (where 're' stands for regular expression, of
> course) :
> :g/re/p
> :v/re/p
I like it. Fits in with our Un*x heritage, and doesn't have any existing
meaning that implies things it doesn't do.
-miko
What's wrong with split()?
split { f($_) }, $iterator -or- @array.split { f($_) }
vs.
split /\Q$delim\E/, $string -or- $string.split( /\Q$delim\E/ )
BTW, since it's possible to say:
my (@even, @odd) = split { $_ % 2 }, 0 .. Inf;
I presume that split will be smart enough to be usefully lazy. So
laziness is probably a contagious property. (If the input is lazy, the
output probably will be, too.)
But what happens with side-effects, or with pathologically ordered
accesses?
That is, iterators tend to get wrapped with a lazy array, which caches
the accesses.
So if the discriminator function caches values of its own, what
happens?
E.g.,
# Side-effects
my (@even, @odd)
= split { is_prime($_) && $last_prime = $_; $_ % 2 }, 0..Inf;
The value of last_prime is .. ?
# Pathological access:
my (@even, @odd) = ... as above ...
print $#odd;
Does @even (which is going to be cached by the lazy array) just swamp
memory, or what?
=Austin
=Austin
: I like "purge", although "except", "exclude", and "omit" all have their
: charms.
: For partition function, I like "divvy", "carve", "segment" (in that order)
: and almost anything other than "separate", which IIRC is one of the most
: misspelled words in English.
May I suggest 'winnow'?
--
Aaron
And I would strongly suggest that C<divvy> isn't the right name for it,
since, apart from being a ugly, slang word, "divvy" implies dividing up
equally. The built-in would actually be doing classification of the
elements of the list, so it ought to be called C<classify>.
I would expect that C<classify> would return a list of array references.
So Larry is (of course! ;-) entirely correct in pointing out that it
would require the use of := (not =). As for an error when = is used,
perhaps that ought to be handled by a general "Second and subsequent
lvalue arrays will never be assigned to" error.
The selector block/closure would, naturally, be called in C<int>
context each time, so (again, as Larry pointed out) a boolean
function would naturally classify into two arrays. Though it
might at first be a little counterintuitive to have to write:
(@false, @true) := classify { $^x > 10 } @nums;
I think it's a small price to pay to avoid tiresome special cases.
Especially since you then get your purge/vrep/antigrep for free:
(@members) := classify {$_->{'quit'}} @members;
;-)
Damian
> The selector block/closure would, naturally, be called in C<int> context
> each time, so (again, as Larry pointed out) a boolean function would
> naturally classify into two arrays. Though it might at first be a little
> counterintuitive to have to write:
OK, but I would assert that the false/true classification is going to be
the more common case, not "classify by index position", and that
furthermore there will be a lot of situations where the false/true value
may be any number, not just 1 or 0.
For example, suppose I want to separate a list of people into people who
have never donated money and those who have. Assuming that each person
object has a donations property which is an array reference, I would want
to classify them in this manner:
(@nevers, @donors) := classify($_->[donations]) @people;
According to the C<int> model, that would give me people who have donated
zero times, and people who have donated once, and the people who have
donated more than once would be lost. Now, of course you can force the
dontations into a boolean context, but, frankly, I think If we force
people to always remember to force boolean context, just to preserve the
(IMHO) unusual case of classifying by integer, we're, on balance, making
more work for the world.
Ergo, I suggest we simply have a separate command for the false/true
situation:
(@nevers, @donors) := falsetrue($_->[donations]) @people;
(Yes, "falsetrue" is a stupid name, please replace with something better.)
Then turn donations into a boolean.
(@donors, @nevers) := classify(!$_->[donations]) @people;
I don't think there is the need to bloat the langauge with every special
case we can think of.
Graham.
I worry that C<classify> sounds too much like something class-related,
and would confuse people. What about C<arrange> or something? Decent
thesaurus entries for <separate> include:
assign, classify, comb, compartmentalize, discriminate, distribute,
group, order, segregate, sift, winnow, amputate, cut, dismember,
excise, lop, disunite, divorce, estrange, part, wean, detach,
disconnect, disengage, dissociate, extract, isolate, part, steal, take,
uncouple, withdraw
Some of those might be appropriate (or just amusing). :-)
> The selector block/closure would, naturally, be called in C<int>
> context each time, so (again, as Larry pointed out) a boolean
> function would naturally classify into two arrays. Though it
How would you do something like:
(@foo,@bar,@zap) := classify { /foo/ ;; /bar/ ;; /zap/ } @source;
I was more hoping for a C<for> or C<given> derivative that would
provide a series of 'stream'-like tests, not just one test with N
answers. Something that was a shorthand for the obvious but somewhat
tedious C<given> counterpart. (If @source had an entry 'foobar', we
could debate whether that should go in one destination stream or two.)
> Especially since you then get your purge/vrep/antigrep for free:
I don't think we need a separate func either, but if we're gonna have a
purge/vrep/antigrep, can someone _please_ think of a better name for
it? "purge" clearly needs an inverse called "binge", "vrep" sounds
like, well, UNIX, and "antigrep" sounds like something I put in my car
to avoid it grepping when I start it on cold mornings.
Even just "ngrep" sounds better to me. :-|
MikeL
I still like partition (or simply C<part>). Segregate (c<seg>)
might also work
I notice everyone still want Int context for eval of the block:
Pease don't forget about hashes. Is there such a thing as
'hashkey context'?
Perl6 is much better than Perl5 for naming parameters. Could
we make the following work?
( low=>@under,
mid=>@in_range,
high=>@over )
= partition @input -> $v {
$v < 10 ?? "low" :: $v > 20 ?? "high" :: "mid";
};
Also, can I return superpositions (sorry, junctions), to provide
multiple classifications? Or would I return an array for that?
Dave.
> I worry that C sounds too much like something class-related,
> and would confuse people. What about C or something? Decent
> thesaurus entries for include:
>
> assign, classify, comb, compartmentalize, discriminate, distribute,
> group, order, segregate, sift, winnow, amputate, cut, dismember, excise,
> lop, disunite, divorce, estrange, part, wean, detach, disconnect,
> disengage, dissociate, extract, isolate, part, steal, take, uncouple,
> withdraw
designate?
-- Tim
'Classify' also seems wrong if some items are
thrown away. I like 'part':
(@foo,@bar) := part { ... } @source;
Headed off in another direction, having a sub
distribute its results like this reminds me of:
... -> ...
Can arrays on the rhs of a -> ever mean
something useful?
--
ralph
Or, to follow the spirit rather than the letter of Unix, how 'bout "ere"
for "Elide REgex" or "tang" for "Tog's A Negated Grep"?
/s
Gah. s/Tog/Tang/.
/s
Wouldn't that mean we had to rename grep to 'gnat'? ("Gnat's Not A Tang",
presumably, never mind rot13 and reversal...)
--
Aaron Crane * GBdirect Ltd.
http://training.gbdirect.co.uk/courses/perl/
>>I worry that C<classify> sounds too much like
>>something class-related
>
> 'Classify' also seems wrong if some items are
> thrown away. I like 'part':
>
> (@foo,@bar) := part { ... } @source;
ralph and I don't often agree, but I certainly do in this case.
I like C<part> very much as a name for this built-in. Must be
the vaguely biblical association <image of Charlton Heston
with his staff raised high above an array> ;-)
> Headed off in another direction, having a sub
> distribute its results like this reminds me of:
>
> ... -> ...
>
> Can arrays on the rhs of a -> ever mean
> something useful?
Sure. It means that the key of the pair is an array reference.
Damian
> I notice everyone still want Int context for eval of the block:
> Pease don't forget about hashes. Is there such a thing as
> 'hashkey context'?
I doubt it. Unless you count Str context.
> Perl6 is much better than Perl5 for naming parameters. Could
> we make the following work?
>
>
> ( low=>@under,
> mid=>@in_range,
> high=>@over )
> = partition @input -> $v {
> $v < 10 ?? "low" :: $v > 20 ?? "high" :: "mid";
> };
I very much doubt it. I think at that point you really want:
for @input -> $v {
push ($v < 10 ?? @under :: $v > 20 ?? @over :: @in_range), $v;
}
> Also, can I return superpositions (sorry, junctions), to provide
> multiple classifications? Or would I return an array for that?
A (dis)junction ought to work there.
Damian
> How would you do something like:
>
> (@foo,@bar,@zap) := classify { /foo/ ;; /bar/ ;; /zap/ } @source;
Since I don't understand what that's supposed to do, I probably *wouldn't*
do something like it. What effect are you trying to achieve?
Damian
That sounds horribly scary...
--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)
"If you want to propagate an outrageously evil idea, your conclusion
must be brazenly clear, but your proof unintelligible."
--Ayn Rand, explaining how today's philosophies came to be
Sorry. A shorthand for:
for @source {
given {
when /foo/ { push @foo, $_ }
when /bar/ { push @bar, $_ }
when /zap/ { push @zap, $_ }
}
}
.... that "classifies" (or "parts") @source according to the results of a
series of tests, not just one.
MikeL
>>> (@foo,@bar,@zap) := classify { /foo/ ;; /bar/ ;; /zap/ } @source;
> A shorthand ... that "classifies" (or "parts") @source according to
> the results of a series of tests, not just one.
You mean, like:
(@foo,@bar,@zap) := part { when /foo/ {0}; when /bar/ {1}; when /zap/ {2} } @source;
???
And there's always:
push (/foo/ && @foo || /bar/ && @bar || /zap/ && @zap), $_ for @source;
But perhaps there would also be a hashed form, in which each key is a test
(i.e. a rule or closure) and each value an index:
(@foo,@bar,@zap) := part { /foo/ => 0, /bar/ => 1, /zap/ => 2 }, @source;
or even a arrayed form, when the corresponding index was implicit:
(@foo,@bar,@zap) := part [/foo/, /bar/, /zap/], @source;
Damian
How about just
(@foo,@bar,@zap) := classify [ rx/foo/, rx/bar/, rx/zap/ ] @source;
and implement classify as a normal sub? Why does everything
have to be built into the first version of Perl 6?
Is there any reason classify can't be a normal sub? e.g. can
a sub return ( [], [], [] ) and have that bound to 3 array
variables? What about return @AoA when @AoA = ( [], [], [] )?
- Ken
That's kinda nifty. But admittedly, it's not to-die-for necessary, if
I'm the only one fond of it.
Ken Fox wrote:
> and implement classify as a normal sub? Why does everything
> have to be built into the first version of Perl 6?
Yeah, I agree! Oh, except when it's things _I'm_ asking for. _Those_
are always 100% necessary. :-/
(We're basically asking for everything under the sun, but I think we all
know that < 10% of it will actually get in, which is a Good Thing. :-)
But sometimes the brainstorming shakes loose something more broadly interesting.)
MikeL
P.S. As for judging the value of a proposal, I personally try to ask
the following questions:
1) Is it a simplification of a universally common but otherwise
long/tedious algorithm?
2) Is there only One Way To Do It (Correctly)?
3) Is there a name for the operation so obvious that you can, after
being first introduced to it, easily remember what it does? (like
"reverse", "split", "while", etc.)
Not that I always take my own advice. :-) Other people might have
different informal criteria. (For future teaching purposes, I'd love to
hear what they are.)
I think this makes a nice specialization of the hash approach. However, I
believe
it will become cumbersome with anything other than trivial expressions. The
hash
approach, in that case, would be clearer.
Tanton
> How about just
>
> (@foo,@bar,@zap) := classify [ rx/foo/, rx/bar/, rx/zap/ ] @source;
>
> and implement classify as a normal sub?
We could certainly do that. But let's call it C<part>.
Et voilà :
sub part ($classifier, *@list) {
my &classify := convert_to_sub($classifier);
my @parts;
for @list -> $nextval {
my $index = try{ classify($nextval) } // next;
push @parts[$index], $nextval;
}
return @parts;
}
sub convert_to_sub ($classifier is topic) is cached {
when Code { return $classifier }
when Array {
my @classifiers = map {convert_to_code($_)} @$classifier;
return sub ($nextval) {
for @classifiers.kv -> $index, &test {
return $index if test($nextval);
}
return;
}
}
when Hash {
my %classifiers = map { convert_to_code(.key) => .value } %$classifier;
return sub ($nextval) {
my @indices = map { defined .key()($nextval) ?? .value :: () } %classifiers;
return @indices ?? any(@indices) :: undef;
}
}
default { croak "Invalid classifier (must be closure, array, or hash)" }
}
But then the thousands of people who are apparently clamouring for this
functionality and who would have no hope of getting the above correct,
would have to pull in some module every time they wanted to partition an array.
> Why does everything have to be built into the first version of Perl 6?
Everything doesn't. Everything shouldn't be. Just the really common,
important stuff.
I have to confess though, there are *many* times I've wished for this particular
functionality as a built-in. Which is why I'm spending time on it now.
Damian
> Et voilŕ:
Or, of those who prefer their code sanely formatted:
sub part ($classifier, *@list) {
my &classify := convert_to_sub($classifier);
my @parts;
for @list -> $nextval {
my $index = try{ classify($nextval) } // next;
push @parts[$index], $nextval;
}
return @parts;
}
sub convert_to_sub ($classifier is topic) is cached {
when Code { return $classifier }
when Array {
my @classifiers = map {convert_to_code($_)} @$classifier;
return sub ($nextval) {
for @classifiers.kv -> $index, &test {
return $index if test($nextval);
}
return;
}
}
when Hash {
my %classifiers = map { convert_to_code(.key) => .value } %$classifier;
return sub ($nextval) {
my @indices = map { defined .key()($nextval) ?? .value :: () } %classifiers;
return @indices ?? any(@indices) :: undef;
}
}
default { croak "Invalid classifier (must be closure, array, or hash)" }
}
Damian
This may be a useful distinction: stuff which is built into the
language versus stuff which is shipped in the default libraries of the
language.
A categorise method would be just grand, and I think it should be
shipped with the default Perl 6 array classes, but Perl 6 The Core
Language wouldn't need to know about that particular method if it
didn't want to.
--
A Law of Computer Programming:
Make it possible for programmers to write in English
and you will find that programmers cannot write in English.
Presumably, to avoid run time errors, that
would need to be something like:
push (/foo/ && @foo ||
/bar/ && @bar ||
/zap/ && @zap ||
@void), $_ for @source;
> But perhaps...
>
> ( @foo, @bar, @zap) :=
> part { /foo/ => 0, /bar/ => 1, /zap/ => 2 }, @source;
Why not:
part ( @source, /foo/ => @foo, /bar/ => @bar, /zap/ => @zap );
or maybe:
@source -> /foo/ => @foo, /bar/ => @bar, /zap/ => @zap;
To end up with @foo entries being *aliases* of
entries in @source. Btw, could these be valid,
and if so, what might they do:
@source -> $foo, $bar;
@source -> @foo, @bar;
--
ralph
> A categorise method would be just grand, and I think it should be
> shipped with the default Perl 6 array classes, but Perl 6 The Core
> Language wouldn't need to know about that particular method if it
> didn't want to.
Err. Since arrays are core to Perl 6, how could their methods not be?
Of course, as long as you can call C<part> without explicitly loading
a module, it's merely a philosophical distinction as to whether
C<part> is core or not.
Damian
> Presumably, to avoid run time errors, that
> would need to be something like:
>
> push (/foo/ && @foo ||
> /bar/ && @bar ||
> /zap/ && @zap ||
> @void), $_ for @source;
True.
> Why not:
>
> part ( @source, /foo/ => @foo, /bar/ => @bar, /zap/ => @zap );
Because C<map>, C<grep>, C<reduce> etc all take the list they're
operating on as the last argument. And they do that for a very good reason:
so it's easy to build up more complex right-to-left pipelines, like:
(@foo, @bar) :=
part [/foo/, /bar/],
sort { $^b <=> $^a }
grep { $_ > 0 }
@data;
> @source -> /foo/ => @foo, /bar/ => @bar, /zap/ => @zap;
Huh???
That's the equivalent of:
@source, sub (/foo/ => @foo, /bar/ => @bar, /zap/ => @zap);
which is just a syntax error.
> To end up with @foo entries being *aliases* of
> entries in @source. Btw, could these be valid,
Err. I very much doubt it.
Damian
If we're worried about the distance between the source and destination
when there are many tests, maybe:
part { /foo/ => @foo, /bar/ => @bar, /zap/ => @zap }, @source;
Or, 'long' formatted:
part {
/foo/ => @foo,
/bar/ => @bar,
/zap/ => @zap,
}, @source;
Assuming the type system can handle that. But people will forget the
comma before C<@source>, because it looks so similar to C<map>. And
think of the { ... } as a code block, not a hashref. Pffft.
I keep thinking we're missing something here. This is just a
multi-streamed C<grep>, after all. It should be easy.
Was it ever decided what C<for> would look like with multiple streams?
Maybe we could just use the stream delimiters in the C<grep> like we do
in C<for>?
grep {
/foo/ -> @foo,
/bar/ -> @bar,
/zap/ -> @zap,
} @source;
???
MikeL
> If we're worried about the distance between the source and destination
> when there are many tests
Are we? I'm not.
> maybe:
>
> part { /foo/ => @foo, /bar/ => @bar, /zap/ => @zap }, @source;
>
> Or, 'long' formatted:
>
> part {
> /foo/ => @foo,
> /bar/ => @bar,
> /zap/ => @zap,
> }, @source;
I really dislike the use of dative arguments (i.e. those that are modified
in-place by a function).
Besides, you can already write:
push (
/foo/ ?? @foo ::
/bar/ ?? @bar ::
/baz/ ?? @baz ::
[]
), $_ for @source;
Heck, even in Perl 5 you can write:
push @{
/foo/ ? \@foo :
/bar/ ? \@bar :
/baz/ ? \@baz :
[]
}, $_ for @source;
> I keep thinking we're missing something here. This is just a
> multi-streamed C<grep>, after all. It should be easy.
Famous last words. ;-)
> Was it ever decided what C<for> would look like with multiple streams?
for zip(@x, @y, @z) -> $x, $y, $z {...}
and its operator version:
for @x Åš @y Åš @z -> $x, $y, $z {...}
> Maybe we could just use the stream delimiters in the C<grep> like we do
> in C<for>?
No. We gave up special stream delimiters in C<for>s,
in preference for general zippers.
Damian
> I'm not sure the meaning of the name C<part> would be obvious
> to someone who hadn't seen it before.
What, as opposed to C<grep> or C<map> or C<splice> or C<qr> or
C<flock> or C<ref> or C<fork> or C<chomp> or C<crypt> or C<getservent>
or C<ucfirst> or C<lstat> or C<vec> or...? ;-)
> I keep thinking C<sift> would be nice, or maybe
> C<discrim>. Just a thought...
C<sift> is quite good. Though I still like C<part> best.
Damian
Well, no; it's an implementation distinction too. Non-core methods
1) don't mean anything special to the compiler
2) can be implemented in C, Perl, Parrot, or whatever else we like
and 3) can be added or taken away without affecting the basic design of
the language
all of which means
4) we don't have to worry about them quite yet.
Although the concept of having a data type called an array is core to
the design of Perl 6, the precise clever methods those arrays respond to
can be added organically later, or even customized by the end-user.
Basically, I'm just saying that we don't have to put everything in at
once. Let's have finish carving the statue before we decide what
shade of vermillion to paint its toenails.
--
>Almost any animal is capable learning a stimulus/response association,
>given enough repetition.
Experimental observation suggests that this isn't true if double-clicking
is involved. - Lionel, Malcolm Ray, asr.
I usually just lurk here, but I just had to pipe in. :) I'm not sure the
meaning of the name C<part> would be obvious to someone who hadn't seen
it before. I keep thinking C<sift> would be nice, or maybe
C<discrim>. Just a thought...
- Ian.
Given the original example
(@foo,@bar,@zap) := part [ /foo/, /bar/, /zap/ ] @source;
this binds the contents of @parts to (@foo,@bar,@zap)? The
array refs in @parts are not flattened though. Is it correct
to think of flattening context as a lexical flattening? i.e.
only terms written with @ are flattened and the types of
the terms can be ignored?
BTW, if part were declared as an array method, the syntax
becomes
@source.part [ /foo/, /bar/, /zap/ ]
or
part @source: [ /foo/, /bar/, /zap/ ]
Can part be a multi-method defined in the array class
so the original example syntax can be used? (I'd prefer
the code too because the switch statement is eliminated.)
> sub convert_to_sub ($classifier is topic) is cached {
Very nice.
> for @classifiers.kv -> $index, &test {
An array::kv method? Very useful for sparse arrays, but
is this preferred for all arrays? An explicit index counter
seems simpler in this case.
> my @indices = map { defined .key()($nextval) ?? .value
> :: () } %classifiers;
That map body looks like a syntax error, but it isn't. Can I add
extra syntax like
map { defined(.key.($nextval)) ?? .value :: () }
to emphasize the fact that .key is returning a code ref?
Last, but not least, the Hash case returns a junction (most
likely of a single value). Junctions don't collapse like
superpositions, so I'm wondering what really happens.
Can you describe the evaluation? I'm really interested in how
long the junction lasts (how quickly it turns into an integer
index), and what happens with a duplicate (ambiguous?) index.
Sorry for so many questions. The code you wrote was just a
really, really good example of many Perl 6 features coming
together.
[This is out of order; Damian wrote it in another message.]
> Everything doesn't. Everything shouldn't be. Just the really common,
> important stuff.
So CGI.pm is in?
I don't think "really common, important" is a good criteria for
being in the core. IMHO it should be "language defining, awkward or
impossible to implement as a module".
Perhaps the part method can be implemented as a mix-in module that
extends array without subclassing it? AUTOLOAD can do that now
for packages. Are classes sealed or will they use AUTOLOAD too?
- Ken
>> I keep thinking C<sift> would be nice, or maybe
>> C<discrim>. Just a thought...
>
> C<sift> is quite good. Though I still like C<part> best.
Ooh, I like C<sift> best. C<part> is too easy to interpret as other
things (partition? part with? part from? part of? partner? etc.).
David
--
David Wheeler AIM: dwTheory
da...@wheeler.net ICQ: 15726394
http://david.wheeler.net/ Yahoo!: dew7e
Jabber: The...@jabber.org
> On Saturday, December 7, 2002, at 10:47 PM, Damian Conway wrote:
>
> > Ian Remmler decloaked and wrote:
> >
> > > I keep thinking C<sift> would be nice ...
> >
> > C<sift> is quite good. Though I still like C<part> best.
>
> Ooh, I like C<sift> best.
I dislike C<sift> cos it's a small typo away from C<shift>.
Smylers
> I dislike C<sift> cos it's a small typo away from C<shift>.
Yes, but I would expect to be a compile-time error, since the
signatures are different. The same can't be said for r?index.
>> sub part ($classifier, *@list) {
>
> ....
>
>> return @parts;
>> }
>
>
> Given the original example
>
> (@foo,@bar,@zap) := part [ /foo/, /bar/, /zap/ ] @source;
>
> this binds the contents of @parts to (@foo,@bar,@zap)?
Yes.
> The array refs in @parts are not flattened though.
Correct. Each array ref is bound to the corresponding array name.
>
Is it correct
> to think of flattening context as a lexical flattening? i.e.
> only terms written with @ are flattened and the types of
> the terms can be ignored?
I'm not sure I understand this question.
> BTW, if part were declared as an array method, the syntax
> becomes
>
> @source.part [ /foo/, /bar/, /zap/ ]
Nearly. The parens are not optional on this form of method call, I believe.
So that would be:
@source.part([ /foo/, /bar/, /zap/ ]);
>
> or
>
> part @source: [ /foo/, /bar/, /zap/ ]
Yes.
> Can part be a multi-method defined in the array class
Multimethods don't belong to any particular class.
Does it *need* to be a method or multimethod???
>> for @classifiers.kv -> $index, &test {
>
> An array::kv method? Very useful for sparse arrays, but
> is this preferred for all arrays? An explicit index counter
> seems simpler in this case.
Depends on your definition of simpler, I guess. Depending on what you mean by
"explicit index counter", that would have to be:
for 0..@classifiers.end Åš @classifiers -> $index, &test {
...
}
Or (heaven forefend!):
loop (my $index=0; $index<@classifiers; $index++) {
my &test := @classifiers[$index];
...
}
I really think an C<Array::kv> method nicely meets the very common need of
iterating the indices and values of an array in parallel, with a minimum
of syntax and a maximum of maintainability.
>
>> my @indices = map { defined .key()($nextval) ?? .value
>> :: () } %classifiers;
>
>
> That map body looks like a syntax error, but it isn't.
> Can I add extra syntax like
>
> map { defined(.key.($nextval)) ?? .value :: () }
>
> to emphasize the fact that .key is returning a code ref?
Yes, indeed.
> Last, but not least, the Hash case returns a junction (most
> likely of a single value). Junctions don't collapse like
> superpositions, so I'm wondering what really happens.
>
> Can you describe the evaluation?
Sure. Suppose that the classifier closure returns the junction C<any(1)>.
Then, within C<part>, the C<$index> variable stores that junction (i.e. junctions
survive both a copy-on-return and an assignment). The next statement is:
push @parts[$index], $nextval;
The use of a junction as an index causes the array look-up to return a junction
of aliases to the array elements selected by the various states of the index.
So C<@parts[$index]> is a disjunction of a single alias (i.e. to C<@parts[1]>).
Pushing the next value onto that alias causes it to autovivify as an array ref
(if necessary), and then push onto that nested array.
Suppose instead that the classifier closure returns the junction C<any(0,1)>.
Then, within C<part>, the C<$index> variable stores that junction, and its use
as an index causes the array look-up to return a junction
of aliases to the array elements selected by the two states of the index.
So C<@parts[$index]> is, in this second case, a disjunction of two aliases
(i.e. to C<@parts[0]> and C<@parts[1]>). Pushing the next value onto that
disjunctive alias causes it to autovivify both elements as array refs
(if necessary), and then -- in parallel -- push the value onto each nested array.
> I'm really interested in how
> long the junction lasts (how quickly it turns into an integer
> index),
It never turns into an integer index. Using a junction as an index is the
same as passing it to the C<Array::operator:[]> method, which causes the
call to the method to be distributed over each state in the junction. So, just
as:
foo(1|2|3)
is the same as:
foo(1) | foo(2) | foo(3)
so:
@array[1|2|3]
is the same as:
@array[1] | @array[2] | @array[3]
And:
@array[1|2|3] = "str";
is the same as:
(@array[1] | @array[2] | @array[3]) = "str"
which the same as:
(@array[1] = "str") | (@array[2] = "str") | (@array[3]) = "str")
and what happens with a duplicate (ambiguous?) index.
Can't happen. As Luke has expounded, junctions are a form of set,
and have no duplicate states.
> Perhaps the part method can be implemented as a mix-in module that
> extends array without subclassing it?
And I'm suggesting that C<part>ing is such sweet sorrow that everyone
will want to do it all the time. Or at least often enough that dragging
it in from a module with rapidly become a PITA. Just as it in Perl 5
to use C<List::Utils::reduce> or C<List::Utils::max>.
Manipulating a core data type in commonly useful ways ought to be via
core operations (or, at worst, operations that are invisibly non-core),
so that JAPHs are encouraged to code what they mean explicitly:
$sum = reduce {$^a+$^b} @nums;
$max = max @nums;
rather than emergently:
my ($max, $sum) = (-Inf, -Inf);
for @nums {
$max = $_ if $max < $_;
$sum += $_;
}
> AUTOLOAD can do that now
> for packages. Are classes sealed or will they use AUTOLOAD too?
There will certainly be a mechanism akin to AUTOLOAD in Perl 6.
How it will work has yet to be decided.
Damian
> Ooh, I like C<sift> best. C<part> is too easy to interpret as other
> things (partition? part with? part from? part of? partner? etc.).
You know, that's *exactly* why I like C<part> better. ;-)
Damian
Sometimes array references behave as arrays, e.g.
push $array, 1
In flattening context array refs don't flatten, only arrays.
I'm not even sure that only arrays flatten either -- it might
be anything that begins with @. e.g.
my Point @p;
($x, $y) := @p;
If the flattening rule is "only @ symbols flatten" then it
would be lexical flattening -- we only have to look at the
text. (I'm using lexical in the same sense as lexical
variable uses it.)
> Multimethods don't belong to any particular class.
> Does it *need* to be a method or multimethod???
If C<part> is not a method or multimethod, then it acts
like a reserved word or built-in, like C<grep> or C<map>.
IMHO that's name space pollution.
I know multi-methods don't "belong" to a class. It seems
useful to develop standards on where the implementation
is found though. I would expect to find C<part> as an
auto-loaded multimethod in "perl6/6.0/auto/array/part.al"
It would actually be nice if all the C<push>, C<pop>,
etc. functions became methods, e.g.
push @array: 1;
> Depends on your definition of simpler, I guess.
I don't see anything particularly complex about this:
my $index = 0;
for @classifiers {
return $index if $_.($nextval);
++$index
}
That's understandable and it should produce simple bytecode.
If @classifiers is sparse or non-zero-based, then the .kv
method might be better.
> I really think an C<Array::kv> method nicely meets the very common need of
> iterating the indices and values of an array in parallel, with a minimum
> of syntax and a maximum of maintainability.
Yes, I agree, but it needs to construct a stream generator
which isn't particularly efficient. I was surprised to see it
in a place where the generality and elegance isn't needed.
Thanks for the explanation of the junction. I'm not sure
whether I'm more excited by the possibility to write code
using junctions or more terrified by the certainty of
debugging that code... ;)
> And I'm suggesting that C<part>ing is such sweet sorrow that everyone
> will want to do it all the time. Or at least often enough that dragging
> it in from a module with rapidly become a PITA. Just as it in Perl 5
> to use C<List::Utils::reduce> or C<List::Utils::max>.
How about formalizing global namespace pollution with something
like the Usenet news group formation process? Ship Perl 6 with a
very small number of global symbols and let it grow naturally.
- Ken
[snipped]
> so it's easy to build up more complex right-to-left pipelines, like:
>
> (@foo, @bar) :=
> part [/foo/, /bar/],
> sort { $^b <=> $^a }
> grep { $_ > 0 }
> @data;
>
>
I would like perl6 to support left-to-right part/sort/grep pipelines.
Left to right syntax is generally good because it facilitates the flow
of reading.
For these pipelines, the current right to left syntax is due to the emphasis
on the operation over the data operated on, so the operator appears
first. Nevertheless with a long pipeline, data is best factored out in a
variable so having it first is not an impediment.
Tentative syntax:
... is an left-associative operator that has the same precedence as .
argexpr...listop indirop
would be equivalent to
listop indirop argexpr
example:
@data = [ very_long_data_expression ]
(@foo, @bar) := @data...grep { $_ > 0 }...sort { $^b <=> $^a }...part [/foo/, /bar/];
Also, I am not necessarily advocating that operators like :=
could be flipped to become := with flipped operands:
@data...grep { $_ > 0 }...sort { $^b <=> $^a }...part [/foo/, /bar/] =: (@foo, @bar)
I am just advocating to examine the idea. :)
I certainly see an imediate problem with the current conventions:
=~ and ~= are two different beasts, not one beast and its flipped version.
__
stef
> Damian:
> > so it's easy to build up more complex right-to-left pipelines, like:
> >
> > (@foo, @bar) :=
> > part [/foo/, /bar/],
> > sort { $^b <=> $^a }
> > grep { $_ > 0 }
> > @data;
> >
> >
>
> I would like perl6 to support left-to-right part/sort/grep pipelines.
> Left to right syntax is generally good because it facilitates the flow
> of reading.
>
> For these pipelines, the current right to left syntax is due to the emphasis
> on the operation over the data operated on, so the operator appears
> first. Nevertheless with a long pipeline, data is best factored out in a
> variable so having it first is not an impediment.
[snip]
I was just playing with Mathematica and thinking this very same thing.
Mathematica has an operator // that applies arguments on the left to
the function on the right. I was just thinking how good that was for
clarity. To do some awful computation, and get a numeric result, you
can write:
N[awful computation]
Or:
awful computation // N
I was instantly reminded of TMTOWTDI, in a good way. Perhaps Perl
could adopt a similar mechanism? The operator in question should have
very low precedence. >> is available, I think, since bitops are
prefixed with . or whatever.
$0{statement}{expression}{additive_expression}[0] >> print;
That's rather nicer, IMHO, than:
print($0{statement}{expression}{attitive_expression}[0]);
You could even give a closure:
$0{...} >> { print $^v, "\n" }
Not that anyone would. The situation is analogous to that of:
die "Can't do it" unless something;
versus
something or die "Can't do it";
It allows for moving the important stuff out to the left (Depending on
what you consider important).
@a >> grep { $_ > 0 } >> sort >> { print $^v, "\n"}
Aha! That's when you use the closure. Unix pipelines are so nice to
script with, why shouldn't Perl steal them? :)
> Also, I am not necessarily advocating that operators like :=
> could be flipped to become := with flipped operands:
>
> @data...grep { $_ > 0 }...sort { $^b <=> $^a }...part [/foo/, /bar/] =: (@foo, @bar)
>
> I am just advocating to examine the idea. :)
> I certainly see an imediate problem with the current conventions:
> =~ and ~= are two different beasts, not one beast and its flipped version.
Yeah... I don't think that would work so well. There's just too many
operators that have meanings both ways.
Luke
some time ago there was some discussion in that direction . Larry told
he leke to put arguments before the function name and than he began to
talk about japaneese . I was trying to push ~~ operator for exactly
this purpose . but Larry explained that ~~ is first of all for the
purpose of returning meaningful boolean .
I really like the idea of pipe-like syntax .
Mathematica have another operator that seems to be nice ( and not used
yet in perl ) :
@students /. sort { $^a.grade <=> $^b.grade }
/. head 5 ;
interesting, I proposed then ~> for that purpose : fusion of ~~ and
-> . but ~> is ugly , I admit .
so
$x /. foo # foo( $x )
$x /. foo /. bar # bar( $x /. foo ) # bar( foo( $x ) )
maybe /. should be just infix form of given
given $x , &foo ;
given ( given $x , &foo ) , &bar ;
but then proper Unix pipe |. should probably be infix form of "for" ...
for @x , &foo ;
for ( for @x , &foo ) , &bar ;
@x |. &foo
@x |. &foo |. &bar
infix form of "if" *is* already in language .
if $x { &foo } else { &bar };
$x ?? { &foo } :: { &bar };
arcadi
>
>
> [snipped]
>
> > so it's easy to build up more complex right-to-left pipelines, like:
> >
> > (@foo, @bar) :=
> > part [/foo/, /bar/],
> > sort { $^b <=> $^a }
> > grep { $_ > 0 }
> > @data;
> >
> >
>
> I would like perl6 to support left-to-right part/sort/grep pipelines.
> Left to right syntax is generally good because it facilitates the flow
> of reading.
It is good for a rather deeper reason than just facilitating the flow of
reading. Psycholinguistic experiments show that the human brain can't
absorb the meaning of such language structures when an "unbound referent"
has not been filled. Consider the difference in comprehensibility of
I gave my friend who I saw last July in the park near the summer
cottage one afternoon when it was rainy but fairly warm my standard
talk on linguistic complexity.
vs.
I gave my standard talk on linguistic complexity to my friend who I saw
last July in the park near the summer cottage one afternoon when it was
rainy but fairly warm.
Pipelines in general--get everything done HERE, then hand it off to
something THERE, and then somewhere ELSE--are much easier for the mind to
comprehend than nested structures where the referent on which the whole
structure depends is found at the beginning or the end.
I can say
The dog bit the cat who chased the rat who stole the cheese which
spoiled in the barn which was built by the farmer who married the
teacher who....
more or less indefinitely. We should definitely have the ability to
pipeline as effectively in Perl, and that means left-to-right.
Trey
--
I'm looking for work. If you need a SAGE Level IV with 10 years Perl,
tool development, training, and architecture experience, please email me
at tr...@sage.org. I'm willing to relocate for the right opportunity.
I am wrong about the precedence, the operator should just looser than
list operator and certainly looser than comma to avoid to use
parentheses around argexpr
(argexpr)...listop indirop # parenthese necessary if ... too tight
>
>
> example:
>
> @data = [ very_long_data_expression ]
> (@foo, @bar) := @data...grep { $_ > 0 }...sort { $^b <=> $^a }...part [/foo/, /bar/];
>
To go left to right all the way, we could have:
@data...grep { $_ >0 } ...sort...@result
Or even
@data...grep { $_ >0 } ...sort...@result...@result2 ... grep { % 2}... @result3;
--
stef
sub comparator {
when /hi/ { 0 }
when /lo/ { 1 }
default { 2 }
}
@input = qw(high low hi lo glurgl);
@out1 = part comparator @input;
@out2 = sort { comparator $^a <=> comparator $^b } @input;
Identical, aren't they? If C<sort> returned all items that evaluated to
0 (equal) together, they would be identical when bound, too. (Of
course, how such a thing would be implemented or even expressed as an
exercise for the reader. :^) )
[ It seems that this thread has drifted off-topic. Perhaps a renaming
is in order? ]
--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)
"If you want to propagate an outrageously evil idea, your conclusion
must be brazenly clear, but your proof unintelligible."
--Ayn Rand, explaining how today's philosophies came to be
> I would like perl6 to support left-to-right part/sort/grep pipelines.
> Left to right syntax is generally good because it facilitates the flow
> of reading.
>
> [cut]
>
> Tentative syntax:
> ... is an left-associative operator that has the same precedence as .
>
> [cut]
>
> example:
>
> @data = [ very_long_data_expression ]
> (@foo, @bar) := @data...grep { $_ > 0 }...sort { $^b <=> $^a }...part [/foo/, /bar/];
I like the intent, but I'm not sure about the syntax -- nor the
statement about precidence: seems to me that the pipe operator
needs a very low precidence, not very high.
In fact, the operator seems to behave pretty much like a low
precidence c<dot> operator: its passes its lhs as the first
arg to the method that follows.
The most obvious (IMHO) syntax for the operator would be
the old arrow operator (->): its a pity its already taken,
and could be ambiguous in this context if overloaded.
An existing convention for low precidence versions of operators
is to use an alphabetic name (e.g. || vs or). perhaps the we
could name this operator C<pp>: its vaguely remenicent of the
word pipe, without subverting that identifier's existing role.
Thus we could write:
@out = @in
pp map { foo }
pp grep { bar }
pp sort { $^a <=> $^b }
Dave.
My understanding was that in Perl6, you could use pretty much anything
for a hashkey--string, number, object, whatever, and that it did not
get mashed down into a string. Did I have this wrong?
--Dks
(@foo, @bar) := @a
. grep { $_ > 0}
. sort { $^b <=> $^b }
. part [/foo/, /bar/];
Of course, that means that grep and sort and part are all methods of the Array
class, so the standard way to write them would be
grep @a: {$_ > 0};
instead of
grep {$_ > 0} @a;
Hmmmm. Odd. I'm guessing it wouldn't be possible to extend the indirect
object syntax to allow
grep {$_ > 0} @a:;
(object can go anywhere in argument list, so long as it's marked with a :. But
now I'm trying to speculate about Larry's colon, something best left to
others).
But somehow it seems like an increase in readability, especially if things were
renamed. Imagine renaming "grep" to "where" or "suchthat". And then the
antigrep can be "except".
--
Adam Lopresto (ad...@cec.wustl.edu)
http://cec.wustl.edu/~adam/
If God didn't want us to eat animals, he wouldn't have made them out of
meat!
Which is why I really wish we had a closure-based syntax similar to
sort, but I grudgingly understand the problems with that. (Heck, it's
similar to C<given>, C<for>, C<sort>, C<map>, C<grep>... that's why I
think it has to be a built-in, because we're really talking about
variations of the same basic, painfully common algorithm.
I'd suggest if we could do >> (and <<) 'piping' operators, C<part>
would become:
@data >> part /foo/, /bar/ >> @foo, @bar;
Again, though, people will keep thinking they want to write:
@data >> part { /foo/, /bar/ } >> @foo, @bar;
to look like C<map> and C<grep>.
If we had piping capable of handling splits into parallel streams we
*could* do away with C<part> altogether, and have something like:
@data >> ( /foo/ >> @foo
& /bar/ >> @bar
& /zap/ >> @zap );
Where & and | would control whether you fell through to the next test?
That way, C<part> really *is* just a multi-streamed grep!
> [ It seems that this thread has drifted off-topic. Perhaps a renaming
> is in order? ]
OK.
MikeL
> If C<part> is not a method or multimethod, then it acts like a
> reserved word or built-in, like C<grep> or C<map>. IMHO that's name
> space pollution.
Yes, it is namespace pollution, but I don't think that's a problem in
Perl.
Many other languages have functions in the same namespace as variables;
this is usually the biggest problem, since people tend to have more
variables than functions. Perl's sigils means this isn't a problem:
C<$part> or C<@part> cannot conflict with C<part>.
Functions are all in a namespace. Perl providing C<CORE::part> doesn't
preclude the existence of C<Ken_Fox::part> or whatever. Most core
functions can be overridden, so even if you don't like the built-in
behaviour you can change it, without Perl insisting on keeping the
original name for itself.
Perl 5 doesn't permit this with all built-in functions, but I'm hoping
that the improved grammar and function declaration syntax in Perl 6 will
allow many more built-in functions to be overridden.
> How about formalizing global namespace pollution with something like
> the Usenet news group formation process? Ship Perl 6 with a very
> small number of global symbols and let it grow naturally.
If the initial release of Perl 6 doesn't have commonly-required
functions then people will write their own. People will do these in
incompatible ways, ensuring that when it's determined that the language
would benefit from having a particular function built in at least some
people will have to change their code to keep it working.
People will also choose different names for their functions. If C<part>
only appears in Perl version 6.0.3, there'll already be dozens of
scripts which have a sub of that name doing something completely
different.
Adding extra functions will create critical differences between versions
of Perl with very small differences in their version number. People
will get frustrated at needing a particular point-release of Perl to run
programs[*0].
Or, alternatively, people will shy away from using those functions --
which by definition are so useful in every day programming that it's
been decided to add them to the language -- because they want their code
to be portable, thereby defeating the purpose of adding them.
Perl 6.0.0 can't be perfect, but please can we aim to be as close as
possible. Releasing a language with the caveat "but we've missed out
lots of important functions that we expect to add in the next version or
four" strikes me as a little odd.
[*0] Yes, obviously this always applies to some extent: a
point-release wouldn't be made unless it adds or changes _something_
in the language. But often these are small things, or tweaks to
edge-case behaviour.
Smylers
> is to use an alphabetic name (e.g. || vs or). perhaps the we
> could name this operator C<pp>: its vaguely remenicent of the
>
> @out = @in
> pp map { foo }
> pp grep { bar }
> pp sort { $^a <=> $^b }
I like the idea of an alphabetic operator name here, but I would like to find something other than 'pp', for two reasons:
1) to me, pp looks too much like one of the quote operators
2) it isn't particularly self-documenting (not that qq or grep or whatever are, but...)
My suggestion would be one of the following:
to, after, sendto, into, thru
@out = @in to map { foo } to grep { bar } to sort { $^a <=> $^b }
@out = @in after map { foo } after grep { bar } after sort { $^a <=> $^b }
@out = @in sendto map { foo } sendto grep { bar } sendto sort { $^a <=> $^b }
@out = @in into map { foo } into grep { bar } into sort { $^a <=> $^b }
@out = @in thru map { foo } thru grep { bar } thru sort { $^a <=> $^b }
I'm not deliriously happy with any of these, but I think 'to' and
'after' are the best of the lot. Anyone else have a better
suggestion?
--Dks
> Looks to me like with a few appropriate methods, you have left-to-right
> ordering for free.
>
> (@foo, @bar) := @a
> . grep { $_ > 0}
> . sort { $^b <=> $^b }
> . part [/foo/, /bar/];
Yes, exactly.
> Of course, that means that grep and sort and part are all methods of the Array
> class, so the standard way to write them would be
>
> grep @a: {$_ > 0};
>
> instead of
>
> grep {$_ > 0} @a;
>
> Hmmmm. Odd. I'm guessing it wouldn't be possible to extend the indirect
> object syntax to allow
>
> grep {$_ > 0} @a:;
Eh, here you give a nod to backwards compatibility (not to mention to
removing extraneous symbols), I think, and define universal multimethods
taking a block and an array or an array of blocks and an array (or an
array of rules and an array, etc.).
That way, all the following would work:
(@foo, @bar) := @a.grep{foo}.sort{byBar}.part[/foo/, /bar/];
# do you need parens around the parameters?
(@foo, @bar) := part [ /foo/, /bar/ ]
sort byBar
grep {foo} @a;
(@foo, @bar) := part :(
sort :(
grep :@a {foo})
byBar)
[/foo/, /bar/];
That last is atrocious--but the fact that it is available doesn't mean
that it should be encouraged.
> (object can go anywhere in argument list, so long as it's marked with a
> :. But now I'm trying to speculate about Larry's colon, something best
> left to others).
That would allow
(@foo, @bar) := part [ /foo/, /bar/ ] :(
sort byBar :(
grep {foo} :@a));
as well as
(@foo, @bar) := part [ /foo/, /bar/ ] :(
sort :(
grep :@a {foo})
byBar);
or
(@foo, @bar) := part :(
sort {foo} :(
grep {foo} :@a))
[ /foo/, /bar/ ];
This is really awful, IMHO. I say just define the multimethod letting you
rearrange the arguments back to the old style and get on with it.
Extending colon to allow it to move willy-nilly through the argument list
may be useful in simple cases, but if nested will be incomprehensible.
This is exactly the same as natural languages, by the way--we can deal
with one or two levels of interior clause nesting, but beyond that we lose
track of what clause is playing what role. The question becomes, is
moving the indirect object around useful enough in the small
comprehensible cases that we're willing to accept the ability to write the
large incomprehensible ones?
> But somehow it seems like an increase in readability, especially if things were
> renamed. Imagine renaming "grep" to "where" or "suchthat". And then the
> antigrep can be "except".
As synonyms, or total renamings for the array-method form? I neither like
populating the language with superfluous names, nor do I like having to
remember "C<grep> is C<suchthat> when used as an array method", which
you'll have to do if you try to transform the functional style to the
method-pipeline style.
Hmm. Does operator precedence allow that?
@a . grep { $_ > 0 } . sort { $^b <=> $^b };
Or does it think the second dot is attempting to call a method of the
thing returned by the closure { $_ > 0 }, not the result of (@a.grep {
$_ > 0 })?
E.G. if 'grep' is a normal method, wouldn't the dot operator bind more
tightly to the closure { $_ > 0 } than the method name 'grep' would?
MikeL
You're taking that out of context. Ship the commonly required
functionality, but don't introduce new global symbols.
A global C<push> function:
push @array, 1
A class C<push> method:
push @array: 1
Methods work well with AUTOLOAD, so they probably don't require
C<use> statements. Anyways, I'd rather have C<use> statements than
globals. I know others disagree -- I even disagree when I'm
trying to write a one-liner on the command line.
Perl 6 is the community rewrite. One of the pillars of the
community is CPAN. Could CPAN help resolve simple library and
namespace issues? Adding C<purge> or C<part> is not a language
design issue.
- Ken
By default they're keyed by strings. You can smack a property on them
to key them by something else, though:
my %sparse is keyed(Int);
my %anything is keyed(Object); # or UNIVERSAL
Luke
I think we're better off shooting for 6.0.0 not getting in the way in
spots we're not sure of, rather than perfection. We can fix an awful
lot later if we don't get in our own way now.
I fully expect that the keyword list will be closed off for rather a
long time, though, once 6.0.0 has been released, so perhaps energy
would be better directed in figuring out how to make something like
purge (and its syntactic cousins) work as an add-on?
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk
> I'd suggest if we could do >> (and <<) 'piping' operators
We can't have << because heredocs and <<..>>'s are already using it.
Likewise we can't have >> because of <<..>>.
See my other post on solving the left-to-right (pseudo-)problem.
Damian
> If the initial release of Perl 6 doesn't have commonly-required
> functions then people will write their own. People will do these in
> incompatible ways, ensuring that when it's determined that the language
> would benefit from having a particular function built in at least some
> people will have to change their code to keep it working.
>
> People will also choose different names for their functions. If C<part>
> only appears in Perl version 6.0.3, there'll already be dozens of
> scripts which have a sub of that name doing something completely
> different.
>
> Adding extra functions will create critical differences between versions
> of Perl with very small differences in their version number. People
> will get frustrated at needing a particular point-release of Perl to run
> programs[*0].
>
> Or, alternatively, people will shy away from using those functions --
> which by definition are so useful in every day programming that it's
> been decided to add them to the language -- because they want their code
> to be portable, thereby defeating the purpose of adding them.
>
> Perl 6.0.0 can't be perfect, but please can we aim to be as close as
> possible. Releasing a language with the caveat "but we've missed out
> lots of important functions that we expect to add in the next version or
> four" strikes me as a little odd.
Amen! I deliberately requoted all of that because it's so very right
that I wanted everyone to reread it. ;-)
I have nothing to add except my wholehearted agreement, and a reminder of
how much trouble was caused in Perl 5 by not having one form of switch
statement, and therefore ending up with 23 forms of switch statement.
Damian
> Sometimes array references behave as arrays, e.g.
>
> push $array, 1
>
> In flattening context array refs don't flatten, only arrays.
> I'm not even sure that only arrays flatten either -- it might
> be anything that begins with @. e.g.
>
> my Point @p;
> ($x, $y) := @p;
>
> If the flattening rule is "only @ symbols flatten" then it
> would be lexical flattening -- we only have to look at the
> text. (I'm using lexical in the same sense as lexical
> variable uses it.)
That would certainly make sense.
> It would actually be nice if all the C<push>, C<pop>,
> etc. functions became methods, e.g.
>
> push @array: 1;
I'm undecided on that. I can certainly see the appeal though.
>> Depends on your definition of simpler, I guess.
>
> I don't see anything particularly complex about this:
>
> my $index = 0;
> for @classifiers {
> return $index if $_.($nextval);
> ++$index
> }
Apart from the fact that it creates a gratuitous lexical outside
the scope of the C<for> block. And that it requires explicit vs
implicit incrementing of the index variable. And it won't work if
the array doesn't start at index 0, or doesn't have contiguous
indices.
Whereas:
for @classifiers.kv -> $index, $_ {
return $index if $_.($nextval);
}
suffers from none of those problems. And has half as many lines.
And encourages the coder to name the topic something more maintainable:
for @classifiers.kv -> $index, &classifier {
return $index if classifier($nextval);
}
> That's understandable and it should produce simple bytecode.
We're probably not going to convince each other.
I guess it's a religious issue. ;-)
> Yes, I agree, but it needs to construct a stream generator
> which isn't particularly efficient.
I suspect that .kv iterators will be *very* lightweight.
Precisely because they will be heavily used for this very idiom.
> I was surprised to see it
> in a place where the generality and elegance isn't needed.
IMHO there is *no* such place. ;-)
> Thanks for the explanation of the junction. I'm not sure
> whether I'm more excited by the possibility to write code
> using junctions or more terrified by the certainty of
> debugging that code... ;)
Well, I'd hope you'd be *both*!
> How about formalizing global namespace pollution with something
> like the Usenet news group formation process? Ship Perl 6 with a
> very small number of global symbols and let it grow naturally.
I'm not in favour of that. Most of the things we're having to
fix in Perl 6 are things that "grew naturally" in Perl 5.
Evolution is *greatly* overrated.
Damian
> My understanding was that in Perl6, you could use pretty much anything
> for a hashkey--string, number, object, whatever, and that it did not
> get mashed down into a string. Did I have this wrong?
Not wrong. But it's not the default. The default is Str keys only.
But I take your point and it may well be that C<want 'hashkey'>
is a useful think to know.
Damian
No.
@out1 has three elements, each an array reference: (['high','hi'], ['low','lo'], ['glurgl'])
@out2 has five elements, each a string: ('high', 'hi', 'low', 'lo', 'glurgl')
> "If you want to propagate an outrageously evil idea, your conclusion
> must be brazenly clear, but your proof unintelligible."
> --Ayn Rand, explaining how today's philosophies came to be
Hmmmmm. Sound more like:
--Damian Conway, explaining how Perl 6 junctions came to be
;-)
Damian
> I like the intent, but I'm not sure about the syntax -- nor the
> statement about precidence: seems to me that the pipe operator
> needs a very low precidence, not very high.
> An existing convention for low precidence versions of operators
> is to use an alphabetic name (e.g. || vs or). perhaps the we
> could name this operator C<pp>: its vaguely remenicent of the
> word pipe, without subverting that identifier's existing role.
> Thus we could write:
>
> @out = @in
> pp map { foo }
> pp grep { bar }
> pp sort { $^a <=> $^b }
Perhaps, instead of a low precedence dot operator, what we need is an
operator that *appends* its left operand to the argument list of its
right operand.
I'd suggest it be called C<then>:
@out = @in
then map { foo }
then grep { bar }
then sort { $^a <=> $^b };
Of course, if you were a strict left-to-right-arian, you'd presumably write:
@in
then map { foo }
then grep { bar }
then sort { $^a <=> $^b }
then @out = ;
Damian
> Looks to me like with a few appropriate methods, you have left-to-right
> ordering for free.
>
> (@foo, @bar) := @a
> . grep { $_ > 0}
> . sort { $^b <=> $^b }
> . part [/foo/, /bar/];
Yes indeed.
> Of course, that means that grep and sort and part are all methods of the Array
> class, so the standard way to write them would be
>
> grep @a: {$_ > 0};
>
> instead of
>
> grep {$_ > 0} @a;
There's no reason that C<grep> couldn't be both a method and a builtin.
> Hmmmm. Odd. I'm guessing it wouldn't be possible to extend the indirect
> object syntax to allow
>
> grep {$_ > 0} @a:;
It's *technically* possible (see Lingua::Romana::Perligata for example ;-)
It may or may not be *culturally* possible.
Damian
>> (@foo, @bar) := @a
>> . grep { $_ > 0}
>> . sort { $^b <=> $^b }
>> . part [/foo/, /bar/];
>
>
> Hmm. Does operator precedence allow that?
I don't think the method-call syntax allows it. I think methods
need their parens. So we need:
(@foo, @bar) := @a
. grep( { $_ > 0} )
. sort( { $^b <=> $^b } )
. part( [/foo/, /bar/] );
which is no huge imposition.
On the other hand, my C<then> proposal doesn't require parens at all. ;-)
Damian
*Why* do methods need their parens? If methods can be specified to possibly
take a block, such as grep and sort do, then they shouldn't need parens.
Or at least, I know a language in which this is possible... :)
--
"Don't worry about people stealing your ideas. If your ideas are any good,
you'll have to ram them down people's throats."
-- Howard Aiken
To know whether the method takes a block, you need to know how it's been
declared. In other words, the type of @a needs to be known to find grep's
declaration. In turn, grep must specify its return type in order to find
sort's declaration, and sort must specify its return type so that part's
declaration may be found.
That's all fine for the standard/builtin methods on arrays, but its a bit
unperl-like to force users to highly specify everything. Of course, if they
do declare methods with all the bells and whistles, they get the benefit of
not having to use parens later on.
--
Peter Haworth p...@edison.ioppublishing.com
"Although they all look the same to me, doormats and other furnishings
probably have a strict social heirarchy. Every chair a god. Every item
of pottery from the Franklin mint, an angel. Man, I love decor."
-- Ashley Pomeroy
Well, that's what always happens on a method call.
> In turn, grep must specify its return type in order to find
> sort's declaration,
No, not at all. As I've said, you assume that all methods *can* take a block.
--
What would happen if you ran up to Hitler and mentioned Usenet?
- Kibo
At run time, yes. However, at compile time, due to Perl's dynamic nature,
you don't know how methods have been declared unless the programmer is using
the optional B&D features.
> > In turn, grep must specify its return type in order to find sort's
> > declaration,
>
> No, not at all. As I've said, you assume that all methods *can* take
> a block.
Fair enough; that simplifies things somewhat. However, you can't tell how
many arguments they take. How do you parse this without the programmer
specifying a great deal more than they're used to in Perl 5?
$foo.bar $baz,$qux
Is it
$foo.bar($baz),$qux
or
$foo.bar($baz.$qux)
or even a syntax error (though this would require bar()'s declaration to be
known at compile time):
$foo.bar() $baz,$qux
--
Peter Haworth p...@edison.ioppublishing.com
Spider Boardman: I'm having fun with it.
Dan Sugalski: Inside the [perl] tokenizer/lexer? This has got to be the
scariest thing I've heard in a long time. You are a sick, sick man.
I see no block here. I'm just talking about passing a block to a method.
You think I'm talking about a clever way of specifying a block's argument
signature. I'm not.
--
These days, if I owned the M$ division that built Lookout
and SexChange, I'd trade it for a dog, and then I'd shoot
the dog. - Mike Andrews, asr.
Actually, I was accepting your point about block arguments not needing
parens, and generalising it to other kinds of arguments.
You want this to work:
@b = @a.grep { /\S/ };
instead of/as well as this:
@b = @a.grep( { /\S/ } );
I can agree that it's much cleaner looking. However, I want to be sure that
it doesn't introduce ambiguity. If the programmer wants $c on the end, and
writes this:
@b = @a.grep { /\S/ }, $c;
how does the compiler know whether $c is an argument to grep, or another
element to be assigned to @b?
Maybe a method can either be called with a parenthesised argument list, no
arguments (without parens), or with a single paren-less block. That also
gets around the chaining issue of:
@a.grep { /\S/ }.grep { .foo };
If the block is surrounded by implicit parens, that stops it getting
parsed as:
@a.grep( { /\S/ }.grep( { .foo } ));
Anyway, my point was that methods with paren-less arguments are either
ambiguous or greedy, unless you restrict the types of arguments this
applies to. If it's just blocks, them I'm fine with it.
--
Peter Haworth p...@edison.ioppublishing.com
"The usability of a computer language is inversely proportional to the
number of theoretical axes the language designer tries to grind."
-- Larry Wall
The same way it does when it sees a normal sub?
I know, late binding and all that. But when you think about it, a lot
can be done to simulate the conditions otherwise. For example, with a
definition like this:
class Foo {
method bar($self: $baz) { ... }
}
And a call like this:
@b=$foo_obj.bar $baz, $quux;
Where we can see *at runtime* that $quux is too many arguments, we can
just append it to the end of bar()'s return value. (This would only
happen when there were no parentheses.) Similarly, with:
class Foo {
method bar($self: HASH $baz) { ... }
}
And:
%b=foo_obj.bar { baz() };
We can call the closure and construct a hashref from the value.
--Brent Dax <bren...@cpan.org>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)
Seems to me that you just gave a really good argument for requiring
the parentheses.
-Scott
--
Jonathan Scott Duff
du...@cbi.tamucc.edu
I've always hated that methods are treated so differently from subs. It
trips me up constantly, and I'd like to see it changed. Such a thing
can be done. The cop-out may be easier--but is it the *right* way to do
this language? If we wanted an easy language to implement, we would
have stuck with something that looks and feels like C. Perl is designed
to make the *programmer*'s life easier, not the implementer's.
> *Why* do methods need their parens?
Because calls to them are not resolved until run-time and because methods
can be overloaded by signature, so we can't tell at parse time what the
parameter list of the called method will be (i.e. where it will end),
so we can't determine how to parse the arguments.
For example. consider:
$obj.method $obj.method { closure() } $arg1, obj.method $arg2, $arg3;
Is that:
$obj.method( $obj.method( { closure() }, arg1, obj.method($arg2, $arg3) ) );
or:
$obj.method( $obj.method({ closure() }, arg1), obj.method($arg2, $arg3) );
or:
$obj.method( $obj.method({ closure() }), arg1, obj.method($arg2, $arg3) );
or:
$obj.method( $obj.method({ closure() }), arg1, obj.method($arg2), $arg3 );
???
*All* of them might be valid interpretations, and *none* of them might be known
to be valid (or even knowable) at the point where the code is parsed.
Throw multimethods into the mix and things get an order of magnitude worse.
Incidentally, the indirect object syntax will suffer from exactly the same problems
in Perl 6.
To solution in both cases is to defer the parameter/argument type checking and
the method dispatch until run-time. That works fine, but only if the compiler
can determine exactly how many arguments to pass to the method call. For that,
it needs either explicit parens, or a default rule.
Perhaps Perl 6 *will* have the default rule. Maybe that each method is passed
as many arguments as possible, working recursively outwards (and right-to-left) in
an expression. That would, for example, mean that:
$obj.method $obj.method { closure() } $arg1, obj.method $arg2, $arg3;
would always mean:
$obj.method( $obj.method( { closure() }, arg1, obj.method($arg2, $arg3) ) );
That *might* be okay, except that people are going to be very annoyed if the only
possible valid interpretation was:
$obj.method( $obj.method({ closure() }, arg1), obj.method($arg2, $arg3) );
and they get a run-time error instead.
For that reason, even if we can solve this puzzle, it might be far kinder
just to enforce parens.
Damian
I might be weird, but when I use parens to clarify code in Perl, I
like to use the Lisp convention:
(method $object args)
Hopefully that will still work even if Perl 6 requires parens.
- Ken
Beware the hobgoblin of foolish consistency :-)
> Perl is designed
> to make the *programmer*'s life easier, not the implementer's.
Another good argument for requiring the parentheses!
I think it'd become
(method $object: args)
I confess I found myself thinking along similar lines when I read
Damian's post.
--
Piers
"It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
-- Jane Austen?
I'm just talking about passing a block to a method. You think I'm
talking about a clever way of specifying a method's argument
signature. I'm not.
--
No proper program contains an indication which as an operator-applied
occurrence identifies an operator-defining occurrence which as an indication-
applied occurrence identifies an indication-defining occurrence different
from the one identified by the given indication as an indication-applied
occurrence. - Algol 68 Report
My father in law (Charles Lindsey) says that para in the Algol 68 Report is
not his.
Richard
--
Personal Ric...@waveney.org http://www.waveney.org
Telecoms Ric...@WaveneyConsulting.com http://www.WaveneyConsulting.com
Web services Ric...@wavwebs.com http://www.wavwebs.com
Independent Telecomms Specialist, ATM expert, Web Analyst & Services
Hmmm...maybe this is a good time to bring up something that's been
bothering me for a while.
It seems like Perl6 is moving farther and farther away from Perl5's
(almost) typelessness. All of a sudden, we are getting into ints,
Ints, Objects, Strs, etc...more and more of the code examples that are
being posted to these lists use type declarations in method
signatures, variable declarations, and anywhere else that they might
squeeze in. It isn't clear to me if this is being done because we are
currently discussing the new types and type-safety mechanisms--all of
which are optional, and only come into play when you request them--or
if it is expected that this will be the new paradigm for Perl
programming.
So...are we intending that types and type safety will be like 'use
strict' (optional and only on request), or will they be like sigils
(mandatory, can't be turned off)? Or, perhaps, on by default but able
to be turned off?
--Dks
All subroutines with multiple signatures would have this problem,
right, even normal non-method subs?
foo $a, $b, $c, $d; # how many args?
So could a general rule be that multimethods (subs or methods with more
than one possible variant) require the parens, but subs with one known
signature do not?
> To solution in both cases is to defer the parameter/argument type
> checking and the method dispatch until run-time. That works fine, but
> only if the compiler can determine exactly how many arguments to pass
> to the method call. For that, it needs either explicit parens, or a
> default rule.
Oof. I had been (foolishly?) hoping that if argument types were known
at parse time (due to previous declarations), it would frequently be
possible to resolve the multimethod variant during compilation. Now
I'm wondering -- because of Perl's ability to add classes/methods at
runtime, for example -- if there are _any_ circumstances in which that
would be true.
How much overhead do we expect (runtime) multimethods to have? I would
guess it to be nontrivial, e.g. substantially worse than normal
methods...
MikeL
You'll note that in my code sample the hash values had no type. What
I specified is no more restrictive than Perl 5, just using something
other than a string for the hash key.
But I know what you mean as far as the rest of it. There will
certainly be no I<mandatory> typing in Perl 6 where there wasn't in
Perl 5. As far as what people will do, well, that's up to people.
They're not going to set social standards. But my guess would be that
the standards would converge to some moderate between strongly typed
and typelessness. You might see things like this:
my Int $count = 0;
while (...) { ... }
But probably nothing C++ish:
my LinkedList::iterator_type $iter = new LinkedList::iterator;
People would probably leave that one untyped, because it's such a
pain. I'm reminded of several Perl mottos simultaneously...
> So...are we intending that types and type safety will be like 'use
> strict' (optional and only on request), or will they be like sigils
> (mandatory, can't be turned off)? Or, perhaps, on by default but able
> to be turned off?
Optional by request, but not explicit request. If you type a
variable, you're asking for type checking on that variable.
Luke
I'd expect a non-trivial overhead to start, declining with time, only
paid when calling methods or subs that could be potentially
multimethod. I know a number of techniques to make the cost smaller,
and they'll get implemented over time, but it'll likely never be
free, at least not for perl.
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk