In S06, the meaning of chaining comparison operators is defined as a
derived form:
(a < b < c) ==> (a < b) and (b < c)
With the note that "b" must be evaluated at most once. However, if
taken literally, it gives this rather weird result:
pugs> ? 2 < (0 | 3) < 4
(#t|#t)
My intuition is that it should be equivalent to this:
pugs> ? (2 < 0 < 4) | (2 < 3 < 4)
(#f|#t)
That is, the autothreading should operate on the whole comparison chain,
treating it as a large variadic function with short-circuiting semantics.
Is this perhaps "saner" than the blind rewrite suggested in the spec?
Also, consider this:
pugs> ? 1|2 => 3|4
(((1 => 3)|(1 => 4))|((2 => 3)|(2 => 4)))
Since junctions are documented to only "flatten" on boolean context,
here the pair-building arrow has been autothreaded. Is it the intended
semantic? What about the list-building semicolon?
Thanks,
/Autrijus/
pugs> ? 4 < (0 | 6) < 2
(#t|#f)
Why is it so? Because:
4 < (0 | 6) and (0 | 6) < 2
(4 < 0 | 4 < 6) and (0 | 6) < 2 # local autothreading
(#f | #t) and (0 | 6) < 2 # evaluation
#t and (0 | 6) < 2 # reduction in boolean context(!)
(0 | 6) < 2 # short circuitry
(0 < 2 | 6 < 2) # local autothreading
(#t | #f) # evaluation
Sick, eh?
Thanks,
/Autrijus/
Why is it allowed to do this?
> (0 | 6) < 2 # short circuitry
> (0 < 2 | 6 < 2) # local autothreading
> (#t | #f) # evaluation
>
> Sick, eh?
Yes, indeed.
This example would work just as well if the local autothreading were done
first on the right and side of the "and". Is there an example where this
is not the case? I can't think of one.
Nicholas Clark
(Without understanding the background to the implementation of junctions)
why are you using a short-circuiting "and"?
Surely if you take an expression that contains the junction (a|b) and
convert that to ... a ... and ... b ... then you are implying an order to
the elements of a junction? I didn't think that junctions had order - I
thought that they were sets.
Nicholas Clark
Because "and" forces boolean context to determine whether it
short-circuits or not. However, I should've make it clear that
if the left hand side evaluates to #f, it will return the junction
itself, not #f. This is true in both spec and pugs implementation.
Thanks,
/Autrijus/
It was short-circuiting the "and", collapsing the left hand side
junction into a boolean to determine whether to evaluate the right hand
side comparison. It is not short-circuiting over individual values.
So I agree. All evaluations to sets needs to be done fully.
Thanks,
/Autrijus/
OK.
So the question I'm asking, which I think is orthogonal to yours, is
If junctions are sets, and so a|b is identical to b|a, then isn't it wrong
for any implementation of junctions to use any short-circuiting logic in
its implementation, because if it did, then any active data (such as tied
things will side effects) may or may not get called depending on whether a
junction happened to be stored internally with a first, or with b first?
(unless the implementation can prove to itself that nothing it's dealing with
has side effects, so short circuiting will have no effect. Of course, this is
an implementation detail, and aside from speed, must not be determinable
from outside)
Nicholas Clark
Well, because perl5's "and" is short-circuiting, and I assume perl6's
"and" is no exception...
> Surely if you take an expression that contains the junction (a|b) and
> convert that to ... a ... and ... b ... then you are implying an order to
> the elements of a junction? I didn't think that junctions had order - I
> thought that they were sets.
Sure. The question is whether the junctions should autothread over the
whole comparison chain (globally), or only to a specific binary
comparison (locally).
Thanks,
/Autrijus/
NC> If junctions are sets, and so a|b is identical to b|a, then isn't
NC> it wrong for any implementation of junctions to use any
NC> short-circuiting logic in its implementation, because if it did,
NC> then any active data (such as tied things will side effects) may
NC> or may not get called depending on whether a junction happened to
NC> be stored internally with a first, or with b first?
NC> (unless the implementation can prove to itself that nothing it's
NC> dealing with has side effects, so short circuiting will have no
NC> effect. Of course, this is an implementation detail, and aside
NC> from speed, must not be determinable from outside)
it also depends on the junction function :). i would think one() would
need to check all the elements. but i agree with you about a junction
being an unordered set. you have to treat it that way to make sense
since you can add elements to a junction (can someone clue us in with a
code example?). on the other hand we expect 'and' and friends to short
circuit and execute from left to right and that is taken advantage of in
many ways. so i say 'a or b' is not semantically the same as 'a | b' in
that there is no guarantee of order in junctions. but there is no
guarantee of evaluating all of the elements in a junction, it can short
curcuit as soon as it can determine a correct boolean result (assuming a
boolean result is wanted).
uri
--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
> pugs> ? 4 < (0 | 6) < 2
> (#t|#f)
>
>
Here's my take on it.
Compare
my $a = (0 | 6);
say 4 < $a and $a < 2;
vs
say 4 < (0 | 6) and (0 | 6) < 2;
The difference is that in the first case the junction refers to the same
object, and the result should probably be expanded only once:
(4 < 0 and 0 < 2) | (4 < 6 and 6 < 2)
while in the second case, we have two different junctions, and each gets
threaded separately:
(4 < 0 and 0 < 2) | (4 < 6 and 6 < 2) | (4 < 0 and 6 < 2) | (4 < 6 and 0
< 2)
The first expansion gives the correct result, while the other is really
a variant on what you have. And I believe this becomes highly dangerous
if you start assigning junctions around. :)
Miro
Yup. My mathematic intuition cannot suffer that:
4 < X < 2
to be true in any circumstances -- as it violates associativity.
If one wants to violate associativity, one should presumably *not*
use the chained comparison notation!
So Pugs will evaluate that to (#f|#f), by lifting all junctions
out of a multiway comparison, and treat the comparison itself as
a single variadic operator that is evaluated as a chain individually.
This way, both associativity and junctive dimensionality holds, so
I think it's the way to go. Please correct me if you see serious
flaws with this approach.
Thanks,
/Autrijus/
Indeed. It smells like "relational" algebra, so we can confirm this
intuition with a rather familiar example:
select * from $a cross join $b cross join $c
where a < b and b < c
Looks right to me! I look forward to the SQL query construction
modules in Perl6. :)
Ashley
Feels right to me.
Larry
>Yup. My mathematic intuition cannot suffer that:
>
> 4 < X < 2
>
>to be true in any circumstances -- as it violates associativity.
>If one wants to violate associativity, one should presumably *not*
>use the chained comparison notation!
>
>So Pugs will evaluate that to (#f|#f), by lifting all junctions
>out of a multiway comparison, and treat the comparison itself as
>a single variadic operator that is evaluated as a chain individually.
>
>
>
I think this is correct, however... this is not what I meat in my
comment. Note I didn't use chained comparison anywhere.
What I meant is that for any form with two parameters (in the example, 4
< ___ and ___ < 2), aparently it's not the same whether the two
parameters refer to the same junction or to two equal (but distinct)
junctions.
Miro
By which I mean:
my $x = 4 < $j;
if $x < 2 { say "never executed" }
Luke
> I'm wonding if we should allow a method that returns a junction that is
> allowed to collapse the original:
>
> if 4 < $j.collapse and $j.collapse < 2 {
> say "never executed";
> }
>
> But that's probably not a good idea, just by looking at the
> implementation complexity of Quantum::Entanglement. People will just
> have to learn that junctions don't obey ordering laws.
>
> Luke
>
Well, we see the same kind of thing with standard interval arithmetic:
(-1, 1) * (-1, 1) = (-1, 1)
(-1, 1) ** 2 = [0, 1)
The reason that junctions behave this way is because they don't
collapse. You'll note the same semantics don't arise in
Quantum::Entanglement (when you set the "try to be true" option).
But you can force a collapse like this:
my $x = 4 < $j;
if $j < 2 { say "never executed" }
I'm wonding if we should allow a method that returns a junction that is
>>Well, we see the same kind of thing with standard interval arithmetic:
>>
>> (-1, 1) * (-1, 1) = (-1, 1)
>> (-1, 1) ** 2 = [0, 1)
>>
>>The reason that junctions behave this way is because they don't
>>collapse. You'll note the same semantics don't arise in
>>Quantum::Entanglement (when you set the "try to be true" option).
>>
>>But you can force a collapse like this:
>>
>> my $x = 4 < $j;
>> if $j < 2 { say "never executed" }
>>
>>
>
>By which I mean:
>
> my $x = 4 < $j;
> if $x < 2 { say "never executed" }
>
>
>
Uh, I'm not sure this does what I think you wanted to say it does. ;) $x
is a boolean, unless < returns a magical object... in which case, the
magical part of $x ought to be a reference to the original $j, no?
>>I'm wonding if we should allow a method that returns a junction that is
>>allowed to collapse the original:
>>
>> if 4 < $j.collapse and $j.collapse < 2 {
>> say "never executed";
>> }
>>
>>But that's probably not a good idea, just by looking at the
>>implementation complexity of Quantum::Entanglement. People will just
>>have to learn that junctions don't obey ordering laws.
>>
>>
Well, I suspect that junctions will have to be references and just
collapse every time. Observe:
my $x = any(1, 2, 3, 4, 5);
print "SHOULD NOT RUN" if (is_prime($x) && is_even($x) && $x > 2);
This only works if $x collapses. Same for matching junctioned strings:
my $a = any (<a b c>);
print "Boo!" if $a ~ /a/ and $a ~ /b/ and $a ~ /c/;
(perhaps I meant to use ~~, I don't quite remember :) )
Either way, autocollapsing juntions is a Good Thing IMHO, and the only
remaining confusion (to go back to my initial post) is that the only
case that doesn't work is when you instance a junction twice as a pair
of same literals:
print "SUCCESS, unfortunately" if (is_prime(any(1, 2, 3, 4, 5)) &&
is_even(any(1, 2, 3, 4, 5)) && any(1, 2, 3, 4, 5) > 2);
Hope I'm making sense. Been a hard day at work. ;)
Miro
What if junctions collapsed into junctions of the valid options under
some circumstances, so
my $x = any(1,2,3,4,5,6,7);
if(is_prime($x) # $x = any(2,3,5,7)
and is_even($x) # $x = any(2)
and $x > 2) # $x = any()
Matt
--
"Computer Science is merely the post-Turing Decline of Formal Systems Theory."
-???
unrelated to the overall topic, shouldn't this be
(a < b < c) ==> (a < b) and (b < c) and (a < c)
anyway? Sorry if I missed this discussed previously.
--Brock
This is Just Wrong, IMO. How confusing is it going to be to find that
calling is_prime($x) modifies the value of $x despite it being a very
simple test operation which appears to have no side effects?
As far as I can see it, in the example, it's perfectly logical for
is_prime($x), is_even($x) and $x > 2 to all be true, because an any()
junction was used. If an all() junction was used it would be quite a
different matter of course, but I would see is_prime() called on an
any() junction as returning true the moment it finds a value inside that
junction which is prime. It doesn't need to change $x at all.
In a way, you're sort of asking 'has $x got something that has the
characteristics of a prime number?' and of course, $x has - several of
them, in fact (but the count is not important).
Soemtimes, although frequently I will check for preconditions at the
begining of a function. After I finished checking for them, I expect
to be able to do stuff assuming them without anyworry of exceptions or
anything else. In these cases I am using conditionals to filter
input, which I imagine is a fairly common case...
It didn't bother me that junctions weren't ordered transitively.
(Ordering had better work transitively for ordinary numbers, but
junctions aren't numbers.) Yes, the 4<(0|6)<2 thing doesn't quite
DWIM, but maybe I should just mean something else.
Except then I started thinking about why, and I decided that it
*should* DWEveryoneM. Funny things that aren't numbers can still be
ordered intransitively, but the reason that this doesn't work the way
we want is not so much because junctions are something funny, but
because < is something funny.
That is, x<y<z does not mean "y is between x and z", which is how
most people probably read it. Instead, because of chain association,
it means "x<y and y<z". Our problem then comes from using y in two
different terms instead of a single 'between' term: If we had a
'between' operator, "y between [x,z]" would work even for junctions.
The chain-association thing puzzled me a bit, because in the back of
my mind I was still thinking of comparative operators as doing their
chainy magic by returning a value to be applied to the following term
(sort of like +) -- no 'and's involved. In 4<6<2, the 4<6 returns "6
but true", and then we're left with 6<2, which is false.
Similarly, 4<(0|6)<2 first evaluates 4<(0|6) aka 4<0 | 4<6.
Booleating a junction returns the elements that fit, in this case 6.
Then we move on to evaluate 6<2, which again is false, just as we
wanted.
<http://groups-beta.google.com/group/perl.perl6.language/browse_thread/th
read/41b18e5920ab2d78/4b24a002ab4ff9c9>
Quoting from the first message in that thread:
>If a boolean operation is applied in non-boolean context to a
>junction, it will return a junction of only the values for which the
>condition evaluated true. [...]
>
>* Junctions discard C<undef>s on-sight (i.e. C<undef> can never be a
> member of a junction).
>* Comparison operators return C<undef> if they fail, and their
> left-hand side "C<but true>" if they succeed.
>* Comparison operators are right-associative.
Oops, my example above assumed < was left-associative and returned
its right-hand side. But doesn't flipping it around that way solve
this:
>Unfortunately, the last change clobbers the left-short-circuiting
>behavior of comparison operators.
or are there funny edge cases or something that I'm not seeing?
Anyway, the above-cited thread presents two versions of junctions and
comparative operators, and I prefer the Lucian interpretation to the
Damianian. It seems a bit more mathematical and a bit less
specially-cased; both good things in my book.
>But you can force a collapse like this:
> my $x = 4 < $j;
> if $x < 2 { say "never executed" }
Because then we're explicitly taking the result of the first
comparison and applying it to the second. Which is what 4<$j<2 ought
to mean anyway. I think Autrijus's solution for Pugs is to treat <
as list-associative (with a between[-like] function as the list op),
but I'm not sure how that works when you try to mix different
comparative operators.
If every boolean function returns the junctive elements that make it
true, then they can be chained indiscriminately:
is_even(is_prime(1|2|3|4|5))>2
means is_even(2|3|5)>2
means (2)>2
means false.
Incidentally, is there a way to decompose a junction? Something like
a .conjunctions method that returns a list of all the conjunctive
elements (if it isn't a conjunction, then it would return a single
element which is the whole thing).
$petticoat=(1 | 2 | 3 | (4&5));
$petticoat.conjunctions; # list containing (1|2|3|(4&5))
$petticoat.disjunctions; # list containing 1, 2, 3, (4&5)
($petticoat.disjunctions)[-1].conjunctions; # list 4, 5
- David "guilty by list association" Green
Yes, it is fairly common, but I don't think it's common enough to attach
unexpected side-effects to innocent-seeming functions. If I want to
modify a junction to contain only values which satisfy a given
precondition, I'll be wanting to use something which states that explicitly.
Which reminds me that I'm not very aware of anything which can decompose
a junction once it's been created, which would be fairly necessary for
doing that sort of thing. If you can turn junctions into lists then
precondition filtering isn't bad at all. Something like
my $filtered = any($junction.list().grep({ satisfies_precondition }));
Of course, I just invented that out of nothing so I may be way off base.
I haven't found anything in any Apocalypse or Synopsis which says if you
can do things like this, but if Perl itself can pick junctions apart, we
should be able to as well.
My approach to this comes somewhat from my basis in liking Haskell a lot
and thus wanting to keep unusual side-effects to a minimum. However, if
junctions collapse and modify themselves as was in your example, what
happens when you don't want to have them collapse? Do you have to
explicitly make copies?
>> What if junctions collapsed into junctions of the valid options under
>> some circumstances, so
>>
>> my $x = any(1,2,3,4,5,6,7);
>> if(is_prime($x) # $x = any(2,3,5,7)
>> and is_even($x) # $x = any(2)
>> and $x > 2) # $x = any()
>
>
> This is Just Wrong, IMO. How confusing is it going to be to find that
> calling is_prime($x) modifies the value of $x despite it being a very
> simple test operation which appears to have no side effects?
>
> As far as I can see it, in the example, it's perfectly logical for
> is_prime($x), is_even($x) and $x > 2 to all be true, because an any()
> junction was used. If an all() junction was used it would be quite a
> different matter of course, but I would see is_prime() called on an
> any() junction as returning true the moment it finds a value inside
> that junction which is prime. It doesn't need to change $x at all.
>
> In a way, you're sort of asking 'has $x got something that has the
> characteristics of a prime number?' and of course, $x has - several of
> them, in fact (but the count is not important).
>
Well, yes, unexpected side-effects are not so great, however, in this
case they're sequestered behind junctions. In fact, the other post
suggested using implicit backtracking for this (something that can have
a real problem with *expected* side-effects). If you just think of
junctions as 'Just Works', side effects are implementation detail.
To address your idea, problem is, you generally don't know whether
you've been passed a junction (short of very specific type query), and
writing code without being able to rely on the fact that (is_prime($x)
&& !!is_prime($x)) == false is Just Plain Evil. For example, something
as simple as
if (is_prime($x)) { ... }
else { ... }
may be buggy if $x is a junction. To make it work correctly, you will
want to write
if (is_prime($x)) { ... }
if (!is_prime($x)) { ... }
Evil, no? :)
Miro
---------- Forwarded message ----------
From: Thomas Yandell <thomas...@gmail.com>
Date: Thu, 10 Feb 2005 10:22:44 +0000
Subject: Re: Junctive puzzles.
To: Matthew Walton <mat...@alledora.co.uk>
> > What if junctions collapsed into junctions of the valid options under
> > some circumstances, so
> >
> > my $x = any(1,2,3,4,5,6,7);
> > if(is_prime($x) # $x = any(2,3,5,7)
> > and is_even($x) # $x = any(2)
> > and $x > 2) # $x = any()
>
> This is Just Wrong, IMO. How confusing is it going to be to find that
> calling is_prime($x) modifies the value of $x despite it being a very
> simple test operation which appears to have no side effects?
>
> As far as I can see it, in the example, it's perfectly logical for
> is_prime($x), is_even($x) and $x > 2 to all be true, because an any()
> junction was used. If an all() junction was used it would be quite a
> different matter of course, but I would see is_prime() called on an
> any() junction as returning true the moment it finds a value inside that
> junction which is prime. It doesn't need to change $x at all.
>
> In a way, you're sort of asking 'has $x got something that has the
> characteristics of a prime number?' and of course, $x has - several of
> them, in fact (but the count is not important).
>
Is it perhaps the comments that are wrong, rather than the code?
my $x = any(1,2,3,4,5,6,7);
if(is_prime($x) # expression evaluates to any(2,3,5,7)
and is_even($x) # expresion evaluates to any(2, 4, 6)
# at this point the boolean expression evaluates to any(2) - is this
the same as 2?
and $x > 2) # expression evaluates to any(3,4,5,6,7)
# so result is false
# $x is still any(1,2,3,4,5,6,7)
Is this right?
Is the following comment correct?
my $x = any(2,3,4,5) and any(4,5,6,7); # $x now contains any(4,5)
Tom
Short answer: I don't think so.
Long answer: I tend to get very lost when dealing with junctions, so
I can be completely wrong. However, watch the precedence and meanings
of the operators here -- I would think that
my $x = any(2,3,4,5) and any(4,5,6,7);
results in $x containing any(2,3,4,5), just as
my $y = 2 and 3;
results in $y containing 2 (since C<and> has lower precedence than C<=>).
Even if you fixed the =/and precedence with parens, to read
my $x = (any(2,3,4,5) and any(4,5,6,7));
then I think the result is still that $x contains any(4,5,6,7).
It gets interpreted as (from S09):
$x = any( 2 and 4, # 4
2 and 5, # 5
2 and 6, # 6
2 and 7, # 7
3 and 4, # 4
3 and 5, # 5
# etc...
5 and 6, # 6
5 and 7, # 7
);
which ultimately boils down to any(4,5,6,7).
Pm
>Even if you fixed the =/and precedence with parens, to read
>
> my $x = (any(2,3,4,5) and any(4,5,6,7));
>
>then I think the result is still that $x contains any(4,5,6,7).
>
>
Funny. I thought $x would contain 'true' here, since C<and> was a
boolean operator. But I could be very wrong.
The overall impression I'm getting here is that we need some syntax for
saying:
$x = any(1..1000) such_that is_prime($x);
where "such_that" acts as a form of "junctive grep". so the above might
mean the same as:
$x = any(1..1000 ==> grep(is_prime($_)));
We then can say that any junction stored in a var stays constant, until
explicitly reassigned. Just like every other kind of thing we store.
Philosophy Question:
What's the difference between a junction and an array containing all the
possible values of the junction? Other than how they are used, of
course. So, on that train of thought, would this make sense:
if $x == @x.any {...}
if $x == @x.none {...}
If this is the case, then this entire discussion collapses into how to
best convert arrays into junctions and junctions into arrays. Perl's
existing abilities to edit arrays should be more than sufficient for
editing junctions.
-- Rod Adams
> The overall impression I'm getting here is that we need some syntax for
> saying:
>
> $x = any(1..1000) such_that is_prime($x);
In standard Perl 6 that'd be:
$x = any(grep {is_prime $^x} 1..1000);
or, if you prefer your constraints postfixed:
$x = any( (1..1000).grep({is_prime $^x}) );
If you really wanted a "such that" operator you could certainly create one
yourself:
multi sub *infix:<such_that> (Junction $j, Code $constraint) {
return $j.type.new(grep $constraint, $j.values);
}
$x = any(1..1000) such_that {is_prime $^x};
Though, personally, I think a C<.where> method with an adverbial block might
be neater:
multi method Junction::where (Junction $j: *&constraint) {
return $j.type.new(grep &constraint, $j.values);
}
$x = any(1..1000).where:{is_prime $^x};
# or...
$x = where any(1..1000) {is_prime $^x};
> We then can say that any junction stored in a var stays constant, until
> explicitly reassigned. Just like every other kind of thing we store.
Yep. That's exactly what we'll be saying!
> Philosophy Question:
>
> What's the difference between a junction and an array containing all the
> possible values of the junction?
Junctions have an associated boolean predicate that's preserved across
operations on the junction. Junctions also implicitly distribute across
operations, and rejunctify the results.
> So, on that train of thought, would this make sense:
>
> if $x == @x.any {...}
> if $x == @x.none {...}
Probably. It's entirely possible that, in addition to being built-in list
operators, C<all>, C<any>, C<one>, and C<none> are also multimethods on
Scalar, Array, and List.
Damian
> On Thu, Feb 10, 2005 at 10:42:34AM +0000, Thomas Yandell wrote:
>
>>Is the following comment correct?
>>
>>my $x = any(2,3,4,5) and any(4,5,6,7); # $x now contains any(4,5)
>
>
> Short answer: I don't think so.
>
> Long answer:
<decloak>
Patrick is right on the money here...as usual. (Don't you just *love* that in
the guy who's job it is to actually make this stuff work! ;-)
Damian
<recloak>
> Rod Adams wrote:
>
>> The overall impression I'm getting here is that we need some syntax
>> for saying:
>>
>> $x = any(1..1000) such_that is_prime($x);
>
>
> In standard Perl 6 that'd be:
>
> $x = any(grep {is_prime $^x} 1..1000);
>
> or, if you prefer your constraints postfixed:
>
> $x = any( (1..1000).grep({is_prime $^x}) );
Both of those seem way too brutal to me.
>
>> We then can say that any junction stored in a var stays constant,
>> until explicitly reassigned. Just like every other kind of thing we
>> store.
>
>
> Yep. That's exactly what we'll be saying!
Good.
>
>> Philosophy Question:
>>
>> What's the difference between a junction and an array containing all
>> the possible values of the junction?
>
>
> Junctions have an associated boolean predicate that's preserved across
> operations on the junction. Junctions also implicitly distribute
> across operations, and rejunctify the results.
My brain is having trouble fully grasping that. Let me attempt a paraphrase:
Junctions exist to be tested for something.
When a test is performed, the junction is evaluated in terms of that
test. A "result junction" is created, which contains only the elements
of the original junction which will pass that given test. If the result
junction is empty, the test fails.
Looking at the S09 C<< substr("camel", 0|1, 2&3) >> example explains a lot.
>
>> So, on that train of thought, would this make sense:
>>
>> if $x == @x.any {...}
>> if $x == @x.none {...}
>
>
> Probably. It's entirely possible that, in addition to being built-in
> list operators, C<all>, C<any>, C<one>, and C<none> are also
> multimethods on Scalar, Array, and List.
okay.
------
Now that I've gotten some feedback from my original message (on list and
off), and have had some time to think about it some more, I've come to a
some conclusions:
Junctions are Sets. (if not, they would make more sense if they were.)
Sets are not Scalars, and should not be treated as such.
If we want Sets in Perl, we should have proper Sets.
Let's first define what a Set is:
- A Set is an unordered collection of elements in which duplicates are
ignored.
- There are really only two questions to ask of a Set: "Is X a member of
you?", and "What are all your member?"
- Typically, all members of a set are of the same data type. (I'm in no
way committed to this being part of the proposal, but it makes sense if
it is)
Sets and Lists are two different things. Lists care about order and
allow duplicates. Iterating a Set produces a List, and one can convert a
List into a Set fairly easily.
Sets and Hashes are quite similar, but in other ways different. The keys
of a Hash are a Set of type String. In addition to the String
constraint, each element of the set has an additional scalar value
associated with it. Hashes can be multidimensioned. I have no idea what
a multidimensional Set is. It may be possible to represent Sets as
lightweight Hashes if the "Strings for keys" constraint is lifted or
altered, but I see several advantages to Sets being distinct, for
reasons I'll outline below.
So I propose making Sets a first class data type right up there with
Arrays, Hashes, and Scalars. For the purposes of this posting, I will
assume that they get a sigil of #. (to get a comment, you need
whitespace after the #). I harbor no expectations that it will stay as
this, but I needed something, and didn't feel like remapping my keyboard
at the moment. Interestingly, on US keyboards, @#$% is shift-2345.
With that, we can now make existing operators do nifty things:
#a = #b + #c; # Union (Junctive $b or $c)
#a = #b - #c; # Difference ( $b and not $c)
#a = #b * #c; # Intersection ( $b and $c)
#a == #b; # Do sets contain the same values?
#a < #b; # Is #a a subset of #b? likewise >, <=, >=
$a ~~ #b; # Membership
#a += $b; # Add value to set
#a = @b; # Create a set from an array/list
#a = (1,2,3);
$ref = #{1..10}; # an anonymous Set reference
@a = #b; # One way to iterate the members.
It's probably best to define binary | and & as "Set Creation with
Union/Intersection", so we have:
#a = 1|3|7;
#a + @b == #a | @b;
We also add some methods here and there:
@a = #b.values;
#b = @a.as_set;
$a = #b.elems;
my str #a = %b.keys;
I also envision "virtual sets", which cannot be iterated, but can be
tested for membership against. These would be defined by a closure or
coderef.
#natural_numbers = { $^a == int($^a) && $^a > 0 };
#primes = &is_prime;
Set operations with virtual sets should be able to define new closures
based on the former ones:
#a = #b + #c; ==> #a = {$^a ~~ #b || $^a ~~ #c};
#a = #b * #c; ==> #a = {$^a ~~ #b && $^a ~~ #c};
#a = #b - #c; ==> #a = {$^a ~~ #b && $^a !~ #c};
#a = #b + 3; ==> #a = {$^a == 3 || $^a ~~ #b};
So now, some sample code:
$x = any( (1..1000).grep({is_prime $^x}) ); # Damian's example from above
vs
#x = (1..1000) * #primes;
if $x == 1|2|3 {...}
vs
if $x ~~ 1|2|3 {...}
or
if $x ~~ #{1,2,3} {...}
$x = (any(2,3,4,5) and any(4,5,6,7));
# where $x == any(2,3,4,5)
vs
#x = #{2,3,4,5} * #{4,5,6,7};
# where #x == #{4,5}
if 3 < $a < 4 {...} # where $a is junctive
# given the length of discussion here, this a very confusing concept.
vs
if 3 < #a.max && #a.min < 4 {...}
# or whatever semantics you actually expected!
for #a -> $a {...} # if non-virtual
# might auto-thread
I'll be glad to provide other sample conversions if requested.
As for the original purpose of Junctives, back whenever they were first
presented, one can declare that the type of an object is in fact a Set
of types, which may be just one element, or more. (or even none?!?) You
would want to restrict it to a Set of Class, or some such thing, so you
don't get someone attempting to create an object of type 3.8.
Why I like Sets better than Junctions:
- No surprises. You know exactly what you're getting at any given point.
- Sets are very common computing concepts, so most programmers will
understand them immediately.
- The resulting code is much easier to read and write, IMHO.
Why they merit a new sigil:
- They are structurally and semantically different from Scalars, Arrays,
and Hashes.
- They are integral to the language design, and basic enough to make
shuffling them into a separate class dubious at best.
-- Rod Adams
Converting an array into a junction is easy, use C<any> or C<all>:
$x = any(@array);
$y = all(@array);
"Perl 6 and Parrot Essentials" says to convert that to convert a
junction into a flat array, use the C<.values> method. (I didn't
find an equivalent statement in the synopses/apocalypses/exigeses.)
Pm
The boolean form of C<and> is C<?&> .
C<and> is the low-precedence version of C<&&>.
Pm
---------- Forwarded message ----------
From: Thomas Yandell <thomas...@gmail.com>
Date: Fri, 11 Feb 2005 09:40:03 +0000
Subject: Re: Fwd: Junctive puzzles.
To: "Patrick R. Michaud" <pmic...@pobox.com>
If only I could just do something like:
perl6 -MData::Dumper -e 'print Dumper(any(2,3,4,5) && any(4,5,6,7))'
...then I could easily find out for myself. Until that happy day I
will have to ask you guys to clear it up for me.
Is there another operator that takes the intersection of two
junctions, such that any(2,3,4,5) *some op* any(4,5,6,7) would result
in any(4,5)?
Tom
Seems today is indeed that happy day:
% wget -m -np http://svn.openfoundry.org/pugs/
% cd svn.openfoundry.org/pugs
% perl Makefile.PL
% make
% ./pugs -e "(any(2,3,4,5) && any(4,5,6,7)).perl"
((4 | 5 | 6 | 7))
Thanks,
/Autrijus/
Thanks,
Tom
Yes. In Pugs 6.0.3 (released one minute ago), that operator is
simply called "&":
% ./pugs -e "(any(2,3,4,5) & any(4,5,6,7)).perl"
((2 | 3 | 4 | 5 | 6 | 7))
That is, the "&" builder now automagically collapses nested
junctions under it. I intend to fill in the rest of the collapse
logic tomorrow, after some feedback from the list.
The question I'd like to ask is: is this kind of collapsing desired?
If yes, how far should it go? Should it only be done when it results in
reduced dimensions (i.e. from (Junctions of Junctions) to (Junctions)),
or whenever duplicated value can be collapsed? Consider:
all( one(1, 2), one(2, 3) )
Should it be collapsed into:
any( one(1, 3), 2 )
so that "2" now only occurs once? Is it sane?
Thanks,
/Autrijus/
On Fri, Feb 11, 2005 at 01:22:51PM -0600, Patrick R. Michaud wrote:
> # return true if $x is a factor of $y
> sub is_factor (Scalar $x, Scalar $y) { $y % $x == 0 }
> [...]
> # a (somewhat inefficient?) is_prime test for $bar
> if is_factor(none(2..sqrt($bar)), $bar) { say "$bar is prime"; }
>
> Just because I'm curious, here's the the prime test spelled out for
> $bar==23, testing if 23 is a prime number:
>
> is_factor(none(2..sqrt(23)), 23)
>
> -> is_factor(none(2..4), 23)
>
> -> { 23 % none(2..4) == 0 }
> [...]
...and here's where I went awry. According to S09, the derivation
should instead be
-> is_factor(none(2..4), 23)
-> none( is_factor(2, 23), is_factor(3, 23), is_factor(4, 23) )
-> none( 0, 0, 0 )
-> true
In other words, the junction isn't passed to the function, but
instead the function is "autothreaded" to be called individually
for each value in the junction.
S09 also says:
In any scalar context not expecting a junction of values, a
junction produces automatic parallelization of the algorithm.
I briefly grepped through the apocalypses/synopses and couldn't
find the answer -- how do I tell a scalar context to expect a
junction of values? In particular, is there a way for me to pass
a junction to a routine without it autothreading and without having
to bury the junction in an array or some other structure?
Pm
From "Perl 6 and Parrot Essentials":
A junction is basically just an unordered set with a logical
relation defined between its elements. Any operation on the
junction is an operation on the entire set.
"Tests" are not used to select elements from the junction, a "test"
(such as a relational op) is simply applied to the elements of the
junction(s) and returns the junction of the results. In other words,
if you think of a "test" as being a boolean operator, then applying
a boolean operator to a junction is going to return a junction of
true/false values, because boolean operators return true/false values.
For example, with the "less than or equals" (<=) relational operator,
the expression
any(2,3,4) <= 3
becomes
any( 2 <= 3, # 1 (true)
3 <= 3, # 1 (true)
4 <= 3 # 0 (false)
)
which ultimately becomes any(1,0), because <= is an operator that
returns booleans. In plain English, we're asking "Are any of the values
2, 3, or 4 less than or equal to 3?", and the answer is "yes", because
any(1,0) evaluates to true in a boolean context. Note that it does
*not* becomes the junction of the values that were less than or equal to 3
-- for that we would use C<grep>.
Similarly, consider
all(2,3,4) <= 3
which becomes
all( 2 <= 3, # 1 (true)
3 <= 3, # 1 (true)
4 <= 3 # 0 (false)
)
or all(1,0). Here, the English question is "Are all of the values
2, 3, and 4 less than or equal to 3?", and the answer is "no" (and
all(1,0) evaluates to false in a boolean context).
Okay, so how is this useful? Here's an example (w/apologies for any
inadvertent syntax errors):
if (any(@age) >= 100) { say "There's a centenarian here!"; }
is somehow a lot nicer than
if (grep { $^x >= 100 } @age) { say "There's a centenarian here!"; }
And
if (all(@age) >= 100) { say "We're all centenarians here!"; }
is much nicer than
if (!(grep { $^x < 100 } @age)) { say "We're all centenarians here!"; }
But wait, there's more! Junctions are valuable because we can combine
them into multiple operands or function arguments. Thus,
# intersection: Are any values in @foo also in @bar?
any(@foo) == any(@bar)
# containment: Are all of the elements in @foo also in @bar?
all(@foo) == any(@bar)
# non-intersection: Are all of the elements in @foo not in @bar?
all(@foo) == none(@bar)
Here's that last one spelled out to see the effects, assuming @foo=(2,3,4)
and @bar=(5,6):
all(2,3,4) == none(5,6)
-> all( 2 == none(5,6),
3 == none(5,6),
4 == none(5,6)
)
-> all( none(2==5, 2==6),
none(3==5, 3==6),
none(4==5, 4==6)
)
-> all( none(0, 0),
none(0, 0),
none(0, 0)
)
Of course, the value of junctions is that they work pretty much
with any operation on scalar arguments. Thus, if we define an
is_factor() function as:
# return true if $x is a factor of $y
sub is_factor (Scalar $x, Scalar $y) { $y % $x == 0 }
then we automatically get:
# are any of @foo factors of $bar?
if is_factor(any(@foo), $bar) { ... }
# are all of @foo factors of $bar?
if is_factor(all(@foo), $bar) { ... }
# is $foo a factor of any elements of @bar?
if is_factor($foo, any(@bar)) { ... }
# is $foo a factor of all elements of @bar?
if is_factor($foo, all(@bar)) { ... }
# are any elements of @foo factors of any elements of @bar?
if is_factor(any(@foo), any(@bar)) { ... }
# are all elements of @foo factors of all elements of @bar?
if is_factor(all(@foo), all(@bar)) { ... }
# a (somewhat inefficient?) is_prime test for $bar
if is_factor(none(2..sqrt($bar)), $bar) { say "$bar is prime"; }
Just because I'm curious, here's the the prime test spelled out for
$bar==23, testing if 23 is a prime number:
is_factor(none(2..sqrt(23)), 23)
-> is_factor(none(2..4), 23)
-> { 23 % none(2..4) == 0 }
-> { none( 23 % 2, 23 % 3, 23 % 4 ) == 0 }
-> { none( 1, 2, 3 ) == 0 }
-> { none( 1==0, 2==0, 3==0 ) }
-> { none( 0, 0, 0) }
-> true
That is just too cool. :-)
Pm
This collapse is probably wrong. In particular,
any($a, $b) & any($b, $c)
is not the same as
any($a, $b, $c)
To see this, set $a=1, $b=0, $c=0:
any($a, $b) & any($b, $c)
-> any(1,0) & any(0,0)
-> false
any($a, $b, $c)
-> any(1, 0, 0)
-> true
> Consider:
>
> all( one(1, 2), one(2, 3) )
>
> Should it be collapsed into:
>
> any( one(1, 3), 2 )
>
> so that "2" now only occurs once? Is it sane?
No, it's wrong.
all( one(1, 2), one(2, 3) ) # false
any( one(1, 3), 2) # true
There might be some junctions that can be "collapsed", but these
aren't examples of them.
Pm
>For example, with the "less than or equals" (<=) relational operator,
>the expression
>
> any(2,3,4) <= 3
>
>becomes
>
> any( 2 <= 3, # 1 (true)
> 3 <= 3, # 1 (true)
> 4 <= 3 # 0 (false)
> )
>
>which ultimately becomes any(1,0), because <= is an operator that
>returns booleans. In plain English, we're asking "Are any of the values
>2, 3, or 4 less than or equal to 3?", and the answer is "yes", because
>any(1,0) evaluates to true in a boolean context. Note that it does
>*not* becomes the junction of the values that were less than or equal to 3
>-- for that we would use C<grep>.
>
>
I would argue that this sort of relational comparison is of limited
usefulness. Invariably, the next question that will nearly always be
asked is "_Which_ values worked / didn't work?". Using my set notation,
one gets the result set, and can then easily test for empty set to see
if there was anything meeting that condition. Not mentioned before, but
I had assumed that in boolean context, and empty set to be false, and
non-empty true.
so, where you say:
if any(1,2,3,4) <= 3 {...}
I would have a bit more complex:
if (1,2,3,4) & {$^a <=3} {...}
however, I could also say:
if ((1,2,3,4) & {$^a <=3}).elems > 2 {...}
To ask if "more than two elements are <= 3".
One could also then do relative simple things like:
for (1,2,3,4) & {$^a <=3} -> $a { ... }
Assuming that the {$^a <= 3} is some sort of meaningful threshold, one
would likely have defined a virtual set for it, so that becomes:
for (1,2,3,4) * #threshold -> $a { ... }
>Similarly, consider
>
> all(2,3,4) <= 3
>
>
!((2,3,4) * #threshold)
or
#{2,3,4}.max <= 3
both of which are not quite as clean as your example.
I am willing to consider replacing the somewhat sloppy .min/.max for a
.any/.all to render it:
#{2,3,4}.all <=3
But I'm not sure I'm all that happy w/ the autothreading implications of
that. Especially if you are calling a function with side effects. What
if one value causes a 'die'? Does it throw an exception, even though
other values succeeded fine? And how utterly impossible is it going to
be to debug a program that gets an inadvertent junction/set thrown in
somewhere. Like:
$x = $Value | 'Default';
instead of :
$x = $Value || 'Default';
If you have | return a set, not a junction, you will quickly get an
error down the road about using a set as a scalar, giving you somewhere
to look. As opposed to a potential constant weaving in and out of
autothreading, with junctions in half the places you were expecting scalars.
Consider:
$a = package::func(); # returns a lovely ('cat'|'mouse')
say "Splat! $a";
Do we get:
Splat! cat
Splat! mouse
Or just one of them, at apparent random?
Return a set, and the unsuspecting user gets:
Splat! SET(0xFFFFFFF)
Which, while not ideal, it will happen consistently, and be easier to
deal with.
Hmm. If we instead declare autothreading dead, but instead have explicit
threading, and then convert any() and all() to more orthogonal and() and
or(), we could do something like:
and(#{2,3,4}.thread <= 3)
but even here, are we not better served with the hyper operators?
and((2,3,4) Â»<= 3)
btw, I like and()/or() over all()/any() because it makes it very clear
we are establishing a boolean context.
>But wait, there's more! Junctions are valuable because we can combine
>them into multiple operands or function arguments. Thus,
>
> # intersection: Are any values in @foo also in @bar?
> any(@foo) == any(@bar)
>
>
#foo * #bar # and we even know which ones!
> # containment: Are all of the elements in @foo also in @bar?
> all(@foo) == any(@bar)
>
>
#foo <= #bar
> # non-intersection: Are all of the elements in @foo not in @bar?
> all(@foo) == none(@bar)
>
>
!(#foo *#bar)
>Of course, the value of junctions is that they work pretty much
>with any operation on scalar arguments. Thus, if we define an
>is_factor() function as:
>
> # return true if $x is a factor of $y
> sub is_factor (Scalar $x, Scalar $y) { $y % $x == 0 }
>
>then we automatically get:
>
> # are any of @foo factors of $bar?
> if is_factor(any(@foo), $bar) { ... }
>
> # are all of @foo factors of $bar?
> if is_factor(all(@foo), $bar) { ... }
>
> # is $foo a factor of any elements of @bar?
> if is_factor($foo, any(@bar)) { ... }
>
> # is $foo a factor of all elements of @bar?
> if is_factor($foo, all(@bar)) { ... }
>
> # are any elements of @foo factors of any elements of @bar?
> if is_factor(any(@foo), any(@bar)) { ... }
>
> # are all elements of @foo factors of all elements of @bar?
> if is_factor(all(@foo), all(@bar)) { ... }
>
> # a (somewhat inefficient?) is_prime test for $bar
> if is_factor(none(2..sqrt($bar)), $bar) { say "$bar is prime"; }
>
>
But what happens when you try to escape the boolean context? I'll
reiterate my autothreading concerns above.
And in these, you still have to do something completely different to
determine what the factors are.
Sometimes a short loop is a good thing.
btw, in my set notation, you get:
@bar * {is_factor($^a, $foo)}
-- Rod Adams
Right. Teaches me that implementing nontrivial features on 3am
just-before-sleep is probably a bad idea. :-/
Thanks for your quick feedback. :)
/Autrijus/
I satnd corrected. The implementation is incorrect.
Pugs 6.0.4 has just been released (now with the "eval" primitive!),
it has cleaned up the collapsing logic thus:
- all() checks its operands to see if any of them are also all()
junctions; it then takes an union of those junctions sets first,
then unify it again with the set of other operands.
- same applies for any().
- one() checks its operands for duplicates; if found, it collapses
itself into an empty one() junction, thus failing all tests.
Is this somewhat saner? :-)
Thanks,
/Autrijus/
Well, except junctions hold more information than the simple comparisons
I've given here. For example, a junction can have a value like:
$x = ($a & $b) ^ ($c & $d)
which is true only if $a and $b are true or $c and $d are true but not
both.
> Invariably, the next question that will nearly always be
> asked is "_Which_ values worked / didn't work?".
If you're wanting to know *which* values worked, we still have C<grep> --
we don't need a special set notation for it, or to worry about junctions.
> so, where you say:
> if any(1,2,3,4) <= 3 {...}
> I would have a bit more complex:
>
> if (1,2,3,4) & {$^a <=3} {...}
if grep {$^a <= 3} (1,2,3,4) { ... }
> however, I could also say:
>
> if ((1,2,3,4) & {$^a <=3}).elems > 2 {...}
if (grep {$^a <= 3) (1,2,3,4)) > 2 { ... }
> for (1,2,3,4) & {$^a <=3} -> $a { ... }
>
> Assuming that the {$^a <= 3} is some sort of meaningful threshold, one
> would likely have defined a virtual set for it, so that becomes:
>
> for (1,2,3,4) * #threshold -> $a { ... }
Ummm, out of curiosity, what would #threshold look like here?
> [...] And how utterly impossible is it going to
> be to debug a program that gets an inadvertent junction/set thrown in
> somewhere. Like:
>
> $x = $Value | 'Default';
> instead of :
> $x = $Value || 'Default';
Hmm, this is an interesting point. I'll let others chime in here,
as I don't have a good answer (nor am I at all authoritative on junctions).
> and((2,3,4) Â»<= 3)
>
> btw, I like and()/or() over all()/any() because it makes it very clear
> we are establishing a boolean context.
To me the fact that we're using <= establishes that we're interested
in a boolean result; I don't need "and/or" to indicate that. Using
"and" to mean "all" doesn't quite work for me, as I somehow think of
"and" as a two-argument operation.
> [intersection]
> #foo * #bar # and we even know which ones!
>
> [containment]
> #foo <= #bar
>
> [non-intersection]
> !(#foo * #bar)
Somehow overloading C<*> to mean "intersection" just doesn't work for
me here. I'd have to think about it.
> But what happens when you try to escape the boolean context? I'll
> reiterate my autothreading concerns above.
> And in these, you still have to do something completely different to
> determine what the factors are.
...and we can still do those different things using grep and the
other list-context operations at our disposal.
Ultimately I don't think I agree with the notion that sets and lists
are so different, or that sets deserve/require their own sigil.
Certainly a list can be used to represent a set, and we can easily
define intersection/union/subset operations for lists, or else just
define a Set class and put the operations there. Getting a list
to have unique values seems easy enough
@xyz = all(@xyz).values(); # remove any duplicate elements of @xyz
so I'm not sure I see the need for a separate type.
Anyway, hopefully some others who have more experience than me on
this can chime in if appropriate.
Pm
>>$x = $Value | 'Default';
>> instead of :
>>$x = $Value || 'Default';
>
>
> Hmm, this is an interesting point. I'll let others chime in here,
> as I don't have a good answer (nor am I at all authoritative on junctions).
This is merely syntax; it doesn't really have anything to do with junctions
per se.
Besides which, both those syntaxes are *already* valid in Perl 5. And yet
people don't commonly make this mistake now. Why would they make it more
frequently when | produces a junction instead?
We see very few C<$x * $y> vs C<$x ** $y> mistakes. We see very few C<-$x> vs
C<--$x> mistakes. And when C<//> becomes a valid Perl 5 operator (in 5.10)
we're not fearfully anticipating a flood of:
$x = $Value / 'Default';
instead of :
$x = $Value // 'Default';
errors.
> Ultimately I don't think I agree with the notion that sets and lists
> are so different, or that sets deserve/require their own sigil.
Sets shouldn't have a sigil anyway, whether they're qualitatively different
from lists or not. A set is a *value* (like an integer, or a string, or a
list). A set is not a *container* (like an scalar or an array). And only
containers get sigils in Perl.
> Certainly a list can be used to represent a set, and we can easily
> define intersection/union/subset operations for lists, or else just
> define a Set class and put the operations there. Getting a list
> to have unique values seems easy enough
>
> @xyz = all(@xyz).values(); # remove any duplicate elements of @xyz
BTW, I'm pretty sure there will be built-in C<Array::uniq> and C<List::uniq>
methods in Perl 6. So that's just:
@xyz = uniq @xyz;
or better still:
@xyz.=uniq;
Damian
If you have control over that routine, argument prototypes is probably
the way to go. To wit:
pugs> sub myRand ($j) { rand }; (myRand(any(1,2))).perl
"((0.33283094755206977 | 0.815772904389485))"
pugs> sub myRand (Junction $j) { rand }; (myRand(any(1,2))).perl
"(0.9624736987665468)"
If you don't have control over the routine, maybe taking a reference
is the way to go:
pugs> sub myRand ($j) { rand }; $ref := \ (1|2); (myRand($ref)).perl
"(0.5057952976799094)"
Note that "\ (1|2)" is evaluated as "ref(any(1,2))", not "any(ref(1),ref(2))".
The reason is that currently I'm prohibiting autothreading if any of the
below is true:
* $context.isa(Bool)
* $context.isa(Junction)
* Any.isa($context)
Hence, because the "\" primitive has a prototype of:
&infix:<\> (Any $x) returns Ref
Its first argument is not autothreaded. As usual, please sanity-check
the heuristics above. :-)
Thanks,
/Autrijus/
Is there other built-in methods not found in perl5 that you are
aware of? I'd like to work out declarations and implementations
of them in one sweep, if possible. :-)
Thanks,
/Autrijus/
>Rod Adams wrote:
>
>
>>I would argue that this sort of relational comparison is of limited
>>usefulness.
>>
>>
>
>Well, except junctions hold more information than the simple comparisons
>I've given here. For example, a junction can have a value like:
>
> $x = ($a & $b) ^ ($c & $d)
>
>which is true only if $a and $b are true or $c and $d are true but not
>both.
>
>
That's why I allowed for virtual sets, defined by a closure.
>>Invariably, the next question that will nearly always be
>>asked is "_Which_ values worked / didn't work?".
>>
>>
>
>If you're wanting to know *which* values worked, we still have C<grep> --
>we don't need a special set notation for it, or to worry about junctions.
>
>
Of course we'll always have C<grep>. But this is Perl, and I want YAWTDI.
After all, another way to test membership was just added, whereas before
you pretty much just had C<grep>.
>>and((2,3,4) ?<= 3)
>>
>>btw, I like and()/or() over all()/any() because it makes it very clear
>>we are establishing a boolean context.
>>
>>
>
>To me the fact that we're using <= establishes that we're interested
>in a boolean result; I don't need "and/or" to indicate that. Using
>"and" to mean "all" doesn't quite work for me, as I somehow think of
>"and" as a two-argument operation.
>
>
That's a minor quibble, and I could go either way.
>>[intersection]
>>#foo * #bar # and we even know which ones!
>>
>>[containment]
>>#foo <= #bar
>>
>>[non-intersection]
>>!(#foo * #bar)
>>
>>
>
>Somehow overloading C<*> to mean "intersection" just doesn't work for
>me here. I'd have to think about it.
>
>
I saw a reference to a flavor of Pascal that used it that way. C<x>
might be more in line with the math notation for it, but somehow I doubt
that would make you feel better.
>Ultimately I don't think I agree with the notion that sets and lists
>are so different, or that sets deserve/require their own sigil.
>
My issue is less that lists and sets are radically different. It is much
more a matter of Junctions and Scalars are radically different. Getting
me to accept that a Scalar holds several different values at once is a
hard sell. Especially when you consider duplicated side effects.
And what happens if you attempt to evaluate a junction in a non-boolean
context?
-- Rod Adams
> Patrick R. Michaud wrote:
>
>> Ultimately I don't think I agree with the notion that sets and lists
>> are so different, or that sets deserve/require their own sigil.
>
>
> Sets shouldn't have a sigil anyway, whether they're qualitatively
> different from lists or not. A set is a *value* (like an integer, or a
> string, or a list). A set is not a *container* (like an scalar or an
> array). And only containers get sigils in Perl.
Yet you're wanting to store something which holds different values (a
junction) in a scalar field. I could see holding an enumerated set in an
array, without any trouble at all. But junctions can be more than an
enumeration of elements. To steal Patrick's example from before:
$x = ($a & $b) ^ ($c & $d)
Which cannot be held in an array.
So my argument here is that none of the existing containers are suitable
for holding a set/junction.
Scalars are meant to hold a single value. Junctions can hold several.
Arrays can hold many different values, but cannot store the
interrelationship between then, as in the example above.
Hashes suffer the same problems as Arrays.
So my proposal was to create a new container, Sets, to store them in. I
included the ability to store enumerated values, as well as create more
complex logic via closures.
I was also attempting to add a bit of sanity to the semantics, by
rephrasing things into something the average programmer would be able to
parse. Given the numerous corrections to how one junction or another was
parsed, I concluded that the exact semantics were becoming too subtle.
I also find the following incredibly disturbing:
>perl6 -e "$x = 'cat'|'dog'; say $x;"
dog
cat
Getting iterated executions of a statement without explicitly iterating
it bothers me greatly. I work heavily in databases, where updating or
inserting twice with one call can be fatal to data consistency.
So, if we are not having Sets, how exactly does one tell if what they
are holding is a single value scalar, or a multi-value junction?
Can a junction hold values of completely different types, or just
different values of the same type?
If evaluation of one value of a junction causes an error, is $! now a
junction as well?
-- Rod Adams
> Is there other built-in methods not found in perl5 that you are
> aware of?
Yes.
> I'd like to work out declarations and implementations
> of them in one sweep, if possible. :-)
Hah! Dream on! I don't think we have a canonical list anywhere (except in
Larry's head). Some non-Perl-5 Perl 6 builtins that instantly come to mind
include:
zip - interleave arrays (also the C<ï½¥> operator)
uniq - remove duplicates without reordering
reduce - standard list processing function
sum - add list values
max - maximum (may take a block to specify comparison criterion)
min - minimum (may take a block to specify comparison criterion)
kv - return interleaves keys and values from a hash or array
type - return the type metaobject for a referent
pick - select at random from a list, array, or hash
I'm sure there are many others.
Damian
FWIW, I also find it incredibly disturbing. Although I don't have
to deal with it yet in the side-effect-free FP6, I think one way
to solve this is for the "say" to return a junction of IO actions.
Normally a statement-separating semicolon "launches" IO actions
obtained by evaluating the left-side statement, then moves on to
handle the right-side statement. For example, "say" will have
this signature:
multi sub say (*@_) returns IO of Bool { ... }
The "IO of Bool" can only be launched by a toplevel sequencing,
obtained by the destructive assignment:
my $a = say("xxx"); # output "xxx\n" on screen; $a is now a Bool
say("yyy"); # output "yyy\n" on screen; retval is discarded
but not via the nondestructive binding:
my $a := say("xxx"); # nothing outputted, $a is "IO of Bool"
$a; # this runs the action.
Your example then becomes:
$x := 'cat'|'dog'; # $x is now Junction of String
say $x; # Junction of (IO of String)
the toplevel sequencing only handles "IO of ..." types, so the junction
above will not print anything. Instead it may raise a warning of "using a
Junction in a void context", or something equally ominous.
(Haskell users will notice the equivalency of toplevel sequencing with
monadic I/O).
> So, if we are not having Sets, how exactly does one tell if what they
> are holding is a single value scalar, or a multi-value junction?
$x.isa("Junction").
> Can a junction hold values of completely different types, or just
> different values of the same type?
A junction typed as "Junction of Scalar" (as is the default) can
probably hold pretty much anything. There is also "Junction of Class"
and the amazingly weird "Junction of Any".
> If evaluation of one value of a junction causes an error, is $! now a
> junction as well?
I don't have answers for this one.
Thanks,
/Autrijus/
Thinking about it, that warnings should be "Using a Junction in Action
context". The "Action" name is better because "IO" is already used for
handles. I'll try to list some notable deviations here:
* A new type, "Action", that represents actions that must be
sequenced in order and may have side effects.
* The default context for toplevel program is now "Action of List"
(or "Action of Any"). Each semicolon-separated statement in it
are evaluated in that context as well.
* Change the destructive assignment ("=") operator, so the lvalue
context (say "Scalar") may match a corresponding rvalue context
(i.e. "Action of Scalar"):
multi sub print (*@_) returns Action of Bool { ... }
$a = print(3); # $a.isa(Bool) -- launch the action
$b := print(3); # $b.isa(Action of Bool) -- not launched
* Similarily, containers of Action objects won't launch them:
# This prints nothing
@b := [print(1), print(2), print(3)];
* However, destructive assignment under "List of Action" context
launches them in a sequence:
# This prints 123
@b = [print(1), print(2), print(3)];
I reckon that this treatment is fairly drastic. However, if one writes
perl6 program under the perl5-like imperative subset (i.e. always use
destructive assignment), then all user-defined functions will be
evaluated under the Action context by default, so syntactic differences
may still be minimized.
As I'm not planning to implement it until I finish the OO parts, I'd
appreciate feedbacks on this idea.
Thanks,
/Autrijus/
Would that happen though? What's the signature of C<say>? I think
it's something like
multi sub *say ($stream = $*OUT: *$data) { ... }
so autothreading wouldn't happen anyway as S9 says the slurpy array/hash
aren't autothreaded. Also, for user-defined subs I'd imagine that you
could give perl a hint not to autothread it with perhaps a "is
nonthreading" trait.
> Getting iterated executions of a statement without explicitly iterating
> it bothers me greatly.
It's funny how one man's feature is another man's bother :-)
> So, if we are not having Sets, how exactly does one tell if what they
> are holding is a single value scalar, or a multi-value junction?
Using the same introspection capabilities that let's you tell if that
scalar you have is an object of some sort I'd imagine.
> Can a junction hold values of completely different types, or just
> different values of the same type?
Given perl's tendency towards permission rather than restriction, I'd
guess that you could have a junction composed of almost anything.
Consider:
any(3,"fred",@foo, { $^x*$^x})
Now, what that *means* is another story :-)
> If evaluation of one value of a junction causes an error, is $! now a
> junction as well?
How do you "evaluate one value of a junction"? I would think that the
junctive disposition of $! would depend on whether that "one value" were
a junction or not. If not, then you just get $! when that one particular
value is evaluated (like 3/any(2,0,3) would generate a junction of
any(3/2,3/0,3/3) with that 3/0 waiting to be realized (evaluated) and
once it is, then $! would hold the "divide by zero")
In my current sleep-deprived state I think that you're more likely to
get a junction of various $! valus than have $! be a junction of
values (unless you're setting it explicitly)
-Scott
--
Jonathan Scott Duff
du...@pobox.com
Thanks, that's what I was looking for. Sorry I didn't catch it sooner. :-)
Pm
Right. What I mean is that
one($a, $a, $b)
should collapse into
one($b)
That is, it should delete all duplicate elements from its set. Does it
look like correct?
Thanks,
/Autrijus/
Indeed. Perhaps I can refactor one() to store it with two subsets:
the "none" set and the "one" set; new elements are checked against
the "one" set; if duplicates are found, it gets moved into the "none" set.
That way the type of the junction is still one(); the .values() method
will then return two items for each element in the none() subset, and one
for each in the one() subset.
Does it make sense?
Thanks,
/Autrijus/
Oops, I missed that part. Sorry.
> Of course we'll always have C<grep>. But this is Perl, and I want YAWTDI.
> After all, another way to test membership was just added, whereas before
> you pretty much just had C<grep>.
...another way to test membership was added...?
> My issue is less that lists and sets are radically different. It is much
> more a matter of Junctions and Scalars are radically different. Getting
> me to accept that a Scalar holds several different values at once is a
> hard sell. Especially when you consider duplicated side effects.
Since Scalars can be objects that are fairly complex aggregations
that simultaneously hold (or appear to hold) multiple different
values at once, this doesn't seem like a strong argument.
> And what happens if you attempt to evaluate a junction in a non-boolean
> context?
I dunno, which context are you concerned about?
$a = any(2,3,4);
$b = ? $a; # boolean context, $b == true
$n = $a + 3; # numeric context, $n == any(5,6,7)
$s = $a ~ 'x'; # string context, $s == any('2s', '3s', '4s')
@l = ($a); # list context, @l == (any(2,3,4))
Pm
Depends on when it's checking its operands for duplicates, and
the type of checking being performed. For example,
$x = one(0, 0, 1);
has duplicate elements (the zeroes) but is still true.
Similarly, with
$x = one(2, 3, 2);
$x = $x % 2;
if $x { say "Exactly one odd value"; }
collapsing the one(2, 3, 2) to be one() in the first statement would
be incorrect.
Pm
No, consider
$a = 1;
$b = 2;
one($a, $a, $b) # false
one($b) # true
Pm
True, but it's always possible to find or create functions that would
have the "disturbing" effect that Rod was pointing out.
sub mysay { say $^x; }
$x = 'cat' | 'dog';
mysay $x;
Pm