Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Junctions, Sets, and Threading.

53 views

Skip to first unread message

Rod Adams

unread,

Feb 22, 2005, 4:25:23 AM2/22/05

to Perl6 Language List

Definitions
===========

set (n) : A data container that can hold many elements simultaneously.
The order of elements in a set is meaningless. No two elements in a set
may hold the same value.

junction (n) : The combination of several independent values and an
explicit boolean predicate, to form an object that can behave as a
single entity, until evaluation, when threading will occur.

thread (v) : To take a single statement or expression, and execute it
one or more times, with each iteration using a different value or
element from a junction or set that is participating in that expression.
The results of each execution will be recombined in some fashion.

boolean expression (n) : An expression evaluated in a boolean context.
Some notable boolean expressions:
1. Any of the comparison operators taken with their operands.
2. The test expression of an C< if > or C< while > statement.
3. The distant connection between a C< given > and it's associated
C< when >s.
4. Type checking.

implicit threading (n) : Threading that is done on the user's behalf
with no special syntax in the code declaring when and where it is
happening. It is entirely at the implementation's judgment of the given
circumstances when threading will take place.

explicit threading (n) : Threading done for the user's, on his request.
The form of this request may vary, but there is some syntax that tells
the implementation to thread a given section of code.

function (n) : Any subroutine in capable of returning a value. In Perl
6, this includes: operators, subs, methods, multi-methods, and more.

Junctions
=========

The purpose of a junction is to allow for performing several tests at a
given time, with the testing code needing no knowledge of that junctions
are present. While a junction can represent several values at the same
time, such notions as "hold" and "contain" should be avoided, and they
can be very misleading, they betray the fact that a junction is a single
entity. As such, junctions are best thought of as Scalar. (Like other
scalars, it may reference to something much more complex, but that's
beside the point). Therefore the result of a junction evaluation should
be Scalar. (We will see later that this scalar value is in fact
boolean).

But what is a "junction evaluation"? Simply put, it is the results of
threading all values of a junction through an expression and the
recombination of these results back into a scalar, via the logic of the
associated boolean predicate.

Note the fact that it's a _boolean_ predicate. That means that the end
result of any junction evaluation is a single boolean value. Therefore,
the only place that it makes sense to evaluate a junction is in a place
where a boolean value is expected. This brings us to our "Prime Junction
Rule":

Junctions can only be evaluated as part of a boolean expression.
Attempting to evaluate a junction anywhere else is an error.

Some comments about what this does for us.

- Threading in this case is purely implicit. This allows a user to drop
a junction into any place a test parameter is called for. Asking for
explicit threading here makes the concept useless.

- Nesting of junctions is not an issue, since each threaded evaluation
is yet another boolean expression.

- Things like C< for $j .. 10 {...} >, are illegal. They didn't make
much sense in the first place.

- In fact, most code written without junctions in mind will execute
cleanly and as expected, since the result of the threading operation
will be exactly what is expected: boolean.

- As such there should be no limitation as to where a junction can be
stored and passed. Functions will have the ability to explicitly reject
junctions as arguments by typing their parameters with something like
C< all(any(Int, Str), none(Junction)) >. Yes, that's using junctions to
reject junctions.

- There is still a limited case of "Nasty Side Effect Syndrome" that can
be caused by having an assignment, output, or other undesirable type of
function/operator as part of a boolean expression. There will therefore
need to be a "is unautothreadable" trait to be had for functions. All
builtin I/O and assignment functions should have this on. This
protection is shallow. If the call is not in the text of the evaluated
expression, it is not guaranteed to be tested.

- If you wanted to thread a junction in a non-boolean way, you probably
didn't want a junction. You likely either wanted an array or a set, and
some combination of hyper operators. See below.

Boolean Expressions
===================

Some common boolean expressions in context, with valid junction usage:

if $x == any(1,2,3,4) {...}

given $string {
when any(/a .* q/, 'santa') {...}
when all(&is_all_uppercase(), /^ \D+ $/) {...}
}

if is_prime(sqrt(any(25,14,32))) {...}

Note that in the last case, it's the entire boolean expression that's
threaded. The breakdown would look like:

is_prime(sqrt(any(25,14,32))) --->
any(is_prime(sqrt(25)),
is_prime(sqrt(14)),
is_prime(sqrt(32))
)

The rule for what to thread is to take the smallest enclosing boolean
expression around the junction as possible. This expansion cannot extend
past a single statement.

Sets
====

Despite several surface similarities between a set and a junction, they
are quite different. Sets actually contain other things. Junctions hold
several values at once, and can even hold the same value more than once
at once. Sets are inherently plural in thought process. Junctions are
inherently singular. The operations one wishes to perform on a set are
radically different from those performed on a junction.

Sets are also rather different from arrays. Arrays allow any given value
to be duplicated, imply a certain order that remain, and other such
usage issues. Not to say that an array can't _hold_ a set. In fact I'll
be assuming that an array is used internally a bit later on.

Sets and hashes are an odd pair. Hashes are not a good basis for Sets
due to the restriction that the keys are strings. Even if this is not
the case, Sets do not need the related data value that hashes give you.
But more on hashes and sets later.

So we need a new Set class. We'll give it a constructor called
C< set() > (creative, huh?) which takes a list as input. And we'll
redefine several operators around it:

my $x = set(1..3);
my $y = set(1,3,5,7,9);
my $n = 2;

$x + $y # set(1,2,3,5,7,9)
$x + $n # set(1,2,3,5,7,9)
$x * $y # set(1,3)
$x - $y # set(2)
$x < $y # $x is a proper subset of $y
$x <= $y # $x is a subset of $y
$x == $y # $x is the set $y
$x ~~ $y # $x is the set $y
$n < $y # $n is an element of $y
$n ~~ $y # $n is an element of $y
set() # empty set
$x.powerset # power set of $x
$x.elems # count of elements in set $x

(If this list looks familiar, it's because I stole off Damian, and made
several edits.)
In addition, a set evaluated in a list context returns it's members.

The proposed implementation is that of an internal sorted array. Yes,
sorted. This makes the processing time of virtually every set operation
dramatically faster. Membership is O(lg n) with a binary search; unions,
intersections, set difference and others are all O(n).

However, since a Set can contain elements, we need to consider the
sorting function. I don't care how it works, as long as the following
holds:

Objects of the same class are grouped together.
Objects of the same class are ordered by their traditional sort, if
existent. Numbers increasing, Strings alphabetically, etc.
It's recreatable. The same code running on any platform will sort any
combination of the same data the same way.
If the object in question supports a C< .key > method, that is used for
the sort.

In addition to C< ~~ > for membership, we will borrow the {}
subscripters from hashes (not sure about the <> or «» variants). In this
form, a C< $set{$key} > reference will look for the element that matches
the place in the sort identified by $key (in other words, the element
with the matching .key() or value). If something is there, it then
returns C< .value||$_ >, assuming $_ is the element at that place.

This seemingly bizarre set of semantics gives us something very
powerful. One can make a Set of Pairs, which behaves much like a hash,
only with the ability to use any data type as the key.

We can now completely eliminate Hashes and replace them with a Set of
(String => Any) pairs. But we won't. Hashes are a special case. They do
lookups in roughly linear time, not logarithmic. Not to mention they're
just too useful and common as they are to mess with them much.

Note that we can also create a "sparse array" in a similar manner.

Sets will borrow at least the following methods from Arrays:
grep, map, elems
and these from hashes:
keys, values, each, delete, exists

Since the set is internally sorted, any iteration of the set is sorted
(by .key||$_) as well.

HyperQuotes
===========

Just to round out any corner cases that the existing hyper operators do
not cover, and to add a clearer way to do some of them, we will allow
any list surrounded with » «, called hyperquotes, to thread the
immediate function, treating each element of the list as a scalar, and
have the expression return a list of the results. Some samples:

@s = 'item' _ »@x«;

$x = 'abcdabc';
@x = split »/a/, /c/, /d/«, $x;
# (['','bcd','bc'],['ab','dab',''],['abc','abc'])

@x = func($a, »@y«);

With these and all the other hyper operator, one should be able to easy
perform the equivalent threading operation that they lost with the Prime
Junction Rule.

Since all the »'s and «'s floating around make this explicit threading,
there are no nasty surprises laying around. You never have a scalar
suddenly explode into a list, causing no end of havoc in unsuspecting
code.

These are also infinitely more useful, since you can be guaranteed that
they won't short circuit, and they are in exactly the order you
specified.

So that's how I feel about things. The Prime Junction Rule was the big
revelation I had the other night. The rest of this was the logical
consequence spawned off that single thought to make it a complete idea.

I think that overall in this process I've done the following:
- Kept the real power of Junctions intact.
- Provided fairly strong protection for newbies, without sacrificing
power.
- Kept Nasty surprises to a minimum.
- Got rid of the need for "half on" features.
- Provided back any power that the Prime Rule removed through sets and
expanded hyper ops.

-- Rod Adams

Damian Conway

unread,

Feb 22, 2005, 6:13:31 AM2/22/05

to perl6-l...@perl.org

Rod Adams wrote:

> The purpose of a junction is to allow for performing several tests at a
> given time, with the testing code needing no knowledge of that junctions
> are present. While a junction can represent several values at the same
> time, such notions as "hold" and "contain" should be avoided, and they
> can be very misleading, they betray the fact that a junction is a single
> entity. As such, junctions are best thought of as Scalar. (Like other
> scalars, it may reference to something much more complex, but that's
> beside the point). Therefore the result of a junction evaluation should
> be Scalar.

Everything is okay to here.

> (We will see later that this scalar value is in fact boolean).

Err...no. See below.

> But what is a "junction evaluation"? Simply put, it is the results of
> threading all values of a junction through an expression and the
> recombination of these results back into a scalar, via the logic of the
> associated boolean predicate.

Correct.

> Note the fact that it's a _boolean_ predicate. That means that the end
> result of any junction evaluation is a single boolean value.

No. For example:

# Print a list of substrings...
my $substring = substr("junctions", any(1..3), any(3..6));
say $substring.values();

> Therefore, the only place that it makes sense to evaluate a junction
> is in a place where a boolean value is expected. This brings us to our
> "Prime Junction Rule":
>
> Junctions can only be evaluated as part of a boolean expression.

No.

> Attempting to evaluate a junction anywhere else is an error.

No.

> Some comments about what this does for us.
>
> - Threading in this case is purely implicit. This allows a user to drop
> a junction into any place a test parameter is called for. Asking for
> explicit threading here makes the concept useless.

Correct.

> - Nesting of junctions is not an issue, since each threaded evaluation
> is yet another boolean expression.

Not necessarily, but boolean evaluation of does indeed work that way.

> - Things like C< for $j .. 10 {...} >, are illegal. They didn't make
> much sense in the first place.

Probably.

> - In fact, most code written without junctions in mind will execute
> cleanly and as expected, since the result of the threading operation
> will be exactly what is expected: boolean.

It doesn't have to be boolean, but the point is basically correct.

> - As such there should be no limitation as to where a junction can be
> stored and passed. Functions will have the ability to explicitly reject
> junctions as arguments by typing their parameters with something like
> C< all(any(Int, Str), none(Junction)) >. Yes, that's using junctions to
> reject junctions.

Yes.

> - There is still a limited case of "Nasty Side Effect Syndrome" that can
> be caused by having an assignment, output, or other undesirable type of
> function/operator as part of a boolean expression. There will therefore
> need to be a "is unautothreadable" trait to be had for functions. All
> builtin I/O and assignment functions should have this on. This
> protection is shallow. If the call is not in the text of the evaluated
> expression, it is not guaranteed to be tested.

I need to think about this. I'm not sure I'm convinced this isn't covered by
none(Junction) typing on parameters.

> - If you wanted to thread a junction in a non-boolean way, you probably
> didn't want a junction. You likely either wanted an array or a set, and
> some combination of hyper operators. See below.

Not convinced of this.

> Sets
> ====
>
> Despite several surface similarities between a set and a junction, they
> are quite different. Sets actually contain other things. Junctions hold
> several values at once, and can even hold the same value more than once
> at once.

Err, no. Junctions *are* values, and those values are unique.

> Sets are inherently plural in thought process. Junctions are
> inherently singular. The operations one wishes to perform on a set are
> radically different from those performed on a junction.

True.

> Sets and hashes are an odd pair. Hashes are not a good basis for Sets
> due to the restriction that the keys are strings. Even if this is not
> the case,

It's not.

> In addition, a set evaluated in a list context returns it's members.

Err...then how do you create a list of sets???

> In addition to C< ~~ > for membership, we will borrow the {}
> subscripters from hashes (not sure about the <> or «» variants).

You'd have to take them, I'm afraid. Introducing gratuitous
inconsistencies would be a Bad Idea.

> In this form, a C< $set{$key} > reference will look for the element
> that matches the place in the sort identified by $key (in other
> words, the element with the matching .key() or value). If something
> is there, it then returns C< .value||$_ >, assuming $_ is the element
> at that place.
>
> This seemingly bizarre set of semantics gives us something very
> powerful. One can make a Set of Pairs, which behaves much like a hash,
> only with the ability to use any data type as the key.

Hashes already have this ability. I don't think overloading the hash access
syntax on Sets is worth the trouble.

> Sets will borrow at least the following methods from Arrays:
> grep, map, elems
> and these from hashes:
> keys, values, each, delete, exists

Keys? Surely sets only have values???

> HyperQuotes
> ===========
>
> Just to round out any corner cases that the existing hyper operators do
> not cover, and to add a clearer way to do some of them, we will allow
> any list surrounded with » «, called hyperquotes, to thread the
> immediate function, treating each element of the list as a scalar, and
> have the expression return a list of the results. Some samples:
>
> @s = 'item' _ »@x«;

That's:

@s = 'item »_« @x;

> $x = 'abcdabc';
> @x = split »/a/, /c/, /d/«, $x;

That's:

@x = »split« [/a/, /c/, /d/], $x;

but I can see the appeal. Though I still think it's more readble to write:

@x = values split any(/a/, /c/, /d/), $x;

> @x = func($a, »@y«);

That's:

@x = »func«($a, @y);

But, y'know, this one almost convinces me. Especially when you consider:

sub func ($i, $j, $k) {...}

@x = func($a, »@y«, @z);

> With these and all the other hyper operator, one should be able to easy
> perform the equivalent threading operation that they lost with the Prime
> Junction Rule.

I very much doubt we're going to lose it. ;-)

> These are also infinitely more useful, since you can be guaranteed that
> they won't short circuit, and they are in exactly the order you
> specified.

This is admittedly very nice.

> So that's how I feel about things. The Prime Junction Rule was the big
> revelation I had the other night. The rest of this was the logical
> consequence spawned off that single thought to make it a complete idea.
>
> I think that overall in this process I've done the following:
> - Kept the real power of Junctions intact.
> - Provided fairly strong protection for newbies, without sacrificing
> power.
> - Kept Nasty surprises to a minimum.
> - Got rid of the need for "half on" features.
> - Provided back any power that the Prime Rule removed through sets and
> expanded hyper ops.

Leaving aside the Unnecessary Restriction on junctions (a.k.a. Prime Rule ;-),
the main downside is that we would be multiplying our entities.
We'd need to be certain that the increased cognitive load is worth
the benefits.

Damian

Juerd

unread,

Feb 22, 2005, 6:30:39 AM2/22/05

to Damian Conway, perl6-l...@perl.org

Damian Conway skribis 2005-02-22 22:13 (+1100):

> > @x = func($a, »@y«);
> That's:
> @x = »func«($a, @y);
> But, y'know, this one almost convinces me. Especially when you consider:
> sub func ($i, $j, $k) {...}
> @x = func($a, »@y«, @z);

Naievely, I'd expect

my @a = @b = 1..3;
»foo«(@a, @b)

to result in

foo(@a[0], @b[0]),
foo(@a[1], @b[1]),
foo(@a[2], @b[2]);

but

foo(»@a«, »@b«)

with the same arrays in

foo(@a[0], @b[0]),
foo(@a[0], @b[1]),
foo(@a[0], @b[2]),
foo(@a[1], @b[0]),
foo(@a[1], @b[1]),
foo(@a[1], @b[2]),
foo(@a[2], @b[0]),
foo(@a[2], @b[1]),
foo(@a[2], @b[2]);

Likewise,

@foo »+« @bar

would iterate in parallel, resulting in min(+@foo, +@bar) elements,
while

»@foo« + »@bar«

would return +@foo * +@bar elements.

I'd then expect

$foo +« @bar

and

$foo + »@bar«

to be equivalent (æsthetically, the latter is more pleasing, imo).

Juerd
--
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html
http://convolution.nl/gajigu_juerd_n.html

Damian Conway

unread,

Feb 22, 2005, 6:35:10 AM2/22/05

to Juerd, perl6-l...@perl.org

Juerd wrote:

> Naievely, I'd expect
>
> my @a = @b = 1..3;
> »foo«(@a, @b)
>
> to result in
>
> foo(@a[0], @b[0]),
> foo(@a[1], @b[1]),
> foo(@a[2], @b[2]);
>
> but
>
> foo(»@a«, »@b«)
>
> with the same arrays in
>
> foo(@a[0], @b[0]),
> foo(@a[0], @b[1]),
> foo(@a[0], @b[2]),
> foo(@a[1], @b[0]),
> foo(@a[1], @b[1]),
> foo(@a[1], @b[2]),
> foo(@a[2], @b[0]),
> foo(@a[2], @b[1]),
> foo(@a[2], @b[2]);

Yeah, that was the bit that nearly has me convinced. ;-)

Damian

Aldo Calpini

unread,

Feb 22, 2005, 9:19:24 AM2/22/05

to Damian Conway, perl6-l...@perl.org

Damian Conway wrote:
> > @s = 'item' _ »@x«;
>
> That's:
>
> @s = 'item »_« @x;

(just checking that my unerstanding is correct, don't want to be
nitpicking :-)

assuming that you meant to prepend the string "item" to each element of
@x, isn't that:

@s = 'item' »~« @x;

furthermore, given that the operator "hyperates" on one side only,
shouldn't this be:

@s = 'item' ~« @x;

cheers,
Aldo

Luke Palmer

unread,

Feb 22, 2005, 1:23:29 PM2/22/05

to Juerd, Damian Conway, perl6-l...@perl.org

Hmm, this all makes me think of my proposal a few weeks back:

« foo(@a[$^i], @b[$^i]) »

« foo(@a[$^i], @b[$^j]) »

I've grown to believe that my proposal had some kinks in it,
particularly in the area of what kind of thing «@a» is. But I'm also
believing that something like it is becoming warranted.

Luke

Rod Adams

unread,

Feb 22, 2005, 1:57:24 PM2/22/05

to Damian Conway, perl6-l...@perl.org

Damian Conway wrote:

> Rod Adams wrote:
>
> > The purpose of a junction is to allow for performing several tests at a
> > given time, with the testing code needing no knowledge of that
> junctions
> > are present. While a junction can represent several values at the same
> > time, such notions as "hold" and "contain" should be avoided, and they
> > can be very misleading, they betray the fact that a junction is a
> single
> > entity. As such, junctions are best thought of as Scalar. (Like other
> > scalars, it may reference to something much more complex, but that's
> > beside the point). Therefore the result of a junction evaluation should
> > be Scalar.
>
> Everything is okay to here.
>

> [snip]

>
> > Therefore, the only place that it makes sense to evaluate a junction
> > is in a place where a boolean value is expected. This brings us to our
> > "Prime Junction Rule":
> >
> > Junctions can only be evaluated as part of a boolean expression.
>
> No.
>
>
> > Attempting to evaluate a junction anywhere else is an error.
>
> No.

This is my major point of the post. In my opinion, your example of:

# Print a list of substrings...
my $substring = substr("junctions", any(1..3), any(3..6));
say $substring.values();

Is a perfect example of a place where saying:

# Print a list of substrings...

my @substring = substr("junctions", »1..3«, »3..6«);
say @substring;

Is far more in line with what you're doing. Not to mention you likely
wanted it to do:

my @substring = substr("junctions", »1..3«, »3..6«);
say »@substring«;

instead, putting nice "\n"s between values. My way also guarentees a
certain order to the output.

Simply put, you're ignoring the fact that junctions have a boolean
predicate. Which they do.
In this example, it didn't matter if you had used any/all/one/none.

>
> > - There is still a limited case of "Nasty Side Effect Syndrome" that
> can
> > be caused by having an assignment, output, or other undesirable type of
> > function/operator as part of a boolean expression. There will therefore
> > need to be a "is unautothreadable" trait to be had for functions. All
> > builtin I/O and assignment functions should have this on. This
> > protection is shallow. If the call is not in the text of the evaluated
> > expression, it is not guaranteed to be tested.
>
> I need to think about this. I'm not sure I'm convinced this isn't
> covered by
> none(Junction) typing on parameters.

It is not.

By my proposal:

$j = any(1,2,3);
unless say($j) { die '...' }

The say would thread, because of the boolean expression in the 'unless'.
C< say > does not get the junction as a parameter. But C< say > needs to
be marked that it's a no-no to thread over it.

> > - If you wanted to thread a junction in a non-boolean way, you probably
> > didn't want a junction. You likely either wanted an array or a set, and
> > some combination of hyper operators. See below.
>
> Not convinced of this.

I am completely convinced of this. Please express your reservations so I
can address them.

>
>
> > Sets
> > ====
> >
> > Despite several surface similarities between a set and a junction, they
> > are quite different. Sets actually contain other things. Junctions hold
> > several values at once, and can even hold the same value more than once
> > at once.
>
> Err, no. Junctions *are* values, and those values are unique.

$x = one(2,3,3);

squishing the duplicate 3 is not allowed here.

>
>
> > Sets are inherently plural in thought process. Junctions are
> > inherently singular. The operations one wishes to perform on a set are
> > radically different from those performed on a junction.
>
> True.
>
>
> > Sets and hashes are an odd pair. Hashes are not a good basis for Sets
> > due to the restriction that the keys are strings. Even if this is not
> > the case,
>
> It's not.

Changing the index type of hashes away from Str makes as much sense to
me as changing the index type of arrays to something non-integral. None.

>
> > In addition, a set evaluated in a list context returns it's members.
>
> Err...then how do you create a list of sets???

grumble. Didn't think of that. I was looking for a simple way to say:

for $set {...}

without throwing all kinds of special cases around.

>
>
> > In addition to C< ~~ > for membership, we will borrow the {}
> > subscripters from hashes (not sure about the <> or «» variants).
>
> You'd have to take them, I'm afraid. Introducing gratuitous
> inconsistencies would be a Bad Idea.

That's acceptable.

>
> > In this form, a C< $set{$key} > reference will look for the element
> > that matches the place in the sort identified by $key (in other
> > words, the element with the matching .key() or value). If something
> > is there, it then returns C< .value||$_ >, assuming $_ is the element
> > at that place.
> >
> > This seemingly bizarre set of semantics gives us something very
> > powerful. One can make a Set of Pairs, which behaves much like a hash,
> > only with the ability to use any data type as the key.
>
> Hashes already have this ability. I don't think overloading the hash
> access
> syntax on Sets is worth the trouble.

And the same hash function would nicely handle non-strings?
And I'll have to ask: Can you also change the index of arrays to Double?

I'd prefer to keep all the optimizations that hashs have as is. Throw
all the abject weirdness elsewhere.

>
> > Sets will borrow at least the following methods from Arrays:
> > grep, map, elems
> > and these from hashes:
> > keys, values, each, delete, exists
>
> Keys? Surely sets only have values???

Sets have elements. Those elements can have .key and .value methods on
them. Pairs for example. Any element that can't .key or .value will
default to it's value.

btw, this whole Sets mimic hashes thing can be ripped out of my Sets
proposal if nobody else likes it.

>
>
> > HyperQuotes
> > ===========
> >
> > Just to round out any corner cases that the existing hyper operators do
> > not cover, and to add a clearer way to do some of them, we will allow
> > any list surrounded with » «, called hyperquotes, to thread the
> > immediate function, treating each element of the list as a scalar, and
> > have the expression return a list of the results. Some samples:
> >
> > @s = 'item' _ »@x«;
>
> That's:
>
> @s = 'item »_« @x;

TMTOWTDI is a good thing. Personally I think my way was clearer. Your
mileage may vary.

>
> > $x = 'abcdabc';
> > @x = split »/a/, /c/, /d/«, $x;
>
> That's:
>
> @x = »split« [/a/, /c/, /d/], $x;
>
> but I can see the appeal. Though I still think it's more readble to
> write:
>
> @x = values split any(/a/, /c/, /d/), $x;
>
>
> > @x = func($a, »@y«);
>
> That's:
>
> @x = »func«($a, @y);
>
> But, y'know, this one almost convinces me. Especially when you consider:
>
> sub func ($i, $j, $k) {...}
>
> @x = func($a, »@y«, @z);

I knew I wasn't coming up with good examples. It was 3:30am by the time
I got to this part of the post.

>
>
> > With these and all the other hyper operator, one should be able to easy
> > perform the equivalent threading operation that they lost with the
> Prime
> > Junction Rule.
>
> I very much doubt we're going to lose it. ;-)

I was referring to losing the threading outside of boolean expressions.

>
>
> > These are also infinitely more useful, since you can be guaranteed that
> > they won't short circuit, and they are in exactly the order you
> > specified.
>
> This is admittedly very nice.
>
>
> > So that's how I feel about things. The Prime Junction Rule was the big
> > revelation I had the other night. The rest of this was the logical
> > consequence spawned off that single thought to make it a complete idea.
> >
> > I think that overall in this process I've done the following:
> > - Kept the real power of Junctions intact.
> > - Provided fairly strong protection for newbies, without sacrificing
> > power.
> > - Kept Nasty surprises to a minimum.
> > - Got rid of the need for "half on" features.
> > - Provided back any power that the Prime Rule removed through sets and
> > expanded hyper ops.
>
> Leaving aside the Unnecessary Restriction on junctions (a.k.a. Prime
> Rule ;-),
> the main downside is that we would be multiplying our entities.
> We'd need to be certain that the increased cognitive load is worth
> the benefits.

Well, let me say this: Consider the Sets a separate issue. That should
drastically help in "multiplying our entities". I'm not sure why I had
convinced myself that the Set proposal was tied to the Prime Rule.

I consider the Prime Rule as essential to junction sanity. Without it,
you're free to ignore that predicate, and begin a path of "junction
explosion", where one simple junction suddenly converts vast swatches of
execution into junctions, with no easy collapsing back down to what the
author had in mind.

If you're ignoring the predicate, a junction is just a list of values.
Treat it as such. A list. Hyperquotes will deal with those situations
nicely, though explicitly.

-- Rod Adams

-- Rod Adams.

Damian Conway

unread,

Feb 22, 2005, 5:53:36 PM2/22/05

to Aldo Calpini, perl6-l...@perl.org

Aldo Calpini wrote:

> Damian Conway wrote:
>
>> > @s = 'item' _ »@x«;
>>
>> That's:
>>
>> @s = 'item »_« @x;
>
>
> (just checking that my unerstanding is correct, don't want to be
> nitpicking :-)
>
> assuming that you meant to prepend the string "item" to each element of
> @x, isn't that:
>
> @s = 'item' »~« @x;

If Rod meant concatenation, yes it should be C<~>. I was just reproducing what
he wrote.

> furthermore, given that the operator "hyperates" on one side only,
> shouldn't this be:
>
> @s = 'item' ~« @x;

The rule, as I understand it, is that unaries have one hypermarker and
binaries have two.

Damian

Damian Conway

unread,

Feb 22, 2005, 6:32:20 PM2/22/05

to Rod Adams, perl6-l...@perl.org

Rod Adams wrote:

> This is my major point of the post. In my opinion, your example of:
>
> # Print a list of substrings...
> my $substring = substr("junctions", any(1..3), any(3..6));
> say $substring.values();
>
> Is a perfect example of a place where saying:
>
> # Print a list of substrings...
> my @substring = substr("junctions", »1..3«, »3..6«);
> say @substring;
>
> Is far more in line with what you're doing.

But much less obvious. This is the core of my qualms. You're proposing we have
all three of:

»foo« @x, @y;
foo »@x«, »@y«;
foo «@x @y»;

all meaning very different things. That really worries me. I'd especially
prefer that we didn't overload »« to mean both "linear vector" and
"cross-product".

I certainly like the idea of having both abilities available; I'm just think
the syntaxes need to be better differentiated.

> Not to mention you likely wanted it to do:
>
> my @substring = substr("junctions", »1..3«, »3..6«);
> say »@substring«;
>
> instead, putting nice "\n"s between values.

No, if I'd wanted to do that, I'd have written:

say for $substring.values;

> Simply put, you're ignoring the fact that junctions have a boolean
> predicate. Which they do.

Yep. Isn't it useful that you can do that, and get more functionality out of a
single construct?

>> I need to think about this. I'm not sure I'm convinced this isn't
>> covered by none(Junction) typing on parameters.
>
>
> It is not.
>
> By my proposal:
>
> $j = any(1,2,3);
> unless say($j) { die '...' }
>
> The say would thread, because of the boolean expression in the 'unless'.
> C< say > does not get the junction as a parameter. But C< say > needs to
> be marked that it's a no-no to thread over it.

But that was precisely my point:

sub say (none(Junction) *@sayings) {...}

> I am completely convinced of this. Please express your reservations so I
> can address them.

sub get_matches {
my @in = split /<ws>/, prompt 'search for: ';
my @out = split /<ws>/, prompt 'but not: ';
return %data{any(@in) & none(@out)};
}

my $data = get_matches();

No booleans anywhere in sight.

>> Err, no. Junctions *are* values, and those values are unique.
>

> $x = one(2,3,3);
>
> squishing the duplicate 3 is not allowed here.

True. one() is special in that regard. The other junctions squish/ignore
duplicates.

> Changing the index type of hashes away from Str makes as much sense to
> me as changing the index type of arrays to something non-integral. None.

Then perhaps you need to think about it a little more. A hash is a mapping. In
Perl 5 that mapping is restricted to Str->Any. In Perl 6 it's normally the
same, but can be Any->Any. Which finally realizes the full power of the
concept of mappings. For example:

my %seen is shape(IO) of Bool; # %seen maps IO objects to boolean values

while get_next_input_stream() -> $in {
next if %seen{$in};
$text ~= slurp $in;
%seen{$in} = 1;
}

>> > In addition, a set evaluated in a list context returns it's members.
>>
>> Err...then how do you create a list of sets???
>
>
> grumble. Didn't think of that. I was looking for a simple way to say:
>
> for $set {...}
>
> without throwing all kinds of special cases around.

for values $set {...}

> And the same hash function would nicely handle non-strings?
> And I'll have to ask: Can you also change the index of arrays to Double?

No, of course not. But you *can* change the index to any enumerable type:

my @weekly_rainfall is shape(Day);

@weekly_rainfall[Day::Wed] = 42; # millimetres

>> > Sets will borrow at least the following methods from Arrays:
>> > grep, map, elems
>> > and these from hashes:
>> > keys, values, each, delete, exists
>>
>> Keys? Surely sets only have values???
>
>
> Sets have elements. Those elements can have .key and .value methods on
> them. Pairs for example. Any element that can't .key or .value will
> default to it's value.
>
> btw, this whole Sets mimic hashes thing can be ripped out of my Sets
> proposal if nobody else likes it.

I certainly don't like it. But that should be a different thread.

>> > @s = 'item' _ »@x«;
>>
>> That's:
>>
>> @s = 'item »_« @x;
>
> TMTOWTDI is a good thing. Personally I think my way was clearer. Your
> mileage may vary.

It does. Significantly. I like the idea, but hate the syntax. I'd much rather
something more explicitly named. Like:

@s = 'item' »~« every(@x);

and:

say every(@first_name) ~ every(@surname);

>> But, y'know, this one almost convinces me. Especially when you consider:
>>
>> sub func ($i, $j, $k) {...}
>>
>> @x = func($a, »@y«, @z);

Which would again be more obvious as:

@x = func($a, every(@y), @z);

>> > With these and all the other hyper operator, one should be able to easy
>> > perform the equivalent threading operation that they lost with the
>> > Prime Junction Rule.
>>
>> I very much doubt we're going to lose it. ;-)
>
> I was referring to losing the threading outside of boolean expressions.

Me too.

> Well, let me say this: Consider the Sets a separate issue. That should
> drastically help in "multiplying our entities". I'm not sure why I had
> convinced myself that the Set proposal was tied to the Prime Rule.

Agreed. Let's leave it aside.

> I consider the Prime Rule as essential to junction sanity.

I consider it an unnecessary and unhelpful restriction on the utility of
junctions.

> Without it, you're free to ignore that predicate

Yep.

> If you're ignoring the predicate, a junction is just a list of values.

No. You can ignore the predicate because you don't need it *yet*.

> Treat it as such. A list. Hyperquotes will deal with those situations
> nicely, though explicitly.

No they don't. See my C<get_matches> example above.

And I'm still very concerned that using hyperquote to parallelize functions
and data *in different ways* will make the autothreading syntax harder to
understand and differentiate.

Damian

PS: I'll be away for the rest of the week, but will happily continue to
argue when I return. ;-)

Uri Guttman

unread,

Feb 22, 2005, 7:20:09 PM2/22/05

to Damian Conway, Rod Adams, perl6-l...@perl.org

>>>>> "DC" == Damian Conway <dam...@conway.org> writes:

DC> my %seen is shape(IO) of Bool; # %seen maps IO objects to boolean values

DC> while get_next_input_stream() -> $in {
DC> next if %seen{$in};
DC> $text ~= slurp $in;
DC> %seen{$in} = 1;
DC> }

but that is doable in perl5 as well. $in would stringify to a unique key
and you can test for it. better is if you used those keys as io handles
and that is where perl5 loses. say you had to process a bunch of handles
(say sockets which aren't going to be closed) and wanted to pass the
ones that still need processing. then you would do something like this
(perl5ish syntax which won't really work):

my %handles = map { open( $_, '<' ) or die "foo" => 1 } @files ;

later on some code could clear a handle's flag but leave the file open
and you can find the handles you want easily.

process_handles( grep( $handles{$_}, keys %handles ) ) ;

my current workaround for this problem is to have a hash that maps a ref
to itself! the key is a stringified ref and the value is the real
ref. having the key be any value (but internally strringified for
hashing) is very needed.

>>> But, y'know, this one almost convinces me. Especially when you consider:
>>>
>>> sub func ($i, $j, $k) {...}
>>>
>>> @x = func($a, »@y«, @z);

DC> Which would again be more obvious as:

DC> @x = func($a, every(@y), @z);

i agree. i like the names there as it reads better.

DC> PS: I'll be away for the rest of the week, but will happily continue to
DC> argue when I return. ;-)

so where do you go for an argument when you are away?

uri

--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Rod Adams

unread,

Feb 23, 2005, 12:18:10 AM2/23/05

to Damian Conway, perl6-l...@perl.org

Damian Conway wrote:

> Rod Adams wrote:
>
>> This is my major point of the post. In my opinion, your example of:
>>
>> # Print a list of substrings...
>> my $substring = substr("junctions", any(1..3), any(3..6));
>> say $substring.values();
>>
>> Is a perfect example of a place where saying:
>>
>> # Print a list of substrings...
>> my @substring = substr("junctions", »1..3«, »3..6«);
>> say @substring;
>>
>> Is far more in line with what you're doing.
>
>
> But much less obvious. This is the core of my qualms. You're proposing
> we have all three of:
>
> »foo« @x, @y;
> foo »@x«, »@y«;
> foo «@x @y»;
>
> all meaning very different things. That really worries me. I'd
> especially prefer that we didn't overload »« to mean both "linear
> vector" and "cross-product".
>
> I certainly like the idea of having both abilities available; I'm just
> think the syntaxes need to be better differentiated.

I'm not pinning everything on the syntax being » «. I just hadn't heard
the objection to it before so was running with it. If it takes changing
them to something else, I'll go with that. Though I was hoping for
something shorter than C<every()>. I also like the use of operators
here, since they are making a substantial change to the entire
statement. All of any/all/one/none/every look too much like ordinary
function calls for my tastes. But we do seem to be mighty short on
unambiguous punctuation these days.

However, I would have picked the « » as qw// to be the loser in this
regard. Then the »'s and «'s all mean "repetition is happening". I
personally never found C< $x{qw/dog cat foo/} > to be a bother, and I
use the equivalent in Perl5 very frequently. _If_ that is dropped, then
you can have « » mean "linear vector" and » « mean "cross product". But
I won't press this idea.

As for "obvious", I was processing two lists of parameters, and got back
a 2-D array, where I could easily match up each input to result.
You returned something of the form any(any(),any(),any()), where there
is no way to match which result was from a given input.
(FWIW, we both get garbage for output and/or errors. You're printing raw
junctions, I'm printing arrayrefs.)

I would wager that most people attempting this sort of thing would find
a 2-D array to be a much more obvious result than a nested junction.

>
>> Simply put, you're ignoring the fact that junctions have a boolean
>> predicate. Which they do.
>
>
> Yep. Isn't it useful that you can do that, and get more functionality
> out of a single construct?

Powerful? Yes.

Scary? Very. I start with something that looks like a scalar, and all
the code it touches turns all my other variables into juxtapositions.
Kind of like taint checking.

Useful? No more so that hyperquotes in the same place.

I find it more useful to get even more functionality out of the list
construct.

>
>
>>> I need to think about this. I'm not sure I'm convinced this isn't
>>> covered by none(Junction) typing on parameters.
>>
>>
>>
>> It is not.
>>
>> By my proposal:
>>
>> $j = any(1,2,3);
>> unless say($j) { die '...' }
>>
>> The say would thread, because of the boolean expression in the
>> 'unless'. C< say > does not get the junction as a parameter. But C<
>> say > needs to be marked that it's a no-no to thread over it.
>
>
> But that was precisely my point:
>
> sub say (none(Junction) *@sayings) {...}

In my way, C< say > is never given the chance to accept or decline $j as
an argument. The above would be handled as:

unless any(say(1), say(2), say(3)) {...}

All C< say > sees is scalars, one at a time. I was looking for a way to
mark C< say > with a way to tell the callee that threading over it
(note: "over" not "through") is a bad idea.

>
>> I am completely convinced of this. Please express your reservations
>> so I can address them.
>
>
> sub get_matches {
> my @in = split /<ws>/, prompt 'search for: ';
> my @out = split /<ws>/, prompt 'but not: ';
> return %data{any(@in) & none(@out)};
> }
>
>
> my $data = get_matches();
>
> No booleans anywhere in sight.

So with example data:

@in = qw/foo bar baz/;
@out = qw/baz fiz bop/;
%data = (foo => 5, bar => 6, baz => 7, fiz => 5, bop => 8);

you return:

all(any(5,6,7), none(7,5,8))

Which is just 6. If I were a user of this function, I would want foo's 5
as well. And I'd probably prefer to have had them as a list, not a junction.

By delaying the application of your predicates, you have voided whatever
it was that made those values have those predicates associated with
them. Logic would say that each step of processing of the values in the
junction without the application of the predicate would more often than
not move you further and further away from the correct meaning of the
predicate in terms of those values.

This is why I want to stop "junction propagation". The more you feed
code junctions that create other junctions, the more obscure your final
result is. The predicate is there to explain how to recombine the
results of threading together to form a single result. Attempting to
evaluate the values without evaluating the predicate as well seems
incredibly error prone to me. By forcing the predicate to be evaluated,
we generate no new junctions... just results.

So perhaps a better rephrasing of the Prime Rule would be: Every legal
junction evaluation must include the evaluation of it's predicate.

I would be in favor of adding non-boolean predicates. For instance, some
Numeric predicates could be: min, max, sum, mode, median, mean, stdev.
String predicates could be: min, max, longest, shortest. Basically any
function which can take several results and merge it into a single value
again can be a predicate.

BTW, this example is what sets are good for:

sub get_matches {
my @in = split /<ws>/, prompt 'search for: ';
my @out = split /<ws>/, prompt 'but not: ';

return %data{set(@in) - set(@out)};
}

>
>> I consider the Prime Rule as essential to junction sanity.
>
>
> I consider it an unnecessary and unhelpful restriction on the utility
> of junctions.

It's a better safeguard for the unaware than anything else I've seen.

>
>
>> Without it, you're free to ignore that predicate
>
>
> Yep.
>
>
>> If you're ignoring the predicate, a junction is just a list of values.
>
>
> No. You can ignore the predicate because you don't need it *yet*.

Then don't affix a predicate to that group of values *yet*.

In my mind it's quite clear that by binding the values to a predicate,
you are saying they belong together. Therefore, to evaluate the
junction, you have to evaluate it's entirety, not some intermediate
results. If you're not ready to commit to the predicate yet, don't bind
to it. Once you're bound, you should be stuck unless you do an explicit
C< $j.values() > or the like.

>
>> Treat it as such. A list. Hyperquotes will deal with those situations
>> nicely, though explicitly.
>
>
> No they don't. See my C<get_matches> example above.

Indeed, while admittedly my hyperquotes do not provide an elegant
solution there, your junctions appear not to provide an elegant working
solution there, either.

The solution to the C<get_matches> problem is to use either sets or
grep. And since grep performs boolean tests, feel free to use the
junctions listed in there.

>
> And I'm still very concerned that using hyperquote to parallelize
> functions and data *in different ways* will make the autothreading
> syntax harder to understand and differentiate.

I'm confident in the ability of this list (in whole or in part) to come
up with a decent syntax once it's agreed to do so.

>
> PS: I'll be away for the rest of the week, but will happily continue
> to argue when I return. ;-)

Best of luck with whatever draws you away. And thanks for the heads up,
so I knew why you stopped responding. :-)

-- Rod Adams

0 new messages