What does the reduce metaoperator do with an empty list?
my @a;
[+] @a; # 0? exception?
[*] @a; # 1? exception?
[<] @a; # false?
[||] @a; # false?
[&&] @a; # true?
Also if it magically supplies some correct like the above, how does it
know what that value is?
Thanks,
Matt
--
"Computer Science is merely the post-Turing Decline of Formal Systems Theory."
-???
>All~
>
>What does the reduce metaoperator do with an empty list?
>
>my @a;
>[+] @a; # 0? exception?
>[*] @a; # 1? exception?
>[<] @a; # false?
>[||] @a; # false?
>[&&] @a; # true?
>
>Also if it magically supplies some correct like the above, how does it
>know what that value is?
>
>
My general thoughts has been that:
[op] @list
behaves something like:
eval join(op, @list)
so feeding it an empty list would return undef, regardless of op.
Similarly, if @list is just one element, it returns that element.
-- Rod Adams
The usual definition of reduce in most languages that support it, is
that reduce over the empty list produces the Identity value for the
operation. So for the above ops the answers are: 0, 1, depends, false,
true. For chained ops like '<' it depends on whether x<y<z return x or
z on being true. if x then -inf else if z then +inf (don't ask if it
returns y). Note that some ops (like '%') don't have an identity value
and therefore [%] over the empty list is the equivalent to a divide by 0
and probably throws an exception or at least returns undef. Now this is
a problem for user defined operations, so reduce of a user defined op
over the empty list is either an exception, return undef or we need a
trait that can be specified for an infix op that specifies what to
return for reduce over the empty list. The compiler shouldn't bother to
check if what you specified is really the Identity value for op, but I'd
consider it a bug if it isn't.
/me puts on his lambda hat
In Haskell, there is a distinction between foldl and foldl1 (similar
remarks apply to foldr/foldr1[1]):
The former (foldl) requires you to give an explicit 'left unit'[2],
which is implicitly added to the left of the list being reduced. This
means that folding an empty list will just give you the unit.
i.e.
foldl (+) 0 [a,b,c] = ((0+a)+b)+c
foldl (+) 0 [] = 0
The latter (foldl1) doesn't use a unit value, but this means that you
can't fold empty lists.
i.e.
foldl1 (+) [a,b,c] = (a+b)+c
foldl1 (+) [] = ***error***
/me puts camel hat back on
This suggests that []-reducing an empty list should probably (IMO) do
one of the following:
* Fail, since it's an ill-defined operation without an explicit unit
* Try to find a suitable unit value (see below), and fail if it doesn't find one
You /could/ try to do something 'sensible' and return 0 or undef, but
this seems likely to result in more confusion.
Another alternative is to give the user the option of specifying such
a unit when using the reduction meta-operator, but this seems to work
against the whole point of [+] (which is brevity). If you want to
specify your own unit, use '&reduce'.
> my @a;
> [+] @a; # 0? exception?
> [*] @a; # 1? exception?
> [<] @a; # false?
> [||] @a; # false?
> [&&] @a; # true?
>
> Also if it magically supplies some correct like the above, how does it
> know what that value is?
>
Perhaps the operator could have some kind of 'unit' trait? (Or perhaps
'left_unit' and 'right_unit'?)
Stuart
[1] Just remember that unlike foldr/foldl, which are explicitly
right/left associative, [+] is 'DWIM-associative', reflecting the
associativity of the underlying operator.
[2] e.g. 0 is the left (and right) unit of + because 0 + x == x (and x + 0 == 0)
1) undef (which may or may not contain an exception), or
2) some unit/identity value that is a trait of the operator,
depending on whether or not people think (2) is actually a good idea.
The usual none(@Larry) disclaimer applies, of course...
Stuart
I would think that the Principle of Least Surprise points to (1),
given that the standard explanation of the [+]@x is eval join( '+', @x
) ...
Rob
Well the only case where it probably really matters is [+] where you
really want the result to be 0. Of course +undef == 0, so maybe
returning undef might be okay. I'm thinking about the case:
[+] grep &some_condition, @a
where you really want the total to be 0, even if the result of the grep
is empty.
A case can also be made for (assuming @a = ();) that
[*}@a == 1
[~]@a eq '' (also covered by ~undef)
[?&]@a ~~ true
[?|]@a ~~ false (also covered by ?undef)
[?^]@a ~~ false (also covered by ?undef)
[+&]@a == MAXINT (whatever that is)
[+|]@a == 0 (also covered by +undef)
[+^]@a == 0 (also covered by +undef)
chained ops are wierd
[<]@a ~~ false
[>]@a ~~ false
[<=]@a ~~ true
[>=]@a ~~ true
Other ops have theoritical values that I don't know if we can handle:
[~&]@a should be an infinitely long bitstring of 1's
[~|]@a should be an infinitely long bitstring of 0's
Again, given that that the really important case [+] is covered by
+undef == 0, maybe just always returning undef is good enough.
$rod == none(@Larry), and therefore just because I think it should be
that way, doesn't mean it is that way.
But the "eval join" way of looking at it does seem to be consistent with
what I've seen discussed previously, and would provide a useful way to
remember the effects of edge cases.
-- Rod Adams
Wow, that's actually pretty elegant, and it has the benefit of
explicitly TELLING the reader what you intended to do.
Another option (depending on your situation) is to use 'err' to
replace the resultant 'undef' with a value of your choosing, i.e.
[*] @coefficients err 1;
[+] @scores err 0;
[&&] @preconditions err 1;
etc. (assuming I got the precedence right)
I think all these rather nice workarounds, combined with the hairiness
and complexity of trying to come up with default answers, make a
really strong case for undef/fail being the right choice here.
Stuart
> All~
>
> What does the reduce metaoperator do with an empty list?
Interesting. Mathematically an empty sum is zero and an empty product is
one. Maybe each operator {c,s}hould have an associated method returning
its neutral element for [] to use it on empty lists, so that it would
probably return undef on empty lists.
Just my 2 Eurocents,
Michele
--
It was part of the dissatisfaction thing. I never claimed I was a
nice person.
- David Kastrup in comp.text.tex, "Re: verbatiminput double spacing"
>> 1) undef (which may or may not contain an exception), or
>> 2) some unit/identity value that is a trait of the operator,
>>
>> depending on whether or not people think (2) is actually a good idea.
>
> I would think that the Principle of Least Surprise points to (1),
I don't think so. I, for one, would expect [+]; to return zero and [*]; to
return 1. But that's only because I {trust,expect} perl(6) to be smart
enough to DWIM.
Michele
--
\renewcommand\labelitemi{\textcolor{yellow}{\textbullet}}
would change the colour of the first-level bullet, and improve the
document so's i can't see the bullets.
- Robin Fairbairns in comp.text.tex
Here is the last answer from Ken Iverson, who invented reduce in
the 1950s, and died recently.
file:///usr/share/j504/system/extras/help/dictionary/intro28.htm
Identity Functions and Neutral
The monads 0&+ and 1&* are identity functions, and 0 and 1 are
said to be identity elements or neutrals of the dyads + and *
respectively. Insertion on an empty list yields the neutral of
the dyad inserted. For example:
+/ i.0 +/'' +/0{. 2 3 5
0 0 0
*/i.0 */'' */0{. 2 3 5
1 1 1
> my @a;
> [+] @a; # 0? exception?
> [*] @a; # 1? exception?
> [<] @a; # false?
> [||] @a; # false?
> [&&] @a; # true?
>
> Also if it magically supplies some correct like the above, how
> does it know what that value is?
>
> Thanks,
> Matt
The page
file:///usr/share/j504/system/extras/help/dictionary/d420.htm
gives examples, unfortunately not easily readable ones.
"If y has no items (that is, 0=#y), the result of u/y is the
neutral or identity element of the function u. A neutral of a
function u is a value e such that x u e ↔ x or e u x ↔ x, for
every x in the domain (or some significant sub-domain such as
boolean) of u . This definition of insertion over an argument
having zero items extends partitioning identities of the form
u/y ↔ (u/k{.y) u (u/k}.y) to the cases k e. 0,#y .
"The identity function of u is a function ifu such that ifu y ↔
u/y if 0=#y ."
[The following table is greatly simplified by listing identity
elements rather than identity functions. Some are only left
identities, and some only right identities.]
Identity element For
0 < > + - +. ~: | (2 4 5 6 b.)
1 = <: >: * % *. %: ^ ! (1 9 11 13 b.)
_ <.
__ >.
'' ,
[and a few more that I will not explain here]
Glossary
J Description
+. or
~: objects are identical
| remainder, defined so that 0|N is N
b. Boolean functions from table
<: less than or equal, restricted to Booleans here
(1<:0 is 0, 1<:1 is 1)
>: greater than or equal, restricted to Booleans here
* times, restricted to Booleans here
% divide
*. and
%: root
^ exponential
! combinations
<. minimum
>. maximum
'' empty vector, list of length 0
, catenate, join lists
_ infinity
__ negative infinity
So (_ <. N) is N, as is (__ >. N).
All of these functions are defined in detail but quite tersely in
the J Dictionary, indexed on the page
file:///usr/share/j504/system/extras/help/dictionary/vocabul.htm
For examuple, the Boolean function b. is defined on the page
file:///usr/share/j504/system/extras/help/dictionary/dbdotn.htm
--
Edward Cherlin
Generalist & activist--Linux, languages, literacy and more
"A knot! Oh, do let me help to undo it!"
--Alice in Wonderland
http://cherlin.blogspot.com
I'd say the principle of least surprise points to (1); in the sense that
$sum = [+] @x; would Just Work, etc.
I also have a vague sense that the 'identity' value for an operator might
also be useful in other places in the compiler (enabling optimizations,
etc). Providing it as a trait means that these 'other things' could work
even with user-defined operators. (And leaving the trait undefined gives
you the behavior (1), if that's what you want.)
--scott
Albanian LICOZY shotgun CABOUNCE plastique Sigint Justice fissionable
LITEMPO KGB KUCAGE LIONIZER ESCOBILLA North Korea CLOWER genetic NRA
( http://cscott.net/ )
Thanks for bringing in a little history to the discussion. Those links
are all local to your system; do you have internet reachable versions of them?
Cheers,
Sam.
http://www.jsoftware.com/books/help/dictionary/intro28.htm
http://www.jsoftware.com/books/help/dictionary/d420.htm
and so on. Front page is at
http://www.jsoftware.com/books/help/dictionary/title.htm . I still haven't
figured out what "J" is though.
sure. Maybe the identity values could be supplied by making the reduce operator
for them a curried version of reduce, where reduce requires a list with at least
one element (or DIES :))
eg, something similar to; (sorry for the psuedo-perl6, corrections welcome :))
sub prefix:<[+]> (*@args) ::= &reduce.assuming(func => &infix:<+>, first => 0);
sub prefix:<[*]> (*@args) ::= &reduce.assuming(func => &infix:<*>, first => 1);
This would mean that unless the operator specifically defines a curried reduce
version of itself, then the [] version of it on an empty list will be a hard run-time
error.
Then again, maybe an identity trait is more elegant and covers the varied ways
that multiple operators combine inside reduce..
> You /could/ try to do something 'sensible' and return 0 or undef, but
> this seems likely to result in more confusion.
Personally I think returning an undef for this kind of situation would be as wrong
as returning undef for 0/0.
Sam.
Randal> For example, if I wanted the identity hash (where all values are 1,
Randal> but keys are original list elements), I could do:
Randal> my %hash = @somelist.inject({}, { $^a{$^b} = 1; $^a });
And yes, I know I can spell this as:
my %hash = ({}, @somelist).reduce({ $^a{$^b} = 1; $^a });
But it puts things all in the wrong place for me. :)
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<mer...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Mark> The usual definition of reduce in most languages that support it, is
Mark> that reduce over the empty list produces the Identity value for the
Mark> operation.
In Smalltalk, the equivalent of "reduce" is "inject:into:", so a
"sum" reduce looks like:
sum := aList inject: 0 into: [:previous :this | previous + this]
Now the advantage here is that if aList is empty, we get back the inject
value. Thus, the behavior is always well-defined.
The Perl reduce operator treats the first element of the list as the
"inject" value above. However, if the first element is missing,
the most Perlish thing I can think of is having it return undef,
because it's like you've specified an undef inject value.
I'd also argue that we could provide .inject_into, to make Smalltalkers
happy to always spell out the initial value and codeblock, instead
of relying on the first element of the list for the initial value.
For example, if I wanted the identity hash (where all values are 1,
but keys are original list elements), I could do:
my %hash = @somelist.inject({}, { $^a{$^b} = 1; $^a });
That'd be Way Cool. Once you get your head around inject, you never
want to go back to reduce. :)
I think we should provide built-in operators with an attribute
called "identity". Reduce, when given an empty list, would
check if the operator has a defined identity attribute. If so,
it is returned as the result of the reduction. If the opereator
has no identity attribute, reduce throws an exception for an
empty list.
Is there a built-in operator that doesn't have a meaningful
identity value? I first thought of exponentiation, but it has
an identity value of 1 - you just have to realize that since
it is a right associative operator, the identity has to be
applied from the right.
I suspect that if people ever get into writing code that works
on operators instead of data, there would be additional uses
found for the identity attribute (and there may be additional
operator attributes that make sense there too, although none
come immediately to mind). MJD will soon have to start working
on the second edition of Higher Order Perl.
--
> Is there a built-in operator that doesn't have a meaningful
> identity value? I first thought of exponentiation, but it has
> an identity value of 1 - you just have to realize that since
> it is a right associative operator, the identity has to be
> applied from the right.
Well the identity of % is +inf (also right side only). The identities
for ~| and ~^ are infinitely long bitstrings of 0's, while that for ~&
is a similarly long bitstring of 1's. The chained comparison ops are
weird as depending of which why you define the associativity (and thus
which side's value you return when true) you get either a left side only
or right side only Identity. E.g. if X<Y is left associative and
returns Y when true then it has a left side identity of -inf, etc. But
as I'm not sure if the chained ops are actually going to be defined
defined in terms of the associativity of the binary op (the way false
results propagates through messes things up), so that argument may not work.
http://www.jsoftware.com/books/help/dictionary/intro28.htm
Sorry. It's exactly the same material the provide with their
software, so I got confused.
> [snip]
>
> Thanks for bringing in a little history to the discussion.
> Those links are all local to your system; do you have internet
> reachable versions of them?
>
> Cheers,
> Sam.
--
If as usual the definition of a right identity value e is that a op e = a for all a,
then only +inf works. Besdies you example should have been;
$n % any (($n+1)..Inf), $n % $n = 0.
> > E.g. if X<Y is left associative and returns Y when true then ...
>
> Sorry, is it the case that $x = $y < $z might put something else
> but 0 or 1 into $x depending on the order relation between $y and $z?
Which is one reason why I siad that it might not make sense to define the chaining ops in terms of the associtivity of the binary ops, But as we are interested in what [<] over the empty list shoud return , the identity (left or right) of '<' is unimportant as I think that should return false as there is nothing to be less then anything else. Note that defaulting to undef therefore works in that case.
--
Mark Biggar
ma...@biggar.org
mark.a...@comcast.net
mbi...@paypal.com
On the contrary a mathematician would say that the empty list is
monotonically increasing (vacuously) and the answer should be true.
Certainly.
> I first thought of exponentiation, but it has
> an identity value of 1 - you just have to realize that since
> it is a right associative operator, the identity has to be
> applied from the right.
>
> I suspect that if people ever get into writing code that works
> on operators instead of data, there would be additional uses
> found for the identity attribute (and there may be additional
> operator attributes that make sense there too, although none
> come immediately to mind).
APL and J programmers have lots of examples.
It's an enhanced APL without the funny characters.
Yes, a number of operators have an inverse on Boolean 0 and 1
only.
> This would give an "axiomatic" type system:
>
> class Num does Group[Num,+,0] {...}
> class Num does Field[Num,+,0,*,1] {...}
> class Str does Monoid[Str,~,''] {...}
>
> class Complex does Field[Array[2] of Num,+,[0,0],*,[1,0]]
> {...}
>
> class 3DVector does VectorSpace[Array[3] of Num,+,[0,0,0]]
> {...}
>
> And it provides valuable information to the optimizer.
Signed integer division and remainder are defined by the relation:
A = (A/B)*B + (A rem B)
where (A rem B) has the sign of A and an absolute value less than the absolute value of B. Signed integer division satisfies the identity:
(-A)/B = -(A/B) = A/(-B)
It does have a right side identity of +INF.
> HaloO Mark,
>
> please don't regard the following as obtrusive.
>
> you wrote:
> > If as usual the definition of a right identity value e is that a op e = a for
> all a,
> > then only +inf works. Besdies you example should have been;
>
> Or actually $n % any( abs($n)+1 .. Inf ) to really exclude 0
> from the junction.
>
> > $n % any (($n+1)..Inf), $n % $n = 0.
>
> That depends on the definition of % and the sign of $n.
> With the euclidean definition 0 <= ($n % $N == $n % -$N) < abs($N)
> and for $n < 0 there's no identity at all. The identity element
> has to be an element of the set, which +Inf isn't. It's a type.
>
> BTW, is % defined as truncation in Perl6?
> That would be a bit unfortunate. Simple but not well thought out.
> --
> TSa (Thomas Sandlaß)
>
The identity operand is -inf for < and <=, and +inf for >
and >=. A chained relation < (>, <=, >=) is then taken to
mean monotonically increasing (decreasing, non-decreasing,
non-increasing), and an empty list, like a one element list,
is always in order.
--
I think % should behave as Perl 5 has it, especially insofar as it
coerces to integer. That doesn't mean we can't have "mod" and/or
"rem" variants that have other semantics. Infix operators are pretty
much in their own namespace anyway, so they won't clobber sub names.
Larry
But it seems to me that the simplest "correct" behaviour for reductions is:
2+ args: interpolate specified operator
1 arg: return that arg
0 args: fail (i.e. thrown or unthrown exception depending on use fatal)
Then, as Brad previously reminded us all, to cover the possibility of empty
arg lists, you just prepend the desired identity value:
$sum = [+] 0, @values;
$prod = [*] 1, @values;
$prob = [*] 0, @probs;
or else, as Stuart observed, postpend them as a default:
$sum = ([+] @values err 0);
$prod = ([*] @values err 1);
$prob = ([*] @probs err 0);
Damian
> 0 args: fail (i.e. thrown or unthrown exception depending on use
> fatal)
...
>
> $sum = ([+] @values err 0);
> $prod = ([*] @values err 1);
> $prob = ([*] @probs err 0);
Just wanted to check, if I've said "use fatal": will that "err 0" DWIM,
or will the fatalness win? Would I need to write
> Damian Conway wrote:
>
>> 0 args: fail (i.e. thrown or unthrown exception depending on use
>> fatal)
>
> ...
>
>>
>> $sum = ([+] @values err 0);
>> $prod = ([*] @values err 1);
>> $prob = ([*] @probs err 0);
>
>
> Just wanted to check, if I've said "use fatal": will that "err 0" DWIM,
That depends what you mean. ;-)
Under "use fatal" you'll get an exception if you don't provide any args.
> or will the fatalness win?
The fatalness will win.
> Would I need to write
Yes. :-)
And what you'd need to write would be:
$sum = (try{ [+] @values } err 0);
Damian
The "err ..." idiom seems too useful to have it break in this case.
Afterall, the purpose of "err 0" is to tell the stupid computer that I
know what to do with the empty-array scenario.
Feels like fine grained control over fatalness is needed (and not just
per-function). For example "use fatal :void_context_only" would be nice,
but wouldn't help in this case. This needs "use fatal :void_or_assign".
What it probably needs is: "use fatal :untested"
That is, die unless the failed result is in a boolean or "definedean" context.
This might even be a more reasonably dwimmy default, with 'use fatal :always'
being required to get "always throw the exception".
Damian
Seems to me that C<err> should have an implied C<try{...}> on it's lhs.
Unless we spell that C<error>.
-- Rod Adams
Following this logic, does join(" ", @foo) with +@foo being 0 fail too?
I dislike this, and would prefer [op] with no elements to simply return
whatever () returns: an empty list in list context, undef in scalar
context.
Juerd
--
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html
http://convolution.nl/gajigu_juerd_n.html
> -----Original Message-----
> From: Damian Conway [mailto:dam...@conway.org]
> Sent: Tuesday, May 31, 2005 11:14 PM
> To: perl6-l...@perl.org
> Subject: Re: reduce metaoperator on an empty list
>
> Juerd asked:
>
>
> >> 2+ args: interpolate specified operator
> >> 1 arg: return that arg
> >> 0 args: fail (i.e. thrown or unthrown exception depending on use
> fatal)
> >
> > Following this logic, does join(" ", @foo) with +@foo being 0 fail too?
>
> No. It returns empty string. You could think of C<join> as being
> implemented:
>
> sub join (Str $sep, *@list) { reduce { $^a ~ $sep ~ $^b } "", @list }
>
> Just as C<sum> is probably implemented:
>
> sub sum (*@list) { [+] 0, @list }
>
If this were the case, then
join '~', 'a', 'b', 'c'
would equal '~a~b~c' instead of 'a~b~c'
Joe Gottman
>> 2+ args: interpolate specified operator
>> 1 arg: return that arg
>> 0 args: fail (i.e. thrown or unthrown exception depending on use fatal)
>
> Following this logic, does join(" ", @foo) with +@foo being 0 fail too?
No. It returns empty string. You could think of C<join> as being implemented:
sub join (Str $sep, *@list) { reduce { $^a ~ $sep ~ $^b } "", @list }
Just as C<sum> is probably implemented:
sub sum (*@list) { [+] 0, @list }
> I dislike this, and would prefer [op] with no elements to simply return
> whatever () returns: an empty list in list context, undef in scalar
> context.
I'd have assumed that "empty list in list context, undef in scalar
context" *is* what C<fail> returns when C<use fatal> isn't in effect.
Damian
>>No. It returns empty string. You could think of C<join> as being
>>implemented:
>>
>> sub join (Str $sep, *@list) { reduce { $^a ~ $sep ~ $^b } "", @list }
>
> If this were the case, then
> join '~', 'a', 'b', 'c'
> would equal '~a~b~c' instead of 'a~b~c'
Good point. Thanks. Make that:
sub join (Str $sep, *@list) {
reduce { $^a ~ $sep ~ $^b } @list || ""
}
Presuming (of course) that my earlier plea that || should preserve context
across both operands is granted. ;-)
Damian
Also, in mathematical terms, many operators have no identity. Also,
if we suppose that every operator does have an identity (a silly thing
to do), then we can't just return that. It doesn't even make sense
for operators like < whose return type is different from its argument
types (but reduce itself doesn't make a lot of sense for such
operators unless they're list-associative).
For something like:
$ordered = [<] @array;
If @array is empty, is $ordered supposed to be true or false? It
certainly shouldn't be anything but those two, because < is a boolean
operator.
Coming back to our identities issue, consider:
$mods = [%] @array;
If @array is empty, what is $mods supposed to be? There's no
reasonable integer that it could possibly be.
You have to either supply an initial value or refactor your logic not
to allow an empty @array (as in the first case). If you want it some
other way, there are far too many special cases we have to work with,
some of which are just mathematically impossible.
I think `fail`ing is the best bet.
Luke
Another broken symmetry. Ah, Perl 5's full of them, and I see that Perl 6
will be no exception. Excuse me while I find my cynic hat . . ah, here it
is.
I'm still in the camp of those wanting each operator to know its own identity
value (perhaps in terms of a trait). The identity of multiplication (say) is
always 1, after all, and it doesn't change depending on when you do
multiplication in your own code. In mathematical terms, it's a property of
the operator, not of how it's used.
You are going to see empty lists more often than you think in expressions like
$product = [*] @array;
and having to write that as
$product = [*] 1, @array;
just to protect against a common case doesn't exactly flaunt Perl's DWIMmery
to me. I *have* to write 1 there, or otherwise the reduce meta-operator
isn't calculating the product when there are items in @array. This hardly
makes sense from a Huffman perspective. Someone please convince me
otherwise.
Multiplication's an easy example, in any case, and any programmer will know
the identity is 1. But look at the confusion here on this list among people
over the identities for <, >, ** and other beasts. I've already forgotten
some of them.
(On another note: I had a thought the other day: [op] produces one thing from
a list, and »op« produces a list from a list. When I squint with hindsight,
it makes more sense to me for »op« to be the one that converges a list to a
single point, and [op] to do listy things to lists. I'm not seriously
suggesting a switch - there's probably parsing issues with the thought - I'm
just sayin', is all.)
--
Debbie Pickett
http://www.csse.monash.edu.au/~debbiep
deb...@csse.monash.edu.au
> You are going to see empty lists more often than you think in expressions like
> $product = [*] @array;
> and having to write that as
> $product = [*] 1, @array;
> just to protect against a common case doesn't exactly flaunt Perl's DWIMmery
> to me. I *have* to write 1 there, or otherwise the reduce meta-operator
> isn't calculating the product when there are items in @array. This hardly
> makes sense from a Huffman perspective. Someone please convince me
> otherwise.
The problem is that writing 1 there is still wrong in the "no arguments" case.
The product of zero numbers cannot possibly be one in any common sense
interpretation. (And, yes, I'm perfectly well aware of the mathematical
interpretations in which it does make sense...that's not the point.)
What you want is:
$product = ([*] @values err 0);
Or:
$factorial = ([*] 1..$n err 1);
So what you want is not an identity value as default (which isn't even
possible for many operators, as Luke pointed out), but a predictable failure
value as default, so you can intercept that failure and choose your own
outcome in the edge case.
Damian
> $ordered = [<] @array;
>
> If @array is empty, is $ordered supposed to be true or false? It
> certainly shouldn't be anything but those two, because < is a boolean
> operator.
I read that (mathematically) as "for all i, for all j such that j-i=1,
a_i<a_j", which is certainly satisfied by an empty list.
Michele
--
> Why should I read the fucking manual? I know how to fuck!
In fact the problem is that the fucking manual only gives you
theoretical knowledge which is useless in practice ;)
- Giuseppe "Oblomov" Bilotta in a bunch of usenet groups.
Yep, it sure is. Now tell Perl to read it that way for any operator.
Luke
> You have to either supply an initial value or refactor your logic not
> to allow an empty @array (as in the first case). If you want it some
> other way, there are far too many special cases we have to work with,
> some of which are just mathematically impossible.
>
> I think `fail`ing is the best bet.
I haven't read the whole thread, so I'm sorry, if it was before. I think
that a fail is a good idea. Or maybe it should be a warning, and the
return value should be undef.
I mean:
[<]@empty === undef < undef;
[*]@empty === undef * undef;
etc.
Bye,
Andras
>> I read that (mathematically) as "for all i, for all j such that j-i=1,
>> a_i<a_j", which is certainly satisfied by an empty list.
>
> Yep, it sure is. Now tell Perl to read it that way for any operator.
Should _I_?!? ;-)
I wonder what a logic-oriented programming language a' la prolog would say
in this case. Now, Perl6 is supposed to support similar features too -
maybe the right(TM) answer could be along that way...
Michele
--
> primordinarily concerned with providing ...
Neat "word"!
- Donald Arseneau in comp.text.tex
> For something like:
>
> $ordered = [<] @array;
>
> If @array is empty, is $ordered supposed to be true or false? It
> certainly shouldn't be anything but those two, because < is a boolean
> operator.
I have no problem with 3-state logic systems (true, false, undef) if
this is what is required to let me choose the corner-case behavior.
Damian previously wrote:
> 2+ args: interpolate specified operator
> 1 arg: return that arg
> 0 args: fail (thrown or unthrown exception depending on use fatal)
The 1-arg case doesn't seem to work right with the [<] operator:
? [<] 1
? [<] 0
If the [<] is taken to mean "ordered", then it doesn't seem right that
these two tests would give different results. In this case, I need to
special case both the 0-arg and 1-arg scenarios. We either need to hard
code these special-cases into perl (belch), or we need to make it easy
to code both special-cases inline in the code, or we need a better
general-case rule.
One approach might be to reverse the direction of the definitions. That
is, instead of defining the binary form and then autogeneralizing in
terms of "join", we might define operators in terms of thier reduction
behavior, and then autospecialize to the binary case. Of course, that
still doesn't help for Damian's "product Vs factorial" example for the
0-arg case.
Or we take the well trodden road of ignoring mathematical correctness
and simply state "this is what perl does: take it or leave it".
This is asking "Is @array ordered?" In the case of a 0-element or
1-element array, the answer is "It is not disordered", which means
$ordered is true.
$ordered = ! [!<] @array;
Rob
You haven't convinced me, but rather than flog a dead horse, I'll just suggest
that we both reserve the right to say "I told you so" when there are several
years' worth of Perl 6 code out there, and we see how common our respective
examples are. Meanwhile, it's safer to take the "fail" course of action,
because it can be relaxed later on; the converse couldn't be true.
What you and Luke *have* convinced me of is that the one-element list rule
isn't right either; for operators like <, it still makes sense to fail in
some soft way on one-element lists:
$ordered = ([<] @array) // 1; # Zero- and one-element arrays get the 1.
I'm starting to agree that there have to be different rules for associative*
and non-associative reduction.
* Left and right association might need to be different too in terms of the
order they eat list elements.
--
Debbie Pickett
debbiep-list-...@futzle.com
Sure it is: it's the number you have before you've multiplied any
numbers into it. It's the center of the world, for multiplication. If
you started anywhere else you wouldn't end up at the correct product.
If you started at zero you'd never get off the dime.
>
> What you want is:
>
> $product = ([*] @values err 0);
Not what I'd ever want...
>
> Or:
>
> $factorial = ([*] 1..$n err 1);
>
> So what you want is not an identity value as default (which isn't even
> possible for many operators, as Luke pointed out), but a predictable
> failure value as default, so you can intercept that failure and choose
> your own outcome in the edge case.
>
> Damian
This is why I would rather the o -> [o] circumfixion left [o] an infix,
not prefix operator. I would rather be explicit about my identity:
$product = 1 [*] @array;
$factorial = 1 [*] 1..$n;
$text = "" [~] @lines;
$balance = $balance [-] @debits;
or
$balance [-]= @debits;
which last suggests how you don't always want to start at the identity,
and how [-] makes sense in this context.
Unfortunately it doesn't handle the case of join $sep, @strings:
"" [{$^a ~ $sep ~ $^b}] @strings
(yes, I know that's not going to pass lexical analysis) since, as was
pointed out, you get an extra $sep at the front.
yours,
Roger Hale
The "err" operator bind only to the point on the instruction it is
attached to, ie it's not a shortcut for eval(), right?
I'm just seeing some edge cases here for custom defined operators; you
don't want exceptions getting silently converted to default values...
Sam.
Hmm. Not all operators *have* an identity.
You'd have to write, in that case;
@array[0] [ƒ] @array[1..@array.end]
Which isn't great IMHO.
Sam.
So, what's wrong with not providing an identity? It's not like we
cannot have a module that says "[*] will be its own operator that
defaults to 1, [+] will default to 0", and so forth. In fact, I can
easily see a module like that being very useful.
Remember - the language DOES NOT have to provide the sinecure for all
situations. Discussions like this only emphasize that fact. Give it a
module and let it go.
In fact, if there's two different and competing implementations for
the reduce meta-operator, then maybe it's better to provide a
reduce-noident and reduce-ident and let people consciously choose
which one they want to use ...
Rob
But I would like to stress that this is an *optional* property.
We will only define an identity value for those operators for
which it is obvious what an "accumulator" should be initialized to.
(This is probably also the set of operators for which op= doesn't warn
when the left side is undefined, so "identval" is doing double duty.
In Perl 5 we hardwired that, so it'd be nice to generalize the concept
in Perl 6.)
In fact, by the accumulator argument, maybe it should be called
"initvalue" rather than "identvalue".
Larry
< and > still don't make sense as reduce operators. Observe the table:
# of args | Return (type)
0 | -Inf
1 | Num (the argument)
2 | bool
... | bool
There is a pretty fundamental difference between X,X -> X operators
and X,X -> Y operators wrt reduce. Most languages don't even allow
the latter kind, but I think it would be cool if we did.
I'm not really sure what you mean by "it can mask real exceptions".
If the operator is defined to require 1 or more (or in the case of
nonhomogeneous operators, 2 or more) arguments, then that is a real
exception. "err" just fails to be a realistic fallback mode.
In any case, I'm fairly satisfied with your conclusion (but I wasn't
when I began writing this reply :-). I just think we have to give
nonhomogeneous operators like < some special treatment. So, from our
somewhat lexical definition of reduce, I don't think an identity input
is what we're looking for. We really want an identity output.
Binary operators really need two such outputs: one for zero arguments
and one for one argument. The one argument output can be inferred
from the zero argument output (but not vice versa). So, we could
define two properties:
is nullaryvalue(-Inf)
is unaryfunction({ 1 })
And, of course, if nullaryvalue is not defined, then we die (or fail)
when zero arguments are given. unaryfunction will default to {
&op(nullaryvalue, 1) }, or { $_ } if nullaryvalue was not defined,
unless the output type is incompatible with the input type (in which
case we have to die even with one argument). We could also define the
unaryfunction automatically if a default value is given for one of the
arguments of a binary operator.
But this all ceases to be a problem for list-associative operators,
which presumably take a list of arguments rather than just two.
That reminds me, how are <, >, etc. defined anyway? How can we tell
them to be list-associative with each other?
Luke
Yeah, I keep confusing them with min and max.
: That reminds me, how are <, >, etc. defined anyway? How can we tell
: them to be list-associative with each other?
Because they're all of that specific precedence level, which is defined
to work that way by fiat, I suspect. (Other operators have to be
explicitly declared as list associative, and even then are only list
associative with themselves, not with all other operators in their
precedence level.) I suppose it could be construed as some kind of
default property on the actual precedence level, but it's not clear
that that would be a useful generalization.
Larry
Okay, I was referring more to the implementation. How do we tell apart:
3 < 4 <= 5 == 5
From
3 lt 4 >= 5 != 5
?
Luke
Let's look at the type of one of the many `reduce' variants in Haskell;
foldr1 :: (a -> a -> a) -> [a] -> a
This is the Perl6ish;
sub reduce( ::Code{( ::(someType), ::(someType) ) returns ::(someType)} $func,
Array of ::(someType) ) returns ::(someType);
ie, the function $func supplied must take and return arguments of a single
type. So you have come to the same conclusion as the FP folk :-).
Here I'm using ::(foo) as a coined syntax for a parametric type to the
definition. `someType' need not be defined anywhere else, but must be
the same within the application of a definition.
IMHO we still need to make some movements towards a specification for how
this sort of thing is specified...
> I just think we have to give
> nonhomogeneous operators like < some special treatment. So, from our
> somewhat lexical definition of reduce, I don't think an identity input
> is what we're looking for. We really want an identity output.
When I last looked, in pugs, functions that return "bool" types are
currently setup to return one or the other argument when used with reduce;
like there is a similar;
multi
sub reduce( ::{Code( ::(someType) $a, ::(someType) $b ) returns bool} $func,
Array of ::(someType) ) returns ::(someType);
This gives meaning to operators like [>], which would return the maximum
value from a list.
Sam.
So
$bool = [<] @empty_array; # is false (-Inf < -Inf)
$bool = [<=] @empty_array; # is true (-Inf <= -Inf)
Which would make some sort of sense - in an empty array there's no right
element that's bigger than it's left neighbour ...
And if the case [<] @empty_array should return true it's easy to use ?? ::.
Just my €0.02.
Regards,
Phil
> Okay, I've made up my mind. The "err" option is not tenable because
> it can cloak real exceptions, and having multiple versions of reduce is
> simply multiplying entities without adding much power. So let's allow
> an optional "identvalue" trait on operators. If it's there, reduce
> can use it. If it's not, reduce returns failure on 0 args.
So, just to clarify, for the expression:
[op] @list
if C<op> has an C<:identval> trait then the result is:
* &op.identval() if @list == 0
* @list[0] if @list == 1
* @list[0] op @list[1] op...op @list[-1] if @list > 1
Otherwise the result is:
* fail if @list == 0
* @list[0] if @list == 1
* @list[0] op @list[1] op...op @list[-1] if @list > 1
Correct?
> But I would like to stress that this is an *optional* property.
> We will only define an identity value for those operators for
> which it is obvious what an "accumulator" should be initialized to.
> (This is probably also the set of operators for which op= doesn't warn
> when the left side is undefined, so "identval" is doing double duty.
So, to clarify again, if $var is undefined, then the assignment:
$var op= $value;
is equivalent to:
$var = (&op.does(identval) ?? &op.identval() :: undef) op $value;
Correct?
> In Perl 5 we hardwired that, so it'd be nice to generalize the concept
> in Perl 6.)
Indeed.
> In fact, by the accumulator argument, maybe it should be called
> "initvalue" rather than "identvalue".
I'd be very leery of calling it "initvalue", since people are apt to confuse it
with real initialization. I'd be inclined to call it "identityval" and simply
say that an undefined variable that is op-assigned defaults to the identity
value of the operator (if any).
Damian
I don't think there's a double dispatch there. I think &op just knows
it can default its left argument to an existing attribute value if it
(somehow) knows it's part of an assignment op. There's not a lot of
payback in getting all abstract here, as far as I can see.
Larry
As long as the actual arguments aren't allowed to be lazy/thunky/iteratey,
they can just be evaluated pairwise like normal. That does imply some
kind of temporary storage for
3 < =$IN < 5
so that it only reads =$IN once.
Larry
No need to wait. There is a ton of APL and J code to inspect.
Having predefined identity elements for reductions on empty
arrays is widely exploited.
--
Edward Cherlin
Generalist & activist--Linux, languages, literacy and more
"A knot! Oh, do let me help to undo it!"
--Alice in Wonderland
http://cherlin.blogspot.com
The table of Iverson's chosen identities in J is at
http://www.jsoftware.com/books/help/dictionary/d420.htm
He uses _ for +Inf and __ for -Inf and does cover < and >.
> We can go as far as having -Inf on [<] and +Inf on [>],
> but maybe that means we have to define values that are
> positive and negative infinity on strings so we can get [gt]
> and [lt] indentity values. Those might be useful in any event.
My Second System alarm is going off. What is the result of
(-Inf)~(+Inf) or whatever the syntax would be? Or indeed
(-Inf)~$String for any string? I know what it means to say that
+Inf is the identity for a numeric min function, and -Inf for
max, but an infinite string value baffles me. It would have to
satisfy -Inf le "", right? But with no values in between?
When considering what extensions to make in APL and J, Iverson
always asked what identities could be preserved at the
boundaries. Thus if C is identical to A,B (APL catenation) with
neither A nor B empty, then f/C is identical to (f/A)f(f/B). So
using identity elements, we can extend this identity to some
functions, and to arrays within their domains.
What identities did you have in mind for string infinities? What
sort of reduction can you do on "Hello, "~(+Inf)~"world"? How
does it print? What good is it at all?
I agree with that from the APL, J, Haskell, and FP directions and
several more.
That means that we have to straighten out the functions that can
return either a Boolean or an item of the argument type.
Comparison functions < > <= >= = != should return only Booleans,
IMHO, and the same for the string functions lt and gt and the
rest. It also means that we need primitive functions (operators)
like max and min that only return one of the arguments, and that
can also be used with a reduction operator (metaoperator).
I'm not sure but Perl6 could do better or at least trickier ;)
Let's assume that < > <= >= when chained return an accumulated
boolean and the least or greatest value where the condition was
true. E.g.
0 < 2 < 3 returns 0 but true
1 < 2 < 1 returns 1 but false
4 < 5 < 2 returns 2 but false
Then the reduce versions [<] and [<=] naturally come out as min
and strict min respectively.
Is it correct that [min] won't parse unless min is declared
as an infix op, which looks a bit strange?
if 3 min 4 { ... }
--
TSa (Thomas Sandlaß)
The natural method of implementation would imply that the
final is returned:
0 < 2 < 3 returns 3 but true
1 < 2 < 1 returns 1 but false
4 < 5 < 2 returns 2 but false
The application of each stage of the chain has to remember
the right hand value (for the next stage of the comparison)
as well as the accumulated boolean result. When the boolean
result is true, that has < and <= returning the max, and > and
>= returning the min - the opposite of what you asked above.
When the numbers are not in the desired order, it would be
nice to shirtcircuit and not continue on with the meaningless
comparisons as soon as one fails - which means that the max
or min value could not be known.
Whatever is chosen, though, still has to make sense for other
chained comparisons:
$v != $w < $x > $z == $z
cannot sensibly return either the max or the min (which would
it choose?).
I'd be inclined to have the result be val but true/false
where val is the right hand operand of the final comparison
actually tested. When a consistant set of operators is used
(a mixture of <, <=, and ==; or a mixture of >, >=, and ==)
- then a true boolean result also provides the max (or min
respectively) value, while a false boolean result provides
the value of the first element that was out of order.
--
> Let's assume that op is overloaded for two completely unrelated types
> A and B, which are both defining their respective identity elements
> but !(A.identval =:= B.identval). How should the &op multi method object
> pick the correct one *without* looking at $value's type?
Your mistake is in thinking that the identity trait is on the operand type. It
isn't; it's on the operator itself.
> Or is the
> intent to enforce a unique identity value for each operator like 0 for +
> and 1 for *?
Almost. Remember that overloaded operators are also distinguishable by their
operand types, so you can specify separate identity values for C<&infix:<+>(Num,
Num)> and C<&infix:<+>(Matrix, Matrix)>.
> There's actually a second problem. Will the &op.does(identval) condition
> be true or false if the &op multi contains some targets with and some
> without an identval?
The op will have already been selected by the MMD mechanism before that question
is asked.
> Finally I don't understand how the knowledge about a pending assignment
> eases the choice problem for the multi. Note that the choice of
> assignment operator depends on the return value of the operator and
> the type of which the lhs is undef.
The MMD mechanism sorts out which op is required, either by looking at the
static type of the lhs variable, or by treating the undef as a coercion
(Manhattan distance = 1)
Damian
>>You haven't convinced me, but rather than flog a dead horse,
>>I'll just suggest that we both reserve the right to say "I
>>told you so" when there are several years' worth of Perl 6
>>code out there, and we see how common our respective examples
>>are.
>
> No need to wait. There is a ton of APL and J code to inspect.
> Having predefined identity elements for reductions on empty
> arrays is widely exploited.
I would be more convinced by this body of experience if I felt that the wide
exploitation of implicit identity elements actually helped improve the
predictability and maintainability of programs written in those languages,
neither of which is renowned for its accessibility to ordinary programmers. ;-)
Damian
An interesting idea, but seems a bit heavy..
> Is it correct that [min] won't parse unless min is declared
> as an infix op, which looks a bit strange?
> if 3 min 4 { ... }
Sure. Again there is a Haskell example to heed closely here; for instance, the function:
divMod :: (Integral a) => a -> a -> (a, a)
Can be written as either;
divMod 42 9
or:
42 `divMod` 9
The reverse direction is ();
(+) :: (Num a) => a -> a -> a
(+) 7 5
7 + 5
Sam.
That's how it was done in APL. In fact, that's how every dyadic
(two-argument) function was done in APL. It looks strange at
first, but the syntax is simpler.
"I eat my peas with honey.
I've done it all my life.
It makes them taste real funny,
But it keeps them on the knife."
> The natural method of implementation would imply that the
> final is returned:
>
> 0 < 2 < 3 returns 3 but true
>
> 1 < 2 < 1 returns 1 but false
>
> 4 < 5 < 2 returns 2 but false
>
> The application of each stage of the chain has to remember
> the right hand value (for the next stage of the comparison)
> as well as the accumulated boolean result. When the boolean
> result is true, that has < and <= returning the max, and > and
>
> >= returning the min - the opposite of what you asked above.
In J one can do (<)/ 0 1 1, evaluated as
0<(1<1)
0<0
0
Then, since 0<0 is 0, and 0<1 is 1, 0 is a left identity of <
(restricted to Booleans), and </'' is defined to be 0.
I don't suppose anybody here cares, but it turns out that Boolean
scans have a variety of uses, such as running parity, locating
transitions, and Gray Code to binary conversion, and Boolean
identity elements have their uses within these schemes.