On Fri, Mar 23, 2012 at 03:03:09PM +1300, Martin D Kealey wrote:
> On Thu, 22 Mar 2012, Carl Mäsak wrote:
> > Jonathan Lang (>>), Daniel (>):
> > >> 1, 2, 4 ... 100 # same as 1,2,4,8,16,32,64
> > > That last one doesn't work on Rakudo :-(
> > And it never will. Note that 100 is not a power of 2, and that the goal
> > needs to match exactly.
> Hmmm, so it's likely that most times you get a Num rather than an Int or
> Rat, those won't stop either?
> 1, 7 / 6.0 ... 2
> 1, sqrt(2), 2 ... 8
The expression 7/6.0 produces a Rat, so the first sequence properly stops at 2.
On Rakudo on my system, sqrt(2) indeed produces a Num,
but since floating point arithmetic doesn't result in
sqrt(2) / 1 == 2 / sqrt(2), no geometric sequence is deduced and the sequence fails with "unable to deduce sequence".
> Question: do we support
> 1, 2i, -4 ... 256
I think this ought to work, but for some reason Rakudo on my system
hangs whenever I try it. The following does work in Rakudo:
> On Rakudo on my system, sqrt(2) indeed produces a Num,
> but since floating point arithmetic doesn't result in
> sqrt(2) / 1 == 2 / sqrt(2), no geometric sequence is deduced
> and the sequence fails with "unable to deduce sequence".
Although, arguably, that might be considered a bug.
Not that sqrt(2) / 1 should == 2 / sqrt(2) of course, but that, when
deducing a sequence we know we're comparing quotients, so we ought to
allow for the inevitable loss of significant digits within the two
preliminary division ops, and therefore compare the results with an
suitably larger epsilon.
That would not only be computational more justifiable,
I suspect it might also produce more "least surprise". ;-)
>> On Rakudo on my system, sqrt(2) indeed produces a Num,
>> but since floating point arithmetic doesn't result in
>> sqrt(2) / 1 == 2 / sqrt(2), no geometric sequence is deduced
>> and the sequence fails with "unable to deduce sequence".
> Although, arguably, that might be considered a bug.
> Not that sqrt(2) / 1 should == 2 / sqrt(2) of course, but that, when
> deducing a sequence we know we're comparing quotients, so we ought to
> allow for the inevitable loss of significant digits within the two
> preliminary division ops, and therefore compare the results with an
> suitably larger epsilon.
> That would not only be computational more justifiable,
> I suspect it might also produce more "least surprise". ;-)
But unless we twist smartmatching semantics for that purpose, it means
we cannot do the same fuzziness for the endpoint, condemning people to
write infinite loops instead of failing fast.
So I'm firmly against such magic. All the previous iterations of the
sequence operator had some additional degrees of magic, and we've come
to regret all of them. This discussion makes me think that maybe
deducing geometric sequences is too much magic as well.
> On Fri, Mar 23, 2012 at 03:03:09PM +1300, Martin D Kealey wrote:
>> Question: do we support
>> 1, 2i, -4 ... 256
> I think this ought to work, but for some reason Rakudo on my system
> hangs whenever I try it.
The problem was that infix:<!=> hung with Complex numbers, because it
was defined for each numeric type, but the Complex candidate was
missing. Thus the most general candidate called .Numeric on its
arguments, re-dispatched, and looped infinitely.
Fixed in 2012.03-3-g4a247b1, and tested in S32-num/complex.t.
This also fixes the sequence 1, 2i, -4 ... 256.
> But unless we twist smartmatching semantics for that purpose,
No!
Please, no.
;-)
> it means we cannot do the same fuzziness for the endpoint,
Except that we will be encouraging people to use: * >= $END
as their standard endpoint pattern, which will provide
most of the necessary fuzz.
> So I'm firmly against such magic.
But that's the point: it's not magic. It's correct numerical computation
under the limitations of floating-point arithmetic. As discussed in
every numerical computing textbook for the last half a century.
Note that this problem occurs for *arithmetic* deductions as well.
If either number is floating point, some loss of precision in the
difference between them is almost inevitable, especially if the two
numbers are very close.
For example:
1, 1.000000000001, 1.000000000002 ... *
won't deduce a correct arithmetic sequence either (on most hardware).
In fact that example would be slightly more likely to imply a geometric
sequence, since a division operation loses fewer significant digits than
a subtraction when the arguments are both so close to 1.0)
> This discussion makes me think that maybe
> deducing geometric sequences is too much magic as well.
Geometric sequence inference is fine on Ints and Rats.
But, if we can't perform inferences involving Nums in a sound numerical
way (and it may well be that we can't, without taking a noticeable
performance hit), then I think that we would be better off limiting the
deduction of *both* arithmetic and geometric sequences to starting lists
that contain only Ints and Rats.
Note that Rakudo also doesn't require a binding operation for the array... assignment of detectably infinite lists (indicated here by the final Whatever term) is supported.
> Actually, that one works fine in both niecza and rakudo, since those are Rats.
Oh, that's good to hear.
It doesn't change my underlying argument however. Any operations
performed on genuine floats are going to lose precision, and if we're
using such operations to infer relationships (such as equality or
sequence) then we ought to take the loss of precision we're causing
into account when deciding outcomes.
Which seems to mean either (a) making it a compiletime error to request
sequence inferences on floats , or (b) comparing the differences and
quotients within the sequence inference with a larger epsilon (or using
interval arithmetic).
>> Actually, that one works fine in both niecza and rakudo, since those are Rats.
> Oh, that's good to hear.
> It doesn't change my underlying argument however. Any operations
> performed on genuine floats are going to lose precision, and if we're
> using such operations to infer relationships (such as equality or
> sequence) then we ought to take the loss of precision we're causing
> into account when deciding outcomes.
> Which seems to mean either (a) making it a compiletime error to request
> sequence inferences on floats , or (b) comparing the differences and
> quotients within the sequence inference with a larger epsilon (or using
> interval arithmetic).
> Damian
IMHO: if we're going to take loss of precision into account, we should do so explicitly. I'm a bit rusty, so forgive me if I misuse the terminology: if a number has an epsilon, the epsilon should be attached to it as a trait so that it can be accessed by the program. This allows all sorts of things, like "close enough" smart-matching and error propagation. The main question is if Perl should assign a minimum epsilon to all floats by default, or if this should be an "all's fair if you predeclare" type of thing.
On Sat, Mar 24, 2012 at 06:16:58PM -0700, Jonathan Lang wrote:
> IMHO: if we're going to take loss of precision into account, we should do so explicitly. I'm a bit rusty, so forgive me if I misuse the terminology: if a number has an epsilon, the epsilon should be attached to it as a trait so that it can be accessed by the program. This allows all sorts of things, like "close enough" smart-matching and error propagation. The main question is if Perl should assign a minimum epsilon to all floats by default, or if this should be an "all's fair if you predeclare" type of thing.
Speaking as an implementor, I think this is quite unlikely. Some of my
reasons:
1. Interval arithmetic (this is the correct technical term) is very
inefficient on some popular architectures.
2. After a long computation which is numerically stable *but which Perl
cannot _prove_ the stability of*, error bounds will be very large, and
if smart-matching automatically takes them into account, it will tend
to result in extremely suprising false positives.
3. Adding hidden state to numbers that is not visible by default will make
debugging harder. Not hiding the state will make output much noisier.
Larry is free to override this of course. Also, interval arithmetic ought
to be possible as a module.
On Mar 24, 2012, at 6:36 PM, Stefan O'Rear <stefa...@cox.net> wrote:
> On Sat, Mar 24, 2012 at 06:16:58PM -0700, Jonathan Lang wrote:
>> IMHO: if we're going to take loss of precision into account, we should do so explicitly. I'm a bit rusty, so forgive me if I misuse the terminology: if a number has an epsilon, the epsilon should be attached to it as a trait so that it can be accessed by the program. This allows all sorts of things, like "close enough" smart-matching and error propagation. The main question is if Perl should assign a minimum epsilon to all floats by default, or if this should be an "all's fair if you predeclare" type of thing.
> Speaking as an implementor, I think this is quite unlikely. Some of my
> reasons:
> 1. Interval arithmetic (this is the correct technical term) is very
> inefficient on some popular architectures.
> 2. After a long computation which is numerically stable *but which Perl
> cannot _prove_ the stability of*, error bounds will be very large, and
> if smart-matching automatically takes them into account, it will tend
> to result in extremely suprising false positives
I'll concede that these two (especially the second one) are sufficient arguments against the "default epsilon" idea, rendering my disagreement on the third point largely moot. That said:
> 3. Adding hidden state to numbers that is not visible by default will make
> debugging harder. Not hiding the state will make output much noisier.
That's why I was suggesting making epsilon available as a trait (e.g., something like "5 but error(.01)"): if you need it, it would be easy to look it up; if you don't need it, it would be easy to ignore it.
> Larry is free to override this of course. Also, interval arithmetic ought
> to be possible as a module.
Absolutely, it would be possible as a module. And with your second point above, it ought to be a module, to be explicitly applied when the programmer wants to use interval arithmetic.
On 2012-March-23, at 12:01 am, Damian Conway wrote:
> [...] we ought to allow for the inevitable loss of significant digits within the two preliminary division ops, and therefore compare the results with an suitably larger epsilon.
> That would not only be computational more justifiable, I suspect it might also produce more "least surprise". ;-)
I think that comparisons for floating-point values should take some kind of 'significance' adverb and complain if it's missing. Having to be explicit makes for the least surprise of all.
Probably with something like 'use epsilon :within(0.0002)' as way to declare the fuzziness for a given scope if you have a lot of comparisons. And of course you could use (the equivalent of) 'use epsilon :within(0)' to say, "I know what I'm doing, just give me straight what I ask for and I'll take the consequences."
Alternatively, maybe have float-comparisons give an error or warning, and introduce an "approximation operator": π == ~22/7 :within($epsilon). (Except "~" is already taken!)
[I was going to suggest that as a way to handle stopping points in a sequence: 1, 3, 5 ... ~10, but that still wouldn't work without treating the Num::Approx values as a special case, which defeats the purpose. Though with a postfix "up from" operator, you could say: 1, 3, 5 ... 10^.]
On 2012-March-21, at 6:38 pm, Daniel Carrera wrote:
> The idea of smart-matching a function just doesn't quite fit with my brain. I can memorize the fact that smart-matching 7 and &foo means evaluating foo(7) and seeing if the value is true, but I can't say I "understand" it.
Maybe it just needs a better name. "Match" implies that two (or more) things are being compared against each other, and that's how smart-matching started out, but it's been generalised beyond that. The underlying .ACCEPTS method suggests "acceptance"... but that's too broad (a function can "accept" args without returning true). "Agreement" fits, in the sense of "that [food] agrees with me", but I think it suggests equality a bit too strongly. "Accordance"? "Conformance"? "Validation"? That seems a good match (ahem) for the concept: ~~ checks whether some value is "valid" (or "desired"?) according to certain criteria. The obvious way to validate some value against a simple string or number is to compare them; or against a pattern, to see if the value matches; but given a function, you check the value by passing it to the function and seeing whether it says yea or nay.
I'm not sure "validation" or "validity" is the best name, but it conforms better to what smart-"matching" does. Or "conformance".... Hm. But terminology that sets up the appropriate expectations is a good thing.
>> it means we cannot do the same fuzziness for the endpoint,
> Except that we will be encouraging people to use: * >= $END
> as their standard endpoint pattern, which will provide
> most of the necessary fuzz.
and which will still surprise those people who are surprised
by floating point inaccuracies.
>> This discussion makes me think that maybe
>> deducing geometric sequences is too much magic as well.
> Geometric sequence inference is fine on Ints and Rats.
> But, if we can't perform inferences involving Nums in a sound numerical
> way (and it may well be that we can't, without taking a noticeable
> performance hit), then I think that we would be better off limiting the
> deduction of *both* arithmetic and geometric sequences to starting lists
> that contain only Ints and Rats.
Floating point numbers *can* represent a huge number of commonly used
values without errors, and you can do error-free arithmetic operations
on many of them. Excluding Nums from automatic deduction feels like an
unnecessary pessimization or stigmatization, especially if you consider
that writing a number like 0.001 in your program gives a Rat by default
not a Num.
Most of the time you only get a Num in Perl 6 if you consciously decide
to write one, in which case you should also be well aware of the
limitations of FP math.
At least in #perl6 I've never seen anybody try to write an auto-deduced
sequence, and fail because of floating-point errors.
> On 2012-March-23, at 12:01 am, Damian Conway wrote:
>> [...] we ought to allow for the inevitable loss of significant digits within the two preliminary division ops, and therefore compare the results with an suitably larger epsilon.
>> That would not only be computational more justifiable, I suspect it might also produce more "least surprise". ;-)
> I think that comparisons for floating-point values should take some kind of 'significance' adverb and complain if it's missing. Having to be explicit makes for the least surprise of all.
Note that neither 22/7 nor 0.002 are floating-point values.
I don't know if the majority of the perl6-language posters have realized
it yet, but both Perl 6 and the its implementations are quite mature
these days. Mature enough that such proposals should be prototyped as
modules, and thoroughly tested on lots of existing code before taken
into consideration for
Niecza supports operator adverbs, and supports them on user-defined
operators, so there's nothing to stop you from trying it.
> At least in #perl6 I've never seen anybody try to write an auto-deduced
> sequence, and fail because of floating-point errors.
Except for Martin's 1, sqrt(2), 2...8
But, yes, the widespread use of Rats rather than Nums
means only the edgiest of edge-cases fails. And as you get
an explicit Failure when it does happen, at least people will
know when the numerical computations don't work as hoped.
> I don't know if the majority of the perl6-language posters have realized
> it yet, but both Perl 6 and the its implementations are quite mature
> these days. Mature enough that such proposals should be prototyped as
> modules, and thoroughly tested on lots of existing code before taken
> into consideration for
... inclusion into the spec.
Sometimes I do finish my sentences with several hours delay, sorry for that.
I also like "agreement", "conformance"... In a situation like this, I
reach for a thesaurus- very useful when looking for just the right
name for a variable/method name/way to describe a concept. Here's a
grab bag to start with:
Fit, correspond, congruity, harmonize seem like other good
descriptions for the concept. Fit is especially good due to its
brevity, and congruence is good due to the use of ~~ as the smartmatch
aka congruence/fitness/agreement/harmonizing/correspondence/conformance
operator.
On Sun, Mar 25, 2012 at 12:35 AM, David Green <david.gr...@telus.net> wrote:
> On 2012-March-21, at 6:38 pm, Daniel Carrera wrote:
>> The idea of smart-matching a function just doesn't quite fit with my brain. I can memorize the fact that smart-matching 7 and &foo means evaluating foo(7) and seeing if the value is true, but I can't say I "understand" it.
> Maybe it just needs a better name. "Match" implies that two (or more) things are being compared against each other, and that's how smart-matching started out, but it's been generalised beyond that. The underlying .ACCEPTS method suggests "acceptance"... but that's too broad (a function can "accept" args without returning true). "Agreement" fits, in the sense of "that [food] agrees with me", but I think it suggests equality a bit too strongly. "Accordance"? "Conformance"? "Validation"? That seems a good match (ahem) for the concept: ~~ checks whether some value is "valid" (or "desired"?) according to certain criteria. The obvious way to validate some value against a simple string or number is to compare them; or against a pattern, to see if the value matches; but given a function, you check the value by passing it to the function and seeing whether it says yea or nay.
> I'm not sure "validation" or "validity" is the best name, but it conforms better to what smart-"matching" does. Or "conformance".... Hm. But terminology that sets up the appropriate expectations is a good thing.