Mark Dickinson wrote:
> On Thu, Mar 13, 2008 at 4:20 AM, Imri Goldberg <lorg...@gmail.com
> <mailto:lorg...@gmail.com>> wrote:
>
> My suggestion is to do either of the following:
> 1. Change floating point == to behave like a valid floating point
> comparison. That means using precision and some error measure.
> 2. Change floating point == to raise an exception, with an error
> string
> suggesting using precision comparison, or the decimal module.
>
>
> I don't much like either of these; I think option 1 would cause
> a lot of confusion and difficulty---it changes a conceptually
> simple operation into something more complicated.
>
> As for option 2., I'd agree that there are situations where having
> a warning (not an exception) for floating-point equality (and
> inequality) tests might be helpful; but that warning should be
> off by default, or at least easily turned off.
As I said earlier, I'd like static checkers (like Python-Lint) to catch
this sort of cases, whatever the decision may be.
>
> Some Fortran compilers have such a (compile-time) warning,
> I believe. But Fortran's users are much more likely to be
> writing the sort of code that cares about this.
>
>
> Since this change is not backwards compatible, I suggest it be added
> only to Python 3.
>
>
> It's already too late for Python 3.0.
Still, I believe it is worth discussing.
>
>
> 3. Programmers will still need the regular ==:
> Maybe, and even then, only for very rare cases. For these, a special
> function\method might be used, which could be named floating_exact_eq.
>
>
> I disagree with the 'very rare' here. I've seen, and written, code like:
>
> if a == 0.0:
> # deal with exceptional case
> else:
> b = c/a
> ...
>
> or similarly, a test (a==b) before doing a division by a-b. That
> one's kind of dodgy, by the way: a != b doesn't always guarantee
> that a-b is nonzero, though you're okay if you're on an IEEE 754
> platform and a and b are both finite numbers.
While checking against a==0.0 (and other similar conditions) before
dividing will indeed protect from outright division by zero, it will
enlarge any error you will have in the computation. I guess it would be
better to do the same check for 'a is small' for appropriate values of
'small'.
>
> Or what if you wanted to generate random numbers in the open interval
> (0.0, 1.0). random.random gives you numbers in [0.0, 1.0), so a
> careful programmer might well write:
>
> while True:
> x = random.random()
> if x != 0.0:
> break
>
> (A less fussy programmer might just say that the chance
> of getting 0.0 is about 1 in 2**53, so it's never going to happen...)
>
> Other thoughts:
>
> - what should x == x do?
If suggestion no. 1 is accepted, always return True. If no. 2 is
accepted, raise an exception.
Checking x==x is as meaningful as checking x==y.
> - what should
>
> 1.0 in set([0.0, 1.0, 2.0])
>
> and
>
> 3.0 in set([0.0, 1.0, 2.0])
>
> do?
>
Actually, one of the reasons I thought about this subject in the first
place, was dict lookup for floating point numbers. It seems to me that
it's something you just shouldn't do.
As for your examples, I believe these two should both raise an
exception. This is even worse than normal comparison - here you are
checking against the hash of a floating point number. So if you do that
in the current implementation, there's a good chance you'll get
unexpected results. If you do that given the implementation of
suggestion 1, you'll have a hard time make set work.
> Mark
Cheers,
Imri.
-------------------------
Imri Goldberg
www.algorithm.co.il/blogs
www.imri.co.il
-------------------------
Insert Signature Here
-------------------------
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas
On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg <lorg...@gmail.com> wrote:As I said earlier, I'd like static checkers (like Python-Lint) to catch
this sort of cases, whatever the decision may be.
> It's already too late for Python 3.0.Still, I believe it is worth discussing.
While checking against a==0.0 (and other similar conditions) before
dividing will indeed protect from outright division by zero, it will
enlarge any error you will have in the computation. I guess it would be
better to do the same check for 'a is small' for appropriate values of
'small'.
Actually, one of the reasons I thought about this subject in the first
place, was dict lookup for floating point numbers. It seems to me that
it's something you just shouldn't do.
Mark Dickinson wrote:
> (with apologies for the random extra level of quoting in the below...)
>
>
> On Thu, Mar 13, 2008 at 11:09 AM, Imri Goldberg
> <lorg...@gmail.com <mailto:lorg...@gmail.com>> wrote:
>
> As I said earlier, I'd like static checkers (like Python-Lint)
> to catch
> this sort of cases, whatever the decision may be.
>
>
> Hmm. Isn't that tricky? How does the static checker decide
> whether the objects being compared are floats? I guess one could
> be content with catching some cases where the operands to ==
> are clearly floats... Wouldn't you have to have run-time warnings
> to be really sure of catching all the cases?
>
Yes. Writing a static-checker for Python is tricky in any case. For the
sake of this discussion, it might be useful to refer to some 'ideal'
static checker. This will allow us to better define what is the desired
behavior.
>
> > It's already too late for Python 3.0.
> Still, I believe it is worth discussing.
>
>
>
> Sure. I didn't mean that to come out in quite the dismissive way it
> did :).
> Apologies. Maybe a PEP aimed at Python 4.0 is in order. If you're open
> to the idea of just having some way to enable warnings, it could be
> much sooner.
>
I think that generating a warning (by default?) is a strong enough
change in the right direction, so we should add that as another option.
(Was also suggested in a comment on my blog.)
>
> While checking against a==0.0 (and other similar conditions)
> before
> dividing will indeed protect from outright division by zero,
> it will
> enlarge any error you will have in the computation. I guess it
> would be
> better to do the same check for 'a is small' for appropriate
> values of
> 'small'.
>
>
> Still, a check for 0.0 is good enough in some cases: if a is tiny, the
> large intermediate values may appear and then disappear happily
> before giving a sensible final result. These are usually the sort
> of cases where just having division by 0.0 return an infinity
> would have "just worked" too (making the whole "if" redundant), but
> that's not (currently!) an option in Python.
>
> It's a truism that floating-point equality tests should be avoided, but
> it's just not true that floating-point equality testing is *always* wrong,
> and I don't think that Python should make it so.
>
Alright, that's why in my original suggestion, I proposed a function for
'old-style' comparison.
It still seems to me that in most cases you are better off doing
something other than using the current ==.
A point I'm not sure of though, is what happens to other comparison
operators, namely,
<=, <, >, >=. If they retain their original meaning than <= and >=
become at least a bit inconsistent.
I'll be glad to hear more opinions about this.
> Actually, one of the reasons I thought about this subject in
> the first
> place, was dict lookup for floating point numbers. It seems to
> me that
> it's something you just shouldn't do.
>
>
> So your proposal would presumably include making
>
> x in dict
>
> and
>
> x not in dict
>
> errors for any float x, regardless of the contents of the dictionary
> (or list, or set, or frozenset, or...) dict?
>
> What would you do about Decimals? A Decimal is just another
> floating point format (albeit base 10 instead of base 2); so
> presumably all these warnings/errors should apply equally
> to Decimal instances? If not, why not?
>
This last note gave me pause. I still need to think more about this, but
here are my thoughts so far:
1. Decimal's behavior might be considered even more inconsistent - the
precision applies to arithmetical operations, but not to comparisons.
2. As a result, it seems to me that decimal's behavior might also be
changed.
It needn't be the same change as regular floating point though - decimal
behavior might follow suggestion 1, while regular floating points might
follow suggestion 2. (I see no point in it being the other way around
though.)
3. Usage in containers depending on __hash__ should change according to
how == behaves for decimals. If == raises an a warning/exception, so
should "x in {..}". If == will be changed to work according to precision
for decimals, then usage in containers will be (very) problematic,
because of context changes. (Consider what happens when changing the
precision.)
4. Right now, I would avoid using decimal or regular floating points in
such containers. The results are just not predictable enough. Using the
'ideal static-checker' mentioned above, I'd say that any such use should
result in a warning.
In any case, there might be a place for a way to do floating point
comparisons in a 'standard' manner.
> I'm not trying to be negative here---as Aahz says, this is an
> interesting idea; I'm just trying to understand exactly how
> things might work.
>
> Mark
Sure, so do I.
Also, if you have <= and >= then you can cheat by
doing 'x <= y and x >= y'. :-)
--
Greg
That's part of what I meant.
There's also the problem that if x>y, then you want x!=y. This means
that there are implications for all comparison operators.
This makes changing == behavior to an epsilon comparison more involved.
I still think it is feasible, but will require much more consideration.
In any case, emitting a warning for == is still 'cheap', and the
original arguments stand.
-------------------------
Imri Goldberg
www.algorithm.co.il/blogs
www.imri.co.il
-------------------------
Insert Signature Here
-------------------------
This makes changing == behavior to an epsilon comparison more involved.I still think it is feasible, but will require much more consideration.
Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__,
__nsim__, __ltsim__, and __gtsim__ slots.
I'm not at all sure how serious I am right now. It's late, and I have
fuzzy recollections of how those kinds of things might have been nice in
some past numerical code.
And then =~ and !~ could be defined for strings and do regular
expression matching! Woo! More operators! With pronouns!
Neil
> On Thu, Mar 13, 2008 at 6:18 PM, Imri Goldberg <lorg...@gmail.com
> <mailto:lorg...@gmail.com>> wrote:
>
> This makes changing == behavior to an epsilon comparison more
> involved.
> I still think it is feasible, but will require much more
> consideration.
>
>
> Okay, now I am going to be negative. :-)
>
> I really think that there's essentially zero chance of == and != ever
> changing
> to 'fuzzy' comparisons in Python. I don't want to discourage you from
> working
> out possible details as an academic exercise, or perhaps with some other
> (Python-like?) language in mind, but I just don't see it ever
> happening in Python.
> Maybe I'm wrong, in which case I hope other python people will tell me so,
> but I think pursuing this is, in the end, going to be a waste of time.
>
Alright, I agree it's a good idea to drop the proposal to changing
floating point == into an epsilon compare.
What about issuing a warning though?
Consider the following course of action. It is the one with the least
changes:
== for regular floating point numbers now issues a warning, but still
works. This warning might be turned off. All other operators are left
unchanged.
Do you think this should be dropped as well?
Just for my own code, I think I'd like this behavior. I still consider
floating point == a potential bug, and this helps me catch it, in the
absence of the 'ideal static checker'.
> Containers would be affected in peculiar ways. I think people would be
> really surprised to find that 1.0+2e-16 *was* an element of the set {1.0},
> or that 1.0 and 1.0+2e-16 weren't allowed to be different keys in a dict.
> And how on earth do you check for set or dict membership under the
> hood?
>
I think that right now containers behave in peculiar ways when used with
FP numbers.
Take set for example - you might as well just use list instead of it.
When you consider dict, then doing d[x] might not return the result you
actually want.
> I don't know of any other language that has successfully done this, even
> though I've seen the idea floated many times for different languages.
> That doesn't mean much, since I only know a small handful of the many
> hundreds (thousands?) of languages out there. If you know a
> counterexample, I'd be interested to hear it.
>
> Mark
Don't know of a good counterexample. I agree that before changing the
behavior of == to fuzzy comparison, you'll want experience with that
kind of change.
Cheers,
Imri
Alright, I agree it's a good idea to drop the proposal to changingfloating point == into an epsilon compare.
What about issuing a warning though?
Consider the following course of action. It is the one with the least
changes:
== for regular floating point numbers now issues a warning, but still
works. This warning might be turned off. All other operators are left
unchanged.
Do you think this should be dropped as well?
Though, clearly, that's what DeprecationWarning should immediately be
renamed to :-).
Bill
1. python users have to know, that representation of float has some
problems
2. python users must not care about internal float representation
Solution "2." is not good, because someday somebody will complain, that
computer calculations are not accurate (some scientist who was not
willing about learning how computer stores floats).
It is better to choose "1." -- beginers will have to accept that
computer is not able to store every real number, because floats are
stored as binary numbers. Maybe operator "==" for floats should be
deprecated, and people should use something like "!~" or "=~", and they
should be able to set precission for float numbers?
> Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__,
> __nsim__, __ltsim__, and __gtsim__ slots.
I think that all of these are a bad idea. In my experience,
when comparing with a tolerance, you need to think carefully
about what the appropriate tolerance is for each and every
comparison. Having a global default tolerance would just
lead people to write sloppy and unreliable numerical code.
--
Greg
> == for regular floating point numbers now issues a warning, but still
> works. This warning might be turned off.
I think I would find it annoying to have to disable a warning
whenever I legitimately wanted to do a floating ==.
Also, having a global warning/no warning setting for the
whole program isn't really right -- whether a floating == is
legitimate is something that needs to be decided on a
case-by-case basis.
--
Greg
Neil Toronto wrote:
> Don't forget a !~ b, a <~ b, and a >~ b, and the associated __sim__,
> __nsim__, __ltsim__, and __gtsim__ slots.
I think that all of these are a bad idea. In my experience,
when comparing with a tolerance, you need to think carefully
about what the appropriate tolerance is for each and every
comparison. Having a global default tolerance would just
lead people to write sloppy and unreliable numerical code.
Given the discussion here, and some more reading on my part, it seems to
me that there isn't much chance for me convincing anyone to raise an
exception on FP ==. I'm not too sure that it's the right move anyway.
While I'll probably avoid FP == in my code, it seems to me that there
are some cases it is useful (even given the inaccuracy of the results).
Regarding adding warnings to pychecker/pylint, I think it's a good idea.
Probably for another mailing list though :).
Also, I considered the subject of runtime warnings as well.
Adding the relevant warnings to any static checker could be really hard
work while warning during runtime could be a lot easier. Therefore, it
seems worthwhile to consider this option. I didn't happen to use the
warnings module before, so I read its documentation now (also the PEP)
and played with it a little.
First, if a warning is generated for floating point ==, it can be turned
off globally, or on a line-by-line basis.
Second, regarding Mark's comment on SmellyCodeWarning. I thought about
it a bit, and it seems no joke to me. gcc has a -Wall mode, so does
Python. Why not use it in this situation? (i.e. having some warnings not
displayed by default.)
I think it would be interesting to consider more cases of
'SmellyCodeWarning' in general, and adding them under some warning
category. If there's a need for a use case, we've already got the first
one - floating point comparisons.
Cheers,
Imri.
-------------------------
Imri Goldberg
www.algorithm.co.il/blogs
www.imri.co.il
-------------------------
Insert Signature Here
-------------------------
Mark Dickinson wrote:
> I really think that there's essentially zero chance of == and != ever
> changing to 'fuzzy' comparisons in Python.
They sort of already did -- you can define __eq__ and __ne__ on your
own class in bizarre and inconsistent ways. [Though I think you can't
easily override that (x is y) ==> (x==y).]
You can even do this with your own float-alike class.
What you're really asking for is that the float class take advantage of this.
> I don't know of any other language that has successfully done this, ...
Changing an existing class requires that the class be "open". That is
the default in languages like smalltalk or ruby. It is even the
default for python classes -- but it is certainly not the default for
"python" classes that are actually coded in C -- which includes
floats.
-jJ
> Alright, I agree it's a good idea to drop the proposal to changing
> floating point == into an epsilon compare.
> What about issuing a warning though?
> Consider the following course of action. It is the one with the least
> changes:
> == for regular floating point numbers now issues a warning, but still
> works. This warning might be turned off. All other operators are left
> unchanged.
If you change ==, you should really change !=, and probably the other
comparisons as well.
I suspect what you really want is a warning on any usage of a floating
point. And I'm only half-joking. Comparison (or arithmetic) with
other floats adds error. Comparison (or arithmetic) with ints is
*usually* a bug (unless one of the operands is a constant that someone
was too lazy to write correctly).
-jJ
Why not? I get this with Python 2.5.1:
>>> from decimal import *
>>> Decimal.__eq__ = lambda x, y: False
>>> x = Decimal(2)
>>> x == x
False
>>> x is x
True
>>>
Or am I misunderstanding your meaning?
<unnecessary pendantry> Of course, even for floats it's not true
that x is y implies x == y:
>>> x = float('nan')
>>> x is x
True
>>> x == x
False
</unnecessary pedantry>
> Changing an existing class requires that the class be "open". That is
> the default in languages like smalltalk or ruby. It is even the
> default for python classes -- but it is certainly not the default for
> "python" classes that are actually coded in C -- which includes
> floats.
You mean like:
>>> float.__eq__ = lambda x, y: False
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'float'
? Presumably there are good reasons for this restriction
(performance? convenience? lack of round tuits?), but
I've no idea what they are. I can't say that I've ever felt a
need to do anything like this.
Mark
That depends on what you regard as "correct". Python
generally permits a duck-typed approach to numbers
wherein using integers as a subset of floats is
considered legitimate, and not lazy at all.
--
Greg