Fernando Rodríguez <f...@wanadoo.es> writes: > I just noticed this: > CL-USER 60 > (round 5.1) > 5 > 0.09999999999999964
> Is ths normal? O:-)
Floating-point arithmetic isn't exact. Conversion between number systems with various bases isn't exact either. So yes, this is normal. You can always use rationals.
(round 51/10) => 5, 1/10
-- "Das Auto hat keine Zukunft. Ich setze aufs Pferd." Wilhelm II. (1859-1941)
wb...@gmx.net (Wolfhard =?iso-8859-1?q?Bu=DF?=) writes: > Fernando Rodríguez <f...@wanadoo.es> writes:
> > I just noticed this: > > CL-USER 60 > (round 5.1) > > 5 > > 0.09999999999999964
> > Is ths normal? O:-)
> Floating-point arithmetic isn't exact. Conversion between number systems > with various bases isn't exact either. So yes, this is normal. > You can always use rationals.
> (round 51/10) => 5, 1/10
I'm not a floating point expert here, and it's easy to get this kind of thing wrong, but _believe_ that the case Fernando cites is neither a case of floating point arithmetic not being exact (though it's true, it's often not exact), nor a case of conversion being inexact (which I'm sure you knew). In addition to the two problems you cite, floating point _notation_ does not match floating point internal _representation_ exactly either, so I think the real underlying problem in this case is that 5.1 is getting represented as a binary fraction which is already an approximation even before you start. ROUND in this case is separating the 5 from the .1 but I don't think there is a loss of information in that particular operation; I think it's "exact" insofar as the inputs were exact, because there is no increase in magnitude that would force a loss of precision. (I could be wrong on this, so someone who's more familiar with the internals please correct me.) My point here is that if you think 5.1 is an exact number just because it prints nicely, you already have a misconception about what a floating point number is _even before_ you start doing math. (In a sense, I think the reason it looks more exact than it is is that the binary to decimal conversion done by the printer conspires in some cases to hide the representational trick that's going on.)
> > Floating-point arithmetic isn't exact. Conversion between number systems > > with various bases isn't exact either. So yes, this is normal. > > You can always use rationals.
> > (round 51/10) => 5, 1/10
> I'm not a floating point expert here, and it's easy to get this kind > of thing wrong, but _believe_ that the case Fernando cites is neither > a case of floating point arithmetic not being exact (though it's true, > it's often not exact), nor a case of conversion being inexact (which > I'm sure you knew). In addition to the two problems you cite, > floating point _notation_ does not match floating point internal > _representation_ exactly either, so I think the real underlying > problem in this case is that 5.1 is getting represented as a binary > fraction which is already an approximation even before you start. > ROUND in this case is separating the 5 from the .1 but I don't think > there is a loss of information in that particular operation; I think > it's "exact" insofar as the inputs were exact, because there is no > increase in magnitude that would force a loss of precision.
You're right. The conversion I mentioned is not the issue in this case. The conversion between internal and external representation and vice versa is.
-- "Das Auto hat keine Zukunft. Ich setze aufs Pferd." Wilhelm II. (1859-1941)
* Kent M Pitman | ROUND in this case is separating the 5 from the .1 but I don't think | there is a loss of information in that particular operation; I think | it's "exact" insofar as the inputs were exact, because there is no | increase in magnitude that would force a loss of precision.
Well, actually, there is. If 5.1 and 0.1 both use n bits of precision, (- 5.1 5.0) actually ends up using n-3 bits of precision for the 0.1 return value, effectively replacing the three least significant bits with zeros. Since 0.1 has a bit pattern of repeating groups of 1100, losing the three least significant bits must lead to a value different from 0.1.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Erik Naggum <e...@naggum.net> writes: > * Kent M Pitman > | ROUND in this case is separating the 5 from the .1 but I don't think > | there is a loss of information in that particular operation; I think > | it's "exact" insofar as the inputs were exact, because there is no > | increase in magnitude that would force a loss of precision.
> Well, actually, there is. If 5.1 and 0.1 both use n bits of precision, > (- 5.1 5.0) actually ends up using n-3 bits of precision for the 0.1 > return value, effectively replacing the three least significant bits with > zeros. Since 0.1 has a bit pattern of repeating groups of 1100, losing > the three least significant bits must lead to a value different from 0.1.
Yes, that's so. But I thought I was already taking this into account.
I guess it's really a philosophical question, not a technical one.
For example, there must be some decimally expressible rational whose exact representation is the same as the exact representation of 5.1. If that (numerically equal) number were rounded, the missing binary digits when you do the "shift" are correctly supplied as zero [well, if you're rounding down; I'm more confident that what I'm saying is true of truncate than round]. Consequently, the operation of truncation is [to the limits of the original representation] exact. What is in question is the "well" of numbers that have as their stable point the particular binary digits. That is, my understanding is that the number line says
........................ <- actual real number line
\ /\ /\ /\ / \ / \ / \ / \ / v v v v . . . . <- binary number line (probably not as evenly spaced as shown, not that it matters other than to just make it painful in practice)
The "wells" I am talking about are the v's above, that take a range of rationals (or reals) and map them into a particular set of binary numbers. When unpacked to decimal, there is one particular representative decimal (among the reals) that "names" the binary point, but the operation is not on that decimal, it's on the binary. [Incidentally, my understanding of the problem of printing decimals is that you have to be careful to choose a decimal that will re-read in the same well, so you don't get drift by read/print.]
But at the binary level, the operations are exact if you do not lose digits in the computations. The problem is that when you do the operation in binary and then map back to the representative decimal you don't always get the result of doing the operation directly on the decimal number "symbolically". The real question becomes, though, whether the operation is in error (that is, whether the binary exists to support the decimal) or whether the mapping is in error, since in principle there are other input numbers that could have been typed.
This is very similar to the question of whether \Foo and foo and FOO and |FOO| are the same. Of course they are, and as a direct consequence, the system can't tell on behalf of which symbol a string-concatenation operation is done. It would be false, though, to say that the operation of concatenating the string designator foo and the string designator bar being "FOOBAR" and not "foobar" is an error. It's true that "foobar" is the string designated by the concatenation of the string "foo" and the string "bar", but that again illustrates that some of these operations are information-losing and we have to live with them.
Anyone wondering why I say I don't know much about floats will perhaps now see better what I'm getting at--these are issues that you can acquire a great deal of _trivia_ about, but developing good _intuitions_ about them is harder because the operations for moving around in the space are not uniquely reversible, and so the action of shifting representation for convenience of expression (usually the earmark of intelligent behavior, IMO) is thwarted by the fact that there are few shifts of representation (i.e., informally, few "intelligent ways to rethink things") that are not information losing. That fact, in the aggregate, is the basis for why intuitions break down... intuitions assume you can gloss details that you can't gloss. I wish I had taken some group theory. I'm sure there is a simpler way to say some of this in group theory. Anyone still in school who has the chance to take some should do so... I have always regretted not having it at my disposal terminologically.
> * Kent M Pitman > | ROUND in this case is separating the 5 from the .1 but I don't think > | there is a loss of information in that particular operation; I think > | it's "exact" insofar as the inputs were exact, because there is no > | increase in magnitude that would force a loss of precision.
> Well, actually, there is. If 5.1 and 0.1 both use n bits of precision, > (- 5.1 5.0) actually ends up using n-3 bits of precision for the 0.1 > return value, effectively replacing the three least significant bits with > zeros. Since 0.1 has a bit pattern of repeating groups of 1100, losing > the three least significant bits must lead to a value different from 0.1.
I threw this together today after not finding anything to print out a float in a base other than decimal. It's more for exploration of floating point in different bases than anything else. I was aiming to write something that produced something more along the lines of -1.110110*10^1010 without really ramping up the error in the representation, perhaps over the weekend.
* "Geoff Summerhayes" | I threw this together today after not finding anything to print out a | float in a base other than decimal. It's more for exploration of | floating point in different bases than anything else. I was aiming to | write something that produced something more along the lines of | -1.110110*10^1010 without really ramping up the error in the | representation, perhaps over the weekend. | | (defun print-float-base(float &optional (base 10) (stream *standard-output*)) | (let ((*print-base* base) | (*print-radix* t)) | (multiple-value-bind (signif expon sign) | (integer-decode-float float) | (format stream "~&#.(* ~A (float ~A) (expt ~A ~A))~%" | sign signif (float-radix float) expon))))
scale-float is a more efficient way of changing the exponent. Also remember to use a (unit) float of the same type as the argument float for the float call, or you end up just playing with single-floats. Quoted from the standard on the integer-decode-float page:
Remember that all floats are really rational numbers in disguise. The denominator must be a power of two, and the numerator must be in the range 4503599627370496 to 9007199254740991 (assuming double precision normalized).
So the closest float to the decimal number 5.1 is
5742089524897382 / 1125899906842624
This is actually slightly less than 51/10, but we don't have the luxury of picking the denominator in a floating point number, and out of these numerators, the middle one is the closest.
(To see this, multiply 1125899906842624 by 51 and divide by 10)
Now we take the floor. We divide 5742089524897382 by 1125899906842624, getting 5 with a remainder of
112589990684262 / 1125899906842624
The denominator is still a power of two, but the numerator is well below the minimum of 4503599627370496. This won't do, but it is easily corrected. We multiply the numerator and denominator by 2 until numerator *is* in the right range.
Now we want to print this fraction as a decimal equivalent. There is a trick here. The decimal expansion may not be of finite length, so we'll have to be prepared to truncate it. But we also want it to be the case that when the decimal expansion is read back in, the same bit pattern will be constructed by the reader.
1 / 72057594037927956 is about .0000000000000000139, so we'll have to print out about 17 digits (unless we're lucky and the last few are zero). As it turns out, the number 0.09999999999999964 is the shortest decimal number that, when read in, becomes
> I don't think there is a loss of information in that particular > operation; I think it's "exact" insofar as the inputs were > exact, because there is no increase in magnitude that would > force a loss of precision.
Kent is correct, there was no information lost by the round operation or the printing of the result.
> * Kent M Pitman > | ROUND in this case is separating the 5 from the .1 but I don't think > | there is a loss of information in that particular operation; I think > | it's "exact" insofar as the inputs were exact, because there is no > | increase in magnitude that would force a loss of precision.
> Well, actually, there is. If 5.1 and 0.1 both use n bits of precision, > (- 5.1 5.0) actually ends up using n-3 bits of precision for the 0.1 > return value, effectively replacing the three least significant bits with > zeros. Since 0.1 has a bit pattern of repeating groups of 1100, losing > the three least significant bits must lead to a value different from
0.1.
There isn't any precision lost in the subtraction.
Both 5.1 and 5.0, when represented in floating point, start with a bit pattern of 101000..... When you subtract them, the six most significant bits are zero. The float is renormalized by multiplying the numerator and denominator by 2 (i.e. shifting the mantissa and incrementing the exponent) until the most significant bit is a one.
Where you lose precision is in the construction of 5.1 as a floating point number. 5.1 is represented internally as 5742089524897382 / 1125899906842624
and 5.0 is represented as 5629499534213120 / 1125899906842624
multiplying the numerator and denominator by 64 gives this 7205759403792768 / 72057594037927936
Now the decimal .1 is represented internally as 7205759403792794 / 72057594037927936
which is a bit different from the result of the subtraction, but the real problem is that the floating point representation for 5.1 is slightly smaller than 51/10, so the answer comes out slightly smaller than 1/10.
The `cancellation' of the most significant six digits in the subtraction does not introduce any more error than had already been introduced by using an approximation to 5.1
| That fact, in the aggregate, is the basis for why intuitions break | down... intuitions assume you can gloss details that you can't | gloss.
Isn't this one of the things education is about: Building a better intuition? (I think I first heard this stated by a physicist.) Not that your improved intuition can ever replace the hard work of figuring out the details, but every new insight obtained should improve your intuition so that even stuff that was previously counter- intuitive is now obvious.
| I wish I had taken some group theory. I'm sure there | is a simpler way to say some of this in group theory.
Group theory is useful for lots of things, but I can't see much of an application for it in reasoning about floating point arithmetic. In fact, one of the reasons this is so difficult is precisely that the floating point numbers do not form a group, much less a ring or field, as we would have liked them to do. In short, there is just no way to understand floating point arithmetic without getting down in the nitty-gritty and getting your hands dirty. There is (AFAIK) no simple and elegant mathematical theory in which you can express these things.
The best I can come up with, terminologywise, to your diagram is this: The set F of all floating point numbers of a certain kind is a finite set of dyadically rational numbers (that is, rational numbers whose denominators are powers of two). Along with F, we assume given a mapping p: I->F, where I is an interval of real numbers, roughly (but by no means exactly) extending from most-negative-whatever-float to most-positive-whatever-float (but at least, I contains all of F). Moreover, this mapping is a projection onto F in the sense that p(f)=f whenever f belongs to F. But the mapping is many-to-one. The inverse image of any f in F is an interval which I will write here simply as [f] (for those familiar with TeX, p^{-1}[f] would be a more conventional notation). All these intervals create a partition of the interval I of approximately representable real numbers. An arithmetic operation, such as +, is then in principle carried out within F as follows: Replace the exact sum a+b by p(a+b), if a+b is within I. (If not, we are facing an overflow, so we adjoin an artificial entity Inf to assign to the "sum".)
Any rounding rules, such as truncate or round-to-even, are built into the function p.
In the given example, 5.1 does not belong to F but 5 does. So one really ends up subtracting 5 from p(5.1). That the result is different from p(0.1) is just a special case of the general observation that p(a)-p(b) is in general not equal to p(a-b), and similarly with the other operations. However, when you rightly observe that there is no loss of precision in this special case, what you are really saying is that p(5.1)-5 already belongs to F, so in this case the final step of projecting the result back into F is not necessary.
So why wasn't floating point arithmetic designed so that the (obviously very desirable) property p(x@y)=p(x)@p(y) holds for all the four arithmetical operations (@ in {+ - * /})? The unfortunate answer is that this is mathematically impossible to achieve.
I don't know if this way of seeing things really helps - to some of the more mathematically oriented abstract thinkers it may, while to others it's merely stating the obvious in a form of gobble-de-gook. To the latter, my apologies. (You should have stopped reading before you got to this point, though.)
> scale-float is a more efficient way of changing the exponent. Also > remember to use a (unit) float of the same type as the argument float for > the float call, or you end up just playing with single-floats. Quoted > from the standard on the integer-decode-float page:
> I tend to use the function rational to decode floating point numbers.
I avoided using scale-float because I wanted to get around the default base eventually. Here's the revised function, first working draft, I'm afraid it's not very pretty:
------------------------------
(defun print-float-base (float &optional (base 10) (stream *standard-output*)) "Print an approximation of a floating-point number in the specified base to a stream. The representation is not readable and has the format -M.MMMMM..M*BASE^EXPONENT All components are printed in the base." (let ((fraction (rational float)) (power 0) (*print-base* base) ;; estimate length of significant digits in output (digits (ceiling (log (expt (float-radix float) (float-digits float)) base)))) ;; normalize float (do ((integer (truncate fraction) (truncate fraction))) ((< integer base)) (progn (incf power) (setf fraction (/ fraction base)))) (do ((integer (truncate fraction) (truncate fraction))) ((or (zerop float) (not (zerop integer)))) (progn (decf power) (setf fraction (* fraction base)))) ;; print mantissa (multiple-value-bind (integer remainder) (truncate fraction) (let ((*print-radix* (not (= 10 base)))) (format stream "~A." integer)) (if (not (zerop remainder)) (progn (setf remainder (abs remainder)) (do ((x 0 (1+ x))) ((or (zerop remainder) (> x digits))) (multiple-value-setq (integer remainder) (truncate (* remainder base))) (let ((*print-radix* nil)) (format stream "~A" integer)))) (format stream "0")) ;; print base and exponent ;; the printed representation of base is always ;; `10' in that base (let ((*print-radix* nil)) (format stream "*10^~A" power)))))