In article <juhbj3$pd$
1...@online.de>,
hel...@astro.multiCLOTHESvax.de (Phillip Helbig---undress to
Alright, with this in mind, let's go back to the original post:
> I have some code in DOUBLE PRECISION which, however, has real input and
> output values. The input values are converted to internal variables via
> DBLE and the results are converted to the output values via REAL.
Ok, here you are talking about subroutine arguments. Presumably the
actual and dummy arguments match and are both declared REAL.
> Suppose an input value X is some number which has no exact
> floating-point represenation, e.g. 2.63.
This simply is not possible. Every real value of the actual
subroutine arguments has a floating point representation. That real
value is associated (by address, or by copying its value, or
whatever) with the dummy argument of the subroutine. This then
means that every value for that dummy argument has an exact floating
point representation.
> What should I expect the value
> DBLE(X) to be, tested by PRINT* during the code? Should I expect
> 2.63000000000000000 or 2.63<"random" digits>?
First, I would say that DBLE() is an intrinsic function that should
never (well, almost never) be used in new code. If you are trying
to understand how some existing code works, then you certainly need
to understand it. But for new code, you should almost certainly be
using the REAL() intrinsic to do these kinds of precision
conversions between different KINDs of REAL values. It might take a
few more keystrokes, but it makes it clear what kind of conversion
is expected by the programmer.
Now, having said that, the conversion from a short precision to a
longer precision is done by appending binary zero digits to the
short precision number. This means that the arithmetic value of the
floating point number does not change. However, when printed, you
may see different numbers printed; this is sometimes due to errors
in the binary to decimal conversion routines (the standard C library
was a common cause for these errors for many years in the 80's), and
in other cases it is a question of how the values are rounded as the
decimal digits are computed. But the important thing is that the
floating point value is not changed, at all, not even in the last
bit.
Going the other way, from longer precision to shorter precision, is
a different matter of course. Here there are issues related to
rounding and truncation of nonzero bits.
Finally, you mention "random" digits. There should be some nonzero
decimal digits printed out in most cases due simply to the binary to
decimal conversion. But they are not an infinite number, they
always terminate at some number of decimal digits. You can see this
just by computing the decimal values for the binary digits.
.1 .5
.01 .25
.001 .125
.0001 .0625
As you can see, there are as many decimal digits as there are binary
digits, but no more. The allowed floating point values are just
sums of some subset of those values, so each floating point value
has an exact decimal value, and the number of digits required to
represent that value is no more than the last nonzero binary digit.
That means that 24 decimal digits are required to represent any
single precision binary mantissa and 53 decimal digits are required
to represent any double precision binary mantissa (note that most
modern floating point representations include a hidden bit, if you
are wondering where the extra bit comes from).
You almost never see that many decimal digits printed out. Why not?
It is because in order to reproduce a given floating point number,
only enough decimal digits are required so that the ***rounded***
value is correct. For single precision that requires 9 decimal
digits for single precision and 17 decimal digits for double
precision.
However, when working on old codes (usually f77 and older), you can
also see "random" digits for another reason. These old codes often
involved aliasing of variables, or reusing some memory location to
store the bits for variables of various types. This aliasing can be
done with EQUIVALENCE, within common blocks, or through actual/dummy
argument association, and with various compiler extensions, also
through pointer association or direct bit manipulation. So if you
have a double precision variable, that resides in a memory location
that at some other location in the program is used for another
purpose, then some or all of the bits in that variable might be
modified in some nonobvious way. In this case, the low-order bits
might have been changed in some way, and later when printed this
also results in "random" decimal digits.
I remember one example of this with a 60-bit CDC code. A sparse
array was being stored. Only the nonzero elements were stored, and
the lowest 6 bits were used to store, not the bits of the floating
point value, but rather a skip-count for the index of the next
stored value. In some cases, those low-order bits were masked out
before any floating point operations were done on those values or
before they were printed, but in other cases they were just left
there because the programmer "knew" that they would not make any
significant difference in the final result. That masking operation
was done by equivalencing the 60-bit word to a hollerith string, and
using some character operations on the 10th character in the string
(CDCs had a 6-bit character set). These 60-bit numbers were single
precision, so this is only tangentially related to the main subject
of this thread, but it is an example of "random" decimal digits that
I remember.
$.02 -Ron Shepard