Common Fortran error -- setting double precision variable to a single precision constant

Beliavsky

unread,

May 30, 2014, 8:28:51 AM5/30/14

to

Having read the Fortran questions on Stack Overflow for a few months, one very FAQ is why

double precision :: x
x = 1.1

gives surprising results, as does passing "1.1" to a function expecting a double precision argument. One should write something like 1.1d0 instead. I think Matlab, Python, and R, some scripting languages that new Fortran programmers may be used to, treat 1.1 as double precision and avoid the Fortran pitfall.

Would it break currently standard-conforming programs to change the language and promote 1.1 to 1.1d0 when the LHS is double precision? Do gfortran or g95 or other compilers have options that warn about the code fragment above? G95 says nothing with -Wall -Wextra -fbounds-check -ftrace=full -freal=nan , and gfortran 4.8 says nothing with -Wall -Wextra .

Wolfgang Kilian

unread,

May 30, 2014, 8:58:07 AM5/30/14

to

On 05/30/2014 02:28 PM, Beliavsky wrote:
> Having read the Fortran questions on Stack Overflow for a few months, one very FAQ is why
>
> double precision :: x
> x = 1.1
>
> gives surprising results, as does passing "1.1" to a function expecting a double precision argument. One should write something like 1.1d0 instead. I think Matlab, Python, and R, some scripting languages that new Fortran programmers may be used to, treat 1.1 as double precision and avoid the Fortran pitfall.

Why is the result surprising? The right-hand side is a single-precision
expression. This is evaluated, then it is assigned to a
double-precision variable. The assignment operator has a specific form
that converts single to double.

I guess that those languages might run into the same problem if the LHS
were quad-precision. What if the RHS is a mixed-mode expression, even
if it just contains literal constants? Do you want promotion in this:

real(kind=quad) :: mismatch = 1.1_double - 1.1_single

> Would it break currently standard-conforming programs to change the language and promote 1.1 to 1.1d0 when the LHS is double precision? Do gfortran or g95 or other compilers have options that warn about the code fragment above? G95 says nothing with -Wall -Wextra -fbounds-check -ftrace=full -freal=nan , and gfortran 4.8 says nothing with -Wall -Wextra .

I'm quite sure that it would break conforming programs. The RHS
evaluation must not depend on the LHS.

A reasonable complaint might be that most Fortran compilers use single
precision as default REAL. I guess that is not required by the
standard, just by tradition.

-- Wolfgang

--
E-mail: firstnameini...@domain.de
Domain: yahoo

e p chandler

unread,

May 30, 2014, 9:33:34 AM5/30/14

to

On Friday, May 30, 2014 8:28:51 AM UTC-4, Beliavsky wrote:
> Having read the Fortran questions on Stack Overflow for a few months, one very FAQ is why
> double precision :: x
> x = 1.1
> gives surprising results, as does passing "1.1" to a function expecting a
> double precision argument. One should write something like 1.1d0 instead. I '
> think Matlab, Python, and R, some scripting languages that new Fortran
> programmers may be used to, treat 1.1 as double precision and avoid the
> Fortran pitfall.

That is a fundamental difference between "C" family languages and Fortran.
It's not that the constant is promoted to double to match the LHS, it is that the *default* is double.

> Would it break currently standard-conforming programs to change the language
> and promote 1.1 to 1.1d0 when the LHS is double precision? Do gfortran or g95
> or other compilers have options that warn about the code fragment above? G95
> says nothing with -Wall -Wextra -fbounds-check -ftrace=full -freal=nan , and
> gfortran 4.8 says nothing with -Wall -Wextra .

Then you end up with a special case rule that treats LHS of double differently from other types. Consider

program foo
implicit none
real :: a

a=1/2
write(*,*) a
end program foo

Should both integer constants be promoted to real because the LHS is real??

--- e

Richard Maine

unread,

May 30, 2014, 11:57:07 AM5/30/14

to

Wolfgang Kilian <kil...@invalid.com> wrote:

> On 05/30/2014 02:28 PM, Beliavsky wrote:
> > Having read the Fortran questions on Stack Overflow for a few months,
one very FAQ is why
> >
> > double precision :: x
> > x = 1.1
> >
> > gives surprising results, as does passing "1.1" to a function expecting
> > a double precision argument. One should write something like 1.1d0
> > instead. I think Matlab, Python, and R, some scripting languages that
> > new Fortran programmers may be used to, treat 1.1 as double precision
> > and avoid the Fortran pitfall.
>
> Why is the result surprising? The right-hand side is a single-precision

> expression...

> > Would it break currently standard-conforming programs to change the
> > language and promote 1.1 to 1.1d0 when the LHS is double precision?

> I'm quite sure that it would break conforming programs. The RHS
> evaluation must not depend on the LHS.

I don't see how it would break conforming programs if limitted to the
exact case shown, where the RHS was just a literal constant. The
standard does not specify that the conversion from single to double
precision be exact (or even that there necesarily be a double value
corresponding exactly to every single value with no error). A program
that assumes such a thing is making assumptions outside of the standard.

Now a much bigger question in my mind would be how to define it and
where to draw a boundary. Would it really be restricted to only the case
where the RHS was a single literal constant? If so, that would be an
iregularity and bound to invite complaints about how the same thing
didn't happen in very simillar cases. If other cases are brought in,
then you'd get into more than just the question of conversion from
single to double. That would go down the path of having various parts of
the RHS expression be interpreted differently depending on the LHS
(including such things as possibly invoking different specific
procedures). That gets messy to even define and I do agree that you'd
end up breaking standard conforming programs that way.

Always treating 1.1 as a double precision literal, independent of
context, would certainly break standard conforming programs. You'd get
the wrong specific for generics, and you'd get argument kind
disagreements for other cases. That one is at least simple to define,
but would break things all over.

> A reasonable complaint might be that most Fortran compilers use single
> precision as default REAL. I guess that is not required by the
> standard, just by tradition.

That single precision is default REAL *IS* specified by the fortran
standard. Well, except that there isn't even technically a term "single
precision" in the standard. While we often refer to it as single
precision in casual conversation, the modern standard's term is just
"default kind". (And if I recall correctly, the term was simply "real"
in older standards; double precsion wasn't a different kind of real, but
was a separate type, albeit related to real). In either case, "single
precision" and "default real" aren't two distinct things to have the
possibility of being different. It is specified that double precision
has to be different from default kind (and that theprecision of double
precision has to be higher than that of default).

It is possible that I'm misinterpreting your statement, though. I'm
talking only about terms within of the Fortran language. If you perhaps
are talking about 32-bit representations when you say "single
precision", as in IEEE single precision, then you are certainly right
that the Fortran standard does not require that default real be a 32-bit
representation. I would not even say that it was particularly
traditional. Traditional Fortran representations varied widely. I spent
a fair fraction of my career using Fortran on machines that didn't even
have 32-bit representations. Single precision (well, "real", as this was
prior to f90) was 60 bits and double was 120.

--
Richard Maine
email: last name at domain . net
domain: summer-triangle

glen herrmannsfeldt

unread,

May 30, 2014, 5:33:45 PM5/30/14

to

Beliavsky <beli...@aol.com> wrote:

> Having read the Fortran questions on Stack Overflow for a
> few months, one very FAQ is why

> double precision :: x
> x = 1.1

> gives surprising results, as does passing "1.1" to a function
> expecting a double precision argument. One should write something
> like 1.1d0 instead. I think Matlab, Python, and R, some scripting
> languages that new Fortran programmers may be used to,
> treat 1.1 as double precision and avoid the Fortran pitfall.

As well as I know it, in most mathematical computations, double
precision is used to be sure of obtaining single precision
results. (In large matrix calculations, it is easy to get even
less than single precision results with double precision math.)

Note that one solution for the problem above is the use of
the now standard (in IEEE 754-2008) decimal floating point.
That won't fix the, possible more common, probem of:

REAL PI
PI=3.1415926535

or even

PI=4.*ATAN(1.0)

Note that all languages have pitfalls that tend to trap
beginners in that language (even those experienced in others).
Fortran programmers quickly learn to add D0 onto any constant that
could possibly need it.

Do remember that Fortran originated on vacuum tube computers.
Fortran II, with double precision, came out soon after, still
on the original 704. For many years, even on processors with
single precision hardware, double precision was done in software,
and much slower.

Also, some hardware supplied a 60 or 64 bit wide single precision,
(enough for most calculations) and software 120 bit or 128 bit
(to satisfy the standard, but rarely used).

> Would it break currently standard-conforming programs to change
> the language and promote 1.1 to 1.1d0 when the LHS is double
> precision? Do gfortran or g95 or other compilers have options
> that warn about the code fragment above? G95 says nothing
> with -Wall -Wextra -fbounds-check -ftrace=full -freal=nan ,
> and gfortran 4.8 says nothing with -Wall -Wextra .

It is likely that it would result in too many warnings, but it
seems that it could be useful as an option.

Java requires a cast on narrowing conversions, such as assigning
a double value to a float variable.

C originated as a systems programming language, such that floating
point performance was most often not required. K&R C did all
calculations in double precision, along with constants, but did
allow for single (float) variables. The ANSI C standard added
float (single precision, with a trailing f) constants, but they
are rarely used.

-- glen

Quadibloc

unread,

May 30, 2014, 8:17:03 PM5/30/14

to

On Friday, May 30, 2014 6:28:51 AM UTC-6, Beliavsky wrote:
> I think Matlab, Python, and R, some scripting languages that new Fortran
> programmers may be used to, treat 1.1 as double precision and avoid the Fortran
> pitfall.

And then there is C, which uses double precision real for everything, having only the double-precision versions of the trig functions and so on.

It certainly is true that it would be nice if 1.1 were just the abstract real number 1.1 (and I don't mean 32-bit floating point, I mean the mathematical object) which would then get fitted to variables of finite precision as well as possible.

Back when FORTRAN originated, though, there was a definite need for the behavior of programs to be predictable - and for compilers to be simple.

Could a simple and predictable rule be established that would allow a language like FORTRAN, where different real precisions have (near-)equal stature, to still be very simple?

I would say that when you have 1.1 instead of 1.1E0, 1.1D0, or 1.1Q0 (for EXTENDED PRECISION), the thing to do is to say that it's initially converted to binary at the maximum available precision (i.e. 1.1Q0, 128-bit real or 80-bit real) and then possibly demoted the first time it comes into contact with something of a specified real type.

So if you have

REAL A,B
DOUBLE PRECISION Y
...
A = B * (1.37/9.8) + Y

the first thing 1.37 and 9.8 come into contact with is each other. So the division is performed by the compiler in extended precision.

Then the quotient comes into contact with B, since multiplication is evaluated before division, and thus it is converted to single precision.

Then Y, a double precision constant, is added.

If that result is not precise enough, you would need to explicitly say 1.37D0/9.8D0; that way, B would be promoted, and then the multiplication would be performed.

So even with a simple and consistent rule, there would be some unexpected losses of precision.

John Savard

Ajit Thakkar

unread,

May 30, 2014, 8:52:56 PM5/30/14

to

On Friday, May 30, 2014 9:28:51 AM UTC-3, Beliavsky wrote:
> double precision :: x
> x = 1.1

> Do gfortran or g95 or other compilers have options that warn about the code fragment above? G95 says nothing with -Wall -Wextra -fbounds-check -ftrace=full -freal=nan , and gfortran 4.8 says nothing with -Wall -Wextra .

Using -Wconversion-extra with gfortran 4.9 gives a warning.

x = 1.1
1
Warning: Conversion from REAL(4) to REAL(8) at (1)

Richard Maine

unread,

May 30, 2014, 8:52:57 PM5/30/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

> Could a simple and predictable rule be established that would allow a
> language like FORTRAN, where different real precisions have (near-)equal
> stature, to still be very simple?
>
> I would say that when you have 1.1 instead of 1.1E0, 1.1D0, or 1.1Q0 (for
> EXTENDED PRECISION), the thing to do is to say that it's initially
> converted to binary at the maximum available precision (i.e. 1.1Q0,
> 128-bit real or 80-bit real) and then possibly demoted the first time it
> comes into contact with something of a specified real type.

And if it never "comes in contact" with anything else, then it stays at
the maximum precision? So

call sub(1.1)

doesn't work if the dummy argument of sub is default real? And assume
that sub has an implicit interface, so in the calling scope you don't
know anything about the dummy. Or perhaps that sub is generic with
multiple possible kinds.

I guess I find the existing rule a lot simpler than any alternative
proposal I've seen. The existing rule is that 1.1 is a real of default
kind - always. Sure, one can make mistakes ignoring that rule just like
one can make mistakes by ignoring other rules. But I don't think you can
beat it for simplicity.

Gordon Sande

unread,

May 30, 2014, 8:58:07 PM5/30/14

to

On 2014-05-31 00:17:03 +0000, Quadibloc said:

> On Friday, May 30, 2014 6:28:51 AM UTC-6, Beliavsky wrote:
>> I think Matlab, Python, and R, some scripting languages that new
>> Fortran> programmers may be used to, treat 1.1 as double precision and
>> avoid the Fortran> pitfall.
>
> And then there is C, which uses double precision real for everything,
> having only the double-precision versions of the trig functions and so
> on.
>
> It certainly is true that it would be nice if 1.1 were just the
> abstract real number 1.1 (and I don't mean 32-bit floating point, I
> mean the mathematical object) which would then get fitted to variables
> of finite precision as well as possible.
>
> Back when FORTRAN originated, though, there was a definite need for the
> behavior of programs to be predictable - and for compilers to be simple.

In the presence of separate computation of difering program units the
answer will
almost surely be no. Notice that the examples given tend to not be
compilers, but
rather interpreters, and one rarely has separate program units with
separate compilation.

glen herrmannsfeldt

unread,

May 30, 2014, 9:36:08 PM5/30/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

> On Friday, May 30, 2014 6:28:51 AM UTC-6, Beliavsky wrote:
>> I think Matlab, Python, and R, some scripting languages that
>> new Fortran programmers may be used to, treat 1.1 as double
>> precision and avoid the Fortran pitfall.

> And then there is C, which uses double precision real for
> everything, having only the double-precision versions of the
> trig functions and so on.

That was old C. There is now sqrtf() (sounds like the Fortran I
function naming) and sqrtl(), the latter for (long double),
allowing for a type wider than double precision.

And even more recently, there is now csqrt(), yes they now have
complex functions in C.

But yes, C does make it easier to use the double precision forms.
And C will convert arguments to the appropriate type, so using
sqrt() on float values works just fine.

> It certainly is true that it would be nice if 1.1 were just
> the abstract real number 1.1 (and I don't mean 32-bit floating
> point, I mean the mathematical object) which would then get
> fitted to variables of finite precision as well as possible.

> Back when FORTRAN originated, though, there was a definite need
> for the behavior of programs to be predictable - and for compilers
> to be simple.

Well, double precision didn't come until Fortran II, but it wasn't
all that long in coming.

> Could a simple and predictable rule be established that would
> allow a language like FORTRAN, where different real precisions
> have (near-)equal stature, to still be very simple?

> I would say that when you have 1.1 instead of 1.1E0, 1.1D0,
> or 1.1Q0 (for EXTENDED PRECISION), the thing to do is to say
> that it's initially converted to binary at the maximum available
> precision (i.e. 1.1Q0, 128-bit real or 80-bit real) and then
> possibly demoted the first time it comes into contact with
> something of a specified real type.

You might consider how PL/I does it. Constants have the base,
scale, and precision that they are written in, so 1.1 is
FIXED DEC(2,1), that is, two digits with one after the decimal
point. It is then converted, according to the rules, when
necessary.

> So if you have

> REAL A,B
> DOUBLE PRECISION Y
> ...
> A = B * (1.37/9.8) + Y

> the first thing 1.37 and 9.8 come into contact with is each other.
> So the division is performed by the compiler in extended precision.

> Then the quotient comes into contact with B, since multiplication
> is evaluated before division, and thus it is converted to
> single precision.

> Then Y, a double precision constant, is added.

> If that result is not precise enough, you would need to explicitly
> say 1.37D0/9.8D0; that way, B would be promoted, and then the
> multiplication would be performed.

There are many things that could be done. The question is, which one
should the default be? You can always ask for it to be done in
a different way, when needed.

> So even with a simple and consistent rule, there would be some
> unexpected losses of precision.

As long as you have divide, there will always be problems on one
end or the other. The PL/I rules are nice and consistent, but,
in the end, they don't come out so simple.

-- glen

glen herrmannsfeldt

unread,

May 30, 2014, 9:44:40 PM5/30/14

to

Richard Maine <nos...@see.signature> wrote:

(snip)

> And if it never "comes in contact" with anything else, then it stays at
> the maximum precision? So

> call sub(1.1)

> doesn't work if the dummy argument of sub is default real? And assume
> that sub has an implicit interface, so in the calling scope you don't
> know anything about the dummy. Or perhaps that sub is generic with
> multiple possible kinds.

> I guess I find the existing rule a lot simpler than any alternative
> proposal I've seen. The existing rule is that 1.1 is a real of default
> kind - always. Sure, one can make mistakes ignoring that rule just like
> one can make mistakes by ignoring other rules. But I don't think you
> can beat it for simplicity.

Well, the C rule isn't all that less simple. 1.1 is double, 1.1f is
float (single precision). But it came at a later time, and for
different reasons.

There are a lot of strange Fortran rules left over from the 1950's.
The amazing thing is that they work, along with the new rules,
to form a useful language. (The designers of PL/I didn't believe
it possible, but then they had only a short time to do it.)
Seems to me that it took between 13 and 29 years, depending on
how you count. (1990-1977 or 1995-1966.) Maybe longer.

-- glen

Richard Maine

unread,

May 30, 2014, 10:35:05 PM5/30/14

to

Richard Maine <nos...@see.signature> wrote:

> Quadibloc <jsa...@ecn.ab.ca> wrote:
>
> > Could a simple and predictable rule be established that would allow a
> > language like FORTRAN, where different real precisions have (near-)equal
> > stature, to still be very simple?
> >
> > I would say that when you have 1.1 instead of 1.1E0, 1.1D0, or 1.1Q0 (for
> > EXTENDED PRECISION), the thing to do is to say that it's initially
> > converted to binary at the maximum available precision (i.e. 1.1Q0,
> > 128-bit real or 80-bit real) and then possibly demoted the first time it
> > comes into contact with something of a specified real type.
>
> And if it never "comes in contact" with anything else, then it stays at

> the maximum precision?...

Or if the first thing it "comes in contact" with isn't a real, as in
something like

n*1.1

where n is an integer. Or even worse,

u*1.1

where u is a derived type and the * is a defined operation. Though I
guess that's back to something much like the subroutine argument case
that I mentioned previously.

If you try to keep the expression as being without a specific kind for
as long as possible, it seems to me like you are going to end up
building a complete new set of rules for interpreting expressions. The
existing rules assume that the kinds of operands are always known before
you have to determine what the operand does. That doesn't sound like a
simple path at all. In fact, I'd say it was a huge complication compared
to the simple problem it is addressing.

I suppose an alternative to defining a whole new sort of expression
evaluation for indeterminate kind would be to just specify that a
literal without an explicit kind parameter was just the kind with the
highest precision, then letting the normal expression rules work. That
one is at least simple. But it breaks existing code. And it also would
make for the strange issue that adding a new kind to the compiler could
change the interpretation of code, or even make it invalid.

Gotta run, which is probably a good thing for cutting down on my
verbosity.

Quadibloc

unread,

May 30, 2014, 11:24:20 PM5/30/14

to

On Friday, May 30, 2014 6:52:57 PM UTC-6, Richard Maine wrote:

> I guess I find the existing rule a lot simpler than any alternative
> proposal I've seen. The existing rule is that 1.1 is a real of default
> kind - always. Sure, one can make mistakes ignoring that rule just like
> one can make mistakes by ignoring other rules. But I don't think you can
> beat it for simplicity.

Oh, absolutely. That's why FORTRAN had that rule and not some other one, because it was the simplest.

Your example points out another strange thing about FORTRAN, compared to modern languages - due to separate compilation, the type of arguments to functions need not be specified.

It's just that simplicity isn't the only goal. Naturalness - or behaving like mathematical notation, so that all our default assumptions just work - is another one. Perhaps a chimerical and misleading goal, yes.

Because the fact that 2/3 is 0 and not 0.666...667 is, as an example, something I can see no halfway passable way to change.

John Savard

Quadibloc

unread,

May 30, 2014, 11:31:24 PM5/30/14

to

On Friday, May 30, 2014 8:35:05 PM UTC-6, Richard Maine wrote:

> If you try to keep the expression as being without a specific kind for
> as long as possible, it seems to me like you are going to end up
> building a complete new set of rules for interpreting expressions.

Basically, the rule was that the expression was extended precision real, but flagged as subject to change, until it is forced to be real at a known precision.

So although it was without a final kind, it was tentatively a specific kind, the maximum-precision real kind.

And the subroutine call case is worse on a little-endian machine. Arrays, of course, have to be declared, preventing one problem from arising. (Hmmm... but being able to write an array-valued constant is a desirable feature that might have already been added.)

You are quite right that this means a new set of rules, and so it's better in a new language that isn't FORTRAN.

John Savard

glen herrmannsfeldt

unread,

May 31, 2014, 12:46:17 AM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

(snip)

> It's just that simplicity isn't the only goal. Naturalness - or
> behaving like mathematical notation, so that all our default
> assumptions just work - is another one. Perhaps a chimerical
> and misleading goal, yes.

Seems to me that is what the PL/I rules were designed to do, though
some might disagree. For add, subtract, and multiply, the rules
are the ones we learned in elementary school, except that at some
point you hit the implementation limit.

Fixed point values have a scale factor, the number of digits
(binary or decimal) after the radix point. (The scale factor
can be negative.)

For a two-operand operation, if the base (decimal or binary) differ,
the decimal operand is converted to binary.

If the modes differ (real or complex) real is converted to complex.
(Note that PL/I allows fixed point complex arithmetic.)

For two operands with precision (total digits) and scale factor
(digits after the radix point) of (p,q) and (r,s), for addition
and subtraction the result has scale factor max(q,s) and precision
min(N, max(p-q, r-s)). After the radix point, you have the maximum
of the two operands. Before the radix point, you have one more than
the maximum before, but not more than the implementation limit, N.

For multiply, the result has scale factor (q+s) and precision of
min(N, p+r+1).

Both obvious rules, though with a limit in total digits.

Now for divide, but first remember that constants have the base,
mode, precision, and scale in which they are written.

Also, you can use the ADD, SUBTRACT, MULTIPLY or DIVIDE function to
specify the precision and scale factor for the result.

OK, finally the rules for divide. The result has precision N, and
scale factor N-p+q-s. Not so obvious, but consider integers (q=s=0).
For integer divisor, the smallest absolute value (assuming not zero)
is 1. The largest quotient has p digits before the radix point, and
so N-p after. For s=0, but not q, the quotient has as many as needed
before the radix point, and as many as possible after.
Same with s not zero.

Not so obvious, if you divide two values of (N,0) (that is, integers
of maximum precision) the result is (N,0).

Also, 1/3 is 0.3333333...., that is, one digit before the decimal point,
and as many digits as possible after.

> Because the fact that 2/3 is 0 and not 0.666...667 is, as an
> example, something I can see no halfway passable way to change.

In PL/I, 2/3 and 2./3. are the same, both are fixed point with
one digit before and zero digits after the decimal point.

-- glen

robin....@gmail.com

unread,

May 31, 2014, 3:51:05 AM5/31/14

to

On Saturday, May 31, 2014 1:57:07 AM UTC+10, Richard Maine wrote:
> Wolfgang Kilian <kil...@invalid.com> wrote: > On 05/30/2014 02:28 PM, Beliavsky wrote: > > Having read the Fortran questions on Stack Overflow for a few months, one very FAQ is why > > > > double precision :: x > > x = 1.1 > > > > gives surprising results, as does passing "1.1" to a function expecting > > a double precision argument. One should write something like 1.1d0 > > instead. I think Matlab, Python, and R, some scripting languages that > > new Fortran programmers may be used to, treat 1.1 as double precision > > and avoid the Fortran pitfall. > > Why is the result surprising? The right-hand side is a single-precision > expression... > > Would it break currently standard-conforming programs to change the > > language and promote 1.1 to 1.1d0 when the LHS is double precision? > I'm quite sure that it would break conforming programs. The RHS > evaluation must not depend on the LHS.

> I don't see how it would break conforming programs if limitted to the exact case shown, where the RHS was just a literal constant.

True, but that's only the tip of the iceberg.

Real constants tend to appear liberally in expressions, and if double precision
(or some other extended precision) is required, the constant MUST be written
in double-precision form (or other), otherwise the same problem exists.

The point is that it's a programming error to write the wrong form
of real constant when other than default real is required/desired.

robin....@gmail.com

unread,

May 31, 2014, 4:03:33 AM5/31/14

to

On Saturday, May 31, 2014 7:33:45 AM UTC+10, glen herrmannsfeldt wrote:
> Beliavsky <b.no...@aol.com> wrote: > one very FAQ is why > double precision :: x > x = 1.1 > gives surprising results, as does passing "1.1" to a function > expecting a double precision argument. One should write something > like 1.1d0 instead. I think Matlab, Python, and R, some scripting > languages that new Fortran programmers may be used to, > treat 1.1 as double precision and avoid the Fortran pitfall.

> As well as I know it, in most mathematical computations, double precision
> is used to be sure of obtaining single precision results.
> (In large matrix calculations, it is easy to get even less than single precision results with double precision math.)
> Note that one solution for the problem above is the use of the now standard (in IEEE 754-2008) decimal floating point. That won't fix the,
> possible more common, probem of:
REAL PI
PI=3.1415926535

And why uis that a problem? REAL is REAL (and single precision).
Adding d0 changes nothing.

> or even PI=4.*ATAN(1.0)

Ditto.

> Note that all languages have pitfalls that tend to trap beginners in that language (even those experienced in others). Fortran programmers quickly learn to add D0 onto any constant that could possibly need it. Do remember that Fortran originated on vacuum tube computers. Fortran II, with double precision, came out soon after, still on the original 704. For many years, even on processors with single precision hardware, double precision was done in software, and much slower.

Double precision is always slower than single precision, whether in
software or hardware.

And on very early computers, all floating-point was done in software.

> Also, some hardware supplied a 60 or 64 bit wide single precision, (enough for most calculations) and software 120 bit or 128 bit (to satisfy the standard, but rarely used).

And on some computers, single precision real used 30-bit mantissa.

> Would it break currently standard-conforming programs to change
> the language and promote 1.1 to 1.1d0 when the LHS is double
> precision?

Could very well do that, esp. when the constant is an argument to a function.

robin....@gmail.com

unread,

May 31, 2014, 4:12:43 AM5/31/14

to

On Saturday, May 31, 2014 2:46:17 PM UTC+10, glen herrmannsfeldt wrote:

> Because the fact that 2/3 is 0 and not 0.666...667 is, as an
> example, something I can see no halfway passable way to change.

In PL/I, 2/3 and 2./3. are the same, both are fixed point with one digit before and zero digits after the decimal point.

And the result in both cases is 0.666666666666.....6

glen herrmannsfeldt

unread,

May 31, 2014, 4:23:31 AM5/31/14

to

robin....@gmail.com wrote:

(snip, I wrote)

> REAL PI
> PI=3.1415926535

> And why uis that a problem? REAL is REAL (and single precision).
> Adding d0 changes nothing.

Yes, adding D0 changes nothing, even if you use it in a double
precision expression.

>> or even PI=4.*ATAN(1.0)

> Ditto.

Double Ditto.

(snip)

> Double precision is always slower than single precision, whether in
> software or hardware.

The x87, and some other current processors, normally computes to full
precision, independent of the precision of the operands.

> And on very early computers, all floating-point was done in software.

I suppose, but mostly it wasn't done. Fortran originated on the 704,
pretty much the first with hardware floating point. And not all that
fast, as it was.

(snip)

-- glen

robin....@gmail.com

unread,

May 31, 2014, 5:00:47 AM5/31/14

to

On Saturday, May 31, 2014 6:23:31 PM UTC+10, glen herrmannsfeldt wrote:

> r.no...@gmail.com wrote:

>>> REAL PI
>>> PI=3.1415926535

>> And why is that a problem? REAL is REAL (and single precision).

>> Adding d0 changes nothing.

> Yes, adding D0 changes nothing, even if you use it in a double precision
> expression.

You are repeating what I said.

>> or even PI=4.*ATAN(1.0)

>> Ditto.

> Double Ditto.

ditto again.

>> Double precision is always slower than single precision, whether in
>> software or hardware.

> The x87, and some other current processors, normally computes to full
> precision, independent of the precision of the operands.

Loading and storing are slower for double-precision, hence double precision
is slower than single.

>> And on very early computers, all floating-point was done in software.

> I suppose, but mostly it wasn't done.

Floating-point was used on the Pilot ACE (1951) for much numerical work --
the purpose for which it was built, and that continued on the DEUCE (1955).

TIP on DEUCE used all floating-point -- 30 or 31-bit mantissas.

GIP on Pilot ACE and DEUCE used block floating-point, with 30-bit mantissas.
GIP was used extensively throughout the working lives of DEUCEs.

> Fortran originated on the 704, pretty much the first with hardware floating
> point.

IIRC, a certain German computer built during the war used floating-point.
ILLIAC had floating-point hardware (don't have a commencement date,
but certainly by 1956).

> And not all that fast, as it was.

Timings?

robin....@gmail.com

unread,

May 31, 2014, 5:31:35 AM5/31/14

to

On Saturday, May 31, 2014 6:23:31 PM UTC+10, glen herrmannsfeldt wrote:

> Fortran originated on the 704, pretty much the first with hardware floating
> point. And not all that fast, as it was.

From Wiki:

Leonardo Torres y Quevedo, in 1914 designed an electro-mechanical version of the Analytical Engine of Charles Babbage which included floating-point arithmetic.

In 1938, Konrad Zuse of Berlin completed the Z1 computer, the first mechanical binary programmable computer, this was however unreliable in operation. It worked with 24-bit binary floating-point numbers having a 7-bit signed exponent, a 16-bit significand (including one implicit bit), and a sign bit. The memory used sliding metal parts to store 64 words of such numbers. The relay-based Z3 computer, completed in 1941 had representations for plus and minus infinity. It implemented defined operations with infinity such as 1/∞ = 0 and stopped on undefined operations like 0×∞. It also implemented the square root operation in hardware.

Konrad Zuse, architect of the Z3 computer, which used 22-bit binary floating point.
Zuse also proposed, but did not complete, carefully rounded floating–point arithmetic that would have included ±∞ and NaNs, anticipating features of IEEE Standard floating–point by four decades.

The first ''commercial'' computer with floating-point hardware was Zuse's Z4 computer designed in 1942–1945. The Bell Laboratories Mark V computer implemented decimal floating point in 1946.

The Pilot ACE had binary floating-point arithmetic which became operational at National Physical Laboratory, UK in 1950. A total of 33 were later sold commercially as the English Electric DEUCE. The arithmetic was actually implemented as subroutines, but with a one megahertz clock rate, the speed of floating-point operations and fixed point was initially faster than many competing computers.

robin....@gmail.com

unread,

May 31, 2014, 6:48:23 AM5/31/14

to

On Saturday, May 31, 2014 6:23:31 PM UTC+10, glen herrmannsfeldt wrote:

> r.no...@gmail.com wrote:

>> And on very early computers, all floating-point was done in software.

> I suppose, but mostly it wasn't done.

"He [J. H. Wilkinson] continued work becoming more involved in writing many
high quality papers on numerical analysis, particularly numerical linear
algebra. Having written subroutines to do floating-point arithmetic [1946-1949]
before a computer had been built to run them on, he was now in the fortunate
position of being able to progress rapidly, gaining experience with
floating-point computing. In numerical linear algebra he developed backward
error analysis methods. He worked on numerical methods for solving systems of
linear equations and eigenvalue problems. He wrote over 100 papers and was
best known for his books Rounding Errors in Algebraic Processes (1963)
and the impressive The Algebraic Eigenvalue Problem (1965)."

Quadibloc

unread,

May 31, 2014, 8:48:11 AM5/31/14

to

On Friday, May 30, 2014 10:46:17 PM UTC-6, glen herrmannsfeldt wrote:

> In PL/I, 2/3 and 2./3. are the same, both are fixed point with
> one digit before and zero digits after the decimal point.

The PL/I rule seems very strange from a FORTRAN perspective, because decimal is an exotic type rather than an expected default type. But PL/I is intended to embrace COBOL as well.

In FORTRAN, the distinction between integer and floating-point is a fundamental one. It's very important to know which one you're using at any given time, since the former is so much faster (well, it was, back in the old days).

So 2/3 .EQ. 0 seems an acceptable price to pay. I'd prefer it to be the mathematical quantity _if_ I could think of a non-confusing way to do it in a FORTRAN-esque context.

With 1.1 versus 1.1D0, on the other hand, we're talking about something that people do sometimes forget about, rather than one of the first things that's beaten into the heads of people learning FORTRAN. So I see at least a possibility of a change being a net improvement.

Not that I seriously recommend that either as a change in FORTRAN; I think that it is too late to change now. But these are issues to be thought about by people designing new languages.

John Savard

Klaus Wacker

unread,

May 31, 2014, 9:38:22 AM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:
> On Friday, May 30, 2014 10:46:17 PM UTC-6, glen herrmannsfeldt wrote:
>
>> In PL/I, 2/3 and 2./3. are the same, both are fixed point with
>> one digit before and zero digits after the decimal point.
>
> The PL/I rule seems very strange from a FORTRAN perspective, because
> decimal is an exotic type rather than an expected default type. But
> PL/I is intended to embrace COBOL as well.
>
> In FORTRAN, the distinction between integer and floating-point is a
> fundamental one. It's very important to know which one you're using
> at any given time, since the former is so much faster (well, it was,
> back in the old days).
>
> So 2/3 .EQ. 0 seems an acceptable price to pay. I'd prefer it to be
> the mathematical quantity _if_ I could think of a non-confusing way
> to do it in a FORTRAN-esque context.

I beg to differ. Integer arithmetic is an important thing to have, and
it being part of Fortran is not due to a compromise based on computing
speed. For an example, see datesub.for
<http://ftp.aset.psu.edu/pub/ger/fortran/hdk/datesub.for>.
"2/3 .EQ. 0" is quite important for that to work. The only question is
whether one uses the same symbol to denote both integer and floating
operations, division in particular. The original Fortran developers
already had to deal with a rather limited character set (they had to
use "*" for multiplication), so they went for the same symbols and for
deciding the type of operation based on the type of the operands.

--
Klaus Wacker klaus.w...@t-online.de
51°29'7"N 7°25'7"E http://www.physik.tu-dortmund.de/~wacker

Dan Nagle

unread,

May 31, 2014, 9:59:23 AM5/31/14

to

Hi,

This is assuming single precision means 32-bit and double precision
means 64-bit, which is a common but not universal assumption.

On 2014-05-31 08:03:33 +0000, robin....@gmail.com said:

> Double precision is always slower than single precision, whether in
> software or hardware.

Except on IBM Power processors, where 32-bit floating point
is computed by converting to 64-bit, operating, and converting
back to 32-bit.

Or, as Glen mentioned, on x87 coprocessors, where the floating point
stack is 80 bits.

--
Cheers!

Dan Nagle

Ron Shepard

unread,

May 31, 2014, 2:28:37 PM5/31/14

to

In article <b6a48d92-6e7d-479b...@googlegroups.com>,

Beliavsky <beli...@aol.com> wrote:

> Having read the Fortran questions on Stack Overflow for a few months, one
> very FAQ is why
>
> double precision :: x
> x = 1.1
>
> gives surprising results, as does passing "1.1" to a function expecting a
> double precision argument. One should write something like 1.1d0 instead. I
> think Matlab, Python, and R, some scripting languages that new Fortran
> programmers may be used to, treat 1.1 as double precision and avoid the
> Fortran pitfall.

As others have pointed out, assuming a different default real type
does not solve the problem, it just changes which codes work and
which ones don't.

In modern fortran, I would say that you should almost never use
"double precision" and you should never specify the precision of
constants with "D" or "Q" exponents (the latter of which is a
nonstandard and nonportable extension). Instead, all precision
should be specified with KIND values.

For example, many fortran compilers on intel hardware now support
four real kind values, corresponding to 32, 64, 80, and 128 bit
representations. The 128 bit ones in particular involve a lot of
software emulation support, depending on the underlying hardware.
And as far as the language is concerned, it is straightforward to
add other bit-length representations, or to use different mantissa
lengths within a given total length. As an example of the latter,
the DEC VAX compilers supported two different 64-bit floating point
representations with different exponent and mantissa lengths,
although the approval of the f90 standard was delayed long enough so
that there was never a fortran 90 compiler that supported those two
different representations before the company when out of business.

My point is that using different exponent letters to distinguish
different real KIND values is short sighted. And yes, I know that is
basically what c and c++ and other languages do too, but it is still
shortsighted nontheless. This is something that fortran got right 25
years ago, so don't screw it up now.

So in light of all of this, I would always try to "fix" code like
you show above with something obvious and explicit such as

double precision :: x
integer, parameeter :: wp=kind(x)
x = 1.1_wp

One could go even further and eliminate the "double precision"
entirely, making the code even clearer and less confusing.

Also, note that if a subprogram has an explicit interface, then KIND
mismatches of the arguments are caught at compile time. You know
about them even if the code itself is never executed during a run.
This gives you the chance to just fix the mistake, with no
surprising runtime errors or unexpected output.

$.02 -Ron Shepard

Richard Maine

unread,

May 31, 2014, 3:26:40 PM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

> Arrays, of course, have to be declared, preventing one problem from
> arising. (Hmmm... but being able to write an array-valued constant is a
> desirable feature that might have already been added.)

Yes, array constants were added in f90 with somewhat awkward delimitters
(I still have to look them up to make sure I get the order right) as in

(/ 1.2, 3.4, 5.6 /)

F2003 added the option to use brackets as in

[ 1.2, 3.4, 5.6 ]

and also added a way to specify type and kind explicitly, as in

[character(len=16):: "tom", "dick", "harry"]

Multi-dimensional arrays are still pretty awkward and require messing
with reshape.

glen herrmannsfeldt

unread,

May 31, 2014, 3:27:13 PM5/31/14

to

robin....@gmail.com wrote:
> On Saturday, May 31, 2014 2:46:17 PM UTC+10, glen herrmannsfeldt wrote:

>> Because the fact that 2/3 is 0 and not 0.666...667 is, as an
>> example, something I can see no halfway passable way to change.

Are you sure I said that? Check again, please.

> In PL/I, 2/3 and 2./3. are the same, both are fixed point with one digit before and zero digits after the decimal point.

> And the result in both cases is 0.666666666666.....6

-- glen

Quadibloc

unread,

May 31, 2014, 3:58:17 PM5/31/14

to

On Saturday, May 31, 2014 7:38:22 AM UTC-6, Klaus Wacker wrote:

> I beg to differ. Integer arithmetic is an important thing to have, and
> it being part of Fortran is not due to a compromise based on computing
> speed.

I certainly agree with that.

> For an example, see datesub.for
> <http://ftp.aset.psu.edu/pub/ger/fortran/hdk/datesub.for>.
> "2/3 .EQ. 0" is quite important for that to work. The only question is
> whether one uses the same symbol to denote both integer and floating
> operations, division in particular.

Yes, it's true that Microsoft BASIC used \ for integer division.

> The original Fortran developers
> already had to deal with a rather limited character set (they had to
> use "*" for multiplication), so they went for the same symbols and for
> deciding the type of operation based on the type of the operands.

I don't think I would want to argue against that. Yes, we could use × and ÷ these days, but using different symbols for the same operation is also confusing.

One could have two alternate assignment statements, one where all arithmetic is integer, and another where all arithmetic is floating-point, as a workaround. Or some other way to tell the compiler 'I am thinking in integer terms here' or 'I am thinking in terms of real numbers here'. FORTRAN already does this, so the only question is whether there is a way to make it more intuitive and less cumbersome.

John Savard

Quadibloc

unread,

May 31, 2014, 4:04:39 PM5/31/14

to

On Friday, May 30, 2014 9:31:24 PM UTC-6, I wrote:

> You are quite right that this means a new set of rules, and so it's better in a > new language that isn't FORTRAN.

On my web site, I have a description of a language I was thinking of. I looked at it, and I saw that this new rule was probably the one it would use, but I only mentioned the matter in passing, rather than explicitly dealing with the question.

I'll have to get around to that. But in the meantime, while there, I made another addition to the page... adding to the language an alternative version of the assigned GO TO. This one is meant to ease the conversion of COBOL programs.

As you can see from

http://www.quadibloc.com/prog/lg0424.htm

it resembles one of the most horrible, horrific statements that any programming language ever dared to include. However, if you look at the previous page,

http://www.quadibloc.com/prog/lg0423.htm

aside from making the object of that statement a special statement, the SWITCH statement, rather than just any GO_TO, I take one other precaution - borrowed from the FORTRAN assigned GO TO - to make it acceptable where the corresponding COBOL statement was insane.

John Savard

Quadibloc

unread,

May 31, 2014, 4:12:28 PM5/31/14

to

On Saturday, May 31, 2014 12:28:37 PM UTC-6, Ron Shepard wrote:

> My point is that using different exponent letters to distinguish
> different real KIND values is short sighted.

I agree. Instead, in the language I've proposed, # is simply an operator, and A # B is what A * (10. ** B) would be in FORTRAN (in my language, ** is replaced by ^ or by ?PW. - unfortunately, due to the use of other symbols, I don't have a nice punctuation for period words available... originally I used :PW: but maybe I can still use `PW`).

I also use ! as a unary operator that multiplies the following number by i (rather than factorial) and so the constant subexpression 3.2#-5+!-1.6#-4 (note that parentheses aren't required to separate unary operators, because there are no symbols like ** or <= in my language) is how one writes a complex number.

John Savard

Quadibloc

unread,

May 31, 2014, 4:14:23 PM5/31/14

to

And, before I forget, casts are done this way:

_(DOUBLE_PRECISION,X)

in the language.

John Savard

Richard Maine

unread,

May 31, 2014, 4:16:43 PM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

> One could have two alternate assignment statements, one where all
> arithmetic is integer, and another where all arithmetic is
> floating-point, as a workaround. Or some other way to tell the compiler
> 'I am thinking in integer terms here' or 'I am thinking in terms of real
> numbers here'. FORTRAN already does this, so the only question is
> whether there is a way to make it more intuitive and less cumbersome.

Fine on the unspecified "some other way". That's vague enough that I
can't usefully comment. But nix on the assignment statement. Expressions
appear in *MANY* contexts. Assignment statements are only one of them.
One of the things I fairly regularly have to emphasize to people is that
expression evaluation is not inherently related to assignment
statements. Expression evaluation is its own distinct subject, and a
rather major one.

For that matter, I think it would be a mistake to tie things like this
to anything at the level of an entire expression (much less at the level
of an entire statement, which could involve multiple expressions).

Expressions can get pretty complicated (though I grant that it is
sometimes a good idea to break complicated expressions down into simpler
parts) and different parts of the same expression can involve different
types. For an example that is trivially obvious and of pretty major
importance, expressions where you are "thinking in terms of real
numbers" can often involve computing integer array indices. It is even
reasonably common for integer division to be used in computing such
indices.

glen herrmannsfeldt

unread,

May 31, 2014, 5:29:41 PM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:
> On Friday, May 30, 2014 10:46:17 PM UTC-6, glen herrmannsfeldt wrote:

>> In PL/I, 2/3 and 2./3. are the same, both are fixed point with
>> one digit before and zero digits after the decimal point.

> The PL/I rule seems very strange from a FORTRAN perspective,
> because decimal is an exotic type rather than an expected default
> type. But PL/I is intended to embrace COBOL as well.

Well, that is true, but maybe not as exotic as it sounds.
For one, it is a way to specify the needed precision, in either
fixed or floating point. Note that SELECTED_INT_KIND and
SELECTED_REAL_KIND also expect their values in decimal digits.
(That seems to be true even with the RADIX option.)

Now that IBM has processors supporting decimal floating point,
FLOAT DEC might be implemented that way, but traditionally it
was a way to specify the precision in decimal digits.

> In FORTRAN, the distinction between integer and floating-point
> is a fundamental one. It's very important to know which one
> you're using at any given time, since the former is so much
> faster (well, it was, back in the old days).

In PL/I, the same distinction is between FIXED and FLOAT.
You can specify fixed point integers (with scale factor 0).
Values with non-zero scale factor may be implemented using binary
or decimal (or any other base) arithmetic. For FLOAT, you specify
the precision in decimal digits (DEC) or bits (BIN), again
independent of the underlying implementation base.

> So 2/3 .EQ. 0 seems an acceptable price to pay. I'd prefer it
> to be the mathematical quantity _if_ I could think of
> a non-confusing way to do it in a FORTRAN-esque context.

Well, it is nice to have a truncating integer divide for many
algorithms, but PL/I will do that if you need it. To be sure,
use DIVIDE(x, y, 10, 0) (That is, x/y to 10 digits in the
appropriate base.)

> With 1.1 versus 1.1D0, on the other hand, we're talking about
> something that people do sometimes forget about, rather than
> one of the first things that's beaten into the heads of people
> learning FORTRAN. So I see at least a possibility of a change
> being a net improvement.

When I was learning about PL/I (not so long after I started in
Fortran) Fortran programmers were warned to avoid unneeded
conversions. To use A=1.0 instead of A=1, for REAL A.
(That is, some compilers would do the conversion at run time.)

It is much nicer to write I=10; than I=1010B; for binary I,
so compilers had to be expected to do constant conversion at
compile time.

> Not that I seriously recommend that either as a change in FORTRAN;
> I think that it is too late to change now. But these are issues
> to be thought about by people designing new languages.

Yes.

-- glen

Quadibloc

unread,

May 31, 2014, 5:42:51 PM5/31/14

to

On Saturday, May 31, 2014 3:29:41 PM UTC-6, glen herrmannsfeldt wrote:

> It is much nicer to write I=10; than I=1010B; for binary I,

Yes, when I first heard of that aspect of the design of PL/I, I thought the language was downright insane. To me, 10 was an abstract object, a number - and of course in a computer its representation would be binary.

That was an over-reaction. The 360 had decimal arithmetic, and PL/I let you use it.

But having to write numbers which have an internal binary representation in binary still is insane. How I write a number - in decimal, octal, or hexadecimal - and what internal representation I want the computer to use it in are two completely separate things.

I should be able to specify them both separately.

My language does that - _(DECIMAL,%Q777) would be a packed decimal constant with the value 511. Constants are built using the same casts that act on variables, extending the principle for #, the symbol like ALGOL's 10.

John Savard

glen herrmannsfeldt

unread,

May 31, 2014, 7:21:18 PM5/31/14

to

Ron Shepard <ron-s...@nospam.comcast.net> wrote:
> In article <b6a48d92-6e7d-479b...@googlegroups.com>,
> Beliavsky <beli...@aol.com> wrote:

>> Having read the Fortran questions on Stack Overflow for a
>> few months, one very FAQ is why

>> double precision :: x
>> x = 1.1

(snip)

> As others have pointed out, assuming a different default real type
> does not solve the problem, it just changes which codes work and
> which ones don't.

> In modern fortran, I would say that you should almost never use
> "double precision" and you should never specify the precision of
> constants with "D" or "Q" exponents (the latter of which is a
> nonstandard and nonportable extension). Instead, all precision
> should be specified with KIND values.

Until hardware supporting quad precision gets more popular, it
isn't so obvious.

The idea behind KIND, and behind the PL/I declarations, is that
you always know exactly how much precision you need. You specify
how much you need, the system supplies one that meets the need.

Most numerical work of even the slightest complexity requires
double precision (at least) intermediates to get reasonable
(single precision) results.

> For example, many fortran compilers on intel hardware now support
> four real kind values, corresponding to 32, 64, 80, and 128 bit
> representations. The 128 bit ones in particular involve a lot of
> software emulation support, depending on the underlying hardware.
> And as far as the language is concerned, it is straightforward to
> add other bit-length representations, or to use different mantissa
> lengths within a given total length.

The idea behind x87, and part of the IEEE standard, is that
intermediate calculations, especially sums, are done with more
precision.

> As an example of the latter,
> the DEC VAX compilers supported two different 64-bit floating point
> representations with different exponent and mantissa lengths,
> although the approval of the f90 standard was delayed long enough so
> that there was never a fortran 90 compiler that supported those two
> different representations before the company when out of business.

Well, even more, Alpha didn't support them. Alpha has instructions
to load and store the VAX forms, converting to/from IEEE form in
registers, possibly with loss of bits.

IBM, on the other hand, supports a variety of forms. Current z/
machines support S/360 style hex floating point (needed for back
compatibility), IEEE binary (among others, needed for Java), and
now IEEE standard decimal floating point, all on one machine!

It seems that the Fortran 2008 SELECTED_REAL_KIND allows one to
specify the desired radix. It will be interesting to see as DFP
gets more popular.

> My point is that using different exponent letters to distinguish
> different real KIND values is short sighted. And yes, I know that is
> basically what c and c++ and other languages do too, but it is still
> shortsighted nontheless. This is something that fortran got right 25
> years ago, so don't screw it up now.

OK, use _DP on all constants, and set it up as an appropriate
PARAMETER at the top. Then you only have to change one place.
Though you do still have to be sure that the values are
appropriate for different KINDs.

> So in light of all of this, I would always try to "fix" code like
> you show above with something obvious and explicit such as

> double precision :: x
> integer, parameeter :: wp=kind(x)
> x = 1.1_wp

> One could go even further and eliminate the "double precision"
> entirely, making the code even clearer and less confusing.

integer, parameeter :: wp=kind(1.D0)
x = 1.1_wp

Isn't much different in readability. The important point is that
you only have to change one place to change the precision of
the whole routine.

Also, note that the decimal forms in IEEE 754-2008 include
a 64 bit and 128 bit form, with a 32 bit form as "storage only".
(There is also a 16 bit "storage only" form, that I don't know
anyone that uses.)

That seems to me to suggest that compilers implementing "single"
and "double" precision decimal float use the 64 and 128 bit forms.

> Also, note that if a subprogram has an explicit interface, then KIND
> mismatches of the arguments are caught at compile time. You know
> about them even if the code itself is never executed during a run.
> This gives you the chance to just fix the mistake, with no
> surprising runtime errors or unexpected output.

-- glen

glen herrmannsfeldt

unread,

May 31, 2014, 7:31:35 PM5/31/14

to

Richard Maine <nos...@see.signature> wrote:
> Quadibloc <jsa...@ecn.ab.ca> wrote:

>> One could have two alternate assignment statements, one where all
>> arithmetic is integer, and another where all arithmetic is
>> floating-point,

(snip)

> For that matter, I think it would be a mistake to tie things like this
> to anything at the level of an entire expression (much less at the level
> of an entire statement, which could involve multiple expressions).

I suppose, but there is a large class of algorithms that uses only
integer aritmetic and depends on the truncating property of integer
division.

There is also a large class of algorithms that depends on floating
point values, and rarely needs truncating integer division.

A large fraction of the former are now written in C, though.

-- glen

glen herrmannsfeldt

unread,

May 31, 2014, 7:40:53 PM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

(snip, I wrote)

>> It is much nicer to write I=10; than I=1010B; for binary I,

> Yes, when I first heard of that aspect of the design of PL/I,
> I thought the language was downright insane. To me, 10 was an
> abstract object, a number - and of course in a computer its
> representation would be binary.

A side effect of the idea that constants have the scale, mode, base,
precision, and scale factor that they are written in. Sounds nice,
but can be surprising.

> That was an over-reaction. The 360 had decimal arithmetic,
> and PL/I let you use it.

Some years ago, I used the CALL/OS PL/I compiler running on
a S/370 processor. That compiler uses binary arithmetic for
FIXED DECIMAL variables.

> But having to write numbers which have an internal binary
> representation in binary still is insane. How I write a
> number - in decimal, octal, or hexadecimal - and what internal
> representation I want the computer to use it in are two
> completely separate things.

> I should be able to specify them both separately.

> My language does that - _(DECIMAL,%Q777) would be a packed
> decimal constant with the value 511. Constants are built using
> the same casts that act on variables, extending the principle
> for #, the symbol like ALGOL's 10.

As noted, there is no requirement in PL/I that the underlying
aritmetic be in the base specified, but that scaling has to be
appropriate for that base. That results in a lot of multiply and
divide by powers of 10 if you do FIXED DEC in binary.

PL/I has functions to do all the conversions, so you can say

I=BINARY(10,31,0);

To specify a binary 10 with 31 bits of precision, and 0 after the
binary point. You can also have variables like:

DCL J FIXED BIN(31,3);

such that 3 bits are after the binary point. All values will be
multiples of 1/8 (independent of the underlying hardware).

-- glen

Richard Maine

unread,

May 31, 2014, 8:14:14 PM5/31/14

to

You elided my comment about array indices. Those are always integer (in
Fortran anyway), reasonably often use integer division, and can involved
in almost any algorithm.

And, of course, we haven't even brought up other types, including
user-defined ones. Note in that regard that user-defined types get used
for integer and floatting point things, for example, to do user-defined
integers or floats with different range or precision than provided by
the processor.

The subject in question was about the behavior of particular operations.
I think it a fundamental mistake to cast it as a property of an entire
expression. There are these things called operators. Assigning a
particular operator to a particular operation is what we do all the
time. If one wants to specify when to use a particular operation, just
use that operator. That's simple, usual practice, and perfectly
applicable to the matter in question. I don't see why anyone would even
consider trying to turn it into something at the level of a whole
expression or more.

Quadibloc

unread,

May 31, 2014, 8:18:57 PM5/31/14

to

On Saturday, May 31, 2014 6:14:14 PM UTC-6, Richard Maine wrote:

> You elided my comment about array indices. Those are always integer (in
> Fortran anyway), reasonably often use integer division, and can involved
> in almost any algorithm.

I would just make array indices an exception to the rule that everything is floating in the floating assignment.

The floating and integer assignments would just be options to use where appropriate to allow coding without worrying so much about type behavior.

John Savard

Richard Maine

unread,

May 31, 2014, 8:33:48 PM5/31/14

to

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> Ron Shepard <ron-s...@nospam.comcast.net> wrote:

> > In modern fortran, I would say that you should almost never use
> > "double precision" and you should never specify the precision of
> > constants with "D" or "Q" exponents (the latter of which is a
> > nonstandard and nonportable extension). Instead, all precision
> > should be specified with KIND values.
>
> Until hardware supporting quad precision gets more popular, it
> isn't so obvious.

I guess our perspectives differ because that's been obvious to me for
about 3 decades. The first time I saw a draft of what later became f90 I
imediately saw kind numbers as a huge improvement.

Perhaps that's because of all the time I had previously spent manually
converting code between different precisions.

> Most numerical work of even the slightest complexity requires
> double precision (at least) intermediates to get reasonable
> (single precision) results.

Not when you are on a system where single precision was 60 bits, as the
systems I used most often in the late 70's and eary 80's. Using double
precision for everything on those CDCs was a really bad idea. That
doubled the memory usage (which was a big deal) and also did bad things
to execution time.

So no, it isn't just machines of the future. It is also many machines of
the past, and some of the ones of today.

The experiences of that age cured me of casually assuming that machines
were always going to be just like the ones I am working on at the moment
unless that assumption is actually needed for the application. (I do
grant that there are applications where that sort of machine dependence
is needed).

> OK, use _DP on all constants, and set it up as an appropriate
> PARAMETER at the top. Then you only have to change one place.
> Though you do still have to be sure that the values are
> appropriate for different KINDs.

Yes, that's the usual recommendation, modulo the choice of name. The
name wp is pretty common and seems reasonable to me. (Oh, I see you did
that a few lines later). I personally use the long-winded name r_kind
myself because I think having "kind" in it helps clarity (and I tend to
drop the then redundant kind= in many contexts).

Ron Shepard

unread,

May 31, 2014, 8:42:47 PM5/31/14

to

In article <5c0a1437-5997-4824...@googlegroups.com>,

Quadibloc <jsa...@ecn.ab.ca> wrote:

> Yes, it's true that Microsoft BASIC used \ for integer division.

If I remember the language correctly, this is needed in BASIC
because all variables are floating point, there are no integer
variables. I think that holds for constants too, but I'm not certain
about that.

Speaking of other languages, the C language has the odd feature that
the value of integer expressions such as i/j are not defined by the
language if either or both of "i" or "j" are negative. That is, 2/3
does evaluate to 0, as expected, but -2/3 and 2/-3 can evaluate to
either 0 or -1, and -2/-3 can evaluate to either 0 or 1. This
ambiguity propagates to modular and remainder functions too --
basically the language requires them to all be consistent.

$.02 -Ron Shepard

robin....@gmail.com

unread,

May 31, 2014, 9:06:05 PM5/31/14

to

On Sunday, June 1, 2014 9:21:18 AM UTC+10, glen herrmannsfeldt wrote:

> Most numerical work of even the slightest complexity requires double
> precision (at least) intermediates to get reasonable (single precision)
> results.

No it doesn't.

The GIP scheme on the DEUCE (which was used extensively) used
single precision (30 bits) for its numerical work (including
intermediate precision).
The exception was for multiplication, where a 63-bit product was
obtained, as you would expect for integer hardware.

robin....@gmail.com

unread,

May 31, 2014, 9:19:13 PM5/31/14

to

On Sunday, June 1, 2014 9:40:53 AM UTC+10, glen herrmannsfeldt wrote:
> Quadibloc <j.no...@ecn.ab.ca> wrote: (snip, I wrote) >> It is much nicer to write I=10; than I=1010B; for binary I, > Yes, when I first heard of that aspect of the design of PL/I, > I thought the language was downright insane. To me, 10 was an > abstract object, a number - and of course in a computer its > representation would be binary. A side effect of the idea that constants have the scale, mode, base, precision, and scale factor that they are written in. Sounds nice, but can be surprising.

>> That was an over-reaction. The 360 had decimal arithmetic,
>> and PL/I let you use it.

> Some years ago, I used the CALL/OS PL/I compiler running on a S/370
> processor. That compiler uses binary arithmetic for FIXED DECIMAL variables.

On the S/360, the decimal feature was optional.
I don't know why the CALL compiler didn't use decimal.
Integer arithmetic wouldn't have had the range of decimal values (9 digits in
binary vs. decimal's 15).

The CDC Cyber and 7600? machines didn't have decimal, so scaled decimal
was implemented using binary integer (48 bits).

Richard Maine

unread,

May 31, 2014, 10:02:08 PM5/31/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

> On Saturday, May 31, 2014 6:14:14 PM UTC-6, Richard Maine wrote:
>
> > You elided my comment about array indices. Those are always integer (in
> > Fortran anyway), reasonably often use integer division, and can involved
> > in almost any algorithm.
>
> I would just make array indices an exception to the rule that everything
> is floating in the floating assignment.

And procedure arguments, some of which occasionally play a role like
array indices (so much so that some coding styles deliberately blur the
distinction, making a function reference look like an array element) and
others of which are floating point?

> The floating and integer assignments would just be options to use where
> appropriate to allow coding without worrying so much about type
> behavior.

I guess our philosophies on that just differ. I think that people should
think about the type behavior and I think that features to try and hide
it are encouraging to error-prone habits. A lot like implicit typing.
Quite a lot, in fact.

William Clodius

unread,

May 31, 2014, 10:12:36 PM5/31/14

to

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> robin....@gmail.com wrote:
>
> (snip, I wrote)
> > REAL PI
> > PI=3.1415926535
>
> > And why uis that a problem? REAL is REAL (and single precision).
> > Adding d0 changes nothing.
>
> Yes, adding D0 changes nothing, even if you use it in a double
> precision expression.
>
> >> or even PI=4.*ATAN(1.0)
>
> > Ditto.
>
> Double Ditto.
>
> (snip)

>
> > Double precision is always slower than single precision, whether in
> > software or hardware.
>

> The x87, and some other current processors, normally computes to full
> precision, independent of the precision of the operands.

>
> > And on very early computers, all floating-point was done in software.
>

> I suppose, but mostly it wasn't done. Fortran originated on the 704,

> pretty much the first with hardware floating point. And not all that
> fast, as it was.

FWIW Bakus wrote a software floating point library for the 704's
predecessor, whose popularity made his proposals first for a hardware
implementation in the 704 and then a language to take advantage of that
(and other capabilities) sucesful.
>
> (snip)
>
> -- glen

William Clodius

unread,

May 31, 2014, 10:12:37 PM5/31/14

to

Dan Nagle <danl...@me.com> wrote:

> Hi,
>
> This is assuming single precision means 32-bit and double precision
> means 64-bit, which is a common but not universal assumption.

With the popularity of IEEE floating point I don't know of any modern
architecture with floating point where that is not the case, and in
particular any architecture with an F90+ compiler in standard conforming
mode (i.e., where default integer is the same size as default floating
point, and not in a mode compatible with code written for Cray/CDC
arithmetic).

William Clodius

unread,

May 31, 2014, 11:19:25 PM5/31/14

to

Ron Shepard <ron-s...@NOSPAM.comcast.net> wrote:

> In article <5c0a1437-5997-4824...@googlegroups.com>,
> Quadibloc <jsa...@ecn.ab.ca> wrote:
>
> > Yes, it's true that Microsoft BASIC used \ for integer division.
>
> If I remember the language correctly, this is needed in BASIC
> because all variables are floating point, there are no integer
> variables. I think that holds for constants too, but I'm not certain
> about that.

As near as I can tell all versions of BASIC have had integers but some
simple ones have lacked floating point. I suspect you are confusing
BASIC with some other language. Perhaps early versions of MATLAB.

>
> Speaking of other languages, the C language has the odd feature that
> the value of integer expressions such as i/j are not defined by the
> language if either or both of "i" or "j" are negative. That is, 2/3
> does evaluate to 0, as expected, but -2/3 and 2/-3 can evaluate to
> either 0 or -1, and -2/-3 can evaluate to either 0 or 1. This
> ambiguity propagates to modular and remainder functions too --
> basically the language requires them to all be consistent.
>

You are thinking of older versions of C. C99 followed Fortran's
definition as hardware implemented that definition, so all
implementations of C89 used that definition. FWIW most numericists
consider round to floor for a positive divisor and round to ceiling for
a negative divisor more use for such things as interpolation tables.

> $.02 -Ron Shepard

William Clodius

unread,

May 31, 2014, 11:19:26 PM5/31/14

to

Richard Maine <nos...@see.signature> wrote:

> Quadibloc <jsa...@ecn.ab.ca> wrote:
>

> > Could a simple and predictable rule be established that would allow a
> > language like FORTRAN, where different real precisions have (near-)equal
> > stature, to still be very simple?
> >
> > I would say that when you have 1.1 instead of 1.1E0, 1.1D0, or 1.1Q0 (for
> > EXTENDED PRECISION), the thing to do is to say that it's initially
> > converted to binary at the maximum available precision (i.e. 1.1Q0,
> > 128-bit real or 80-bit real) and then possibly demoted the first time it
> > comes into contact with something of a specified real type.
>
> And if it never "comes in contact" with anything else, then it stays at
> the maximum precision? So
>
> call sub(1.1)
>
> doesn't work if the dummy argument of sub is default real? And assume
> that sub has an implicit interface, so in the calling scope you don't
> know anything about the dummy. Or perhaps that sub is generic with
> multiple possible kinds.
>
> I guess I find the existing rule a lot simpler than any alternative
> proposal I've seen. The existing rule is that 1.1 is a real of default
> kind - always. Sure, one can make mistakes ignoring that rule just like
> one can make mistakes by ignoring other rules. But I don't think you can
> beat it for simplicity.
FWIW if I remember descriptions of Ada, in Ada 1.1 is an infinite
precision float that from context must be statically convetable to the
appropriate (finite) machine representation. However Ada began with
packages, its equivalent to Fortran's modules, so that whether the
conversion is statically determinable can always be determined.

glen herrmannsfeldt

unread,

May 31, 2014, 11:20:38 PM5/31/14

to

William Clodius <wclo...@earthlink.net> wrote:

(snip, someone wrote)

>> > And on very early computers, all floating-point was done in software.

(and then I wrote)

>> I suppose, but mostly it wasn't done. Fortran originated on the 704,
>> pretty much the first with hardware floating point. And not all that
>> fast, as it was.

> FWIW Bakus wrote a software floating point library for the 704's
> predecessor, whose popularity made his proposals first for a hardware
> implementation in the 704 and then a language to take advantage of that
> (and other capabilities) sucesful.

Well, certainly one would have wanted a software version before
building the hardware. The 701 came with 2K or 4K words, which
doesn't leave much after you fit in the floating point routines.

The slow speed and small memory made a big incentive to use fixed
point when possible. Also, being used to slide rules, people were
more used to having to keep track of the decimal point.

The "mostly" and "pretty much" were to leave room for what was
actually done.

But a lot more scientific computing was done on the 704.

-- glen

glen herrmannsfeldt

unread,

May 31, 2014, 11:28:28 PM5/31/14

to

Ron Shepard <ron-s...@nospam.comcast.net> wrote:
> In article <5c0a1437-5997-4824...@googlegroups.com>,
> Quadibloc <jsa...@ecn.ab.ca> wrote:

>> Yes, it's true that Microsoft BASIC used \ for integer division.

> If I remember the language correctly, this is needed in BASIC
> because all variables are floating point, there are no integer
> variables. I think that holds for constants too, but I'm not certain
> about that.

Some BASIC systems have added integer variables and constants.
I am not sure which ones, though.

> Speaking of other languages, the C language has the odd feature that
> the value of integer expressions such as i/j are not defined by the
> language if either or both of "i" or "j" are negative. That is, 2/3
> does evaluate to 0, as expected, but -2/3 and 2/-3 can evaluate to
> either 0 or -1, and -2/-3 can evaluate to either 0 or 1. This
> ambiguity propagates to modular and remainder functions too --
> basically the language requires them to all be consistent.

I think that was removed in more recent C standards.

Many integer/modulo algorithms work better with -2/3 as -1.
(Consider what time you get when you set a clock back.)

But it doesn't matter much, as pretty much all hardware does it the
way Fortran requires. If the choice is between working with both
C and Fortran, and working only with C, which will designers choose?

-- glen

glen herrmannsfeldt

unread,

Jun 1, 2014, 1:43:46 AM6/1/14

to

robin....@gmail.com wrote:
> On Sunday, June 1, 2014 9:21:18 AM UTC+10, glen herrmannsfeldt wrote:

>> Most numerical work of even the slightest complexity requires double
>> precision (at least) intermediates to get reasonable (single precision)
>> results.

> No it doesn't.

Matrix operations, even matrix multiplication, can easily result
in loss of precision, as sums of products are generated. One of
the worst is matrix inversion, but most don't do that anymore.
But it also gets worse as the matrix gets bigger. You really
want more precision for larger matrices, but "double precision"
has stayed pretty much the same (other than CDC and Cray) for
many years.

> The GIP scheme on the DEUCE (which was used extensively) used
> single precision (30 bits) for its numerical work (including
> intermediate precision).
> The exception was for multiplication, where a 63-bit product was
> obtained, as you would expect for integer hardware.

I presume they didn't use 10000x10000 matrices on it.

-- glen

glen herrmannsfeldt

unread,

Jun 1, 2014, 1:48:50 AM6/1/14

to

William Clodius <wclo...@earthlink.net> wrote:
> Ron Shepard <ron-s...@NOSPAM.comcast.net> wrote:
>> In article <5c0a1437-5997-4824...@googlegroups.com>,
>> Quadibloc <jsa...@ecn.ab.ca> wrote:

>> > Yes, it's true that Microsoft BASIC used \ for integer division.

>> If I remember the language correctly, this is needed in BASIC
>> because all variables are floating point, there are no integer
>> variables. I think that holds for constants too, but I'm not certain
>> about that.

> As near as I can tell all versions of BASIC have had integers but some
> simple ones have lacked floating point. I suspect you are confusing
> BASIC with some other language. Perhaps early versions of MATLAB.

I am pretty sure that the original BASIC had only floating point.
I used the HP 2000TSB systems for some years, which only have
floating point. The Microsoft BASIC in ROM systems for early
microcomputers (or on paper tape or cassette tape) have only
floating point.

I believe one of the early Apple II BASIC systems was integer
only (in ROM) with the floating point version loaded from disk.

Around that time, as computer memories got bigger, it was not
unusual to allow for two types of variables, (well, three including
string variables).

-- glen

robin....@gmail.com

unread,

Jun 1, 2014, 5:09:08 AM6/1/14

to

On Sunday, June 1, 2014 1:20:38 PM UTC+10, glen herrmannsfeldt wrote:

> William Clodius <w.no...@earthlink.net> wrote:

>> FWIW Bakus wrote a software floating point library for the 704's
>> predecessor, whose popularity made his proposals first for a hardware
>> implementation in the 704 and then a language to take advantage of that
>> (and other capabilities) sucesful.

> Well, certainly one would have wanted a software version before building
> the hardware. The 701 came with 2K or 4K words, which doesn't leave much
> after you fit in the floating point routines. The slow speed and
> small memory made a big incentive to use fixed point when possible.

Not on Pilot ACE (1950), which had 352 words of memory.
The floating-point routines for the production model (DEUCE, 1955) --
add, subtract, multiply, divide, sqrt, sum series -- occupied 96 words,
leaving plenty of space for the remainder of the calculations (see
subroutine A13F/1).

As for speed on that machine, floating-point multiply and divide took only
two to three times as long as their fixed-point equivalents. That's because
mult and div were asynchronous.

And I already mentioned that floating-point was used for the GIP scheme --
used heavily for pretty-well all numerical work.

> Also, being used to slide rules, people were more used to having to keep
> track of the decimal point.

Numerical mathematics was not done with slide rules (a mere 3 or 4 significant
digits).
Mechanical calculators were instead used for such work.

robin....@gmail.com

unread,

Jun 1, 2014, 5:30:43 AM6/1/14

to

On Sunday, June 1, 2014 3:43:46 PM UTC+10, glen herrmannsfeldt wrote:

> r.no...@gmail.com wrote:
> On Sunday, June 1, 2014 9:21:18 AM UTC+10, glen herrmannsfeldt wrote:

>> Most numerical work of even the slightest complexity requires double
>> precision (at least) intermediates to get reasonable (single precision)
>> results.

> No it doesn't.

> Matrix operations, even matrix multiplication, can easily result in loss of
> precision, as sums of products are generated.

I already covered that in the post to which you are replying [below].
The intermediate products, rather obviously, were formed in 63 bits.

> One of the worst is matrix inversion, but most don't do that anymore.

Matrix inversion on DEUCE was performed in floating-point (31-bit mantissa).
That was found to give excellent results.

Your "fear-mongering" statement "Most numerical work of even the slightest
complexity requires double precision (at least) intermediates ..."
is not borne out by years of intensive successful matrix work on DEUCE in single
precision (30-bit mantissa).

> But it also gets worse as the matrix gets bigger.
> You really want more precision for larger matrices, but "double precision"
> has stayed pretty much the same (other than CDC and Cray) for many years.

> The GIP scheme on the DEUCE (which was used extensively) used
> single precision (30 bits) for its numerical work (including
> intermediate precision).
> The exception was for multiplication, where a 63-bit product was
> obtained, as you would expect for integer hardware.

> I presume they didn't use 10000x10000 matrices on it.

With 8K words of drum store, not likely, but some DEUCE had magnetic tape,
which allowed significantly larger matrices.

The largest matrix of which I am aware on a DEUCE with the 8K drum
(and no magnetic tape) exceeded 1000 x 1000. That involved the solution of
simultaneous equations.

robin....@gmail.com

unread,

Jun 1, 2014, 9:19:25 AM6/1/14

to

On Saturday, May 31, 2014 11:36:08 AM UTC+10, glen herrmannsfeldt wrote:
> Quadibloc <j.no...@ecn.ab.ca> wrote: > On Friday, May 30, 2014 6:28:51 AM UTC-6, Beliavsky wrote: >> I think Matlab, Python, and R, some scripting languages that >> new Fortran programmers may be used to, treat 1.1 as double >> precision and avoid the Fortran pitfall. > And then there is C, which uses double precision real for > everything, having only the double-precision versions of the > trig functions and so on. That was old C. There is now sqrtf() (sounds like the Fortran I function naming) and sqrtl(), the latter for (long double), allowing for a type wider than double precision. And even more recently, there is now csqrt(), yes they now have complex functions in C. But yes, C does make it easier to use the double precision forms. And C will convert arguments to the appropriate type, so using sqrt() on float values works just fine. > It certainly is true that it would be nice if 1.1 were just > the abstract real number 1.1 (and I don't mean 32-bit floating > point, I mean the mathematical object) which would then get > fitted to variables of finite precision as well as possible. > Back when FORTRAN originated, though, there was a definite need > for the behavior of programs to be predictable - and for compilers > to be simple. Well, double precision didn't come until Fortran II, but it wasn't all that long in coming. > Could a simple and predictable rule be established that would > allow a language like FORTRAN, where different real precisions > have (near-)equal stature, to still be very simple? > I would say that when you have 1.1 instead of 1.1E0, 1.1D0, > or 1.1Q0 (for EXTENDED PRECISION), the thing to do is to say > that it's initially converted to binary at the maximum available > precision (i.e. 1.1Q0, 128-bit real or 80-bit real) and then > possibly demoted the first time it comes into contact with > something of a specified real type.

> You might consider how PL/I does it. Constants have the base, scale, and
> precision that they are written in, so 1.1 is FIXED DEC(2,1), that is, two
> digits with one after the decimal point.

Indeed, but is worth pointing out that the value is stored in the same form,
typically as a _decimal_ value, not binary, and _exactly_ as 1.1, not as
1.0999999 or some such.

robin....@gmail.com

unread,

Jun 1, 2014, 9:32:31 AM6/1/14

to

On Sunday, June 1, 2014 5:27:13 AM UTC+10, glen herrmannsfeldt wrote:
> r.no...@gmail.com wrote:
>> On Saturday, May 31, 2014 2:46:17 PM UTC+10, glen herrmannsfeldt wrote:

>>> Because the fact that 2/3 is 0 and not 0.666...667 is, as an
>>> example, something I can see no halfway passable way to change.

> Are you sure I said that? Check again, please.

No, you didn't say that. The credits '>' got out-of-whack.
But you didn't point out what PL/I did, that's why I added the remark at the
bottom, namely that the result was 0.666666666...6

>>> In PL/I, 2/3 and 2./3. are the same, both are fixed point with one digit before and zero digits after the decimal point.

>> And the result in both cases is 0.666666666666.....6

Dan Nagle

unread,

Jun 1, 2014, 12:49:50 PM6/1/14

to

Hi,

On 2014-06-01 02:12:37 +0000, William Clodius said:

> Dan Nagle <danl...@me.com> wrote:
>
>> Hi,
>>
>> This is assuming single precision means 32-bit and double precision
>> means 64-bit, which is a common but not universal assumption.
>
> With the popularity of IEEE floating point I don't know of any modern
> architecture with floating point where that is not the case, and in
> particular any architecture with an F90+ compiler in standard conforming
> mode (i.e., where default integer is the same size as default floating
> point, and not in a mode compatible with code written for Cray/CDC
> arithmetic).

While the point is well-taken, many compilers have switches
-r8, -i8, or similar, preserving the standard interpretation
of the numeric storage unit, and setting it to 64 bits.

This is a contributing factor to the fact that default real
and double precision, while portable syntax, are not
portable semantics.

And 64-bit hardware is becoming more expected, as Apple
wants to sell movies. :-)

At an MSRC a few years ago, I was tasked to survey users
to learn what sizes of reals were being used. The results
were that about 25% of the work was 32-bit (signal processing,
for example seismology), about 70% of the work was 64-bits,
and about 5% was 128-bit (stiff DEs).

--
Cheers!

Dan Nagle

Qolin

unread,

Jun 1, 2014, 4:06:24 PM6/1/14

to

"Richard Maine" wrote in message
news:1lmitc7.5ndnim1ebizcwN%nos...@see.signature...

I confess I dislike the use of (for example) 1.0_DP in constants, mainly
because one cannot drop the ".0", as in 1_DP. Contrast with 1.0D0, which
can be written as 1D0.

I work in a group composed mostly of C++ programmers. A couple of them have
been "persuaded" to help me out with the Fortran, and they often stumble on
the strange ways in which the 2 languages differ. I dread having to justify
why 1.0_DP is different to 1_DP ...

--
Qolin

Qolin at domain dot com
domain is qomputing

glen herrmannsfeldt

unread,

Jun 1, 2014, 8:16:00 PM6/1/14

to

Qolin <no...@nowhere.com> wrote:

(snip)

> I confess I dislike the use of (for example) 1.0_DP in constants, mainly
> because one cannot drop the ".0", as in 1_DP. Contrast with 1.0D0, which
> can be written as 1D0.

> I work in a group composed mostly of C++ programmers. A couple of them have
> been "persuaded" to help me out with the Fortran, and they often stumble on
> the strange ways in which the 2 languages differ. I dread having to justify
> why 1.0_DP is different to 1_DP ...

Seems that they could have fixed that when KINDs were defined,
but it is a little late now.

A compiler could use a disjoint set of KINDs, so at least detect
the problem at compile time.

-- glen

Richard Maine

unread,

Jun 1, 2014, 10:46:34 PM6/1/14

to

Qolin <no...@nowhere.com> wrote:

> I dread having to justify
> why 1.0_DP is different to 1_DP ...

If I had my way (which is too late to plausibly happen) kinds would be
derived types instead of integer values. Then 1_dp would not be a valid
form for an integer (assuming dp to be a real kind value). So at least
you'd get a compilation error for 1_dp instead of unintentionally
getting an integer.

And it wouldn't have been my main reason, but I suppose a side effect
would be that one could then allow 1_dp as a valid form for a real
constant because it would be unambiguous that something with _dp had to
be real, just as it is unambiguous that something with D0 has to be
real.

Ron Shepard

unread,

Jun 2, 2014, 1:59:00 AM6/2/14

to

In article <lmg14d$p0u$1...@dont-email.me>,

"Qolin" <no...@nowhere.com> wrote:

> I confess I dislike the use of (for example) 1.0_DP in constants, mainly
> because one cannot drop the ".0", as in 1_DP. Contrast with 1.0D0, which
> can be written as 1D0.

You can drop the "0". It is the same as the difference between the
integer 1 and the real 1. Even a C programmer should be able to
understand that. :) There is also 1.e0_DP and 1._DP.

$.02 -Ron Shepard

Wolfgang Kilian

unread,

Jun 2, 2014, 3:04:14 AM6/2/14

to

On 05/30/2014 05:57 PM, Richard Maine wrote:
> Wolfgang Kilian <kil...@invalid.com> wrote:
>
>> On 05/30/2014 02:28 PM, Beliavsky wrote:
>>> Having read the Fortran questions on Stack Overflow for a few months,
> one very FAQ is why
>>>
>>> double precision :: x
>>> x = 1.1
>>>
>>> gives surprising results, as does passing "1.1" to a function expecting
>>> a double precision argument. One should write something like 1.1d0
>>> instead. I think Matlab, Python, and R, some scripting languages that

>>> new Fortran programmers may be used to, treat 1.1 as double precision
>>> and avoid the Fortran pitfall.
>>

[...]

>> A reasonable complaint might be that most Fortran compilers use single
>> precision as default REAL. I guess that is not required by the
>> standard, just by tradition.
>
> That single precision is default REAL *IS* specified by the fortran
> standard. Well, except that there isn't even technically a term "single
> precision" in the standard. While we often refer to it as single
> precision in casual conversation, the modern standard's term is just
> "default kind". (And if I recall correctly, the term was simply "real"
> in older standards; double precsion wasn't a different kind of real, but
> was a separate type, albeit related to real). In either case, "single
> precision" and "default real" aren't two distinct things to have the
> possibility of being different. It is specified that double precision
> has to be different from default kind (and that theprecision of double
> precision has to be higher than that of default).
>
> It is possible that I'm misinterpreting your statement, though. I'm
> talking only about terms within of the Fortran language. If you perhaps
> are talking about 32-bit representations when you say "single
> precision", as in IEEE single precision, then you are certainly right
> that the Fortran standard does not require that default real be a 32-bit
> representation. I would not even say that it was particularly
> traditional. Traditional Fortran representations varied widely. I spent
> a fair fraction of my career using Fortran on machines that didn't even
> have 32-bit representations. Single precision (well, "real", as this was
> prior to f90) was 60 bits and double was 120.

I was specifically referring to IEEE single precision, or rather 'float'
vs. 'double' as it is used in the C world. The OP did mention C-related
languages.

If one is used to present-day 64bit i86-based machines, it looks odd
that the default REAL type should be anything less than 'c_double' in
precision, in particular since Fortran is tailored for precision
numerics. It can be understood in terms of historical development.
There may be some systems where float is still faster than double. (A
compiler with 'double' as default REAL must provide another type as
DOUBLE PRECISION, but all compilers-OS combinations that I use do
support some precision beyond 'double'.)

-- Wolfgang

--
E-mail: firstnameini...@domain.de
Domain: yahoo

Gordon Sande

unread,

Jun 2, 2014, 9:05:19 AM6/2/14

to

I once heard a story about beginning Fortran programmers who had taken
the "1 is an integer and 1. is a real" story to heart and tried using
N for an integer and N. for a real. Thats consistency!

Thomas Koenig

unread,

Jun 2, 2014, 3:26:16 PM6/2/14

to

Richard Maine <nos...@see.signature> schrieb:

> If I had my way (which is too late to plausibly happen) kinds would be
> derived types instead of integer values. Then 1_dp would not be a valid
> form for an integer (assuming dp to be a real kind value).

It would be possible, using some internal compiler flags, to warn
about any kind number for a real which is not the result of the
selected_real_kind intrinsic, and to warn about any kind number
for integer and selected_integer_kind accordingly.

It would make sense to hide this warning behind a compiler options,
and to enable it either with a specific option. A question for
discussion would be if such an option should be enabled with something
as generic as -Wall.

What do think? It would at least get this

> So at least
> you'd get a compilation error for 1_dp instead of unintentionally
> getting an integer.

result.

Of course, this

> And it wouldn't have been my main reason, but I suppose a side effect
> would be that one could then allow 1_dp as a valid form for a real
> constant

would not be possible using that method.

Dan Nagle

unread,

Jun 2, 2014, 3:52:19 PM6/2/14

to

Hi,

On 2014-06-02 19:26:16 +0000, Thomas Koenig said:

> It would be possible, using some internal compiler flags, to warn
> about any kind number for a real which is not the result of the
> selected_real_kind intrinsic, and to warn about any kind number
> for integer and selected_integer_kind accordingly.

Perhaps "not imported from iso_fortran_env" ?

The real_kinds() array, and the real32, real64, real128
constants work just fine.

--
Cheers!

Dan Nagle

Richard Maine

unread,

Jun 2, 2014, 5:17:58 PM6/2/14

to

Also IEEE_selected_real_kind().

glen herrmannsfeldt

unread,

Jun 2, 2014, 5:54:27 PM6/2/14

to

Thomas Koenig <tko...@netcologne.de> wrote:
> Richard Maine <nos...@see.signature> schrieb:

(snip)

> It would be possible, using some internal compiler flags, to warn
> about any kind number for a real which is not the result of the
> selected_real_kind intrinsic, and to warn about any kind number
> for integer and selected_integer_kind accordingly.

As well as I know it, there is no requirement in the standard
for the INTEGER and REAL KINDs to overlap. Enough people do seem
to like using the number of bytes, though, that compiler writers
tend to use them.

If there is no requirement that KIND numbers be unique, it seems
that the SELECTED...KIND functions could return values that are
unlikely to be used directly, in addition to the popular values.

> It would make sense to hide this warning behind a compiler options,
> and to enable it either with a specific option. A question for
> discussion would be if such an option should be enabled with something
> as generic as -Wall.

As well as I know it, it is legal to:

PRINT *,KIND(1D0)
END

and then use the printed value in future compilations with
that compiler. (It might be done, for example, with make.)
One probably shouldn't depend on the values staying the
same across compiler versions, though.

> What do think? It would at least get this

>> So at least
>> you'd get a compilation error for 1_dp instead of unintentionally
>> getting an integer.

> result.

> Of course, this

>> And it wouldn't have been my main reason, but I suppose a side
>> effect would be that one could then allow 1_dp as a valid form
>> for a real constant

I was just wondering about the cases where 1_DP would give the wrong
value.

Seems to me that the big problem is when used as an actual argument,
where the type and kind would have to agree.

But a compilation error (not just warning) would avoid that.

-- glen

Phillip Helbig---undress to reply

unread,

Jun 2, 2014, 6:30:55 PM6/2/14

to

In article <lmh7le$npi$1...@dont-email.me>, Wolfgang Kilian

<kil...@invalid.com> writes:

> If one is used to present-day 64bit i86-based machines, it looks odd
> that the default REAL type should be anything less than 'c_double' in
> precision, in particular since Fortran is tailored for precision
> numerics. It can be understood in terms of historical development.
> There may be some systems where float is still faster than double. (A
> compiler with 'double' as default REAL must provide another type as
> DOUBLE PRECISION, but all compilers-OS combinations that I use do
> support some precision beyond 'double'.)

Another aspect is memory. If you don't NEED more precision and/or
range (remember VMS and the ability to choose between more precision and
much more range or more range and much more precision with two different
8-byte reals?), why use 20 GB of memory when 10 would do just as well?
Yes, memory has become much less expensive (I remember pricing
workstations in the early-to-mid 1990s and calculating DM 100 per MB of
RAM), but this still might be a relevant factor.

glen herrmannsfeldt

unread,

Jun 2, 2014, 6:51:18 PM6/2/14

to

Phillip Helbig---undress to reply <hel...@astro.multiclothesvax.de> wrote:

(snip)

> Another aspect is memory. If you don't NEED more precision and/or
> range (remember VMS and the ability to choose between more precision and
> much more range or more range and much more precision with two different
> 8-byte reals?),

Well, for one, it is VAX that supplies two different 8 byte reals,
and most often not implemented on the same machine. VMS has since
been ported to Alpha and IA64, but of which provide IEEE forms.
Alpha has instructions to load/store the VAX forms.

> why use 20 GB of memory when 10 would do just as well?
> Yes, memory has become much less expensive (I remember pricing
> workstations in the early-to-mid 1990s and calculating DM 100 per MB of
> RAM), but this still might be a relevant factor.

-- glen

Richard Maine

unread,

Jun 2, 2014, 9:00:52 PM6/2/14

to

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> I was just wondering about the cases where 1_DP would give the wrong
> value.
>
> Seems to me that the big problem is when used as an actual argument,
> where the type and kind would have to agree.

That's certainly one case. Another is expressions like 1_dp/n (n being
an integer). Perhaps not the most advisable style (even aside from the
accident of omitting the decimal point), but I've seen it done (and in
places where 0 was not the intended result).

William Clodius

unread,

Jun 2, 2014, 11:56:40 PM6/2/14

to

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> robin....@gmail.com wrote:
> <snip>

> > Double precision is always slower than single precision, whether in
> > software or hardware.
>

> The x87, and some other current processors, normally computes to full
> precision, independent of the precision of the operands.

However often tre real cost is not the calculation, but the memory
transfers from memory through the various levels of cache to the
registers. In the appropriate situation, i.e., large arrays of floats as
opposed to isolated scalars, the better memory usage can help
performance.

glen herrmannsfeldt

unread,

Jun 3, 2014, 1:05:40 AM6/3/14

to

William Clodius <wclo...@earthlink.net> wrote:

(snip, I wrote)

>> The x87, and some other current processors, normally computes to full
>> precision, independent of the precision of the operands.

> However often tre real cost is not the calculation, but the memory
> transfers from memory through the various levels of cache to the
> registers. In the appropriate situation, i.e., large arrays of floats as
> opposed to isolated scalars, the better memory usage can help
> performance.

The ones to cache, for intermediate values, should be fast enough.
If the final array is single precision, then that one will go through
cache to main memory.

-- glen

glen herrmannsfeldt

unread,

Jun 3, 2014, 1:10:22 AM6/3/14

to

Richard Maine <nos...@see.signature> wrote:

(snip, I wrote)

>> Seems to me that the big problem is when used as an actual argument,
>> where the type and kind would have to agree.

> That's certainly one case. Another is expressions like 1_dp/n (n being
> an integer). Perhaps not the most advisable style (even aside from the
> accident of omitting the decimal point), but I've seen it done (and in
> places where 0 was not the intended result).

Oh, yes, that one, too.

So, I still like my previous suggestion, with SELECTED_REAL_KIND
returning different values than SELECTED_INT_KIND, but also the
popular KIND values will be allowed.

-- glen

robin....@gmail.com

unread,

Jun 3, 2014, 3:28:55 AM6/3/14

to

Isn't that self-contradictory?

If the "popular" kind values (4, 8) are retained,
how do you distinguish between 4 for integer and 4 for real?

Seems to me that the difficulty can be overcome only by using entirely
different kind values for integer, real etc.

Wolfgang Kilian

unread,

Jun 3, 2014, 3:30:36 AM6/3/14

to

For current 64bit hardware one might naively expect a 64bit REAL type as
default ... so why not do it the other way round: if there is good
reason for compressed storage, one can declare and work with arrays of a
KIND that is shorter than the default, right?

However, Fortran programs are supposed to be portable in space and time.
In the x86 world, 32bit has been a de-facto standard for the default
REAL, and I don't like relying on compiler options to change such
things. So I don't see how to avoid a syntax like 1._dp everywhere in
production code.

Wolfgang Kilian

unread,

Jun 3, 2014, 3:43:41 AM6/3/14

to

On 06/02/2014 04:46 AM, Richard Maine wrote:
> Qolin <no...@nowhere.com> wrote:
>
>> I dread having to justify
>> why 1.0_DP is different to 1_DP ...
>
> If I had my way (which is too late to plausibly happen) kinds would be
> derived types instead of integer values. Then 1_dp would not be a valid
> form for an integer (assuming dp to be a real kind value). So at least
> you'd get a compilation error for 1_dp instead of unintentionally
> getting an integer.
>
> And it wouldn't have been my main reason, but I suppose a side effect
> would be that one could then allow 1_dp as a valid form for a real
> constant because it would be unambiguous that something with _dp had to
> be real, just as it is unambiguous that something with D0 has to be
> real.
>

Would it break anything if values of an opaque intrinsic (derived) type
were added to the language, as legal values in all places where KIND
values are currently required? (And analogously for I/O units, for that
matter.) --
Except, in practice, that this requires a lot of work to correctly
define and implement, and that there are probably more important issues
on the list.

robin....@gmail.com

unread,

Jun 3, 2014, 3:45:43 AM6/3/14

to

On Saturday, May 31, 2014 11:59:23 PM UTC+10, Dan Nagle wrote:
> Hi, This is assuming single precision means 32-bit and double precision means 64-bit, which is a common but not universal assumption.

> On 2014-05-31 08:03:33 +0000, r.no...@gmail.com said:
>> Double precision is always slower than single precision, whether in
>> software or hardware.

> Except on IBM Power processors, where 32-bit floating point is computed by
> converting to 64-bit, operating, and converting back to 32-bit.

Loading and storing 32-bit values takes less time than 64-bit ones.

> Or, as Glen mentioned, on x87 coprocessors, where the floating point stack
> is 80 bits.

There's less time in loading and storing 32-bit values compared to
64-bit values (or even 80-bit ones).

Dan Nagle

unread,

Jun 3, 2014, 9:54:54 AM6/3/14

to

Hi,

On 2014-06-03 07:45:43 +0000, robin....@gmail.com said:

>> Except on IBM Power processors, where 32-bit floating point is computed by
>> converting to 64-bit, operating, and converting back to 32-bit.
>
> Loading and storing 32-bit values takes less time than 64-bit ones.

This depends on the width of the memory bus and the cache line size.

>> Or, as Glen mentioned, on x87 coprocessors, where the floating point stack
>> is 80 bits.
>
> There's less time in loading and storing 32-bit values compared to
> 64-bit values (or even 80-bit ones).

Maybe.

Even laptops these days have 64-bit busses.
(So you can watch a movie in-flight, I suppose.)

--
Cheers!

Dan Nagle

Gordon Sande

unread,

Jun 3, 2014, 10:26:18 AM6/3/14

to

On 2014-06-03 07:45:43 +0000, robin....@gmail.com said:

> On Saturday, May 31, 2014 11:59:23 PM UTC+10, Dan Nagle wrote:
>> Hi, This is assuming single precision means 32-bit and double precision
>> means 64-bit, which is a common but not universal assumption.
>> On 2014-05-31 08:03:33 +0000, r.no...@gmail.com said:
>>> Double precision is always slower than single precision, whether in
>>> software or hardware.
>
>> Except on IBM Power processors, where 32-bit floating point is computed by
>> converting to 64-bit, operating, and converting back to 32-bit.
>
> Loading and storing 32-bit values takes less time than 64-bit ones.

Except when the memory system is native 64 bits and does a 32 bit store
by loading
64 bits into a memory buffer, modifying the 32bits and then storing the
resulting
64 bits.

Richard Maine

unread,

Jun 3, 2014, 12:17:03 PM6/3/14

to

Wolfgang Kilian <kil...@invalid.com> wrote:

> On 06/02/2014 04:46 AM, Richard Maine wrote:

> > If I had my way (which is too late to plausibly happen) kinds would be
> > derived types instead of integer values.

> Would it break anything if values of an opaque intrinsic (derived) type
> were added to the language, as legal values in all places where KIND
> values are currently required? (And analogously for I/O units, for that
> matter.) --

I can't off-hand think of any bar to that. And yes, I'd have also done
it for I/O units. And not that I'm a big pusher of UDDTIO (user-defined
derived-type Input/Output) in general, but I'd have designed it using an
intrinsic derived type containing all the relevant properties of the
file unit instead of having those properties all as individual procedure
arguments; having a derived type would make it easier to add other
relevant things in the future.

I've mentioned before that I sometimes got the impression that some of
the language designers themselves didn't fully appreciate how to
effectively use the newish features of the language, derived types in
particular. There was just an awful lot of resistance to using derived
types in the language itself. Heck, that's even reflected in the
terminology, with intrinsic types being those defined by the standard
and derived types generally being considered as user-defined.

> Except, in practice, that this requires a lot of work to correctly
> define and implement, and that there are probably more important issues
> on the list.

Yeah. That.

Ron Shepard

unread,

Jun 3, 2014, 12:46:14 PM6/3/14

to

In article <0d66dd7b-5221-43cc...@googlegroups.com>,

robin....@gmail.com wrote:

> > So, I still like my previous suggestion, with SELECTED_REAL_KIND returning
> > different values than SELECTED_INT_KIND, but also the popular KIND values
> > will be allowed.
>
> Isn't that self-contradictory?
>
> If the "popular" kind values (4, 8) are retained,
> how do you distinguish between 4 for integer and 4 for real?

I think he is suggesting having several KIND values that map to the
same underlying data type and kind and having them "match" each
other through TKR argument checking. Say the integer kinds 4 and
104 both mean the same thing, and the real kinds 4 and 204 both mean
the same thing. Then 1_4 and 1._4 would both be allowed, as they
currently are, but 1_204 or 1._104 would trigger compiler errors.
And he wants SELECTED_x_KIND() and KIND() to return the distinct 104
and 204 values.

I don't know how well this could work. Currently, different KIND
values of a type are distinct, and it seems like that could be used
in various ways to advantage in existing code. Presumably intrinsic
operators could be introduced to test if two KIND values map to the
same thing, but it would be an extra level of complication.

I personally don't think this issue is very important. It is
perfectly natural to me to think of "1" as an integer and "1." as a
real. Yes, the programmer can make a mistake, but this isn't
something that bothers me very often, so I can't judge how important
it might be in the scheme of things.

$.02 -Ron Shepard

James Van Buskirk

unread,

Jun 3, 2014, 4:25:50 PM6/3/14

to

"Gordon Sande" <Gordon...@gmail.com> wrote in message
news:lmklua$3ka$1...@dont-email.me...

> On 2014-06-03 07:45:43 +0000, robin....@gmail.com said:

>> Loading and storing 32-bit values takes less time than 64-bit ones.

> Except when the memory system is native 64 bits and does a 32 bit store by
> loading
> 64 bits into a memory buffer, modifying the 32bits and then storing the
> resulting
> 64 bits.

Exept when the compiler is capable of generating efficient code
where the SIMD properties of the processor permit double the
throughput with 32-bit reals compared to 64-bit reals.

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

glen herrmannsfeldt

unread,

Jun 3, 2014, 4:26:29 PM6/3/14

to

Wolfgang Kilian <kil...@invalid.com> wrote:

(snip)

> For current 64bit hardware one might naively expect a 64bit REAL type as
> default ... so why not do it the other way round: if there is good
> reason for compressed storage, one can declare and work with arrays of a
> KIND that is shorter than the default, right?

Seems to me that it was an accident that C did that.

First, C was meant as a systems programming language, for writing
things like operating systems and compilers. It included floating
point, implemented in software on many systems, but assumed that it
would be a small part of the actual computation. (That is, K&R C.)
K&R only supplied double constants, all floating point expressions
were evaluated as double, converted to float if necessary.

By the time the ANSI C standard was ready in 1989, the 8087 and 80287
were ready, and it was not so unusual to do everything as double.
Even so, ANSI added single precision constants (with a trailing f)
and the ability, with a prototype, to pass them to functions.
(K&R converted to double before passing float to a function, and
back again, if needed.)

> However, Fortran programs are supposed to be portable in space and time.
> In the x86 world, 32bit has been a de-facto standard for the default
> REAL, and I don't like relying on compiler options to change such
> things. So I don't see how to avoid a syntax like 1._dp everywhere in
> production code.

In the case of Fortran, DOUBLE PRECISION didn't come until Fortran II,
I believe done in software. (Single precision floating point in
vacuum tube hardware already seems pretty amazing to me.)

As far as I know, Fortran started the E notation for constants.
I am more sure that it started D, as no other language that I know
of uses D. It was already necessary to keep compatible with older
programs. (From a reliable source, it was Fortran that originated
the multiple character variable names. Mathematicians still seem not
to have figured that one out.)

Even a small matrix can have a large condition number, but it
is definitely easier with larger ones. Much work has been done
over the years on numerical algorithms to reduce the problems of
precision loss in matrix calculations. We use LU-decomposition
instead of matrix inversion. But as problems get larger, there
is a natural need for more precision on intermediate values,

I suppose it could have been done when then Fortran 66 standard
was written, but it seems to me that they pretty much wrote
what existed into the standard, maybe a little less.

I might have written this here too many times, but there is
a feature on the IBM 7030/Stretch that seems not to have been
used much since. One can select (maybe from a front panel switch)
whether to shift in zeros or ones on post normalization.
That is, round up or down. The idea was that a calculation would
be done both ways, and differences would be due to precision loss.
Now, there is one obvious problem, in that subtracting two such
numbers can cancel out the effect. But overall it seems a good
idea that hasn't been used much.

Otherwise, IBM starting with the 360/85 and continuing into all
S/370 and later machines, has hardware support for extended
precision (128 bit) floating point, except for divide.
(DXR didn't appear in hardware until much later.) VAX has H-float,
a 128 bit floating type, but not in hardware on many processors.
(I believe extra cost microcode for some, and standard on the
bottom end 11/730.)

But yes, a default of "double precision", considering the popularity
of the IEEE standard forms, does seem reasonable now.

-- glen

glen herrmannsfeldt

unread,

Jun 3, 2014, 4:58:16 PM6/3/14

to

Gordon Sande <Gordon...@gmail.com> wrote:
> On 2014-06-03 07:45:43 +0000, robin....@gmail.com said:

(snip)

>> Loading and storing 32-bit values takes less time than 64-bit ones.

> Except when the memory system is native 64 bits and does a 32 bit
> store by loading 64 bits into a memory buffer, modifying the 32bits
> and then storing the resulting 64 bits.

It might not be an accident that cache and ECC became popular for
memory systems at about the same time.

For ECC memory, you have to read a whole check block, which is
reasonably commonly 64 bits, modify any part that changed, then
write back the block. Assuming the cache line size is at least
as large, though, all cache write backs will be of appropriate
size. Now I have to wonder if anyone ever did ECC on cache.

ECC at 64 bits is pretty convenient, as you need 8 check bits,
so the same number you would for byte parity bits.

It also interesting that both cache and ECC were at about the same
time as the change from core to semiconductor main memory.
Core has a destructive read, which requires one to write the
data back. If one is going to write back modified data, one can
delay the otherwise necessary restore.

There have been processors with core memory that, in the case of
read-modify-write instructions (consider incrementing a memory word)
avoid the restoring write that wasn't needed. Some early DRAM also
allowed for this, but for most the internal refresh is fast enough
that there isn't any gain.

-- glen

robin....@gmail.com

unread,

Jun 3, 2014, 10:01:25 PM6/3/14

to

On Wednesday, June 4, 2014 2:46:14 AM UTC+10, Ron Shepard wrote:

> In article <0.no...@googlegroups.com>, r.no...@gmail.com wrote:
>>> So, I still like my previous suggestion, with SELECTED_REAL_KIND returning
>>> different values than SELECTED_INT_KIND, but also the popular KIND values
>>> will be allowed.

>> Isn't that self-contradictory?
>> If the "popular" kind values (4, 8) are retained,
>> how do you distinguish between 4 for integer and 4 for real?

> I think he is suggesting having several KIND values that map to the same
> underlying data type and kind and having them "match" each other through
> TKR argument checking. Say the integer kinds 4 and 104 both mean the same
> thing, and the real kinds 4 and 204 both mean the same thing.
> Then 1_4 and 1._4 would both be allowed, as they currently are,

But 1_4 and 1._4 are not currently allowed, except in some installations.
Kind numbers are meaningless in general.
It's better to use a named suffix, such as "dp" etc, derived from the
relevant intrinsic, or even 1d5.

> but 1_204 or 1._104 would trigger compiler errors.

I would hope not. Compiler errors are bugs in the compiler.
You mean, compilation-time error messages.

glen herrmannsfeldt

unread,

Jun 3, 2014, 10:39:50 PM6/3/14

to

robin....@gmail.com wrote:

(snip, someone wrote)

>> I think he is suggesting having several KIND values that map to the same
>> underlying data type and kind and having them "match" each other through
>> TKR argument checking. Say the integer kinds 4 and 104 both mean the same
>> thing, and the real kinds 4 and 204 both mean the same thing.
>> Then 1_4 and 1._4 would both be allowed, as they currently are,

Yes, that is what I was suggesting. Assuming it isn't disallowed.

> But 1_4 and 1._4 are not currently allowed, except in
> some installations.

If some means almost all, then yes. Personally, I don't think people
should use it, but many do. As well as I know it, the compilers
that previously didn't allow it now do.

> Kind numbers are meaningless in general.
> It's better to use a named suffix, such as "dp" etc, derived from the
> relevant intrinsic, or even 1d5.

>> but 1_204 or 1._104 would trigger compiler errors.

> I would hope not. Compiler errors are bugs in the compiler.
> You mean, compilation-time error messages.

Most call the former Internal Compiler Errors, not just compiler
errors.

>> And he wants SELECTED_x_KIND() and KIND() to return the
>> distinct 104 and 204 values.

Yes. That would allow programs using _4 to still work, and still
detect the error of using _wp on the wrong constant.

-- glen

Quadibloc

unread,

Jun 4, 2014, 7:39:11 AM6/4/14

to

On Tuesday, June 3, 2014 2:26:29 PM UTC-6, glen herrmannsfeldt wrote:

> I might have written this here too many times, but there is
> a feature on the IBM 7030/Stretch that seems not to have been
> used much since. One can select (maybe from a front panel switch)
> whether to shift in zeros or ones on post normalization.
> That is, round up or down. The idea was that a calculation would
> be done both ways, and differences would be due to precision loss.
> Now, there is one obvious problem, in that subtracting two such
> numbers can cancel out the effect. But overall it seems a good
> idea that hasn't been used much.

Well, IEEE 754 lets you choose your rounding mode.

Actually, a lot of ideas have been tried to deal with errors in calculation. One simple one is simply to do everything with numbers that don't have to be normalized, and never to normalize so as to bring in digits that aren't significant into the mantissa. Another is interval arithmetic.

Basically, though, the consensus is that none of those measures works well enough; they're all too likely to be deceptive, and so nothing substitutes for doing proper numerical analysis by hand.

Since some algorithms are too complicated to expect programmers to do this, it isn't clear to me that this was a good reason to completely abandon such techniques. However, perhaps using computer algebra instead would be a more valid alternative.

John Savard

glen herrmannsfeldt

unread,

Jun 4, 2014, 5:12:50 PM6/4/14

to

Quadibloc <jsa...@ecn.ab.ca> wrote:

(snip, I wrote)

>> I might have written this here too many times, but there is
>> a feature on the IBM 7030/Stretch that seems not to have been
>> used much since. One can select (maybe from a front panel switch)
>> whether to shift in zeros or ones on post normalization.
>> That is, round up or down. The idea was that a calculation would
>> be done both ways, and differences would be due to precision loss.
>> Now, there is one obvious problem, in that subtracting two such
>> numbers can cancel out the effect. But overall it seems a good
>> idea that hasn't been used much.

> Well, IEEE 754 lets you choose your rounding mode.

It does, and I thought about that when I wrote it.

As far as I know, it doesn't do that one, but then I probably don't
understand IEEE rounding modes all that well.

-- glen

robin....@gmail.com

unread,

Jun 5, 2014, 5:40:31 AM6/5/14

to

On Wednesday, June 4, 2014 6:26:29 AM UTC+10, glen herrmannsfeldt wrote:

> I might have written this here too many times, but there is a feature on
> the IBM 7030/Stretch that seems not to have been used much since.
> One can select (maybe from a front panel switch) whether to shift in zeros
> or ones on post normalization. That is, round up or down.

Shifting in ones doesn't "round up", since there is no rounding.

If the exponents are equal and upon addition or subtraction there is
significant cancellation such that only one significant digit remains in the
least-significant position, the result obtained upon shifting on zeros
has the obvious effect of not changing anything, while shifting in ones
has the effect of adding 0.499999999 (assuming that the remaining single bit
represents the value 0.5), or in other words, changes the value by just under
100%.

Rounding on DEUCE floating-point was to add a 1-bit to the digit
*beneath* the least-significant digit [after normalising, of course].
That had the effect of adding a one-bit to the least-signficant digit
if the next-lower bit was 1.
I expect that the rounding method was due to J. H. Wilkinson in his work
in his helping to design Pilot ACE, and in his development of floating-point
routines during the period 1946-1950, and subsequently when the machine was working 1950-1955.

> The idea was that a calculation would be done both ways, and differences would be due to precision loss. Now, there is one obvious problem, in that subtracting two such numbers can cancel out the effect.

Usually only if the exponents were equal before the ones were introduced.
Surprising results might well have been obtained when that was not the case,
as indicated above.

John Savard

unread,

Jun 6, 2014, 5:41:08 PM6/6/14

to

On Sun, 01 Jun 2014 02:30:43 -0700, robin.vowels wrote:

> Your "fear-mongering" statement "Most numerical work of even the
> slightest complexity requires double precision (at least) intermediates
> ..."
> is not borne out by years of intensive successful matrix work on DEUCE
> in single precision (30-bit mantissa).

Well, what I know is that while numerical work _was_ possible on the IBM
7090 in single precision, with 36-bit floating-point numbers having a 27-
bit mantissa, people complained that they usually had to switch to double
precision when the IBM 360 came out, with (effectively, thanks to the
exponent being hexadecimal) a 21-bit mantissa.

Current computers use the IEEE 754 standard, with a hidden first bit
effectively adding a bit to the mantissa, so we're dealing with a 24-bit
mantissa in single-precision today.

That is greater than the known unsatisfactory 21-bit size for a mantissa,
but less than the known adequate (at least for some problems) 27-bit size
for a mantissa.

So I would think there is good reason to be apprehensive that single
precision, even in the IEEE 754 format, will not be adequate for much
serious scientific numerical work - and experience with 31-bit mantissas
on the DEUCE doesn't contradict this, given that other experience does
show that a 27-bit size is likely to be satisfactory, but single precision
today means a smaller 24-bit mantissa.

John Savard

Tim Prince

unread,

Jun 6, 2014, 6:56:40 PM6/6/14

to

I'm wondering from the context whether some of the preceding references
are to an extended precision intermediate format.
IEEE 754 defined an extended single precision format, presumably for
possible use as an intermediate format, e.g. for dot product
accumulation, but the last machine I worked on which had such a thing
was the Prime, and that pre-dated IEEE754 (unless you count the x87
extra precision, which is effectively double, when 53-bit precision mode
is set, but is not available in many "64-bit mode" compilers).
With compilers such as Intel's which "riffle" sum accumulation
(alternating among multiple parallel accumulators) we return to getting
some extra precision, with 32 parallel sums typical for AVX dot product.
What has been called "riffle" in my final career stage was called
"interleave" 35 years ago.
Even without riffle, AVX dot products (e.g. with gnu dot_product or
inner_product) use 8 parallel sums, which sometimes would be equivalent
to 27-bit precision. With AVX2 there is effectively double precision
intermediate between multiply and add when using fma.
Certain applications, including commercial ones, have successfully
combined single and double precision so as to retain precision for
critical data along with good cache locality. This may pre-suppose
having a full double precision version of the application to verify
adequacy of the mixed precision for cases in question.
When working with complex arithmetic, it's important to be aware of the
complex-limited-range compilation options of various compilers. By
contrast, full-range single precision complex may be implemented by the
compiler automatically promoting locally to double where it may be
necessary.
--
Tim Prince

wilson

unread,

Jun 7, 2014, 11:59:34 PM6/7/14

to

-- IMHO there is another factor to consideer. Howw well conditioned is
the matrrix. I personally ran into problems where single precisiosn
results had the wrong sign and double precisiion arithmetic produced
results that were off by about 50%. Even worse, both sets of answers
"checked out" when fed back into the original problem in the appropriate
precision. As a result of this, when I was teaching numbeerical analysis
I aassigned homework exercises to find 2x2 matriciies that produced bad
results in single and double precision. The students had no trouble doing
this.

glen herrmannsfeldt

unread,

Jun 8, 2014, 2:40:18 AM6/8/14

to

wilson <wins...@udayton.edu> wrote:
> On Fri, 06 Jun 2014 17:41:08 -0400, John Savard

(snip)

>> Well, what I know is that while numerical work _was_ possible on the IBM
>> 7090 in single precision, with 36-bit floating-point numbers having a 27-
>> bit mantissa, people complained that they usually had to switch to double
>> precision when the IBM 360 came out, with (effectively, thanks to the
>> exponent being hexadecimal) a 21-bit mantissa.

> -- IMHO there is another factor to consideer. Howw well conditioned is
> the matrrix. I personally ran into problems where single precisiosn
> results had the wrong sign and double precisiion arithmetic produced
> results that were off by about 50%. Even worse, both sets of answers
> "checked out" when fed back into the original problem in the appropriate
> precision. As a result of this, when I was teaching numbeerical analysis
> I aassigned homework exercises to find 2x2 matriciies that produced bad
> results in single and double precision. The students had no trouble doing
> this.

Yes. It is possible to have an ill-conditioned 2x2 matrix and a
well conditioned 10000x10000 matrix, but there is a tendency for
the condition to get worse as the matrix gets larger.

-- glen

Quadibloc

unread,

Jun 12, 2014, 5:56:47 PM6/12/14

to

On Saturday, June 7, 2014 9:59:34 PM UTC-6, wilson wrote:

> -- IMHO there is another factor to consideer. Howw well conditioned is
> the matrrix.

Of course, when a matrix is ill-conditioned, one can still get the right inverse if one uses an appropriate algorithm, even if just plain pivoting will produce nonsense.

John Savard

glen herrmannsfeldt

unread,

Jun 12, 2014, 6:51:31 PM6/12/14

to

It is better algorithms that have reduced the need for more than
(53 bit) double precision, though that isn't always enough.

For a sufficiently ill-conditions, but not singular, matrix,
it might take more than 53 bits to do the matrix multiply.
(That is, to show that it is the inverse.)

It is fun to do unstable solutions to partial differential equations,
and watch how long the solution stays reasonable, before it suddenly
explodes. More bits will allow it to go longer before "bad things" (tm)
happen, but eventually they will.

But I do wonder why there isn't more hardware with a 128 bit
floating point type.

-- glen

robert....@oracle.com

unread,

Jun 12, 2014, 8:02:05 PM6/12/14

to

On Thursday, June 12, 2014 3:51:31 PM UTC-7, glen herrmannsfeldt wrote:
>
> But I do wonder why there isn't more hardware with a 128 bit
> floating point type.

Lack of demand. If a company could earn an additional billion dollars a year selling such CPUs, they would exist.

Robert Corbett

gera...@rrt.net

unread,

Jun 14, 2014, 12:20:16 AM6/14/14

to

On Thursday, June 12, 2014 5:51:31 PM UTC-5, glen herrmannsfeldt wrote:

... 20 minutes into the future...

Why there isn't more hardware with 256 bit floating point type?
______________________________________ G G G Gerard Schildberger

Nasser M. Abbasi

unread,

Jun 14, 2014, 12:25:52 AM6/14/14

to

On 6/13/2014 11:20 PM, gera...@rrt.net wrote:

>
> ... 20 minutes into the future...
>
> Why there isn't more hardware with 256 bit floating point type?
> ______________________________________ G G G Gerard Schildberger
>

There is no need. One can simply, today, use
"Multiple Precision Arithmetic" libraries, which there are
few around.

http://en.wikipedia.org/wiki/Bignum
http://en.wikipedia.org/wiki/GNU_Multiple_Precision_Arithmetic_Library

"There are no practical limits to the precision
except the ones implied by the available memory
in the machine"

Many CAS systems use/link to such libraries:

In[243]:= N[Pi, 300]
Out[243]= 3.141592653589793238462643383279502884197169399
3751058209749445923078164062862089986\
280348253421170679821480865132823066470938446095505822
317253594081284811174502841027\
0193852110555964462294895493038196442881097566593344
61284756482337867831652712019091\
45648566923460348610454326648213393607260249141
273724587006606315588174881169`300.

--Nasser

gera...@rrt.net

unread,

Jun 14, 2014, 1:38:25 AM6/14/14

to

On Friday, June 13, 2014 11:25:52 PM UTC-5, Nasser M. Abbasi wrote:

I guess my reference to M M M M Maxhead Room was a bit
obtuse. 20 more minutes into the future, ...
Why there isn't more hardware with 512 bit floating point type?
_____________________________________ Gerard Schildberger