The best way to get quad precision

Peng Yu

unread,

Jun 16, 2015, 11:41:53 AM6/16/15

to

Hi,

I use `kind(1.d0)` to get the correct number of kind for double precision. But I don't know what number in quad precision 1.d0 corresponds to.

program main
implicit none
integer, parameter :: dp = kind(1.d0)
real(dp) :: x = 1._dp

print *, x
end program

I've seen people use selected_real_kind() to get quad precision. In this sense, should I use `integer, parameter :: dp = selected_real_kind (15)` for double precision instead of using `integer, parameter :: dp = kind(1.d0)`?

program main
implicit none
integer, parameter :: qp = selected_real_kind (32)

real(qp) :: x = 1._qp

print *, x
end program

Regards,
Peng

Steve Lionel

unread,

Jun 16, 2015, 11:44:13 AM6/16/15

to

On 6/16/2015 11:41 AM, Peng Yu wrote:

> I use `kind(1.d0)` to get the correct number of kind for double precision. But I don't know what number in quad precision 1.d0 corresponds to.

There is no standard syntax to do this.

> I've seen people use selected_real_kind() to get quad precision. In this sense, should I use `integer, parameter :: dp = selected_real_kind (15)` for double precision instead of using `integer, parameter :: dp = kind(1.d0)`?

Yes, that's the way to do it. You could use selected_real_kind(30) to
get quad precision, if the compiler supports it.

--
Steve Lionel
Developer Products Division
Intel Corporation
Merrimack, NH

For email address, replace "invalid" with "com"

User communities for Intel Software Development Products
http://software.intel.com/en-us/forums/
Intel Software Development Products Support
http://software.intel.com/sites/support/
My Fortran blog
http://www.intel.com/software/drfortran

Refer to http://software.intel.com/en-us/articles/optimization-notice
for more information regarding performance and optimization choices in
Intel software products.

Peng Yu

unread,

Jun 16, 2015, 11:56:28 AM6/16/15

to

On Tuesday, June 16, 2015 at 10:44:13 AM UTC-5, Steve Lionel wrote:
> On 6/16/2015 11:41 AM, Peng Yu wrote:
>
> > I use `kind(1.d0)` to get the correct number of kind for double precision. But I don't know what number in quad precision 1.d0 corresponds to.
>
> There is no standard syntax to do this.
>
> > I've seen people use selected_real_kind() to get quad precision. In this sense, should I use `integer, parameter :: dp = selected_real_kind (15)` for double precision instead of using `integer, parameter :: dp = kind(1.d0)`?
>
> Yes, that's the way to do it. You could use selected_real_kind(30) to
> get quad precision, if the compiler supports it.

See the following if there is no suffix to a floating point number, the compiler consider it as REAL(4) as shown in by the warnings. However, the values stored in the variables x* seems to indicate that non suffixed floating point numbers are actually of the precision of REAL(10).

Is it a bug in gfortran? What it actually should be according to FORTRAN standard?

$ cat main.f90
program main
real(4):: x4 = 3.141592653589793238462643383279502884197169399
real(8):: x8 = 3.141592653589793238462643383279502884197169399
real(10):: x10 = 3.141592653589793238462643383279502884197169399
real(16):: x16 = 3.141592653589793238462643383279502884197169399
print *, x4
print *, x8
print *, x10
print *, x16
end program
$ gfortran -Wconversion-extra -o main.exe main.f90
main.f90:3.16:

real(8):: x8 = 3.141592653589793238462643383279502884197169399
1
Warning: Conversion from REAL(4) to REAL(8) at (1)
main.f90:4.18:

real(10):: x10 = 3.141592653589793238462643383279502884197169399
1
Warning: Conversion from REAL(4) to REAL(10) at (1)
main.f90:5.18:

real(16):: x16 = 3.141592653589793238462643383279502884197169399
1
Warning: Conversion from REAL(4) to REAL(16) at (1)
$ ./main.exe
3.14159274
3.1415927410125732
3.14159274101257324219
3.14159274101257324218750000000000000

michael...@compuserve.com

unread,

Jun 16, 2015, 12:17:52 PM6/16/15

to

On Tuesday, 16 June 2015 17:41:53 UTC+2, Peng Yu wrote:
> integer, parameter :: qp = selected_real_kind (32)

You can also use a more general form, as shown in this module that I found in a book:

module numeric_kinds
! named constants for 4, 2, and 1 byte integers:
integer, parameter :: &
i4b = selected_int_kind(9), &
i2b = selected_int_kind(4), &
i1b = selected_int_kind(2)
! and for single, double and quadruple precision reals:
integer, parameter :: &
sp = kind(1.0), &
dp = selected_real_kind(2*precision(1.0_sp)), &
qp = selected_real_kind(2*precision(1.0_dp))
end module numeric_kinds

Regards,

Mike Metcalf

Wolfgang Kilian

unread,

Jun 16, 2015, 12:21:09 PM6/16/15

to

The compiler is free to store an intermediate value with higher
precision than required by the data type. In this particular case, my
wild guess is that it is just more convenient for the compiler to call
an extended-precision routine for reading and assign it directly to the
LHS, instead of literally following the rule that the RHS is first
evaluated and then assigned. If you do calculations with this number
before assigning it to the LHS, it is likely (but also not required)
that you will lose the extra precision.

So, I don't think that this is a compiler bug.

-- Wolfgang

Gordon Sande

unread,

Jun 16, 2015, 12:34:15 PM6/16/15

to

On 2015-06-16 15:56:27 +0000, Peng Yu said:

> On Tuesday, June 16, 2015 at 10:44:13 AM UTC-5, Steve Lionel wrote:
>> On 6/16/2015 11:41 AM, Peng Yu wrote:
>>
>>> I use `kind(1.d0)` to get the correct number of kind for double
>>> precision. But I don't know what number in quad precision 1.d0
>>> corresponds to.
>>
>> There is no standard syntax to do this.
>>
>>> I've seen people use selected_real_kind() to get quad precision. In
>>> this sense, should I use `integer, parameter :: dp = selected_real_kind
>>> (15)` for double precision instead of using `integer, parameter :: dp =
>>> kind(1.d0)`?
>>
>> Yes, that's the way to do it. You could use selected_real_kind(30) to>
>> get quad precision, if the compiler supports it.
>
> See the following if there is no suffix to a floating point number, the
> compiler consider it as REAL(4) as shown in by the warnings. However,
> the values stored in the variables x* seems to indicate that non
> suffixed floating point numbers are actually of the precision of
> REAL(10).
>
> Is it a bug in gfortran? What it actually should be according to
> FORTRAN standard?
>

> $ cat main.f90program main

> real(4):: x4 = 3.141592653589793238462643383279502884197169399
> real(8):: x8 = 3.141592653589793238462643383279502884197169399
> real(10):: x10 = 3.141592653589793238462643383279502884197169399
> real(16):: x16 = 3.141592653589793238462643383279502884197169399
> print *, x4
> print *, x8
> print *, x10
> print *, x16
> end program
> $ gfortran -Wconversion-extra -o main.exe main.f90
> main.f90:3.16:
>
> real(8):: x8 = 3.141592653589793238462643383279502884197169399
> 1
> Warning: Conversion from REAL(4) to REAL(8) at (1)
> main.f90:4.18:
>
> real(10):: x10 = 3.141592653589793238462643383279502884197169399
> 1
> Warning: Conversion from REAL(4) to REAL(10) at (1)
> main.f90:5.18:
>
> real(16):: x16 = 3.141592653589793238462643383279502884197169399
> 1
> Warning: Conversion from REAL(4) to REAL(16) at (1)
> $ ./main.exe 3.14159274 3.1415927410125732
> 3.14159274101257324219 3.14159274101257324218750000000000000

Using "print *, xx" will only get the compiler default print format for
whatever type xx has. To look at the full precision value you would need to
write all the values out using a longer explicit format. The numbers presented
would seem to be varying display precisions of the same bit configuration.

The question is what you get with a format like e30.25 for your several
variables.

mec...@gmail.com

unread,

Jun 16, 2015, 12:36:20 PM6/16/15

to

On Tuesday, June 16, 2015 at 10:56:28 AM UTC-5, Peng Yu wrote:
> See the following if there is no suffix to a floating point number, the compiler consider it as REAL(4) as shown in by the warnings. However, the values stored in the variables x* seems to indicate that non suffixed floating point numbers are actually of the precision of REAL(10).

No, all of the stored values are correct only to 24 bits (about seven decimal digits). The printed values are all in error, starting with the digit "7". The number of digits printed with a PRINT statement is the number of digits needed in the decimal representation to retain the full precision of the value in the internal form (IEEE binary). In your program, all the values stored in internal form have 24 bit precision.

> Is it a bug in gfortran? What it actually should be according to FORTRAN standard?

Instead of making unfounded speculations and asking people to confirm them, why don't you read some Fortran manuals, perhaps a copy of some version of the Fortran standard, and try a few different compilers on your programs?

glen herrmannsfeldt

unread,

Jun 16, 2015, 3:25:03 PM6/16/15

to

Peng Yu <peng...@gmail.com> wrote:

(snip)

> See the following if there is no suffix to a floating point number,
> the compiler consider it as REAL(4) as shown in by the warnings.
> However, the values stored in the variables x* seems to indicate
> that non suffixed floating point numbers are actually of the
> precision of REAL(10).

> Is it a bug in gfortran? What it actually should be according
> to FORTRAN standard?

Hmm. The idea of temporary real (80 bit) format in the x87 is
to keep intermediate values with extra precision during
calculations. For one exaple, that allows summing a bunch of
numbers with less worry about precision loss due to cancelation.

Once you do that, should constants also have extra precision?

What about constants that result from evaluating constant expressions
at compile time?

The effects of extra precision in expression evaluation have been
discussed before, but maybe not for constants.

-- glen

James Van Buskirk

unread,

Jun 16, 2015, 3:57:25 PM6/16/15

to

wrote in message
news:edd4003d-7459-46c3...@googlegroups.com...

There is a problem in that not all compilers have quad precision and
so if qp is used as a kind number, on such a compiler it will be
negative and cause an error. This can be worked around by creating
a candidate kind number and using MERGE to select between the
candidate and its backup if necessary. As an additional complication,
MERGE wasn't allowed in initialization expressions in f95, so it
proved necessary to use a more complicated expression with SIGN
to replace it.

D:\gfortran\clf\kinds>type kinds4.f90
module numeric_kinds
! named constants for 8, 4, 2, and 1 byte integers:
integer, parameter :: &
i8b = selected_int_kind(18), &

i4b = selected_int_kind(9), &
i2b = selected_int_kind(4), &
i1b = selected_int_kind(2)
! and for single, double and quadruple precision reals:
integer, parameter :: &

sp = kind([REAL::]), &
dp = kind([DOUBLE PRECISION::])
integer, parameter, private :: &
ep_we_hope = selected_real_kind(18,4931)
integer, parameter :: &
ep = (1+sign(1,ep_we_hope))/2*ep_we_hope+ &
(1-sign(1,ep_we_hope))/2*dp
integer, parameter, private :: &
qp_we_hope = selected_real_kind(2*precision(1.0_dp))
integer, parameter :: &
qp = merge(qp_we_hope,ep,qp_we_hope >= 0)
integer, parameter, private :: &
i16b_we_hope = selected_int_kind(38)
integer, parameter :: &
i16b = merge(i16b_we_hope,i8b,i16b_we_hope >= 0)
end module numeric_kinds

program test
use numeric_kinds
implicit none
integer(i16b) ii16b
integer(i8b) ii8b
integer(i4b) ii4b
integer(i2b) ii2b
integer(i1b) ii1b
real(sp) rsp
real(dp) rdp
real(ep) rep
real(qp) rqp
write(*,'(*(g0))') 'range(ii1b) = ',range(ii1b)
write(*,'(*(g0))') 'range(ii2b) = ',range(ii2b)
write(*,'(*(g0))') 'range(ii4b) = ',range(ii4b)
write(*,'(*(g0))') 'range(ii8b) = ',range(ii8b)
write(*,'(*(g0))') 'range(ii16b) = ',range(ii16b)
write(*,'(*(g0))') 'precision(rsp) = ',precision(rsp)
write(*,'(*(g0))') 'range(rsp) = ',range(rsp)
write(*,'(*(g0))') 'precision(rdp) = ',precision(rdp)
write(*,'(*(g0))') 'range(rdp) = ',range(rdp)
write(*,'(*(g0))') 'precision(rep) = ',precision(rep)
write(*,'(*(g0))') 'range(rep) = ',range(rep)
write(*,'(*(g0))') 'precision(rqp) = ',precision(rqp)
write(*,'(*(g0))') 'range(rqp) = ',range(rqp)
end program test

D:\gfortran\clf\kinds>gfortran kinds4.f90 -okinds4

D:\gfortran\clf\kinds>kinds4
range(ii1b) = 2
range(ii2b) = 4
range(ii4b) = 9
range(ii8b) = 18
range(ii16b) = 38
precision(rsp) = 6
range(rsp) = 37
precision(rdp) = 15
range(rdp) = 307
precision(rep) = 18
range(rep) = 4931
precision(rqp) = 33
range(rqp) = 4931

But with ifort not all of these kinds are available, but using the
kind numbers from the preceding module is still OK:

D:\gfortran\clf\kinds>ifort /nologo kinds4.f90

D:\gfortran\clf\kinds>kinds4
range(ii1b) = 2
range(ii2b) = 4
range(ii4b) = 9
range(ii8b) = 18
range(ii16b) = 18
precision(rsp) = 6
range(rsp) = 37
precision(rdp) = 15
range(rdp) = 307
precision(rep) = 33
range(rep) = 4931
precision(rqp) = 33
range(rqp) = 4931

Some duplication, but no compile-time errors.

e p chandler

unread,

Jun 16, 2015, 4:46:33 PM6/16/15

to

Your constants on the right hand side of the assignment statements are evaluated as they are listed without regard for the type of variable to which they are being assigned. Without a _<kind> suffix, they are default real type.

---- e

Ron Shepard

unread,

Jun 16, 2015, 9:19:33 PM6/16/15

to

On 6/16/15 10:56 AM, Peng Yu wrote:
> However, the values stored in the variables x* seems to indicate that non suffixed floating point numbers are actually of the precision of REAL(10).

I do not understand what you mean by this? The printed numbers all have
the same error, about 1.e-7, which is the real(4) precision on your
machine. You would see this if you wrote out precision(x4),
precision(x8), etc. What you see printed is various numbers of digits
of that truncated real(4) value.

Look at the number of decimal digits that are required to print various
binary values 2^(-n):

1 .5
2 .25
3 .125
4 .0625
5 .03125
6 .015625
7 .0078125
8 .00390625
...
23 1.1920928955078125e-7

Notice that the nth row of that table requires n decimal digits to
represent exactly. Your real(16) result has error of 1e-7, and it has 23
nonzero digits after the decimal point. Those are consistent with
real(4) accuracy and with that real(4) value being printed out to full
decimal accuracy. It is 23 because your value is between 2 and 4.

$.02 -Ron Shepard

JB

unread,

Jun 17, 2015, 4:23:18 AM6/17/15

to

On 2015-06-16, Peng Yu <peng...@gmail.com> wrote:
> See the following if there is no suffix to a floating point number, the compiler consider it as REAL(4) as shown in by the warnings. However, the values stored in the variables x* seems to indicate that non suffixed floating point numbers are actually of the precision of REAL(10).
>
> Is it a bug in gfortran? What it actually should be according to FORTRAN standard?

No, you're being tripped up by the binary->decimal
conversion. Consider the following modification of your program (this
works on gfortran/x86_64, no guarantees for other targets):

program main
real(4):: x4 = 3.141592653589793238462643383279502884197169399
real(8):: x8 = 3.141592653589793238462643383279502884197169399
real(10):: x10 = 3.141592653589793238462643383279502884197169399
real(16):: x16 = 3.141592653589793238462643383279502884197169399

write(*, '(g0,T53,B0)') x4, x4 ! 8 bits exponent
write(*, '(g0,T50,B0)') x8, x8 ! 11 bits exp
write(*, '(g0,T45,B0)') x10, x10 ! 15 bits exp + 1 bit for integral part
write(*, '(g0,T46,B0)') x16, x16 ! 15 bits exp
end program

Output:

3.14159274 1000000010010010000111111011011
3.1415927410125732 100000000001001001000011111101101100000000000000000000000000000
3.14159274101257324219 1000000000000001100100100001111110110110000000000000000000000000000000000000000
3.14159274101257324218750000000000000 1000000000000001001001000011111101101100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

I have selected the tab shifts so that the significands line up. You
can see that the kind 8/10/16 numbers have a significand where the
first part is identical with the real(4) number, and then padded with
zeros until the end.

So what this means is that GFortran converts the literal values in the
source code to a real of default kind, then on assignment the RHS is
converted to the kind on the LHS. Exactly like it should according to
the standard.

--
JB

Dan Nagle

unread,

Jun 17, 2015, 10:25:59 AM6/17/15

to

Hi,

On 2015-06-16 15:44:11 +0000, Steve Lionel said:

> On 6/16/2015 11:41 AM, Peng Yu wrote:
>
>> I use `kind(1.d0)` to get the correct number of kind for double
>> precision. But I don't know what number in quad precision 1.d0
>> corresponds to.
>
> There is no standard syntax to do this.

I think one could argue that real128 in the iso_fortran_env
constitutes a standard way of requesting quad precision.

Of course, one must make the "usual" assumption
regarding numeric storage units versus machine word size.

--
Cheers!

Dan Nagle

Wolfgang Kilian

unread,

Jun 17, 2015, 10:40:31 AM6/17/15

to

We just got beaten by that one. gfortran thinks that real128 should be
extended precision (80bit). Literally, this may be correct since it
uses 128bit storage in RAM.

===
use, intrinsic :: iso_fortran_env
print *, 4*atan(1._8)
print *, 4*atan(1._real128)
print *, 4*atan(1._16)
end
===
3.1415926535897931
3.14159265358979323851
3.14159265358979323846264338327950280

-- Wolfgang

--
E-mail: firstnameini...@domain.de
Domain: yahoo

FX

unread,

Jun 17, 2015, 11:12:46 AM6/17/15

to

> We just got beaten by that one. gfortran thinks that real128 should be
> extended precision (80bit). Literally, this may be correct since it
> uses 128bit storage in RAM.

Yes, the standard does say "a REAL type whose storage size expressed in
bits is 128". And "If the processor supports more than one kind of that
size, it is processor dependent which kind value is provided".

gfortran chooses to return the smaller kind value with the storage size
requested, which is why you're getting extended precision instead of
quadruple precision. (I had to check, but it turns out I was the one who
wrote that code!)

On gfortran, the kind value is dictated by the precision: kind value is
equal the precision divided by 8, rounding up [don't ask why!].

--
FX

FortranFan

unread,

Jun 17, 2015, 11:16:18 AM6/17/15

to

On Wednesday, June 17, 2015 at 10:40:31 AM UTC-4, Wolfgang Kilian wrote:

> ..
> We just got beaten by that one. ..

Yes, it makes sense you got "beaten" instead of being bit because:

-- begin code --
use, intrinsic :: iso_fortran_env, only : r8 => real64, r16 => real128

integer, parameter:: dp = selected_real_kind(15,307)
integer, parameter:: qp = selected_real_kind(33,4931)

print *, " double: 4.0_dp*atan(1.0_dp) = ", 4.0_dp*atan(1.0_dp)
print *, " double: 4.0_r8*atan(1.0_r8) = ", 4.0_r8*atan(1.0_r8)
print *, " quad: 4.0_qp*atan(1.0_qp) = ", 4.0_qp*atan(1.0_qp)
print *, " quad: 4.0_r16*atan(1.0_r16) = ", 4.0_r16*atan(1.0_r16)

stop

end program
-- end code --

Upon execution,
----
double: 4.0_dp*atan(1.0_dp) = 3.1415926535897931
double: 4.0_r8*atan(1.0_r8) = 3.1415926535897931
quad: 4.0_qp*atan(1.0_qp) = 3.14159265358979323846264338327950280

quad: 4.0_r16*atan(1.0_r16) = 3.14159265358979323846264338327950280

----

Wolfgang Kilian

unread,

Jun 17, 2015, 11:48:03 AM6/17/15

to

You don't state, which compiler was that?

-- Wolfgang

Thomas Koenig

unread,

Jun 17, 2015, 11:53:38 AM6/17/15

to

Peng Yu <peng...@gmail.com> schrieb:

> $ cat main.f90
> program main
> real(4):: x4 = 3.141592653589793238462643383279502884197169399
> real(8):: x8 = 3.141592653589793238462643383279502884197169399
> real(10):: x10 = 3.141592653589793238462643383279502884197169399
> real(16):: x16 = 3.141592653589793238462643383279502884197169399
> print *, x4
> print *, x8
> print *, x10
> print *, x16
> end program

> $ gfortran -Wconversion-extra -o main.exe main.f90

The most recent gfortran now issues the warning

main.f90:2:64:

real(4):: x4 = 3.141592653589793238462643383279502884197169399
1
Warning: Non-significant digits in »REAL(4)« number at (1), maybe incorrect KIND [-Wconversion-extra]

which should make this issue a little bit clearer.

kargl

unread,

Jun 17, 2015, 12:17:44 PM6/17/15

to

Fortunately, you can use real_kinds.

use, intrinsic :: iso_fortran_env
implicit none
print *, real_kinds
print *, real32, real64, real128
end

troutmask:sgk[217] gfc6 -o z a.f90 && ./z
4 8 10 16
4 8 10

FX has already indicated that he made the choice for real128.
One reason why might want real128 to be REAL(10), which
is Intel's 80-bit extended format, is that most computations
can be done in hardware. REAL(16) is a software floating
point implementation.

And, yes, I realize that individuals who have computational
backgrounds will automatically think real128 should match
up with IEEE-754 binary128.

--
steve

FortranFan

unread,

Jun 17, 2015, 12:44:25 PM6/17/15

to

Sorry - compiler is gfortran: GCC version 6.0.0 20150607 (experimental)

Tim Prince

unread,

Jun 17, 2015, 12:58:00 PM6/17/15

to

On 6/17/2015 11:16 AM, FortranFan wrote:
> use, intrinsic :: iso_fortran_env, only : r8 => real64, r16 => real128
>
> integer, parameter:: dp = selected_real_kind(15,307)
> integer, parameter:: qp = selected_real_kind(33,4931)
>
> print *, " double: 4.0_dp*atan(1.0_dp) = ", 4.0_dp*atan(1.0_dp)
> print *, " double: 4.0_r8*atan(1.0_r8) = ", 4.0_r8*atan(1.0_r8)
> print *, " quad: 4.0_qp*atan(1.0_qp) = ", 4.0_qp*atan(1.0_qp)
> print *, " quad: 4.0_r16*atan(1.0_r16) = ", 4.0_r16*atan(1.0_r16)
>
> stop
>
> end program

GNU Fortran (GCC) version 6.0.0 20150606 (experimental)
(x86_64-unknown-cygwin)
$ ./a

double: 4.0_dp*atan(1.0_dp) = 3.1415926535897931
double: 4.0_r8*atan(1.0_r8) = 3.1415926535897931
quad: 4.0_qp*atan(1.0_qp) = 3.14159265358979323846264338327950280

quad: 4.0_r16*atan(1.0_r16) = 3.14159265358979323851

Wolfgang Kilian

unread,

Jun 17, 2015, 1:32:14 PM6/17/15

to

Speed is an issue, but if you find that double is insufficient for some
numerics, you typically want to directly switch to quadruple and get a
reliable result - once this is established, you may check if extended
would also do.

>
> And, yes, I realize that individuals who have computational
> backgrounds will automatically think real128 should match
> up with IEEE-754 binary128.
>

Precisely. With the days of EQUIVALENCE long gone, I don't think in
terms of memory space when 'quadruple' is mentioned. And it's easy to
confuse ISO (_fortran_env) with IEEE, as you say.

We realized this some time ago and turned to a more elaborate
configuration where the user can choose among the available kinds.

-- Wolfgang

glen herrmannsfeldt

unread,

Jun 17, 2015, 2:46:54 PM6/17/15

to

kargl <s...@removetroutmask.apl.washington.edu> wrote:
> Wolfgang Kilian wrote:
(snip)

>> We just got beaten by that one. gfortran thinks that real128 should be
>> extended precision (80bit). Literally, this may be correct since it
>> uses 128bit storage in RAM.

(snip)

> use, intrinsic :: iso_fortran_env
> implicit none
> print *, real_kinds
> print *, real32, real64, real128
> end

> troutmask:sgk[217] gfc6 -o z a.f90 && ./z
> 4 8 10 16
> 4 8 10

> FX has already indicated that he made the choice for real128.
> One reason why might want real128 to be REAL(10), which
> is Intel's 80-bit extended format, is that most computations
> can be done in hardware. REAL(16) is a software floating
> point implementation.

There is hardware around that can do 128 bit floating point,
but it doesn't seem to be very common.

> And, yes, I realize that individuals who have computational
> backgrounds will automatically think real128 should match
> up with IEEE-754 binary128.

Or the decimal128 format? Though I believe that counts as double
precision, with decimal64 is single and decimal32 as half precision.

-- glen

FX

unread,

Jun 17, 2015, 4:29:47 PM6/17/15

to

> Sorry - compiler is gfortran: GCC version 6.0.0 20150607 (experimental)

On a platform where extended precision is not supported, but quad prec
is, apparently. So I'd guess powerpc (AIX or OS X)?

--
FX

FX

unread,

Jun 17, 2015, 4:35:51 PM6/17/15

to

Regarding this:

>> FX has already indicated that he made the choice for real128. One
>> reason why might want real128 to be REAL(10), which is Intel's 80-bit
>> extended format, is that most computations can be done in hardware.
>> REAL(16) is a software floating point implementation.

and that:

> Speed is an issue, but if you find that double is insufficient for some
> numerics, you typically want to directly switch to quadruple and get a
> reliable result - once this is established, you may check if extended
> would also do.

> [...]

> Precisely. With the days of EQUIVALENCE long gone, I don't think in
> terms of memory space when 'quadruple' is mentioned. And it's easy to
> confuse ISO (_fortran_env) with IEEE, as you say.

gfortran exists on multiple platforms, and so design choices made (where
the standard gives us some leeway) have to be viable on more than just
the common Intel hardware. This question can be argued both ways (as you
have done above), and unfortunately the standard's definition of REAL128
is not very useful in practice.

I believe the choice made, which is to choose the "smallest" (in terms of
precision) real kind of those fitting the requirements is, at the very
least, not entirely illogical and can be consistently applied through all
our supported platforms.

--
FX

JB

unread,

Jun 17, 2015, 4:38:57 PM6/17/15

to

Or maybe plain old 32-bit x86, where extended precision variables are
padded to 12 bytes rather than 16 as on x86-64?

(Personally, I think this choice of real128 is a bit confusing, but I
guess it's water under the bridge by now...)

--
JB

FX

unread,

Jun 17, 2015, 4:46:51 PM6/17/15

to

> Or maybe plain old 32-bit x86, where extended precision variables are
> padded to 12 bytes rather than 16 as on x86-64?

True, 32-bit x86 on Linux does that... I haven't use that in long time
now :)

--
FX

Gary Scott

unread,

Jun 17, 2015, 8:41:45 PM6/17/15

to

I would think it totally unacceptable...real128 should imply the most
likely definition (e.g. 128 bits, not 80 bits). Add a new non-standard
definition realxxx for 80 bits if you want to support that...argh...jus
argh...

kargl

unread,

Jun 17, 2015, 9:18:10 PM6/17/15

to

glen herrmannsfeldt wrote:

> kargl <s...@removetroutmask.apl.washington.edu> wrote:
>> Wolfgang Kilian wrote:
> (snip)
>>> We just got beaten by that one. gfortran thinks that real128 should be
>>> extended precision (80bit). Literally, this may be correct since it
>>> uses 128bit storage in RAM.
> (snip)
>
>> use, intrinsic :: iso_fortran_env
>> implicit none
>> print *, real_kinds
>> print *, real32, real64, real128
>> end
>
>> troutmask:sgk[217] gfc6 -o z a.f90 && ./z
>> 4 8 10 16
>> 4 8 10
>
>> FX has already indicated that he made the choice for real128.
>> One reason why might want real128 to be REAL(10), which
>> is Intel's 80-bit extended format, is that most computations
>> can be done in hardware. REAL(16) is a software floating
>> point implementation.
>
> There is hardware around that can do 128 bit floating point,
> but it doesn't seem to be very common.

Yes, I know as I'm the person that has been writing FreeBSD's
sparc64 libm long double functions.

glen herrmannsfeldt

unread,

Jun 17, 2015, 9:55:11 PM6/17/15

to

kargl <s...@removetroutmask.apl.washington.edu> wrote:

(snip, I wrote)

>> There is hardware around that can do 128 bit floating point,
>> but it doesn't seem to be very common.

> Yes, I know as I'm the person that has been writing FreeBSD's
> sparc64 libm long double functions.

Oh, which Sparc processors have it? I know it is defined, but
I thought most did it in software emulation.
(I have the SPARC V9 book.)

Otherwise, IBM S/370 and successors, and VAX H-float (optional,
but standard on the 11/730) have been around for years.

-- glen

Wolfgang Kilian

unread,

Jun 18, 2015, 3:15:16 AM6/18/15

to

The design choices are clearly acceptable, no problem with that.

That said, it's somewhat odd that the IEEE_* and ISO_FORTRAN_ENV modules
are part of the standard, but apparently not 'synchronized' with each
other - that is, the latter defines named kinds which are not identical
to IEEE kinds, while the former defines IEEE features but no IEEE kinds.

Probably I'm missing something, or the issue is more complicated, but
what about binary64 etc. as additional standardized named kind
parameters in either of those intrinsic modules?

-- Wolfgang

kargl

unread,

Jun 18, 2015, 11:50:51 AM6/18/15

to

glen herrmannsfeldt wrote:

> kargl <s...@removetroutmask.apl.washington.edu> wrote:
>
> (snip, I wrote)
>>> There is hardware around that can do 128 bit floating point,
>>> but it doesn't seem to be very common.
>
>> Yes, I know as I'm the person that has been writing FreeBSD's
>> sparc64 libm long double functions.
>
> Oh, which Sparc processors have it? I know it is defined, but
> I thought most did it in software emulation.
> (I have the SPARC V9 book.)
>

Sorry, Glen, reading comprehension problem on my part. ;-)
I missed that you specifically stated hardware. On sparc64,
I'm implying things like lgammal, erfl, etc. These are definitely
software implementation. I have looked to see what happens
at a lower level (e.g., +. -, /, *, etc.).

--
steve

robert....@oracle.com

unread,

Jun 18, 2015, 8:55:11 PM6/18/15

to

On Wednesday, June 17, 2015 at 6:55:11 PM UTC-7, glen herrmannsfeldt wrote:
>
> Oh, which Sparc processors have it? I know it is defined, but
> I thought most did it in software emulation.
> (I have the SPARC V9 book.)

The SPARC architecture has included instructions for 128-bit floating-point operations since at least version 8 of the architecture specification. A compliant implementation of the SPARC architecture is required to support the 128-bit floating-point instructions in a combination of hardware and software. To date, that support has been almost entirely in software. One advantage of using the 128-bit floating-point instructions is that floating-point exceptions are handled the same regardless of the operand size.

glen herrmannsfeldt

unread,

Jun 19, 2015, 8:49:31 AM6/19/15

to

kargl <s...@removetroutmask.apl.washington.edu> wrote:
(snip, I wrote)
>>>> There is hardware around that can do 128 bit floating point,
>>>> but it doesn't seem to be very common.

>>> Yes, I know as I'm the person that has been writing FreeBSD's
>>> sparc64 libm long double functions.

(and also wrote)

>> Oh, which Sparc processors have it? I know it is defined, but
>> I thought most did it in software emulation.
>> (I have the SPARC V9 book.)

> Sorry, Glen, reading comprehension problem on my part. ;-)
> I missed that you specifically stated hardware. On sparc64,
> I'm implying things like lgammal, erfl, etc. These are definitely
> software implementation. I have looked to see what happens
> at a lower level (e.g., +. -, /, *, etc.).

IBM added it with the 360/85 and continued with S/370.

Well, according to IBM calculations, if the hardware supported add,
subtract, and multiply, then software divide was enough. That divide
didn't occur so often. But then in ESA/390, in the 1990's, they
added extended precision divide.

I am not so sure which VAX models had hardware (microcode) optional
support for H-float.

I suppose processors are getting faster, but for number crunching you
want hardware support before using it generally.

-- glen

Tim Prince

unread,

Jun 19, 2015, 9:15:43 AM6/19/15

to

On 6/19/2015 8:49 AM, glen herrmannsfeldt wrote:

>
> I am not so sure which VAX models had hardware (microcode) optional
> support for H-float.
>

I remember VAX/750 as the first with microcode H-float. I think it had
the same precision and nearly the range of the later IEEE 128-bit format
(higher than the IBM quad). I'm surprised to see how little on-line
literature is available on VAX.

> I suppose processors are getting faster, but for number crunching you
> want hardware support before using it generally.
>

Particularly in view of the lack of simd hardware for quad precision.

Compilers, e.g. gfortran, routinely use implicit higher precision data
types for compile-time constant propagation, with gmp/mpfr supporting it
very well.
Many people have assumed erroneously that the existence of 128- 256- and
512-bit data formats implied hardware support for equivalent large data
types. gnu compilers even support equivalent big integers, but without
the arithmetic operators.

JB

unread,

Jun 22, 2015, 3:45:30 AM6/22/15

to

On 2015-06-19, Tim Prince <tpr...@computer.org> wrote:
> Compilers, e.g. gfortran, routinely use implicit higher precision data
> types for compile-time constant propagation, with gmp/mpfr supporting it
> very well.

As a side-note, GFortran doesn't actually do this. MPFR is used for
compile-time evaluation of floating point expressions, yes, but it's
done in the same precision has the source datatype implies.

--
JB

FX

unread,

Jun 22, 2015, 3:49:49 AM6/22/15

to

>> Compilers, e.g. gfortran, routinely use implicit higher precision data
>> types for compile-time constant propagation
>

> As a side-note, GFortran doesn't actually do this.

If some compilers do this (user higher precision for intermediate
calculations), I think they're a minority.

--
FX

FX

unread,

Jun 22, 2015, 3:57:52 AM6/22/15

to

For example, none of the compilers I have at hand would print "0" as the
output of this short program:

integer, parameter :: n = 3
print *, (n + 1) * (1 / real(n)) - 1 - 1 / real(n)
end

--
FX

Tim Prince

unread,

Jun 22, 2015, 6:20:45 AM6/22/15

to

I see that my description of what gfortran does is open to
misunderstanding. As FX said, the constants are rounded to the declared
data type before they appear in the program. It's not an extended
precision such as might have appeared in some compilers for x87, or even
what ifort gives when -fpconstant is set for consistency with the long
time ago Microsoft Fortran. They are calculated by mpfr with sufficient
precision internally to expect a correctly rounded answer, rather than
trying to guess what may happen with whatever math library may be linked
by the time an executable is built.
I did have a misconception with respect to the last example FX gave above.

robin....@gmail.com

unread,

Jun 22, 2015, 9:18:08 AM6/22/15

to

Why should they?

Steve Lionel

unread,

Jun 22, 2015, 10:06:37 AM6/22/15

to

On 6/19/2015 9:15 AM, Tim Prince wrote:
> On 6/19/2015 8:49 AM, glen herrmannsfeldt wrote:
>
>> >
>> >I am not so sure which VAX models had hardware (microcode) optional
>> >support for H-float.
>> >
> I remember VAX/750 as the first with microcode H-float. I think it had
> the same precision and nearly the range of the later IEEE 128-bit format
> (higher than the IBM quad). I'm surprised to see how little on-line
> literature is available on VAX.

The first was actually the 11/780, for which "G and H float" was an
option added in 1979 or so. I was present at the Smithsonian
Astrophysical Observatory computer lab when their 780 was the first to
be retrofitted with this option. It included new firmware (loaded from
an RX01 8-inch floppy disk via the embedded PDP-11) and wire-wrap changes.

The 11/750 had a firmware-only option, but it was REALLY SLOW. The
11/730 was the first VAX to have any kind of "real" hardware support for
quad precision as it was an FPGA machine. For many years, the 730 was
the fastest H-float VAX even though it was one of the slowest overall. I
think the VAX 8600 "Venus" implemented G and H float in hardware. No
other models did.

Similar to what Bob Corbett described for Sun machines, the VAX
architecture was extended with new instructions for G and H float, and
the OS loaded an emulator (mainly written by Stan Rabinowitz, with
integration into VMS by yours truly) so that if the processor didn't
recognize the instructions they would be emulated in software.

H-float and IEEE-quad were pretty much identical in precision in range,
the main difference being in how the exponent was biased and the lack of
denorms or infinities. (Similarly, G-float and IEEE double - IEEE
floating point didn't yet exist.) At the time, IBM's quad precision was
implemented using two doubles with shifted fractions, yielding greater
precision but not range.

--
Steve Lionel
Developer Products Division
Intel Corporation
Merrimack, NH

For email address, replace "invalid" with "com"

User communities for Intel Software Development Products
http://software.intel.com/en-us/forums/
Intel Software Development Products Support
http://software.intel.com/sites/support/
My Fortran blog
http://www.intel.com/software/drfortran

Refer to http://software.intel.com/en-us/articles/optimization-notice
for more information regarding performance and optimization choices in
Intel software products.

Tim Prince

unread,

Jun 22, 2015, 12:33:52 PM6/22/15

to

Thanks for filling in the missing bits of history.
I suppose it was relatively rare to retrofit the 11/780, so it may not
be surprising that we never encountered one with H-float facility.
As far as I know, the IBM360 and SGI/MIPS quad precision as effectively
a structure of 2 doubles persisted through their history, yielding
106-107 bits mantissa precision in the SGI case, with lower range than
plain double (due to premature underflow).

James Van Buskirk

unread,

Jun 22, 2015, 3:39:45 PM6/22/15

to

"Tim Prince" wrote in message news:cuq5rr...@mid.individual.net...

mpfr clearly can produce good results even when the math library
yields an invalid answer. One of my favorites is:

D:\gfortran\clf\coshtest>type coshtest.f90
program coshtest
implicit none
integer, parameter :: dp = kind([double precision::])
real(dp), parameter :: x = log(huge(1.0_dp))+log(8.0_dp)/2
real(dp), parameter :: y = atan(1.0_dp)
complex(dp), parameter :: test = cosh(cmplx(nearest(x,-1.0),y,dp))
complex(dp) z
write(*,*) test
z = cmplx(nearest(x,-1.0),y,dp)
write(*,*) cosh(z)
end program coshtest

D:\gfortran\clf\coshtest>gfortran coshtest.f90 -ocoshtest

D:\gfortran\clf\coshtest>coshtest
( 1.7976931348621251E+308, 1.7976931348621249E+308)
( Infinity, Infinity)

D:\gfortran\clf\coshtest>ifort /nologo coshtest.f90
coshtest.f90(6): error #6362: The data types of the argument(s) are invalid.
[
COSH]
complex(dp), parameter :: test = cosh(cmplx(nearest(x,-1.0),y,dp))
-----------------------------------------^
coshtest.f90(10): error #6362: The data types of the argument(s) are
invalid.
[COSH]
write(*,*) cosh(z)
-------------------^
compilation aborted for coshtest.f90 (code 1)

Dang. Is that a new feature of f2008 or something? OK, so we'll try it
with
cos(z) instead:

D:\gfortran\clf\coshtest>type costest.f90
program costest
implicit none
integer, parameter :: dp = kind([double precision::])
real(dp), parameter :: x = log(huge(1.0_dp))+log(8.0_dp)/2
real(dp), parameter :: y = atan(1.0_dp)
complex(dp), parameter :: test = cos(cmplx(y,nearest(x,-1.0),dp))
complex(dp) z
write(*,*) test
z = cmplx(y,nearest(x,-1.0),dp)
write(*,*) cos(z)
end program costest

D:\gfortran\clf\coshtest>ifort /nologo costest.f90

D:\gfortran\clf\coshtest>costest
(1.797693134862125E+308,-1.797693134862125E+308)
(1.797693134862125E+308,-1.797693134862125E+308)

D:\gfortran\clf\coshtest>gfortran costest.f90 -ocostest

D:\gfortran\clf\coshtest>costest
( 1.7976931348621251E+308, -1.7976931348621249E+308)
( Infinity, -Infinity)

So in this somewhat pathological case ifort's math library is doing
a better job than gfortran's.

glen herrmannsfeldt

unread,

Jun 22, 2015, 3:54:08 PM6/22/15

to

Steve Lionel <steve....@intel.invalid> wrote:

(snip)

> H-float and IEEE-quad were pretty much identical in precision in range,
> the main difference being in how the exponent was biased and the lack of
> denorms or infinities. (Similarly, G-float and IEEE double - IEEE
> floating point didn't yet exist.) At the time, IBM's quad precision was
> implemented using two doubles with shifted fractions, yielding greater
> precision but not range.

IBM HFP (hexadecimal floating point), for short, long, and extended
precision has a 7 bit base 16 exponent, allowing about 1e-78 to 1e+75.
For extended precision, the lower half doesn't use the bits where the
exponent would go in a double precision value, which simplifies the
software emulation. So, 112 bits, or 28 hex digits, of significand.

Sometime during ESA/390 years BFP (binary floating point) was added,
again with short, long, and extended (32, 64 and 128 bit formats),
and with more exponent bits for the wider formats.

More recently, decimal floating point was added, with 64 bit and
128 bit formats, following the IEEE 754-2008 standard. As with
binary, more exponent bits for the larger format.

-- glen

kargl

unread,

Jun 22, 2015, 4:05:15 PM6/22/15

to

James Van Buskirk wrote:

>
> mpfr clearly can produce good results even when the math library
> yields an invalid answer. One of my favorites is:
>
> D:\gfortran\clf\coshtest>type coshtest.f90
> program coshtest
> implicit none
> integer, parameter :: dp = kind([double precision::])
> real(dp), parameter :: x = log(huge(1.0_dp))+log(8.0_dp)/2
> real(dp), parameter :: y = atan(1.0_dp)
> complex(dp), parameter :: test = cosh(cmplx(nearest(x,-1.0),y,dp))
> complex(dp) z
> write(*,*) test
> z = cmplx(nearest(x,-1.0),y,dp)
> write(*,*) cosh(z)
> end program coshtest
>
> D:\gfortran\clf\coshtest>gfortran coshtest.f90 -ocoshtest
>
> D:\gfortran\clf\coshtest>coshtest
> ( 1.7976931348621251E+308, 1.7976931348621249E+308)
> ( Infinity, Infinity)

It depends on the math library for the operating system. For
gfortran on x86_64 FreeBSD, I get

% gfc6 -o z -O jvb.f90
% ./z
( 1.7976931348621251E+308, 1.7976931348621249E+308)
( 1.7976931348621251E+308, 1.7976931348621249E+308)

> So in this somewhat pathological case ifort's math library is doing
> a better job than gfortran's.

HTH

--
steve

Jos Bergervoet

unread,

Jun 22, 2015, 5:03:51 PM6/22/15

to

Well, even Mathematica knows this! (Skipping PL/I for a moment)

Simplify[ (n+1)*(1/Re[n]) -1 -1/Re[n], Element[n, Integers]]

Out[1]= 0

It probably comes down to what transformational rules the
compiler is allowed to make.. Order of evaluation is not
enough here, and using the distributive property is not
allowed perhaps in Fortran?

--
Jos