Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Let the ALLOCATE statement check if variables are already ALLOCATED

1,524 views
Skip to first unread message

Beliavsky

unread,
Jun 23, 2019, 10:34:00 AM6/23/19
to
It would be nice if the ALLOCATE statement could be extended to take an optional argument so that instead of writing, as I often do, something like

if (allocated(x)) deallocate(x)
if (allocated(y)) deallocate(y)
allocate(x(n),y(n))

I could write

allocate(x(n),y(n),check_already_allocated=.true.)

If the ALLOCATE statement allocates many variables, this extension could save a lot of code and increase clarity.

Gary Scott

unread,
Jun 23, 2019, 10:57:26 AM6/23/19
to
Meaning?

allocate(x(n),y(n),reallocate_if_already_allocated=.true.)

Beliavsky

unread,
Jun 23, 2019, 11:29:54 AM6/23/19
to
Yes, the proposed syntax would do what the first code snippet does.

steve kargl

unread,
Jun 23, 2019, 12:37:57 PM6/23/19
to
Modren supports a feature that is known as allocation on assignment.

x = .... ! something with shape n
... ! Many statements later.
x = .... ! something with shape m /= n

PS: yes, I know one needs to (re)ALLOCATE for input, i.e,

if (allocated(x)) deallocate(x)
allocate(x(n))
read(fd,*) (x(i), i=1, n)


--
steve

Richard Weed

unread,
Jun 23, 2019, 1:07:48 PM6/23/19
to
I see this as one of those really good ideas that would be a great help to users that will never get implemented or even considered by the standards folks because it doesn't mesh with what some members of the committee think Fortran should be. I would implement it with a simple "reallocate" logical, ie

allocate(x(n), y(n), reallocate=.TRUE.)

obviously you want to support the case where either x or y is already allocated

I know you can do the same thing with (re)allocate on assignment but I am still not convinced that it works correctly on all compilers without either a severe performance hit or triggering an unneeded deep copy etc. I've encountered issues with it on more than one recent compiler and tend to avoid using it if I can.

Another extension might be to allow multiple allocations to be enclosed in parens and allow either source or typed allocation for specific entities in the list

allocate(x(n), (y(n), SOURCE=10.0), (integer: r(10)), reallocate=.TRUE.)

FortranFan

unread,
Jun 23, 2019, 4:24:48 PM6/23/19
to
As I stated in another thread, it appears what interested users can do is to make a case for enhancements such as this in the standard and put a together a proposal paper for submission with one or more contacts listed at the WG5 Fortran website. GitHub is a good platform for users to group together and collaborate to capture their needs, collect use cases, and summarize *what* is of interest for an enhancement or addition to the language. The standards committee can best work on the response: either No, or if at all yes, as to *how* and *when*!

Richard Weed

unread,
Jun 23, 2019, 5:45:54 PM6/23/19
to
While one would hope the committee would be ameanable to ideas from outside the committee I was told by a member of the committee that I worked with a couple of times several years ago that getting anything considered much less approved would be difficult. Your suggestion about using GitHub to form a platform for users to get together is a good one and has been proposed by others but I haven't seen anyone actually set up such a site. My idea along those lines is a "shadow government" where users would do what you suggest. Take the current interpretation document available on the J3 website and submit changes they would like to see with specific changes to the standard text along with examples and use cases. Basically, let users generate their own version of the standard and constrast that with what the committee is developing. I think we would end up with more changes that are of use to a wider community of users (such as Mr. Beliavsky's suggested change to allocate) than what is currently being discussed by the committee (see document N2165 at http://wg5-fortran.org). While I applauded Steve Lionel's attempt to solicit user community input, I see very little of the suggestions other than template's/generics that users submitted to Steve's poll from earlier this year in that document. I would suggest to Steve that before the committee makes a final decision on what should be in the next standard revison he do a second poll that allows users to comment on the contents of N2165 and subsequent documents. Even better, stand up a users only site where users can submit ideas directly to the committee.

Steve Lionel

unread,
Jun 23, 2019, 9:10:36 PM6/23/19
to
On 6/23/2019 5:45 PM, Richard Weed wrote:
> While one would hope the committee would be ameanable to ideas from outside the committee I was told by a member of the committee that I worked with a couple of times several years ago that getting anything considered much less approved would be difficult.
...
> I would suggest to Steve that before the committee makes a final decision on what should be in the next standard revison he do a second poll that allows users to comment on the contents of N2165 and subsequent documents. Even better, stand up a users only site where users can submit ideas directly to the committee.

The planning for the next revision consisted PRIMARILY of listening to
users and adding features users requested. The survey was open for many
months, and I promoted the survey and the results more than once. At
some point we have to nail down the work plan and get to development.
Constant one-plusing is what made Fortran 2018 take so long. There were
so many good ideas from the survey we couldn't possibly take them all
this time.

My goal for this next revision is to get it done in five years or less.
That means we're going to close the gates earlier than was done in the
past, and any new proposals considered had better be spectacular (and
not previously considered) to justify reopening the work plan. Features
that don't actually add functionality, but are just another way of doing
something not many programs do, won't pass that test.

It is my goal, as WG5 Convenor, to make the standards process as open
and transparent as I can. But it is also my goal to get revisions out in
a reasonable time. I welcome any and all participation and have actively
recruited new J3 members from industry. If anyone tries to tell you that
J3 or WG5 doesn't accept user input, don't believe them. But we need
more than just suggestions, we need people to put in the work to develop
the features in the standard. It's hard work and sometimes great ideas
become impractical when we start thinking of how they play with the
existing language.

Everyone is welcome to attend J3 meetings, which are typically held in
Las Vegas in February and October. A joint J3/WG5 meeting is usually in
June but can move around (it's August this year), and alternates between
North America and other member countries. Dates are on the J3 web site
(https://j3-fortran.org)

The US committee has already prepared its list of desired features for
the next revision based on the survey and other input from users. I
expect to see other active countries (UK, Japan, Germany, Canada)
deliver their lists at the August meeting in Tokyo.

If someone wants to start an effort to collect ideas, that's fine with
me. We'll be glad to look at that for the revision after the one in
progress. But I'd be happier to see interested users showing up at
meetings and helping us develop the concepts and write the words.

--
Steve Lionel
Retired Intel Fortran developer/support
Email: firstname at firstnamelastname dot com
Twitter: @DoctorFortran
LinkedIn: https://www.linkedin.com/in/stevelionel
Blog: http://intel.com/software/DrFortran

robin....@gmail.com

unread,
Jun 24, 2019, 2:59:39 AM6/24/19
to
On Monday, June 24, 2019 at 12:34:00 AM UTC+10, Beliavsky wrote:
> It would be nice if the ALLOCATE statement could be extended to take an optional argument so that instead of writing, as I often do, something like
>
> if (allocated(x)) deallocate(x)
> if (allocated(y)) deallocate(y)
> allocate(x(n),y(n))
>
> I could write
>
> allocate(x(n),y(n),check_already_allocated=.true.)

This is already more typing than IF (ALLOCATED(X)) DEALLOCATE (X)

> If the ALLOCATE statement allocates many variables, this extension could save a lot of code and increase clarity.

It might be better if ALLOCATE checked (and deallocate, if necessary)
prior to ALLOCATE-ing.

edmondo.g...@gmail.com

unread,
Jun 24, 2019, 5:40:48 AM6/24/19
to
Most of the time I use allocatables as local variables or as dummy arguments with intent out. So I can be certain that they are not allocated and I almost never deallocate them explicitly.

But in the few remaining cases I think your proposal is sensible.
When I can I use the reallocation on assignment, but I can't always use it.

Beliavsky

unread,
Jun 24, 2019, 6:38:36 AM6/24/19
to
On Monday, June 24, 2019 at 2:59:39 AM UTC-4, robin...@gmail.com wrote:

> It might be better if ALLOCATE checked (and deallocate, if necessary)
> prior to ALLOCATE-ing.

I would like that, but I am sure that the standards committee has had some reason (performance?) for not mandating this. I think this has been discussed in c.l.f. before.

Wolfgang Kilian

unread,
Jun 24, 2019, 9:32:59 AM6/24/19
to
I wonder whether the absence of that option is a feature -- not a bug?

In the present standard, there are various scenarios where the
allocation check is not needed:
(1) If the arrays x,y are local arrays of some procedure, they will be
deallocated on entry.
(2) If the arrays x,y are dummy arguments of some procedure, they will
be deallocated on entry if they are tagged as INTENT(OUT).
(3) If allocation on assignment is used, deallocation is automatic if
(and only if) necessary.
(4) Clean-up procedures are not needed for local allocatable
(sub)objects since deallocation is automatic at the end of a procedure.
(Such code was ubiquitous when array pointers were the de-facto standard
for dynamical memory, before the 'allocatable TR').

On the other hand, if an array that I want to use turns out to be
already allocated with wrong shape, this is often a symptom of some
error in my program logic. In most situations, I expect the allocation
status to be well-defined and appropriate to the problem.

This doesn't exclude situations where the deallocate check is perfectly
valid. However, thinking of it - I'd actually prefer to have this extra
line

if (allocated (x)) deallocate (x)

sticking out in my code. In the best of all worlds, with a comment that
tells the reader why the check may be required.

-- Wolfgang

--
E-mail: firstnameini...@domain.de
Domain: yahoo

edmondo.g...@gmail.com

unread,
Jun 24, 2019, 10:14:16 AM6/24/19
to
This why, in the original proposal, it should only enabled by a keyword.
That means that in all other situations it will trigger an error if it is already allocated.
It may even be slightly more efficient as it should be equivalent to:

if (allocated(x)) then
if (size(x)/=n) then
deallocate(x)
allocate(x(n))
endif
else
allocate(x(n))
endif

Even though the compiler may optimize the deallocation and reallocation of the same size anyway.


FortranFan

unread,
Jun 24, 2019, 3:39:13 PM6/24/19
to
On Sunday, June 23, 2019 at 9:10:36 PM UTC-4, Steve Lionel wrote:

> ..
> The planning for the next revision consisted PRIMARILY of listening to
> users and adding features users requested. The survey was open for many
> months, and I promoted the survey and the results more than once. At
> some point we have to nail down the work plan and get to development.
> Constant one-plusing is what made Fortran 2018 take so long. There were
> so many good ideas from the survey we couldn't possibly take them all
> this time.
>
> My goal for this next revision is to get it done in five years or less.
> That means we're going to close the gates earlier than was done in the
> past, and any new proposals considered had better be spectacular (and
> not previously considered) to justify reopening the work plan. Features
> that don't actually add functionality, but are just another way of doing
> something not many programs do, won't pass that test.
> ..


I wish WG5 can also figure out and adopt *more* of the modern options involving online collaboration toward at least the aspects in language development that mostly fall in the category of that mentioned in "What is New in Fortran 2018" document at the WG5 website "Features that address de ficiencies and discrepancies". In my mind, what is suggested in the original post with ALLOCATE statement falls under this bracket.

A large fraction of the scope and effort toward feature enhancements such as these, the "minor" ones per Modern Fortran Explained, is decidedly limited, though it can be quite burdensome if approached in the traditional manner of serialized processing of a burgeoning number of such requests by a subcommittee.

Offloading a lot of such effort, especially the grunt work that is otherwise constant, to a more modern development model which also involves crowd-sourcing and online collaboration platforms and which often garners 24x7x365 user engagement, can really help Fortran with parallelized and semi-automated advancement. What is mostly required is enumeration and enunciation by Fortran (sub)committees of a basic set of language semantics (rules) and (other) requirements and constraints which need to be kept in mind and the "crowd" can then iron out a lot of wrinkles in, and even reject, its own ideas. The standard (sub)committee(s) would then review, refine, and redirect development and hopefully reduce its own burden along the way.

My bottom-line message: at least with "Features that address de ficiencies and discrepancies" in Fortran, the standard body would do well to consider alternate options to allow the introduction of MORE as well as SPEEDIER refinements in the language. Otherwise this trope of bringing in time and resource constraints but which are associated with *traditional* development processes, that are decidedly burdensome just like procedural programming, and which then hinder the introduction of new Fortran improvements appears a disservice to its practitioners.

Richard Weed

unread,
Jun 24, 2019, 9:37:04 PM6/24/19
to
Thank you Fortran Fan for putting into better words what I guess I was asking for ie someway for us users with many years of Fortran development experience (abut 45 years and a hundered thousand lines plus of code in my case) to share our knowledge/expertise with the committee without having to use our own personal time/money or beg our employers to support a trip to Las Vegas or elsewhere. If I told my boss I wanted him to pay for a trip to Las Vegas to participate in a Fortran standard committee meeting he would kick me out of his office. As Fortran Fan suggests, the only way for a wider participation in the standard development process to happen is a new development model that leverages the internet and other modern communication platforms.

As to Steve's comments regarding trying to limit the scope of work for the next standard and hoping to get something done in five years, I can only point to N2165 and ask why consider anything OTHER than what the majority of people who responded to Steves survey clearly indicated was the most important feature lacking in Fortan, mainly something like the STL and/or templates in general. If I was in Steve or Dan Nagle's position, I would throw most of what is in N2165 and start over with just generics and correction and interpertations to the current standard and I would set a maximum of two years to get it done.

As to generics, I would adopt the following approach. First I would create something like the STL where containers and procedures that use them would either exist at the level of intrinsic types or (probably a better approach) be brought in when needed via a ISO_FORTRAN_STL module. I would start with an STL like facility because modern solutions algorithms in areas I work in (Finite Elements and Computational Fluid Dynamics) rely heavily on accessing unstructured data (ie doesn't map easily to an array or a tensor product mesh). For this you need lists, trees, queues, stacks, dictionaries etc and a range of sorting/searching algorithms. If you do a survey of modern FEM and CFD C++ codes you will see heavy use of the STL. I think you can make a case that the STL is a major reason for new code development in these two disciplines moving to C++. Programmers get the classic ADTs for free and don't have to waste time writing there own versions. I would only implement full blown user defined templates after I something like the STL in place. Unfortunately, I get the feeling that the committee will proceed with their view of generics without fully understanding exactly how they are used in real world codes and only rely on some abstract computer science notion of what generics should be.


Just my 2 cents.

Juan Domínguez

unread,
Jun 26, 2019, 3:17:38 AM6/26/19
to
Yes, I find it useful but I also think that these features don´t necessarily have to be implemented in the language. I think that having a standard or de-facto library for this a many other features will be the correct approach.

The problem is that to have a good "standard" library we need generics or templates.

Other languages have followed that path and programming in Python without the standard library or in C++ without the STL is unthinkable these days.

Michael Siehl

unread,
Jun 26, 2019, 4:18:17 PM6/26/19
to
I am not against such but have a personal and general comment regarding any future Fortran language extensions:

Depending on the chosen runtime (sequential or coarray), Fortran comprises two very distinct programming languages: With the coarray runtime in use, traditional statements can have different or extended uses. The ALLOCATE statement, for one example, can be used to not only allocate (local) memory but also to establish or repair data transfer channels among images (through allocatable coarrays) or to (newly) establish execution segment ordering for a coarray (not explicitely stated by the standard though, I think).

Only few month ago, I did learn that we can use F03 OOP for the development of any sophisticated parallel algorithm. Inheritance works, runtime polymorphism should also work. More research is required on this, one of my current focuses is on strategies for implementing F03 classes as distributes objects for doing general purpose parallel programming. I can't tell yet if this will result in a somewhat different use of OOP for parallel programming.

Thus, whenever someone wants to extend or change something with the Fortran language, she/he needs to consider the possible consequences to both of the runtimes. This can be difficult because we do not have enough understanding of the possibilities and consequences of the coarray runtime yet.

To me, it could seem useful to distinguish between the different languages, resulting from the different runtimes, by explicitly using the naming Coarray Fortran because the naming Fortran 2018 alone does not distinguish between the both different runtimes.

Cheers

Richard Weed

unread,
Jun 26, 2019, 8:38:59 PM6/26/19
to
On Wednesday, June 26, 2019 at 3:18:17 PM UTC-5, Michael Siehl wrote:

> Depending on the chosen runtime (sequential or coarray), Fortran comprises two very distinct programming languages: With the coarray runtime in use, traditional statements can have different or extended uses. The ALLOCATE statement, for one example, can be used to not only allocate (local) memory but also to establish or repair data transfer channels among images (through allocatable coarrays) or to (newly) establish execution segment ordering for a coarray (not explicitely stated by the standard though, I think).

Interesting. I never really thought about sequential and coarray being different runtimes but you are exactly right. In light of that, I wonder if a better approach would have been to have dedicated allocate and deallocate (co_allocate, co_deallocate maybe) for coobjects. I would think there might be some possible optimizations that could be done if the code for sequential and coarray allocations were separated if for no other reason the sequential version could have been kept pristine and the coarray version could then build on it. Obviously its too late for that now but I would be interested to learn from Steve or others who were around when the coarray facility was being developed if that was ever discussed.

Gary Scott

unread,
Jun 26, 2019, 9:27:36 PM6/26/19
to
I sure hope nobody tries to hold back language progress due to coarray
compatibility or lack thereof :(

Wolfgang Kilian

unread,
Jun 28, 2019, 3:56:32 AM6/28/19
to
On 25.06.2019 03:37, Richard Weed wrote:
>
> As to generics, I would adopt the following approach. First I would create something like the STL where containers and procedures that use them would either exist at the level of intrinsic types or (probably a better approach) be brought in when needed via a ISO_FORTRAN_STL module. I would start with an STL like facility because modern solutions algorithms in areas I work in (Finite Elements and Computational Fluid Dynamics) rely heavily on accessing unstructured data (ie doesn't map easily to an array or a tensor product mesh). For this you need lists, trees, queues, stacks, dictionaries etc and a range of sorting/searching algorithms. If you do a survey of modern FEM and CFD C++ codes you will see heavy use of the STL. I think you can make a case that the STL is a major reason for new code development in these two disciplines moving to C++. Programmers get the classic ADTs for free and don't have to waste time writing there own versions. I would only implement full blown user defined templates after I something like the STL in place. Unfortunately, I get the feeling that the committee will proceed with their view of generics without fully understanding exactly how they are used in real world codes and only rely on some abstract computer science notion of what generics should be.
>
>
> Just my 2 cents.
>

Agreed with this message, STL functionality is central for the
acceptance of a language. However: without a thourough understanding
and unambiguous definition of the role and semantics of generics, in the
Fortran context, I also fear for a similar conceptual disaster and
debugging nightmare as the C++ templates have been - despite their
undeniable value for practical programming.

My personal view:

(1) clarify first whether generics are introduced at the coarse-grained
module level (proposals exist) or at a fine-grained level
(types/procedures/TBP, as done previously for parameterization by KIND)

(2) guarantee that generics work seamlessly together with
object-oriented design (abstract entities and extension) and with
parallel programming (coarrays).

(3) define a way of specifying the interface of a generic construct that
is convenient for the user *and* makes generics strictly type-safe, in
Fortran's terms.

(4) verify that execution speed is not affected.

I assume that all this is already on the agenda.

Then, specifications for a Fortran "STL" could be given. Include a
bullet-proof generics specification in the standard, but provide a first
STL specification as a separate TR?

Richard Weed

unread,
Jun 28, 2019, 9:51:24 AM6/28/19
to
On Friday, June 28, 2019 at 2:56:32 AM UTC-5, Wolfgang Kilian wrote:

> My personal view:
>
> (1) clarify first whether generics are introduced at the coarse-grained
> module level (proposals exist) or at a fine-grained level
> (types/procedures/TBP, as done previously for parameterization by KIND)

Something else that always puzzled me was why the standards committee did not mandate that KIND parameters be unique. Most compilers I've delt with over the
years since they were introduced return the same values for 4 byte and 8 byte reals and integers (ie 4 for both INT32 and REAL32, 8 for both INT64 and REAL64). I presume this was done to placate the lazy programmers who insist on
writing REAL(8) etc. and probably a policy of not dictating to the developers how to implement the standard. I see this as a defect in the standard. Why?
Because if KINDS are unique it opens the way for making parameterized types much more useful. The use of the TYPE statement to define intrinsic values was introduced in Fortran 2008. Unfortunately, the standard decided to require users to define the explicit type and kind in a rather klunky manner, ie.

Type(REAL(REAL64)) :: areal

If KINDS were unique, we could just write

Type(KIND=REAL64) :: areal

Now PDTs could be much more useful as generic containers because we could write

Integer, PARAMETER :: wp=SELECTED_REAL_KIND(p=13,r=307)
Integer, PARAMETER :: ip=>SELECTED_INT_KIND(9)

Type list_node_t(lkind)

Integer, kind :: lkind

Type(KIND=lkind) :: val
Type(list_node_t(lkind)), Pointer :: next_p
Type(list_node_t(lkind)), Pointer :: prev_p

End Type

We know have a true generic container for a list node (at least for intrinsic
types although extension to user types should be doable). If you need a
list of 64 bit reals or 64 bit ints you get it by just typing

Type(list_node_t(wp)) :: real64Node
Type(list_node_t(ip)) :: int64Node

Without having to create two different instances of a PDT to hold the two separate types. This is my idea of what "generic" means but you can't do it with todays standard because (IMHO) the committee gave to much leeway to the developers when KINDS entered the lanaguage with Fortran 90. On the surface you would think this would be easy to rectify. My approach would be to just add an increment to the current values returned by KIND, SELECTED_xxx_KIND, and the INT64/REAL64 etc parameters in ISO_FORTRAN_ENV ie 100 for the ints, 200 for the reals etc to make them unique and modify the parser to appease the folks who still insist on doing REAL(8) etc.

I think this relatively straight forward change to the standard would be a first step in implementing an STL like facility and maybe reduce the need for a full blown template facility. But again, looking at the current working papers on the J3 website and WG5 2165 I have no hope of this ever being implemented because IMHO it just makes too much sense.

RW

steve kargl

unread,
Jun 28, 2019, 10:23:30 AM6/28/19
to
Richard Weed wrote:

> On Friday, June 28, 2019 at 2:56:32 AM UTC-5, Wolfgang Kilian wrote:
>
>> My personal view:
>>
>> (1) clarify first whether generics are introduced at the coarse-grained
>> module level (proposals exist) or at a fine-grained level
>> (types/procedures/TBP, as done previously for parameterization by KIND)
>
> Something else that always puzzled me was why the standards committee did not mandate that KIND parameters be unique. Most compilers I've delt with over the
> years since they were introduced return the same values for 4 byte and 8 byte reals and integers (ie 4 for both INT32 and REAL32, 8 for both INT64 and REAL64). I presume this was done to placate the lazy programmers who insist on
> writing REAL(8) etc. and probably a policy of not dictating to the developers how to implement the standard. I see this as a defect in the standard. Why?
> Because if KINDS are unique it opens the way for making parameterized types much more useful. The use of the TYPE statement to define intrinsic values was introduced in Fortran 2008. Unfortunately, the standard decided to require users to define the explicit type and kind in a rather klunky manner, ie.

The kind type parameter is unique. That is, kind(1.e0) /= kind(1.d0).

I am aware of at least 2 processors that have/allow kind(1.e0) = 1 and kind(1.d0) = 2.

--
steve

Richard Weed

unread,
Jun 28, 2019, 11:11:05 AM6/28/19
to
That is not what I said if you would have read my post more closely. I was saying that the KIND parameters for an INT64 and and REAL64 are both 8. I know
that the KIND parameters for different precisions of the same type are unique and that some compilers (NAG is the one I'm familiar with) are not 4 or 8.

steve kargl

unread,
Jun 28, 2019, 11:39:29 AM6/28/19
to
Richard Weed wrote:

> On Friday, June 28, 2019 at 9:23:30 AM UTC-5, steve kargl wrote:
>> Richard Weed wrote:
>>
>> > On Friday, June 28, 2019 at 2:56:32 AM UTC-5, Wolfgang Kilian wrote:
>> >
>> >> My personal view:
>> >>
>> >> (1) clarify first whether generics are introduced at the coarse-grained
>> >> module level (proposals exist) or at a fine-grained level
>> >> (types/procedures/TBP, as done previously for parameterization by KIND)
>> >
>> > Something else that always puzzled me was why the standards committee did not mandate that KIND parameters be unique.

You wrote the above sentence.

>>
>> The kind type parameter is unique. That is, kind(1.e0) /= kind(1.d0).
>>
>> I am aware of at least 2 processors that have/allow kind(1.e0) = 1 and kind(1.d0) = 2.
>>
>
> That is not what I said if you would have read my post more closely.

Perhaps, you should have been more articulate in expressing your idea. The kind type
parameters are unique for a given type. A kind type parameter is not a type itself. How
is J3 going to assign unique numbers/names to each possible type that a processor could
support? On some targets, gfortran supports REAL(4) and REAL(8), and on some it
supports REAL(4), REAL(8), and REAL(16), and on others REAL(4), REAL(8), REAL(10),
and REAL(16). The Fortran standard allows for radix other than 2. How is J3 going to
specify names for possible radix 10 REAL types.

I also know of an individual who gave gfortran a -fprecision=XXX option that allowed
one to specify a REAL type with XXX precision. How is J3 going to codify this possibility?

The kind type parameter system that J3 enshrined in the standard provides flexibility
without artificially limiting implementations.

--
steve



Richard Weed

unread,
Jun 28, 2019, 3:35:17 PM6/28/19
to
The committee does not have to mandate what the values are just that they are unique across all types. They can state that the kind value for any integer type cannot be the same as any real type, or any logical type etc. Nothing in that statement obligates the developer to use a specific value for a given type.The gist of what I was saying is as currently implemented you can't modify the current facility for using TYPE to define an instrinsic type to allow something like type(KIND=INT64) or type(KIND=REAL64) and have the compiler know what type you are requesting because both have values of 8 on most compilers. Even if you can't specify a canned INT64 or REAL64 type parameter for either arbitrary precision or non radix 2 types, you could make SELECTED_REAL_KIND, SELECTED_INT_KIND etc. return unique values. And under no way is what I'm proposing "artificially limiting implementations", if anything its significantly increasing the usability of KIND parameters to define generic containers etc.

FortranFan

unread,
Jun 28, 2019, 3:56:46 PM6/28/19
to
On Friday, June 28, 2019 at 9:51:24 AM UTC-4, Richard Weed wrote:

> .. the standard decided to require users to define the explicit type and kind in a rather klunky manner, ie.
>
> Type(REAL(REAL64)) :: areal
>
> If KINDS were unique, we could just write
>
> Type(KIND=REAL64) :: areal
> ..

A statement such as "Type(KIND=REAL64) :: areal" would make no sense in a lot of contexts (especially in all the type, kind, rank (TKR) ones) in the current standard; it would be an absolute non-starter one can imagine.

Dick Hendrickson

unread,
Jun 28, 2019, 4:50:03 PM6/28/19
to
On 6/28/19 8:51 AM, Richard Weed wrote:
> On Friday, June 28, 2019 at 2:56:32 AM UTC-5, Wolfgang Kilian wrote:
>
>> My personal view:
>>
>> (1) clarify first whether generics are introduced at the coarse-grained
>> module level (proposals exist) or at a fine-grained level
>> (types/procedures/TBP, as done previously for parameterization by KIND)
>
> Something else that always puzzled me was why the standards committee did not mandate that KIND parameters be unique.

There's probably no good reason, in the mathematical proof sense.

At the time KIND was introduced REAL*4, INTEGER*4, etc., were widely
used to specify type and precision. The standard required that default
integer be the same storage size as default real (to make common and
equivalence work). It seemed reasonable to expect/allow the kind value
to match the storage size. Telling people that REAL(KIND=4) and
INTEGER(KIND=5) are the replacements for the *4 syntax is an awkward
sell. ;)

The *4 syntax was non-standard, but all the compilers supported it and
J3 tries not to introduce things that conflict with existing practice.

IEEE had not yet taken over the world and there was talk of hardware
that would support different floating point formats (like DEC) and (more
unlikely) different integer formats or character sets. Perhaps the
selection would be a command line option which makes it easy try
different FP formats. Making rules about KIND values for potential
machines seemed hard.

Not a proof, but that's some of the discussion I remember.

Dick Hendrickson

robin....@gmail.com

unread,
Jun 28, 2019, 10:17:44 PM6/28/19
to
It is in the case of the REALs (and must be, or the system falls apart),
but that is not what he was saying.
He said that the kind of default integer is often the same as the
kind of default real, etc.

> I am aware of at least 2 processors that have/allow kind(1.e0) = 1 and kind(1.d0) = 2.

As they MUST.

robin....@gmail.com

unread,
Jun 28, 2019, 10:23:07 PM6/28/19
to
On Saturday, June 29, 2019 at 1:39:29 AM UTC+10, steve kargl wrote:
> Richard Weed wrote:
>
> > On Friday, June 28, 2019 at 9:23:30 AM UTC-5, steve kargl wrote:
> >> Richard Weed wrote:
> >>
> >> > On Friday, June 28, 2019 at 2:56:32 AM UTC-5, Wolfgang Kilian wrote:
> >> >
> >> >> My personal view:
> >> >>
> >> >> (1) clarify first whether generics are introduced at the coarse-grained
> >> >> module level (proposals exist) or at a fine-grained level
> >> >> (types/procedures/TBP, as done previously for parameterization by KIND)
> >> >
> >> > Something else that always puzzled me was why the standards committee did not mandate that KIND parameters be unique.
>
> You wrote the above sentence.
>
> >>
> >> The kind type parameter is unique. That is, kind(1.e0) /= kind(1.d0).
> >>
> >> I am aware of at least 2 processors that have/allow kind(1.e0) = 1 and kind(1.d0) = 2.
> >>
> >
> > That is not what I said if you would have read my post more closely.
>
> Perhaps, you should have been more articulate in expressing your idea. The kind type
> parameters are unique for a given type.

True, but that is not enough.

> A kind type parameter is not a type itself. How
> is J3 going to assign unique numbers/names to each possible type that a processor could
> support?

All the standard needs to do is to specify that every kind number
must be unique, regardles of whether it is INTEGER, REAL, LOGICAL.

> On some targets, gfortran supports REAL(4) and REAL(8), and on some it
> supports REAL(4), REAL(8), and REAL(16), and on others REAL(4), REAL(8), REAL(10),
> and REAL(16). The Fortran standard allows for radix other than 2. How is J3 going to
> specify names for possible radix 10 REAL types.

Kinds are not specified by names; they are specified by numbers.

> I also know of an individual who gave gfortran a -fprecision=XXX option that allowed
> one to specify a REAL type with XXX precision. How is J3 going to codify this possibility?
>
> The kind type parameter system that J3 enshrined in the standard provides flexibility
> without artificially limiting implementations.

No, it leads to ambiguity and therefore programming error.

robin....@gmail.com

unread,
Jun 28, 2019, 10:55:32 PM6/28/19
to
On Saturday, June 29, 2019 at 6:50:03 AM UTC+10, Dick Hendrickson wrote:
> On 6/28/19 8:51 AM, Richard Weed wrote:
> > On Friday, June 28, 2019 at 2:56:32 AM UTC-5, Wolfgang Kilian wrote:
> >
> >> My personal view:
> >>
> >> (1) clarify first whether generics are introduced at the coarse-grained
> >> module level (proposals exist) or at a fine-grained level
> >> (types/procedures/TBP, as done previously for parameterization by KIND)
> >
> > Something else that always puzzled me was why the standards committee did not mandate that KIND parameters be unique.
>
> There's probably no good reason, in the mathematical proof sense.
>
> At the time KIND was introduced REAL*4, INTEGER*4, etc., were widely
> used to specify type and precision. The standard required that default
> integer be the same storage size as default real (to make common and
> equivalence work). It seemed reasonable to expect/allow the kind value
> to match the storage size. Telling people that REAL(KIND=4)

Specifying things like REAL(KIND=4) is not portable.
Better is REAL (KIND=KIND(1.0E0)) etc.

> and
> INTEGER(KIND=5) are the replacements for the *4 syntax is an awkward
> sell. ;)
>
> The *4 syntax was non-standard,

Still is.

And meaningless for word machines and machines with more than
one 4-byte REALs, multiple 8-byte REALS, etc.

> but all the compilers supported it

No they didn't.

> and
> J3 tries not to introduce things that conflict with existing practice.
>
> IEEE had not yet taken over the world and there was talk of hardware
> that would support different floating point formats (like DEC) and (more
> unlikely) different integer formats

From the 1960s one of the most widely-used machines
supported 16-bit and 32-bit integers.
I refer, of course, to the IBM 360 and the look-alikes.
There were machines that supported only 16-bit integers.
Some models of the look-alikes supported only 16-bit integers.

Descendants of the System 360 now support 8, 16, 32, and 64-bit integers.

ga...@u.washington.edu

unread,
Jul 9, 2019, 12:38:47 AM7/9/19
to
(someone wrote)

> There were machines that supported only 16-bit integers.

There were, and maybe still are.

The standard always required default INTEGER and REAL to be the
same size, though. Some compilers that I remember default to
the non-standard different size, but had a compiler option for
matching sizes. I believe that some would allocate four bytes,
but only use two.

The IBM 36 bit machines store 15 bit (plus sign) integers
in 36 bit words.

As for KIND values, I do agree that it would have been nice for
kinds of different types to be disjoint. There was one not so
long ago, where someone misunderstood KIND values in this way,
maybe:

REAL (KIND(1)) ...


Dominik Gronkiewicz

unread,
Oct 5, 2019, 6:47:46 PM10/5/19
to
I believe that for now allowing generic kinds and ranks (rather than types) would solve 90% of the cases that generics are used for (at least in number grinding industry). (Also I know that assumed rank (?) has become a part of C interop but I'm not talking about that.) It's fairly easy to implement too. And it would also complete "parametrized derived type" feature which is currently quite useless as you still have to write separate subroutines for each kind (ugh).

I also think that many of the generic programming needs such as linked lists, vectors and trees, as well as utilities (swap, sort etc) should become part of the language standard library, but not in C++ STL sense, but as intrinsic part of the language. STL is written in C++ because it makes sense in C++. Fortran STL cannot be written in current Fortran in any sensible way, and will not be unless *true* generic programming (as in C++) is introduced which, as many have stated, is extremely complicated in practice and might actually kill the language (or it will likely never be complete). On the other hand, currently the standard library is extremely poor (I would even call it a joke). Its extension would be very easy to add to the language and would unlikely conflict with already existing paradigm. Sure, there are 10% cases that actually would benefit from unlimited C++-like generic programming, but for those maybe Fortran is not the best choice.

integer, linked_list :: a
! initialize by array
a = [1, 2, 3]
a = [a, 4, 5] ! append element
a = a + 4 ! increase each by 1
a(2) = -1 ! change one element (slow!)
each (b => a, index = i)
print *, i, b
end

The reason why I'm arguing for the built-in solution is also because Fortran doesn't have a package ecosystem, so as somebody said above, you can't really build an application based on random codes on github that appear and disappear. I'm not counting old obscure and obsolete F77 numerical subroutine libraries, we are serious here aren't we.

Sadly, it's true that most Fortran "features" and updates are actually in the interests of narrow group of customers and not actually progress the language. If Fortran Commitee does cover travel and accommodation cost I think at least a few people here would be dedicated and qualified to actually become a members and contribute. But is that possibility really open, or is it just presented to shut the mouth of people that protest the current state of stagnation?

Dominik

JCampbell

unread,
Oct 6, 2019, 3:05:10 AM10/6/19
to
On Tuesday, July 9, 2019 at 2:38:47 PM UTC+10, ga...@u.washington.edu wrote:

> The standard always required default INTEGER and REAL to be the
> same size, though. Some compilers that I remember default to
> the non-standard different size, but had a compiler option for
> matching sizes. I believe that some would allocate four bytes,
> but only use two.

I find this requirement for "default INTEGER and REAL to be the same size" does not work well for transitioning to 64-bit, where a default 8-byte integer can have some benefits. Use of SIZE and LOC identifies how this gets messy for 64-bit, where the default for Fortran intrinsic SIZE is only 4-byte, while the Extension intrinsic LOC is generally 8-byte.

I would have expected that by default, SIZE should give the correct answer, rather than spitting the dummy and saying you have not also provided the correct KIND so I will give the wrong answer. As a programmer, I find this is just stupid.

Having developed software using multiple compilers that have different KIND values, I find the use of the non-standard *byte syntax to be a more robust approach, although not to everyone's taste. When defining numerical constants (especially integers, eg 11_8) with a kind suffix is non portable. Again for 64-bit, these 8-byte integer constants are not robust in their usage.

I am not sure why integer and real kind values should be different for the same byte size. NAG appear to think this has some benefits, although I would expect that when using NAG this could be more an annoyance, rather than a benefit. What is the benefit ?

JCampbell

unread,
Oct 6, 2019, 3:26:53 AM10/6/19
to
I do not know of any computers that support Fortran 90+ that don't support IEEE754.
For this reason I suggest that KIND should be marked as an Obsolescent feature, as there is no longer any need for this syntax.

We did it with FORALL, why not KIND ?

robin....@gmail.com

unread,
Oct 6, 2019, 7:09:12 AM10/6/19
to
On Sunday, October 6, 2019 at 9:47:46 AM UTC+11, Dominik Gronkiewicz wrote:
> I believe that for now allowing generic kinds and ranks (rather than types) would solve 90% of the cases that generics are used for (at least in number grinding industry). (Also I know that assumed rank (?) has become a part of C interop but I'm not talking about that.) It's fairly easy to implement too. And it would also complete "parametrized derived type" feature which is currently quite useless as you still have to write separate subroutines for each kind (ugh).
>
> I also think that many of the generic programming needs such as linked lists, vectors and trees, as well as utilities (swap, sort etc) should become part of the language standard library, but not in C++ STL sense, but as intrinsic part of the language. STL is written in C++ because it makes sense in C++. Fortran STL cannot be written in current Fortran in any sensible way, and will not be unless *true* generic programming (as in C++) is introduced which, as many have stated, is extremely complicated in practice and might actually kill the language (or it will likely never be complete). On the other hand, currently the standard library is extremely poor (I would even call it a joke). Its extension would be very easy to add to the language and would unlikely conflict with already existing paradigm. Sure, there are 10% cases that actually would benefit from unlimited C++-like generic programming, but for those maybe Fortran is not the best choice.
>
> integer, linked_list :: a
> ! initialize by array
> a = [1, 2, 3]
> a = [a, 4, 5] ! append element
> a = a + 4 ! increase each by 1
> a(2) = -1 ! change one element (slow!)
> each (b => a, index = i)
> print *, i, b
> end
>
> The reason why I'm arguing for the built-in solution is also because Fortran doesn't have a package ecosystem, so as somebody said above, you can't really build an application based on random codes on github that appear and disappear. I'm not counting old obscure and obsolete F77 numerical subroutine libraries, we are serious here aren't we.

Try Numerical Recipes in Fortan 90.

Ron Shepard

unread,
Oct 6, 2019, 12:01:43 PM10/6/19
to
On 10/6/19 2:26 AM, JCampbell wrote:
> I do not know of any computers that support Fortran 90+ that don't support IEEE754.
> For this reason I suggest that KIND should be marked as an Obsolescent feature, as there is no longer any need for this syntax.

I think that KIND is one of the best features of fortran compared to
other programming languages. It allows the precision of an entire
program, millions of lines of code, to be changed by modifying a single
statement, and it allows very precise control of the precision of
intermediates within expressions.

> I am not sure why integer and real kind values should be different for
> the same byte size.

An obvious answer is that there may be multiple KINDs of both integer
and real variables that occupy the same number of bytes. This idea could
also be extended to logical variables with some benefit to interlanguage
programming.

* IEEE defines binary and decimal formats that occupy the same number of
bits.

* VAX hardware supported two different 64-bit real formats (although
this predated f90).

* gfortran already supports two different real kinds (10 and 16) that
occupy 128 bits.

* Many programmers prefer a KIND system in which the values are unique
for each combination of type and format. This is already implemented as
an option in some fortran compilers. That preference conflicts directly
with the above.

$.02 -Ron Shepard

Dominik Gronkiewicz

unread,
Oct 6, 2019, 1:46:43 PM10/6/19
to
W dniu niedziela, 6 października 2019 13:09:12 UTC+2 użytkownik robin...@gmail.com napisał:
> Try Numerical Recipes in Fortan 90.

Thanks! How do I enable this repository in Fortran Packaging System? You get my point? ;)

Beliavsky

unread,
Oct 6, 2019, 2:17:16 PM10/6/19
to
A large fraction of new open source Fortran projects are hosted on GitHub. Those projects provide instructions for compilation, often providing make files. What would a Fortran packaging system do in addition to this? Do C and C++ have packaging systems?

FortranFan

unread,
Oct 6, 2019, 3:07:20 PM10/6/19
to
On Sunday, October 6, 2019 at 3:26:53 AM UTC-4, JCampbell wrote:

> I do not know of any computers that support Fortran 90+ that don't support IEEE754.
> For this reason I suggest that KIND should be marked as an Obsolescent feature, as there is no longer any need for this syntax.
> ..

With applications and algorithms coming up toward machine learning and AI with 16-bit floating-point types and possible further development of other number formats toward higher precision computing using smaller number of bits, the KIND facility in Fortran can really be a boon if used well. And it's here to stay, it permeates almost all the semantics and syntax in modern Fortran.

With near-universal support of IEEE 754 floating-point arithmetic with current processors, what Fortran can do for its persevering practitioners is to offer SHORT-HAND notation to more easily consume the IEEE 754 number format in codes.

For heaven's sake, the Fortran standard committee would do well to introduce INTRINSIC *ALIASes* of, say, FP32, FP64, FP128 that map to the corresponding 32-bit, 64-bit, and 128-bit formats (i.e., "real(KIND=ieee_selected_real_kind(P=..,R=.., RADIX=..)") that are listed in the IEEE 754 standard in great detail and which have been supported by all the processors for nearly 30 years now. This is so that a young, bright middle-schooler can readily start doing 'formula translation' in Fortran with IEEE 754 formats like so

fp64, parameter :: Rgas = ..
fp64 :: T, P, density
..
density = P/Rgas/T

instead of the godforsaken and ill-communicated 'canon':

integer, parameter :: RK = ieee_selected_real_kind( p=..)
real(kind=RK), parameter :: Rgas = ..
real(kind=RK) :: T, P, density
..
density = P/Rgas/T

robin....@gmail.com

unread,
Oct 6, 2019, 6:12:30 PM10/6/19
to
On Monday, October 7, 2019 at 6:07:20 AM UTC+11, FortranFan wrote:
> On Sunday, October 6, 2019 at 3:26:53 AM UTC-4, JCampbell wrote:
>
> > I do not know of any computers that support Fortran 90+ that don't support IEEE754.
> > For this reason I suggest that KIND should be marked as an Obsolescent feature, as there is no longer any need for this syntax.
> > ..
>
> With applications and algorithms coming up toward machine learning and AI with 16-bit floating-point types and possible further development of other number formats toward higher precision computing using smaller number of bits, the KIND facility in Fortran can really be a boon if used well. And it's here to stay, it permeates almost all the semantics and syntax in modern Fortran.
>
> With near-universal support of IEEE 754 floating-point arithmetic with current processors, what Fortran can do for its persevering practitioners is to offer SHORT-HAND notation to more easily consume the IEEE 754 number format in codes.
>
> For heaven's sake, the Fortran standard committee would do well to introduce INTRINSIC *ALIASes* of, say, FP32, FP64, FP128 that map to the corresponding 32-bit, 64-bit, and 128-bit formats (i.e., "real(KIND=ieee_selected_real_kind(P=..,R=.., RADIX=..)")

And 80-bit?

> that are listed in the IEEE 754 standard in great detail and which have been supported by all the processors for nearly 30 years now.

128 bit is "relatively" recent.

ga...@u.washington.edu

unread,
Oct 6, 2019, 6:49:03 PM10/6/19
to
On Sunday, October 6, 2019 at 3:12:30 PM UTC-7, robin...@gmail.com wrote:

(snip, someone wrote)

> > that are listed in the IEEE 754 standard in great detail and
> > which have been supported by all the processors for nearly 30 years now.

> 128 bit is "relatively" recent.

IEEE 754 allowed for extended precision, which Intel implemented
as 80 bits, from the beginning. The specific 128 bit formats
(binary and decimal) are more recent.

But 128 bit floating point goes back at least to the IBM 360/85
in about 1968, and is standard for floating point in S/370
and successors.

spectrum

unread,
Oct 7, 2019, 1:20:29 AM10/7/19
to
On Sunday, October 6, 2019 at 7:47:46 AM UTC+9, Dominik Gronkiewicz wrote:
(...)
> The reason why I'm arguing for the built-in solution is also because Fortran doesn't have a package ecosystem, so as somebody said above, you can't really build an application based on random codes on github that appear and disappear. I'm not counting old obscure and obsolete F77 numerical subroutine libraries, we are serious here aren't we.

Just to encourage discussions, is it possible for you (or other people) to create
a new thread about modern "ecosystem(s)"? I think it is already useful even if
the thread simply gathers information on the ecosystems used by recent
languages (to consider the difference from just using Github or more conventional
repositories).

robin....@gmail.com

unread,
Oct 7, 2019, 3:13:50 AM10/7/19
to
On Monday, October 7, 2019 at 9:49:03 AM UTC+11, ga...@u.washington.edu wrote:
> On Sunday, October 6, 2019 at 3:12:30 PM UTC-7, robin...@gmail.com wrote:
>
> (snip, someone wrote)
>
> > > that are listed in the IEEE 754 standard in great detail and
> > > which have been supported by all the processors for nearly 30 years now.
>
> > 128 bit is "relatively" recent.
>
> IEEE 754 allowed for extended precision, which Intel implemented
> as 80 bits, from the beginning. The specific 128 bit formats
> (binary and decimal) are more recent.

Well after 1990. IEEE 128 bits were not generally available then
and certainly not "supported by all the processors for nearly 30 years now"
as FortranFan claimed.

The decimal format is even more recent, about 5 years.

> But 128 bit floating point goes back at least to the IBM 360/85
> in about 1968,

Not IEEE. It's S/360 hexadecimal float.

ga...@u.washington.edu

unread,
Oct 7, 2019, 5:01:47 AM10/7/19
to
On Monday, October 7, 2019 at 12:13:50 AM UTC-7, robin...@gmail.com wrote:

(snip, I wrote)

> > But 128 bit floating point goes back at least to the IBM 360/85
> > in about 1968,

> Not IEEE. It's S/360 hexadecimal float.

That is why it doesn't say IEEE above.

VAX has H-float, standard, as far as I know, only in
the 11/730, low end model.

RISC-V has it, but I don't know if the existing chips
implement it. For an FPGA implementation, you could add it.

Otherwise, no-one else wants to bother to implement
the hardware for it.

JCampbell

unread,
Oct 7, 2019, 7:15:08 AM10/7/19
to
On Monday, October 7, 2019 at 3:01:43 AM UTC+11, Ron Shepard wrote:
> On 10/6/19 2:26 AM, JCampbell wrote:
> > I do not know of any computers that support Fortran 90+ that don't support IEEE754.
> > For this reason I suggest that KIND should be marked as an Obsolescent feature, as there is no longer any need for this syntax.
>
> I think that KIND is one of the best features of fortran compared to
> other programming languages. It allows the precision of an entire
> program, millions of lines of code, to be changed by modifying a single
> statement, and it allows very precise control of the precision of
> intermediates within expressions.

Ron, Thanks for your comments, although I do disagree.

Rather than the “best”, I would consider the idea of being able to change the precision of "millions of lines of code" as the worst of outcomes. This ignores that the algorithms in this code have been developed for a particular precision. The suitability of the algorithm or approach is dependent on that precision being used. Change the precision and the algorithm is probably no longer suitable or "tuned".
For my applied analysis, there is only one practical option: hardware implemented 64-bit. (I could certainly be interested in hardware implemented 128-bit vector instructions, although I wonder what it would deliver.)
Back in 1970's when there was significant variation in the precisions on offer, changing the hardware not only required change to the type declarations but also required much care with numeric parameters used in the code.

You also state that KIND "allows very precise control of the precision of intermediates within expressions". I don't think this is the case. It is a dangerous illusion.
For Selected_Real_Kind, while there are two parameters “p” and “r”, the reality is there are typically only 2 hardware supported precisions available, while other options come with a significant performance (128 bit) or implementation penalty (eg: gFortran's 80 bit on 64-bit OS).

If you review most presented coders use of Selected_Real_Kind, the values of p and r are not chosen based on the precision required of the algorithm, but on the precision identified for the required practical option, hence the typical coding of:

integer, parameter :: rp = Selected_Real_Kind ( 6, 37 )
integer, parameter :: dp = Selected_Real_Kind ( 15, 307 )

(interesting I have seen examples of integer, parameter :: rp = Selected_Real_Kind ( 7, 37 ) by a well respected Fortran user )

I would like to select a higher precision accumulator for dot_product, but with the performance required of vector instructions, this is not a practicality. While Selected_Real_Kind implies many options, the practical reality is very different.

IEEE 754 has been around since 1980's and most non-conforming examples being quoted are no longer in use, certainly not as preferred equipment.

FORALL is a good example of a coding approach that has not kept up with the recent changes in multi-thread programming. KIND is going the same way, as it is a clumsy approach to support of available precisions.

spectrum

unread,
Oct 7, 2019, 7:44:19 AM10/7/19
to
Hmm, sorry about a bit misleading post above. By this sentence,

> is it possible for you (or other people) to create a new thread about modern
"ecosystem(s)"? ... (snip)

I'm just interested in someone would be interested in opening a such thread
(as a placeholder to gather/add more info), while not mixing topics here.
Indeed, many languages have been using (or trying to use) ecosystems (incl. C++)
to create a network of libraries/packages, with a common installation protocol
with dependency control (on other libraries).

# Btw, Numerical Recipes are proprietary, and netlib contains a lot of lagacy codes
that do not compile or work properly (due to many reasons, which I experienced several
times).

ga...@u.washington.edu

unread,
Oct 7, 2019, 10:25:58 AM10/7/19
to
On Monday, October 7, 2019 at 4:15:08 AM UTC-7, JCampbell wrote:

(snip)

> Ron, Thanks for your comments, although I do disagree.

> Rather than the “best”, I would consider the idea of being able to
> change the precision of "millions of lines of code" as the worst
> of outcomes. This ignores that the algorithms in this code have
> been developed for a particular precision. The suitability of the
> algorithm or approach is dependent on that precision being used.

This is true for some algorithms.

But also, even when it isn't, it is unusual to know the exact
precision needed for a specific problem, and especially not
at compile time.

> Change the precision and the algorithm is probably no longer
> suitable or "tuned".

Theoretically, one can use the appropriate inquiry functions
to automatically tune for the available precision. Yes it often
isn't easy to do that.

> For my applied analysis, there is only one practical option: hardware
> implemented 64-bit. (I could certainly be interested in hardware
> implemented 128-bit vector instructions, although I wonder what it
> would deliver.)

In a large variety of problems, one needs much more precision for
intermediate data than for final results. Often double precision
is used to obtain the needed single precision results. This is true
for many matrix algorithms, where partial cancellation in intermediate
results reduces precision. Also, the needed intermediate precision
increases (slowly) with matrix size.

> Back in 1970's when there was significant variation in the
> precisions on offer, changing the hardware not only required
> change to the type declarations but also required much care
> with numeric parameters used in the code.

I do remember some programs with sets of constants at the beginning,
where you were supposed to uncomment the ones for your machine.

> You also state that KIND "allows very precise control of the
> precision of intermediates within expressions". I don't think
> this is the case. It is a dangerous illusion.

One of the ideas that came along with IEEE 754 is increased
precision for intermediate values. The 8087 implements this with
the 80 bit temporary real format. The idea was that all calculations
would be done with this extra precision. Specifically, the original
idea was a virtual stack, where the processor would keep track of
spilling to memory. The 8087 virtual stack never worked, and as far
as I know, wasn't fixed for later x87 processors. The result is that
you don't know which parts are done in extra precision. Even more,
with optimizers, values that you thought were stored in memory,
might be kept in registers with extra precision.

> For Selected_Real_Kind, while there are two parameters “p” and “r”,
> the reality is there are typically only 2 hardware supported
> precisions available, while other options come with a significant
> performance (128 bit) or implementation penalty (eg: gFortran's
> 80 bit on 64-bit OS).

128 bit floating point goes back at least to the IBM 360/85 and
all S/370 models, in IBM's hexadecimal floating point format.

When IBM added IEEE floating point somewhere in the ESA/390
years, they included 128 bit formats.

Even more, in the newer IEEE 754-2008 with decimal floating point,
the 64 bit format is considered single precision, and 128 is
considered double precision.

Otherwise, VAX has H-float, though a microcode option on many
models, and as well as I know, was rarely ordered.

RISC-V has it as an option, though I don't know what is available
to buy. If RISC-V catches on, we might see more of it.

Ron Shepard

unread,
Oct 7, 2019, 12:30:16 PM10/7/19
to
On 10/7/19 6:15 AM, JCampbell wrote:
> On Monday, October 7, 2019 at 3:01:43 AM UTC+11, Ron Shepard wrote:
>> On 10/6/19 2:26 AM, JCampbell wrote:
>>> I do not know of any computers that support Fortran 90+ that don't support IEEE754.
>>> For this reason I suggest that KIND should be marked as an Obsolescent feature, as there is no longer any need for this syntax.
>>
>> I think that KIND is one of the best features of fortran compared to
>> other programming languages. It allows the precision of an entire
>> program, millions of lines of code, to be changed by modifying a single
>> statement, and it allows very precise control of the precision of
>> intermediates within expressions.
>
> Ron, Thanks for your comments, although I do disagree.
>
> Rather than the “best”, I would consider the idea of being able to change the precision of "millions of lines of code" as the worst of outcomes. This ignores that the algorithms in this code have been developed for a particular precision. The suitability of the algorithm or approach is dependent on that precision being used. Change the precision and the algorithm is probably no longer suitable or "tuned".

This is a numerical analysis issue, not a language issue. The use of
KINDs allows the programmer to write robust portable code, but it alone
does not address all aspects of the numerical analysis. Furthermore,
when combined with explicit interfaces (modules, contained procedures,
etc.), fortran does catch many types of mixed precision problems at
compile time through TKR matching.

> For my applied analysis, there is only one practical option: hardware implemented 64-bit. (I could certainly be interested in hardware implemented 128-bit vector instructions, although I wonder what it would deliver.)
> Back in 1970's when there was significant variation in the precisions on offer, changing the hardware not only required change to the type declarations but also required much care with numeric parameters used in the code.

If you don't do mixed precision programming, if all you do is hardware
implemented 64-bit, then the use of fortran KINDS is not really an issue
is it?

> You also state that KIND "allows very precise control of the precision of intermediates within expressions". I don't think this is the case. It is a dangerous illusion.
> For Selected_Real_Kind, while there are two parameters “p” and “r”, the reality is there are typically only 2 hardware supported precisions available, while other options come with a significant performance (128 bit) or implementation penalty (eg: gFortran's 80 bit on 64-bit OS).

Yes, fortran requires only two supported precisions, and even then, it
is not too picky about the details. There would be downsides if the
language required more, just as there would be downsides to it
prohibiting more. The KINDs approach seems to be a nice, practical, open
ended, compromise.

>
> If you review most presented coders use of Selected_Real_Kind, the values of p and r are not chosen based on the precision required of the algorithm, but on the precision identified for the required practical option, hence the typical coding of:
>
> integer, parameter :: rp = Selected_Real_Kind ( 6, 37 )
> integer, parameter :: dp = Selected_Real_Kind ( 15, 307 )

Yes, of course, we all do that in certain programs when that is all that
is required.

> I would like to select a higher precision accumulator for dot_product, but with the performance required of vector instructions, this is not a practicality. While Selected_Real_Kind implies many options, the practical reality is very different.

I agree about this issue. At present, in order to write a higher
precision dot product, you must do something like:

real(ep) :: se, xe, ye
...
se = 0.0_ep
do i = 1, N
xe = x(i)
ye = y(i)
se = se + xe * ye
enddo

This requires that the optimizer does a lot for you that you cannot
specify directly in the language. What you might really want is to
replace that last statement with something like

se = se + dprod(x(i),y(i),kind=ep)

with the appropriately generalized version of the dprod() intrinsic. For
some reason, this has not been done. Or, to go a step further, perhaps
something like

se = dot_product(x,y,kind=ep)

if you want a higher-level implementation of the operation.

In any case, the KIND facility of fortran is not the limitation, as the
above hypothetical expressions show.


> FORALL is a good example of a coding approach that has not kept up with the recent changes in multi-thread programming. KIND is going the same way, as it is a clumsy approach to support of available precisions.

FORALL was never what programmers wanted or needed in the first place in
the 1980s. There were already numerous implementations of parallel do
loops using compiler directives that were available to show what was
needed, but the standards committee ignored all that and did something
else entirely.

We can just agree to disagree about the flexibility and utility of KIND.

$.02 -Ron Shepard

Thomas Koenig

unread,
Oct 7, 2019, 2:10:05 PM10/7/19
to
Ron Shepard <nos...@nowhere.org> schrieb:

> I agree about this issue. At present, in order to write a higher
> precision dot product, you must do something like:
>
> real(ep) :: se, xe, ye
> ...
> se = 0.0_ep
> do i = 1, N
> xe = x(i)
> ye = y(i)
> se = se + xe * ye
> enddo

Or, shorter

se = 0.0_ep
do i=1,n
se = se + real(x(i),kind=ep) * real(y(i), kind=dp)
end do

or

se = sum(real(x,kind=ep) * real(y,kind=ep))

> This requires that the optimizer does a lot for you that you cannot
> specify directly in the language.

I'm not quite sure what you mean. Could you elaborate?

Ron Shepard

unread,
Oct 7, 2019, 9:02:40 PM10/7/19
to
In the case of the vector expression, you don't really want to convert
the entire vectors from one precision to another, that would be wasteful
of memory, bandwidth, and machine cycles. In my original do-loop
example, you don't really want to even store the converted values in
memory, you just want them stored in registers and operated on in the
normal way. And there is memory alignment and strip mining that is
supposed to occur for the sake of efficiency. This is usual optimization
level stuff that you cannot specify in the high-level language, but you
more or less expect the compiler to do for you nonetheless.

However, in the case of dprod() code, that allows something a little
different. When floating point multiplication occurs, there are four
separate terms that are added together, consisting of the high-high,
high-low, low-high, and low-low bits. The last term is usually just to
round correctly. dprod() was originally designed to allow all four of
those terms to be accumulated into the appropriate double precision
value. So in a sense, there is no extra effort involved over a
single-precision product, but you get the correct extended result
anyway. I guess that a clever compiler might be able to recognize this
in the above expressions, and return the correct result without doing
the exponent and mantissa conversions and by using the four pieces of
the single-precision product rather than working with the four
double-precision product terms as is usually required. But if you look
at what the high-level code is telling the compiler to do, and look at
what you want to actually happen, that is a bit of a leap. That is not
the normal kind of optimization transformation.

That is why I think the dprod() intrinsic (or that functionality with
some other name) should have been generalized to allow more than just
the default real kind as arguments and the default double precision kind
as the result.

$.02 -Ron Shepard

ga...@u.washington.edu

unread,
Oct 7, 2019, 9:14:52 PM10/7/19
to
On Monday, October 7, 2019 at 9:30:16 AM UTC-7, Ron Shepard wrote:

(snip)

> This is a numerical analysis issue, not a language issue. The use of
> KINDs allows the programmer to write robust portable code, but it alone
> does not address all aspects of the numerical analysis. Furthermore,
> when combined with explicit interfaces (modules, contained procedures,
> etc.), fortran does catch many types of mixed precision problems at
> compile time through TKR matching.

Well, yes.

But just about always, one doesn't know the needed precision.

There are a small number of cases where one writes a program
to solve one specific problem, never to be used again.

But most often, one wants to be somewhat general, and so it
might be reused. This is true for both fixed and floating
point.

One might, for example, write a program to do numerical
integration. Integration is convenient in that it most
often doesn't lose precision, but consider the need to
loop over array elements. How big of a variable do you
need for the loop index, and loop limits?

In the 16 bit minicomputer days, and then again in the 16
bit microcomputer days, it was usual to use 16 bit integers.
(Even if they were stored in 32 bits.) Now most have gone
to 32 bits for default integer, but with larger memories
that might not be enough. But it is also inconvenient most
of the time to code everything with 64 bit integers for
all the loops.

Back to floating point. Many algorithms will need more
precision for larger problems. As above, this size of problems
increases with the size of computer memories.

Numerical integration mostly doesn't need more precision as
problem size increases, but many matrix based algorithms do.

KIND isn't so bad, but it isn't especially convenient when
there is no way to know what precision you need for actual data.

> > For my applied analysis, there is only one practical option:
> > hardware implemented 64-bit. (I could certainly be interested
> >in hardware implemented 128-bit vector instructions, although I
> > wonder what it would deliver.)

Hardware 128 bit floating point is coming along slowly.

Faster computers, allowing for larger problems, and ones that
need more precision are here.

robin....@gmail.com

unread,
Oct 7, 2019, 11:32:46 PM10/7/19
to
On Monday, October 7, 2019 at 8:01:47 PM UTC+11, ga...@u.washington.edu wrote:
> On Monday, October 7, 2019 at 12:13:50 AM UTC-7, robin...@gmail.com wrote:
>
> (snip, I wrote)
>
> > > But 128 bit floating point goes back at least to the IBM 360/85
> > > in about 1968,
>
> > Not IEEE. It's S/360 hexadecimal float.
>
> That is why it doesn't say IEEE above.

By your not saying that explicitly, some readers may think
you were speaking about IEEE, so I put that right.

Thomas Koenig

unread,
Oct 8, 2019, 4:43:37 PM10/8/19
to
Ron Shepard <nos...@nowhere.org> schrieb:
Sure.

> In my original do-loop
> example, you don't really want to even store the converted values in
> memory, you just want them stored in registers and operated on in the
> normal way.

And that is what is extremely likely to happen, unless your compiler is
_seriously_ behind the times. I might worry about g77, but then again
g77 is certainly not equipped to handle modern architectures. g95 is
already based on gcc 4.1, which uses single static assigment.

[...]

> However, in the case of dprod() code, that allows something a little
> different. When floating point multiplication occurs, there are four
> separate terms that are added together, consisting of the high-high,
> high-low, low-high, and low-low bits. The last term is usually just to
> round correctly. dprod() was originally designed to allow all four of
> those terms to be accumulated into the appropriate double precision
> value.

What you describe seems to apply to a machine which has single precision
hardware, but has to emulate double precision in software. Otherwise,
I would suspect that loading two single precision value into double
precision registes, and then performing the multiplication and
summation in double precision, would be the natural way to do this.

> So in a sense, there is no extra effort involved over a
> single-precision product, but you get the correct extended result
> anyway.

What you are describing would need four single precision
multiplications plus a few additions, to realize one multiplication
in double precision. Does not sound fast to me...

>I guess that a clever compiler might be able to recognize this
> in the above expressions, and return the correct result without doing
> the exponent and mantissa conversions and by using the four pieces of
> the single-precision product rather than working with the four
> double-precision product terms as is usually required.

Like I said, I don't think it is needed if you just load
single precision variables into double precision registers.

I have yet to see a floating point format where a*b, where a and
b are single precision, cannot be represented exactly by a double
precision number.

robin....@gmail.com

unread,
Oct 8, 2019, 8:42:26 PM10/8/19
to
Perhaps you can tell us when DPROD was originally designed.

For the IBM 360 (1966), single-precision hardware multiplication produced
a double precision product directly. The low-order 8 bits were zero.
So, I expect that if you wanted a double precision result from
single precision operands, you used DPROD.
That would be much faster than converting each single-precision operand
to double-precision, and then doing a double-precision multiplication.
It would also require fewer instructions.

> So in a sense, there is no extra effort involved over a
> single-precision product, but you get the correct extended result
> anyway. I guess that a clever compiler might be able to recognize this
> in the above expressions, and return the correct result without doing
> the exponent and mantissa conversions and by using the four pieces of
> the single-precision product rather than working with the four
> double-precision product terms as is usually required.

Again, for the IBM 360 and successors, double-precision products are
directly produced by the hardware.

robin....@gmail.com

unread,
Oct 8, 2019, 8:53:12 PM10/8/19
to
On Wednesday, October 9, 2019 at 7:43:37 AM UTC+11, Thomas Koenig wrote:
> Ron Shepard <nos...@nowhere.org> schrieb:
> > On 10/7/19 1:10 PM, Thomas Koenig wrote:
> >> Ron Shepard <nos...@nowhere.org> schrieb:

> > In my original do-loop
> > example, you don't really want to even store the converted values in
> > memory, you just want them stored in registers and operated on in the
> > normal way.
>
> And that is what is extremely likely to happen, unless your compiler is
> _seriously_ behind the times. I might worry about g77, but then again
> g77 is certainly not equipped to handle modern architectures. g95 is
> already based on gcc 4.1, which uses single static assigment.
>
> [...]
>
> > However, in the case of dprod() code, that allows something a little
> > different. When floating point multiplication occurs, there are four
> > separate terms that are added together, consisting of the high-high,
> > high-low, low-high, and low-low bits. The last term is usually just to
> > round correctly. dprod() was originally designed to allow all four of
> > those terms to be accumulated into the appropriate double precision
> > value.
>
> What you describe seems to apply to a machine which has single precision
> hardware, but has to emulate double precision in software. Otherwise,
> I would suspect that loading two single precision value into double
> precision registes, and then performing the multiplication and
> summation in double precision, would be the natural way to do this.

Please see my earlier post re single precision multiplication
for the IBM 360, which directly produced a double precision result.

On the S/360, loading a single-precision value into a floating-point
register does not clear the low 32 bits of the register.
It is first necessary to clear the register (by subtracting it from itself)
and then loading the single-precision value into the register.

That obviously wastes time and instructions compared with direct
single-precision multiplication.

ga...@u.washington.edu

unread,
Oct 9, 2019, 1:41:23 AM10/9/19
to
On Tuesday, October 8, 2019 at 5:53:12 PM UTC-7, robin...@gmail.com wrote:

(snip)

> On the S/360, loading a single-precision value into a floating-point
> register does not clear the low 32 bits of the register.
> It is first necessary to clear the register (by subtracting it from itself)
> and then loading the single-precision value into the register.

> That obviously wastes time and instructions compared with direct
> single-precision multiplication.

Some years ago, I had just this question here. The OS/360 compilers,
when using a single precision product where a double precision product
is needed, do just that. As written in the standard, single precision
multiply generates a single precision product.

On the other hand, the 8087 is designed, following IEEE 754
suggestions, to generate an extended precision product.
To get a proper single precision product, values are stored
and refetched.

The decision from that discussion, maybe more than 10 years
ago, was that extra precision is allowed.

More often, though, I need a double sized product from integer
multiply. Most processors generate a double sized product, and
most high-level languages don't make it easy to get one.
There is no IDPROD function! (Also useful is integer divide
with double length dividend, which also isn't usual for
high-level languages.)

In any case, about 40 years later, ESA/390 has a MDER
instruction to multiply and generate a single precision
product.

Thomas Koenig

unread,
Oct 9, 2019, 3:25:35 PM10/9/19
to
ga...@u.washington.edu <ga...@u.washington.edu> schrieb:

> More often, though, I need a double sized product from integer
> multiply. Most processors generate a double sized product, and
> most high-level languages don't make it easy to get one.

By now you should be able to trust your compiler on this one:

$ cat prod.f90
function iprod (a, b)
integer, value :: a, b
integer (kind=8) :: iprod
iprod = int(a,kind=8) * int(b,kind=8)
end function iprod
$ gfortran -O3 -S prod.f90
$ cat prod.s
.file "prod.f90"
.text
.p2align 4
.globl iprod_
.type iprod_, @function
iprod_:
.LFB0:
.cfi_startproc
movslq %edi, %rax
movslq %esi, %rdi
imulq %rdi, %rax
ret
.cfi_endproc
.LFE0:
.size iprod_, .-iprod_
.ident "GCC: (GNU) 10.0.0 20191003 (experimental)"
.section .note.GNU-stack,"",@progbits

So, the function has four instructions: Load %rax from the first
argument in %edi (widening load), load %rdi from the second argument
in %esi (same story) and multiply the two 64-bit registers containg
a 32-byte value each to yield the result in %rax, which also where
the return value is stored. And, of course, there's the return
statement.

In 32-bit mode, you get, because of the different argument passing
convention:

movl 8(%esp), %eax
imull 4(%esp)
ret

Again, the compiler handles this just fine.

JCampbell

unread,
Oct 9, 2019, 8:44:49 PM10/9/19
to
On Wednesday, October 9, 2019 at 4:41:23 PM UTC+11, ga...@u.washington.edu wrote:
>
> On the other hand, the 8087 is designed, following IEEE 754
> suggestions, to generate an extended precision product.
> To get a proper single precision product, values are stored
> and refetched.
>
> The decision from that discussion, maybe more than 10 years
> ago, was that extra precision is allowed.
>

"extra precision is allowed" Something to think about !!

I would like to better understand the 8087 extended precision evolution and why it was apparently dropped by Intel when SSE/AVX simd was introduced.

The idea of an extended precision accumulator (eg for dot_product) is appealing although I don't know how practical it really was. If only 1 calculation is performed the benefit is easy to identify, but for my use in finite element analysis (FEA) when there may be many (millions) of dot_product calculations combined, I am not sure of the benefit.
When dropping 80-bit accumulators and adopting SSE or AVX in 64-bit, the answers changed, but it is difficult to identify how significant the loss of precision or change in results really was. (there are lots of other influences on accuracy in larger models)

gFortran's implementation of 8087 real*10 looks awful, by using 16-byte storage. Another 64-bit compiler I use does not provide 10 byte reals, so I am unsure of the performance penalty for adopting higher precision.

At the moment there is Intel hardware support for 4-byte and 8-byte SIMD.
I am not aware of any move to support 10-byte, 12-byte or 16-byte SIMD performance or if this would have a lot of benefit. (I hope it would)

With memory storage capacity increasing, solving large systems of equations using larger number formats is not difficult. eg for changing a 32gb real*8 array to 64gb real*16 array, the barrier is not memory but performance.

My limited experience in FEA has shown an insignificant decline in numeric accuracy (error = f - [K].x) when using larger systems of equations. This suggests to me that the demand for higher precision may not be a major influence.

Is there any reports to say if this development of higher precision SIMD is needed ?

ga...@u.washington.edu

unread,
Oct 9, 2019, 10:36:11 PM10/9/19
to
On Wednesday, October 9, 2019 at 5:44:49 PM UTC-7, JCampbell wrote:

(snip, I wrote)
> > The decision from that discussion, maybe more than 10 years
> > ago, was that extra precision is allowed.

> "extra precision is allowed" Something to think about !!

> I would like to better understand the 8087 extended precision
> evolution and why it was apparently dropped by Intel when
> SSE/AVX simd was introduced.

I am not so sure. One possibility is that extended precision
(temporary real, as intel calls it) never worked as well as it
should have.

The 8087 was supposed to have a virtual stack, where registers
would spill to memory on stack overflow, and back on stack
underflow. It wasn't until after the chip was built that it
was found not to work. It was not possible to write the
interrupt routine to move data as appropriate.

The result of this, is that sometimes calculations have
the extra precision, and sometimes they don't, and you don't
usually know which ones. This complicates numerical analysis
of algorithms using it.

I suspect, though, that for SSE having powers of two bits
makes things work easier, and so they built it that way.


> The idea of an extended precision accumulator (eg for dot_product)
> is appealing although I don't know how practical it really was.
> If only 1 calculation is performed the benefit is easy to identify,
> but for my use in finite element analysis (FEA) when there may
> be many (millions) of dot_product calculations combined,
> I am not sure of the benefit.

I have worked with square mesh PDE solvers, and know a little about
what happens with them. Solving the mesh doesn't tend to lose
bits, as it is mostly averaging near values. Computing first
and especially second derivatives is fairly sensitive to errors
in the mesh solution, but again extra precision doesn't help much.

As well as I know, it is matrix algorithms, even matrix multiply
(which is, as you note, a lot of dot products) can in some cases
result in precision loss. Worst is matrix inversion, but
fortunately there are usually better ways to do such problems.

Reminds me of a feature of the IBM 7030 (stretch) that, as far
as I know, hasn't been implemented in later processors.
One can select whether post normalization shifts in zeros or
ones. One can then run a program both ways, and check for
differences.


> When dropping 80-bit accumulators and adopting SSE or AVX
> in 64-bit, the answers changed, but it is difficult to
> identify how significant the loss of precision or change in
> results really was. (there are lots of other influences on
> accuracy in larger models)

(snip)

steve kargl

unread,
Oct 9, 2019, 10:37:58 PM10/9/19
to
JCampbell wrote:

>
> gFortran's implementation of 8087 real*10 looks awful, by using 16-byte storage. Another 64-bit
> compiler I use does not provide 10 byte reals, so I am unsure of the performance penalty for
> adopting higher precision.

Apparently, you've never heard about memory alignment. Reading the
documentation might help.

'-m96bit-long-double'
'-m128bit-long-double'
These switches control the size of 'long double' type. The x86-32
application binary interface specifies the size to be 96 bits, so
'-m96bit-long-double' is the default in 32-bit mode.

Modern architectures (Pentium and newer) prefer 'long double' to be
aligned to an 8- or 16-byte boundary. In arrays or structures
conforming to the ABI, this is not possible. So specifying
'-m128bit-long-double' aligns 'long double' to a 16-byte boundary
by padding the 'long double' with an additional 32-bit zero.

In the x86-64 compiler, '-m128bit-long-double' is the default
choice as its ABI specifies that 'long double' is aligned on
16-byte boundary.

Notice that neither of these options enable any extra precision
over the x87 standard of 80 bits for a 'long double'.

Given the 30 year history of GCC, I suspect the 16-byte storage for an
Intel 80-bit extended precision entity has been well-thought out.

--
steve


JCampbell

unread,
Oct 10, 2019, 10:51:06 PM10/10/19
to
Steve,

I have heard about memory alignment and have tried to code to minimise it, although not very successfully. (I never produced a test that clearly identified the problem and solution.)
It is more an excuse than a reason, for if you look at the performance penalties for non-aligned arrays with AVX ( doing dot product on arrays that are out of alignment by 4 or 8, rather than 16 bytes ) there have been significant performance improvements on i7 processors over last 8 years for what is claimed to be non-aligned delays, certainly for practical (mixed) computation.
This issue should (and has) been addressed in the processor at memory <> cache interface and not in Fortran coding. I think my tests have identified this has happened.

Back in the 80's we had a 80-bit hardware accumulator.
Now for Intel i series processors, the best SIMD hardware accumulator is 64-bit.
I have not seen any discussion about why this has happened or if a higher precision accumulator is justified ?

I also don't see any requests for higher precision hardware performance.
I am asking if there is much support for 10, 12 or more likely 16 byte SIMD, or if that is also not needed ?
Memory capacity is making this a feasible option.

I see this as more important than KIND's way of addressing precision.

James Van Buskirk

unread,
Oct 11, 2019, 12:18:55 AM10/11/19
to
"JCampbell" wrote in message
news:279634bb-3531-441c...@googlegroups.com...

> Back in the 80's we had a 80-bit hardware accumulator.
> Now for Intel i series processors, the best SIMD hardware
> accumulator is 64-bit.
> I have not seen any discussion about why this has happened
> or if a higher precision accumulator is justified ?

Having the floating point size same as integer size makes sense
(see TRANSFER). Then it's a logical step to go to 4-wide accumulators
with FMA instructions that can be issued to 2 pipelines resulting in
16 FLOPs per clock cycle.

robin....@gmail.com

unread,
Oct 11, 2019, 11:27:47 AM10/11/19
to
On Thursday, October 10, 2019 at 1:36:11 PM UTC+11, ga...@u.washington.edu wrote:
> On Wednesday, October 9, 2019 at 5:44:49 PM UTC-7, JCampbell wrote:
>
> (snip, I wrote)
> > > The decision from that discussion, maybe more than 10 years
> > > ago, was that extra precision is allowed.
>
> > "extra precision is allowed" Something to think about !!
>
> > I would like to better understand the 8087 extended precision
> > evolution and why it was apparently dropped by Intel when
> > SSE/AVX simd was introduced.
>
> I am not so sure. One possibility is that extended precision
> (temporary real, as intel calls it) never worked as well as it
> should have.

The co-processor received wide use.
I doubt that that was the reason.
There was a flaw in one version of the chip that gave wrong results
on division.

> The 8087 was supposed to have a virtual stack, where registers
> would spill to memory on stack overflow, and back on stack
> underflow. It wasn't until after the chip was built that it
> was found not to work.

There were many versions of the chip. It there were such a
shortcoming, it would have been rectified by the next version.

ga...@u.washington.edu

unread,
Oct 11, 2019, 7:19:49 PM10/11/19
to
On Friday, October 11, 2019 at 8:27:47 AM UTC-7, robin...@gmail.com wrote:

(snip, I wrote)

> > The 8087 was supposed to have a virtual stack, where registers
> > would spill to memory on stack overflow, and back on stack
> > underflow. It wasn't until after the chip was built that it
> > was found not to work.

> There were many versions of the chip. It there were such a
> shortcoming, it would have been rectified by the next version.

That is what I thought, too, but as well as I know, it never
happened.

Well, the 80287 is much of the 8087 logic, but with a different
(and synchronous) but interface. It runs with a separate clock
from the CPU clock. By the 80387, I believe much was redesigned,
such that it could have been fixed.

The problem has to do with the inability to restore the stack to
the state that it should be at the time, but that is as close
as I know it.
0 new messages