Fortran and Climate Models

John McCue

unread,

Mar 14, 2021, 3:10:02 PM3/14/21

to

Hi,

An interesting article I ran across about Fortran and Climate Models

https://partee.io/2021/02/21/climate-model-response/

David Duffy

unread,

Mar 14, 2021, 7:26:26 PM3/14/21

to

This too is good:

http://www.easterbrook.ca/steve/?p=1558

Clive Page

unread,

Mar 15, 2021, 12:41:10 PM3/15/21

to

Yes both good. But one thing struck me rather forcefully: the Fortran Language Committees over the last decade or so has spend almost all of its time on Co-arrays, on the assumption that as clock speeds reach a plateau, the only way up is to use parallelism. But as far as I can tell, although almost all climate models use Fortran, and many of them use parallelism for obvious reasons, they use MPI (or sometimes OpenMP), not co-arrays.

Many of these models have been around for a long time, I gather, so the parallelism must have been added fairly recently. So why MPI rather than co-arrays? Is it easier to retrofit MPI to existing code than co-arrays? I don't know much about this but it seems doubtful. Does anyone know?

--
Clive Page

spectrum

unread,

Mar 15, 2021, 5:20:53 PM3/15/21

to

On Tuesday, March 16, 2021 at 1:41:10 AM UTC+9, Clive Page wrote:
> Many of these models have been around for a long time, I gather, so the parallelism must have been added fairly recently. So why MPI rather than co-arrays? Is it easier to retrofit MPI to existing code than co-arrays? I don't know much about this but it seems doubtful. Does anyone know?

I have no experience with climate modeling, but isn't it simply because coarrays are not well-known among the community
(and also compilers that support it are not widely available) before 2010?

# this is a comment from Milan Curcic (who is I think a young expert of that field)
https://fortran-lang.discourse.group/t/why-are-climate-models-written-in-programming-languages-from-1950/832/13

spectrum

unread,

Mar 15, 2021, 5:24:35 PM3/15/21

to

Sorry for typos..., I mean "were" rather than "are" in this sentence:

> isn't it simply because coarrays were not well-known among the community
(and also compilers that supported it were not widely available) before 2010?

gah4

unread,

Mar 15, 2021, 10:17:15 PM3/15/21

to

On Monday, March 15, 2021 at 9:41:10 AM UTC-7, Clive Page wrote:

(snip)

> Yes both good. But one thing struck me rather forcefully: the Fortran Language Committees
> over the last decade or so has spend almost all of its time on Co-arrays, on the assumption
> that as clock speeds reach a plateau, the only way up is to use parallelism.
> But as far as I can tell, although almost all climate models use Fortran, and many of them
> use parallelism for obvious reasons, they use MPI (or sometimes OpenMP), not co-arrays.

> Many of these models have been around for a long time, I gather, so the parallelism
> must have been added fairly recently. So why MPI rather than co-arrays?
> Is it easier to retrofit MPI to existing code than co-arrays?
> I don't know much about this but it seems doubtful. Does anyone know?

As far as I know, at least some of the coarray implementations use MPI at the lower level.

I suspect, though, that means it is hard to use both at the same time.

But I also suspect that doing things at a lower level allows for more efficiency,
which might be just important enough in this case.

The question, then, is how much overhead is there in implementing coarrays, and
in which programs is it worth doing it at a lower level for more efficiency.

David Duffy

unread,

Mar 15, 2021, 11:44:56 PM3/15/21

to

Clive Page <use...@page2.eu> wrote:
>
> Yes both good. But one thing struck me rather forcefully: the Fortran
> Language Committees over the last decade or so has spend almost all of its
> time on Co-arrays,
>

> Many of these models have been around for a long time, I gather, so the
> parallelism must have been added fairly recently. So why MPI rather than
> co-arrays? Is it easier to retrofit MPI to existing code than co-arrays?
> I don't know much about this but it seems doubtful. Does anyone know?
>

Age of code base? Looking through
https://www.nemo-ocean.eu/
which settled on _Fortran 95_ in 2008? or so with
MPI/OpenMP. Their development strategy for 2018-2022

https://doi.org/10.5281/zenodo.1471663

sets the target as "20000+ processors" increasing to "200000 processors
(reanalysis; 2023) ...[with] shared memory parallelism [that] will
supercede the need for intranode communications." This would be
OpenMP/OpenACC AFAICT. I think there are people using those codes who
read this NG.

JCampbell

unread,

Mar 16, 2021, 7:31:00 AM3/16/21

to

I also have no experience with climate modeling, but have been using OpenMP to improve computing performance in structural finite element analysis.
My reason for this is that it is a simpler approach involving a single executable, without the complexity of distributed memory or multiple processors.
I would expect climate models might require distributed memory so co-arrays could be a better approach.
(FEA can use a large shared array which suits the multi-threaded shared memory approach of OpenMP)

OpenMP 3.0 has also been stable for longer and is well supported by ifort and gFortran. It provides reasonable functionality for improving computational performance. I can't speak for the stability of co-arrays.
Perhaps OpenMP was an earlier functional option.

At present I am finding it difficult to scale up performance with OpenMP when increasing the number of threads in a shared memory many cores environment or when using multiple threads per core.
OpenMP requires memory use strategies for SIMD and cache management appropriate for the hardware configuration, an area for Fortran usage that has been excluded from the standard.
Certainly hardware considerations are important to both Openmp and (I expect) co-arrays.

Is the introduction of co-arrays a change to Fortran's ambivalence to hardware ?

Ron Shepard

unread,

Mar 16, 2021, 3:25:38 PM3/16/21

to

On 3/16/21 6:30 AM, JCampbell wrote:
> I also have no experience with climate modeling, but have been using OpenMP to improve computing performance in structural finite element analysis.

> My reason for this is that it is a simpler approach involving a single executable, without the complexity of distributed memory or multiple processors.[...]

It is a common misconception that shared memory programming is "simpler"
than distributed memory programming. It is just a matter of perspective
as to which programming model is adopted. For a wide class of problems,
single-program multiple-data (SPMD) distributed memory programming is
the simplest and most straightforward approach, especially when trying
to scale beyond the small-scale 8 to 16 processor range.

Message passing libraries were developed in the late 1980s when parallel
computing first became popular. Their main advantage then was the fact
that no compiler or language modifications were required, and thus they
were portable to a wide range of underlying hardware. MPI was initially
a collection of the features of these early libraries. It has been
around for almost 30 years now, so that is why application areas such as
climate modeling, fluid dynamics, and quantum chemistry began using it
and have continued to use it over time. Higher level libraries, such as
scalapack and the global array library have been developed on top of MPI.

It was not uncommon to see SPMD programs running on shared-memory
machines at that time that outperformed the compiler-driven
shared-memory programs. The SPMD model bypasses many performance
bottlenecks associated with memory bandwidth and cache coherence. And of
course, the shared-memory programs at that time were limited to specific
compilers on specific hardware, whereas the MPI SPMD programs were
portable to a wider range of compiler and hardware combinations, and in
the late 1980s and early 1990s, there was a wide range of hardware and
software combinations (compilers, operating systems, network
connectivity, CPU architectures, etc.).

OpenMP is an attempt to address those portability issues for the shared
memory model, but it is still a shared-memory model, so it is inherently
limited.

I view coarrays as another way to write SPMD programs that fits within
the MPI and global arrays application domain. I know there are
shared-memory implementations of coarrays, but I have not used those. I
first used coarray fortran in 2004. At that time it was just a Cray
compiler feature, so it was not portable. Now it is part of the fortran
language standard, which makes it more portable in some ways, but less
so in others. The problem I think is that MPI itself is pretty easy to
use in most applications, and it works with mixed-language fortran-C
applications, so coarrays have made only slow progress over the last
decade. But at that same time, MPI has not kept up with modern fortran
practices (scope hierarchy, allocatable arrays, pointers, etc.), so that
makes coarray programming more appealing.

I personally would have preferred if MPI had kept up with the fortran
language over the last 20 years, rather than to introduce coarrays as a
new programming layer into the language. What we have now is kind of the
worst situation possible for fortran. We are now in a situation where we
need to use an MPI based distributed approach for parallel scaling, but
use some combination of coarrays and OpenMP on the shared-memory
multicore nodes for maximum performance. That is a difficult programming
model. Maybe coarray implementations can somehow grow at both ends and
bridge that gap, allowing million-node SPMD programs and efficient use
of shared-memory nodes.

$.02 -Ron Shepard

gah4

unread,

Mar 16, 2021, 8:09:21 PM3/16/21

to

On Tuesday, March 16, 2021 at 12:25:38 PM UTC-7, Ron Shepard wrote:

(snip)

> It is a common misconception that shared memory programming is "simpler"
> than distributed memory programming. It is just a matter of perspective
> as to which programming model is adopted. For a wide class of problems,
> single-program multiple-data (SPMD) distributed memory programming is
> the simplest and most straightforward approach, especially when trying
> to scale beyond the small-scale 8 to 16 processor range.

I would have said a small class of very popular problems.
It depends on how you do the counting (weighting).

> Message passing libraries were developed in the late 1980s when parallel
> computing first became popular. Their main advantage then was the fact
> that no compiler or language modifications were required, and thus they
> were portable to a wide range of underlying hardware. MPI was initially
> a collection of the features of these early libraries. It has been
> around for almost 30 years now, so that is why application areas such as
> climate modeling, fluid dynamics, and quantum chemistry began using it
> and have continued to use it over time. Higher level libraries, such as
> scalapack and the global array library have been developed on top of MPI.

Well also that many important problems spend a lot of time in a very
small part of the program, deep in inner loops. You can spend some time
speeding up those, even if it takes some detailed work, and get a big
payback.

Remember Amdahl's law.

With OpenMP and coarray, it is not so easy to know the overhead, and
so to predict the speedup.

mies...@gmail.com

unread,

Mar 18, 2021, 12:34:46 PM3/18/21

to

I am currently preparing a paper about the implementation of a parallel programming model using Fortran, to share this with Damian Rouson shortly. As a preview, there is a small chapter to share a different view to the Fortran programming language. I am not sure if I will keep this or change this later, but it could fit quite well with the discussion here. So let's share this here:

The purpose of this work is to extend and to implement the Fragmented Objects model, which is a special kind of a distributed objects model, using the Fortran 2018 programming language. The advantage of using such a parallel programming model is a higher level of abstraction for parallelism, to make general-purpose parallel programming feasible.

2 Refining the Definition of the Fortran Programming Language

Let’s start with a different view to the Fortran programming language here.

With the advent of the Fortran 2008 standard, the Fortran programming language has drastically and uniquely changed it’s nature. Since then, and even if the Fortran standards text -as well as the modern Fortran literature- do describe the language as a single one, Fortran comprises two distinct programming languages: Fortran (classic) and Coarray Fortran. The both programming languages do mainly share the same syntax, except that Coarray Fortran has some unique extensions for parallel programming exclusively. The main difference between the both is the different runtime they are addressing to: Fortran (classic) addresses to the classical sequential (serial) runtime, while Coarray Fortran addresses to a parallel runtime.

Fortran (classic) comprises each Fortran standard so far, and describes sequential programming with the Fortran language. Coarray Fortran addresses parallel programming since the Fortran 2008 standard. Coarray Fortran with Fortran 2008 was limited to SPMD (Single Program–Multiple Data). With the advent of coarray teams in Fortran 2018 it is multi-level SPMD, something more similar to MPMD (Multiple Program–Multiple Data): Each coarray team comprises a single and independent SPMD entity.

There is good reason to divide the (single) Fortran language into two distinct programming languages: Coarray Fortran is commonly described as the original (sequential) Fortran language with the addition of just a small set of extensions for parallel programming. This definition is certainly true for a basic description of the Fortran language and from the viewpoint of a Fortran compiler (processor) implementer.

For the Fortran programmer, on the other hand, it does hardly describe how dramatic the impact of Coarray Fortran can be for many of the classical Fortran language features, especially with a novel use of classical OOP features (Fortran 2003) for parallel programming: As we will see, Fortran’s OOP can be used to define and implement parallellism using Coarray Fortran. That usage of OOP has not much in common with classical OOP usage.

Another example for a difference with Coarray Fortran is the data exchange between two routines that run concurrently. Parts of the data transfer will go through coarrays that are not part of the routines argument list. Thus, the classical argument list does no longer describe the complete interface of such a routine.

Parallel programming with Fortran does require the use of both: Fortran (classic) for the programming of the purely local computations, and Coarray Fortran for anything that involves parallel programming. Thus, using OOP syntax can have two very different meanings, even within the same single Fortran program: Classical OOP with the programming of the purely local computations, and using OOP syntax for defining and implementing parallelism with parallel programming using Coarray Fortran.

Let’s start here to use the term Fortran for such a combined use of the both, Fortran (classic) and Coarray Fortran.

Regards
Michael Siehl

cjcoats

unread,

Feb 15, 2022, 12:31:22 PM2/15/22

to

On Monday, March 15, 2021 at 12:41:10 PM UTC-4, Clive Page wrote:
[snip...]

> Many of these models have been around for a long time, I gather, so the parallelism must have been added fairly recently. So why MPI rather than co-arrays?

For meteorology models, the parallelism was added in the late 1980's and early 1990's.

Moreover, the thermodynamics/cloud/convection/chemistry related computations (which are most of the calculation, fwiw) are quite intricate calculations at each individual grid cell or each individual vertical grid-column (frequently with internal time steps) that do not translate cleanly into co-array form.

FWIW

Beliavsky

unread,

Feb 15, 2022, 2:22:58 PM2/15/22

to

A recent Wall Street Journal article that mentions the Fortran code CESM2 is excerpted at Fortran Discourse https://fortran-lang.discourse.group/t/climate-scientists-encounter-limits-of-computer-models-bedeviling-policy/2756 .

Lynn McGuire

unread,

Feb 15, 2022, 4:22:46 PM2/15/22

to

We have the same problem in our chemical process simulator. The
saturation calculations rule everything in the four phases (vapor,
liquid hydrocarbon, aqueous liquid, solid) and are not very parallelizable.

Lynn

Thomas Koenig

unread,

Feb 16, 2022, 4:15:33 AM2/16/22

to

Lynn McGuire <lynnmc...@gmail.com> schrieb:

I can well imagine, especially if you are doing iterative calculations
of your equilibria.

(Side remark: I do hope your program converges better than Aspen Plus
does. If you use a converged solution as a starting value there, that
program will likely diverge. I have no polite words for that kind
of numerics).

Lynn McGuire

unread,

Feb 16, 2022, 3:05:30 PM2/16/22

to

The problems happen when a solution ends up in multiple phases and one
of the components is a strong polar like water and / or an alcohol. We
fight with that all the time.

Lynn

Thomas Koenig

unread,

Feb 16, 2022, 3:47:32 PM2/16/22

to

Lynn McGuire <lynnmc...@gmail.com> schrieb:
> On 2/16/2022 3:15 AM, Thomas Koenig wrote:
>> Lynn McGuire <lynnmc...@gmail.com> schrieb:

>>> We have the same problem in our chemical process simulator. The
>>> saturation calculations rule everything in the four phases (vapor,
>>> liquid hydrocarbon, aqueous liquid, solid) and are not very parallelizable.
>>
>> I can well imagine, especially if you are doing iterative calculations
>> of your equilibria.
>>
>> (Side remark: I do hope your program converges better than Aspen Plus
>> does. If you use a converged solution as a starting value there, that
>> program will likely diverge. I have no polite words for that kind
>> of numerics).
>
> The problems happen when a solution ends up in multiple phases and one
> of the components is a strong polar like water and / or an alcohol. We
> fight with that all the time.

With Aspen, it is more like every time you have a recycle stream :-(
I am glad I no longer work with that program (and I only used it
a few times).

However, it is still not clear what goes wrong. Surely, if your
flash calculation has converged so that energy and mass balances
and equilibria are satisfied, you should be able to test for that?

Or is it a problem with derivatives that you try to calculate where
one point is inside the flash regime and the other one outside?

Lynn McGuire

unread,

Feb 16, 2022, 5:06:30 PM2/16/22

to

Yes, the worst is when one of the phases is on the knife edge of phase
change, usually water. More water than normal is in the vapor phase and
the rest of the water is in the aqueous liquid phase. Getting that to
converge is a trick and can cause recycles to spin out of their
convergence zone as the solution flip flops between the phases. We have
tried to modify our software so that bad initialization is ok but there
are always edge cases where the solution is backed into a corner.

I equate it to the plane of convergence. The plane of convergence is
not flat, there are hills and valleys. If a solver heads into a valley,
there may not be a good solution in that valley. The good solution
might be in the next valley or five valleys over. The solver must be
able to force the calculations over the next hill, or the next five
hills. Not trivial when you are working with nested solvers. And some
hills and valleys look like west Texas and some hills and valleys look
like the Himalayas.

Lynn

Thomas Koenig

unread,

Feb 17, 2022, 2:33:05 AM2/17/22

to

[Aspen is _far_ worse than what you describe]

I can imagine that this could be the case, especially if you have
termination errors of an iterative solver for the phase distribution.

One suggestion (from afar, and you may have tried this already):
If the calculations are right on the boundary of phase change,
you could calculate the liquid phase only, the equations will work
fine if you simply assume no gas phase. After that is converged, you
could then check if your total vapor pressure is larger than the
pressure in the process unit you are looking at, and then, if that
is the case, calculate the vapor phase under the assumption that
there is one. For a suitably small amount of vapor, you can also
assume that the concentrations in your liquid phase do not change,
so the vapor composition is fixed.

If you want to take this a little further, you might also be able
to do a calculation for a small amount of vapor, assuming a linear
change in concentration and temperature in your liquid phase with
vapor fraction or pressure or... This could be somewhat easier if
your thermodynamic quantities have explicit derivatives, which
I obviously don't know.

Lynn McGuire

unread,

Feb 17, 2022, 5:45:29 PM2/17/22

to

Nope, we use an iterative interpolation method for the adiabatic and
isentropic flashes. For the isothermal flash we try to solve the knife
edge before falling into the interpolation method. For the constant
volume flashes we use the interpolation method after the modified
Wegstein fails (I call it the brute force method).

We have hundreds of solvers in our software. We are Frankenstein's
monster, built of many pieces. Our three main recycle solvers are all
flow based. I am going to add a pressure based recycle solver to that
group in the next year or two.

BTW, we have 60 different equations of state in our software. The
details of each EOS is hidden from the recycle and flash solvers,
otherwise they would be trying to do too much.

Lynn

Thomas Koenig

unread,

Feb 18, 2022, 3:35:38 PM2/18/22

to

Lynn McGuire <lynnmc...@gmail.com> schrieb:

> Nope, we use an iterative interpolation method for the adiabatic and
> isentropic flashes. For the isothermal flash we try to solve the knife
> edge before falling into the interpolation method. For the constant
> volume flashes we use the interpolation method after the modified
> Wegstein fails (I call it the brute force method).

This is getting a bit away from Fortran, but...

To me, a "flash" is something adiabatic: You reduce the pressure
from above the vapor pressure to below the vapor pressure of your
volatile components. Part of the volatile components evaporate,
leading to the formation of a gas phase and a drop in temperature due
to the enthalpy of evaporation (which some folks call "heat" due
to sloppy terminology).

Now, I can see what an isothermal flash would be, you add enough heat
to keep the temperature constant. I wouldn't call it that, but OK.

As for a "constant volume" flash - I'm not sure what process that I
would still consider a flash can be isochoric, every flash I know
expands in volume. But I guess that's all a matter of terminology.

> We have hundreds of solvers in our software. We are Frankenstein's
> monster, built of many pieces. Our three main recycle solvers are all
> flow based. I am going to add a pressure based recycle solver to that
> group in the next year or two.

Sounds like a challenge.

> BTW, we have 60 different equations of state in our software. The
> details of each EOS is hidden from the recycle and flash solvers,
> otherwise they would be trying to do too much.

To bring this slightly back to Fortran: I think equations of
state could profit from Fortran's object-oriented features.

I would envisage an abstract type representing a substance, with
type-bound procedures for calculating the state depending on
whatever variables you want (let's say from p and T, or p and h,
or h and s, or ...) and which would let the user inquire about
the properties via other type-bound procedures.

If I were to implement something like that, this is probably the
approach I look at first. (The likelyhood of that happening is
10^-x, it is a few decades since I last played around implementing
an equation of state, it was when I was still at University).

Lynn McGuire

unread,

Feb 18, 2022, 4:46:39 PM2/18/22

to

We have two constant volume flashes:
1. constant volume and constant enthalpy (T and P float)
2. constant volume and constant temperature (H and P float)

Constant volume flashes are used for pressure vessels with internal
reactions or external heat energy (endothermic or exothermic).

Most of the modern equations of states are using temperature and density
as their dependent properties. They are trying to get out of using
vapor fraction as was used before as the density does vary as one goes
across phase and critical boundaries.

Lynn

Thomas Koenig

unread,

Feb 19, 2022, 5:35:20 AM2/19/22

to

Lynn McGuire <lynnmc...@gmail.com> schrieb:

> We have two constant volume flashes:
> 1. constant volume and constant enthalpy (T and P float)
> 2. constant volume and constant temperature (H and P float)

> Constant volume flashes are used for pressure vessels with internal
> reactions or external heat energy (endothermic or exothermic).

So, you use "flash" in a different sense than I do. Faiir enough,
computer science is not the only field beset by terminology
differences :-)

Ron Shepard

unread,

Feb 19, 2022, 11:16:22 AM2/19/22

to

In my field of computational chemistry, I hear this called the "sudden
approximation," which is kind of the opposite extreme from the adiabatic
approximation. It shows up in all kinds of areas, from light-matter
interactions to thermodynamics to scattering theory.

https://en.wikipedia.org/wiki/Adiabatic_theorem

$.02 -Ron Shepard

Lynn McGuire

unread,

Feb 21, 2022, 3:52:52 PM2/21/22

to

A flash to me is taking a known mixture of components and deciding in
which of the four phases that they belong in with respect to any two of
the Pressure, Temperature, Enthalpy (H), Density, or Entropy (S)
properties. The remaining properties and transport properties can be
easily calculated then.

And yes, there are mixtures that exist in all four phases at a given T
and P.

Lynn

Walter Spector

unread,

Mar 27, 2022, 1:35:16 PM3/27/22

to

For the most part, a good article.

For about a dozen years prior to retiring, I worked on code used by weather/climate community. We mostly used Fortran 95, an ever increasing amount of C++, and MPI. Why not F2003, OpenMP, or coarrays? Simple. It was a fairly hard requirement that our code could run on a wide variety of compilers and hardware. It took an extraordinarily outrageous amount of time for the various compiler vendors to get F2003 supported - and to this day some still haven't. Same with co-arrays and F2008. In the case of OpenMP, it doesn't work will with large distributed memory machines - and we had customers whose models needed to scale to 10s of thousands of processors with such hardware.

Fortran 95 works extremely well. Our code made extensive use of an object style of programming using derived types, modules, dynamic memory management, optional arguments on procedures, and so on. Pretty much the entire API is Fortran oriented - though many of the internals are in C++. Interestingly we also had a C API. It went largely unused until we decided to do a python API. The python API was implemented by calling the C API. The python API was surprisingly popular among our user base - so it actually became the biggest customer of the C API.

One of the bigger issues with Fortran was the difficulty of doing template-like generic programming. (Sorry, but Parameterized Derived Types didn't have what we needed. And PDTs are F2003. So even if the compiler vendors had been quicker to support F2003, we still wouldn't have used them as much as one might want...) Instead we often made 'interesting' use of the C preprocessors macro capabilities to do this. It seems like these days fypp could be used instead - though that is a very recent development. The old CoCo portion of the F95 Standard did not have a macro capability. So even if the compiler vendors had implemented CoCo, which again they didn't, it was still lacking in usability.

MPI works well, but hasn't been all roses. MPI was initially based on F77, so it took the MPI standards folks a long time to straighten out F90 interfaces and modules for ease of use and good compile-time checking of calls. Also the use of default integers for array sizes and such was a problem. MPI 3 and it looks like now MPI 4 have finally resolved much of this. Shame it has taken almost 30 years to get there.