Intel Fortran compiler options

Beliavsky

unread,

Apr 14, 2017, 2:02:43 PM4/14/17

to

For compiling with Intel Fortran on Windows, emphasizing correctness over speed, I can use the options

-nologo -stand:f08 -traceback -check:all -check:bounds -check:uninit -debug:all -warn:all,nodec,interfaces -Qfp-stack-check -fpe:0 -gen-interfaces -qinit:snan

What other options do people suggest?

FortranFan

unread,

Apr 14, 2017, 4:17:41 PM4/14/17

to

The most important option for me is -standard-semantics:
https://software.intel.com/en-us/fortran-compiler-18.0-developer-guide-and-reference-standard-semantics

Plus I mostly use -stand:f15 (or at times just -stand) for diagnostics relative to the standard.

You should also try Intel Fortran forum for more informed opinion on such discussions:
https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows

Stefano Zaghi

unread,

Apr 15, 2017, 2:07:36 AM4/15/17

to

Dear Beliasvsky,

when I compile in "paranoiac debug mode" I use

-O0 -debug all -check all -warn all -extend-source 132 -traceback -gen-interfaces -fpe-all=0 -fp-stack-check -fstack-protector-all -ftrapuv -no-ftz -std08 -standard-semantics

This is my first test for all programs, then come the actual TDD :-)

My best regards.

Beliavsky

unread,

Apr 28, 2017, 5:14:32 PM4/28/17

to

On Friday, April 14, 2017 at 2:02:43 PM UTC-4, Beliavsky wrote:

I have found that to avoid crashes in my program, which uses lots of allocatable arrays, either -heap-arrays or -F1234567890 is necessary (for the -F option, the number after it can be reduced by a factor of 100, and the program still runs). I wonder which option to use as the default when compiling. The program seems to run at about the same speed in either case.

From ifort /?

/F<n> set the stack reserve amount specified to the linker

/heap-arrays[:n]
temporary arrays of minimum size n (in kilobytes) are allocated in
heap memory rather than on the stack. If n is not specified,
all temporary arrays are allocated in heap memory.

FortranFan

unread,

Apr 28, 2017, 5:19:01 PM4/28/17

to

On Friday, April 28, 2017 at 5:14:32 PM UTC-4, Beliavsky wrote:

> ..
>
> I have found that to avoid crashes in my program, which uses lots of allocatable arrays, either -heap-arrays or -F1234567890 is necessary .. I wonder which option to use as the default when compiling. ..

You should consult with Intel Fortran support, at least inquire on their forum.

I use /heap-arrays option by default.

Tim Prince

unread,

Apr 28, 2017, 6:08:37 PM4/28/17

to

The more confusing factor is the difference in the way ifort treats :n
in comparison to the almost analogous gfortran option. In ifort, the :n
is potentially useful only for arrays of size known at compile time;
otherwise, :n is better omitted. The general idea of using stack for
small frequently allocated temporaries and heap for large infrequently
allocated ones is valid if it can be accomplished.
Steve Lionel consistently recommended in case of doubt the use of
/heap-arrays. Since you seem to be interested mainly in Windows, some
of the situations are excluded where I would not second that recommendation.
The question is less consequential when you can restrict the temporary
array creation to regions of serial execution, of course creating and
destroying temporaries outside inner loops (a desirable tactic). When
this is not possible, it may lead to considerations in optimizing
trade-off between number of MPI images and number of OpenMP threads,
probably requiring the stack reserve to be set to maximum while
OMP_STACKSIZE is set to minimum working value. For example, when I
worked on NASA Overflow for MIC KNC, optimum was to use stack private
arrays with ulimit -s unlimited, OMP_STACKSIZE=9M, 6 MPI processes per
CPU with OMP_NUM_THREADS=30. Larger NUM_THREADS would have required
more than "unlimited" stack; larger NP could require more than 16GB RAM.
With the importance of OpenMP, the prevalence of bugs associated with
heap-arrays under OpenMP seems to be much reduced from what it was at
one time.
Current Intel CPUs, unlike MIC KNC, don't usually require more than 1
thread per core for typical Fortran jobs, and stack reserve should not
be as critical a resource.

Ron Shepard

unread,

Apr 29, 2017, 12:41:38 PM4/29/17

to

Okay, so here we are, yet again, with another thread about memory
allocation resource problems in fortran. We have had these threads since
the mid 90s when f90 compilers began to appear. The problem, which we
have all known about here for the last 20 years, is that it is not
possible within the language to query the operating system to ask about
available and allocated memory. Programmers need that information in
order to allocate resources efficiently, to make runtime algorithm
choices, and in many cases we need that information even to allow the
program to run, not just to run efficiently.

There are still legacy fortran codes that, today right now as I type
this, allocate a block of memory at the beginning of execution of the
code and perform their own memory management within that heap/stack.
That is the way we did memory allocation in the 70s and 80s with f77.
When you do memory allocation that way, you always have available the
amount of memory left and the amount of memory that has been allocated.
For nontrivial programs, that information is critical in almost every
programming environment but even more so with parallel computing where
it is all to easy to exhaust cache, to exhaust stack space, to exhaust
physical memory, or even to exhaust swap space.

I have posted about this here before, since the 90s even when the
problems became apparent with the current automatic/allocatable
programming model. The arguments against making this memory information
available to the programmer, in a standard way so that it is portable in
syntax and, when possible, functionality, where not valid then and they
have not been valid in the 20 years since then. 20 years is a long time
in the computing world. Isn't it time for this glaring and obvious
deficiency in the language to be corrected?

This deficiency is the main reason there is still active development
being done with f77 style coding standards after all this time. This
needs to be fixed. If this issue is not addressed, and if I am still
programming in 10 or 20 years, this will still be a problem within the
language. It is not going away, it will not just magically solve itself.
The existence of this thread, 20 years after this problem became
apparent, is evidence of the critical and fundamental importance of the
ability to query the system for this information.

$.02 -Ron Shepard

Beliavsky

unread,

Apr 29, 2017, 1:30:59 PM4/29/17

to

I have thought about this much less than you, but why can't people just use ALLOCATABLE arrays and the compiler equivalent of the -heap-array option? When arrays are ALLOCATEd, they can check the STAT.

Gary Scott

unread,

Apr 29, 2017, 1:51:16 PM4/29/17

to

I think the problem is mostly unexpectedly exceeding tiny default stack
sizes. It was never obvious to me that array operations would cause a
stack overflow but they seem to, at least in older IVF versions. I
think that is improved in recent versions.

herrman...@gmail.com

unread,

Apr 29, 2017, 4:04:56 PM4/29/17

to

On Saturday, April 29, 2017 at 9:41:38 AM UTC-7, Ron Shepard wrote:
> Okay, so here we are, yet again, with another thread about memory
> allocation resource problems in fortran. We have had these threads since
> the mid 90s when f90 compilers began to appear. The problem, which we
> have all known about here for the last 20 years, is that it is not
> possible within the language to query the operating system to ask about
> available and allocated memory. Programmers need that information in
> order to allocate resources efficiently, to make runtime algorithm
> choices, and in many cases we need that information even to allow the
> program to run, not just to run efficiently.

One problem is that operating systems don't always know how
much is available to a program. Many systems allocate memory
from a global pool, such that another program could use some at
any time. If you ask, and then immediately allocate, it could
still fail.

> There are still legacy fortran codes that, today right now as I type
> this, allocate a block of memory at the beginning of execution of the
> code and perform their own memory management within that heap/stack.
> That is the way we did memory allocation in the 70s and 80s with f77.
> When you do memory allocation that way, you always have available the
> amount of memory left and the amount of memory that has been allocated.
> For nontrivial programs, that information is critical in almost every
> programming environment but even more so with parallel computing where
> it is all to easy to exhaust cache, to exhaust stack space, to exhaust
> physical memory, or even to exhaust swap space.

In the days of single task systems, where only one program ran at
a time, and had all the memory (not used by other parts of the
system) available, that made some sense. Also, the early multitask
systems, with OS/360 as an example, allocated a fixed size region
to each job step, unavailable to others.

But systems like Unix and Windows allocate globally, from a pool
of virtual memory. Knowing how much virtual memory is available
really doesn't help much. (Even more fun, I have known programs
that wouldn't run, computing the available memory in 32 bits,
finding the value negative, and so less than needed.)

> I have posted about this here before, since the 90s even when the
> problems became apparent with the current automatic/allocatable
> programming model. The arguments against making this memory information
> available to the programmer, in a standard way so that it is portable in
> syntax and, when possible, functionality, where not valid then and they
> have not been valid in the 20 years since then. 20 years is a long time
> in the computing world. Isn't it time for this glaring and obvious
> deficiency in the language to be corrected?

The real reason for the current problem is more related to C.
C89 allows for automatic arrays with constant bounds.
Allocating with variable length is done using malloc(), which
returns a pointer to the allocated space. Note required by the
language, but tradition is to also allocate large arrays of
known constant size with malloc(). The result of these, is that
it is usual for C programs not to need a large stack.

I remember the early MS linker, that defaulted to 4K bytes
for stack. You could change it with a command option, but you
had to remember to do that. I was then using it with OS/2 1.2,
which had a reasonable virtual memory system. I could ask for
really large stack sizes, and all would be fine.

Note that the x86 stack grows downward. You have to start
at some high enough value and, depending on that value, can't
easily change it later. That can be large and virtual on modern
systems, but you do have to choose it large enough.

> This deficiency is the main reason there is still active development
> being done with f77 style coding standards after all this time. This
> needs to be fixed. If this issue is not addressed, and if I am still
> programming in 10 or 20 years, this will still be a problem within the
> language. It is not going away, it will not just magically solve itself.
> The existence of this thread, 20 years after this problem became
> apparent, is evidence of the critical and fundamental importance of the
> ability to query the system for this information.

The OS/360 memory allocation call, GETMAIN, has a feature that seems
to be rare in other systems. One can ask for a variable amount,
giving a minimum and maximum. This ability continued to virtual
storage in later systems, and is still in current z/OS.
This allows programs to ask for the minimum needed, but also run
more efficiently when more is available.

Ron Shepard

unread,

Apr 30, 2017, 3:28:22 AM4/30/17

to

On 4/29/17 12:30 PM, Beliavsky wrote:
> I have thought about this much less than you, but why can't people just use ALLOCATABLE arrays and the compiler equivalent of the -heap-array option? When arrays are ALLOCATEd, they can check the STAT.

Using allocatable rather than automatic arrays only solves part of the
problem. What programmers need is the ability to ask how much memory has
been allocated at any time, and how much is is still available at any
time. Then block sizes can be chosen for various algorithms, or
algorithms that trade between memory and operation counts can be tuned,
or decisions can be made between using shared and distributed memory,
and so on. If a queuing system enforces memory limits (and most of them
seem to), then the programmer needs to be able to ask how much available
memory is left, otherwise the job might abort altogether. It is correct
that with different types of memory (local cache, shared cache, local
RAM, shared RAM, swap space on disk, etc.) one might need to query
different memory types separately rather than just a single total
number, but that seems like a detail that could be solved in one of
several reasonable ways.

One of the difficulties is that in addition to automatic and allocatable
arrays, which the programmer can control to some extent, there are also
array operations, copy-in/copy-out arguments, and so on that the
compiler does undercover. The memory used in this way still counts
toward whatever resource limits there are. The programmer only has
indirect control over these, e.g. through compiler optimization levels
or compiler options. This is another reason why query options are necessary.

Using STAT in the allocate() statement can only tell if that particular
allocation succeeded. If it fails, it might well be too late to
backtrack and redo previous steps to make room for that allocation. What
is needed is the ability to query ahead of time too, not just when the
program is at the limit of the resource.

$.02 -Ron Shepard

herrman...@gmail.com

unread,

Apr 30, 2017, 4:18:29 AM4/30/17

to

On Sunday, April 30, 2017 at 12:28:22 AM UTC-7, Ron Shepard wrote:
> On 4/29/17 12:30 PM, Beliavsky wrote:
> > I have thought about this much less than you, but why can't
> > people just use ALLOCATABLE arrays and the compiler equivalent
> > of the -heap-array option? When arrays are ALLOCATEd, they can
> > check the STAT.

> Using allocatable rather than automatic arrays only solves part of the
> problem. What programmers need is the ability to ask how much memory has
> been allocated at any time, and how much is is still available at any
> time. Then block sizes can be chosen for various algorithms, or
> algorithms that trade between memory and operation counts can be tuned,
> or decisions can be made between using shared and distributed memory,
> and so on.

As I noted above, on many systems the amount of virtual memory
available changes depending on what other programs are doing.

But even more, the questions you ask now are more related to real
memory. If you want to loop through an array many times, as many
iterating algorithms do, you want it all in real memory.

As above, some systems allocate virtual memory from a global
pool, where others have a limit per task. (Or it might depend
on which one you hit first.) The exact allocation of real memory
is pretty system dependent, and depends in complicated ways on what
other programs, or other parts of the OS, are doing.

> If a queuing system enforces memory limits (and most of them
> seem to), then the programmer needs to be able to ask how much available
> memory is left, otherwise the job might abort altogether. It is correct
> that with different types of memory (local cache, shared cache, local
> RAM, shared RAM, swap space on disk, etc.) one might need to query
> different memory types separately rather than just a single total
> number, but that seems like a detail that could be solved in one of
> several reasonable ways.

Some years ago, I wrote a program to attempt to determine the
available virtual memory using a loop to allocate successively
smaller by half blocks. With some questionable assumptions relating
to fragmentation, it could be close. C's malloc() is pretty simple,
in allocating a block, with usually a small overhead, and returning
a pointer. Fortran's ALLOCATE might have more overhead to keep
track of what it is doing, but you could still try.

I did manage to crash a Unix system some time ago, allocating
as much memory as I could in a loop, then keeping at allocated
for some time.

> One of the difficulties is that in addition to automatic and allocatable
> arrays, which the programmer can control to some extent, there are also
> array operations, copy-in/copy-out arguments, and so on that the
> compiler does undercover. The memory used in this way still counts
> toward whatever resource limits there are. The programmer only has
> indirect control over these, e.g. through compiler optimization levels
> or compiler options. This is another reason why query options are necessary.

It would be nice for compilers to tell you which statements are
allocating temporary arrays, so one could plan accordingly.
Many, it seems, don't.

> Using STAT in the allocate() statement can only tell if that particular
> allocation succeeded. If it fails, it might well be too late to
> backtrack and redo previous steps to make room for that allocation. What
> is needed is the ability to query ahead of time too, not just when the
> program is at the limit of the resource.

Yes, but you can ALLOCATE in a loop to determine if enough is
available, then DEALLOCATE for later use.

herrman...@gmail.com

unread,

Apr 30, 2017, 4:39:52 AM4/30/17

to

I just tried:

#include <stdio.h>
#include <stdlib.h>
int main() {
size_t i, j, k;
for(i=1, j=2; j>0; i *=2, j*=2) ;
for( ; i>0; i /= 2) if(malloc(i-16)) k += i;
printf("%zd bytes\n", k);
}

on OS X and it returned

140728898420720

That is way more than the size of my swap disk, so I suspect
it isn't a reliable way to measure available memory.

Beliavsky

unread,

Apr 30, 2017, 8:31:44 AM4/30/17

to

On Sunday, April 30, 2017 at 3:28:22 AM UTC-4, Ron Shepard wrote:
> On 4/29/17 12:30 PM, Beliavsky wrote:
> > I have thought about this much less than you, but why can't people just use ALLOCATABLE arrays and the compiler equivalent of the -heap-array option? When arrays are ALLOCATEd, they can check the STAT.
>
> Using allocatable rather than automatic arrays only solves part of the
> problem. What programmers need is the ability to ask how much memory has
> been allocated at any time, and how much is is still available at any
> time.

Does C++ provide for this? When I Google "c++ query memory usage" I see http://stackoverflow.com/questions/63166/how-to-determine-cpu-and-memory-consumption-from-inside-a-process , where the answer to the question "How to determine CPU and memory consumption from inside a process?" involves C code that depends on the operating system. Given the interoperability with C of Fortran 2003+, I think one could call the C code from Fortran.

Richard Maine

unread,

Apr 30, 2017, 9:56:58 AM4/30/17

to

Ron Shepard <nos...@nowhere.org> wrote:

> Using allocatable rather than automatic arrays only solves part of the
> problem. What programmers need is the ability to ask how much memory has
> been allocated at any time, and how much is is still available at any
> time. Then block sizes can be chosen for various algorithms, or
> algorithms that trade between memory and operation counts can be tuned,
> or decisions can be made between using shared and distributed memory,

> and so on...

But those are issues that come up for relatively sophisticated
programmers. Mention has been made in this thread of the many times that
people post here asking about memory-related problems. I think you'll
find that if you examine most of those posts, they are from relatively
unsophisticated users - ones quite unlikely to be helped by the
existance of intrinsics to inquire about such things. I don't think the
standard is the place to fix this.

The two largest sorts of problems I see from novice users are

1. Not understanding even the order of magnitude of the memory some
really huge allocation requires. This often points to being a bit hazy
about the concept of memory requirements at all. That one mostly
requires education rather than mods to the standard.

2. Running out of the very small default stack space limit of some
implementations. I think this problem is more addressed at vendors.
The combination of having a small default stack limit plus putting lots
of stuff on the stack by default seems poor. Yes, I know there are
options to change these things, but in my opinion, the default options
out to be such that normal expectations are met. Otherwise, you'll
continue to have people asking here because they won't know about how to
change the default options. I'll leave it to others as to whether the
default stack size is *GREATLY* increased or whether defaults don't put
arrays on the stack. But in my opinion, one or the other of those should
be done. That's a vendor issue - not a standards one.

--
Richard Maine
email: last name at domain . net
domain: summer-triangle

FortranFan

unread,

Apr 30, 2017, 11:21:16 AM4/30/17

to

On Saturday, April 29, 2017 at 12:41:38 PM UTC-4, Ron Shepard wrote:

> ..

> The existence of this thread, 20 years after this problem became
> apparent, is evidence of the critical and fundamental importance of the
> ability to query the system for this information.

> ..

@Ron Shepard,

Independent of this thread, you should yourself try to put together a proposal for the Fortran standards committee with your thoughts and ideas, say in the form of one of those technical specification (TS) documents.

My hunch is the exercise itself will make you realize what you have in mind is not even possible, let alone being plausible of being addressed by the standards committee.

By the way, don't you think this subthread is related to a Windows OS specific aspect (https://msdn.microsoft.com/en-us/library/8cxs58a6(v=vs.100).aspx) on *stack versus heap* allocations as followed by a specific Fortran compiler vendor (Intel), the details of which are hardly the realm of a language standard such as Fortran?

Ron Shepard

unread,

Apr 30, 2017, 1:05:49 PM4/30/17

to

On 4/30/17 3:18 AM, herrman...@gmail.com wrote:
> On Sunday, April 30, 2017 at 12:28:22 AM UTC-7, Ron Shepard wrote:
>> On 4/29/17 12:30 PM, Beliavsky wrote:

[...]

> As above, some systems allocate virtual memory from a global
> pool, where others have a limit per task. (Or it might depend
> on which one you hit first.) The exact allocation of real memory
> is pretty system dependent, and depends in complicated ways on what
> other programs, or other parts of the OS, are doing.

Yes, but this is the argument in favor of making the information
available to the programmer, not for hiding it.

>> If a queuing system enforces memory limits (and most of them
>> seem to), then the programmer needs to be able to ask how much available
>> memory is left, otherwise the job might abort altogether. It is correct
>> that with different types of memory (local cache, shared cache, local
>> RAM, shared RAM, swap space on disk, etc.) one might need to query
>> different memory types separately rather than just a single total
>> number, but that seems like a detail that could be solved in one of
>> several reasonable ways.
>
> Some years ago, I wrote a program to attempt to determine the
> available virtual memory using a loop to allocate successively
> smaller by half blocks. With some questionable assumptions relating
> to fragmentation, it could be close. C's malloc() is pretty simple,
> in allocating a block, with usually a small overhead, and returning
> a pointer. Fortran's ALLOCATE might have more overhead to keep
> track of what it is doing, but you could still try.

But the underlying allocation routines already have the information just
sitting there. When programmers manage their own heap/stack by using
memory from a fixed array, the amount allocated and the amount left are
always readily available. It would seem to be a rather simple request
for this information to be made available to the programmer using some
kind of general standardized syntax.

> I did manage to crash a Unix system some time ago, allocating
> as much memory as I could in a loop, then keeping at allocated
> for some time.

You can also hang a system by exhausting swap space and in many other
numbers of ways. I think what I'm advocating is a way for the programmer
to query the system so that these situations can be avoided. Maybe not
every single one of them can be avoided, just some of them.

>> One of the difficulties is that in addition to automatic and allocatable
>> arrays, which the programmer can control to some extent, there are also
>> array operations, copy-in/copy-out arguments, and so on that the
>> compiler does undercover. The memory used in this way still counts
>> toward whatever resource limits there are. The programmer only has
>> indirect control over these, e.g. through compiler optimization levels
>> or compiler options. This is another reason why query options are necessary.
>
> It would be nice for compilers to tell you which statements are
> allocating temporary arrays, so one could plan accordingly.
> Many, it seems, don't.

I think these allocations also sometimes depend on the runtime
environment, the length of the arrays, the amount of stack space that is
available at any time, the number of threads at that moment, and so on.
The compiler is already using the information that I'm requesting, I'm
mostly advocating that this information be made available to the
programmer.

>> Using STAT in the allocate() statement can only tell if that particular
>> allocation succeeded. If it fails, it might well be too late to
>> backtrack and redo previous steps to make room for that allocation. What
>> is needed is the ability to query ahead of time too, not just when the
>> program is at the limit of the resource.
>
> Yes, but you can ALLOCATE in a loop to determine if enough is
> available, then DEALLOCATE for later use.

Yes, that is a potentially expensive and inefficient way to determine a
value that is already just sitting there. Also, as others have pointed
out in the past, some systems return a success status during the
allocate, but they later fail when the memory is actually accessed. The
actual memory page allocations are deferred. Or the trash collection
operations, actually returning storage back to the memory pool for
future allocations, might be deferred and performed asynchronously from
the deallocate statements. The ability to query is even more important
for these types of systems.

Also, there can be competitions among different user processes, or even
between the OS and the user, for resources, and that complicates the
query process. But there are also systems where a single task is in
control, either directly or indirectly, of the resource allocations.
That is the way supercomputers work, and the way they are expected to
work for the foreseeable future. Another complication is that user
libraries, say numerical libraries or message passing libraries or
distributed memory libraries on parallel computers, also allocate
resources indirectly. The programmer might not have direct control, or
even knowledge, of these allocations, so the ability to query afterward
to see the amount of memory is important in these situations too.
Another possible solution is to allow tasks to reserve memory. Once
reserved, it is guaranteed to be available to that task for future
allocation/deallocation steps, free from competition from other users or
other OS or library tasks. This is the "sandbox" or "virtual machine"
idea. But that task's own OS/library/compiler allocations still would
come from that reserved block, so the ability to query the status of
that sandbox is still necessary.

$.02 -Ron Shepard

Ron Shepard

unread,

Apr 30, 2017, 1:09:28 PM4/30/17

to

I'm guessing that it returns the amount of virtual memory. That might be
one of the useful memory resources to query about, but not the only one.
Swap space, physical memory, queue limits, sandbox limits, etc. are some
other ones that might be important.

$.02 -Ron Shepard

Ron Shepard

unread,

Apr 30, 2017, 1:25:22 PM4/30/17

to

On 4/30/17 8:56 AM, Richard Maine wrote:
> I think you'll
> find that if you examine most of those posts, they are from relatively
> unsophisticated users - ones quite unlikely to be helped by the
> existance of intrinsics to inquire about such things.

Yes, there are different levels of sophistication. I would hope that
this could be implemented in such a way that unsophisticated users are
not burdened by the ability to query, while more sophisticated ones
could use it to their advantage.

> I don't think the
> standard is the place to fix this.

I think this is a good point, and this is probably one of the main
reasons why it has not been addressed in the last 20-25 years since the
problem became apparent. It cannot be fixed just within fortran, and it
cannot be fixed just within the OS (e.g. as part of POSIX). In some
cases, the necessary functionality might require fundamental changes to
the way memory is managed by the OS. But it probably requires
cooperation between the compiler and the OS to provide this
functionality to the programmer.

The bottom line is that programs can and do crash due to memory resource
constraints, and many (if not all) of those failures could be avoided if
it were possible to query the environment ahead of time to determine the
usage and the availability of resources.

$.02 -Ron Shepard

Ron Shepard

unread,

Apr 30, 2017, 2:11:48 PM4/30/17

to

On 4/30/17 10:21 AM, FortranFan wrote:
> By the way, don't you think this subthread is related to a Windows OS specific aspect (https://msdn.microsoft.com/en-us/library/8cxs58a6(v=vs.100).aspx) on *stack versus heap* allocations as followed by a specific Fortran compiler vendor (Intel), the details of which are hardly the realm of a language standard such as Fortran?

Back in the 70s and 80s when I started programming as part of my job,
there was a single computer vendor that would supply the hardware, the
OS, the compiler, and the math libraries to the customer. This was the
typical situation. If there was a problem, you as a customer went to
that single vendor. It didn't matter exactly where the problem actually
was, in the compiler or the OS or somewhere else, you had just that
single vendor who was responsible for fixing the problem.

That is not the current situation. Now there are different vendors
supplying and maintaining those different components. When an issue like
this comes up, one that requires interaction between the compiler and
the OS for example, then that requires different vendors to cooperate.
Part of that cooperation is through adherence to standards, say the
fortran compiler standard the the POSIX OS standard. There are
advantages to having multiple vendors involved, but there are also
disadvantages. Your statement reflects one of those disadvantages. In
this particular case of querying the system for resource availability,
it is very easy to simply say that it is not the realm of the language
compiler, and it easy to say that it is not the realm of the OS, and
therefore no vendor is responsible for fixing the problem.

That means that it is up to changing the language and OS standards to
provide this capability. Otherwise, how would those things possibly get
implemented? As I pointed out before, this problem has been known,
recognized, and discussed for at least 20 years, probably even longer,
and it has not yet been addressed. 20 years is a long long time with
computers. Given the landscape described above, it may never be addressed.

$.02 -Ron Shepard

campbel...@gmail.com

unread,

Apr 30, 2017, 9:49:53 PM4/30/17

to

Ron,

In Windows, you can get the available memory and the available virtual memory. Silverfrost FTN95 provides an interface to an API routine that does this.
The following is an example of it's use.

subroutine report_memory_usage (string)
character string*(*)
!
integer, parameter:: knd = 4

type MEMORYSTATUSEX
sequence
integer dwLength;
integer dwMemoryLoad;
integer(knd) ullTotalPhys;
integer(knd) ullAvailPhys;
integer(knd) ullTotalPageFile;
integer(knd) ullAvailPageFile;
integer(knd) ullTotalVirtual;
integer(knd) ullAvailVirtual;
integer(knd) ullAvailExtendedVirtual;
end type

stdcall GlobalMemoryStatusEx 'GlobalMemoryStatusEx'(REF):logical
type(MEMORYSTATUSEX)::mdata
!
integer(knd) :: lastAvailPhys = 0

mdata%dwLength = 64

if (GlobalMemoryStatusEx(mdata)) then

write (117,10) mdata%ullAvailPhys, (lastAvailPhys-mdata%ullAvailPhys), string

lastAvailPhys = mdata%ullAvailPhys

! print *, "Percentage of physical memory in use ", mdata%dwMemoryLoad
! print 10, "Amount of actual physical memory ", gb(mdata%ullTotalPhys)
! print 10, "Amount of physical memory available ", gb(mdata%ullAvailPhys)
! print 10, "Committed memory limit ", gb(mdata%ullTotalPageFile)
! print 10, "Amount of memory current process can commit ", gb(mdata%ullAvailPageFile)
! print 10, "Size of virtual address space ", gb(mdata%ullTotalVirtual)
! print 10, "Amount of unreserved/uncommitted memory ", gb(mdata%ullAvailVirtual)
! 10 format(1x,a,f0.3)

else
print*,"Report Memory Failed ", string
end if

10 format (B'---,---,---,--#',B'---,---,---,--#',2x,a)

end subroutine report_memory_usage

Making a mistake between available memory and available virtual memory is not very subtle !

There is another problem with the way local and automatic arrays are allocated on the stack. This should never have been the case. All these arrays above a small size ( say 1kb ) should be processed equivalent to ALLOCATE and go on the "heap" (not sure why it is called this). It is ridiculous that programs fail with stack overflow, when the stack provided is 3mb in size. The stack is not the appropriate place for these arrays. How many novice programmers have had this message, when they should not have to manage this problem. It is amazing that after 20 years this is still a problem and compiler developers have chosen not to fix it.

I would also draw your attention to the way Silverfrost has implemented COMMON with their new 64-bit compiler. COMMON addresses are allocated at run time, so the 2gb limit does not apply. Why haven't other compiler venders done this ? It is possible to provide a coding environment that is a little bit friendly.

Well worth the $.02 Ron

campbel...@gmail.com

unread,

Apr 30, 2017, 9:58:52 PM4/30/17

to

I should have pointed out that given Silverfrost's strange KIND values:
"integer, parameter:: knd = 4" is equivalent to integer*8

herrman...@gmail.com

unread,

May 1, 2017, 12:49:22 AM5/1/17

to

On Sunday, April 30, 2017 at 10:05:49 AM UTC-7, Ron Shepard wrote:

(snip, I wrote)

> > As above, some systems allocate virtual memory from a global
> > pool, where others have a limit per task. (Or it might depend
> > on which one you hit first.) The exact allocation of real memory
> > is pretty system dependent, and depends in complicated ways on what
> > other programs, or other parts of the OS, are doing.

> Yes, but this is the argument in favor of making the information
> available to the programmer, not for hiding it.

I think I earlier proved that this isn't true.

I wrote a program that allocated 140TB of memory.

Maybe it isn't obvious, but my computer doesn't have that much.

Many systems now use lazy allocation, where they allow one to allocate
more than is available, on the assumption that it won't all be used.
(Like airlines overbook, assuming that not everyone will show.)

(snip)

> > Some years ago, I wrote a program to attempt to determine the
> > available virtual memory using a loop to allocate successively
> > smaller by half blocks. With some questionable assumptions relating
> > to fragmentation, it could be close. C's malloc() is pretty simple,
> > in allocating a block, with usually a small overhead, and returning
> > a pointer. Fortran's ALLOCATE might have more overhead to keep
> > track of what it is doing, but you could still try.

> But the underlying allocation routines already have the information just
> sitting there. When programmers manage their own heap/stack by using
> memory from a fixed array, the amount allocated and the amount left are
> always readily available. It would seem to be a rather simple request
> for this information to be made available to the programmer using some
> kind of general standardized syntax.

Systems can use complicated systems for keeping track of
memory, such that the answer isn't available. Fragmentation makes
it more complicated. Even if you know the total, you don't know how
big the largest available block is.

> > I did manage to crash a Unix system some time ago, allocating
> > as much memory as I could in a loop, then keeping at allocated
> > for some time.

> You can also hang a system by exhausting swap space and in many other
> numbers of ways. I think what I'm advocating is a way for the programmer
> to query the system so that these situations can be avoided. Maybe not
> every single one of them can be avoided, just some of them.

You can hang it, but in this case after hanging for too long, it
actually crashed. It seems that it needed some memory for something
too important.

(snip)

> > It would be nice for compilers to tell you which statements are
> > allocating temporary arrays, so one could plan accordingly.
> > Many, it seems, don't.

> I think these allocations also sometimes depend on the runtime
> environment, the length of the arrays, the amount of stack space that is
> available at any time, the number of threads at that moment, and so on.
> The compiler is already using the information that I'm requesting, I'm
> mostly advocating that this information be made available to the
> programmer.

I just meant a note from the compiler that a certain statement
does, or might, use a temporary.

> >> Using STAT in the allocate() statement can only tell if that particular
> >> allocation succeeded. If it fails, it might well be too late to
> >> backtrack and redo previous steps to make room for that allocation. What
> >> is needed is the ability to query ahead of time too, not just when the
> >> program is at the limit of the resource.

> > Yes, but you can ALLOCATE in a loop to determine if enough is
> > available, then DEALLOCATE for later use.

> Yes, that is a potentially expensive and inefficient way to determine a
> value that is already just sitting there. Also, as others have pointed
> out in the past, some systems return a success status during the
> allocate, but they later fail when the memory is actually accessed.

I haven't kept up with what they do, but assuming that the
information is available, it should be fast enough to do the
allocate. But as above, it might not use it anyway.

One popular way is a linked list of unallocated blocks, with the links
at the beginning of the block. This is convenient, but doesn't work
so well with virtual memory. It requires a lot of paging to follow
the list. I believe that there are currently ways to improve this,
but I don't know them.

> The
> actual memory page allocations are deferred. Or the trash collection
> operations, actually returning storage back to the memory pool for
> future allocations, might be deferred and performed asynchronously from
> the deallocate statements. The ability to query is even more important
> for these types of systems.

More important, but also less possible. If it is done asynchronously,
it likely isn't available (yet).

> Also, there can be competitions among different user processes, or even
> between the OS and the user, for resources, and that complicates the
> query process. But there are also systems where a single task is in
> control, either directly or indirectly, of the resource allocations.
> That is the way supercomputers work, and the way they are expected to
> work for the foreseeable future. Another complication is that user
> libraries, say numerical libraries or message passing libraries or
> distributed memory libraries on parallel computers, also allocate
> resources indirectly. The programmer might not have direct control, or
> even knowledge, of these allocations, so the ability to query afterward
> to see the amount of memory is important in these situations too.

If someone wants a non-standard extension to allow it, there is
nothing to stop them. But enough systems don't supply the information,
or don't have it accurately, not to put it in the standard.

FortranFan

unread,

May 1, 2017, 11:05:42 AM5/1/17

to

On Sunday, April 30, 2017 at 2:11:48 PM UTC-4, Ron Shepard wrote:

> ..
>

> That means that it is up to changing the language .. to
> provide this capability. .. this problem has been known,

> recognized, and discussed for at least 20 years, probably even longer,

> .. Given the landscape described above, it may never be addressed.
>
> ..

@Ron Shepard,

It's clear you think this is very important enough to be addressed in the Fortran language standard, but you do not indicate how that is even possble. The crux of your problem is most people seem to think otherwise, both in terms of importance and feasibility. So it's entirely up to you to clarify the picture. My own rather simple-minded understanding of a few OS environments (Windows, Linux, Mac, etc.) suggest there is no way to get a reasonably useful picture of the available memory during runtime that has any sense of commonality across these OS environments. You seem to imply the processor already has the information available via the current ALLOCATE functionality but it's unclear if that's accurate.

I think the situation comes back to my earlier point for you to put together a proposal-type of information packet with your ideas and where, at least as a thought exercise, you work through the implementation on a couple of systems such as Mac and Linux. It's the least one can expect from someone with your extensive experience. Short of that, for you to post something along the lines of "Okay, so here we are, yet again, .. about memory allocation resource problems in Fortran .. " in threads, that appear quite unrelated to a broader and deeper aspect you are bringing up, would come across as a distraction to most readers.

Terence

unread,

May 1, 2017, 6:55:32 PM5/1/17

to

I think Ron Shepard implied the real problem by just talking about it.

Programmers should NOT have to know and worry about ANY aspect of the
computer the program is running on (other than that the compiler for that
architecture was used).

If I program an algorithm in Fortran I expect the program to run.
And pretty much if I program and compile in F77 I know it will.

Everything else needed should be "behind the curtain" of the Wizard.

Terence

Stefano Zaghi

unread,

May 1, 2017, 11:29:41 PM5/1/17

to

Dear Ron,

I have to admit that I am not able to follow you in all details, my ignorance is a limit.

You have alluded to the fact that the big source of memory management issues has been added in the Fortran 90 standard. This seems to imply that Fortran 77 and 66 have a better memory management. Can you elaborate more comparing what has been changed in Fortran 90 standard?

This is surprising me, Fortran has a great backward compatibility, with a modern compiler I am able to compile very old Fortran programs, thus I think I could access to all Fortran 77 memory-related features. What I think is really changed are OS/hardware related features rather Fortran "degradation" that, indeed, looks greatly improved from 77 days.

Dear Terence,

Il giorno martedì 2 maggio 2017 00:55:32 UTC+2, Terence ha scritto:
> I think Ron Shepard implied the real problem by just talking about it.

I do not like "implicit" typing, please adopt "implicit none" here.

> Programmers should NOT have to know and worry about ANY aspect of the
> computer the program is running on (other than that the compiler for that
> architecture was used).

I agree, in most cases I really do not want to know the low-level aspects: compiler is better than me in handling such aspects.

> If I program an algorithm in Fortran I expect the program to run.

This is the hope, but Fortran is not magic, if I am a bad programmer I run a bad program.

> And pretty much if I program and compile in F77 I know it will.

This is a silly sentence: I see a lot of F77-spaghetti-programs plenty of bugs that do not run as expected. Your program run well because you are a great programmer, not because F77 has some "magic" features.

In these days, claiming that F77 is superior to "modern" standards (90+) is false, anachronistic, and in general to be refused without a real prove: newbies should not be mislead to think such a false assumption.

If you think F77 is better that F90+, please provide reproducile examples proving your claim.

> Everything else needed should be "behind the curtain" of the Wizard.

I agree, I hope so, in general. Sometimes, it is not possible.

My best regards.

FortranFan

unread,

May 1, 2017, 11:33:48 PM5/1/17

to

On Monday, May 1, 2017 at 6:55:32 PM UTC-4, Terence wrote:

> ..

>
> Programmers should NOT have to know and worry about ANY aspect of the

> computer the program is running on ..

> And pretty much if I program and compile in F77 I know it will.
>
> Everything else needed should be "behind the curtain" of the Wizard.

> ..

To paraphrase Ron Shepard (or to quote Ronald?!), there you go again! Take a moment to watch this video please, it should sound familiar:
https://www.youtube.com/watch?v=Wi9y5-Vo61w

Note OP didn't provide any details, but only stated "I have found that to avoid crashes in my program, .., either -heap-arrays or -F1234567890 is necessary .."

Since Windows OS is implied, chances are the situation encountered is no different than what's shown below with a FORTRAN 77 program that calculates sigma of x sub i; i=1 to N and where an adjustable array (aka automatic array) comes into play in a subprogram (subroutine) where said array gets allocated on the stack dynamically when the subprogram is invoked and that 'crashes' the program when compiled and linked with 'default' options but which runs and gives results when /STACK option (https://msdn.microsoft.com/en-us/library/8cxs58a6.aspx) is specified with the linker command.

-- begin FORTRAN 77 example --
C stovfl.for
SUBROUTINE STOVFL(N)
IMPLICIT NONE
INTEGER N
INTEGER I
DOUBLE PRECISION X(N)
DOUBLE PRECISION SUMX
SUMX = 0D0
DO 10 I=1,N
X(N)=DBLE(I)
SUMX=SUMX+X(N)
10 CONTINUE
WRITE(6,11) SUMX
11 FORMAT("SUMX=",1PG22.15)
RETURN
END SUBROUTINE
C main.for
IMPLICIT NONE
INTEGER NBIG
PARAMETER(NBIG=130000)
CALL STOVFL(NBIG)
STOP
END
-- end FORTRAN 77 example --

-- begin compile, link, and execution sequence --

C:\Fortran\MythBuster\c92\sor>f77 /c stovfl.for
Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update A)
Copyright 2001 Compaq Computer Corp. All rights reserved.

stovfl.for

C:\Fortran\MythBuster\c92\sor>f77 /c main.for
Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update A)
Copyright 2001 Compaq Computer Corp. All rights reserved.

main.for

C:\Fortran\MythBuster\c92\sor>link /out:stovfl.exe main.obj stovfl.obj
Microsoft (R) Incremental Linker Version 6.00.8447
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.

C:\Fortran\MythBuster\c92\sor>stovfl.exe

forrtl: severe (170): Program Exception - stack overflow
Image PC Routine Line Source
stovfl.exe 00401081 Unknown Unknown Unknown

C:\Fortran\MythBuster\c92\sor>link /stack:2000000 /out:stovfl.exe main.obj stovfl.obj
Microsoft (R) Incremental Linker Version 6.00.8447
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.

C:\Fortran\MythBuster\c92\sor>stovfl.exe
SUMX= 8450065000.00000

C:\Fortran\MythBuster\c92\sor>
-- end sequence --

Note whether it is a FORTRAN 77 program that has to push the stack pointer to make more room for an adjustable/automatic array (X in the stovfl.for example above) on the stack than allowed by the OS by default, or a 'modern Fortran' program with ALLOCATABLE arrays for which a compiler implementation allocates them on the stack by default and runs into OS-related stack limitations, current situation is the users will need to take some time to understand system-specific details and limits in conjunction with the requirements of their code and either change their code (e.g., use a COMMON block for the X array in the example above) or use system-specific instructions (such as the /STACK linker option shown above) to overcome the crashes; the latter is similar to what OP was asking about with /heap-arrays or /F compiler option.

Statements such as "If I program an algorithm in Fortran I expect the program to run. And pretty much if I program and compile in F77 I know it will. " in the context of this subthread are then entirely meaningless.

And the statement, "I think Ron Shepard implied the real problem by just talking about it" means diddly squat. Nothing actionable has been implied up until now.

Far simpler and clearer communication is needed to make readers understand what exactly are the changes to the Fortran standard that are being bandied about and how they will help in the context of programs crashing on a particular OS due to stack size limitations with memory being allocated on the stack either due to implementation default or because of how the code is written, regardless of whether the code is in Fortran 90 (or a later standard revision) and 'which uses lots of allocatable arrays' or using FORTRAN 77 with adjustable (aka automatic) arrays in subprograms such as the example above.

There is too much opacity on display here, those with experience should do better than to indulge in empty statements about FORTRAN 77.

FortranFan

unread,

May 2, 2017, 12:13:40 AM5/2/17

to

On Monday, May 1, 2017 at 11:33:48 PM UTC-4, FortranFan wrote:

> .. what's shown below with a FORTRAN 77 program ..

>
> -- begin FORTRAN 77 example --
> C stovfl.for
> SUBROUTINE STOVFL(N)
> IMPLICIT NONE

> ..

A correction: only remembered after I made the post that FORTRAN 77 did not support 'IMPLICIT NONE', thus the above example does not conform to the FORTRAN 77 standard, strictly speaking. But this aspect involving implicit typing should not have any affect on the rest of the post.

Stefano Zaghi

unread,

May 2, 2017, 12:45:14 AM5/2/17

to

Dear FortranFan,

please, forgive me, but I am a bit confused as I said upthread to Ron.

Il giorno martedì 2 maggio 2017 05:33:48 UTC+2, FortranFan ha scritto:
> ... regardless of whether the code is in Fortran 90 (or a later standard revision) and 'which uses lots of allocatable arrays' or using FORTRAN 77 with adjustable (aka automatic) arrays in subprograms such as the example above.

This your sentence makes me doubtful about what I was assuming to have learned.

I was sure that, at least for my x86_64 GNU/Linux systems, if I define allocatable (local to units) arrays these are allocated on the heap, whereas automatic arrays go on the stack.

Is my understanding common (at least for x86_64 GNU/Linux architectures)?

Indeed, I was surprised by Ron's point just because this my (possible wrong) understanding: because allocatable is a F90+ features this represent a good memory-handling improvement with respect F77- standards: allocatable can help to minimizing the memory consumption by allocating only the necessary memory (avoiding memory oversizing at compile time necessary in F77- standards) and it provides an "indication" that the memory allocated goes on heap and not on stack, thus avoiding stack related issues (if my understanding of allocatable-heap correlation is valid).

My best regards.

herrman...@gmail.com

unread,

May 2, 2017, 1:38:25 AM5/2/17

to

All true, but like any tool, the closer you are to its limits, the
more you need to know how it works, and work around the limits.

It seems to me that many inventions were made before they should
have been. The Wright brothers started flying before most of
aerodynamics was even slightly understood. Radio transmission
before amplifiers or tuned circuits. And finally, computing
hardware before digital electronics.

But back to computing.

For any computing problem, you need to have some idea that the
memory requirements, and especially time requirement, are within
the capabilities of available equipment. If your problem takes
thousands of years to solve on available equipment, there isn't
much point in starting the run.

Virtual storage has been a great help, but also has some new
requirements. For it to work, you need to be sure that most
of the time you are using code and data in real storage.
Often that requires more understanding of the problem and
the computer than one might wish.

Ron Shepard

unread,

May 2, 2017, 4:04:48 AM5/2/17

to

On 5/1/17 10:29 PM, Stefano Zaghi wrote:
> You have alluded to the fact that the big source of memory management issues has been added in the Fortran 90 standard. This seems to imply that Fortran 77 and 66 have a better memory management. Can you elaborate more comparing what has been changed in Fortran 90 standard?

I am not exactly sure what you are asking here. F77 itself did not
provide any runtime memory allocation. Everything in the standard itself
could be implemented with static arrays. There were no allocatable
arrays or automatic arrays in f77.

However, starting in about the mid 80s, most compilers did allow runtime
memory allocation through subroutine calls. Cray pointers were an
example of how this could be combined with compiler extensions, but
there were also straightforward ways of using malloc() that required no
compiler syntax extensions at all.

But that was not what I was talking about. I thought I explained it
pretty well before, but I'll do so again. In f77, the typical hack was
to dimension a large array at the beginning of the program, and then to
use that array as a stack/heap to allocate work arrays from it. Using
sequence association, you can compute offsets within the big work array,
pass array locations as actual arguments to subroutines, and then the
subroutines themselves can look like normal code. The main driver
routines are a big mess, because of all the offset calculations being
done, but the subroutines themselves look alright, they look like normal
fortran code.

There are all kinds of problems with this approach. Mistakes in
computing the offsets are difficult to detect. Garbage collection is
possible, but not necessarily straightforward when the workspace is
treated as a heap. Using the workspace as a stack is simpler, but still
presents problems for debugging, and of course that places more burden
on the programmer to push/pop the arrays in the right order to avoid
wasted space. If you are allocating arrays of different types (integer,
real, double precision, logical), then there were complications
associated with aliasing actual arguments and dummy arguments of
different types. For example, there could be alignment problems. e.g. if
your hardware required double precision values to start on even-word or
8-byte addresses. But with this approach the amount of memory in use and
available was always sitting right there, no more than a memory fetch
and a single integer subtraction away.

So f90 allocatable and automatic arrays are better than this approach is
every possible way. Every possible way *EXCEPT* the ability to query for
the amount of memory used/left at any time. But that one little detail
is actually very important in many situations. I mentioned several of
those in my previous post. And it is that one missing piece of
functionality that has resulted in many current fortran programs still
using the old-fashioned f77 style manual heap/stack management approach,
with all of its numerous problems.

I am one of the programmers for the COLUMBUS electronic structure code,
and large parts of it work this way. The last I looked, the GAUSSIAN and
GAMES electronic structure codes also worked this way. I'm pretty sure
that parts of MOLPRO and MOLCAS also work this way. I'm also pretty sure
that none of these codes would still be doing this if it weren't for
that one missing piece of functionality.

$.02 -Ron Shepard

Ron Shepard

unread,

May 2, 2017, 4:08:42 AM5/2/17

to

On 5/1/17 10:33 PM, FortranFan wrote:
> -- begin FORTRAN 77 example --
> C stovfl.for
> SUBROUTINE STOVFL(N)
> IMPLICIT NONE
> INTEGER N
> INTEGER I
> DOUBLE PRECISION X(N)

>[...]

I see that you already mentioned that IMPLICIT NONE was not f77.
Automatic arrays were also not allowed in f77.

$.02 -Ron Shepard

Stefano Zaghi

unread,

May 2, 2017, 4:23:21 AM5/2/17

to

Dear Ron, thank you very much for your kind reply.

Il giorno martedì 2 maggio 2017 10:04:48 UTC+2, Ron Shepard ha scritto:
> I am not exactly sure what you are asking here. F77 itself did not
> provide any runtime memory allocation. Everything in the standard itself
> could be implemented with static arrays. There were no allocatable
> arrays or automatic arrays in f77.

This is my understanding, but because I know very little about F77, I was not sure. Simply, you claim seemed to allude to the fact that F90 introduced issues that are not present in F77. Reading your last reply, I understand now that you allude to the fact that F90 memory handling improvements are good, but not complete.

> But that was not what I was talking about. I thought I explained it
> pretty well before, but I'll do so again.

I am sorry for bothering, I had missed this. Thank you for your kindness.

> In f77, the typical hack was
> to dimension a large array at the beginning of the program, and then to
> use that array as a stack/heap to allocate work arrays from it. Using
> sequence association, you can compute offsets within the big work array,
> pass array locations as actual arguments to subroutines, and then the
> subroutines themselves can look like normal code. The main driver
> routines are a big mess, because of all the offset calculations being
> done, but the subroutines themselves look alright, they look like normal
> fortran code.

I had read such designed programs: one of my mentore did such a hack for a very complex CFD code. It was a nightmare...

> There are all kinds of problems with this approach. Mistakes in
> computing the offsets are difficult to detect. Garbage collection is
> possible, but not necessarily straightforward when the workspace is
> treated as a heap. Using the workspace as a stack is simpler, but still
> presents problems for debugging, and of course that places more burden
> on the programmer to push/pop the arrays in the right order to avoid
> wasted space. If you are allocating arrays of different types (integer,
> real, double precision, logical), then there were complications
> associated with aliasing actual arguments and dummy arguments of
> different types. For example, there could be alignment problems. e.g. if
> your hardware required double precision values to start on even-word or
> 8-byte addresses. But with this approach the amount of memory in use and
> available was always sitting right there, no more than a memory fetch
> and a single integer subtraction away.

Aside the issues you mentioned, this is not viable in some scenario: my applications, for example, involves Adaptive Mesh Refinement where memory consumption varies at runtime in a non predictable way. Exploiting static allocation is not a good approach at all.

> So f90 allocatable and automatic arrays are better than this approach is
> every possible way. Every possible way *EXCEPT* the ability to query for
> the amount of memory used/left at any time. But that one little detail
> is actually very important in many situations. I mentioned several of
> those in my previous post.

Ok, now is really more clear: what seems to be really an issue it the possibility to query for available memory at runtime. Is this query possible with other languages (or it is too much related to architecture)? It could be a good feed for the standard committee.

> And it is that one missing piece of
> functionality that has resulted in many current fortran programs still
> using the old-fashioned f77 style manual heap/stack management approach,
> with all of its numerous problems.
> I am one of the programmers for the COLUMBUS electronic structure code,
> and large parts of it work this way. The last I looked, the GAUSSIAN and
> GAMES electronic structure codes also worked this way. I'm pretty sure
> that parts of MOLPRO and MOLCAS also work this way. I'm also pretty sure
> that none of these codes would still be doing this if it weren't for
> that one missing piece of functionality.

This looks a little odd to me: this so low-level memory managing is still possible in "modern" Fortran, why claiming that the "old good days" are better? If you need to do a manual heap/stack handling you can do with any Fortran standard, but only F90+ allows me to exploit allocatables...

My best regards.

mecej4

unread,

May 2, 2017, 7:54:15 AM5/2/17

to

That sounds right. That number is a bite short of 2^47, probably equal
to the size of memory that can be addressed minus some reserved area.

-- mecej4
mec...@clothes.com
Disrobe and don Gmail to reply.

Richard Maine

unread,

May 2, 2017, 7:59:27 AM5/2/17

to

Indeed, there was no form of dynamic allocation at all in f77. This was
deliberate. The avoidance of dynamic allocation went so far as to
include some arcane restrictions whose sole purpose was to avoid the
necessity of dynamic allocation. See in particular the restrictions on
character concatenation. Those are downright puzzling if one doesn't
realize that they are aimed at dynamic allocation.

The closest thing to dynamic allocation was adjustable dimensions in
dummy arrays. That sort of looks like dynamic allocation, but it is
actually only dynamic selection of a portion of a statically allocataed
actual argument.

mecej4

unread,

May 2, 2017, 8:14:38 AM5/2/17

to

Perhaps an analogy will help; let's forget about Fortran 77 and earlier
for now.

You have a bank account. You can write checks (cheques) but you cannot
inquire about your balance. You can make deposits but you don't know
when your account will be credited for those deposits. When you write a
check that bounces, your account balance will be reduced to zero.

That is how it is with Fortran 90, etc., and in most other languages, as
well. You are not given the information that you could use to control
your check writing and in general manage YOUR money. That state of
affairs is what Ron Shepard is writing about.

Stefano Zaghi

unread,

May 2, 2017, 8:28:35 AM5/2/17

to

Dear mecj4,

Il giorno martedì 2 maggio 2017 14:14:38 UTC+2, mecej4 ha scritto:
> Perhaps an analogy will help; let's forget about Fortran 77 and earlier
> for now.
>
> You have a bank account. You can write checks (cheques) but you cannot
> inquire about your balance. You can make deposits but you don't know
> when your account will be credited for those deposits. When you write a
> check that bounces, your account balance will be reduced to zero.
>
> That is how it is with Fortran 90, etc., and in most other languages, as
> well. You are not given the information that you could use to control
> your check writing and in general manage YOUR money. That state of
> affairs is what Ron Shepard is writing about.

Thank you for your help. It was clear from Ron's previous answer (now remarked by your) what was the lack of Fortran about which Ron is worried. I was driven in the wrong direction by the claim (misunderstanding Ron's words and the Terence's sentence about F77 supposed superiority added even more confusion) that this does not happen in F77-: if the trick for having "the memory balance" is the static allocation (oversizing "a priori" and handling "memory references" manually) this is still possible in F90+.

As others, I think that such aspects (e.g. memory "balance") are strictly related to the running architecture, thus it is difficult for Fortran committee to take them into account. Are such a kind of queries (e.g. memory "balance") available in C?

My best regards.

Gary Scott

unread,

May 2, 2017, 9:15:45 AM5/2/17

to

On 4/30/2017 1:11 PM, Ron Shepard wrote:

> That means that it is up to changing the language and OS standards to
> provide this capability. Otherwise, how would those things possibly get
> implemented? As I pointed out before, this problem has been known,
> recognized, and discussed for at least 20 years, probably even longer,
> and it has not yet been addressed. 20 years is a long long time with
> computers. Given the landscape described above, it may never be addressed.

Of course there are always OS facilities available:

GlobalMemoryStatusEx function:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366589(v=vs.85).aspx

GetPhysicallyInstalledSystemMemory function:
https://msdn.microsoft.com/en-us/library/windows/desktop/cc300158(v=vs.85).aspx

However, I don't think either of those will provide enough information
to guarantee the success of a subsequent allocation. I don't think
that's possible.

>
> $.02 -Ron Shepard

FortranFan

unread,

May 2, 2017, 9:34:11 AM5/2/17

to

On Tuesday, May 2, 2017 at 12:45:14 AM UTC-4, Stefano Zaghi wrote:

> ..

>
> I was sure that, at least for my x86_64 GNU/Linux systems, if I define allocatable (local to units) arrays these are allocated on the heap, whereas automatic arrays go on the stack.
>

> ..

Stefano,

Please see this in connection with Intel Fortran compiler:
https://software.intel.com/en-us/fortran-compiler-18.0-developer-guide-and-reference-heap-arrays

and note the following:
"Default
-no-heap-arrays or /heap-arrays-

The compiler puts automatic arrays and temporary arrays in the stack storage area."

Then note that on Windows, the default stack size is rather small at 1 MB, as indicated by Microsoft:
https://msdn.microsoft.com/en-us/library/8cxs58a6.aspx

So it is the combination of the two aspects 1) small default stack on Windows OS and 2) Intel Fortran implementation placing data on the stack that leads to program crashes by OP which can then be avoided if /heap-arrays or /F option is specified to the compiler.

By the way, if I'm not mistaken, gfortran even on Windows OS does as you indicate i.e., it will place allocatable arrays on the heap and therefore, the issue with OP's programs crashing will likely not occur with gfortran+Windows OS combination using default compiler options.

I feel Ron Shepard is trying to project from this system-specific issue involving Intel Fortran and Windows OS on to something bigger and that does not look right in the absence of further information.

herrman...@gmail.com

unread,

May 2, 2017, 9:36:57 AM5/2/17

to

On Tuesday, May 2, 2017 at 1:08:42 AM UTC-7, Ron Shepard wrote:

(snip)

> Automatic arrays were also not allowed in f77.

I believe that this is not true.

Variables without the SAVE attribute can be either static
or automatic. On the other hand, just like C89, you can't
dimension automatic arrays with variables.

The distinction between static and automatic is important
when you have recursion, but Fortran 77 doesn't have it.

Stefano Zaghi

unread,

May 2, 2017, 10:06:00 AM5/2/17

to

Dear FortranFan,

Il giorno martedì 2 maggio 2017 15:34:11 UTC+2, FortranFan ha scritto:
> Stefano,
>
> Please see this in connection with Intel Fortran compiler:
> https://software.intel.com/en-us/fortran-compiler-18.0-developer-guide-and-reference-heap-arrays
>
> and note the following:
> "Default
> -no-heap-arrays or /heap-arrays-
>
> The compiler puts automatic arrays and temporary arrays in the stack storage area."
>
> Then note that on Windows, the default stack size is rather small at 1 MB, as indicated by Microsoft:
> https://msdn.microsoft.com/en-us/library/8cxs58a6.aspx
>
> So it is the combination of the two aspects 1) small default stack on Windows OS and 2) Intel Fortran implementation placing data on the stack that leads to program crashes by OP which can then be avoided if /heap-arrays or /F option is specified to the compiler.
>
> By the way, if I'm not mistaken, gfortran even on Windows OS does as you indicate i.e., it will place allocatable arrays on the heap and therefore, the issue with OP's programs crashing will likely not occur with gfortran+Windows OS combination using default compiler options.

Thank you very much for the insight, I was not aware about MS Windows common-behavior and, in particular, that is different from the GNU/Linux one. If I'll ever have to use such systems I have to remember this!

My best regards.

Beliavsky

unread,

May 2, 2017, 11:07:44 AM5/2/17

to

On Tuesday, May 2, 2017 at 9:34:11 AM UTC-4, FortranFan wrote:
> On Tuesday, May 2, 2017 at 12:45:14 AM UTC-4, Stefano Zaghi wrote:
>
> > ..
> >
> > I was sure that, at least for my x86_64 GNU/Linux systems, if I define allocatable (local to units) arrays these are allocated on the heap, whereas automatic arrays go on the stack.
> >
> > ..
>
> Stefano,
>
> Please see this in connection with Intel Fortran compiler:
> https://software.intel.com/en-us/fortran-compiler-18.0-developer-guide-and-reference-heap-arrays
>
> and note the following:
> "Default
> -no-heap-arrays or /heap-arrays-
>
> The compiler puts automatic arrays and temporary arrays in the stack storage area."
>
> Then note that on Windows, the default stack size is rather small at 1 MB, as indicated by Microsoft:
> https://msdn.microsoft.com/en-us/library/8cxs58a6.aspx
>
> So it is the combination of the two aspects 1) small default stack on Windows OS and 2) Intel Fortran implementation placing data on the stack that leads to program crashes by OP which can then be avoided if /heap-arrays or /F option is specified to the compiler.
>
> By the way, if I'm not mistaken, gfortran even on Windows OS does as you indicate i.e., it will place allocatable arrays on the heap and therefore, the issue with OP's programs crashing will likely not occur with gfortran+Windows OS combination using default compiler options.
>

Actually, the program that was giving me trouble with Intel Fortran before I discovered the /F and /heap-array options is freezing the whole computer when I use gfortran GNU Fortran (GCC) 7.0.1 20170122 (experimental) on Windows 10. The program runs fine with gfortran 4.8.4 on ubuntu 14.04.3 and also with g95 on Windows. I don't think the gfortran developers use Windows, so I don't know if they could reproduce my problem if I sent them the code. I think my Windows gfortran was installed from Equation.com .

Ron Shepard

unread,

May 2, 2017, 11:55:49 AM5/2/17

to

On 5/2/17 3:23 AM, Stefano Zaghi wrote:
>> And it is that one missing piece of
>> functionality that has resulted in many current fortran programs still
>> using the old-fashioned f77 style manual heap/stack management approach,
>> with all of its numerous problems.
>> I am one of the programmers for the COLUMBUS electronic structure code,
>> and large parts of it work this way. The last I looked, the GAUSSIAN and
>> GAMES electronic structure codes also worked this way. I'm pretty sure
>> that parts of MOLPRO and MOLCAS also work this way. I'm also pretty sure
>> that none of these codes would still be doing this if it weren't for
>> that one missing piece of functionality.
> This looks a little odd to me: this so low-level memory managing is still possible in "modern" Fortran, why claiming that the "old good days" are better? If you need to do a manual heap/stack handling you can do with any Fortran standard, but only F90+ allows me to exploit allocatables...

Yes, that is exactly the problem that I'm pointing out. Current codes,
using f90+ compilers, are still using the old fashioned f77 style memory
management (allocate a big workspace and divide it up) with all of its
problems and, perhaps even more importantly, giving up many of the
advantages of the f90 functionality with memory management.

$.02 -Ron Shepard

Tim Prince

unread,

May 2, 2017, 12:02:03 PM5/2/17

to

automatic arrays were a fairly widely used and well tested extension in
f77; presumably that is the reason for their appearance in the same form
in f90.
Likewise, implicit none was an even more widely used extension.
"not allowed" seems an overstatement, given that there was no
requirement to forbid or diagnose extensions.

Tim Prince

unread,

May 2, 2017, 12:06:41 PM5/2/17

to

I assume Ron meant what came into f90 as an automatic array and was not
referring to the implementation of SAVE vs. non-SAVE.
There were a few f77 compilers (e.g. Prime) which defaulted to
supporting recursion but no portable way for an application to determine
whether that was in effect.

Ron Shepard

unread,

May 2, 2017, 12:11:17 PM5/2/17

to

On 5/2/17 8:34 AM, FortranFan wrote:
> I feel Ron Shepard is trying to project from this system-specific issue involving Intel Fortran and Windows OS on to something bigger and that does not look right in the absence of further information.

Yes, the issue I'm discussing is larger than this particular problem.
But if there were a standard way to query the system to ask about such
memory resources, these questions would not arise, or at least the
answer would be obvious.

My main concern is not answering beginner programming questions. My
concerns also include the most knowlegable and sophisticated programmers
using the most sophisticated supercomputing hardware that is available
now, and over the next decades, and their jobs crash somewhere in the
middle because they exhaust some memory limit, and yet they have no
ability to query the system to avoid those crashes. And it includes all
levels of programmers and hardware in between those extremes. It really
is a universal problem, it has been the elephant in the room for the
last 20+ years.

$.02 -Ron Shepard

Ron Shepard

unread,

May 2, 2017, 12:42:24 PM5/2/17

to

It should have been clear what I was talking about in my original post,
which included an example of a declaration of an automatic array.

My statement had nothing to do with SAVE, which by the way was in the
f77 standard, or recursion, which was not allowed in f77. And yes, I do
remember that IMPLICIT NONE and automatic arrays and recursion were
allowed by some compilers as extensions to f77, but if you were
concerned about portability, then you would not have used any of those
features in publicly distributed codes. I can even still remember the
compiler messages, helpfully telling me that such arrays had to be dummy
arguments and not local arrays, or that my statement function was out of
order or missing the "=". I also remember using IMPLICIT CHARACTER(A-Z)
as a workaround to mimic some of the functionality of IMPLICIT NONE.
That declaration would sometimes catch undeclared variables.

$.02 -Ron Shepard

FortranFan

unread,

May 2, 2017, 1:55:09 PM5/2/17

to

On Tuesday, May 2, 2017 at 11:07:44 AM UTC-4, Beliavsky wrote:

> ..
>
> Actually, the program that was giving me trouble with Intel Fortran .. is freezing the whole computer when I use gfortran GNU Fortran (GCC) 7.0.1 20170122 (experimental) on Windows 10. The program runs fine with gfortran 4.8.4 on ubuntu 14.04.3 and also with g95 on Windows. ..

@Beliavsky,

If you can post your actual code somewhere online e.g., GitHUb or if you can create a minimal working example (MWE) out of it that illusrates the same issue(s) you are facing now with gfortran on Windows and post the MWE here, someone (who may very well be you yourself, especially if you embark on the MWE task, which is always the ideal scenario) will definitely be able to determine the root cause of the problem and also develop possible options for resolution to the problem.

FortranFan

unread,

May 2, 2017, 3:06:25 PM5/2/17

to

On Tuesday, May 2, 2017 at 12:11:17 PM UTC-4, Ron Shepard wrote:

> ..
>

> My main concern is not answering beginner programming questions. My
> concerns also include the most knowlegable and sophisticated programmers
> using the most sophisticated supercomputing hardware that is available
> now, and over the next decades, and their jobs crash somewhere in the
> middle because they exhaust some memory limit, and yet they have no

> ability to query the system to avoid those crashes. .. It really

> is a universal problem, it has been the elephant in the room for the

> last 20+ years. ..

@Ron Shepard,

Seems to me there are a few issues here:

1) There is a problem with the terminology you are using which causes confusion e.g., you seem to be using terms such as "allocation" or "allocated" and heap/stack very loosely as in, "There are still legacy fortran codes that .. allocate a block of memory at the beginning of execution of the code and perform their own memory management within that heap/stack. That is the way we did memory allocation in the 70s and 80s with f77. When you do memory allocation that way, you always have available the amount of memory left and the amount of memory that has been allocated". Now I only have bookish knowledge on FORTRAN 77 and prior revisions, but it appears legacy codes were DECLARING 'large' globs of data either statically like in a main program or globally like in a COMMON block. Declaring arrays is not the same as memory allocation, at least in terms of how it is understood today. Also, there is no reference to words such as heap or stack in the ANSI FORTRAN 77 standard; in the legacy codes I've seen, there is no FORTRAN-specific management of memory within heap/stack. So whatever was done in legacy codes must have been outside the purview of the standard. So that was Ok then, why should the standard today get involved with anything related to heap or the stack?

2) Many of your comments on this, within this thread and previously, seem to suggest the legacy approach was a 'good' thing and that the facilities introduced in the standard beginning with Fortran 90 with the ALLOCATE statement, etc. have made things worse than in the good old times. It comes across that way at least with your posts in this thread that went, "Okay, so here we are, yet again, with another thread about memory allocation resource problems .." If you indeed meant to convey such an impression, then I do not agree with you for one and two, you have not provided any convincing data to support your argument.

3) But now you write, "Current codes, using f90+ compilers, are still using the old fashioned f77 style memory management (allocate a big workspace and divide it up) with all of its problems and, perhaps even more importantly, giving up many of the advantages of the f90 functionality with memory management." Again the word 'allocate' instead of 'declare'. Then no evidence to support your point about "current codes". Consider this thread: what indication do you have that OP is "still using the old fashioned f77 style memory management"?

4) You seem to work a lot with supercomputers, having done so for a number of years. Presumably you have had and continue to have dealings with Cray Inc. Now Cray appears to have a big influence on the standard and is usually one of the first to get a new standard revision implemented. So keeping the Cray environment in mind (which I know nothing of), what is possible to introduce into the Fortran standard a means to inquire about available memory? If what you have in mind can be done on Cray, that'll be a place to start; if not, it's likely a lost cause. Can you provide information on what's possible from your supercomputer experience that other readers here as well as the standards committee can learn from?

5) If you know what's possible on Cray and other supercomputer environments, then how 'portable' is the approach on, say, Linux and Mac? Can you report on that here with some details? As mentioned at Microsoft development site (MSDN) and elsewhere, there are Windows APIs that can provide partial information on 'available memory' but there are no equivalent options to those on Linux. And what's accessible on Linux today isn't feasible on Windows. So what can be done in the standard that will be consistent and portable across all possible systems, now and in the future? You don't seem willing to address this aspect.

6) Take the silly example I posted involving FORTRAN 77+extensions: if a developer wants to avoid program exceptions with it but the language indeed provided what you say it should, can you show with 'pseudo code' what a developer would do and how will that would prevent issues in any better manner than today? Here's the program crash example again for your reference:

-- begin FORTRAN 77+extensions example --

C stovfl.for
SUBROUTINE STOVFL(N)
IMPLICIT NONE
INTEGER N
INTEGER I
DOUBLE PRECISION X(N)
DOUBLE PRECISION SUMX
SUMX = 0D0
DO 10 I=1,N
X(N)=DBLE(I)
SUMX=SUMX+X(N)
10 CONTINUE
WRITE(6,11) SUMX
11 FORMAT("SUMX=",1PG22.15)
RETURN
END SUBROUTINE
C main.for
IMPLICIT NONE
INTEGER NBIG
PARAMETER(NBIG=130000)
CALL STOVFL(NBIG)
STOP
END

-- end example --

-- begin compile, link, and execution sequence --
C:\Fortran\MythBuster\c92\sor>f77 /c stovfl.for
Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update A)
Copyright 2001 Compaq Computer Corp. All rights reserved.

stovfl.for

C:\Fortran\MythBuster\c92\sor>f77 /c main.for
Compaq Visual Fortran Optimizing Compiler Version 6.6 (Update A)
Copyright 2001 Compaq Computer Corp. All rights reserved.

main.for

C:\Fortran\MythBuster\c92\sor>link /out:stovfl.exe main.obj stovfl.obj
Microsoft (R) Incremental Linker Version 6.00.8447
Copyright (C) Microsoft Corp 1992-1998. All rights reserved.

C:\Fortran\MythBuster\c92\sor>stovfl.exe

forrtl: severe (170): Program Exception - stack overflow
Image PC Routine Line Source
stovfl.exe 00401081 Unknown Unknown Unknown

C:\Fortran\MythBuster\c92\sor>

-- end sequence --

jfh

unread,

May 2, 2017, 7:36:45 PM5/2/17

to

An exception: some of us have several compilers available on one computer and need to know what numeric data types they allow and what their properties are. That's why I wrote kinds.f90 and kinds03.f90 to find out. They are in these sections of http://homepages.ecs.vuw.ac.nz/~harper/fortranstuff.shtml

. . . . Printing properties of the various real and integer kinds
. . . . . . without the ISO or IEEE intrinsic modules, kinds.f90: here
. . . . . . with those Fortran 2003 intrinsic modules, kinds03.f90: here

For example some compilers have a precision 18 real data type, others don't.
If you ask for selected_real_kind(18) on one of the latter you may well get precision 33 and find your program running very slowly.

Some compilers have a range 38 integer date type, others don't. And some don't complain when your integers overflow.

Ron Shepard

unread,

May 2, 2017, 9:47:50 PM5/2/17

to

On 5/2/17 2:06 PM, FortranFan wrote:
> @Ron Shepard,
>
> Seems to me there are a few issues here:
>
> 1) There is a problem with the terminology you are using which causes
> confusion e.g., you seem to be using terms such as "allocation" or
> "allocated" and heap/stack very loosely as in, "There are still
> legacy fortran codes that .. allocate a block of memory at the
> beginning of execution of the code and perform their own memory
> management within that heap/stack. That is the way we did memory
> allocation in the 70s and 80s with f77. When you do memory allocation
> that way, you always have available the amount of memory left and the
> amount of memory that has been allocated". Now I only have bookish
> knowledge on FORTRAN 77 and prior revisions, but it appears legacy
> codes were DECLARING 'large' globs of data either statically like in
> a main program or globally like in a COMMON block. Declaring arrays
> is not the same as memory allocation, at least in terms of how it is
> understood today.

Yes, in the 70's and earlier such work arrays were declared as local
arrays in the main program or they were declared in blank common. With
these approaches, the main program typically had to be recompiled with
the approprite array declaration just prior to execution. But by the
80's most compilers supported various kind of runtime memory allocation.
I mentioned cray pointers and interfaces to malloc previously. With
these approaches, the work array could be allocated at runtime, the
program did not need to be recompiled.

> Also, there is no reference to words such as heap
> or stack in the ANSI FORTRAN 77 standard; in the legacy codes I've
> seen, there is no FORTRAN-specific management of memory within
> heap/stack. So whatever was done in legacy codes must have been
> outside the purview of the standard. So that was Ok then, why should
> the standard today get involved with anything related to heap or the
> stack?

It was the fortran program itself that managed the heap/stack, not the
compiler as it is now. The programmer would compute integer indexes into
the workspace array, and pass those array elements as actual arguments
to the subroutines.

>
> 2) Many of your comments on this, within this thread and previously,
> seem to suggest the legacy approach was a 'good' thing and that the
> facilities introduced in the standard beginning with Fortran 90 with
> the ALLOCATE statement, etc. have made things worse than in the good

> old times. [...]

No, I've pretty much argued the opposite. It is this single critical
feature, the ability to query the system, that is missing with the f90
allocate/automatic approach.

>
> 3) But now you write, "Current codes, using f90+ compilers, are
> still using the old fashioned f77 style memory management (allocate a
> big workspace and divide it up) with all of its problems and, perhaps
> even more importantly, giving up many of the advantages of the f90
> functionality with memory management."

Ah, so you did see me mention the many advantages of the f90 approach. I
wonder why you even wrote your point 2) above?

> Again the word 'allocate'
> instead of 'declare'.

It was done both ways. I also explained this in my original post.

> Then no evidence to support your point about
> "current codes". Consider this thread: what indication do you have
> that OP is "still using the old fashioned f77 style memory
> management"?

In this thread, if the capability to query the system was available, the
OP could have used that to determine how much stack space was available.
He could then have determined how to change his code to stay within
those limits, for example, by changing automatic arrays to allocatable
arrays.

> 4) You seem to work a lot with supercomputers, having done so for a
> number of years. Presumably you have had and continue to have

> dealings with Cray Inc. [...]

I have in the past, starting in the late 70s, but I have not used a cray
in about a decade.

> 5) If you know what's possible on Cray and other supercomputer
> environments, then how 'portable' is the approach on, say, Linux and

> Mac? [...]

As I said before, I think this is something that needs cooperation from
both the compiler and the OS.

> 6) Take the silly example I posted involving FORTRAN 77+extensions:
> if a developer wants to avoid program exceptions with it but the
> language indeed provided what you say it should, can you show with
> 'pseudo code' what a developer would do and how will that would
> prevent issues in any better manner than today? Here's the program
> crash example again for your reference:
>
> -- begin FORTRAN 77+extensions example --
> C stovfl.for
> SUBROUTINE STOVFL(N)
> IMPLICIT NONE
> INTEGER N
> INTEGER I

integer used, left

> DOUBLE PRECISION X(N)
> DOUBLE PRECISION SUMX
> SUMX = 0D0
> DO 10 I=1,N
> X(N)=DBLE(I)
> SUMX=SUMX+X(N)
> 10 CONTINUE
> WRITE(6,11) SUMX
> 11 FORMAT("SUMX=",1PG22.15)
> RETURN
> END SUBROUTINE
> C main.for
> IMPLICIT NONE
> INTEGER NBIG
> PARAMETER(NBIG=130000)

call mem_query('double precision', 'automatic', used, left)
if (NBIG > left) then
write(*,*) 'not enough memory. NBIG=', NBIG, ' left=', left
stop
endif

> CALL STOVFL(NBIG)
> STOP
> END
> -- end example --

There are, of course, other ways to specify the data type. The above is
just an example to show the functionality.

The above would just show what the problem is, it would not solve
anything as it is. However, the next step would be to use the values
returned to change the code. One possibility might be to switch to
allocatable rather than automatic arrays. Another might be to block the
code.

call mem_query('double precision', 'automatic', used, left)
size = min(left,NBIG)
CALL STOVFL(size,NBIG)

This would prevent the memory crash while still using the automatic
array, but of course it requires STOVFL() to loop over the blocks until
all NBIG elements have been processed. That would be trivial to do in
your example code, but actual programs might require substantial effort,
changing entire algorithms, or perhaps trading computational effort for
the lower memory requirements.

$.02 -Ron Shepard

campbel...@gmail.com

unread,

May 3, 2017, 8:19:04 AM5/3/17

to

Ron,

Are you suggesting that a routine "call mem_query('double precision', 'automatic', used, left)" be provided or do you have it already ?

I think Windows API GlobalMemoryStatusEx(mdata) already provides most of the functionality you are asking form.

Also, windows programs do not typically crash when you exceed the available physical memory, but start to utilise paging/virtual memory, although very/incredibly slowly. It can appear that the program and PC have locked up.

Belavsky,

I wonder if this is the case with your example reported above. You should run task manager to see what is happening. It may simply be transitioning to virtual memory. I would also recommend trying equation.com's gFortran 6.3.0 rather than gfortran GNU Fortran (GCC) 7.0.1 20170122 (experimental). Certainly use task manager and monitor memory usage and page faults.

My gripe, which no one appears to agree with, is that local, automatic or temporary arrays should not be allocated on the (small) stack by the compiler, but go to the heap where there is greater capacity. Stack overflow is a very unhelpful response.

Gary Scott

unread,

May 3, 2017, 9:02:17 AM5/3/17

to

If they can fit in the on the stack I've no problem, but I'm unsure if
you can first try stack and if it wont fit try heap is possible (havent
explored).

Richard Maine

unread,

May 3, 2017, 9:33:55 AM5/3/17

to

<campbel...@gmail.com> wrote:

> My gripe, which no one appears to agree with, is that local, automatic or
> temporary arrays should not be allocated on the (small) stack by the
> compiler, but go to the heap where there is greater capacity.

I've expressed agreement with that. But I just said it once, which
likely got lost in all the other posts on the topic.

I'd be happy with either of two options (or I suppose other more
esoteric ones). Either don't put attays on the stack or make the stack
decently large. I don't really care which is done, but whatever it is
needs to be by default - not via some option that people won't discover
without careful investigation. SImple programs that use modest amounts
of memory (by today's standards) should not fail with cryptic messages
usi ng default compiler settings.

dpb

unread,

May 3, 2017, 2:21:29 PM5/3/17

to

On 05/03/2017 7:19 AM, campbel...@gmail.com wrote:
> On Wednesday, May 3, 2017 at 11:47:50 AM UTC+10, Ron Shepard wrote:
...

>> As I said before, I think this is something that needs cooperation from
>> both the compiler and the OS.
>>

...

>>> C main.for
>>> IMPLICIT NONE
>>> INTEGER NBIG
>>> PARAMETER(NBIG=130000)
>>
>> call mem_query('double precision', 'automatic', used, left)
>> if (NBIG> left) then
>> write(*,*) 'not enough memory. NBIG=', NBIG, ' left=', left
>> stop
>> endif
>>
>>> CALL STOVFL(NBIG)
>>> STOP
>>> END
>>> -- end example --

...

>> The above would just show what the problem is, it would not solve
>> anything as it is. However, the next step would be to use the values
>> returned to change the code. One possibility might be to switch to
>> allocatable rather than automatic arrays. Another might be to block the
>> code.
>>
>> call mem_query('double precision', 'automatic', used, left)
>> size = min(left,NBIG)
>> CALL STOVFL(size,NBIG)
>>
>> This would prevent the memory crash while still using the automatic
>> array, but of course it requires STOVFL() to loop over the blocks until
>> all NBIG elements have been processed. That would be trivial to do in
>> your example code, but actual programs might require substantial effort,
>> changing entire algorithms, or perhaps trading computational effort for
>> the lower memory requirements.

...

> I think Windows API GlobalMemoryStatusEx(mdata) already provides
> most of the functionality you are asking form.

The system call provides the overall totals of physical installed memory
and free memory, but that alone isn't sufficient to solve the problem
Ron is trying to address.

Just because there are M free bytes available overall, it does _NOT_
follow that one can ALLOCATE an array of length M/8 doubles. Fortran
must have _contiguous_ memory and the OS numbers are only totals of
everything that is not allocated by some process. Fragmentation of
memory can prevent one from being able to allocate an additional array
of quite a lot smaller than what one might thing possible given the free
memory.

> Also, windows programs do not typically crash when you exceed the
available physical memory, but start to utilise paging/virtual memory,
although very/incredibly slowly. It can appear that the program and PC
have locked up.

...

From a practical standpoint I'd argue there's no difference between the
two, and if one were several days into a run at the time, that's most
annoying at best, particularly if one knew that if had known, there were
other things the program _could_ have done to prevent the situation.

--

herrman...@gmail.com

unread,

May 3, 2017, 5:24:00 PM5/3/17

to

On Wednesday, May 3, 2017 at 11:21:29 AM UTC-7, dpb wrote:
> On 05/03/2017 7:19 AM, campbelljohnd...@gmail.com wrote:

(snip)

> > I think Windows API GlobalMemoryStatusEx(mdata) already provides
> > most of the functionality you are asking form.

> The system call provides the overall totals of physical installed memory
> and free memory, but that alone isn't sufficient to solve the problem
> Ron is trying to address.

> Just because there are M free bytes available overall, it does _NOT_
> follow that one can ALLOCATE an array of length M/8 doubles. Fortran
> must have _contiguous_ memory and the OS numbers are only totals of
> everything that is not allocated by some process. Fragmentation of
> memory can prevent one from being able to allocate an additional array
> of quite a lot smaller than what one might thing possible given the free
> memory.

In the days before VS, fragmentation of real memory was a problem.
Now with VS, you mostly need to worry about fragmenting that, and
with lazy allocation, even that should rarely be a problem.

As noted previously, my system with a 1TB disk and 8GB real memory
allows allocating over 140TB of virtual memory.
If I really wanted to, I could put a large disk array on, and
have a large fraction of that as actual virtual memory.

> > Also, windows programs do not typically crash when you exceed the
> available physical memory, but start to utilise paging/virtual memory,
> although very/incredibly slowly. It can appear that the program and PC
> have locked up.

> From a practical standpoint I'd argue there's no difference between the
> two, and if one were several days into a run at the time, that's most
> annoying at best, particularly if one knew that if had known, there were
> other things the program _could_ have done to prevent the situation.

Well, there is one difference in that, in many cases, once you see
what it is doing, then you kill some other programs to free up some.
But yes, once you use enough real memory that there is nothing else
to free up, then you are stuck. And only someone who really knows
the way a program uses real memory will have a chance to fix that.

I have programs that make successive passes through a large array.
If it can't keep it in real memory, then it pretty much swaps the
whole thing to disk, and back off, each pass through.

William Clodius

unread,

May 3, 2017, 11:34:48 PM5/3/17

to

Ron Shepard <nos...@nowhere.org> wrote:

> <snip>

>
> So f90 allocatable and automatic arrays are better than this approach is
> every possible way. Every possible way *EXCEPT* the ability to query for
> the amount of memory used/left at any time. But that one little detail
> is actually very important in many situations. I mentioned several of

> those in my previous post. And it is that one missing piece of

> functionality that has resulted in many current fortran programs still
> using the old-fashioned f77 style manual heap/stack management approach,
> with all of its numerous problems.

> <snip>
How do you define "the amount of memory used/left at any time"?

Quoting another post of yours

"I'm guessing that it returns the amount of virtual memory. That might
be one of the useful memory resources to query about, but not the only
one. Swap space, physical memory, queue limits, sandbox limits, etc. are
some other ones that might be important."

So you are aware of some of the complexities. The system does not have
one strict number either for memory used or for memory left. Virtual
memory is usually well defined, but only by making it so large that it
is a limit only if the user makes an egregious error in his memory
requests. Other numbers are soft. Swap space is affected by file
creation and deletion, particularly by other applications. Physical RAM
is a soft upper bound on the efficient memory useage, but from that you
need to subtract a (probably poorly defined) upper bound on the minimum
memory the OS needs for safe operation and perhaps memory for other
programs running on the systrem. Buffered I/O can potentially affect
terabytes of memory for short periods. Sandbox limits address some of
these problems, but are not available for many systems.

campbel...@gmail.com

unread,

May 4, 2017, 9:08:24 AM5/4/17

to

You claim "my system with a 1TB disk and 8GB real memory allows allocating over 140TB of virtual memory." I can't replicate this.

On Windows 7, you can't allocate an array larger than the limit of the paging file allocation. Either you have a very large paging allocation, or you may have had a problem with the array size definition.

The following program runs with gFortran 6.2.0 on Windows 7 and demonstrates:

When allocating an array, the array size must be smaller than the available paging space. This allocation does not affect the physical memory allocation until the array is used (initialised)

When the new array size exceeds the available physical memory, then it is swapped to pagefile.sys, which appears as a system hang or crash. I have not bothered to wait for this to clear. ( We have short memories of the performance of old VAX/Pr1mos which ran from the paging file. )

Running the program and task manager appears to confirm the results.
Hopefully running the program will demonstrate these outcomes.

The results identify 3 memory sizes:
1) physical memory which is provided from a pool when arrays are used.
2) paging memory on pagefile.sys which defines a limit when allocating.
3) Virtual memory or Virtual addressing space or 8 terra bytes (2^43 bytes)

There is often confusion between successfully allocating an array (stat=0) and using the array when physical memory pages are allocated. If available physical memory is exceeded, everything appears to hang.

! Program to allocate a lot of memory to test allocate
use ISO_FORTRAN_ENV

integer*8 :: one_gig = 1024*1024*(1024/8) ! size of 1 gb array
integer*8 :: ipos
integer*8, allocatable :: array(:)
integer*4 :: igig, stat_a, stat_d
integer*4 :: max_gb_page = 32
integer*4 :: max_gb_phys = 6
!
write (*,*) 'Compiler Version :',Compiler_Version ()
write (*,*) 'Compiler Options :',Compiler_Options ()
!
! first try to allocate large arrays and see where ALLOCATE changes memory availability
do igig = 1,max_gb_page
ipos = igig * one_gig
allocate ( array(ipos), stat=stat_a )
call report_memory_usage (' allocate only')
if (stat_a /= 0) exit
deallocate ( array, stat=stat_d )
write (*,*) igig,' stat=',stat_a, stat_d, ipos
end do
write (*,*) igig,' stat=',stat_a, stat_d, ' FAIL'
!
! now try to allocate and use the large arrays and see where memory availability changes
do igig = 1,max_gb_phys
ipos = igig * one_gig
allocate ( array(ipos), stat=stat_a )
call report_memory_usage (' first allocate')
if (stat_a /= 0) exit
call use_array ( array, ipos )
call report_memory_usage (' then initialise')
deallocate ( array, stat=stat_d )
write (*,*) igig,' stat=',stat_a, stat_d, ipos
end do
write (*,*) igig,' stat=',stat_a, stat_d, ' FAIL'
!
end

subroutine use_array ( array, ipos )
integer*8 :: ipos, i
integer*8 :: array(ipos)
!
do i = 1,ipos
array(i) = ipos-i*2
end do
write (*,*) 'array initialised', ipos
end subroutine use_array

subroutine report_memory_usage (string)
use ISO_C_BINDING

integer, parameter :: knd = 8 ! 8 byte integer

interface
function GlobalMemoryStatusEx(tick8) bind(C, name="GlobalMemoryStatusEx")
use ISO_C_BINDING
!GCC$ ATTRIBUTES STDCALL :: GlobalMemoryStatusEx
logical(C_BOOL) GlobalMemoryStatusEx
integer(C_LONG_LONG) tick8(8)
end function GlobalMemoryStatusEx
end interface
!
type MEMORYSTATUSEX
sequence
integer dwLength
integer dwMemoryLoad
integer(knd) ullTotalPhys
integer(knd) ullAvailPhys
integer(knd) ullTotalPageFile
integer(knd) ullAvailPageFile
integer(knd) ullTotalVirtual
integer(knd) ullAvailVirtual
integer(knd) ullAvailExtendedVirtual
end type
type (MEMORYSTATUSEX) :: mdata

character string*(*)
!
integer*4 :: tick4(16)
integer*8 :: tick8(8)
equivalence (tick4,tick8)

integer(knd) :: lastAvailPhys = 0

real*8 gb
external gb

! mdata%dwLength = 64
tick4 = 0
tick4(1) = 64

if ( GlobalMemoryStatusEx (tick8) ) then
mdata%dwMemoryLoad = tick4(2)
mdata%ullTotalPhys = tick8(2)
mdata%ullAvailPhys = tick8(3)
mdata%ullTotalPageFile = tick8(4)
mdata%ullAvailPageFile = tick8(5)
mdata%ullTotalVirtual = tick8(6)
mdata%ullAvailVirtual = tick8(7)
mdata%ullAvailExtendedVirtual = tick8(8)

print *, " "
print *, " Memory Report ",string
print *, "Percentage of physical memory in use ", mdata%dwMemoryLoad, " %"
print 11, "Amount of actual physical memory ", gb(mdata%ullTotalPhys)
print 11, "Amount of physical memory available ", gb(mdata%ullAvailPhys)
print 11, "PageFile/Committed memory limit ", gb(mdata%ullTotalPageFile)
print 11, "Amount of memory current process can commit ", gb(mdata%ullAvailPageFile)
print 11, "Size of virtual address space ", gb(mdata%ullTotalVirtual)
print 11, "Amount of unreserved/uncommitted memory ", gb(mdata%ullAvailVirtual)
11 format(1x,a,f0.3,' gb')

else
print*,"Report Memory Failed ", string
end if

end subroutine report_memory_usage

real*8 function gb ( bytes )
integer*8 :: bytes
real*8 :: one_gb = 1024.*1024.*1024. ! size of 1 gb

gb = dble (bytes) / one_gb

end function gb

( I have adapted the program to gFortran from Silverfrost FTN95, so hopefully I have not introduced any errors. )

FortranFan

unread,

May 4, 2017, 11:49:50 AM5/4/17

to

On Tuesday, May 2, 2017 at 9:47:50 PM UTC-4, Ron Shepard wrote:

> ..
>

> As I said before, I think this is something that needs cooperation from
> both the compiler and the OS.
>

> ..

@Ron Shepard,

It is good to see you now agree on the role of the OS though you are still talking about cooperation which seems a misplaced notion. Until and unless all the compute environments out there start providing *relevant* information in a consistent manner which is not trivial by any means, what is a Fortran processor to do? For example, Microsoft offers no way to check for available stack on Windows which is what the mem_query subprogram would need to do in the example you marked up. Sure there are jerry-rigged functions out there to query the stack but none that are thread-safe and that is no way for a standard to proceed. And on Windows, the available memory information for allocations on the heap is of little to no practical use for a running program that wants to manage memory in order to complete its computing tasks reliably.

Why would Windows under Microsoft care to cooperate with a Fortran compiler?

JB

unread,

May 4, 2017, 1:50:49 PM5/4/17

to

On 2017-05-04, campbel...@gmail.com <campbel...@gmail.com> wrote:
> You claim "my system with a 1TB disk and 8GB real memory allows allocating over 140TB of virtual memory." I can't replicate this.
>
> On Windows 7, you can't allocate an array larger than the limit of the paging file allocation.

Linux allows memory to be overcommitted by default. There are a lot of
applications that allocate a lot of virtual memory but actually use
very little of it, though there are also downsides to it, of course.

There is a knob you can turn if you'd prefer otherwise.

--
JB

JB

unread,

May 4, 2017, 1:56:16 PM5/4/17

to

On 2017-05-02, Stefano Zaghi <stefan...@gmail.com> wrote:
> As others, I think that such aspects (e.g. memory "balance") are strictly related to the running architecture, thus it is difficult for Fortran committee to take them into account. Are such a kind of queries (e.g. memory "balance") available in C?

No, as in the ISO C standard specifies no such thing. There are
usually various OS-specific things which can be used to give hints
about the memory state of the system. Others have mentioned a few
Windows API's; on Linux you can e.g. check /proc/self/status and/or
the cgroup memory filesystem if that is in use (typically used on
clusters/supercomputers to limit the amount of memory jobs can use).

--
JB

JB

unread,

May 4, 2017, 2:10:00 PM5/4/17

to

On 2017-05-02, herrman...@gmail.com <herrman...@gmail.com> wrote:
> It seems to me that many inventions were made before they should
> have been. The Wright brothers started flying before most of
> aerodynamics was even slightly understood.

As an aside, one of the factors that led to the Wright brothers
succeeding where their competitors failed is that they actually did
advance the state of aerodynamics: they built their own wind tunnel
where they tested different airfoil shapes, and were thus able to
build much better wings and propellers.

I guess they understood squat about all the math and physics required
for a theoretical approach to fluid dynamics.

--
JB

JB

unread,

May 4, 2017, 2:16:59 PM5/4/17

to

On 2017-05-02, Ron Shepard <nos...@nowhere.org> wrote:
> So f90 allocatable and automatic arrays are better than this approach is
> every possible way. Every possible way *EXCEPT* the ability to query for
> the amount of memory used/left at any time. But that one little detail
> is actually very important in many situations. I mentioned several of
> those in my previous post. And it is that one missing piece of
> functionality that has resulted in many current fortran programs still
> using the old-fashioned f77 style manual heap/stack management approach,
> with all of its numerous problems.

If you think that's all that's required, why don't you create a global
parameter

integer(int64), parameter :: arbitrary_vm_limit = some_value

and a variable

integer(int64) :: currently_allocated = 0

and then everytime you execute an ALLOCATE statement you increment
currently_allocated by the number of bytes allocated, and when you
call DEALLOCATE you decrement it. Then if currently_allocated exceeds
arbitrary_vm_limit you can abort with a friendly error message
directing the user to increase some_value and recompile the
application.

Of course, these measure just the virtual memory and not the actual
physical memory used, but it's no different from the F77 way with a
large static array.

--
JB

JB

unread,

May 4, 2017, 2:23:36 PM5/4/17

to

On 2017-04-29, Beliavsky <beli...@aol.com> wrote:
> When arrays are ALLOCATEd, they can check the STAT.

That's often much less useful than one might hope. On GFortran, STAT
/= 0 essentially means that the C malloc() returned NULL. However, on
Linux this essentially never happens as the OS overcommits virtual
memory. Then later on when you try to use all of that memory the OS
realizes that, oops, I don't actually have that much memory available,
and subsequently kills some victim process (the infamous OOM killer),
usually the process that tried to use a lot of real memory.

--
JB

herrman...@gmail.com

unread,

May 4, 2017, 2:34:00 PM5/4/17

to

On Thursday, May 4, 2017 at 6:08:24 AM UTC-7, campbel...@gmail.com wrote:

(snip)

> You claim "my system with a 1TB disk and 8GB real memory allows
> allocating over 140TB of virtual memory." I can't replicate this.

It is called lazy allocation.

Linux has been doing it for some time now, mine is on OS X 10.10.5

The system allocates virtual address space, and then points it all
to a single page containing all zeros, and marks that pages as
read only. Then, when you attempt to write to a page, the system
allocates an actual page for you, zeros it, and then retries the
write on the new page.

The actual operation is a little more complicated, as even
allocating enough page tables for 140TB would be too big.

In the 24 to 32 bit virtual storage days, with IBM S/370
and DEC/VAX as examples, two levels of tables are used.
IBM calls them segment tables and page tables, VAX calls
them both page tables, but in different parts of address
space. That allows for swapping of page tables.

For 64 bit systems, it is usual to have five levels of
tables, in theory, but there are ways to simplify things
until a full 64 bit address space is needed. Some systems
uses 48 bits of actual address space, with half at the top,
and half at the bottom, of the theoretical 64 bit space.

Once all five levels have been set up, it can do copy
on write to create actual tables when a page is accessed
and actually needed.

If you have a C compiler, even if you don't know C, copy
and paste this program, compile, and run it:

#include <stdio.h>
#include <stdlib.h>

int main() {
size_t i, j, k;
for(i=1, j=2; j>0; i *=2, j*=2) ;
for( ; i>0; i /= 2) if(malloc(i-16)) k += i;
printf("%zd bytes\n", k);
}

and see what it prints out.

You might also note if it runs fast or slow.

Message has been deleted

jfh

unread,

May 4, 2017, 8:32:35 PM5/4/17

to

On my x86_64 system using Linux it ran very fast and printed 17184064576 bytes.
That number is about 1.000244*2**34.

Wolfgang Kilian

unread,

May 5, 2017, 2:28:27 AM5/5/17

to

On 04.05.2017 20:33, herrman...@gmail.com wrote:
> On Thursday, May 4, 2017 at 6:08:24 AM UTC-7, campbel...@gmail.com wrote:
>
> (snip)
>
>> You claim "my system with a 1TB disk and 8GB real memory allows
>> allocating over 140TB of virtual memory." I can't replicate this.
>
> It is called lazy allocation.
>
> Linux has been doing it for some time now, mine is on OS X 10.10.5
>
> The system allocates virtual address space, and then points it all
> to a single page containing all zeros, and marks that pages as
> read only. Then, when you attempt to write to a page, the system
> allocates an actual page for you, zeros it, and then retries the
> write on the new page.

I could imagine that this applies even to standard 'good old F77'
programs, running on Linux. If the compiler runtime library relies on
the OS for its memory needs, you might be able to declare a huge array
in the program and even write to sections of it without problems, but
run into an out-of-memory error when you try to access some new page
later. You would have to write data to the whole array before you can
be sure that it is actually available.

This depends on how static arrays are handled by the compiler/runtime.
But I don't see a contradiction with the F77 standard, maybe I'm
overlooking something.

In the context of the current thread, to my understanding: a query about
the available memory is meaningless, unless you have detailed control
over the operating system itself. This is the price to pay for the
benefits of a modern OS.

-- Wolfgang

--
E-mail: firstnameini...@domain.de
Domain: yahoo

campbel...@gmail.com

unread,

May 5, 2017, 3:09:53 AM5/5/17

to

On Friday, May 5, 2017 at 4:28:27 PM UTC+10, Wolfgang Kilian wrote:
>
> In the context of the current thread, to my understanding: a query about
> the available memory is meaningless, unless you have detailed control
> over the operating system itself. This is the price to pay for the
> benefits of a modern OS.
>

Following on with Ron Shepard's memory management discussion.

The benefit I see for a query about the available memory is knowing how much physical memory is still available, then not exceeding this limit, as exceeding (or nearing) this capacity has a significant performance penalty. There is also the issue of other memory usage, especially disk cacheing. Eg, having an idea of available physical memory can affect the blocking strategy for an equation solver. (Other parameters, such as cache size are also important)

Unfortunately, as a Fortran programmer, I have a very limited control over the operating system, but have to cope best with what is provided.

I can see why available memory was not included in any standard, but it has always been a significant issue for using ALLOCATE or unlabelled COMMON or any paging memory OS. SYSTEM_CLOCK is in the standard.

Thomas Jahns

unread,

May 5, 2017, 7:24:05 AM5/5/17

to

On 05/04/2017 08:20 PM, JB wrote:
> That's often much less useful than one might hope. On GFortran, STAT
> /= 0 essentially means that the C malloc() returned NULL. However, on
> Linux this essentially never happens as the OS overcommits virtual
> memory. Then later on when you try to use all of that memory the OS
> realizes that, oops, I don't actually have that much memory available,
> and subsequently kills some victim process (the infamous OOM killer),
> usually the process that tried to use a lot of real memory.

Many are mistaking this for a language problem, when overcommitment of memory is
a system setting entirely independent of the languages used to write the running
programs. If you want deterministic memory availability, simply set

/proc/sys/vm/overcommit_memory

to another value as explained in e.g.

man 5 proc

Perhaps someone can add the corresponding setting for Windows.

Thomas

JB

unread,

May 5, 2017, 8:44:02 AM5/5/17

to

On 2017-05-05, Wolfgang Kilian <kil...@invalid.com> wrote:
> On 04.05.2017 20:33, herrman...@gmail.com wrote:
>> It is called lazy allocation.
>>
>> Linux has been doing it for some time now, mine is on OS X 10.10.5
>>
>> The system allocates virtual address space, and then points it all
>> to a single page containing all zeros, and marks that pages as
>> read only. Then, when you attempt to write to a page, the system
>> allocates an actual page for you, zeros it, and then retries the
>> write on the new page.
>
> I could imagine that this applies even to standard 'good old F77'
> programs, running on Linux.

Yes.

> If the compiler runtime library relies on
> the OS for its memory needs, you might be able to declare a huge array
> in the program and even write to sections of it without problems, but
> run into an out-of-memory error when you try to access some new page
> later. You would have to write data to the whole array before you can
> be sure that it is actually available.

Yes.

> This depends on how static arrays are handled by the compiler/runtime.

Typically the program loader/dynamic linker or whatever you want to
call it, takes care of allocating memory for static data before
invoking the main program.

From the perspective of the OS itself, static/stack/heap is all the
same in the end, just virtual memory that is demand allocated to
actual physical memory pages when needed.

> In the context of the current thread, to my understanding: a query about
> the available memory is meaningless, unless you have detailed control
> over the operating system itself. This is the price to pay for the
> benefits of a modern OS.

Indeed.

--
JB

Ron Shepard

unread,

May 5, 2017, 12:30:39 PM5/5/17

to

This only accounts for the arrays explicitly allocated by the
programmer. It does not account for any automatic arrays or
compiler-generated intermediate arrays that are done by the compiler and
those arrays also count against any used/left totals. That is, your job
can still crash by exceeding some memory limit, or it can freeze by
exhausting some memory resource.

That is why the query is something that must be implemented into the
language, it is not just a matter of the programmer being careful and
keeping track of things. In my previous posts, I also mentioned that
there may be libraries used by the programmer (say linear algebra
libraries that need additional workspace, or graphical libraries that
allocate and manipulate graphical objects, or communications libraries
that allocate buffers in a parallel environment) that allocate memory
outside of the programmer's direct control.

Many (perhaps all) supercomuters use a batch runtime system. This is one
way that large computers attempt to maximize use of their expensive
resources. These systems have several batch queues, some for quick test
jobs, some for large-node small-memory jobs, some for small-node
large-memory jobs, and some for large-node large-memory jobs, and there
might be several versions based on runtime limits. When you submit the
job, you specify what are the limits of your job (time, nodes, memory),
and you are placed in the appropriate queue. If your job does not
actually run within those constraints, then your job is aborted. With
the PBS system, for example, you specify things like

#PBS -l walltime=10:00:00
#PBS -l nodes=2000:ppn=12
#PBS -l mem=2000GB

This would declare your job will run in 10 hours, take 2000 nodes, use
12 processors per node, and use 1GB per node or 2TB total memory.

Your job can query these limits at runtime by examining environment
variables, and there are usually straightforward ways for your program
to query the running execution time and the number of nodes that your
job has allocated. This capability is built into the fortran language,
so it can be done in a portable way. But an important missing piece is
the ability to query for current memory usage.

$.02 -Ron Shepard

William Clodius

unread,

May 5, 2017, 6:21:19 PM5/5/17

to

Various web discussisons on the p;roblem of defining memory availability
<http://stackoverflow.com/questions/2513505/how-to-get-available-memory-c-g>
<http://stackoverflow.com/questions/669438/how-to-get-memory-usage-at-run-time-in-c>
<http://stackoverflow.com/questions/1674652/c-c-memory-usage-api-in-linux-windows>

Thomas Koenig

unread,

May 6, 2017, 11:29:16 AM5/6/17

to

Ron Shepard <nos...@nowhere.org> schrieb:

[Memory Check]

> That is why the query is something that must be implemented into the
> language, it is not just a matter of the programmer being careful and
> keeping track of things. In my previous posts, I also mentioned that
> there may be libraries used by the programmer (say linear algebra
> libraries that need additional workspace, or graphical libraries that
> allocate and manipulate graphical objects, or communications libraries
> that allocate buffers in a parallel environment) that allocate memory
> outside of the programmer's direct control.

It is also outside the programmer's control if somebody else starts
up a memory-hungry application. This is also outside the control
of the compiler writer, or the language definition.

If you need to control memory, you are very, very deep in
OS-specific territory. Making this part of the language is not
likely to succeed. It would probably be best to make this into
a library with very many #ifdefs ...

I know what I would do on Linux. I'd read in and parse
/proc/self/status, which, among other things, contains the lines

VmPeak: 5964 kB
VmSize: 5964 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 680 kB
VmRSS: 680 kB
RssAnon: 72 kB
RssFile: 608 kB
RssShmem: 0 kB
VmData: 324 kB
VmStk: 136 kB
VmExe: 32 kB
VmLib: 1800 kB
VmPTE: 32 kB
VmPMD: 12 kB
VmSwap: 0 kB

from which I ccould get the memory use of your own executable.
I could also read in and parse /proc/meminfo, which gives
information like

MemTotal: 4044868 kB
MemFree: 202028 kB
MemAvailable: 1596976 kB
Buffers: 29368 kB
Cached: 1624104 kB
SwapCached: 44 kB

from which it would be possible to make a choice about, for example,
which buffer size to allocate.

Now, let's say your Compouter runs AIX (not uncommon in HPC, I think...)
There you do... well, it seems there is a vmgetinfo syscall, which
lets you read all kinds of things. But again, I don't use AIX much,
so I might get that one wrong.

On NetBSD, there seems to be a /proc/meminfo file which gives
overall memory statistics. Dunno how to get to the
individual process memory, but if top can do it, so could
a Fortran program :-)

And so on...

> Many (perhaps all) supercomuters use a batch runtime system. This is one
> way that large computers attempt to maximize use of their expensive
> resources.

If you're writing for a particular supercomputer, you can write
this library for that system. That is no big deal.

Move to the next one, add another #ifdef.