system_clock and processor-dependent behavior

steve kargl

unread,

Jan 8, 2021, 12:54:48 PM1/8/21

to

Could a kind soul run the following program with Intel and NAG compilers?
I know the behavior under gfortran. I curious to see the processor-dependent
behavior implemented by other compilers.

program foo

use iso_fortran_env, only : int32, int64, real64

implicit none

integer(int32) rate4, t1, t2
integer(int64) rate8, u1, u2
!
! Both Intel and gfortran should give the same timings as
! COUNT and COUNT_RATE variables have the same kind type
! parameter. This is documented by both gfortran and Intel.
!
print '(A)', 'Behavior documented to give valid timings with gfortran and Intel'
call system_clock(count_rate=rate4)
print '(A,I0)', 'rate4 = ', rate4
call system_clock(count_rate=rate8)
print '(A,I0)', 'rate8 = ', rate8

call system_clock(count=t1)
call loop
call system_clock(count=t2)
print '(A,F0.3,A)', 'time = ', real(t2-t1,real64)/rate4, ' seconds (int32/int32)'

call system_clock(count=u1)
call loop
call system_clock(count=u2)
print '(A,F0.3,A)', 'time = ', real(u2-u1,real64)/rate8, ' seconds (int64/int64)'
!
! gfortran gives an invalid time with the following, but gfortran
! also documents the behavior. What does Intel do?
!
print *
print '(A)', 'Behavior documented to give invalid timings with gfortran'
call system_clock(count_rate=rate8)
print '(A,I0)', 'rate8 = ', rate8
call system_clock(count=t1)
call loop
call system_clock(count=t2)
print '(A,F0.3,A)', 'time = ', real(t2-t1,real64)/rate8, ' seconds (int32/int64)'

call system_clock(count_rate=rate4)
print '(A,I0)', 'rate4 = ', rate4
call system_clock(count=u1)
call loop
call system_clock(count=u2)
print '(A,F0.3,A)', 'time = ', real(u2-u1,real64)/rate4, ' seconds (int64/int32)'

contains

subroutine loop
implicit none
real(real64) dx, x, y
integer i
integer, parameter :: n = 100000001
dx = 1._real64 / (n - 1)
do i = 1, n
x = 1000 + (i - 1) * dx
y = cos(x)
x = sin(x)
if (x * y > 2) stop 1
end do
end subroutine loop

end program foo

--
steve

Steve Lionel

unread,

Jan 8, 2021, 8:53:32 PM1/8/21

to

On 1/8/2021 12:54 PM, steve kargl wrote:
> Could a kind soul run the following program with Intel and NAG compilers?
> I know the behavior under gfortran. I curious to see the processor-dependent
> behavior implemented by other compilers.

Intel 2021.1.2

Behavior documented to give valid timings with gfortran and Intel

rate4 = 10000
rate8 = 1000000
time = 1.023 seconds (int32/int32)
time = 1.043 seconds (int64/int64)

Behavior documented to give invalid timings with gfortran

rate8 = 1000000
time = .010 seconds (int32/int64)
rate4 = 10000
time = 110.200 seconds (int64/int32)

NAG 7.0.7036

Behavior documented to give valid timings with gfortran and Intel

rate4 = 10000000
rate8 = 10000000
time = 3.805 seconds (int32/int32)
time = 3.655 seconds (int64/int64)

Behavior documented to give invalid timings with gfortran

rate8 = 10000000
time = 3.611 seconds (int32/int64)
rate4 = 10000000
time = 3.627 seconds (int64/int32)

Why would you do such a thing? It makes no sense to me to call
SYSTEM_CLOCK with one integer kind and divide the difference by the
COUNT_RATE of a different kind. It "works" with NAG because the count
rate is the same for both kinds.

--
Steve Lionel
ISO/IEC JTC1/SC22/WG5 (Fortran) Convenor
Retired Intel Fortran developer/support
Email: firstname at firstnamelastname dot com
Twitter: @DoctorFortran
LinkedIn: https://www.linkedin.com/in/stevelionel
Blog: https://stevelionel.com/drfortran
WG5: https://wg5-fortran.org

steve kargl

unread,

Jan 8, 2021, 9:29:32 PM1/8/21

to

Steve Lionel wrote:

> On 1/8/2021 12:54 PM, steve kargl wrote:
>> Could a kind soul run the following program with Intel and NAG compilers?
>> I know the behavior under gfortran. I curious to see the processor-dependent
>> behavior implemented by other compilers.
>
> Intel 2021.1.2
>
> Behavior documented to give valid timings with gfortran and Intel
> rate4 = 10000
> rate8 = 1000000
> time = 1.023 seconds (int32/int32)
> time = 1.043 seconds (int64/int64)
>
> Behavior documented to give invalid timings with gfortran
> rate8 = 1000000
> time = .010 seconds (int32/int64)
> rate4 = 10000
> time = 110.200 seconds (int64/int32)

Thanks, Steve. These are the results I was expecting based on Intel's
on-line documentation. I do not have Intel Fortran installed on
my development systems (which run FreeBSD). Just finished (as
in a minute ago) installing on my Win10 laptop. Now, I need to learn
how to work in a Windows world.

> NAG 7.0.7036
>
> Behavior documented to give valid timings with gfortran and Intel
> rate4 = 10000000
> rate8 = 10000000
> time = 3.805 seconds (int32/int32)
> time = 3.655 seconds (int64/int64)
>
> Behavior documented to give invalid timings with gfortran
> rate8 = 10000000
> time = 3.611 seconds (int32/int64)
> rate4 = 10000000
> time = 3.627 seconds (int64/int32)

This is also nice to see. Between gfortran, intel, and nag we
can definitely see the processor-dependent behavior called
out the Fortran standard.

> Why would you do such a thing? It makes no sense to me to call
> SYSTEM_CLOCK with one integer kind and divide the difference by the
> COUNT_RATE of a different kind. It "works" with NAG because the count
> rate is the same for both kinds.

I have spent the last 1.5 days trying to convince a person, who submitted
a gfortran bug report, that he was seeing processor-dependent behavior.
gfortran documents its choices in its manual. Bug reporter appealed to
Intel Fortran as the correct implementation. He suggested that the
count rate is set by the precision of the first argument, which Intel does
not document. The entertaining thead can be found here:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98577

--
steve

steve kargl

unread,

Jan 8, 2021, 11:07:30 PM1/8/21

to

steve kargl wrote:

> Steve Lionel wrote:
>
>> Why would you do such a thing? It makes no sense to me to call
>> SYSTEM_CLOCK with one integer kind and divide the difference by the
>> COUNT_RATE of a different kind. It "works" with NAG because the count
>> rate is the same for both kinds.
>
> I have spent the last 1.5 days trying to convince a person, who submitted
> a gfortran bug report, that he was seeing processor-dependent behavior.
> gfortran documents its choices in its manual. Bug reporter appealed to
> Intel Fortran as the correct implementation. He suggested that the
> count rate is set by the precision of the first argument, which Intel does
> not document. The entertaining thead can be found here:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98577
>

Well, I finally figured out the bug reporter's problem. The code snippet
is sufficient to illustrate:

integer(4) r4
integer(8) c1, c2
call system_clock(c1, r4) ! Note positional arguments.
....
call system_clock( c2)

For gfortran, the count rate for the first call to system_clock() is
determined from the min(kind(c1), kind(r4)) = 4, and happens to
use a millisecond timescale (r4 = 1000, c1 increment by milliseconds).
For the second call to system_clock(), min(kind(c2) = 8, and a nanosecond
timescale is used. Noting c1 and c2 are counting against a reference
(e.g., the unix epoch) c2 >> c1. A very wrong elapse time results from
(c2-c1)/real(r4).

According to the person, who reported the bug, Intel Fortran chooses
the count rate based on the precision of the first argument (ie., the kind
type parameter). This is a perfectly fine choice. What the bug
reporter may be missing is that Fortran allows keywords, so

call system_clock(count_rate=r4, count=c1) ! keyword
....
call system_clock( c2)

will give an equally bad elapse time estimate.

--
steve

gah4

unread,

Jan 8, 2021, 11:11:03 PM1/8/21

to

On Friday, January 8, 2021 at 6:29:32 PM UTC-8, steve kargl wrote:

(snip)

> This is also nice to see. Between gfortran, intel, and nag we
> can definitely see the processor-dependent behavior called
> out the Fortran standard.

There is an old story:

Q: What's the difference between a bug and a feature?

A: A feature is documented.

Message has been deleted

steve kargl

unread,

Jan 9, 2021, 12:15:11 AM1/9/21

to

مهدي شينون wrote:

>
> Someone is trying to hide the truth here.
> This is the original bug:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98577
>
> The bug was acquiring count and count_rate with the same call where Intel Fortran is consistent and gives the expected results.

No one is hiding the truth. I reported the PR number in
a follow up here. The code I asked to have run here, I wrote.
I wanted to see what the results were because you
refused to run the code when I asked. In fact, I asked
twice.

There is no bug in gfortran. There is nothing to fix. You are
seeing processor-dependent behavior. You seem to be unwilling
to read the documentation that comes with the compiler,
and after reading the Intel Fortran documentation for
system_clock(), it seems you have not read that documentation
either.

The Intel Fortran documentation at

https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/a-to-z-reference/s-1/system-clock.html

dated 4 dec 2020 contains the the following hightlighted sentence:

"All integer arguments used must have the same integer kind parameter."

You code with actual args of count=int64 and count_rate=int32
does not meet that requirement.

If you actually use gfortran, you certainly have an odd way of
saying "Thank You" to someone who has given to you for free
hundreds of bug fixes and several standard conforming features.

--
steve

Steve Lionel

unread,

Jan 9, 2021, 9:31:48 AM1/9/21

to

On 1/8/2021 11:07 PM, steve kargl wrote:
> According to the person, who reported the bug, Intel Fortran chooses
> the count rate based on the precision of the first argument (ie., the kind
> type parameter). This is a perfectly fine choice.

It's even weirder than that. Intel doesn't enforce the documented (not
in the standard) restriction "All integer arguments used must have the
same integer kind parameter." If COUNT is specified, ifort uses the kind
of COUNT to determine the COUNT_RATE, otherwise it uses the kind of
COUNT_RATE. This is NOT what the documentation says!

The standard gives no help here - it seems to assume that there is only
one COUNT_RATE (and one COUNT_MAX) for a given processor. I'd be happier
if there were words along the lines of "... counts per second in calls
to SYSTEM_CLOCK where the integer kind of COUNT is the same as the
integer kind of COUNT_RATE." (Needs some wordsmithing as this is still
somewhat ambiguous, but you should get the point.)

I'm not sure this rises to the level of an interpretation request, but I
do think it could be clarified for the next revision. I'll bring this up
to J3. I'll also gripe to Intel about the behavior not matching the
documentation.

FortranFan

unread,

Jan 9, 2021, 11:00:46 AM1/9/21

to

On Saturday, January 9, 2021 at 9:31:48 AM UTC-5, Steve Lionel wrote:

> ..
> The standard .. seems to assume that there is only
> one COUNT_RATE (and one COUNT_MAX) for a given processor. ..

Indeed.

I kinda wish for a new intrinsic, say CPU_COUNT, whose arguments are all REAL scalars constrained to be of the same kind and for which the example per SYSTEM_CLOCK one in the current standard will be like so:

--- example ---
If the processor clock is a 24-hour clock that registers time at approximately 18.20648193 ticks per second, at 11:30 A.M. the reference

CALL CPU_COUNT (COUNT = C, COUNT_RATE = R, COUNT_MAX = M, KIND=K) defines

C = (11.0_k×3600.0_k + 30.0_k ×60.0_k) * 18.20648193_k
COUNT_RATE = 18.20648193_k
M = .. ! same approach as C

Steve Lionel

unread,

Jan 9, 2021, 12:08:16 PM1/9/21

to

On 1/9/2021 9:31 AM, Steve Lionel wrote:
> It's even weirder than that. Intel doesn't enforce the documented (not
> in the standard) restriction "All integer arguments used must have the
> same integer kind parameter." If COUNT is specified, ifort uses the kind
> of COUNT to determine the COUNT_RATE, otherwise it uses the kind of
> COUNT_RATE. This is NOT what the documentation says!

But wait, there's more! I had forgotten that the standard also allows
COUNT_RATE (but not COUNT_MAX) to be REAL. The Intel documentation is
completely silent about how that affects the value returned, but
experimentation shows that it uses the rate for the same-size integer
kind (except that if you pass a 128-bit real, you get the value for a
64-bit integer.)

Oh, my aching head!

steve kargl

unread,

Jan 9, 2021, 2:31:41 PM1/9/21

to

Steve Lionel wrote:

> On 1/8/2021 11:07 PM, steve kargl wrote:
>> According to the person, who reported the bug, Intel Fortran chooses
>> the count rate based on the precision of the first argument (ie., the kind
>> type parameter). This is a perfectly fine choice.
>
> It's even weirder than that. Intel doesn't enforce the documented (not
> in the standard) restriction "All integer arguments used must have the
> same integer kind parameter." If COUNT is specified, ifort uses the kind
> of COUNT to determine the COUNT_RATE, otherwise it uses the kind of
> COUNT_RATE. This is NOT what the documentation says!
>
> The standard gives no help here - it seems to assume that there is only
> one COUNT_RATE (and one COUNT_MAX) for a given processor. I'd be happier
> if there were words along the lines of "... counts per second in calls
> to SYSTEM_CLOCK where the integer kind of COUNT is the same as the
> integer kind of COUNT_RATE." (Needs some wordsmithing as this is still
> somewhat ambiguous, but you should get the point.)
>
> I'm not sure this rises to the level of an interpretation request, but I
> do think it could be clarified for the next revision. I'll bring this up
> to J3. I'll also gripe to Intel about the behavior not matching the
> documentation.

As you likely know, Fortran 95 required all arguments to be default
integer kind. Fortran 2003 made SYSTEM_CLOCK() generic and added
the real type for COUNT_RATE. It is unclear to me why a real COUNT_RATE
was needed; because at least on the systems I've used, the clock is
discrete. The problem with placing a requirement on integer kind for
COUNT_RATE is that it does not apply to the real type. Another issue
arises in that all three dummy arguments are optional, and as point
out Intel Fortran makes a timescale choice based on COUNT_RATE
if COUNT is not present.

BTW, one of things I wanted to check with the code I posted was
the possibility that Intel Fortran cached the COUNT_RATE type
(i.e., similar to the seed(s)of the PRNG). That is, by default assume
10000, and if COUNT_RATE is passed into SYSTEM_CLOCK() then it
possibly is reset to a new value (e.g., 10000000 for INTEGER(8)).
This then would mean

integer c1, c2
integer(8) rate
call system_clock(count_rate=rate) ! Retrieve rate and set time scale.
do j = 1, N
call systom_clock(count=c1) ! Count in time scale ticks
do i = 1, N
....
end do
call system_clock(count=c2) ! Count in time scale ticks
print *, (c2 - c1) / real(rate)
end do

would give the desired elapsed timing regardless of kind mismatch.

--
steve

Ron Shepard

unread,

Jan 9, 2021, 3:06:01 PM1/9/21

to

On 1/8/21 11:15 PM, steve kargl wrote:
> If you actually use gfortran, you certainly have an odd way of
> saying "Thank You" to someone who has given to you for free
> hundreds of bug fixes and several standard conforming features.

Well, I thank all of the gfortran developers, including you Steve.

Nonetheless, this does seem to be a real issue, and one that I think the
standard needs to address. In addition to mixed-kind arguments in
system_clock(), and mixed type arguments with integer and real
arguments, there is also the fact that system_clock can be called with
just single arguments.

call system_clock( count_rate=XXX )
call system_clock( count=YYY )
call system_clock( count_max=ZZZ )

In this case, there is no "first argument" available to set the
count_rate appropriately. The only information available is the type and
kind of XXX. The subroutine cannot know what kinds of arguments will be
used in the subsequent calls. Both the gfortran and intel compilers
share some KIND values between integer and real types, but the fortran
standard does not require that, and even if it did, that would be
insufficient to resolve this issue since there are usually real kinds
with no corresponding integer kinds and also integer kinds with no
corresponding real kind. And even if there are shared kind values, they
are basically meaningless anyway as far as the system_clock arguments
are concerned.

There should be nothing wrong with using an integer 64-bit count, but
using a 32-bit real count_rate. Remember is is only differences of those
64-bit values that are important, so if a sequence of calls returns
counts that share the high-order bits, they are unimportant anyway, it
is only the low-order bits that matter. I guess the same argument can be
used to allow 32-bit integer count_rates too, but I can't think of why
one would want to do that.

Should the standard be changed so that the count_rate argument is only
allowed when one of the other arguments is also present? It seems like
that would be sufficient to resolve this particular issue. The integer
kind of the count and/or the count_max arguments could be used to select
the appropriate count rate, regardless of the type and kind of that
argument. I think the requirement that the count and count_max arguments
must have the same kinds is already covered by the standard, it seems to
be just the count_rate argument that is the problem.

If an integer count_rate has insufficient precision to hold the return
value, then that could be determined at compile time, not run time.
Maybe the compiler should be required to report that. That is not an
issue for real count_rate arguments, any kind should work.

$.02 -Ron Shepard

Steve Lionel

unread,

Jan 9, 2021, 4:08:17 PM1/9/21

to

On 1/9/2021 2:31 PM, steve kargl wrote:
> BTW, one of things I wanted to check with the code I posted was
> the possibility that Intel Fortran cached the COUNT_RATE type
> (i.e., similar to the seed(s)of the PRNG). That is, by default assume
> 10000, and if COUNT_RATE is passed into SYSTEM_CLOCK() then it
> possibly is reset to a new value (e.g., 10000000 for INTEGER(8)).

COUNT_RATE is an INTENT(OUT) argument (as are the other arguments), so
it is not "passed in". I suppose an implementation could "remember" the
type/kind of the arguments from an earlier call, but I can't imagine
anyone actually doing that and there's no reason one can't have more
than one time measurement going on at the same time.

steve kargl

unread,

Jan 9, 2021, 4:38:10 PM1/9/21

to

Ron Shepard wrote:

> On 1/8/21 11:15 PM, steve kargl wrote:
>> If you actually use gfortran, you certainly have an odd way of
>> saying "Thank You" to someone who has given to you for free
>> hundreds of bug fixes and several standard conforming features.
>
> Well, I thank all of the gfortran developers, including you Steve.
>
> Nonetheless, this does seem to be a real issue, and one that I think the
> standard needs to address.

Yes, I think J3 should revisit the requirements on SYSTEM_CLOCK().
No, I am not going to engage J3. J3 is likely to point to CPU_TIME()
as a means to get elapsed time.

>In addition to mixed-kind arguments in
> system_clock(), and mixed type arguments with integer and real
> arguments, there is also the fact that system_clock can be called with
> just single arguments.
>
> call system_clock( count_rate=XXX )
> call system_clock( count=YYY )
> call system_clock( count_max=ZZZ )
>
> In this case, there is no "first argument" available to set the
> count_rate appropriately. The only information available is the type and
> kind of XXX. The subroutine cannot know what kinds of arguments will be
> used in the subsequent calls. Both the gfortran and intel compilers
> share some KIND values between integer and real types, but the fortran
> standard does not require that, and even if it did, that would be
> insufficient to resolve this issue since there are usually real kinds
> with no corresponding integer kinds and also integer kinds with no
> corresponding real kind. And even if there are shared kind values, they
> are basically meaningless anyway as far as the system_clock arguments
> are concerned.

As with KIND values, J3 (or perhaps I should write the Fortran standard)
does not provide implementation details. For processor-dependent
things, a processor must make a choice. Intel Fortran and gfortran make
different choices. In this case, it seems no choice is perfect.

It's too late to remove, but I personally think it was a mistake to
allow COUNT_RATE to have a REAL type. It seems someone on J3 got
tired of the required explicit conversion required to get an elapsed
time (i.e., t = (count2 - count1) / count_rate is integer division if
count_rate has INTEGER type).

> There should be nothing wrong with using an integer 64-bit count, but
> using a 32-bit real count_rate. Remember is is only differences of those
> 64-bit values that are important, so if a sequence of calls returns
> counts that share the high-order bits, they are unimportant anyway, it
> is only the low-order bits that matter. I guess the same argument can be
> used to allow 32-bit integer count_rates too, but I can't think of why
> one would want to do that.

Processor-dependent behavior requires a choice. There is certainly nothing
wrong with mixed-mode arithmetic. But, a programmer neads to be
aware of the processor-dependent behavior and write the code to take
that behavior into account. For gfortran, the COUNT_RATE may be a
value for a millisecond time scale and COUNT may be on nanoscale timescale.
So, one needs to do the necessary freshman-level unit conversion.

> Should the standard be changed so that the count_rate argument is only
> allowed when one of the other arguments is also present?

Personally, I think J3 should add SYSTEM_COUNT_INIT(NUMBER, RATES, SET)
NUMBER is an intent(out) INTEGER number of available rates for the counter.
For gfortran, NUMBER=2. RATES is an intent(out) REAL array of the available
rates. For gfortran, RATES = [1.e3, 1.e9]. SET is an intent(in) INTEGER used
to select the desired rate, and it must satifies SET <= NUMBER. If NUMBER = 0
(ie., no available clocks) RATES is a zero-sized array.

The companion subroutine is then SYSTEM_COUNTER(COUNT) where COUNT
is clock ticks relative to a reference clock in the time scale chosen by
SYSTEM_COUNT_INIT(SET=xxx). In SYSTEM_COUNT_INIT() had not been called
prior to calling SYSTEM_COUNTER, then there are 2 possibilities that J3 could
require: (1) COUNT is set to -1; or (2) a default time scale of SET = 1 is
used. If NUMBER = 0 (ie., no clocks), then COUNT = 0 always.

Finally, in my timer module (doesn't everyone have a timer module?), use
CPU_TIME(). SYSTEM_CLOCK() is not used.

--
steve

steve kargl

unread,

Jan 9, 2021, 5:00:45 PM1/9/21

to

Steve Lionel wrote:

> On 1/9/2021 2:31 PM, steve kargl wrote:
>> BTW, one of things I wanted to check with the code I posted was
>> the possibility that Intel Fortran cached the COUNT_RATE type
>> (i.e., similar to the seed(s)of the PRNG). That is, by default assume
>> 10000, and if COUNT_RATE is passed into SYSTEM_CLOCK() then it
>> possibly is reset to a new value (e.g., 10000000 for INTEGER(8)).
>
> COUNT_RATE is an INTENT(OUT) argument (as are the other arguments), so
> it is not "passed in". I suppose an implementation could "remember" the
> type/kind of the arguments from an earlier call, but I can't imagine
> anyone actually doing that and there's no reason one can't have more
> than one time measurement going on at the same time.
>

Good point of the loose use of "passed in". So, yes, I meant setting
the COUNT_RATE based on the kind type of the effective argument.

I had not thought about possibly timing two different sections of code
with different time scales. I have a timer module, which simply uses
CPU_TIME(). 'call tic' records an execution time, and 'call toc'
grab the current execution time and reports elapse time since 'tic'
was called. It is not set up to have more than one reference 'tic'
time.

--
steve

Ron Shepard

unread,

Jan 10, 2021, 5:23:32 AM1/10/21

to

On 1/9/21 3:38 PM, steve kargl wrote:
> Finally, in my timer module (doesn't everyone have a timer module?), use
> CPU_TIME(). SYSTEM_CLOCK() is not used.

I have used systems where cpu_time() was only accurate to 0.01 seconds
(a unix time slice), while system_clock() was accurate to microseconds
or nanoseconds. So cpu_time was alright for rough timings of entire
program runs, but if you wanted to fine-tune performance of some smaller
section of code, then system_clock() was the way to go.

$.02 -Ron Shepard

Message has been deleted

gah4

unread,

Jan 10, 2021, 6:29:24 AM1/10/21

to

On Sunday, January 10, 2021 at 3:12:26 AM UTC-8, مهدي شينون wrote:
> CPU_TIME() doesn't give the right values with parallel (OpenMP) regions.

Reminds me that some years ago I had a dual processor Win 2000 system
with the usual tape backup program. It would keep track of elapsed time,
and at the end report the total elapsed time. Except that the time was
half the actual time. It seems that since it was dual processor, it
divided the time by two.

In the case of OpenMP, you have to define what you mean by CPU time.

Does each image keep track of its own time, or is it the total of all?

steve kargl

unread,

Jan 10, 2021, 12:34:02 PM1/10/21

to

Well (to me), anything that completes execution in less than 0.01 seconds
is uninteresting. I worry about things that take 10 of minutes or hours or
days to complete. gfortran's CPU_TIME tends to use libc's getrusage() on
system that have it. On my systems, getrusage() has provided microsecond
resolution for a very long time. Getting a usable clock on MingW and Cygwin
seems to be much more challenging. See gcc/libgfortran/intrinsics/time_1.h.

At least with gfortran and systems with a libc clock_gettime(), there is a difference
between CPU_TIME and SYSTEM_CLOCK. CPU_TIME is the total user+system
execution time of the process. SYSTEM_CLOCK is the total elapse time between
to points in time (i.e., this includes the idle time on a multitasking/multi-user
system when your process is waiting for the cpu).

--
steve

steve kargl

unread,

Jan 10, 2021, 12:47:10 PM1/10/21

to

مهدي شينون wrote:

> CPU_TIME() doesn't give the right values with parallel (OpenMP) regions.

I suppose that it depends on how smart one is, and whether one
reads the documentation provided by the processor to learn about
processor-dependent behavior.

--
steve

FortranFan

unread,

Jan 10, 2021, 1:44:02 PM1/10/21

to

On Sunday, January 10, 2021 at 6:12:26 AM UTC-5, مهدي شينون wrote:

> CPU_TIME() doesn't give the right values with parallel (OpenMP) regions.

Yes, there can be *implementation issues" with CPU_TIME and there are those present. Such as with Intel Fortran where the intrinsic CPU_TIME can yield outliers in code performance evaluation data in which case SYSTEM_CLOCK with 64-bit integer KINDs for its arguments becomes a useful alternative - see link below for an example:
https://community.intel.com/t5/Intel-Fortran-Compiler/Why-does-Version-19-change-how-cube-root-is-calculated/m-p/1180670/highlight/true#M148651

JCampbell

unread,

Jan 11, 2021, 4:20:09 AM1/11/21

to

It is important to consider the precision of the available timers and not just their reported ticks.
"Count_Rate" is merely a scaler for the number of reported ticks per second, but does not guarantee that the precision of the timer is to the nearest tick.
This is more obvious with CPU_TIME, where the elapsed cpu_time is typically updated only 64 times per second.
In gFortran on windows, this reported clock rate varies with kind and also the processor and possibly O/S.
I report what precision is being achieved with the following code. Hopefully it may be of interest.

subroutine report_timer_precision
!
! USE OMP_Variables
!
! checks the precision of the following timers in use
! CPU_Time (CPU) ! Fortran intrinsic
! System_Clock ! Fortran intrinsic
! omp_get_wtime (wtime) ! OMP Library
! QueryPerformance_tick ! WINAPI library
! QueryPerformance_sec ! my wrapper
!
! precision is minimum time between change of value
!
integer*4 k,n
real*8 timer_precision, timer_seconds, next_seconds, dt
integer*8 clock_precision, clock_tick, next_tick
!
real*8, external :: omp_get_wtick
real*8, external :: omp_get_wtime
integer*8, external :: system_clock_rate
integer*8, external :: system_clock_tick
integer*8, external :: QueryPerformance_rate
integer*8, external :: QueryPerformance_tick
real*8, external :: QueryPerformance_sec
!
call report_text ( '>> call report_timer_precision',0)
!
write ( *,2000)
write (98,2000)
2000 format (/ &
' Timer Precision Report'/ &
' This routine checks the accuracy of the timers available'/ &
' the accuracy is the minimum time for a change of reported value'/ &
' this accuracy differs from the resolution of tick values reported'/ &
' the number of call cycles between change of value is also reported'/ )
!
! test precision of omp_get_wtime
timer_precision = omp_get_wtick () ; timer_precision = 1./timer_precision
timer_seconds = omp_get_wtime ()
k = -2 ; n = 0
do
n = n+1
next_seconds = omp_get_wtime () ! call to timer
if ( next_seconds == timer_seconds ) cycle
k = k+1
if (k > 0) exit
timer_seconds = next_seconds
n = 0
end do
dt = next_seconds-timer_seconds ! minimum time between ticks
write ( *,11) 'omp_get_wtime rate = ', timer_precision,' ticks per second'
write ( *,11) 'omp_get_wtime acc = ', dt,' seconds : ', n,' cycles'
write (98,11) 'omp_get_wtime rate = ', timer_precision,' ticks per second'
write (98,11) 'omp_get_wtime acc = ', dt,' seconds : ', n,' cycles'
11 format ( 1x,a,es12.3,a,i0,a)
12 format ( 1x,a,i12,a,i0,a)
!
! compare to precision of SYSTEM_CLOCK
clock_precision = system_clock_rate ()
clock_tick = system_clock_tick ()
k = -2 ; n = 0
do
n = n+1
next_tick = system_clock_tick () ! call to timer
if ( next_tick == clock_tick ) cycle
k = k+1
if (k > 0) exit
clock_tick = next_tick
n = 0
end do
dt = dble(next_tick-clock_tick)/dble(clock_precision) ! minimum time between ticks
write ( *,12) 'system_clock_rate = ', clock_precision,' ticks per second'
write ( *,11) 'system_clock_tick acc = ', dt,' seconds : ', n,' cycles'
write (98,12) 'system_clock_rate = ', clock_precision,' ticks per second'
write (98,11) 'system_clock_tick acc = ', dt,' seconds : ', n,' cycles'
!
! compare to precision of QueryPerformance
clock_precision = QueryPerformance_rate ()
clock_tick = QueryPerformance_tick ()
k = -2 ; n = 0
do
n = n+1
next_tick = QueryPerformance_tick () ! call to timer
if ( next_tick == clock_tick ) cycle
k = k+1
if (k > 0) exit
clock_tick = next_tick
n = 0
end do
dt = dble(next_tick-clock_tick)/dble(clock_precision) ! minimum time between ticks
write ( *,12) 'QueryPerformance_rate =', clock_precision,' ticks per second'
write ( *,11) 'QueryPerformance_tick =', dt,' seconds : ', n,' cycles'
write (98,12) 'QueryPerformance_rate =', clock_precision,' ticks per second'
write (98,11) 'QueryPerformance_tick =', dt,' seconds : ', n,' cycles'
!
! compare to precision of QueryPerformance_sec
timer_seconds = QueryPerformance_sec ()
k = -2 ; n = 0
do
n = n+1
next_seconds = QueryPerformance_sec () ! call to timer
if ( next_seconds == timer_seconds ) cycle
k = k+1
if (k > 0) exit
timer_seconds = next_seconds
n = 0
end do
dt = next_seconds-timer_seconds ! minimum time between ticks
write ( *,11) 'QueryPerformance_sec =', dt,' seconds : ', n,' cycles'
write (98,11) 'QueryPerformance_sec =', dt,' seconds : ', n,' cycles'
!
! test precision of CPU_Time
call CPU_Time (timer_seconds)
k = -2 ; n = 0
do
n = n+1
call CPU_Time (next_seconds) ! call to timer
if ( next_seconds == timer_seconds ) cycle
k = k+1
if (k > 0) exit
timer_seconds = next_seconds
n = 0
end do
dt = next_seconds-timer_seconds ! minimum time between ticks
write ( *,11) 'CPU_Time accuracy =', dt,' seconds : ', n,' cycles'
write (98,11) 'CPU_Time accuracy =', dt,' seconds : ', n,' cycles'
!
write ( *,13) 'NOTE : precision is minimum time between change of value'
write (98,13) 'NOTE : precision is minimum time between change of value'
13 format (1x,a/1x)
!
end subroutine report_timer_precision

JCampbell

unread,

Jan 11, 2021, 4:47:22 AM1/11/21

to

I should have included the following for SYSTEM_CLOCK usage and also include Windows QueryPerform access

!==== Query Perform ===========================================================

integer*8 function QueryPerformance_tick ()
use ISO_C_BINDING

interface
function QUERYPERFORMANCECOUNTER(tick) bind(C, name="QueryPerformanceCounter")
use ISO_C_BINDING
!GCC$ ATTRIBUTES STDCALL :: QUERYPERFORMANCECOUNTER
logical(C_BOOL) QUERYPERFORMANCECOUNTER
integer(C_LONG_LONG) tick
end function QUERYPERFORMANCECOUNTER
end interface
!
integer(C_LONG_LONG) :: tick
logical(C_BOOL) :: ll
!
ll = QUERYPERFORMANCECOUNTER (tick)
QueryPerformance_tick = tick
end function QueryPerformance_tick

integer*8 function QueryPerformance_rate ()
use ISO_C_BINDING

interface
function QUERYPERFORMANCEFREQUENCY(tick_rate) bind(C, name="QueryPerformanceFrequency")
use ISO_C_BINDING
!GCC$ ATTRIBUTES STDCALL :: QUERYPERFORMANCEFREQUENCY
logical(C_BOOL) QUERYPERFORMANCEFREQUENCY
integer(C_LONG_LONG) tick_rate
end function QUERYPERFORMANCEFREQUENCY
end interface
!
logical(C_BOOL) :: ll
integer(C_LONG_LONG) :: tick_rate = -1
!
if ( tick_rate < 0 ) then
ll = QUERYPERFORMANCEFREQUENCY (tick_rate)
write (*,*) 'QueryPerformance', tick_rate,' ticks per second'
end if
QueryPerformance_rate = tick_rate
end function QueryPerformance_rate

!==== System Clock ============================================================

integer*8 function system_clock_tick () ! System_Clock
!
integer*8 :: count
intrinsic system_clock
!
call system_clock (count)
!
system_clock_tick = count
!
end function system_clock_tick

integer*8 function system_clock_rate () ! System_Clock
!
integer*8 :: count_start, count_max
integer*8 :: count_rate = -1
intrinsic system_clock
!
if ( count_rate < 0) then
call system_clock (count_start, count_rate, count_max)
write (*,*) 'System_Clock', count_rate,' ticks per second'
end if
!
system_clock_rate = count_rate
end function system_clock_rate

steve kargl

unread,

Jan 11, 2021, 5:21:04 PM1/11/21

to

JCampbell wrote:

> It is important to consider the precision of the available timers
> and not just their reported ticks. "Count_Rate" is merely a scaler
> for the number of reported ticks per second, but does not guarantee
> that the precision of the timer is to the nearest tick.

Agree. Although it would be silly to have a count_rate value
of a higher precision (eg, microseconds), if the count value is
incremented on a slower scale (eg. milliseond).

> This is more obvious with CPU_TIME, where the elapsed cpu_time
> is typically updated only 64 times per second.

With gfortran, CPU_TIME relies on the clock facilities of the underlying
operating system. CPU_TIME on FreeBSD has microsecond resolution.
I don't do windows, so cannot say anything about that OS.

I'll note that CPU_TIME (again on FreeBSD and likely any unix-like 0S)
reports the user+system execution of program. This is the total time
that the program spends using the CPU. OTOH, SYSTEM_CLOCK
provides the number of ticks relative to some reference time (e.g.,
unix epoch, Jan,1 1970, on FreeBSD). The elapsed time, reported by
taking the difference of two calls to SYSTEM_CLOCK, is the total elapsed
time, which on a multitasking/multi-user system includes the idle time
while your program waits for its turn on the CPU.

--
steve

James Van Buskirk

unread,

Jan 11, 2021, 5:42:14 PM1/11/21

to

"JCampbell" wrote in message
news:b82e6fa6-5bb8-497c...@googlegroups.com...

> logical(C_BOOL) QUERYPERFORMANCECOUNTER

Oh God no...

Gary Scott

unread,

Jan 11, 2021, 5:42:53 PM1/11/21

to

At one time:

"The default timer resolution on Windows is 15.6 ms – a timer interrupt
64 times a second. When programs increase the timer frequency they
increase power consumption and harm battery life."

Hopefully this has improved by I havent checked recently. I've been
using MKL timing facilities which are wrappers to higher resolution
windows timers.

steve kargl

unread,

Jan 11, 2021, 6:33:00 PM1/11/21

to

To back up the above assertion see the code below.

% gfortran -o z -O2 -pipe a.f90
% ./z
cputime(real32): 12.9883 sec
cputime(real64): 12.9883 sec
system_clock(int32): 12.9890 sec
system_clock(int64): 12.9895 sec
% .z
^Z
Suspended
% fg
./z
cputime(real32): 13.1366 sec
cputime(real64): 13.1365 sec
system_clock(int32): 46.8470 sec
system_clock(int64): 46.8471 sec

So, yeah, one needs to know what one is measuring.

program foo

use iso_fortran_env, only : int32, int64, real32, real64

implicit none

real(real32) a1, a2
real(real64) b1, b2
integer(int32) r4, t1, t2
integer(int64) r8, u1, u2
integer, parameter :: n = 10**8

call system_clock(count_rate=r4)
call system_clock(count_rate=r8)

call system_clock(count=t1)
call cpu_time(a1)
call cpu_time(b1)
call system_clock(count=u1)
call loop(n)
call system_clock(count=u2)
call cpu_time(b2)
call cpu_time(a2)
call system_clock(count=t2)

write(*,'(A,F0.4,A)') ' cputime(real32): ', a2 - a1, ' sec'
write(*,'(A,F0.4,A)') ' cputime(real64): ', b2 - b1, ' sec'
write(*,'(A,F0.4,A)') 'system_clock(int32): ', (t2 - t1) / real(r4), ' sec'
write(*,'(A,F0.4,A)') 'system_clock(int64): ', (u2 - u1) / real(r8), ' sec'

contains

subroutine loop(n)
integer, intent(in) :: n
real(real64) dx, x, y
integer i, j
dx = 1._real64 / (n - 1)
do i = 1, n
x = 1000 + (i - 1) * dx
y = cos(x)
x = sin(x)
if (x * y > 2) stop 1
end do
end subroutine loop

end program foo

Ron Shepard

unread,

Jan 11, 2021, 6:36:02 PM1/11/21

to

On 1/11/21 4:21 PM, steve kargl wrote:
> JCampbell wrote:
>
>> It is important to consider the precision of the available timers
>> and not just their reported ticks. "Count_Rate" is merely a scaler
>> for the number of reported ticks per second, but does not guarantee
>> that the precision of the timer is to the nearest tick.
>
> Agree. Although it would be silly to have a count_rate value
> of a higher precision (eg, microseconds), if the count value is
> incremented on a slower scale (eg. milliseond).

Access to the high resolution clock might be limited to slower query
rates, say OS time slices.

>> This is more obvious with CPU_TIME, where the elapsed cpu_time
>> is typically updated only 64 times per second.
>
> With gfortran, CPU_TIME relies on the clock facilities of the underlying
> operating system. CPU_TIME on FreeBSD has microsecond resolution.
> I don't do windows, so cannot say anything about that OS.
>
> I'll note that CPU_TIME (again on FreeBSD and likely any unix-like 0S)
> reports the user+system execution of program. This is the total time
> that the program spends using the CPU. OTOH, SYSTEM_CLOCK
> provides the number of ticks relative to some reference time (e.g.,
> unix epoch, Jan,1 1970, on FreeBSD).

I think this epoch stuff is related to other times. The system_clock
timer sometimes wraps around every few minutes, particularly in the past
when the timer counts occurred in a 32-bit register and the count rates
were microsecond to nanosecond ranges. There is no way a timer like that
could measure ticks since 1970.

> The elapsed time, reported by
> taking the difference of two calls to SYSTEM_CLOCK, is the total elapsed
> time, which on a multitasking/multi-user system includes the idle time
> while your program waits for its turn on the CPU.

There has always been some ambiguity in timers, especially on multiuser,
multithreaded, timesharing systems. On parallel machines, a process
could be swapped out on one node and swapped back in on another. I
remember when the clock rate really was the cpu clock rate, which of
course might vary from node to node in a parallel machine. Then cpu
hardware started changing the clock rates to control temperatures and
energy consumption, or to have higher cpu burst rates to improve
performance, and all this got even harder.

$.02 -Ron Shepard

JCampbell

unread,

Jan 12, 2021, 9:11:23 PM1/12/21

to

I agree it is silly, but there are examples of gFortran and ifort with this characteristic.
I have always considered the windows gFortran and iFort implementations for kind(4) "silly", although perhaps a stronger criticism.

The following example has been compiled using gFortran on Windows
use ISO_FORTRAN_ENV

integer*4 :: i4_rate, i4_tick, i4_last
integer*4 :: n, n_last, k

write (*,*) 'Version :',compiler_version ()
k = -1
call system_clock (i4_last, i4_rate)
do n = 1, huge(n)
call system_clock (i4_tick)
if ( i4_tick == i4_last ) cycle
if ( k > 1 ) exit
i4_last = i4_tick
n_last = n
k = k+1
end do

dt = real(i4_tick-i4_last)/real(i4_rate)
write (*,*) 'timer rate =',i4_rate,' ticks/second'
write (*,*) 'timer change =',i4_tick-i4_last,' ticks'
write (*,*) 'timer calls =',n-n_last
write (*,*) 'timer precision =',dt,' seconds'
end

Version :GCC version 9.2.0
timer rate = 1000 ticks/second
timer change = 15 ticks
timer calls = 2187575
timer precision = 1.49999997E-02 seconds

changing to "integer*8 :: i4_rate, i4_tick, i4_last" produces:
Version :GCC version 9.2.0
timer rate = 2728261 ticks/second
timer change = 1 ticks
timer calls = 13
timer precision = 3.66533840E-07 seconds

Another interesting (possibly windows) feature is the relationship of clock_rate to the processor clock. For kind(8), often clock_rate = processor clock / 1024, although this is not always the case. This indicates that for kind(8), system_clock is based on RDTSC timer, while for kind(4), system clock is based on gettickcount.
All windows implementations of SYSTEM_CLOCK should be traced back to RDTSC, although RATE can be more difficult to recover. It would have been good if RATE was related to the processor clock (although variable overclocking is another complication)

I also find criticism of the inaccuracy of timers in a multi-thread case to be overstated. The inaccuracy of the result is much better than the no result that kind(4) timers provide.

Comparing the results for gFortran's SYSTEM_CLOCK across different O/S provides an interesting insight into timer requirements.

gah4

unread,

Jan 13, 2021, 2:27:42 AM1/13/21

to

On Tuesday, January 12, 2021 at 6:11:23 PM UTC-8, JCampbell wrote:

(snip)

> All windows implementations of SYSTEM_CLOCK should be traced back to RDTSC,
> although RATE can be more difficult to recover. It would have been good if RATE was
> related to the processor clock (although variable overclocking is another complication)

I am not sure specifically about overclocking, it seems that there are a variety of
variable clocking systems now in used. Some might slow down the clock on battery
power, or when the battery is getting lower.

For code optimization, clock cycle counts should be independent of actual
clock rate.

Thomas Koenig

unread,

Jan 13, 2021, 3:39:28 AM1/13/21

to

gah4 <ga...@u.washington.edu> schrieb:

> On Tuesday, January 12, 2021 at 6:11:23 PM UTC-8, JCampbell wrote:
>
> (snip)
>
>> All windows implementations of SYSTEM_CLOCK should be traced back to RDTSC,
>> although RATE can be more difficult to recover. It would have been good if RATE was
>> related to the processor clock (although variable overclocking is another complication)
>
> I am not sure specifically about overclocking, it seems that there are a variety of
> variable clocking systems now in used. Some might slow down the clock on battery
> power, or when the battery is getting lower.

Or when the CPU is not being used heavily, or when...

Also, I am not always sure about what overclocking achieves. The time
it takes for the CPU gates to finish some operation does not depend on
clock cycles. If you overclock, you may just miss the window for
an operation, and it may take a cycle more...

I am not sure how modern processors deal with this situation.

> For code optimization, clock cycle counts should be independent of actual
> clock rate.

Memory also comes into play - lower clock counts can mean (at constant
memoy access times) fewer cycles for memory access.

Optimization for modern CPUs is complex, it has both real and
imaginary parts.

gah4

unread,

Jan 13, 2021, 8:46:33 AM1/13/21

to

On Wednesday, January 13, 2021 at 12:39:28 AM UTC-8, Thomas Koenig wrote:

(snip, I wrote)

> > I am not sure specifically about overclocking, it seems that there are a variety of
> > variable clocking systems now in used. Some might slow down the clock on battery
> > power, or when the battery is getting lower.
> Or when the CPU is not being used heavily, or when...

> Also, I am not always sure about what overclocking achieves. The time
> it takes for the CPU gates to finish some operation does not depend on
> clock cycles. If you overclock, you may just miss the window for
> an operation, and it may take a cycle more...

I assume we are not talking about the people who overclock, ignoring
recommendations from the manufacturere.

As well as I know, though things change pretty fast, some CPUs are
heat limited. They can't run as fast as the gates and stay cool enough,
and/or keep battery use low. So, they can run faster for a short time,
and then let things cool back down again. That helps if the extra need
is only for a short time.

> I am not sure how modern processors deal with this situation.
> > For code optimization, clock cycle counts should be independent of actual
> > clock rate.

> Memory also comes into play - lower clock counts can mean (at constant
> memoy access times) fewer cycles for memory access.

If cache works right, it should not depend on memory speed at all.
I believe cache is synchronous with the processor clock, so will follow
any speed changes. The S in SDRAM means that it is synchronous to
some clock. I am not sure if that is the same as the processor clock.
Avoiding passing data across a clock boundary makes things a lot
easier, but I haven't looked in to those details recently.

> Optimization for modern CPUs is complex, it has both real and
> imaginary parts.

Yes!

gah4

unread,

Jan 13, 2021, 9:06:16 AM1/13/21

to

(someone wrote)

> > CPU_TIME() doesn't give the right values with parallel (OpenMP) regions.

(I wrote)

> In the case of OpenMP, you have to define what you mean by CPU time.

It seems that Fortran 2018 says:

"Whether an image has no clock, has a single clock of its own, or shares a clock with
another image, is processor dependent."

It seems to me, one should be careful using SYSTEM_CLOCK with OpenMP

gah4

unread,

Jan 13, 2021, 9:40:35 AM1/13/21

to

On Friday, January 8, 2021 at 9:15:11 PM UTC-8, steve kargl wrote:

(snip)
> The Intel Fortran documentation at

> https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/a-to-z-reference/s-1/system-clock.html

> dated 4 dec 2020 contains the the following hightlighted sentence:

>
> "All integer arguments used must have the same integer kind parameter."

> You code with actual args of count=int64 and count_rate=int32
> does not meet that requirement.

Does it check that, and report an error if it isn't satisfied?

It seems that 18-007r1 doesn't say anything about KIND.

> If you actually use gfortran, you certainly have an odd way of
> saying "Thank You" to someone who has given to you for free
> hundreds of bug fixes and several standard conforming features.

OK, first I will say thanks, even if the previous poster won't.

And also, that any comments I make are not meant to criticize gfortran or
you, but so that we can all learn to use it better. (And if I make suggestions
that seem to require someone to do some work, they should know that
is not what it meant.)

18-007r1 does say "a clock", which tends to imply one, though maybe
not convincingly. It does give the option for one clock per image in the
case of more than one image.

Also COUNT_RATE is allowed to be REAL, and in the example is 18.20648193,
definitely not integer. That is, they specifically allow for clocks that don't have
an integer number of ticks per second.

I suppose I still believe that requiring the KIND of COUNT and COUNT_RATE
to be the same is reasonable, but it does get interesting when they are
a different KIND. Especially as there is no requirement for any correspondence
between the KIND values for INTEGER and REAL.

I believe , then, that if the values are KIND dependent, then different arguments to the
same call should have the same KIND, or return an error. That doesn't stop someone
from mixing values between calls. I am not so sure about that case.

Note even more that 18.20648193 has more digits than a 32 bit REAL can hold.
In this case, then, it seems to make sense to have a 32 bit INTEGER type for COUNT,
and 64 bit REAL type for COUNT_RATE.

Also, and with either INTEGER or REAL COUNT_RATE, one can convert while dividing:

CALL SYSTEM_CLOCK(COUNT=C, COUNT_RATE=R)
ms= C*1000/R
or
ms=C/(R/1000)

depending on how one wants to treat rounding and/or overflow,
especially for INTEGER R.

Very interesting!

steve kargl

unread,

Jan 13, 2021, 12:39:23 PM1/13/21

to

Not sure what your point is/was. Both gfortran and ifort must
make a choice for processor-dependent behavior. gfortran and
ifort document those choices. It is up to the programmer to read
the documentation and act on it accordingly. I don't use Windows
for numerical work, so perhaps, gfortran does not conform to
its own documentation. However, it has chosen to use millisecond
resolution for default integer kind and nanosecond resolution
(if available from the operating system) for integer kinds with a
larger decimal representation on at least linux and FreeBSD.
The gory details of how gfortran chooses timing facilities can
be found in the files time_1.h, cpu_time.h, and system_clock.c
at https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libgfortran/intrinsics.
Given that none of the individuals, who have contributed code
to gfortran over the last decade+ use Windows, I suspect they
would be thrilled to recieve a patch that fixes gfortran on wndows
to meet your needs.

--
steve

steve kargl

unread,

Jan 13, 2021, 1:09:46 PM1/13/21

to

gah4 wrote:

> On Friday, January 8, 2021 at 9:15:11 PM UTC-8, steve kargl wrote:
>
> (snip)
>> The Intel Fortran documentation at
>
>> https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/a-to-z-reference/s-1/system-clock.html
>
>> dated 4 dec 2020 contains the the following hightlighted sentence:
>>
>> "All integer arguments used must have the same integer kind parameter."
>
>> You code with actual args of count=int64 and count_rate=int32
>> does not meet that requirement.
>
> Does it check that, and report an error if it isn't satisfied?

It can't. All dummy arguments of system_clock are optional.
An individual could grab CLOCK_RATE in routine FOO, and
for example convert to delta_t = real(1.)/count_rate. Then
delta_t is passed into routine BAR and BAH and BAM ...
where timings are done. It is up to the programmer, based
on the documented processor-dependent behavior, to
correctly call SYSTEM_CLOCK with an appropriately typed
COUNT.

Apparently, this is rocket science.

> It seems that 18-007r1 doesn't say anything about KIND.

Fortran 95 required COUNT, COUNT_RATE, and COUNT_MAX
to be default integer kind. Fortran 2003 removed that
requirement, making it generic and adding the real type
for COUNT_RATE.

--
steve

jfh

unread,

Jan 13, 2021, 7:24:00 PM1/13/21

to

On Thursday, January 14, 2021 at 7:09:46 AM UTC+13, steve kargl wrote:
> gah4 wrote:
>
> > On Friday, January 8, 2021 at 9:15:11 PM UTC-8, steve kargl wrote:
> >
> > (snip)
> >> The Intel Fortran documentation at
> >
> >> https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/a-to-z-reference/s-1/system-clock.html
> >
> >> dated 4 dec 2020 contains the the following hightlighted sentence:
> >>
> >> "All integer arguments used must have the same integer kind parameter."
> >
> >> You code with actual args of count=int64 and count_rate=int32
> >> does not meet that requirement.
> >
> > Does it check that, and report an error if it isn't satisfied?
> It can't. All dummy arguments of system_clock are optional.
> An individual could grab CLOCK_RATE in routine FOO, and
> for example convert to delta_t = real(1.)/count_rate. Then
> delta_t is passed into routine BAR and BAH and BAM ...
> where timings are done. It is up to the programmer, based
> on the documented processor-dependent behavior, to
> correctly call SYSTEM_CLOCK with an appropriately typed
> COUNT.
>
> Apparently, this is rocket science.

Is rocket science still a branch of the engineering profession whose products often fail catastrophically before doing what they were intended to?

Gary Scott

unread,

Jan 13, 2021, 9:00:19 PM1/13/21

to

Things often get handed off to "coders" to translate/transcribe which
doesnt always go as planned. At one time, "engineers" were forbidden to
code. They were forced to perform this handoff.

gah4

unread,

Jan 14, 2021, 12:00:09 AM1/14/21

to

On Wednesday, January 13, 2021 at 10:09:46 AM UTC-8, steve kargl wrote:

(snip)

> >> "All integer arguments used must have the same integer kind parameter."

> >> You code with actual args of count=int64 and count_rate=int32
> >> does not meet that requirement.

> > Does it check that, and report an error if it isn't satisfied?

> It can't. All dummy arguments of system_clock are optional.
> An individual could grab CLOCK_RATE in routine FOO, and
> for example convert to delta_t = real(1.)/count_rate. Then
> delta_t is passed into routine BAR and BAH and BAM ...
> where timings are done. It is up to the programmer, based
> on the documented processor-dependent behavior, to
> correctly call SYSTEM_CLOCK with an appropriately typed
> COUNT.

I meant for the case that more than one were in the same call.

I suppose it could have a SAVEd variable remembering which KIND
it had previously been called with, but I wasn't suggesting that.

> Apparently, this is rocket science.
> > It seems that 18-007r1 doesn't say anything about KIND.

> Fortran 95 required COUNT, COUNT_RATE, and COUNT_MAX
> to be default integer kind. Fortran 2003 removed that
> requirement, making it generic and adding the real type
> for COUNT_RATE.

At first I thought that the REAL COUNT_RATE was for convenience
in dividing, but it seems that it is to allow non-integer counts
per second.

Is there any place in the standard where an INTEGER kind depends on
a REAL kind, or vice versa?

I notice, for example, that the REAL function with an INTEGER argument
of any KIND returns default REAL. It might have made sense to return
an appropriately larger KIND for larger argument KIND. (It does in
the case of COMPLEX arguments.)

As far as I know, there is no requirement in the standard for any
connection between INTEGER and REAL kinds. Many systems
now have 32 and 64 bits for each, but there is no reason that they
need to do that, other than what is currently common for hardware.

steve kargl

unread,

Jan 14, 2021, 12:31:16 AM1/14/21

to

gah4 wrote:

> On Wednesday, January 13, 2021 at 10:09:46 AM UTC-8, steve kargl wrote:
>
> (snip)
>> >> "All integer arguments used must have the same integer kind parameter."
>
>> >> You code with actual args of count=int64 and count_rate=int32
>> >> does not meet that requirement.
>
>> > Does it check that, and report an error if it isn't satisfied?
>
>> It can't. All dummy arguments of system_clock are optional.
>> An individual could grab CLOCK_RATE in routine FOO, and
>> for example convert to delta_t = real(1.)/count_rate. Then
>> delta_t is passed into routine BAR and BAH and BAM ...
>> where timings are done. It is up to the programmer, based
>> on the documented processor-dependent behavior, to
>> correctly call SYSTEM_CLOCK with an appropriately typed
>> COUNT.
>
> I meant for the case that more than one were in the same call.
>
> I suppose it could have a SAVEd variable remembering which KIND
> it had previously been called with, but I wasn't suggesting that.

gfortran chooses the time scale based on the least kind type parameter.

call system_clock(count=cnt, count_rate=rate)

min(kind(cnt), count_rate(rate)) = 4 is millisecond time scale.
min(kind(cnt), count_rate(rate)) = 8 is nanosecond time scale.

if min() < 4, then no clock.
Fortunately, kind(integer) == kind(real) for gfortran so the above holds.

>> Apparently, this is rocket science.
>> > It seems that 18-007r1 doesn't say anything about KIND.
>
>> Fortran 95 required COUNT, COUNT_RATE, and COUNT_MAX
>> to be default integer kind. Fortran 2003 removed that
>> requirement, making it generic and adding the real type
>> for COUNT_RATE.
>
> At first I thought that the REAL COUNT_RATE was for convenience
> in dividing, but it seems that it is to allow non-integer counts
> per second.

Well, COUNT is increment by 1 for each tick. It is an integer.
So, yes, a real COUNT_RATE is for convenience (or as with
the example in the Fortran standard a very poor clock, 18.206
ticks per second seems awfully coarse by today's clocks).

> Is there any place in the standard where an INTEGER kind depends on
> a REAL kind, or vice versa?
>
> I notice, for example, that the REAL function with an INTEGER argument
> of any KIND returns default REAL. It might have made sense to return
> an appropriately larger KIND for larger argument KIND. (It does in
> the case of COMPLEX arguments.)

Well, that is how the Fortran standard defined the REAL intrinsic.

> As far as I know, there is no requirement in the standard for any
> connection between INTEGER and REAL kinds. Many systems
> now have 32 and 64 bits for each, but there is no reason that they
> need to do that, other than what is currently common for hardware.

A default integer and a default real occupy 1 numeric storage unit.
A double precision real occupies 2 numeric storage units. That's
the only requirement that I can think between integer and real
(without putting in the effort to read the Fortran standard for you).

--
steve