DO CONCURRENT puzzle

369 views
Skip to first unread message

pmk

unread,
Apr 27, 2020, 3:25:35 PM4/27/20
to
Here's a Fortran trivia question for you: what can one say about
the behavior of the following program?

subroutine foo(a,b,c,ix,iy,n)
integer, intent(in) :: n, ix(*), iy(*)
real, intent(inout) :: a(*), b(*)
real, intent(in) :: c(*)
do concurrent (j=1:n)
b(ix(j)) = c(j)
a(j) = b(iy(j))
end do
end subroutine

program main
real :: a(2), b(1) = [1.0], c(2) = [2.0, 3.0]
integer :: ix(2) = [1, 1], iy(2) = [1, 1]
call foo(a, b, c, ix, iy, 2)
print *, sum(a)
end program

Your options are:

1) The program is not conformant with Fortran 2018.
2) The program is conformant with Fortran 2018 and must print 2.0.
3) The program is conformant with Fortran 2018 and must print 4.0.
4) The program is conformant with Fortran 2018 and must print 5.0.
5) The program is conformant with Fortran 2018 and must print 6.0.
6) The program is conformant with Fortran 2018 but its output
depends on the compiler, hardware, number of CPUs, and race
conditions at runtime.
7) "I wrote this in PL/I in 1967."
8) None of the above.

Anton Shterenlikht

unread,
Apr 28, 2020, 1:37:56 PM4/28/20
to
pmk <pkla...@nvidia.com> writes:

>Here's a Fortran trivia question for you: what can one say about
>the behavior of the following program?

> subroutine foo(a,b,c,ix,iy,n)
> integer, intent(in) :: n, ix(*), iy(*)
> real, intent(inout) :: a(*), b(*)
> real, intent(in) :: c(*)
> do concurrent (j=1:n)
> b(ix(j)) = c(j)
> a(j) = b(iy(j))
> end do
> end subroutine

> program main
> real :: a(2), b(1) = [1.0], c(2) = [2.0, 3.0]
> integer :: ix(2) = [1, 1], iy(2) = [1, 1]
> call foo(a, b, c, ix, iy, 2)
> print *, sum(a)
> end program

>Your options are:

>1) The program is not conformant with Fortran 2018.

I think it violates
11.1.7.5 Additional semantics for DO CONCURRENT construct
para 4 bullet 1:


4. If a variable has unspecified locality,
- if it is referenced in an iteration
it shall either be previously defined during that iteration,
or shall not be defined or become undefined during any other iteration;


In the code b(1) is defined and referenced in both iterations.

Anton

Dick Hendrickson

unread,
Apr 28, 2020, 1:52:24 PM4/28/20
to
Since you’re asking, there must be something hard or tricky about this. ;)

I'd really like to be the first to say 7, but I don't do PL/I; so I’ll
go with 1.

Plugging in the values for ix and iy

do concurrent (j=1:n)
b(ix(j)) = c(j)
a(j) = b(iy(j))
end do

the loop becomes

do concurrent (j=1:n)
b(1) = c(j)
a(j) = b(1)
end do

b(1) is defined and referenced on both iterations.

The top part of page 185 in 007-19r1 says, in a couple of places.
“If [a thingo] is defined or becomes undefined during any iteration, it
shall not be referenced, defined, or become undefined during any other
iteration”

19.6.5 (1) says an assignment statement causes the variable to become
defined, so b(1) is multiply defined and therefore, not conformant.
It'd be fine if N were <=1.

I have a feeling I missed something here, possible in what a "thingo"
is, that's changed over the years.

Dick Hendrickson

pkla...@nvidia.com

unread,
Apr 28, 2020, 2:37:34 PM4/28/20
to
On Tuesday, April 28, 2020 at 10:37:56 AM UTC-7, Anton Shterenlikht wrote:
> I think it violates
> 11.1.7.5 Additional semantics for DO CONCURRENT construct
> para 4 bullet 1:
>
>
> 4. If a variable has unspecified locality,
> - if it is referenced in an iteration
> it shall either be previously defined during that iteration,
> or shall not be defined or become undefined during any other iteration;
>
>
> In the code b(1) is defined and referenced in both iterations.
>
> Anton

Yes, b(1) is defined, and then referenced later in each iteration; so the requirement "it shall EITHER be previously defined during that iteration, OR ..." has been satisfied, yes?

FortranFan

unread,
Apr 28, 2020, 3:52:11 PM4/28/20
to
On Tuesday, April 28, 2020 at 1:52:24 PM UTC-4, Dick Hendrickson wrote:

> ..
> Since you’re asking, there must be something hard or tricky about this. ;)
>
> I'd really like to be the first to say 7, but I don't do PL/I; so I’ll
> go with 1.
> ..

7 is one of the behaviors of this forum for sure!

For the program shown in the original post, particularly with 'ix' and 'iy' as defined, I would pick 4. The array 'a' will end up being the same as 'c' per a conforming processor following the call to foo, if I understood the standard correctly. Thus sum(a) = sum(c) = 5.0.

pkla...@nvidia.com

unread,
Apr 28, 2020, 4:51:31 PM4/28/20
to
On Tuesday, April 28, 2020 at 12:52:11 PM UTC-7, FortranFan wrote:
> For the program shown in the original post, particularly with 'ix' and 'iy' as defined, I would pick 4. The array 'a' will end up being the same as 'c' per a conforming processor following the call to foo, if I understood the standard correctly. Thus sum(a) = sum(c) = 5.0.

We have a winner! Now we can move on to the first follow-up question. Which of the following is a true statement?

1) The requirements placed on the program(mer) for a DO CONCURRENT loop in a conforming program by the Fortran 2018 standard are sufficient to permit the iterations of the loop to safely execute in parallel.
2) The requirements placed on the program(mer) for a DO CONCURRENT loop in a conforming program by the Fortran 2018 standard are sufficient to permit the iterations of the loop to safely execute in any arbitrary serial order, but not necessarily in parallel.

Ian Harvey

unread,
Apr 30, 2020, 5:41:41 AM4/30/20
to
It is 2. This has been discussed here a few times previously (and I
think I've seen it on the J3 list too) - DO CONCURRENT does not mean
what it says - it should have been called DO UNORDERED or some such.
Hence the stuff added in F2018, to help the compiler figure out whether
it can do things concurrently or not.


Ron Shepard

unread,
Apr 30, 2020, 10:41:37 AM4/30/20
to
I think this is correct, although reading the standard on this topic is
very tedious. However, if the array j(:) were declared local to the DO
CONCURRENT construct, then it could be executed both conconcurrently (in
parallel) and unordered (either serially or in parallel). So it seems
that all that is missing for parallel execution are some additional
constraints on the programmer, sufficient syntax is already there.

$.02 -Ron Shepard

ga...@u.washington.edu

unread,
May 1, 2020, 11:29:04 PM5/1/20
to
On Monday, April 27, 2020 at 12:25:35 PM UTC-7, pmk wrote:
> Here's a Fortran trivia question for you: what can one say about
> the behavior of the following program?

(snip)

> 7) "I wrote this in PL/I in 1967."

Just a note on one not so obvious difference between Fortran
and PL/I regarding array expressions.

Fortran gives the value as if (if it doesn't actually) the whole
right side is evaluated before changing the left side. Consider:

A = A + A(5)

Fortran guarantees NOT to use the changed value of A(5).

PL/I, on the other hand, designed when memory was small, or otherwise,
guarantees to use the changed value. That is, as if (if not actually)
it is done element by element in memory order.

There are plenty of times when PL/I will generate temporary arrays,
though, and you do have to be careful sometimes.

Also, what does Fortran do with DO CONCURRENT and VOLATILE arrays?








robin....@gmail.com

unread,
May 2, 2020, 8:32:31 AM5/2/20
to
On Saturday, May 2, 2020 at 1:29:04 PM UTC+10, ga...@u.washington.edu wrote:
> On Monday, April 27, 2020 at 12:25:35 PM UTC-7, pmk wrote:
> > Here's a Fortran trivia question for you: what can one say about
> > the behavior of the following program?
>
> (snip)
>
> > 7) "I wrote this in PL/I in 1967."
>
> Just a note on one not so obvious difference between Fortran
> and PL/I regarding array expressions.
>
> Fortran gives the value as if (if it doesn't actually) the whole
> right side is evaluated before changing the left side. Consider:
>
> A = A + A(5)

That's because an array assignment is treated as loop in which
the elements are are processed in order, thus:

DO I = 1 TO N;
A(I) = A(I) + A(5);
END;

> Fortran guarantees NOT to use the changed value of A(5).
>
> PL/I, on the other hand, designed when memory was small, or otherwise,
> guarantees to use the changed value. That is, as if (if not actually)
> it is done element by element in memory order.
>
> There are plenty of times when PL/I will generate temporary arrays,
> though, and you do have to be careful sometimes.

When an argument is passed to a dummy argument which has a
different type from the argument.

ga...@u.washington.edu

unread,
May 2, 2020, 6:28:38 PM5/2/20
to
On Saturday, May 2, 2020 at 5:32:31 AM UTC-7, robin...@gmail.com wrote:

(snip, I wrote)

> > Just a note on one not so obvious difference between Fortran
> > and PL/I regarding array expressions.

> > Fortran gives the value as if (if it doesn't actually) the whole
> > right side is evaluated before changing the left side. Consider:

> > A = A + A(5)

> That's because an array assignment is treated as loop in which
> the elements are are processed in order, thus:

> DO I = 1 TO N;
> A(I) = A(I) + A(5);
> END;

That is the Fortran example, which does NOT treat it that way.


> > Fortran guarantees NOT to use the changed value of A(5).

> > PL/I, on the other hand, designed when memory was small, or otherwise,
> > guarantees to use the changed value. That is, as if (if not actually)
> > it is done element by element in memory order.

Yes, PL/I treats it like a loop, but even more, requires it to
be done that way. They could have left it undefined, such that
future parallel processors could do it without a loop.

Fortran, on the other hand, requires a temporary unless the compiler
can figure out that nothing changes at the wrong time.

Even more, FORALL, which seems to have been designed for parallel
processing, still requires the whole right side to be evaluated
before changing the left side. The could be nice for vector
processors with really big vector registers, but not especially
convenient for usual sized registers. So, again, might require
a temporary array, depending on what the compiler can figure out.

robin....@gmail.com

unread,
May 3, 2020, 1:09:26 AM5/3/20
to
On Sunday, May 3, 2020 at 8:28:38 AM UTC+10, ga...@u.washington.edu wrote:
> On Saturday, May 2, 2020 at 5:32:31 AM UTC-7, r......@gmail.com wrote:
>
> (snip, I wrote)
>
> > > Just a note on one not so obvious difference between Fortran
> > > and PL/I regarding array expressions.
>
> > > Fortran gives the value as if (if it doesn't actually) the whole
> > > right side is evaluated before changing the left side. Consider:
>
> > > A = A + A(5)
>
> > That's because an array assignment is treated as loop in which
> > the elements are are processed in order, thus:
>
> > DO I = 1 TO N;
> > A(I) = A(I) + A(5);
> > END;
>
> That is the Fortran example, which does NOT treat it that way.

No, that's the PL/I example. That's how PL/I does it.
Note that no temporary array is created.

> > > Fortran guarantees NOT to use the changed value of A(5).
>
> > > PL/I, on the other hand, designed when memory was small, or otherwise,
> > > guarantees to use the changed value. That is, as if (if not actually)
> > > it is done element by element in memory order.
>
> Yes, PL/I treats it like a loop, but even more, requires it to
> be done that way. They could have left it undefined, such that
> future parallel processors could do it without a loop.

There's nothing to stop PL/I array operations from being done
by parallel processors.

That particular case [A = A + A(5); ] would probably have to be done
by looping over all the elements.
It's not the sort of thing one would write in an array operation.

BTW, A = RANDOM(); when A is an array, generates an array filled
with random numbers, because the assignment is carried out in a loop.
The random function returns one pseudo-random number at each call.

> Fortran, on the other hand, requires a temporary

Not particularly efficient when the elements need to be copied
one at a time to the target array. On non-parallel hardware,
essentially, there are two loops -- one to evaluate an expression
and to store the result in a temporary array, and the other
to copy the elements from the temporary array to the target array.

pkla...@nvidia.com

unread,
May 7, 2020, 1:35:30 PM5/7/20
to
On Thursday, April 30, 2020 at 2:41:41 AM UTC-7, Ian Harvey wrote:
> It is 2. This has been discussed here a few times previously (and I
> think I've seen it on the J3 list too) - DO CONCURRENT does not mean
> what it says - it should have been called DO UNORDERED or some such.
> Hence the stuff added in F2018, to help the compiler figure out whether
> it can do things concurrently or not.

In the original example, how would the locality specifiers of Fortran 2018's
DO CONCURRENT construct convey to a processor that the assignment to the
array B modifies the same element that is referenced later in each iteration
of the loop, so that its value in each iteration can be forwarded and allow
the iterations of the loop to be safely executed in parallel?

And suppose that the values used as indices in the assignment to the array B were
instead pairwise distinct, and distinct from all of the values of the indices
used in the later reference to B. How would the locality specifiers of
Fortran 2018's DO CONCURRENT construct convey to a processor that iterations
of the loop may be safely executed in parallel due to a lack of data flow
from the assignment to the reference?

JCampbell

unread,
May 10, 2020, 10:12:35 AM5/10/20
to
Does it really matter if it is conformant with Fortran 2018 ?
It is partly 6 so 8: a data race condition and so not acceptable.

I don't think many clients would consider a program that is conformant with Fortran 2018, but produces an incorrect/uncertain answer, to be an acceptable deliverable. If you push it, ie suggest it is conformant to the standard, I'm sure you wouldn't do much more work.

Phillip Helbig (undress to reply)

unread,
May 10, 2020, 11:28:19 AM5/10/20
to
In article <38489492-4e56-40e8...@googlegroups.com>,
JCampbell <campbel...@gmail.com> writes:

> I don't think many clients would consider a program that is conformant
> with Fortran 2018, but produces an incorrect/uncertain answer, to be an
> acceptable deliverable.

Standard conformance and quality of implementation are two different
things.

I have written my own compiler for VMS:

$ WRITE SYS$OUTPUT "%FORT-F-TCMPLX, program too complex for processor"

I have also ported it to unix:

# echo "program too complex for processor"

Are there non-trivial examples of CODE, as opposed to COMPILERS, which
are standard-conformant but don't do what many expect?

Ron Shepard

unread,
May 10, 2020, 12:03:07 PM5/10/20
to
Yes, the question is whether f2018 conformance is sufficient to ensure
correct parallel execution.

> It is partly 6 so 8: a data race condition and so not acceptable.

What race condition do you think applies? The same values are assigned
to the shared array on all passes through the loop, so as far as race
conditions, it doesn't matter which value is used.

Of course, this is a contrived example, where seemingly nonconforming
code just happens to be conforming (or not, depending on details of the
standard) because of the specific values chosen for the input arrays. So
the code looks like it is nonconforming, and it looks like there is a
race condition, but given the values of the arguments, those objections
may not apply to the specific example.

$.02 -Ron Shepard

Thomas Koenig

unread,
May 10, 2020, 1:27:03 PM5/10/20
to
Phillip Helbig (undress to reply) <hel...@asclothestro.multivax.de> schrieb:

> I have written my own compiler for VMS:
>
> $ WRITE SYS$OUTPUT "%FORT-F-TCMPLX, program too complex for processor"
>
> I have also ported it to unix:
>
># echo "program too complex for processor"

It should be

#! /bin/sh
echo "program too complex for processor" 1>&2
false

or some variant thereof, because

- The shell to use should be indicated

- Error messages should go to standard error

- There should be a return code indicating failure

So, there's bad quality of implementation, and then there's bad
quality of implementation :-)

pkla...@nvidia.com

unread,
May 11, 2020, 12:49:31 PM5/11/20
to
On Sunday, May 10, 2020 at 7:12:35 AM UTC-7, JCampbell wrote:
> Does it really matter if it is conformant with Fortran 2018 ?
> It is partly 6 so 8: a data race condition and so not acceptable.
>
> I don't think many clients would consider a program that is conformant with Fortran 2018, but produces an incorrect/uncertain answer, to be an acceptable deliverable. If you push it, ie suggest it is conformant to the standard, I'm sure you wouldn't do much more work.

The program conforms with Fortran 2018.
Reply all
Reply to author
Forward
0 new messages