Coarray and the meaning of "variable is defined"

Ian Harvey

unread,

Jul 22, 2013, 10:55:31 PM7/22/13

to

A question out of gross ignorance...

For non-atomic things, F2008 says "if a variable is defined on an image
in a segment, it shall not be referenced, defined or become undefined in
a segment on another image unless the segments are ordered."

Is "variable" considered at the highest level of the object - subobject
"tree", or the lowest level? For example - can you define element i of
an array that is a coarray on one image and element i+1 on another image
without requiring segment ordering?

For example, if the following was a segment and it was unordered across
images, would it be conforming?

INTEGER :: coarray(3)[*]

IF (THIS_IMAGE() <= SIZE(coarray)) &
coarray(THIS_IMAGE())[1] = 1

(The fact that there are further restrictions on allocatable subobjects
make me think the answer is "Yes".)

Ta,

IanH

Richard Maine

unread,

Jul 22, 2013, 11:44:20 PM7/22/13

to

Ian Harvey <ian_h...@bigpond.com> wrote:

> A question out of gross ignorance...
>
> For non-atomic things, F2008 says "if a variable is defined on an image
> in a segment, it shall not be referenced, defined or become undefined in
> a segment on another image unless the segments are ordered."
>
> Is "variable" considered at the highest level of the object - subobject
> "tree", or the lowest level? For example - can you define element i of
> an array that is a coarray on one image and element i+1 on another image
> without requiring segment ordering?

I really know nothing of note about coarray stuff, but I do claim to be
able to read standard-speak, so based on that alone...

It would apply to all "levels". If X is an array, then doing

x(1) = 1

defines the variable x(1), but it *ALSO* (partially) defines the
variable x. So the restriction would apply to both x(1) and x. It would
seem that the restriction on x(1) is redundant, but that's ok. I'd still
think that both restrictions apply.

Whether that's what was intended for the coarray stuff, I couldn't tell
you. But that's how I read the words as you cited them. If that's not
what was intended, then those words probably shouldn't say it that way.

--
Richard Maine
email: last name at domain . net
domain: summer-triangle

Ian Harvey

unread,

Jul 23, 2013, 1:17:37 AM7/23/13

to

Thanks for the response. I can think of practical implementation
reasons that might require that restriction to be intended. Oh well.

Further thinking about this (which isn't really related to coarrays,
apart from the fact that I may have completely misread the requirement
that follows that one quoted above...) ...

If you have a type:

TYPE my_type
INTEGER, ALLOCATABLE :: comp
END TYPE my_type

and then, perhaps as a local unsaved, nonpointer, nonallocatable
variable in a procedure...

TYPE(my_type) :: a

is the object `a` defined?

I always thought it was, because the value of an object of derived type
considered the allocation status of its components, etc, and now reading
closely think I see that allocatable components are considered to be
default initialized, and therefore this is covered in the list of things
that cause variables to become defined (number 24 in F2008 - "Invocation
of a procedure that contains an unsaved nonpointer nonallocatable local
variable causes all nonpointer default-initialized subcomponents of the
object to become defined") - so that's all good.

If you then...

ALLOCATE(a%comp)

then does `a` becomes undefined?

I think it does, because one of its subobjects has become undefined, and
this is perhaps (?) covered in the list of things that cause variables
to become undefined (number 12 in F2008 - "Successful execution of an
ALLOCATE statement with no SOURCE= specifier causes a subcomponent of an
allocated object to become undefined if default initialization has not
been specified for that subcomponent" - but perhaps we are stretching
the friendship with subcomponent can also mean "the whole thing").

But then if you...

DEALLOCATE(a%comp)

then does `a` become defined again?

I think that's certainly what is intended; but I don't see this
counter-intuitive "deallocation makes something defined" listed.

I do see "when an allocatable entity is deallocated it becomes
undefined", i.e. a%comp becomes undefined, which sort of implies that
`a` becomes undefined, since a%comp is a subobject of `a`.

Anyway, the reason for that ramble was that the next requirement in
F2008 was:

"if the allocation of an allocatable subobject of a coarray ... is
changed on an image in a segment, that subobject shall not be referenced
or defined in a segment on another image unless the segments are ordered"

and I was trying to relate a change in allocation of a subobject to "a
variable is defined" from the previous requirement - there might be
redundancy there.

I conclude I should stick to single image programs until someone else
works it all out.

Ian Harvey

unread,

Jul 23, 2013, 8:38:04 AM7/23/13

to

On 2013-07-23 3:17 PM, Ian Harvey wrote:
> On 2013-07-23 1:44 PM, Richard Maine wrote:
>> Ian Harvey <ian_h...@bigpond.com> wrote:
>>
>>> A question out of gross ignorance...
>>>
>>> For non-atomic things, F2008 says "if a variable is defined on an image
>>> in a segment, it shall not be referenced, defined or become undefined in
>>> a segment on another image unless the segments are ordered."
>>>
>>> Is "variable" considered at the highest level of the object - subobject
>>> "tree", or the lowest level? For example - can you define element i of
>>> an array that is a coarray on one image and element i+1 on another image
>>> without requiring segment ordering?
>>
>> I really know nothing of note about coarray stuff, but I do claim to be
>> able to read standard-speak, so based on that alone...
>>
>> It would apply to all "levels". If X is an array, then doing
>>
>> x(1) = 1
>>
>> defines the variable x(1), but it *ALSO* (partially) defines the
>> variable x. So the restriction would apply to both x(1) and x. It would
>> seem that the restriction on x(1) is redundant, but that's ok. I'd still
>> think that both restrictions apply.
>>
>> Whether that's what was intended for the coarray stuff, I couldn't tell
>> you. But that's how I read the words as you cited them. If that's not
>> what was intended, then those words probably shouldn't say it that way.

Ok... now reading N1824 from the NAG site "Coarrays in the next Fortran
Standard" by John Reid, I now wonder what was intended. On page 17, in
the context of a discussion on the restriction above it says...

It follows that for code in a segment, the compiler is free to
use almost all its normal optimization techniques as if only
one image were present. In particular, code motion optimizations
may be applied, provided calls of atomic subroutines are not
involved. For an example of an optimization that is not
available, consider the code

integer (kind=short) x(8)[*]
:
! Computation that references and alters x(1:7)

Because another image might define x(8) in a segment that is
unordered with respect to this one, the compiler must not
effectively make this replacement:

integer (kind=short) x(8)[*], temp(8)
:
temp(1:8) = x(1:8) ! Faster than temp(1:7) = x(1:7)
! Computation that references and alters temp(1:7)
x(1:8) = temp(1:8)

This implies to me that the author had the expectation that you could
merrily play away with unique subobjects across images.

That would seem to be a bit tricky for processors in terms of the need
to track modifications to subobjects of a coarray on an image by all
images and then splice those changes back in - what if x was
CHARACTER(100000) and during execution of a segment we modified
characters at positions 2, 3, 5, 7, 11, 13, 17, ... etc on one image and
characters at positions 1, 4, 9, 16, 25, ... on another unordered image.

(While on the topic of coarrays - if you want to use coarrays with
polymorphic components you appear to be completely and utterly sunk. If
my understanding is correct that's a pretty major feature omission! Is
my understanding correct? If so, has anyone got ideas on how to work
around this?)

michael...@compuserve.com

unread,

Jul 23, 2013, 9:27:51 AM7/23/13

to

Ian,
John Reid's words are to be found in their latest form in Section 19.13.1 of "Modern Fortran Explained", where some restrictions on what is permitted in a segment that is unordered with respect to another segement are listed. If you don't have that I can send it to you by e-mail.

Regards,

Mike Metcalf

Ian Harvey

unread,

Jul 23, 2013, 4:14:42 PM7/23/13

to

I've got it (an edition published 2011), but the only salient difference
is that the example of an optimisation that cannot be performed is
elided from the book. The lead-in text is identical.

Is that elision because the example in N1824 was not valid given the
words in the standard, or because the editor wanted to save on printing
costs?

michael...@compuserve.com

unread,

Jul 24, 2013, 4:04:15 AM7/24/13

to

Definitely not the latter. I'll investigate but it may take time.

Regards,

Mike Metcalf

michael...@compuserve.com

unread,

Jul 26, 2013, 2:13:20 AM7/26/13

to

On Tuesday, July 23, 2013 10:14:42 PM UTC+2, Ian Harvey wrote:

Ian,
The reasons why that example was dropped are now obscure, but it does remain valid.

Regards,

Mike Metcalf

rbader

unread,

Oct 3, 2013, 11:03:33 AM10/3/13

to

I agree wrt validity, and so does John Reid, who I just talked to about this.

For an array coarray, definition of disjoint array subobjects from different images in unordered segments is a very relevant programming pattern (halo updates), so not being able to use this would significantly impair the usability of the feature.

Disjoint substring updates may be quite difficult to implement, though, and probably are not really useful anyway.

Regards

Reinhold

Ian Harvey

unread,

Oct 3, 2013, 4:33:57 PM10/3/13

to

...

>
> I agree wrt validity, and so does John Reid, who I just talked to
> about this.
>
> For an array coarray, definition of disjoint array subobjects from
> different images in unordered segments is a very relevant programming
> pattern (halo updates), so not being able to use this would
> significantly impair the usability of the feature.
>
> Disjoint substring updates may be quite difficult to implement,
> though, and probably are not really useful anyway.

Given, for some [all?] character kinds, the internal representation of
an character array and a string practically has to be the same, is there
any real difference?

Terence

unread,

Oct 4, 2013, 1:02:14 AM10/4/13

to

Ian Harvey quotes:-

"Given, for some [all?] character kinds, the internal representation of
an character array and a string practically has to be the same, is there
any real difference?"

A character array is historically a linearly adjacent series of 8 bit fields
running from left to right in addresses of storage units of the appropriate
width.
This can be easily (and has been) adapted to 16-bit characters.

A string has always been defined as a length-prefixed character array (again
adaptable as mentioned).
But that length prefix before strings causes a subtle difference in
processing capabilities..

Richard Maine

unread,

Oct 4, 2013, 1:28:29 AM10/4/13

to

I wonder who this is that has allegedly "always" defined a string like
that. Maybe in some particular narrow field, but not in the world in
general. In particular...

You might have forgotten that this is comp.lang.fortran. Or perhaps you
are thinking about some non-standard feature in some pre-f77 compiler.
Fortran strings in f77 and later pretty much never have a length prefix;
it's hard to do that and make things work like the standard says they
have to.

The internal representation of a character string and an array of
characters essentially has to be the same in Fortran 77 and later (which
is to say, any version of Fortran that has character type in the
standard at all). That's because, for a start, storage association
between character strings and arrays is allowed. Awfully hard to make
all the required cases of that work if the representation isn't the
same. I know of no compiler that has ever tried such a thing. It would
also be hard to make standard Fortran substrings work with an
implementation like that.

If I recall correctly, the term "string" in C has a special meaning if
one gets picky, but that doesn't involve a length prefix. So that
doesn't seem likely to be what you were thinking of.

In standard Fortran, there are differences in the things you can do with
a character string versus an array of characters, but those differences
do not relate to any difference in internal representation.

Ron Shepard

unread,

Oct 4, 2013, 12:56:54 PM10/4/13

to

In article <1la6r6f.1cx9z911q98v6N%nos...@see.signature>,

nos...@see.signature (Richard Maine) wrote:

> Terence <tbwr...@bigpond.net.au> wrote:
>
> > Ian Harvey quotes:-
> > "Given, for some [all?] character kinds, the internal representation of
> > an character array and a string practically has to be the same, is there
> > any real difference?"
> >
> > A character array is historically a linearly adjacent series of 8 bit fields
> > running from left to right in addresses of storage units of the appropriate
> > width.
> > This can be easily (and has been) adapted to 16-bit characters.

By "historically" I think you mean f77. In f90 and later, arrays
need not be contiguous memory locations, and with various slice and
strides, the memory locations need not be increasing. Also, there
are fortran compilers running on word-addressable machines where
several characters are stored within a word.

> >
> > A string has always been defined as a length-prefixed character array (again
> > adaptable as mentioned).
> > But that length prefix before strings causes a subtle difference in
> > processing capabilities..

I remember that Pascal strings worked this way, but fortran strings
do not. As Richard explains, the substring and argument passing
mechanisms are constrained in various ways so that this
representation would be inefficient and problematic in various ways.

When character strings are passed as arguments, many compilers pass
two separate entitties, the string location and the string length.
Other compilers pack these two values into a descriptor, and then
pass that descriptor as a single entity. VAX fortran is an example
of this latter approach. None of this is ever exposed to the
programmer unless you are trying to interoperate with another
language, or with system APIs, hardware devices, or things like that.

>
> I wonder who this is that has allegedly "always" defined a string like
> that. Maybe in some particular narrow field, but not in the world in
> general. In particular...
>
> You might have forgotten that this is comp.lang.fortran. Or perhaps you
> are thinking about some non-standard feature in some pre-f77 compiler.
> Fortran strings in f77 and later pretty much never have a length prefix;
> it's hard to do that and make things work like the standard says they
> have to.
>
> The internal representation of a character string and an array of
> characters essentially has to be the same in Fortran 77 and later (which
> is to say, any version of Fortran that has character type in the
> standard at all). That's because, for a start, storage association
> between character strings and arrays is allowed. Awfully hard to make
> all the required cases of that work if the representation isn't the
> same. I know of no compiler that has ever tried such a thing. It would
> also be hard to make standard Fortran substrings work with an
> implementation like that.
>
> If I recall correctly, the term "string" in C has a special meaning if
> one gets picky, but that doesn't involve a length prefix. So that
> doesn't seem likely to be what you were thinking of.
>
> In standard Fortran, there are differences in the things you can do with
> a character string versus an array of characters, but those differences
> do not relate to any difference in internal representation.

In those cases where a string and an array are required to be
interchangeable, e.g. through argument association, then the
compiler is required to create intermediate copies or whatever is
necessary to conform to the standard behavior. The programmer is
not required to do this. However, this affects the efficiency of
some types of operations, so the programmer needs to understand the
consequences.

$.02 -Ron Shepard

Gordon Sande

unread,

Oct 4, 2013, 2:25:03 PM10/4/13

to

C has two idioms. One is a null terminator and the other is an separate
count. Thus two sets of procedures, the "str…" family and the "strn…"
family.

For a discussion (actually a rant!) on Unicode see the current
<http://www.theregister.co.uk/2013/10/04/verity_stob_unicode/>
and the contained references for the joys of variable length characters.
It seems the first attempt at a standard had some problems and there
have been several iterations.

Terence

unread,

Oct 4, 2013, 6:01:44 PM10/4/13

to

Ron Shapard said:-

"Also, there are fortran compilers running on word-addressable machines
where
several characters are stored within a word."

Of course. I was a Fortran specialist inside IBM and used many of these
referred-to machines.
Also I have more recently specialised in conversion of (full-of-tricks)
legacy Fortran (to F77/F90/95 compilable compatability) from other machines
such as VAX and DEC.

I think my use of 'historically' was fairly clear; it obviously did.n't mean
now, (meaning now that the logical path of Fortran to mantain it as a
leading easilly-teachaable and flexibal programming language, as has been
derailed by unwise directions of extensions of the 'standards'.

We 'old buffers' often exchanged comments of 'In the old days' anything that
was simple, easy-to-use, and efficient as a commercial application was soon
destroyed by the commercial need to change applications in order to mantain
sales.
Understandable, but unwise to allow Fortran become so complicated instead of
taking the 'C' approach.

Even Richard seems to agree something happened to F90 that could have been
different, and so some positive advantages lost.

Richard Maine

unread,

Oct 4, 2013, 6:57:30 PM10/4/13

to

Terence <tbwr...@bigpond.net.au> wrote:

> Even Richard seems to agree something happened to F90 that could have been
> different, and so some positive advantages lost.

I can't quite tell exactly what position is being attributed to me here,
but I doubt that I agree with whatever it is.

In the most literal of readings, yes, I agree that everything in this
world is imperfect and could be better. That certainly includes f90; I
have indeed mentioned several specific things that I would have done
differently (some that I didn't like at the time, and others only in
retrospect). But to acknowledge that f90 is less that absolute
perfection seems like such a motherhood-and-apple-pie statement as to be
void of pertinent content.

If that's really all that was meant, then there's essentially no content
to argue with.

I have to wonder whether Terence was trying to imply something with more
content. If so, I can't really tell what it is, but I'd give odds that I
don't agree. From [elided] context, it almost sounds like an attempt to
imply that I agree that f90 is too complex.

I'd not have bothered to reply again at all. I've concluded that Terence
and I disagree on some pretty basic things, and I'm largely content to
just let that disagreement stand. My former reply in this thread was to
correct what seemed to me to be a misstatement of a matter of fact ("a
string has always been defined...").

I'll not argue with Terence's opinions, but I am fairly attentive when
positions get attributed to me, even when I'm not entirely sure what
those alleged positions are - perhaps particularly when I find the
alleged positions unclear.

Ron Shepard

unread,

Oct 4, 2013, 9:08:45 PM10/4/13

to

In article <l2ndnd$a6a$1...@dont-email.me>,

"Terence" <tbwr...@bigpond.net.au> wrote:

> Also I have more recently specialised in conversion of (full-of-tricks)
> legacy Fortran (to F77/F90/95 compilable compatability) from other machines
> such as VAX and DEC.

[...]

> Understandable, but unwise to allow Fortran become so complicated instead of
> taking the 'C' approach.

It is ironic that you try to make these two points in the same post.
the "full-of-tricks" fortran code was written because the language
was not powerful enough to do what needed to be done.

And it was not tame enough to write and maintain large program
systems.

Contrast that to C, especially when it comes to character strings.
Almost every one of the thousands of bugs that are found in various
operating systems and utility programs every year are based on
overrunning string buffers within C and purposely corrupting or
reading from memory in some way. So no, fortran does not need to be
more like C in this respect.

Of course, f90 alone did not fix all of these problems, but it was a
step in the right direction in many ways. What many (and maybe you)
think of as "complicated" in the language now (compared to f77 and
earlier) is actually what makes code easier to write, debug, and
maintain.

$.02 -Ron Shepard

William Clodius

unread,

Oct 4, 2013, 9:45:06 PM10/4/13

to

Terence <tbwr...@bigpond.net.au> wrote:

> <snip>

> We 'old buffers' often exchanged comments of 'In the old days' anything that
> was simple, easy-to-use, and efficient as a commercial application was soon
> destroyed by the commercial need to change applications in order to mantain
> sales.
> Understandable, but unwise to allow Fortran become so complicated instead of
> taking the 'C' approach.

Note that C is steadily being replaced by more complicated descendants,
e.g. C++ Java, C#, etc.

> <snip>

Thomas Koenig

unread,

Oct 5, 2013, 4:12:52 PM10/5/13

to

Gordon Sande <Gordon...@gmail.com> schrieb:

> C has two idioms. One is a null terminator and the other is an separate
> count. Thus two sets of procedures, the "str…" family and the "strn…"
> family.

Slight correction: The strn* functions have an upper length
on the string length, the strings they operate on are still
null-terminated.

For non-null-terminated data (not really strings), the mem* functions
are used.

Terence

unread,

Oct 5, 2013, 11:52:36 PM10/5/13

to

I recall Richard berating me for suggesting that my prior suggestion, that
the 'best' direction for Fortran extension went wrong around or after the
F95 version; ha seemed to have responded that that if there was any merit at
all in my suggestion I should have said or implied 'after F90'.

Of course, my recent posting was paraphrasing and interpreting my memories.
Which is why I thought Richard had 'some' reservations about standards
directions in that time frame.

No, I don't find F95 too complex for me; just unnecessary.
All I wanted beyond F77 was:
required definition of the variables
appearance of flat memory for array sizes.
some alternatives to computed GOTOs and variations on DO loops.
I got it with my F90/95 compiler, where I use the above features.

Sorry if Richard had other intensions.

About 'full-of-tricks' source programs from VAX and DEC applications.
Both these manufactures differed in hardware from IBM in ways which
required odd processing using Fortran, and calls to unique hardware
fuctions, for example to determine sign, supply a negative zero, and store
charcters as float arrays with 'unusual' bit counts.
Working for IBM and considering IBM as the founder/supporter of Fortran, I
always felt the IBM way to be the 'normal' way.

invalid

unread,

Oct 6, 2013, 6:53:59 AM10/6/13

to

On 2013-10-04, Terence <tbwr...@bigpond.net.au> wrote:

> Understandable, but unwise to allow Fortran become so complicated instead of
> taking the 'C' approach.

What's the 'C' approach? Obfuscation? Most surprise? Smoke and mirrors?

The F2008 standard is 621 pages long.

The C99 standard was 552 pages long. The C11 standard is 701 pages long.

Is this a basis for comparison?

There's no excuse for a language like C to need a 701 page standard. There's
an awful lot of complexity, baggage, and dirty tricks in C for it to need
701 pages.

David Thompson

unread,

Oct 12, 2013, 5:39:13 PM10/12/13

to

On Fri, 4 Oct 2013 15:25:03 -0300, Gordon Sande
<Gordon...@gmail.com> wrote:

> On 2013-10-04 16:56:54 +0000, Ron Shepard said:
>
> > In article <1la6r6f.1cx9z911q98v6N%nos...@see.signature>,
> > nos...@see.signature (Richard Maine) wrote:
> >
> >> Terence <tbwr...@bigpond.net.au> wrote:

<snip>

> >>> A string has always been defined as a length-prefixed character array (again
> >>> adaptable as mentioned).
> >>> But that length prefix before strings causes a subtle difference in
> >>> processing capabilities..
> >
> > I remember that Pascal strings worked this way, but fortran strings
> > do not. As Richard explains, the substring and argument passing
> > mechanisms are constrained in various ways so that this
> > representation would be inefficient and problematic in various ways.
>

To be exact, *varying* strings in *extended* Pascal were length
prefix. Basic Pascal strings actually consisted of just array of char,
except declared as 'packed' which on some systems saved storage.
But most other languages then understood array of char okay, so it was
Pascal's varying strings that were noted as needing special handling.

PL/I also (and earlier) had both fixed and varying length-prefix. (And
it had them for *bit* strings as well.) Ada a little later had fixed
padded, varying length-prefix, AND varying length+pointer.

> C has two idioms. One is a null terminator and the other is an separate
> count. Thus two sets of procedures, the "str…" family and the "strn…"
> family.
>

Not really. Null-terminated is clearly preferred, and the str*
routines are pretty complete; other things like fopen fgets fputs also
use null-terminated. (Since C95, there are are also null-terminated
'wide' strings using wcs* and fgetws etc, usually Unicode.)

There are only two strn* routines, strncpy and strncat, which use BOTH
null terminator and separate limit, but strncpy is NOT just a limited
equivalent to strcpy as one might expect; read the spec carefully, or
browse comp.lang.c where this has been re-discussed every few months
for at least the past decode. It is certainly possible to handle
strings as pointer+count in C if you write code to do so or find a
library that does, but it's not standard and it's not very widespread.

<snip rest>

And to Ron's elsethread: several years ago it was true that most
(known) security vulnerabilities were buffer overruns and especially
string overruns in C. That's changed. It took a while, but most
programmers finally learned not to do that and tools were developed to
catch many of them automatically. There are still a few overruns but
in recent years most of the 'low level' bugs on CERT-US are
use-after-free (including multiple-free) and use of uninitialized
(attacker-influenced) memory. And a substantial fraction are 'high
level' things like HTML injection or scripting, SQL injection or
permission cheats, HTTP abuse, and such.

glen herrmannsfeldt

unread,

Oct 12, 2013, 8:27:21 PM10/12/13

to

David Thompson <dave.th...@verizon.net> wrote:

(snip)

> And to Ron's elsethread: several years ago it was true that most
> (known) security vulnerabilities were buffer overruns and especially
> string overruns in C. That's changed. It took a while, but most
> programmers finally learned not to do that and tools were developed to
> catch many of them automatically.

And the NX bit that prevents execution of parts of memory that
aren't supposed to have instructions in them.

-- glen