Interoperability with empty C structure?

James Van Buskirk

unread,

Nov 9, 2010, 9:18:27 PM11/9/10

to

Here is a C program with an empty struct:

C:\gfortran\james\api>type junk.c
#include <stdio.h>

int main()
{
struct junk
{
// empty!
};
struct junk x;
printf("Hello, world!\n");
return 0;
}

C:\gfortran\james\api>gcc junk.c -ojunk

C:\gfortran\james\api>junk
Hello, world!

But when I try to construct an interoperable type I get:

C:\gfortran\james\api>type stuff.f90
module stuff
implicit none
type, bind(C) :: junk
! Empty!
end type junk
end module stuff

C:\gfortran\james\api>gfortran -c stuff.f90
stuff.f90:3.24:

type, bind(C) :: junk
1
Error: Derived type 'junk' at (1) is empty

So what should I do?

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end

glen herrmannsfeldt

unread,

Nov 9, 2010, 10:22:02 PM11/9/10

to

James Van Buskirk <not_...@comcast.net> wrote:
> Here is a C program with an empty struct:

(snip)

> struct junk
> {
> // empty!
> };

I didn't know that was legal, though I never thought about trying it.
As far as I remember, you can't dimension arrays [0] in C.
(snip)

> But when I try to construct an interoperable type I get:

(snip)

> Error: Derived type 'junk' at (1) is empty

> So what should I do?

Complain to the compiler vendor?

-- glen

Gib Bogle

unread,

Nov 9, 2010, 10:25:36 PM11/9/10

to

...

> C:\gfortran\james\api>gfortran -c stuff.f90
> stuff.f90:3.24:
>
> type, bind(C) :: junk
> 1
> Error: Derived type 'junk' at (1) is empty
>
> So what should I do?
>

Put something in the struct?

Richard Maine

unread,

Nov 9, 2010, 11:36:51 PM11/9/10

to

Gib Bogle <g.b...@auckland.no.spam.ac.nz> wrote:

Shouldn't be needed. I'd probably go with the "bitch at the compiler
vendor" option, at least as a guess. Might possibly be that C doesn't
allow it so that maybe it can't be in a BIND(C) type; I'm not sure of
that part. But f2003 does add Fortran derived types with no components.
(Such a thing had some uses previously, but becomes more important with
the inheritance features of f2003).

Of course, per my usual bitch about dealing with compilers that just
claim selected f2003 features instead of claiming f2003 conformance, it
could just be that derived types with no components are one of teh f2003
features that didn't get added. When you are cherry picking features, it
is easy to miss things like that as they sometimes don't make "feature
lists". You are much more likely to get them if you actually look
through the whole standard insteadf of cherry picking.

This feature is new to f2003.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain

fj

unread,

Nov 9, 2010, 11:43:00 PM11/9/10

to

glen herrmannsfeldt wrote:

> James Van Buskirk <not_...@comcast.net> wrote:
>> Here is a C program with an empty struct:
> (snip)
>
>> struct junk
>> {
>> // empty!
>> };
>
> I didn't know that was legal, though I never thought about trying it.
> As far as I remember, you can't dimension arrays [0] in C.

This is legal in C but not in FORTRAN-95 !

> (snip)
>
>> But when I try to construct an interoperable type I get:
> (snip)
>
>> Error: Derived type 'junk' at (1) is empty

This is the message from FORTRAN, not from C

>
>> So what should I do?
>
> Complain to the compiler vendor?

Empty derived types are a FORTRAN-2003 feature

>
> -- glen

--
François Jacq

James Van Buskirk

unread,

Nov 10, 2010, 1:16:12 AM11/10/10

to

"Richard Maine" <nos...@see.signature> wrote in message
news:1jrpid9.ykt57f33bq9yN%nos...@see.signature...

> Gib Bogle <g.b...@auckland.no.spam.ac.nz> wrote:

>> > So what should I do?

>> Put something in the struct?

> Shouldn't be needed. I'd probably go with the "bitch at the compiler
> vendor" option, at least as a guess. Might possibly be that C doesn't
> allow it so that maybe it can't be in a BIND(C) type; I'm not sure of
> that part. But f2003 does add Fortran derived types with no components.
> (Such a thing had some uses previously, but becomes more important with
> the inheritance features of f2003).

In n1124.pdf, section 6.7.2.1, it says:

"If the struct-declaration-list contains no named members, the
behavior is undefined."

So I don't know whether or not that means C allows it, but the
companion processor (we assume gcc to be the companion to gfortran)
let it slide in my initial C program which compiled, linked, and
ran.

> Of course, per my usual bitch about dealing with compilers that just
> claim selected f2003 features instead of claiming f2003 conformance, it
> could just be that derived types with no components are one of teh f2003
> features that didn't get added. When you are cherry picking features, it
> is easy to miss things like that as they sometimes don't make "feature
> lists". You are much more likely to get them if you actually look
> through the whole standard insteadf of cherry picking.

> This feature is new to f2003.

gfortran seems quite specific in disliking empty structures with the
BIND attribute:

C:\gfortran\james\api>type stuff3.f90
module stuff
implicit none
type, bind(C) :: junk1
! Empty!
end type junk1
type junk2
! Empty!
end type junk2
type junk3
sequence
! Empty!
end type junk3
end module stuff

C:\gfortran\james\api>gfortran -c stuff3.f90
stuff3.f90:3.25:

type, bind(C) :: junk1
1
Error: Derived type 'junk1' at (1) is empty

It passes a structure with the SEQUENCE or no attribute, so there
seems to be a reason it's picking on the type junk1 above and not
the others.

steve

unread,

Nov 10, 2010, 1:33:30 AM11/10/10

to

Looks like gfortran found a bug in your C code.

n1256.pdf

6.2.5: p 35

-- A structure type describes a sequentially allocated nonempty set
of member objects (and, in certain circumstances, an incomplete
array), each of which has an optionally specified name and
possibly distinct type.

--
steve

steve

unread,

Nov 10, 2010, 2:00:30 AM11/10/10

to

On Nov 9, 7:22 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> James Van Buskirk <not_va...@comcast.net> wrote:> Here is a C program with an empty struct:

>
> (snip)
>
> > struct junk
> > {
> > // empty!
> > };
>
> I didn't know that was legal, though I never thought about trying it.

It's not.

--
steve

James Van Buskirk

unread,

Nov 10, 2010, 2:34:08 AM11/10/10

to

"steve" <kar...@comcast.net> wrote in message
news:03597172-53d2-40fc...@n10g2000prj.googlegroups.com...

> n1256.pdf

> 6.2.5: p 35

> -- A structure type describes a sequentially allocated nonempty set
> of member objects (and, in certain circumstances, an incomplete
> array), each of which has an optionally specified name and
> possibly distinct type.

But then why does it say in section 6.7.2.1 of n1124.pdf that

"If the struct-declaration-list contains no named members, the behavior

is undefined" ? It seems to me that if the set has to be nonempty
then the behavior would be defined as being an error or something,
not simply undefined. Does that mean that gcc is wrong to accept
the syntax? I have a much harder time interpreting the C standard
than the Fortran standard for some reason. Where does it say in the
Fortran standard itself that a structure with the BIND attribute
must not be empty?

steve

unread,

Nov 10, 2010, 11:58:44 AM11/10/10

to

On Nov 9, 11:34 pm, "James Van Buskirk" <not_va...@comcast.net> wrote:
> "steve" <kar...@comcast.net> wrote in message
>
> news:03597172-53d2-40fc...@n10g2000prj.googlegroups.com...
>
> > n1256.pdf
> > 6.2.5: p 35
> > -- A structure type describes a sequentially allocated nonempty set
> > of member objects (and, in certain circumstances, an incomplete
> > array), each of which has an optionally specified name and
> > possibly distinct type.
>
> But then why does it say in section 6.7.2.1 of n1124.pdf that
> "If the struct-declaration-list contains no named members, the behavior
> is undefined" ?

I believe it comes down to the definitions of 'named member'
vs 'unnamed member'. Unfortunately, I cannot find a clear definition
for either term in n1124.pdf.

> It seems to me that if the set has to be nonempty
> then the behavior would be defined as being an error or something,
> not simply undefined. Does that mean that gcc is wrong to accept
> the syntax?

Yes, I believe that is correct. Note, however, I've been known
to misread both the Fortran and C standards. :-) Note**2, that
if you use -Wall with gcc, you get a warning about an empty
struct. Perhpas, the gcc C developers are allowing an empty
struct as an extension.

> I have a much harder time interpreting the C standard
> than the Fortran standard for some reason. Where does it say in the
> Fortran standard itself that a structure with the BIND attribute
> must not be empty?

Given Richard's posts, I went looking through the Fortran 2003
standard.
I think your Fortran code is conforming and gfortran should not emit
an error. The passage for Fortran 2003 that I believe allows binding
is from 15.2

The following subclauses define the conditions under which
a Fortran entity is interoperable. If a Fortran entity is
interoperable, an equivalent entity may be defined by means
of C and the Fortran entity is said to be interoperable with
the C entity. There does not have to be such an interoperating
C entity.

That last sentence seems to be a catch-all for allowing a valid
Fortran 2003 entity to have a bind(c) attribute without necessarily
having a corresponding valid C entity.

--
steve

steve

unread,

Nov 10, 2010, 12:03:02 PM11/10/10

to

On Nov 9, 10:16 pm, "James Van Buskirk" <not_va...@comcast.net> wrote:
>
> C:\gfortran\james\api>gfortran -c stuff3.f90
> stuff3.f90:3.25:
>
> type, bind(C) :: junk1
> 1
> Error: Derived type 'junk1' at (1) is empty
>
> It passes a structure with the SEQUENCE or no attribute, so there
> seems to be a reason it's picking on the type junk1 above and not
> the others.

This appears to date back to when C binding was added to gfortran.
You appear to be the first person to try to bind an empty derived
type to an empty struct. In particular, the code is

gcc/fortran/symbol.c (verify_bind_c_derived_type)

curr_comp = derived_sym->components;

/* TODO: is this really an error? */
if (curr_comp == NULL)
{
gfc_error ("Derived type '%s' at %L is empty",
derived_sym->name, &(derived_sym->declared_at));
return FAILURE;
}

--
steve

steve

unread,

Nov 10, 2010, 12:20:36 PM11/10/10

to

On Nov 10, 8:58 am, steve <kar...@comcast.net> wrote:
> On Nov 9, 11:34 pm, "James Van Buskirk" <not_va...@comcast.net> wrote:
>
> > "steve" <kar...@comcast.net> wrote in message
>
> >news:03597172-53d2-40fc...@n10g2000prj.googlegroups.com...
>
> > > n1256.pdf
> > > 6.2.5: p 35
> > > -- A structure type describes a sequentially allocated nonempty set
> > > of member objects (and, in certain circumstances, an incomplete
> > > array), each of which has an optionally specified name and
> > > possibly distinct type.
>
> > But then why does it say in section 6.7.2.1 of n1124.pdf that
> > "If the struct-declaration-list contains no named members, the behavior
> > is undefined" ?
>
> I believe it comes down to the definitions of 'named member'
> vs 'unnamed member'. Unfortunately, I cannot find a clear definition
> for either term in n1124.pdf.

Here's a discussion of the behavior.

http://stackoverflow.com/questions/1626446/what-is-the-size-of-an-empty-struct-in-c

--
steve

nm...@cam.ac.uk

unread,

Nov 10, 2010, 12:23:51 PM11/10/10

to

In article <54d44789-3418-4fec...@b19g2000prj.googlegroups.com>,

steve <kar...@comcast.net> wrote:
>On Nov 9, 11:34 pm, "James Van Buskirk" <not_va...@comcast.net> wrote:
>>
>> > n1256.pdf
>> > 6.2.5: p 35
>> > -- A structure type describes a sequentially allocated nonempty set
>> > of member objects (and, in certain circumstances, an incomplete
>> > array), each of which has an optionally specified name and
>> > possibly distinct type.
>>
>> But then why does it say in section 6.7.2.1 of n1124.pdf that
>> "If the struct-declaration-list contains no named members, the behavior
>> is undefined" ?
>
>I believe it comes down to the definitions of 'named member'
>vs 'unnamed member'. Unfortunately, I cannot find a clear definition
>for either term in n1124.pdf.

Eh? You probably can't, but the meaning is clear. Unnamed members
are the bit-fields without a member name.

Now, I can't answer the question of why, but the usual reason for
such aberrations is that there were two vendors, one of which had
a compiler that accepted them and one of which rejected them, and
neither of which wanted to have to change. So the behaviour was
made undefined (no diagnostic needed) rather than a constraint.

>> It seems to me that if the set has to be nonempty
>> then the behavior would be defined as being an error or something,
>> not simply undefined. Does that mean that gcc is wrong to accept
>> the syntax?
>
>Yes, I believe that is correct. Note, however, I've been known
>to misread both the Fortran and C standards. :-) Note**2, that
>if you use -Wall with gcc, you get a warning about an empty
>struct. Perhpas, the gcc C developers are allowing an empty
>struct as an extension.

If there are no members at all, then there is a breach of the
syntax rule, and a compiler is required to emit a diagnostic.
It is NOT required to reject the program - this is C, not Ada.

>> I have a much harder time interpreting the C standard
>> than the Fortran standard for some reason. Where does it say in the
>> Fortran standard itself that a structure with the BIND attribute
>> must not be empty?

That fails to surprise me. So do I, and I know the C one much better.

>Given Richard's posts, I went looking through the Fortran 2003
>standard.
>I think your Fortran code is conforming and gfortran should not emit
>an error. The passage for Fortran 2003 that I believe allows binding
>is from 15.2
>
> The following subclauses define the conditions under which
> a Fortran entity is interoperable. If a Fortran entity is
> interoperable, an equivalent entity may be defined by means
> of C and the Fortran entity is said to be interoperable with
> the C entity. There does not have to be such an interoperating
> C entity.
>
>That last sentence seems to be a catch-all for allowing a valid
>Fortran 2003 entity to have a bind(c) attribute without necessarily
>having a corresponding valid C entity.

That's how I read it, too.

Regards,
Nick Maclaren.

James Van Buskirk

unread,

Nov 10, 2010, 12:30:03 PM11/10/10

to

"steve" <kar...@comcast.net> wrote in message

news:06b57a6b-bc06-48f5...@n32g2000prc.googlegroups.com...

> You appear to be the first person to try to bind an empty derived
> type to an empty struct.

Actually it happened without trying. gfortran rejected all of the
components of my derived type for syntax errors and then printed
the error message as a compound error.

> /* TODO: is this really an error? */
> if (curr_comp == NULL)
> {
> gfc_error ("Derived type '%s' at %L is empty",
> derived_sym->name, &(derived_sym->declared_at));
> return FAILURE;
> }

If it turns out that you eventually conclude that the condition
is definitely an error, perhaps you could insert some verbiage
in there about the derived type having the BIND attribute. The
message as it stands looks like it's saying that it is not
allowed for any derived type to be empty and I remember that
the gfortran team addressed that issue years ago.

James Van Buskirk

unread,

Nov 10, 2010, 12:46:26 PM11/10/10

to

"steve" <kar...@comcast.net> wrote in message

news:54d44789-3418-4fec...@b19g2000prj.googlegroups.com...

> I think your Fortran code is conforming and gfortran should not emit
> an error. The passage for Fortran 2003 that I believe allows binding
> is from 15.2

> The following subclauses define the conditions under which
> a Fortran entity is interoperable. If a Fortran entity is
> interoperable, an equivalent entity may be defined by means
> of C and the Fortran entity is said to be interoperable with
> the C entity. There does not have to be such an interoperating
> C entity.

> That last sentence seems to be a catch-all for allowing a valid
> Fortran 2003 entity to have a bind(c) attribute without necessarily
> having a corresponding valid C entity.

But does it mean no such valid C entity in the C language or just
in the C parts of the program that the Fortran code is interoperating
with? Maybe I am trying too hard to see how one could read that
sentence either way, though. Thanks for taking the time to probe
this question.

steve

unread,

Nov 10, 2010, 1:28:27 PM11/10/10

to

I tried to downgrade the error to warning, which seems sensible.
Unfortunately, the simple change of gfc_error() to gfc_warning()
yields

laptop:kargl[235] gfc4x -c po.f90
po.f90:3.24:

type, bind(C) :: junk
1
Warning: Derived type 'junk' at (1) is empty
f951: internal compiler error: Segmentation fault: 11
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

Guess I'll open a PR.

--
steve

nm...@cam.ac.uk

unread,

Nov 10, 2010, 1:31:58 PM11/10/10

to

In article <ibelpm$r5u$1...@news.eternal-september.org>,

James Van Buskirk <not_...@comcast.net> wrote:

>"steve" <kar...@comcast.net> wrote in message
>news:54d44789-3418-4fec...@b19g2000prj.googlegroups.com...
>

>> That last sentence seems to be a catch-all for allowing a valid
>> Fortran 2003 entity to have a bind(c) attribute without necessarily
>> having a corresponding valid C entity.
>
>But does it mean no such valid C entity in the C language or just
>in the C parts of the program that the Fortran code is interoperating
>with? Maybe I am trying too hard to see how one could read that
>sentence either way, though. Thanks for taking the time to probe
>this question.

I think that you are trying too hard. In both C and Fortran, that
is not really a meaningful question for an interoperable type.

Regards,
Nick Maclaren.

glen herrmannsfeldt

unread,

Nov 10, 2010, 1:48:59 PM11/10/10

to

James Van Buskirk <not_...@comcast.net> wrote:

(snip)

>> That last sentence seems to be a catch-all for allowing a valid
>> Fortran 2003 entity to have a bind(c) attribute without necessarily
>> having a corresponding valid C entity.

> But does it mean no such valid C entity in the C language or just
> in the C parts of the program that the Fortran code is interoperating
> with? Maybe I am trying too hard to see how one could read that
> sentence either way, though. Thanks for taking the time to probe
> this question.

As, I believe Richard, previously mentioned regarding functions
and BIND(C), even without a valid C implementation it needs to
interoperate with other Fortran BIND(C) structures.

It would seem that should be possible, even without a C compiler.

Note also that it is common in C to determine the size of
an array with the expression sizeof(array)/sizeof(*array), which
assumes that the sizeof(array element) is not zero.

-- glen

steve

unread,

Nov 10, 2010, 3:17:08 PM11/10/10

to

James, would you consider the following a satisfactory warning.
laptop:kargl[255] gfc4x -c po.f90
po.f90:3.24:

type, bind(C) :: junk
1
Warning: Derived type 'junk' with BIND(C) attribute at (1) is empty,
and may be inaccessible by the C companion processor

--
steve

Richard Maine

unread,

Nov 10, 2010, 3:46:20 PM11/10/10

to

James Van Buskirk <not_...@comcast.net> wrote:

> "steve" <kar...@comcast.net> wrote in message
> news:54d44789-3418-4fec...@b19g2000prj.googlegroups.com...
>
> > I think your Fortran code is conforming and gfortran should not emit
> > an error. The passage for Fortran 2003 that I believe allows binding
> > is from 15.2
>
> > The following subclauses define the conditions under which
> > a Fortran entity is interoperable. If a Fortran entity is
> > interoperable, an equivalent entity may be defined by means
> > of C and the Fortran entity is said to be interoperable with
> > the C entity. There does not have to be such an interoperating
> > C entity.
>
> > That last sentence seems to be a catch-all for allowing a valid
> > Fortran 2003 entity to have a bind(c) attribute without necessarily
> > having a corresponding valid C entity.
>
> But does it mean no such valid C entity in the C language or just
> in the C parts of the program that the Fortran code is interoperating
> with?

I think the latter (in the C parts of the program).

I interpret that as saying that there does not have to actually be a
specific interoperating entity in the program. It is sort of like having
a COMMON block with only one instance. COMMON blocks are for
communication between different scopes (they can have other uses, but
those are more or less happenstance; the basic purpose is for
communication). But you are allowed to have an instance COMMON block
even though it doesn't actually communicate with any other instance.
Likewise, the words above are saying (at least to me) that you can have
an interoperable Fortran entity even though there is no C code in sight
that it actully interperates with.

Basically, it emphasizes the "may" in the previous sentence, which said
that "an equivalent entity may be defined". Just because there may be
one doesn't mean that there has to be one. I think it is redundant, as
that's already what "may" means, but someone probably wanted to
emphasize it. Sometimes attempts to emphasize things actually end up
confusing them because people assume that there must be some new content
being conveyed rather than just redundant restatement.

So if the C language did not actually allow an interoperating entity
(and I don't pretend to know C well enough to say anything about that),
I think that a compiler would be justified in bitching. Such a bitch
would not be required in any case. A compiler could certainly allow the
form as an extension with no diagnostic (or with an extension warning).
In fact, that would probably make a lot of sense if it were targetting a
C compiler that also allowed the form, even if the C standard did not.

James Van Buskirk

unread,

Nov 10, 2010, 3:46:36 PM11/10/10

to

"steve" <kar...@comcast.net> wrote in message

news:71af94e0-28d7-448d...@p20g2000prf.googlegroups.com...

> James, would you consider the following a satisfactory warning.
> laptop:kargl[255] gfc4x -c po.f90
> po.f90:3.24:

> type, bind(C) :: junk
1
> Warning: Derived type 'junk' with BIND(C) attribute at (1) is empty,
> and may be inaccessible by the C companion processor

Beyond satisfactory. Excellent clarity and terseness.

nm...@cam.ac.uk

unread,

Nov 11, 2010, 3:50:19 AM11/11/10

to

In article <1jrqqrr.78seitx8a2qN%nos...@see.signature>,

Richard Maine <nos...@see.signature> wrote:
>James Van Buskirk <not_...@comcast.net> wrote:
>> "steve" <kar...@comcast.net> wrote in message
>> news:54d44789-3418-4fec...@b19g2000prj.googlegroups.com...
>>
>> > I think your Fortran code is conforming and gfortran should not emit
>> > an error. The passage for Fortran 2003 that I believe allows binding
>> > is from 15.2
>>
>> > The following subclauses define the conditions under which
>> > a Fortran entity is interoperable. If a Fortran entity is
>> > interoperable, an equivalent entity may be defined by means
>> > of C and the Fortran entity is said to be interoperable with
>> > the C entity. There does not have to be such an interoperating
>> > C entity.
>>
>> > That last sentence seems to be a catch-all for allowing a valid
>> > Fortran 2003 entity to have a bind(c) attribute without necessarily
>> > having a corresponding valid C entity.
>>
>> But does it mean no such valid C entity in the C language or just
>> in the C parts of the program that the Fortran code is interoperating
>> with?
>
>I think the latter (in the C parts of the program).
>
>I interpret that as saying that there does not have to actually be a
>specific interoperating entity in the program. It is sort of like having

>a COMMON block with only one instance. ...

>Likewise, the words above are saying (at least to me) that you can have
>an interoperable Fortran entity even though there is no C code in sight
>that it actully interperates with.

Grrk. The point here is that we are talking about a TYPE and not an
OBJECT - COMMON blocks are thoroughly confusing, because they are
betwixt and between. And the concept of 'existence' of a type (as
distinct from a declaration of that type) in both C and Fortran is
murky.

>Basically, it emphasizes the "may" in the previous sentence, which said
>that "an equivalent entity may be defined". Just because there may be
>one doesn't mean that there has to be one. I think it is redundant, as
>that's already what "may" means, but someone probably wanted to
>emphasize it. Sometimes attempts to emphasize things actually end up
>confusing them because people assume that there must be some new content
>being conveyed rather than just redundant restatement.

Hence the rules against repetitive specification.

>So if the C language did not actually allow an interoperating entity
>(and I don't pretend to know C well enough to say anything about that),
>I think that a compiler would be justified in bitching. Such a bitch
>would not be required in any case. A compiler could certainly allow the
>form as an extension with no diagnostic (or with an extension warning).
>In fact, that would probably make a lot of sense if it were targetting a
>C compiler that also allowed the form, even if the C standard did not.

I do know that language well enough, and should have got involved
when the C interoperability section was designed. I regret to say
that it is even more of a mess than it need be (and, because of the
nature of C, a mess was unavoidable). I can't give a clear example
other than an empty structure of where Fortran's rules for derived
types are clearly incompatible with C's, but can give a couple of
ones where they are unclearly incompatible :-(

Firstly, traditional C does not support large arrays on the stack
or in structures, and not always as static, either - you were and
are expected to use malloc for such things. I can't tell you how
many compilers have low limits, but there are almost certainly
still cases where this prevents interoperability. Certainly, you
are off your mind if you put another entry in a BIND(C) derived
type (or COMMON block) following a large array - most Fortran
compilers will at least generate correct code, but I wouldn't
bet on all C ones doing it. And similar remarks apply to the
relative performance.

Secondly, C's rules on alignment are FAR less well-defined and more
foully complicated than most C 'experts' realise. For example, it
is not commonly known, but the alignment of an array within a
structure can depend on the array's size - I believe that this is
the reason for the bizarre wording in C99 6.7.2.1 paragraph 16:

"the offset of the array shall remain that of the flexible array
member, even if this would differ from that of the replacement
array."

Now, put THAT in your pipe and smoke it, Fortran! :-)

So it is certainly possible for there to be perfectly good BIND(C)
derived types that don't interoperate with the companion processor,
even if many others do. Now, whether you claim that it should not
be regarded as a conforming companion processor, or whether you
regard this as an unfortunate processor-dependent restriction, is
a matter for debate ....

Regards,
Nick Maclaren.

glen herrmannsfeldt

unread,

Nov 11, 2010, 5:56:14 AM11/11/10

to

nm...@cam.ac.uk wrote:
(snip)

> Grrk. The point here is that we are talking about a TYPE and not an
> OBJECT - COMMON blocks are thoroughly confusing, because they are
> betwixt and between. And the concept of 'existence' of a type (as
> distinct from a declaration of that type) in both C and Fortran is
> murky.

Are COMMON blocks of length zero allowed yet?

(snip)

> Hence the rules against repetitive specification.

(snip)

> I do know that language well enough, and should have got involved
> when the C interoperability section was designed. I regret to say
> that it is even more of a mess than it need be (and, because of the
> nature of C, a mess was unavoidable). I can't give a clear example
> other than an empty structure of where Fortran's rules for derived
> types are clearly incompatible with C's, but can give a couple of
> ones where they are unclearly incompatible :-(

> Firstly, traditional C does not support large arrays on the stack
> or in structures, and not always as static, either - you were and
> are expected to use malloc for such things. I can't tell you how
> many compilers have low limits, but there are almost certainly
> still cases where this prevents interoperability. Certainly, you
> are off your mind if you put another entry in a BIND(C) derived
> type (or COMMON block) following a large array - most Fortran
> compilers will at least generate correct code, but I wouldn't
> bet on all C ones doing it. And similar remarks apply to the
> relative performance.

I remember a problem with static arrays in Alpha/OSF1.
With 64 bit addressing, one should expect to be able to
create large arrays without much of a problem, so I was
certainly surprised to see it fail at 100K bytes.
I didn't try to track down the source of the problem, though.
Maybe there is a special addressing mode for static data
that uses 16 bit addresses.

Even so, I don't blame C, but the implementation and/or hardware.

I believe that there is a suggestion somewhere in C89 that
int has at least 16 bits, and that arrays should have at
least 32K elements (or maybe it is bytes). Does Fortran
even require that?

There has to be a limit somewhere. (The machine address space
if nothing else comes first.) Sometimes the limits are surprising,
but that is one problem with real machines instead of ideal ones.

> Secondly, C's rules on alignment are FAR less well-defined and more
> foully complicated than most C 'experts' realise. For example, it
> is not commonly known, but the alignment of an array within a
> structure can depend on the array's size - I believe that this is
> the reason for the bizarre wording in C99 6.7.2.1 paragraph 16:

> "the offset of the array shall remain that of the flexible array
> member, even if this would differ from that of the replacement
> array."

> Now, put THAT in your pipe and smoke it, Fortran! :-)

That may be, but I still find the idea of Fortran non-sequence
types changing the order of members to be even more strange.

> So it is certainly possible for there to be perfectly good BIND(C)
> derived types that don't interoperate with the companion processor,
> even if many others do. Now, whether you claim that it should not
> be regarded as a conforming companion processor, or whether you
> regard this as an unfortunate processor-dependent restriction, is
> a matter for debate ....

-- glen

nm...@cam.ac.uk

unread,

Nov 11, 2010, 6:19:09 AM11/11/10

to

In article <ibgi4e$63j$1...@news.eternal-september.org>,

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
>Are COMMON blocks of length zero allowed yet?

I hope not!

>I remember a problem with static arrays in Alpha/OSF1.
>With 64 bit addressing, one should expect to be able to
>create large arrays without much of a problem, so I was
>certainly surprised to see it fail at 100K bytes.
>I didn't try to track down the source of the problem, though.
>Maybe there is a special addressing mode for static data
>that uses 16 bit addresses.

Probably 64KB, actually. I could explain the most common reasons,
of which that is not one - and CERTAINLY wasn't on the Alpha!
It was almost certainly the linker format.

>Even so, I don't blame C, but the implementation and/or hardware.

There are a lot of implicit assumptions in almost all languages,
such as that Fortran expects reasonable numerical properties of
the floating-point operations. And one of those in traditional C
was that large objects were NOT allocated statically or on the
stack, and structures were small.

>I believe that there is a suggestion somewhere in C89 that
>int has at least 16 bits, and that arrays should have at
>least 32K elements (or maybe it is bytes). Does Fortran
>even require that?
>
>Even so, I don't blame C, but the implementation and/or hardware.

It's more than a suggestion, and it's bytes. However, it doesn't
make it clear whether the compiler must be able to handle ANY type
declaration of up to 65535 or merely at least one of that size!
It certainly used to be common that there were restrictions of that
magnitude for initialised static objects, automatic objects and
structure types, but not arrays as such.

Fortran doesn't have a miscellaneous collection of limits, which
is one reason it does better - seriously.

>There has to be a limit somewhere. (The machine address space
>if nothing else comes first.) Sometimes the limits are surprising,
>but that is one problem with real machines instead of ideal ones.

And you fell foul of one. There was nothing wrong with the
implementation or hardware. Sorry, but that's the situation.

>That may be, but I still find the idea of Fortran non-sequence
>types changing the order of members to be even more strange.

That's almost certainly because you are unfamiliar with strongly
typed languages - or even PL/I :-) There is no more reason that
the order of elements in a structure should be the lexical one
than there is for the order of declared objects in a procedure.

I don't find the idea that alignment can depend on size strange;
I find that it conflicts badly with the rest of the C language.
In strongly typed languages (including modern Fortran), it's
perfectly natural.

Regards,
Nick Maclaren.

glen herrmannsfeldt

unread,

Nov 11, 2010, 6:32:33 AM11/11/10

to

nm...@cam.ac.uk wrote:
> In article <ibgi4e$63j$1...@news.eternal-september.org>,
(snip, I wrote)

>>Even so, I don't blame C, but the implementation and/or hardware.

> There are a lot of implicit assumptions in almost all languages,
> such as that Fortran expects reasonable numerical properties of
> the floating-point operations. And one of those in traditional C
> was that large objects were NOT allocated statically or on the
> stack, and structures were small.

Yes, and those make sense 99% of the time. But on a machine
with 16GB RAM, a 100K static array seemed small to me...

(snip)

> It's more than a suggestion, and it's bytes. However, it doesn't
> make it clear whether the compiler must be able to handle ANY type
> declaration of up to 65535 or merely at least one of that size!
> It certainly used to be common that there were restrictions of that
> magnitude for initialised static objects, automatic objects and
> structure types, but not arrays as such.

And then there is Java, where subscripts are defined to be int,
and int is defined to be 32 bits signed. One could have a large
array of bool, even on a not-so-large machine and reach the limit.

> Fortran doesn't have a miscellaneous collection of limits, which
> is one reason it does better - seriously.

>>There has to be a limit somewhere. (The machine address space
>>if nothing else comes first.) Sometimes the limits are surprising,
>>but that is one problem with real machines instead of ideal ones.

> And you fell foul of one. There was nothing wrong with the
> implementation or hardware. Sorry, but that's the situation.

Last I knew, the java class file still has some 64K limits, though
I haven't heard of so many problems caused by them. Programs
are still getting bigger, though.

-- glen

nm...@cam.ac.uk

unread,

Nov 11, 2010, 6:47:15 AM11/11/10

to

In article <ibgk8h$d8q$1...@news.eternal-september.org>,

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
>>>Even so, I don't blame C, but the implementation and/or hardware.
>
>> There are a lot of implicit assumptions in almost all languages,
>> such as that Fortran expects reasonable numerical properties of
>> the floating-point operations. And one of those in traditional C
>> was that large objects were NOT allocated statically or on the
>> stack, and structures were small.
>
>Yes, and those make sense 99% of the time. But on a machine
>with 16GB RAM, a 100K static array seemed small to me...

It's not the static nature that's the problem, but the need to
initialise it. Fortran has the concept of uninitialised static
data; C doesn't.

Regards,
Nick Maclaren.

glen herrmannsfeldt

unread,

Nov 11, 2010, 10:40:42 AM11/11/10

to

nm...@cam.ac.uk wrote:
(snip regarding small static data space on some systems)

>>Yes, and those make sense 99% of the time. But on a machine
>>with 16GB RAM, a 100K static array seemed small to me...

> It's not the static nature that's the problem, but the need to
> initialise it.

Yes, but people writing link file formats should know that by now,
especially for 64 bit addressing space hosts.

Well, that was about 10 years ago, when 16G was still a lot.

> Fortran has the concept of uninitialised static data; C doesn't.

But with Fortran compilers using C compiler back ends, I
expect that they get the initialized static data included.

-- glen

nm...@cam.ac.uk

unread,

Nov 11, 2010, 10:53:07 AM11/11/10

to

In article <ibh2pq$jhj$1...@news.eternal-september.org>,

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
>>>Yes, and those make sense 99% of the time. But on a machine
>>>with 16GB RAM, a 100K static array seemed small to me...
>
>> It's not the static nature that's the problem, but the need to
>> initialise it.
>
>Yes, but people writing link file formats should know that by now,
>especially for 64 bit addressing space hosts.

Eh? Since nobody with half a clue actually attempts to initialise
massive arrays, why should they provide a completely separate
mechanism to support that? I have implemented suitable mechanisms,
incidentally, and the changes are far more fundamental and pervasive
than you apparently think.

In particular, you need to change from placing the initialised
data in the executable in the form it will be loaded to calling an
initialisation procedure immediately following program loading.
And that's GREAT fun when you have mixed language programs and no
control of which one gets control on entry. It can be done, but
it's not easy and usually needs extensive work to the linker and
loader as well as the compilers.

>Well, that was about 10 years ago, when 16G was still a lot.

Precisely. Now consider a C program with the following:

static double fred[4096][4096][4096];

Do you REALLY want a 512 GB executable?

>> Fortran has the concept of uninitialised static data; C doesn't.
>
>But with Fortran compilers using C compiler back ends, I
>expect that they get the initialized static data included.

Yes, but they DON'T initialise all static data by default! The
Fortran equivalent of the above code doesn't have a problem.

Regards,
Nick Maclaren.

Dan Nagle

unread,

Nov 11, 2010, 12:38:39 PM11/11/10

to

Hello,

On 2010-11-11 06:19:09 -0500, nm...@cam.ac.uk said:

> In article <ibgi4e$63j$1...@news.eternal-september.org>,
> glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>>
>> Are COMMON blocks of length zero allowed yet?
>
> I hope not!

According to R568, a common statement must have a <common-block-object-list>.

According to R101, an <xyz-list> has at least one <xyz>.

So there must be at least one <common-block-object>.

By R569, a <common-block-object> must be a variable name.

Hmm. I don't see a prohibition on zero-sized arrays or characters.

Since common is archaic, and a zero-sized thingo is only possible
with the modern stuff, this may be an oversight.

I'm not motivated to pursue this further.
If you want to do so, go ahead. :-)

--
Cheers!

Dan Nagle

nm...@cam.ac.uk

unread,

Nov 11, 2010, 12:49:34 PM11/11/10

to

In article <ibh9mu$6ji$1...@news.eternal-september.org>,

Dan Nagle <dann...@verizon.net> wrote:
>
>>> Are COMMON blocks of length zero allowed yet?
>>
>> I hope not!
>

>Hmm. I don't see a prohibition on zero-sized arrays or characters.
>
>Since common is archaic, and a zero-sized thingo is only possible
>with the modern stuff, this may be an oversight.
>
>I'm not motivated to pursue this further.
>If you want to do so, go ahead. :-)

Quite. I hoped that it had been forbidden, but am disinclined to
waste time on it. Anyone who puts zero-sized thingies in a COMMON
block needs his head examining.

I doubt that it would get as far as causing C to have hysterics;
the linker would probably have hysterics first!

Regards,
Nick Maclaren.

Richard Maine

unread,

Nov 11, 2010, 1:09:57 PM11/11/10

to

<nm...@cam.ac.uk> wrote:

> Anyone who puts zero-sized thingies in a COMMON
> block needs his head examining.

I find it irksome that, if I recall correctly, making zero-sized things
in common "work" was a major justification for what I regard as the
completely counter-intuitive rules on association of zro-sized things.

Roughtly, all zero-sized things of the same type and kind are considered
to be associated with each other. Or maybe I have it backwards and they
are all considered to be not associated. To me, both are equally
counter-intuitive, so I have trouble remembering which counter-intuitive
answer is correct.

My intuition would say that, all the below being pointer arrays of the
same type and kind, if one did

allocate(x(0))
allocate(y(0))

then x and y would not be associated, but that if one did

allocate(x(0))
y => x

x and y would be associated with each other. One of those is wrong by
the current rules. I think it is the later one that is wrong because
that's the one that is strangest to me - that I can point y at the same
target as x, but they are still not associated.

I've had people try to explain how their intuition views this, but mine
never matches. Their explanation tends to involve looking at the
individual elements of the array and noting that there aren't any
elements to be associated. I figure that x=>y ought to make x and y
associated without needing to ask about details of x and y.

However, what I regard as intuitive here apparently doesn't work well
with zero-sized arrays in common. As I recall, that was the argument
that killed having the standard specify what I wanted. I might not have
been able to convince a majority anyway, but the issue of brokenness in
common made it pretty much a nonstarter.

Steven Correll

unread,

Nov 11, 2010, 2:50:49 PM11/11/10

to

On Nov 11, 8:53 am, n...@cam.ac.uk wrote:
> In article <ibh2pq$jh...@news.eternal-september.org>,

> glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> >> It's not the static nature that's the problem, but the need to
> >> initialise it.
>
> >Yes, but people writing link file formats should know that by now,
> >especially for 64 bit addressing space hosts.
>
> Eh? Since nobody with half a clue actually attempts to initialise
> massive arrays, why should they provide a completely separate
> mechanism to support that? I have implemented suitable mechanisms,
> incidentally, and the changes are far more fundamental and pervasive
> than you apparently think.

[snip]

> Precisely. Now consider a C program with the following:
>
> static double fred[4096][4096][4096];
>
> Do you REALLY want a 512 GB executable?

I doubt anybody is surprised by the following (on MacOS X; "wc -c"
gives the file size in bytes):

$ cat bigc.c

static double fred[4096][4096][4096];

void foo() {
static double fred2[4096][4096][4096];
}
$ gcc -c bigc.c
$ wc -c bigc.o
684 bigc.o

but if anybody is surprised, see the Wikipedia entry for ".bss".

For security, an operating system must generally initialize memory
before giving it to a process. I speculate that the reasoning of the
original C designers was, "Why not take advantage of that and
stipulate that static variables will have a known initial value?" At
the time, the rejoinder might have been, "Because that may be
difficult to accomplish with some linkers on some operating systems."
But, as Glenn says, the C language has been popular for such a long
time that OSes and linkers have made their peace with C.

Languages make different tradeoffs between safety and speed. With
regard to static data, C has opted for safety (whether or not you
think that zero is a good choice, it's a safer choice than "undefined"
because the program will behave repeatably.) Fortran has opted for
speed in case zero-initialization is not free with a particular OS/
linker. Java, as observed in this forum recently, has made a tradeoff
even further toward safety and away from speed, by requiring that
programs be written so that the defined-ness of even non-static
variables can be proved by static analysis. None of these is a "wrong"
tradeoff, though to me the C choice seems to have hit a "sweet spot"
relative to Fortran or Java, since the benefit is less than Java's,
but the cost in today's popular environments is near zero.

And to address the "eat-your-spinach" point of view: yes, programmers
ought never to access undefined variables--but we know they will
sometimes err. And yes, programmers who are not infallible should use
tools separate from the language to check for such errors--but we know
that they will sometimes not bother. So the question is really whether
to spend on safety instead of speed some of the ever-increasing
computer power the hardware gives us. In some cases, "yes" is a
sensible answer.

glen herrmannsfeldt

unread,

Nov 11, 2010, 3:51:03 PM11/11/10

to

nm...@cam.ac.uk wrote:
(snip, I wrote)

>>Yes, but people writing link file formats should know that by now,
>>especially for 64 bit addressing space hosts.

> Eh? Since nobody with half a clue actually attempts to initialise
> massive arrays, why should they provide a completely separate
> mechanism to support that? I have implemented suitable mechanisms,
> incidentally, and the changes are far more fundamental and pervasive
> than you apparently think.

It seems that others have figured out how to initialize to zero
without writing all the zeros into the file.

I do remember when I was first learning Fortran initializing some
large (at the time) arrays to zero in DATA statements, and then,
not knowing about this, punching out object decks. Many cards were
used for those zeros.

> In particular, you need to change from placing the initialised
> data in the executable in the form it will be loaded to calling an
> initialisation procedure immediately following program loading.
> And that's GREAT fun when you have mixed language programs and no
> control of which one gets control on entry. It can be done, but
> it's not easy and usually needs extensive work to the linker and
> loader as well as the compilers.

>>Well, that was about 10 years ago, when 16G was still a lot.

> Precisely. Now consider a C program with the following:

> static double fred[4096][4096][4096];

Well, that is a little bigger than my computer will do, but with

static double fred[600][600][600];

I get a 4977 byte executable, from a 932 byte object file,
the same size as for any smaller array dimensions.
(Scientific Linux 5.3, IA32, with the included C compiler.)

> Do you REALLY want a 512 GB executable?

It seems that some systems figure that out.

>>> Fortran has the concept of uninitialised static data; C doesn't.

>>But with Fortran compilers using C compiler back ends, I
>>expect that they get the initialized static data included.

> Yes, but they DON'T initialise all static data by default! The
> Fortran equivalent of the above code doesn't have a problem.

real*8 fred(600,600,600)
data fred/216000000*0./
print *,fred(100,100,100)
end

generates a nice, small, 5961 byte executable. (The Fortran library
must be bigger than the C library.)

gfortran 4.1.2 on the same system as above.

but make a tiny change and fill the array with 1.D0, then ....

gfortran: Internal error: File size limit exceeded (program f951)
Please submit a full bug report.
See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.

I wonder if I should submit a report.

-- glen

glen herrmannsfeldt

unread,

Nov 11, 2010, 3:58:56 PM11/11/10

to

Steven Correll <steven....@gmail.com> wrote:
(snip)

> but if anybody is surprised, see the Wikipedia entry for ".bss".

> For security, an operating system must generally initialize memory
> before giving it to a process. I speculate that the reasoning of the
> original C designers was, "Why not take advantage of that and
> stipulate that static variables will have a known initial value?" At
> the time, the rejoinder might have been, "Because that may be
> difficult to accomplish with some linkers on some operating systems."
> But, as Glenn says, the C language has been popular for such a long
> time that OSes and linkers have made their peace with C.

Other systems that I used at the time C was designed didn't
zero for safety reasons. It might have been an option on the
systems when extra security was needed. Also, disk blocks weren't
always cleared on file creation. I do remember on OS/360, having
the compile fail, link in a separate job step reading garbage
from disk.

> Languages make different tradeoffs between safety and speed. With
> regard to static data, C has opted for safety (whether or not you
> think that zero is a good choice, it's a safer choice than "undefined"
> because the program will behave repeatably.) Fortran has opted for
> speed in case zero-initialization is not free with a particular OS/
> linker. Java, as observed in this forum recently, has made a tradeoff
> even further toward safety and away from speed, by requiring that
> programs be written so that the defined-ness of even non-static
> variables can be proved by static analysis. None of these is a "wrong"
> tradeoff, though to me the C choice seems to have hit a "sweet spot"
> relative to Fortran or Java, since the benefit is less than Java's,
> but the cost in today's popular environments is near zero.

It seems to me about time for Fortran to adopt zeroing of static data.
It isn't, as far as I know, incompatible with previous versions.

-- glen

nm...@cam.ac.uk

unread,

Nov 11, 2010, 4:23:33 PM11/11/10

to

In article <89c13011-fad8-45aa...@35g2000prt.googlegroups.com>,

Steven Correll <steven....@gmail.com> wrote:
>
>> Precisely. Now consider a C program with the following:
>>
>> static double fred[4096][4096][4096];
>>
>> Do you REALLY want a 512 GB executable?
>
>I doubt anybody is surprised by the following (on MacOS X; "wc -c"
>gives the file size in bytes):

Nor am I. However, Unix is not the only system in existence, and
ISO C89/C90 was intended to run on every important system of the
day. That's getting less true, but is still formally the case.
I don't know how many support a .bss equivalent for C support;
I know that not many did in 1990, and it caused a lot of debate
in WG14.

Back in the very early days, of course, .bss was severely limited
in size (64 KB, if I recall), and it was in those days that the
usage paradigms of C became established.

>For security, an operating system must generally initialize memory
>before giving it to a process. I speculate that the reasoning of the
>original C designers was, "Why not take advantage of that and
>stipulate that static variables will have a known initial value?"

Yer whaa? Security of that nature was NOT one of the objectives
of Unix in 1970! It was designed as a computer scientist's
workbench, and is riddled with covert channels - and, because the
authors were aware of Titan, the clear text of passwords never got
near user code.

>Languages make different tradeoffs between safety and speed. With
>regard to static data, C has opted for safety (whether or not you
>think that zero is a good choice, it's a safer choice than "undefined"

>because the program will behave repeatably.) ...

That is a common error, because it is NOT safer - initialising to
bad data is a MUCH better idea, as extensive experience has shown,
and initialising non-repeatably is as good, but makes debugging a
bit too hard for script kiddies.

It is also NOT true that C has chosen that path, because neither the
stack nor the 'heap' are initialised - nor ever were!

And, nowadays, programs don't get a repeatable image, anyway,
because of operating system factors. Those days are long gone.
However, traditions remain long after the reason for them has
gone ....

Regards,
Nick Maclaren.

nm...@cam.ac.uk

unread,

Nov 11, 2010, 5:27:33 PM11/11/10

to

In article <ibhmsl$mvr$1...@gosset.csi.cam.ac.uk>, <nm...@cam.ac.uk> wrote:
>In article <89c13011-fad8-45aa...@35g2000prt.googlegroups.com>,
>Steven Correll <steven....@gmail.com> wrote:
>
>>For security, an operating system must generally initialize memory
>>before giving it to a process. I speculate that the reasoning of the
>>original C designers was, "Why not take advantage of that and
>>stipulate that static variables will have a known initial value?"
>
>Yer whaa? Security of that nature was NOT one of the objectives
>of Unix in 1970! It was designed as a computer scientist's
>workbench, and is riddled with covert channels - and, because the
>authors were aware of Titan, the clear text of passwords never got
>near user code.

My mistake. I misunderstood what you said, and made a reply that
was true in the sense that I understood but false in the sense that
most people will have understood. It's not worth explaining. In
the sense that you almost certainly mean, you are right, of course.

My apologies.

Regards,
Nick Maclaren.

steve

unread,

Nov 11, 2010, 6:35:12 PM11/11/10

to

On Nov 10, 12:46 pm, "James Van Buskirk" <not_va...@comcast.net>
wrote:

> "steve" <kar...@comcast.net> wrote in message
>
> news:71af94e0-28d7-448d...@p20g2000prf.googlegroups.com...
>
> > James, would you consider the following a satisfactory warning.
> > laptop:kargl[255] gfc4x -c po.f90
> > po.f90:3.24:
> > type, bind(C) :: junk
>
> 1
>
> > Warning: Derived type 'junk' with BIND(C) attribute at (1) is empty,
> > and may be inaccessible by the C companion processor
>
> Beyond satisfactory. Excellent clarity and terseness.

The patch was committed.

Committed revision 166633.

--
steve

Colin Watters

unread,

Nov 12, 2010, 2:23:17 AM11/12/10

to

"glen herrmannsfeldt" <g...@ugcs.caltech.edu> wrote in message >

> real*8 fred(600,600,600)
> data fred/216000000*0./
> print *,fred(100,100,100)
> end
>
> generates a nice, small, 5961 byte executable. (The Fortran library
> must be bigger than the C library.)
>
> gfortran 4.1.2 on the same system as above.
>
> but make a tiny change and fill the array with 1.D0, then ....
>
> gfortran: Internal error: File size limit exceeded (program f951)
> Please submit a full bug report.
> See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.
>
> I wonder if I should submit a report.
>
> -- glen

What happens if you use

data fred / 1.d0, 215999999*0./

?

--
Qolin

Email: my qname at domain dot com
Domain: qomputing

glen herrmannsfeldt

unread,

Nov 12, 2010, 3:00:50 AM11/12/10

to

Colin Watters <bo...@qomputing.com> wrote:
(I wrote)

>> real*8 fred(600,600,600)
>> data fred/216000000*0./
>> print *,fred(100,100,100)
>> end

>> generates a nice, small, 5961 byte executable. (The Fortran library
>> must be bigger than the C library.)

>> gfortran 4.1.2 on the same system as above.

>> but make a tiny change and fill the array with 1.D0, then ....

>> gfortran: Internal error: File size limit exceeded (program f951)
>> Please submit a full bug report.
>> See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.

(snip)

> What happens if you use

> data fred / 1.d0, 215999999*0./

Still the File size limit error.

-- glen

Colin Watters

unread,

Nov 12, 2010, 2:09:34 PM11/12/10

to

"glen herrmannsfeldt" <g...@ugcs.caltech.edu> wrote in message

news:ibis7h$4au$1...@news.eternal-september.org...

Yes as I suspected.

Some years ago I found a windows DLL file in our software delivery suite
had swolen from 2 MB to about 250. I traced the cause to a colleague's
change to a data statement, pretty much like that one. I fixed it by
breaking out the variables that required non-zero initialization onto a
separate, differently-named common block.

Ron Shepard

unread,

Nov 13, 2010, 12:11:24 AM11/13/10

to

In article <ibk3dd$snk$1...@news.eternal-september.org>,
"Colin Watters" <bo...@qomputing.com> wrote:

As a practical matter, there is seldom any real reason why large
arrays need to be initialized to a constant with data statements (or
on the declaration line). It is just as easy, just as clear, and
just as simple to initialize them in the program with an executable
statement

array = 0.0

or

array = 1.0

or whatever is necessary. This way, you don't need to rely on some
clever memory management quirk of the OS to avoid having a large
executable file. Sometimes you need to keep track of whether the
array has been initialized, to make sure it was done and to avoid
doing it multiple times, so that adds a little extra logic, but
still, it is simple and straightforward code.

$.02 -Ron Shepard

JB

unread,

Nov 13, 2010, 3:47:51 AM11/13/10

to

On 2010-11-11, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

> Steven Correll <steven....@gmail.com> wrote:
>> Languages make different tradeoffs between safety and speed. With
>> regard to static data, C has opted for safety (whether or not you
>> think that zero is a good choice, it's a safer choice than "undefined"
>> because the program will behave repeatably.) Fortran has opted for
>> speed in case zero-initialization is not free with a particular OS/
>> linker. Java, as observed in this forum recently, has made a tradeoff
>> even further toward safety and away from speed, by requiring that
>> programs be written so that the defined-ness of even non-static
>> variables can be proved by static analysis. None of these is a "wrong"
>> tradeoff, though to me the C choice seems to have hit a "sweet spot"
>> relative to Fortran or Java, since the benefit is less than Java's,
>> but the cost in today's popular environments is near zero.
>
> It seems to me about time for Fortran to adopt zeroing of static data.
> It isn't, as far as I know, incompatible with previous versions.

If you're going to do this, you might as well go all the way and
require that all variables start zero initialized, and become zeroed
when they today become undefined (e.g. intent(out) arguments on
procedure entry). It would have a performance impact, but OTOH the
dead store elimination pass in an optimizing compiler should be able
to fix most of that.

--
JB

Gary L. Scott

unread,

Nov 13, 2010, 11:55:49 AM11/13/10

to

Sometimes you actually WANT the "undefined" memory state (the volatile
state). As long as that is accommodated...

Richard Maine

unread,

Nov 13, 2010, 12:25:30 PM11/13/10

to

JB <f...@bar.invalid> wrote:

> If you're going to do this, you might as well go all the way and
> require that all variables start zero initialized, and become zeroed
> when they today become undefined (e.g. intent(out) arguments on
> procedure entry). It would have a performance impact, but OTOH the
> dead store elimination pass in an optimizing compiler should be able
> to fix most of that.

Setting to zero instead of undefined is impractical. I suspect you don't
appreciate all the ways that a variable can become undefined. Your
example of intent(out) is one of the most trivial and easy to handle.
Others are not so easy.

Furthermore, in some cases, the whole reason that the standard specifies
something to be undefined is to allow for multiple implementation
strategies, which could have different results. It would be worse than
pointless to specify zero initialization in such cases in that it might
invalidate all of the natural implementations. That would be positively
counterproductive; one would be better to specify one of the obvious
results than to specify something a zero that would not result from any
of them. Some of these cases even reflect hardware differences; so one
is going to force what amounts to emulation of a new Fortran hardware
spec?

Plus, of course, even if one did all that, it would have the
counterproductive effect of killing some debugging aids. It is resonably
common for an undefined variable to occur as a result of something that
is purely and simply a program bug, and not one as simple as mistakenly
assuming zero initialization. It is a nice feature for compilers to be
able to catch at least some such bugs. There are compilers that go to a
lot of trouble to try to do so. I will avoid mentioning the name of a
language that I dislike because, among other things, some kinds of buggy
code have a tendency to be valid code from the language perspective; it
just doesn't do anything like what the writer intended.

That's all without even getting into the fact that "zero" is not a
defined concept in Fortran for some types. That's a problem for zero
initialization as well. Trying to define it for all types would be going
down the path of specifying hardware representations.

For example, I guess you (well, Glenn) just forced some equivalent of
the C null character on all character kinds - not just the C interop
one; well, I suppose we probably won't see CDC display code again
anyway. And you also at least partially specified the representation of
logicals; at least James would be happy at that. :-) Then there are
pointers. Lastly, we have derived types with private components, where a
large part of the point is to hide internal representation, but you just
unhid some of it.

In short, lots of luck. Well, no, that's not honest of me. I don't wish
any such proposal luck at all. Fortunately, I don't think it would have
enough chance that my wish would be relevant.

Wolfgang Kilian

unread,

Nov 13, 2010, 12:29:36 PM11/13/10

to

I'd rather opt for NaN default initialization, if any. Some compilers
have this as an option, quite useful. Unfortunately, there is no
integer NaN. So undefined is just fine, at least you have a chance to
detect errors when the variables contain garbage. More difficult with
zero initialization.

-- Wolfgang

--
E-mail: firstnameini...@domain.de
Domain: yahoo

JB

unread,

Nov 13, 2010, 3:27:53 PM11/13/10

to

There are languages with memory safety in the general case, but with
some special constructs allowing access to raw memory, e.g. for
accessing device registers and such.

--
JB

JB

unread,

Nov 13, 2010, 4:04:09 PM11/13/10

to

On 2010-11-13, Richard Maine <nos...@see.signature> wrote:
> JB <f...@bar.invalid> wrote:
>
>> If you're going to do this, you might as well go all the way and
>> require that all variables start zero initialized, and become zeroed
>> when they today become undefined (e.g. intent(out) arguments on
>> procedure entry). It would have a performance impact, but OTOH the
>> dead store elimination pass in an optimizing compiler should be able
>> to fix most of that.
>
> Setting to zero instead of undefined is impractical.

Maybe so. My main point, however, was that if one were to attempt such
a thing, I'd rather see a thorough effort than some half-hearted "lets
do it only for the cases where we can make use of the .bss section on
some platforms".

> I suspect you don't
> appreciate all the ways that a variable can become undefined.

Yes. I recall in some previous discussion you mentioned there are 14
ways for a variable to become undefined in F2003; of the top of my
head I can come up with slightly less than half of that, so I'm
certainly missing some, and probably some that I'm not aware of at all
(I have to admit I'm mostly ignorant about the OOP stuff in F2003, for
instance).

> Furthermore, in some cases, the whole reason that the standard specifies
> something to be undefined is to allow for multiple implementation
> strategies, which could have different results. It would be worse than
> pointless to specify zero initialization in such cases in that it might
> invalidate all of the natural implementations.

Sure. Then again, given that computers are getting faster but humans
aren't, I think there would be value in features that reduce errors
(say, by reducing the scope of undefined behavior) even at the cost of
performance. Within reason, of course.

Whether zeroing variables would accomplish any of that is of course
another question. A better approach, I guess, would be the previously
mentioned Java behavior where the compiler is required to statically
prove that undefined variables are not accessed; OTOH I don't think it
would be easy to retrofit anything like that into Fortran without
breaking lots of stuff. Perhaps some statement in the spirit of
IMPLICIT NONE to specify that a procedure/module doesn't leave output
arguments/global variables/etc. in an undefined state, does not call
other procedures with undefined actual arguments and so forth?

> That's all without even getting into the fact that "zero" is not a
> defined concept in Fortran for some types. That's a problem for zero
> initialization as well. Trying to define it for all types would be going
> down the path of specifying hardware representations.

I was assuming that the zeroing Glen mentioned would not literally
imply writing zero bits into all the space occupied by the variable in
question, but rather that the standard would specify what is the
initial value of all primitive types (say, 0, 0.0, .FALSE., " " or
ASCII NULL for characters, NULL() for pointers) and that it's up to
the processor to implement that as it sees fit.

--
JB

Colin Watters

unread,

Nov 13, 2010, 6:10:25 PM11/13/10

to

"Ron Shepard" <ron-s...@NOSPAM.comcast.net> wrote in message
news:ron-shepard-4998...@news60.forteinc.com...

...You clearly don't do much Windows DLL programming do you? Your comments
apply quite well to a single .exe, but not to my situation.

The code I develop has a number of large utility DLLs that hold saved
arrays and data structures. Its MUCH easier to initialize everything with
(the equivalent of) a block data routine, compared to planting calls to an
inintialzation routine in the numerous routines (about 50) that may wind up
being called first by the main application executable (of which there are
about 6). Such an initialization routine must itself ensure it doesn't
initialize more than once, ... which needs a saved variable, initialized
with a data statement or equivalent.

Even with this aside, I disagree that initialization with executable
statements is "just as clear" as other methods. OK I grant you a data
statement is entirely comparable with an assignment statement. But with
variables and arrays in modules, F90's "Default initialization" feature
allows declaration, and initialization, on a single line of code, which
IMHO is much clearer than separate declaration and initialization.

Ron Shepard

unread,

Nov 16, 2010, 6:36:32 PM11/16/10

to

In article <ibn5t3$4tv$1...@news.eternal-september.org>,
"Colin Watters" <bo...@qomputing.com> wrote:

> > As a practical matter, there is seldom any real reason why large
> > arrays need to be initialized to a constant with data statements (or
> > on the declaration line). It is just as easy, just as clear, and
> > just as simple to initialize them in the program with an executable
> > statement
> >
> > array = 0.0
> >
> > or
> >
> > array = 1.0
> >
> > or whatever is necessary. This way, you don't need to rely on some
> > clever memory management quirk of the OS to avoid having a large
> > executable file. Sometimes you need to keep track of whether the
> > array has been initialized, to make sure it was done and to avoid
> > doing it multiple times, so that adds a little extra logic, but
> > still, it is simple and straightforward code.
> >
> > $.02 -Ron Shepard
>
> ...You clearly don't do much Windows DLL programming do you?

No I don't, and I probably never will.

> Your comments
> apply quite well to a single .exe, but not to my situation.

I do not follow your arguments here. Does your situation involve
multiple .exe files that share memory or something?

>
> The code I develop has a number of large utility DLLs that hold saved
> arrays and data structures. Its MUCH easier to initialize everything with
> (the equivalent of) a block data routine, compared to planting calls to an
> inintialzation routine in the numerous routines (about 50) that may wind up
> being called first by the main application executable (of which there are
> about 6).

I do not understand this sentence. I do not think you are talking about
shared memory or anything, just a regular library routine, right?

> Such an initialization routine must itself ensure it doesn't
> initialize more than once, ... which needs a saved variable, initialized
> with a data statement or equivalent.

Yes, that is what I said above. It isn't clear if you are agreeing with
me or disagreeing.

In the library routine (which you are calling a DLL, but I don't see
anything special in your comments), you need some simple and
straightforward code like:

...
logical, save :: first_time = .true.
...

if ( first_time ) then
first_time = .false.
... ! allocate large arrays here and assign any necessary values
endif

As I said above, this is simple and easy to understand, and there is no
reliance on tricks in the loader or special quirks in the OS to avoid
having a large exe file. It does not matter if you have a single
library routine or 50 library routines, it is still a simple, clear, and
straightforward test that needs to be done.

>
> Even with this aside, I disagree that initialization with executable
> statements is "just as clear" as other methods. OK I grant you a data
> statement is entirely comparable with an assignment statement. But with
> variables and arrays in modules, F90's "Default initialization" feature
> allows declaration, and initialization, on a single line of code, which
> IMHO is much clearer than separate declaration and initialization.

Yes, but we are talking about how to avoid storing large static arrays
in the executable file. That is normally where data statements (and
fortran initialization statements) normally puts these things. So to
avoid that problem, just initialize any such large arrays with
executable statements.

Problem solved, simple as that.

$.02 -Ron Shepard

Richard Maine

unread,

Nov 16, 2010, 7:03:23 PM11/16/10

to

Colin Watters <bo...@qomputing.com> wrote:

> But with
> variables and arrays in modules, F90's "Default initialization" feature
> allows declaration, and initialization, on a single line of code, which
> IMHO is much clearer than separate declaration and initialization.

I consider "default initialization" to be confusingly named. Yes, it
sort of provides a default for initialization (thus, obviously, the
name). But it does other things as well. In particular, default
initialization happens at different times from when initialization (of
the nondefault variety) happens and can apply to entities such as dummy
arguments that don't even allow initialization (again nondefault). I
think that makes them difficult to compare.

Note also that default initialization is inherently part of a type
definition. As such, the same default initialization applies to *ALL*
variables of the type. There isn't even a syntax to say that a
particular variable should get some different default initialization. As
soon as you bring up a particular variable, you are then doing
non-default initialization, which happens only once and which can't
apply to some variables. There is no way to say that I want this
particular variable to have this particular initialization, different
from other variables of the same type, but I want it to happen when
default initialization would happen. Perhaps the closest you could come
would be to mess with type inheritance and declare a new extended type
just for that variable. I'm old enough to recall reading some of Rube
Goldberg's cartoons in magazines (Popular Mechanics, maybe?); if that
seems like a non-sequitur, don't worry about it. :-)

The business about default initialization being part of a type
definition is why you can't specify default initialization for intrinsic
types (a feature some have asked for). An intrinsic type is already
defined. If you want to have the user redefine some aspect of it, then
it would no longer be the intrinsic type (and you'd have to deal with
how to either avoid or deal with the "same" type maybe having different
definitions in different scopes - yukk.)

The comment about intrinsic types seems more than a minor detail as the
intrinsic types are sometimes the ones that people most want to see such
a default initialization for.

Colin Watters

unread,

Nov 17, 2010, 5:14:00 PM11/17/10

to

"Ron Shepard" <ron-s...@NOSPAM.comcast.net> wrote in message

news:ron-shepard-BCB1...@news60.forteinc.com...

...Ok then let me try to expand a little.

Yes a DLL is very similar to a library, and like a library it can be
linked-to by a number of .exes. The library/dll I am talking about is used
by about 6 .exes (but not concurrently, and there is no sharing of data
between .exes). It has about 200 fortran routines in it, which all share
some global data that used to be in common blocks, but now thankfully is in
a module and is allocated dynamically. The situation I described occurred
when the data was in common, about 6 years ago.

Being a library, it has many routines that can be called from whatever .exe
it is running with/linked to. And there are a large number of routines that
could be called first by the .exe, and each of these would then need to
call the initialization routine you are postulating.

Yes, this could be done, but it would need about 60 calls to it, in order
to ensure it was always called when execution started off. So someone will
have to visit all 60 of those routines and plant initialization calls.
Tedious... but do-able.

This however becomes a maintenance headache. What happens when someone adds
a new routine that could be the first one called by the .exe? Well, he has
to ensure he calls the initialization routine. And if he forgets, then
global data doesn't get initialized.

So the things I don't like are (1) it needs about 60 calls statements to
make it happen, and (2) it is error-prone. All round I prefer data
statements and/or default initialization.

The obvious alternative to this is to require the calling .exe to make the
initialization call, as the first thing it does when using the library.
Yes, again this is do-able, but ... maybe its just me, I don't like this
either.

Incidentally the fix I described was not a trick with the loader, nor did
it rely on quirks of the OS. I think it would have fixed the problem, had
it occurred, on any compiler/linker/os that we might have been using.

Ian Harvey

unread,

Nov 17, 2010, 5:43:55 PM11/17/10

to

On 18/11/2010 9:14 AM, Colin Watters wrote:
...

> ...Ok then let me try to expand a little.
>
> Yes a DLL is very similar to a library, and like a library it can be
> linked-to by a number of .exes. The library/dll I am talking about is used
> by about 6 .exes (but not concurrently, and there is no sharing of data
> between .exes). It has about 200 fortran routines in it, which all share
> some global data that used to be in common blocks, but now thankfully is in
> a module and is allocated dynamically. The situation I described occurred
> when the data was in common, about 6 years ago.
>
> Being a library, it has many routines that can be called from whatever .exe
> it is running with/linked to. And there are a large number of routines that
> could be called first by the .exe, and each of these would then need to
> call the initialization routine you are postulating.
>
> Yes, this could be done, but it would need about 60 calls to it, in order
> to ensure it was always called when execution started off. So someone will
> have to visit all 60 of those routines and plant initialization calls.
> Tedious... but do-able.

Or one call in DLLMain, though that is platform specific (but then so
are EXE's and DLL's to begin with) and there are some significant
limitations.

Colin Watters

unread,

Nov 19, 2010, 6:11:45 PM11/19/10

to

"Ian Harvey" <ian_h...@bigpond.com> wrote in message
news:DPYEo.2546$MF5....@viwinnwfe02.internal.bigpond.com...

Yes I (pardon, we) have started to use dllmain, though I still prefer
compile-time initializations where possible. Is there an equivalent on
linux for shared objects? We use Linux too, though at present its just one
.exe, statically linked to everything else as libraries.

JB

unread,

Nov 19, 2010, 6:57:30 PM11/19/10

to

I'm not sure what dllmain is, but if it's some kind of code that runs
automatically when a shared library is loaded, then yes, such
functionality is available on Linux as well. See the .ctors and .dtors
sections in the ELF spec (or the older .init and .fini). At least with
gcc it's possible to mark functions as constructors/destructors with
__attribute__((constructor)) and __attribute__((destructor)). AFAIK
attributes are not supported by the Fortran frontend, so you'll have
to cobble together this with C or C++.

--
JB