Google Groups

Declaration statement

Chris Torek Mar 10, 2004 11:48 AM
Posted in group: comp.lang.c

    char *s;
    char t[];

as declarations appearing at file and block scope...

In article <news:c2mte5$au4$>
Xiangliang Meng <> writes:
>Could you explain a little  more on why they are OK in a file scope, but the
>second is invalid in blocks? What is the difference for the second between
>in the file scope and in the block scope?

At the most important level, the reason is just "because the C
Standards say so".  The declaration of "t" above is invalid in
block scope because the rules of the C standard prohibit it; it is
valid in file scope -- and means the same as "char t[1]" if no
other declaration overrides it -- because the rules of the C
Standards say so.

Fundamentally, there is no particular reason it *has* to work this
way, and in the days before the first C standard (ANSI X3.159-1989,
also known as "C89"), different C compilers had different rules.
If you wrote "char t[];" at file scope, some compilers gave you an
error message and refused to compile your code.  If the people who
wrote the C89 standard had so chosen, they could have made this an
error.  Instead, though, they decided to adopt a series of rules
that give useful results in various cases where pre-C89 compilers
sometime fell short.

In particular, the C89 folks invented (or borrowed) the concept of
a "tentative definition" of a variable.  You can "tentatively"
define a variable, which has the useful side effect of telling the
C compiler that it is going to exist sooner or later, so that you
can use the name of the variable.  Eventually you also define the
variable "for real", possibly giving it an initial value.

Consider the example of a circular queue data structure:

    struct queue {
        struct queue *forw, *back;
        some_type data;

Queues like this often have a "dummy" head node and then zero or
more data nodes.  Suppose you want to provide an initial single
data node.  How can you define two variables that will point to
each other?

If both variables are to have external linkage (i.e., are "global
variables" in the sense people usually mean by the phrase "global
variables"), you can, even in implementations predating the 1989
C standard, write this code:

    extern struct queue initialnode;

    struct queue head = {
        &initialnode, &initialnode
        /* no actual data (dummy head) */

    struct queue initialnode = {
        &head, &head,

The first of these lines announces to the C compiler: "there is a
variable named initialnode, of type struct queue".  This announcement
is required in order to provide &initialnode as the initializer
for head.forw and head.back.

But what if, on the other hand, both of these variables are to be
declared "static" (i.e., have internal linkage)?  You must write:

    static struct queue initialnode;

("static" instead of "extern").  Many pre-C89 C compilers took
this as the *definition* of the variable initialnode, meaning
the same thing as:

    static struct queue initialnode = { 0, 0, 0 };

If you follow this with a later:

    static struct queue initialnode = {
        &head, &head,

you just got an error.  You cannot write "extern struct queue
initialnode; ... static struct queue initialnode = { ... };"
either, as this is also an error.

The C89 solution to this problem is "tentative definitions", in
which you simply write the definition with no initializer.  A
tentative definition is, in effect, "recorded" at the point of the
definition and "remembered" until a corresponding actual (non-tentative)
definition, or the end of the translation unit:

    static struct queue initialnode; /* tentative */
    static struct queue head = { &initialnode, &initialnode };
    static struct queue initialnode = /* actual */
        { &head, &head, INITIAL_DATA };

    char t[]; /* also tentative */

At the end of the translation unit, all the outstanding tentative
definitions become actual definitions, initialized with "= { 0 }".
This means that the declaration for t eventually becomes:

    char t[] = { 0 };

which gives t a single element (so that sizeof(t) is 1*sizeof(char),
which is just 1).
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: forget about it
Reading email is like searching for food in the garbage, thanks to spammers.