Google Groupes n'accepte plus les nouveaux posts ni abonnements Usenet. Les contenus de l'historique resteront visibles.

Dismiss

multiple include files

150 vues

Accéder directement au premier message non lu

Doug Schmidt

non lue,

6 nov. 1989, 20:27:4806/11/1989

[The following is actually a posting on behalf of Jim Roskind
<leafusa!io!bronx!j...@uunet.uu.net>: Doug]

----------------------------------------
The posted responses to this topic have actually discussed two
distinct problems.

1) How to achieve the notable improvement in compilation
performance in the presence of multiple (repeated) inclusion of
individual header files.

2) How to construct ANY header files in the face of apparent
CIRCULAR dependencies between header files!

Interestingly, some folks recognize 1, but claim "there is no
significant performance problem". Other folks recognize 2, but claim
that circular dependencies can be avoided. I will give examples of
both problems, and offer an explanation of why some posters never see
the problems. The problems are really separate, and hence I will
discuss them as such.

Copyright 1989, James Roskind. Permission to reproduce in full at no
charge is granted, provided this copyright notice is intact and
applicable in the reproduction.

Contents:

PERFORMANCE PROBLEMS WITH MULTIPLE INCLUSIONS
Statement of Problem with Example
When is the Performance Problem Significant
Conclusion
Additional Comment

RESOLVING CIRCULAR DEPENDENCIES IN INCLUDE FILES
Example of Circular References
A Hack Solution to the Example
General Solution for Resolving Circular Dependencies in Include Files
Compilation Time Performance Of General Solution
Extending the General Solution to Accommodate Inheritance
Philosophizing

PERFORMANCE PROBLEMS WITH MULTIPLE INCLUSIONS

Statement of Problem with Example

The problem is typically present when all header files include all
the files that contain prerequisite declarations/definitions. For
example, any header file that makes use of the type "FILE *" will
typically include "stdio.h" before proceeding. If a source file then
includes several header files, it may unwittingly be (indirectly)
including several copies of "stdio.h". Even though there may be code
of the form:

#ifndef FILENAME
#define FILENAME
/* body of include */
#endif

around the bulk of the include file (to at least hopefully minimize
the amount of post-preprocessing parsing and processing), most
compilers will be forced to open and scan the entire file during each
inclusion. The performance penalty is then associated with the
opening, reading, and tokenizing (much of the file), and (at a high
preprocessing level) parsing (and discarding) the contents of the
file.

When is the Performance Problem Significant

At least one statistical evaluation (posted by Ken Yap) purported
(probably quite correctly) to show that no significant performance
hit was taken by the repeated inclusions (on a specific platform).
The reason why some other posters were very concerned is probably
based on the file buffering (aka: disk cache) capabilities of their
system. If a system has minimal disk caching, then repeated scans of
a file actually require a disk seek (maybe even several) to access
and re-read an entire include file. For instance, under vanilla
PC-DOS, such disk accesses can almost dominate a compile time even
WITHOUT repeated inclusions! In contrast, if a system has a
significant disk cache, then repeated opening and reading of a file
may be extremely inexpensive!

The moral is clearly shouted by many users: reducing repeated
inclusion will notably reduce compile times on SEVERAL platforms.
Hence, this is a REAL problem, although not necessarily for your
system.

Conclusion

The comments on this problem were VERY NICELY summarized by by David
Detlefs (posted 25 October on comp.lang.c++). Although I could not
find Nagle's original posting, the fragments found in the MANY
responses indicated that he presented a scheme to AUTOMATICALLY
reduce wasteful multiple inclusions. Specifically, he must have
suggested that the a preprocessor could recognize when an include
file was of the form:

/* nothing here */
#ifndef identifier
/* lots of stuff */
#endif
/* and nothing here */

By noting this fact, it would be possible to deduce in advance (of a
repeated inclusion) whether it was worth even trying to include the
file again when another #include directive was encountered! (I
apologize in advance if I deduced only an understatement of Nagle's
suggestion.) Since the above code construct is a VERY common
implementation method, this "preprocessor optimization" could save a
great deal of preprocessing time when repeated inclusion IS a
performance problem.

In his posting, Detlefs summarized:

> Conclusion: IMHO, anybody who's read Nagle's post and implements a C
> preprocessor (or compiler that incorporates one) and doesn't use the
> technique doesn't recognize a good thing when it walks up and sits in
> his/her lap. Even if your CPP incorporates a mechanism such as
> #pragma once, this will still help if the compiler is used on any of
> the vast existing body of code that doesn't use #pragma once.

I certainly agree that this is an extremely elegant solution.
Moreover, as with most good optimizations, it is not based on
"theoretical gains with code of questionable utility". This
optimization will more than likely provide compilation performance
gains in 99 out of 100 include files that have some sort of
"recursive or iterative include blockage", and actually engage this
mechanism to prevent repeated inclusions.

I would also point out that advocates of the use of the "#pragma
once" should quickly see that this approach provides ALMOST
equivalent performance WITHOUT EXTENDING THE LANGUAGE. I believe
that pragmas should represent methods of implementing extensions to
the language only while or until the language CANNOT support the
construct simply.

Additional Comment

For completeness, I will mention why the above solution falls ever so
slightly short of equivalent functionality of the "#pragma once"
construct. Although the multiple inclusion in the case of sequential
inclusion:

#include "file.h"
/* some other code */
#include "file.h"

is handled the same by a "Nagle wise" preprocessor (with appropriate
blockage in "file.h") as it would with "#pragma once" under a "pragma
wise" preprocessor, recursive inclusion costs ever so slightly more
under the "Nagle optimization". Specifically, assume that "file.h"
includes "file2.h", and that "file2.h" includes "file.h". Under any
such recursive circumstances, even a "Nagle wise" preprocessor will
be forced to scan included files with the standard
recursive/sequential blockage no more than twice (no matter how many
times the file is included). In contrast, the "#pragma once" would
only include the file once. (Recursive inclusion, preempted only by
the "#ifndef IDENTIFIER <newline> #define IDENTIFIER" is actually
useful in some circumstances that will be described in the second
half of this posting)

If you think about it, the double scan in the case of recursive
inclusion is not very significant. If you like complexity analysis,
you can read the next paragraph and see why. If you hate complexity
analysis, you should skip to the next section, which is probably more
interesting anyway ;-).

Having pointed the minor shortcoming of Nagle's optimization, I will
also point out that this is actually insignificant to the performance
improvement in a typically scenario. The multiple inclusion
performance problem is most significant when a source file includes N
distinct header files, each of which includes the same "bothersome.h"
header file. In this case the "bothersome.h" file would be scanned N
times without the optimization, and only (at most) twice with the
optimization. In the absolutely worst case, a source file might
include N files, each of which included almost all of the "other" N-1
header files (recursion blocks prevent infinite looping, although
static inclusion depth limits in some compilers would act as a
secondary guarantee :-). Hence a worst case of the order of N
squared file scans are reduced to 2N scans. The bulk of the
performance problem is in both cases reduced by at least a factor of
N/2, or more significantly, the number of files opened and scans
grows ONLY in proportion to the number of files used. Without the
Nagle optimization it can actually grow as the square of the number
of files used!

--------------------------------------------------------------------

RESOLVING CIRCULAR DEPENDENCIES IN INCLUDE FILES

Several posters (Dave Witherspoon, Paul Vaughn) mentioned difficulty
with apparent circular dependency in include files. Responses to the
problem-posters typically provided a "work-around" for the specific
problem, but ignored the more general problem. The general problem
is that it SHOULD NOT BE a puzzle FOR THE PROGRAMMER to figure out
how to sequence includes so that all the interdependent definitions
provided in multiple include files arrive (post pre-processing) at
the parser in a syntactically and semantically valid sequence.

The standard method of providing a single include file for a single
source file leads to the aforementioned "programming puzzle". The
typical solution to the puzzle involves carefully looking INTO the
foreign header files and developing a proper sequence of inclusion
(and often extracting foreign declarations and restating them
locally!?!). IMHO this violates the concepts of modularity and data
abstraction. This is most evident when a change in one header file
maligns the entire inclusion sequence, and the "puzzle" must be
solved again. I believe that this recurring puzzle can be solved
once and for all via an extension to the overly simplistic
"source.c", "source.h" file standards. Note that this does NOT
involve extending the language, but rather the extending the naming
conventions for include files. I have used conventions of this sort
on several C language object oriented projects, and have found them
to be the "complete solution" to my problems. Hopefully this posting
will arouse some discussion of such conventions. If you successfully
post to me, I will summarize the suggestions on the net.

Example of Circular References

To make easier reading of this example, I will make the example use
both valid C and C++. Assume that the following two structures are
declared in the following 2 files:

file1.h contains:

#ifndef file1_h
#define file1_h
#include "file2.h" /* get typedef for TYPE_B */
typedef struct TYPE_A
{
TYPE_B * b;
/* other stuff */
} TYPE_A;
#endif
/* end of file file1.h*/

and file2.h contains:

#ifndef file2_h
#define file2_h
#include "file1.h" /* get typedef for TYPE_B */
typedef struct TYPE_B
{
TYPE_A * a;
/* other stuff */
} TYPE_B;
#endif
/* end of file file2.h*/

The "good news" about the above files is that they can be processed
without infinite recursive inclusion because of the protective
wrappers. The "bad news" is that the resulting sequence will force
error messages to abound. Clearly the DESIRED goal of a
"programmer/puzzle solver" is that including "file1.h" would provide
a preprocessed sequence equivalent to:

typedef struct TYPE_A
{
struct TYPE_B * b;
/* other stuff */
} TYPE_A;

The above is a valid, but basically impossible result. To achieve
the above result it would be necessary for the header file that
defines "struct TYPE_A" to KNOW that TYPE_B was really a structure!
Hence the data abstraction would be lost, as would our modularity.
(For those readers that might think that this fact is insignificant,
I would point out that the other file might easily be in a module
being designed, implemented, and regularly changed by another
programmer!)

In the next two sections I will suggest a hack solution to this
problem, and then the general solution. In each solution there is an
effort made to preserve modularity and the use of typedef'ed type
names to provide abstract types. Moreover, information about these
types is NOT propagated outside the defining module's header file(s).

A Hack Solution to the Example

Looking at the example cited above, we can partition the information
known to "file1.h" and "file2.h" so that neither is given information
about the other type. A more reasonable (modular) result AFTER
preprocessing of an optimal version of "file1.h" would be:

typedef struct TYPE_A TYPE_A; /* file1.h knows this */
typedef struct TYPE_B TYPE_B; /* file2.h knows this */

struct TYPE_A
{
TYPE_B * b;
/* other stuff */
}; /* file1.h knows this */

With this goal in mind, we might try to use the include files:

file1.h contains:

/* version 2 */
#ifndef file1_h
#define file1_h

typedef struct TYPE_A TYPE_A;

#include "file2.h" /* get typedef for TYPE_B */

struct TYPE_A
{
TYPE_B * b;
/* other stuff */
};
#endif
/* end of file file1.h*/

and file2.h contains:

/* version 2 */
#ifndef file2_h
#define file2_h

typedef struct TYPE_B TYPE_B;

#include "file1.h" /* get typedef for TYPE_A */

struct TYPE_B
{
TYPE_A * a;
/* other stuff */
};
#endif
/* end of file file2.h*/

Interestingly enough, the above "version 2" header files do provide
working code while remaining modular and abstract. There are two
problems with the version-2 approach. The first is that unnecessary
information is included (BOTH struct elaborations are always
provided), but more significantly, the method is not extensible to
the more general case. To be specific about the "unnecessary"
inclusion, the following text would result from including "file1.h":

/* results from including version 2 of file1.h */
typedef struct TYPE_A TYPE_A; /* file1.h knows this */
typedef struct TYPE_B TYPE_B; /* file2.h knows this */
struct TYPE_B
{
TYPE_A * a;
/* other stuff */
}; /* file2.h knows this */
struct TYPE_A
{
TYPE_B * b;
/* other stuff */
}; /* file1.h knows this */

Notice that this inclusion brought in the unnecessary information
involving the elaboration of "struct TYPE_B". Moreover, if the
(unnecessary) struct elaboration had needed any other prerequisite
typedef definitions, those would also have been pulled in!

The second difficulty with version 2 involves attempting to extend
this approach to an even more general problem. The nature of this
version 2 solution may be summarized as follows:

1) Partition the declaration information provided in a header
file into 2 distinct groups. The first group should have NO
dependencies on foreign declarations (eg: typedef struct
TYPE_A TYPE_A). The second group should include all other
information, which presumably has dependencies on foreign
header file declarations (eg: the structure elaboration of
TYPE_A, which depends on the foreign declaration of TYPE_B).

2) Construct a list of all foreign header files that provide
prerequisite declaration information for the "foreign
dependent" declaration. (in the file1.h example this was only
"file2.h").

3) Construct a header file of the following form:

/* multiple inclusion blockade*/
#ifndef FILENAME
#define FILENAME

/* All declarations with no external dependencies */

/* All prerequisite foreign includes */

/* All declaration that HAVE external dependencies */

#endif

The fault in this solution lies in the assumption that the
declarations can be nicely partitioned into "declarations with
external dependencies", and "declarations without external
dependencies". My experience has shown that there are commonly
several hierarchical levels of foreign dependencies. For example:

a) Declaration/definition information with NO dependencies.
This typically includes manifest constants that are chosen
empirically (or based on specifications) for a system.

b) Enumerated constants, which potentially have dependencies
on foreign information provided in (a). Examples of this
include the use of a manifest constant to provide a "gap" in
a list of enumerated constants, or an unusual starting
enumeration value.

c) Typedef definitions which MAY include array types. Such
typedefs may depend on foreign information provided in (a)
and or (b) in order to establish array sizes.

d) Structure elaborations. Such elaboration may easily
depend on foreign information provided in (a), (b), and (c).

e) Inline function definitions (solves posting by Paul
Vaughn). Since functions may accept and return structures
(including foreign elaborations), it is required that the
compiler get to see these foreign elaborations in (d) in
advance of providing the body of the inline function.

When the hierarchy of the dependencies are this numerous, I doubt
that an inclusion scheme such as what is presented above in version 2
could be made to process the complexities.

In the next section I will present a general solution to this
problem. In the final section I will extend the general solution to
allow for inheritance (C++), or equivalently the inclusion of foreign
structures within local structures.

General Solution for Resolving Circular Dependencies in Include Files

We will start by giving the general solution for our example.
Although the original problem involved only "file1.h" and "file2.h",
the general solution is based on dividing each include file into 2
separate files. We will name the additional files "file1T.h" and
"file2T.h".

file1.h contains:

/* version 3 */
#ifndef file1_h
#define file1_h

#include "file1T.h" /* get locally defined typedefs */
#include "file2T.h" /* get typedef for TYPE_B */

struct TYPE_A
{
TYPE_B * b;
/* other stuff */
};
#endif
/* end of file file1.h*/

file1T.h contains:

#ifndef file1T_h
#define file1T_h

typedef struct TYPE_A TYPE_A;

#endif
/* end of file file1T.h*/

file2.h contains:

/* version 3 */
#ifndef file2_h
#define file2_h

#include "file2T.h" /* get locally defined typedefs */
#include "file1T.h" /* get typedef for TYPE_A */

struct TYPE_B
{
TYPE_A * a;
/* other stuff */
};
#endif
/* end of file file2.h*/

file2T.h contains:

/* version 3 */
#ifndef file2_h
#define file2_h

typedef struct TYPE_B TYPE_B;

#endif
/* end of file file2.h*/

The results of including "file1.h" is then exactly the optimal source
text presented earlier:

typedef struct TYPE_A TYPE_A; /* file1T.h provided this */
typedef struct TYPE_B TYPE_B; /* file2T.h provided this */

struct TYPE_A
{
TYPE_B * b;
/* other stuff */
}; /* file1.h provided this */

Having seen the specific solution to the example, the following is
the generalization to provide include files when the foreign
dependencies are more complex than "none vs some". For this solution
to work, the project architect must first define the contents of the
the N levels of the dependency hierarchy. In the previous section I
mentioned a possible list of a-e levels of dependency, which is a
total of 5 levels. With N levels, each traditional header file is
made to correspond to N separate header files. The information in
the traditional header file is divided among the separate header
files in accordance with the definition of the N levels (in our
example, N=2. The lowest level of information is "independent
typedefs", and is placed in "file1T.h". The other level corresponds
to all dependent information, and is placed in version 3 "file1.h").

The format of each file is then:

/* multiple inclusion blockade*/
#ifndef FILENAME
#define FILENAME

/* Inclusion of local file at LOWER level in hierarchy*/

/* All prerequisite foreign includes (lower level in
hierarchy)*/

/* All declaration information for this level */

#endif

Once a standardized header file naming convention like this is
implemented across a project it is very easy for a programmer to
modify the appropriate header files without causing any
inconsistencies in the resulting source text (i.e.: circular
dependencies CAN'T develop). Hence, at the expense of a bit of
planning, the programmer is relieved of the problem of carefully
evaluating and solving include dependency puzzles! (perhaps some
general standards can be proposed for the file naming scheme, and the
corresponding partitioning).

Compilation Time Performance Of General Solution

Someone is probably going to tell me in a reply post that "the
overhead of all these repeat include will probably give the
programmer enough time to solve MANY puzzles, and perhaps even find a
new job :-)". To such a comment there are several nice replies given
below. Other users might question how they would ever construct the
ever changing MAKE dependencies, and that question also has a rather
nice answer.

The first comment on performance is that if an optimization scheme
such as what Nagle apparently proposed (see my discussion of
compilation performance on sources with repeated inclusion) there
would be virtually no cost associated with the multitude of repeated
inclusion. There would be a cost associated with the fact that N
times as many include files must typically now be opened and closed
(i.e.: We divide the traditional header file into N separate header
files. It is faster on most systems to read a single large file,
than these N smaller files). There is however a benefit in the fact
that there will be potentially less total information provided by
this inclusion scenario than with most traditional inclusion
strategies. This may return a SLIGHT benefit in terms of reducing
actual compile times. Hopefully, this strategy can be made to work
reasonably efficiently, but if it doesn't ON YOUR SYSTEM, there is
always the next paragraph...

If your compiler does not support the Nagle optimization, nor the
#pragma once option, and you don't wish to munge your header files to
retrofit a protective wrapper around each #include directive, and the
performance of repeated multiple includes on your system is tragic,
there is work-around. The basic work-around is to define a
"masterinclude.h" file that sequentially includes 1) all the lowest
level files, 2) all the next lowest level file, ...N) all the highest
level include files. In addition, the #include directives in each of
the header files that "automatically includes" whatever is
"prerequisite", should be removed (or at least #if'ed out :-). This
approach avoids all multiple inclusion, but ends up providing a VERY
large amount of header text into all of the source modules. It also
suffers from the fact that the all individual headers are opened
during each source compilation.

If the cost of opening/reading/closing all N levels of ALL include
files is too great, even if multiple inclusion problems are avoided
(via masterinclude.h sequencing), then a more drastic work-around is
possible. The work-around involves "pre-concatenation" of all header
files at a given level L into a masterlevelL.h, followed by a ordered
concatenation of all N masterlevelL.h files into a masterlevel.h
file. This preconcatenation operation can be performed under control
of a MAKE utility, and effectively corresponds to a precompilation of
include files. The cost of this approach is that inclusion of
"masterlevel.h" again provides ALL include information for the entire
project, and may slow the actual compilation by introducing
extraneous declarations. If a sufficiently robust source code
control system is used, it is conceivable that the programmers could
edit this central masterlevel.h include file, and bypass the assembly
process (but I wouldn't recommend it). Even this most radical
work-around preserves the placement of declarations in a proper
position in the header hierarchy to avoid inducing sequencing puzzles.

The good news is that if any of the above work-arounds are used, the
programmer STILL benefits by not having to solve interdependent
inclusion puzzles.

The last comment in the section addresses the maintenance of MAKE
files when such any of the complex include strategies are used.
Several preprocessors provide a list of all the files that were
included during a preprocessing pass in a format that is compatible
to MAKE. This output can be used to constantly update the
dependencies, with no programmer interaction. If your current
compiler can't provide you "automatically" with this output then a)
write a pleasant letter to the vendor suggesting this enhancement,
and b) write a program that generates this information from a source
listing. ( ... and yes, I know that GNU does support this option :-)

Extending the General Solution to Accommodate Inheritance

Inheritance in C++, and equivalently inclusion of a structure within
a structure in C, requires that the original structure be elaborated
fully before being used. In C++ the base class must be elaborated
before a class is derived from it, and in C a structure must be
elaborated before it is used to declare a member in another
structure. On this topic there is some fundamental good news that
makes these interdependent includes "sequenceable". The rule is that
a class may not be derived directly, or recursively from itself, just
as a structure may not contain (directly or recursively) itself as a
member (although a pointer to such is valid, and no problem for us).

The simplest hack solution is to extend the dependency levels to
correspond by adding "C++ derivation" levels, as well as the "C
structure including structure" hierarchy. In both cases there can be
no loops, and hence the "hierarchical level" approach that we
suggested will work. Unfortunately, this again ties the entire
include policy to the actual types of data, and hence provides a new
puzzle when ever the inheritance or structure inclusion hierarchy
changes.

The much simpler solution follows if we chose to use level such as
what was proposed earlier in this paper. Specifically we had level
a-e, including:

a)...
b)...
c)...

d) Structure elaborations. Such elaboration may easily
depend on foreign information provided in (a), (b), and (c).

e) ...

Assuming the header files for the structure/class elaborations are
fairly concise (generally only one or two related elaborations), the
corresponding header file at level (d) can easily be adjusted to
accommodate structure inclusion or inheritance. Generally, header
files at level (d) ONLY #include header files at levels (a), (b), or
(c). In the case where a structure contains or is derived from
another structure, that header file should #include the corresponding
level (d) elaboration. Since there are no loops based on the
semantic restrictions of C++ and C, this #inclusion across level (d)
cannot introduce any conflicts. (Actually, it can introduce
conflicts if header files elaborate multiple structs. If this is a
problem, it is a sign that too many things are defined in the given
header file, and the file should be split within the level to remove
the loop).

Philosophizing

This problem is most interesting because IMHO it WAS (past tense) NOT
a problem in a typical hierarchical design (historically typical of
C). The problem was aggravated by the introduction of function
prototype (which are WONDERFUL) and circular references to structures
(which are INDISPENSABLE) in ANSI C, and has been brought to a head
with the use of object oriented design (even with great things come
problems :-). In a traditional top-down temporal/hierarchical
design, the dependency of include files is naturally formed into a
tree. Base level functions deal only with data at or below that
level. This tends to form the required inclusion sequence into a
nice tree. Once object oriented design methods have entered the
scene, classes and related methods quickly arise that reference other
classes and methods, which recursively reference the original classes.

Jim Roskind
Independent Consultant / Contract Programmer
516 Latania Palm Drive
Indialantic FL 32903-3816
(407)729-4348
...!eddie.mit.edu!ileaf!jar

--
Master Swordsman speak of humility; | sch...@ics.uci.edu (ARPA)
Philosophers speak of truth; | office: (714) 856-4034
Saints and wisemen speak of the Tao of no doubt;
The moon, sun, and sea speaks for itself. -- Hiroshi Hamada

Neil Hunt

non lue,

7 nov. 1989, 14:32:5507/11/1989

In article <1989Nov6.1...@paris.ics.uci.edu> sch...@crimee.ics.uci.edu (Doug Schmidt) writes on behalf of Jim Roskind:

>----------------------------------------
>The posted responses to this topic have actually discussed two
>distinct problems.

>[...]

> 2) How to construct ANY header files in the face of apparent
>CIRCULAR dependencies between header files!

> RESOLVING CIRCULAR DEPENDENCIES IN INCLUDE FILES

>[Lengthy discussion and analysis of multiple inclusion problem deleted]

>Several posters (Dave Witherspoon, Paul Vaughn) mentioned difficulty
>with apparent circular dependency in include files. Responses to the
>problem-posters typically provided a "work-around" for the specific
>problem, but ignored the more general problem. The general problem
>is that it SHOULD NOT BE a puzzle FOR THE PROGRAMMER to figure out
>how to sequence includes so that all the interdependent definitions
>provided in multiple include files arrive (post pre-processing) at
>the parser in a syntactically and semantically valid sequence.

>[...]

>Example of Circular References

>[Standard example of two include files each declaring a structure
containing a pointer to the structure declared in the other file.]

Both C and C++, for obvious reaons, allow a pointer to a structure (class)
to be declared or even defined when the class declaration is not available;
were this not so, it would not be possible to have mutually referencing
data structures. _All_ that is required is that the compiler recognise
that the name ahead of the `*' be recognised as a type name: that is
simple if the declaration contains the keyword struct (class, in c++),
but is harder if a typedef (or class name without the class keyword)
is used. For this reason, the compilers accept (multiple) declarations of
the form `typedef struct opaque_type new_type_name;' (`class opaque;').

Note that such mechanisms do not get around the difficulty of making
one structure be part of another - in this case a full declaration of the
included structure must be available. However, this cannot be part
of a _circular_ dependency if the structure being described is not
to recurse infinitely. Only pointers can (and often must) be circularly
dependent.

The natural fix, then, is for each header file to ensure that
the compiler knows all the new type words, _without_ having to specify
the types explicitly. The example given in the posting then becomes:

file1.h:
typedef a_type struct a_type;
struct b_type { a_type *pa; /* ... */ };
file2.h:
typedef b_type struct b_type;
struct a_type { b_type *pb; /* ... */ };

In C++, a series of statements of the form `class a_type; class b_type;'
suffice to notify the compiler that the tokens a_type and b_type are
to be considered type words.

It seems to be good practise to include such type word introducing statements
for all the non standard type words for declaring pointers in a header file.
This solves the circular dependency problem as described in the original
posting.

One slight problem remains: there is no equivalent syntax for specifying
type words which refer to primitive types or arrays whose exact type is
left unspecified, to be defined in another header file: For example
if designer A defines a bit field type:
typedef unsigned short bit_field_t;
and designer B needs a pointer to that type, he must include the whole
declaration of bit_field_t; in fact, he ought to include the header file,
since he knows not whether it might change to become an int, or a char.

One might argue that there ought to be a syntax for declaring a new type
word for _any_ type, primitive or user defined. Perhaps something along
the lines of `typeword bit_field_t;'. However, I find it hard to find
any examples where this has been a problem (while I have many examples
where class and structure type words have been defined as above). Thus
I would vote for no new extensions to the languages in this area, and
would also resist breaking up my header files into multiple parts as
recommended by the original poster.

In summary - there is no problem; existing facilities permit the
circular pointer dependency difficulty to be overcome in a simple fashion,
not requiring new syntax, or splitting include files into multiple peices.
By declaring all new non standard type words at the start of each header
file, the header files can be included in any order, as long as they are all
included before the first attempt to dereference or increment a pointer
of one of these types. (If not, a compilation error `unknown size' or similar,
will be generated.)

>Jim Roskind
>Independent Consultant / Contract Programmer

>...!eddie.mit.edu!ileaf!jar

Neil/. Neil%teleo...@ai.sri.com ...decwrl!argosy!teleos!neil

PS: See K&R2, 6.5 Self referential structures, page 139, 140:
Occasionally, one needs [...] two structures that refer to each other.
The way to do this is [...]
also, BS, 8.8 Typedef, page 290:
A declaration of the form [...] specifies that an identifier is the
name of some (possibly not yet defined) class or enumeration.
Such declarations allow the declarations of classes which refer to each
other.

John_-...@cup.portal.com

non lue,

7 nov. 1989, 13:48:1207/11/1989

Doug Schmidt has correctly summarized what I proposed, and his
explaination is clearer than mine.

John Nagle

0 nouveau message