Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

typedef redef

14 views
Skip to first unread message

Bill A.

unread,
Oct 1, 1999, 3:00:00 AM10/1/99
to
Hello. I can't find where Standard C allows redefinition of typedef of the
same type without error:

typedef unsigned char Byte;

typedef unsigned char Byte; // This is OK using several compilers
I have

typedef unsigned int Byte; // This is an error

float Byte; // Of course this is an error too.

void test (void)
{
Byte Byte; // This is OK, different scope
}

It seems the redefinition (line #2) is contradictory to the rest of the
language. Does anyone know if the allowed redefinition is ANSI approved or
have compilers just be designed to accept it because the use of the same
typedefs is a common thing as the result of using header files?

Thanks,
Bill


Douglas A. Gwyn

unread,
Oct 1, 1999, 3:00:00 AM10/1/99
to
"Bill A." wrote:
> Hello. I can't find where Standard C allows redefinition of typedef of the
> same type without error:

It doesn't. Your compilers should have produced a diagnostic,
because the subsequent occurrences of the identifier specified
an existing type name.

This is a standard issue for implementors of headers (APIs).
Each typedef needs an associated "idempotency lock" to ensure
that it is seen no more than once.

Clive D.W. Feather

unread,
Oct 1, 1999, 3:00:00 AM10/1/99
to
In article <rv9mco...@corp.supernews.com>, Bill A.
<bi...@megahits.com> writes

>Hello. I can't find where Standard C allows redefinition of typedef of the
>same type without error:

It doesn't.

>typedef unsigned char Byte;


>
>typedef unsigned char Byte; // This is OK using several compilers

This is forbidden by all conforming C89 and C9X compilers; I think a
diagnostic is required, but I haven't got the FDIS to hand.

--
Clive D.W. Feather | Internet Expert | Work: <cl...@demon.net>
Tel: +44 20 8371 1138 | Demon Internet Ltd. | Home: <cl...@davros.org>
Fax: +44 20 8371 1037 | | Web: <http://www.davros.org>
Written on my laptop; please observe the Reply-To address

Ben Combee

unread,
Oct 2, 1999, 3:00:00 AM10/2/99
to
Clive D.W. Feather wrote:
> In article <rv9mco...@corp.supernews.com>, Bill A.
> <bi...@megahits.com> writes
> >Hello. I can't find where Standard C allows redefinition of typedef of the
> >same type without error:
>
> It doesn't.
>
> >typedef unsigned char Byte;
> >
> >typedef unsigned char Byte; // This is OK using several compilers
>
> This is forbidden by all conforming C89 and C9X compilers; I think a
> diagnostic is required, but I haven't got the FDIS to hand.

To give an example, CodeWarrior will raise an error about benign
typedef redefinition when in ANSI strict mode, and warn about it
otherwise, although it does silently allow it through in "MS
Extensions" mode when "extended error checking" is turned off, as this
problem shows up a lot in Microsoft's Win32 header files

--
Ben Combee <bco...@metrowerks.com> -- x86/Win32/Linux/NetWare CompilerWarrior

David R Tribble

unread,
Oct 4, 1999, 3:00:00 AM10/4/99
to
"Douglas A. Gwyn" wrote:

>
> "Bill A." wrote:
>> Hello. I can't find where Standard C allows redefinition of typedef
>> of the same type without error:
>
> It doesn't. Your compilers should have produced a diagnostic,
> because the subsequent occurrences of the identifier specified
> an existing type name.
>
> This is a standard issue for implementors of headers (APIs).
> Each typedef needs an associated "idempotency lock" to ensure
> that it is seen no more than once.

I've been of the opinion that "benign" (i.e., token-for-token
identical) typedef redefinitions should be allowed in C. C++
allows them, IIRC. And C does allow benign preprocessor macro
definitions.

I really haven't seen a convincing argument to the contrary.
What, exactly, would be wrong with allowing:

// <stddef.h>

typedef unsigned int size_t;
...

// <stdlib.h>

typedef unsigned int size_t;
...

-- David R. Tribble, da...@tribble.com, http://www.david.tribble.com --

Clive D.W. Feather

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to
In article <37F92E7E...@tribble.com>, David R Tribble
<da...@tribble.com> writes

>I've been of the opinion that "benign" (i.e., token-for-token
>identical) typedef redefinitions should be allowed in C. C++
>allows them, IIRC.

This would be a major change for C, because it requires some equivalent
of the One Definition Rule. Actually, it would be even more painful:

int n;
....
typedef int vector [n++];
....
typedef int vector [n++];

These two are *not* identical, but it can't be determined until runtime.

>And C does allow benign preprocessor macro
>definitions.

Those require exactly the same token sequences, which is easy to test.

Douglas A. Gwyn

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to
David R Tribble wrote:
> I really haven't seen a convincing argument to the contrary.

There isn't an argument, from the programmer's perspective.
But it does make the compiler have to work a little harder.
Basically, "it's always been that way", so the existing
practice was canonicalized in the standard. It could be
changed in a future revision; if somebody is collecting a
list for C0x, please put this on the list.

Tore Lund

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to
Clive D.W. Feather wrote:
>
> int n;
> ....
> typedef int vector [n++];

Sorry for being dense, but it is this valid C? And how would such a
typedef be used in a program? Surely I must be missing something.
--
Tore Lund <tl...@online.no>


Morris M. Keesan

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to

You mean C20XX, certainly, not C0X.
Have we learned nothing in the past few years?
--
Morris Keesan -- mke...@kenan.com

James Kuyper Jr.

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to
Tore Lund wrote:
>
> Clive D.W. Feather wrote:
> >
> > int n;
> > ....
> > typedef int vector [n++];
>
> Sorry for being dense, but it is this valid C? And how would such a
> typedef be used in a program? Surely I must be missing something.

It will be valid in C99, which adds the concept of a Variable Length
Arrays (VLAs).
Based upon n868, the 1999-01-18 draft of the C99 standard:
VLA's can't be declared static, nor 'extern'. sizeof() becomes a
run-time expression which evaluates its operand, if that operand is a
VLA. The typedef given above can only be declared and used within the
scope of 'n'. The 'n++' expression gets re-evaluated each time a
statement 'n++;' in place of the declaration would be executed. It does
NOT get re-evaluated each time the typedef is used. It's unspecified
whether side-effects of evaluating the expression actually occur - 'n'
might or might not increase in value. As a result, I can't think of any
good reason for putting expressions like this that have side-effects in
a VLA declaration.

A plausible usage of VLA's is:

void func(int, int [*]);

void func(int m, int n, int A[m][n])
{ /* The 'm' in [m] is included only for internal documentation. Just
as
* with ordinary arrays, A[m] is equivalent in this context to A[] or
*A.
*/
int row[n];

/* code. */
}

Among other advantages, this will make multi-dimensional arrays a lot
more convenient to use than they currently are.

This is one of the three most interesting new features in C99, in my
opinion; the other two are compound literals and designated
initializers.

John Hauser

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to

Bill A:
> But I suppose a question would be, what to do with:
>
> typedef long slong; // and
> typedef signed long slong;

If type equivalence is the rule, this is not ambiguous. `long' is
identical to `signed long'.

> or worse
>
> typedef char byte; // and
> typedef unsigned char byte;
>
> A compiler could default chars to signed making these unequivalent.

And so they would and should be. I can't see this as a problem.

- John Hauser

Keith Thompson

unread,
Oct 5, 1999, 3:00:00 AM10/5/99
to
"Bill A" <bi...@megahits.com> writes:
[...]

> typedef char byte; // and
> typedef unsigned char byte;
>
> A compiler could default chars to signed making these unequivalent.

The types "char" and "unsigned char" are distinct, even if plain char
happens to be unsigned.

This usually doesn't matter much, since they're compatible in most
contexts, but "char *" and "unsigned char *" are distinct and
incompatible.

If your plain char happens to be signed, you can change each
occurrence of "unsigned" to "signed" in the above two paragraphs.

--
Keith Thompson (The_Other_Keith) k...@cts.com <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
"Oh my gosh! You are SO ahead of your time!" -- anon.

Bill A

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
I think it's pretty easy to check if a typedef is being defined and all
compilers have to be able to detect equivalent and compatible types. So the
check is easy. But I suppose a question would be, what to do with:

typedef long slong; // and
typedef signed long slong;

or worse

typedef char byte; // and
typedef unsigned char byte;

A compiler could default chars to signed making these unequivalent.

Douglas A. Gwyn wrote in message <37FA0061...@arl.mil>...

Tore Lund

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
James Kuyper Jr. wrote:
>
> Tore Lund wrote:
> >
> > Clive D.W. Feather wrote:
> > >
> > > int n;
> > > ....
> > > typedef int vector [n++];
> >
> > Sorry for being dense, but it is this valid C? And how would such a
> > typedef be used in a program? Surely I must be missing something.
>
> It will be valid in C99, which adds the concept of a Variable Length
> Arrays (VLAs). [snip]

Thank you.
--
Tore Lund <tl...@online.no>


Clive D.W. Feather

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
In article <37FA8FDB...@wizard.net>, James Kuyper Jr.
<kuy...@wizard.net> writes

>> > int n;
>> > ....
>> > typedef int vector [n++];

>The 'n++' expression gets re-evaluated each time a


>statement 'n++;' in place of the declaration would be executed. It does
>NOT get re-evaluated each time the typedef is used.

Right.

>It's unspecified
>whether side-effects of evaluating the expression actually occur - 'n'
>might or might not increase in value.

Wrong. This is something that got changed at Portland - that expression
is *definitely* evaluated normally, exactly as if it read:

int __n = n++;
typedef int vector [__n];

The relevant wording now reads:

[#5] If the size is an expression that is not an integer
constant expression: if it occurs in a declaration at
function prototype scope, it is treated as if it were
replaced by *; otherwise, each time it is evaluated it shall
have a value greater than zero. The size of each instance
of a variable length array type does not change during its
lifetime. Where a size expression is part of the operand of
a sizeof operator and changing the value of the size
expression would not affect the result of the operator, it
is unspecified whether or not the size expression is
evaluated.

This last bit was a "political" compromise; it means that:

sizeof (void (*) (int [n++]))

might not increment n, though:

sizeof (int [n++])

definitely will.

James Kuyper Jr.

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
"Clive D.W. Feather" wrote:
...

> Wrong. This is something that got changed at Portland - that expression

Thanks - I was pretty sure there'd been a change, but I wasn't sure
exactly what it was. I figured it was safer to explicitly cite n869, and
correctly describe what it says, than to risk incorrectly describing
what the final standard says.

Paul Jarc

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
"Bill A" <bi...@megahits.com> writes:
> typedef char byte; // and
> typedef unsigned char byte;
>
> A compiler could default chars to signed making these unequivalent.

In this case, they'd not be equivalent anyway. char has the same
alignment, range, and representation as either signed or unsigned
char, but it is still a distinct, incompatible type from each of them.
Similarly, void* and {signed, unsigned, plain} char* all have the same
alignment and representation, but are distinct, incompatible types.


paul

David R Tribble

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
Bill A wrote:
>
> I think it's pretty easy to check if a typedef is being defined and
> all compilers have to be able to detect equivalent and compatible
> types. So the check is easy. But I suppose a question would be,
> what to do with:
>
> typedef long slong; // and
> typedef signed long slong;

These are equivalent types, by definition.

extern void foo(long x);

void bar(signed long y)
{
foo(y); // Okay, compatible types
}

> or worse


>
> typedef char byte; // and
> typedef unsigned char byte;
>
> A compiler could default chars to signed making these unequivalent.

These are not equivalent types, by definition.

extern void foo(char x);

void bar(signed char y)
{
foo(y); // Error, incompatible types
}

The concept of "compatible types" is defined quite precisely in
the standard. It is this concept that is used to determine if
assignments and function arguments are type-compatible, and it is
the very same kind of logic that would be used to determine whether
a typedef redefinition is benign or not.

David R Tribble

unread,
Oct 6, 1999, 3:00:00 AM10/6/99
to
"Clive D.W. Feather" wrote:
>
> David R Tribble <da...@tribble.com> writes:
>> I've been of the opinion that "benign" (i.e., token-for-token
>> identical) typedef redefinitions should be allowed in C. C++
>> allows them, IIRC.
>
> This would be a major change for C, because it requires some
> equivalent of the One Definition Rule. Actually, it would be even
> more painful:
>
> int n;
> ....
> typedef int vector [n++];
> ....
> typedef int vector [n++];
>
> These two are *not* identical, but it can't be determined until
> runtime.

Eesh. Perhaps in order to allow benign typedef redefinitions, we
would have to exclude typedef redefinitions involving VLAs.
(One of these boogers per scope ought to be enough for anyone anyway.)

As a side note, I favor the idea that expressions with potential
side effects be explicitly disallowed within VLA size declarations.

>> And C does allow benign preprocessor macro definitions.
>
> Those require exactly the same token sequences, which is easy to
> test.

Yes, but since a C compiler must check for compatibility between
types (for things like assignment expressions and function arguments),
it would not be that difficult for a compiler to check that two
typedefs are compatible (i.e., the "same" type). The rules for
this, which are pretty much already laid out in 6.2.7 et al, should
allow:

typedef int int_type;
typedef int int_type; // compatible
typedef signed int int_type; // compatible
typedef int (int_type); // compatible

typedef int * (*pif)();
typedef int_type * (((*(pif)))()); // compatible
typedef int * (*pif)(void); // incompatible, error

Bill A

unread,
Oct 7, 1999, 3:00:00 AM10/7/99
to
James Kuyper Jr. wrote in message <37FA8FDB...@wizard.net>...

>It will be valid in C99, which adds the concept of a Variable Length
>Arrays (VLAs).

This leads me to another puzzle, though not about the typedef.

Compilers "know" the start of an auto array at compile time (frame pointer
plus some constant) and can use that to calculate the offset for x[5].
However, once the compiler sees 'char x[n];' followed by 'char y[n];', how
does it calculate the address of y[5]? Does the compiler store n in a
temporay so that it can calculate the address of y[5] as '&x+n+5'?

And I assume for each VLA declaration, at runtime the frame pointer is
adjusted? Compilers that allocated all local storage once on function entry
(allowing some sections to overlap for nested scopes) must now adjust the
frame pointer as scopes are entered and exited if the scope has VLA.

Thanks,
Bill


James Kuyper Jr.

unread,
Oct 7, 1999, 3:00:00 AM10/7/99
to
Bill A wrote:
...

> Compilers "know" the start of an auto array at compile time (frame pointer
> plus some constant) and can use that to calculate the offset for x[5].
> However, once the compiler sees 'char x[n];' followed by 'char y[n];', how
> does it calculate the address of y[5]? Does the compiler store n in a
> temporay so that it can calculate the address of y[5] as '&x+n+5'?

It often will need to store a temporary, but usually not for that
purpose. The best way to implement VLAs depends upon the platform, but
one possible way is to make VLA's roughly equivalent to pointers
initialized with the results of an alloca() call. There are some subtle
differences, but that's the basic idea.

Nick Maclaren

unread,
Oct 7, 1999, 3:00:00 AM10/7/99
to

In article <37FC739C...@wizard.net>,

Yes. Many compilers already do that, either for all automatic arrays
or for ones larger than the direct addressability from the stack
pointer. The extra tweaks for VLAs are trivial, even for compilers
without alloca, except for a couple of extreme types of stack
implementation (which typically cause hell for debuggers anyway.)

Permitting redefinition would make it impossible to use a stack for
VLAs.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

Larry Jones

unread,
Oct 7, 1999, 3:00:00 AM10/7/99
to
David R Tribble (da...@tribble.com) wrote:
>
> As a side note, I favor the idea that expressions with potential
> side effects be explicitly disallowed within VLA size declarations.

Unfortunately, function calls are expressions with potential side
effects and they are not only reasonable but highly desirable in VLA
size declarations.

-Larry Jones

Fortunately, that was our plan from the start. -- Calvin

Doug Gwyn (ISTD/CNS) <gwyn>

unread,
Oct 7, 1999, 3:00:00 AM10/7/99
to
"Morris M. Keesan" wrote:
> You mean C20XX, certainly, not C0X.

Don't tell me what I mean.

> Have we learned nothing in the past few years?

Apparently some of us haven't.
C0x is just as valid an abbreviation as C9x.

Clive D.W. Feather

unread,
Oct 8, 1999, 3:00:00 AM10/8/99
to
In article <37FBF550...@tribble.com>, David R Tribble
<da...@tribble.com> writes

>> These two are *not* identical, but it can't be determined until
>> runtime.
>
>Eesh. Perhaps in order to allow benign typedef redefinitions, we
>would have to exclude typedef redefinitions involving VLAs.

Meaning you've got an odd wart in the system for no good reason (I don't
see allowing redefinition as important enough to break anything else).

>(One of these boogers per scope ought to be enough for anyone anyway.)

Why ?

>As a side note, I favor the idea that expressions with potential
>side effects be explicitly disallowed within VLA size declarations.

Why should *this* particular use of "expression" be constrained
differently to all other uses ? If "n++" is the most natural way to
write code, why should it be forbidden ? Orthogonality of constructs is
a highly useful feature of a language, and breaking it should *never* be
done lightly.

For example, explain why, given that:

int x = a + b-- * c; int x;
x = a + b-- * c;

are equivalent, why:

int v [a + b-- * c]; int v_size = a + b-- * c;
int v [v_size];

shouldn't be.

In any case, what about function calls ? There was a *lot* of desire to
allow:

int v [get_size_of_table ()];

>>> And C does allow benign preprocessor macro definitions.
>> Those require exactly the same token sequences, which is easy to
>> test.
>Yes, but since a C compiler must check for compatibility between
>types (for things like assignment expressions and function arguments),
>it would not be that difficult for a compiler to check that two
>typedefs are compatible (i.e., the "same" type).

Not at compile time, it can't, because of VLAs.

> The rules for
>this, which are pretty much already laid out in 6.2.7 et al, should
>allow:

int v [n];
// ...
int v [n];

?

David R Tribble

unread,
Oct 8, 1999, 3:00:00 AM10/8/99
to
"Clive D.W. Feather" wrote:
>
> David R Tribble <da...@tribble.com> writes:
> >> These two are *not* identical, but it can't be determined until
> >> runtime.
> >
> > Eesh. Perhaps in order to allow benign typedef redefinitions, we
> > would have to exclude typedef redefinitions involving VLAs.
>
> Meaning you've got an odd wart in the system for no good reason

VLAs already create warts on existing language features, e.g., sizeof.

> (I don't see allowing redefinition as important enough to break
> anything else).

Well, for one thing, it would make writing header files containing
typedefs a lot easier, and not require cluttering up the preprocessor
(global) namespace.

Typical implementation:

// <stddef.h>

#ifndef __size_t_def
typedef unsigned size_t;
#define __size_t_def
#endif
etc...

// <stdio.h>

...same thing...

// <stdlib.h>

...same thing...

New implementation:

// <stddef.h>

typedef unsigned size_t;
etc...

// <stdio.h>

typedef unsigned size_t;
etc...

// <stdlib.h>

typedef unsigned size_t;
etc...

This has the added advantage that if one of the typedefs is changed,
and a program includes more than one header file containing that
typedef, an error will be issued; under current practice, you only
get the first typedef, correct or not, with no warnings.

I don't see how allowing benign typedef redefinitions will break
anything else in the language.

David R Tribble

unread,
Oct 8, 1999, 3:00:00 AM10/8/99
to
Clive D.W. Feather wrote:
>
> David R Tribble <da...@tribble.com> writes:
> [about VLA definitions]

>> (One of these boogers per scope ought to be enough for anyone
>> anyway.)
>
> Why ?

I meant that one typedef involving a VLA within the scope of the VLA
is probably sufficient for most purposes. I should have included a
smiley.

David R Tribble

unread,
Oct 8, 1999, 3:00:00 AM10/8/99
to
Clive D.W. Feather wrote:
>
> David R Tribble <da...@tribble.com> writes:
>> As a side note, I favor the idea that expressions with potential
>> side effects be explicitly disallowed within VLA size declarations.
>
> Why should *this* particular use of "expression" be constrained
> differently to all other uses ? If "n++" is the most natural way to
> write code, why should it be forbidden ? Orthogonality of constructs
> is a highly useful feature of a language, and breaking it should
> *never* be done lightly.

After being reminded that function calls have side effects, I take it
back.

I thought there were still some ambiguities to resolve about
VLA definitions hving expressions with side effects, or are they
all thrashed out by now?

Clive D.W. Feather

unread,
Oct 8, 1999, 3:00:00 AM10/8/99
to
In article <37FE2635...@tribble.com>, David R Tribble
<da...@tribble.com> writes

>I thought there were still some ambiguities to resolve about
>VLA definitions hving expressions with side effects, or are they
>all thrashed out by now?

All thrashed out. I thought I posted the relevant words.

Bill A

unread,
Oct 9, 1999, 3:00:00 AM10/9/99
to
Compilers for small embedded processors will have to handle this differently
since many have no dynamic memory allocation and therefore don't provide
alloca, etc. I would also say that adjusting the stack dynamically would be
more efficient in these systems, which is another benefit.

What happens with with a VLA (either using alloca or a stack pointer
adjustment) when there is insufficient memory to accomodate the VLA?

What is the result of a VLA declaration when the expression evaluates to 0?

Thanks,
Bill

James Kuyper Jr. wrote in message <37FC739C...@wizard.net>...

Bill A

unread,
Oct 9, 1999, 3:00:00 AM10/9/99
to
Which leads me to ask, since most (if not all) production compilers merge
preprocessing and lexical analysis, was it ever considered to allow #ifdef
work with any global namespace identifier? E.g.:

#ifndef Byte
typedef unsigned char Byte;
#endif

Bill

David R Tribble wrote in message <37FE24AA...@tribble.com>...

James Kuyper Jr.

unread,
Oct 9, 1999, 3:00:00 AM10/9/99
to
Bill A wrote:
>
> Compilers for small embedded processors will have to handle this differently
> since many have no dynamic memory allocation and therefore don't provide
> alloca, etc. I would also say that adjusting the stack dynamically would be
> more efficient in these systems, which is another benefit.

Adjusting the stack dynamically is what alloca() does. Don't confuse it
with malloc().

> What happens with with a VLA (either using alloca or a stack pointer
> adjustment) when there is insufficient memory to accomodate the VLA?

Roughly the same thing that happens when there's insufficient memory for
any other automatic object, which is undefined behavior due to exceeding
an implementation's memory limits. The main difference is that the VLA
may come from a different memory source than other automatic variables,
so it might run out of memory under different circumstances.

Whether it will run out more easily or less easily is an implementation
detail. However, if the alternative to a VLA was a fixed length array
large enough to hold the largest possible request, then the VLA approach
will place less strain on overall memory resources. It may make things
worse, despite that, if it increases the strain on a memory source that
is already near it's limit.

> What is the result of a VLA declaration when the expression evaluates to 0?

6.7.5.2 p3: "If the size expression is not a constant expression, and it
is evaluated at program execution time, it shall evaluate to a value
greater than zero."

Keith Thompson

unread,
Oct 9, 1999, 3:00:00 AM10/9/99
to
"Bill A" <bi...@megahits.com> writes:
> Which leads me to ask, since most (if not all) production compilers merge
> preprocessing and lexical analysis, was it ever considered to allow #ifdef
> work with any global namespace identifier? E.g.:
>
> #ifndef Byte
> typedef unsigned char Byte;
> #endif
>
> Bill

That syntax wouldn't work; Byte could independently be defined as
a macro and/or as a typedef. Conceivably you could do something like:

#if !istypedef(Byte)


typedef unsigned char Byte;
#endif

For users, something like this could be extremely handy. For
implementers, it would be ugly. It's bad enough that the parser has
to know about typedef names; forcing that information all the way back
to the preprocessor would just make it worse. Remember that a lot of
compilers still implement the preprocessor as a separate program; the
standard is carefully written to allow this.

Realistically, I don't think there's much chance of this happening;
it's far too late for C9X^H^H^H C99. The compiler most likely to
implement something like this as an extension, gcc, uses a separate
preprocessor.

(The real problem, IMHO, is C's weak namespace control. If you could
define a type Byte without regard to whether something you've imported
uses the same name, this wouldn't be necessary. One common solution
is to call it something like xyz_Byte, where xyz_ is a unique prefix.)

Douglas A. Gwyn

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
Bill A wrote:
> #ifndef Byte
> typedef unsigned char Byte;
> #endif

Can't do anything like that, because it doesn't
fit the model for phases of translation.

Nick Maclaren

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
In article <37FFF8D9...@null.net>,

Look, those things were designed for descriptive convenience and aren't
the Seven Commandments!

I agree that, the way that the language is constructed and described,
such a facility would be a right pain to add - and would need great
care not to introduce catastrophic inconsistencies. That is the REAL
reason for not wanting to rock the boat of the translation phases.

However, as has been pointed out, there are enough problems with the
translation phases that there is a case for reworking them entirely.
IF that were done, THEN this (and many related requests, such as a
preprocessor sizeof) could be reconsidered. I don't know of anyone
who has the stomach for such a task.

James Kuyper Jr.

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
Nick Maclaren wrote:
>
> In article <37FFF8D9...@null.net>,
> Douglas A. Gwyn <DAG...@null.net> wrote:
> >Bill A wrote:
> >> #ifndef Byte
> >> typedef unsigned char Byte;
> >> #endif
> >
> >Can't do anything like that, because it doesn't
> >fit the model for phases of translation.
>
> Look, those things were designed for descriptive convenience and aren't
> the Seven Commandments!
>
> I agree that, the way that the language is constructed and described,
> such a facility would be a right pain to add - and would need great
> care not to introduce catastrophic inconsistencies. That is the REAL
> reason for not wanting to rock the boat of the translation phases.
>
> However, as has been pointed out, there are enough problems with the
> translation phases that there is a case for reworking them entirely.
> IF that were done, THEN this (and many related requests, such as a
> preprocessor sizeof) could be reconsidered. I don't know of anyone
> who has the stomach for such a task.

There are two related ideas that I think should be considered (way too
late for C99, of course):

1. A new kind of conditional, sort of a mix between #if and if(). The
condition would be evaluated in phase 7, which means that unlike if(),
the condition would be required to be a constant expression. Unlike #if,
the condition could refer to things such as sizeof().
If the condition turns out false, the code controlled by this new
conditional would be processed only through phase 6, which is farther
than #if, but not as far as if().

The purpose: to allow the writing of code that would be defective if a
given condition is false, and to prevent it from being a defect by
putting it inside a test of that condition.

Unfortunately, I haven't thought up a good syntax for this idea. #if() ?
:-)

2. Several additional tests on identifiers that count as constant
expressions, usable either in if() or in this new conditional. One
example would be something like 'is_declared(my_symbol)', which would
test for the existence of an in-scope declaration of an identifier,
which would cover Bill's question. Another would be same_type(A,B),
which returns one value if the two identifiers identify the same type, a
different value if they identify compatible types, and 0 otherwise.
Finally, tests that can be applied to arbitrary identifiers, to
determine various characteristics of the things they identify:
is_function(), is_object(), is_type(), is_typedef(), etc.

Douglas A. Gwyn

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
Nick Maclaren wrote:
> Look, those things were designed for descriptive convenience and aren't
> the Seven Commandments!

That preprocessing is performed first *is* a Commandment.

Bill A

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
Were these translation phases designed as a direct consequence of the
computing power and resources the systems had at the time (20 years ago).
Systems now have more RAM than those at that time had in hard disk space, so
wouldn't it make sense to move into the future considering this WRT to
translation phases and what compilers do currently?

Was preprocessing separated from scanning and lexical analysis because of
the times? I.e. there wasn't enough program memory to combine these. When
these activities are combined (as is the case with many compilers I know of)
the information available to the preprocessor allows all of these features
to be possible.

If one argues that this is too much of a change to be required to be made to
a compiler, I argue that it's not that bad and is a small percentage of all
of the other changes that are required to be made over the past few years by
standard changes.

Bill

Nick Maclaren wrote in message <7tpk2u$os$1...@pegasus.csx.cam.ac.uk>...

Peter Curran

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
On Sun, 10 Oct 1999 09:51:34 -0400, "James Kuyper Jr."
<kuy...@wizard.net> wrote:

<snip>


>There are two related ideas that I think should be considered (way too
>late for C99, of course):
>
>1. A new kind of conditional, sort of a mix between #if and if(). The
>condition would be evaluated in phase 7, which means that unlike if(),
>the condition would be required to be a constant expression. Unlike #if,
>the condition could refer to things such as sizeof().
>If the condition turns out false, the code controlled by this new
>conditional would be processed only through phase 6, which is farther
>than #if, but not as far as if().
>
>The purpose: to allow the writing of code that would be defective if a
>given condition is false, and to prevent it from being a defect by
>putting it inside a test of that condition.
>
>Unfortunately, I haven't thought up a good syntax for this idea. #if() ?
>:-)
>
>2. Several additional tests on identifiers that count as constant
>expressions, usable either in if() or in this new conditional. One
>example would be something like 'is_declared(my_symbol)', which would
>test for the existence of an in-scope declaration of an identifier,
>which would cover Bill's question. Another would be same_type(A,B),
>which returns one value if the two identifiers identify the same type, a
>different value if they identify compatible types, and 0 otherwise.
>Finally, tests that can be applied to arbitrary identifiers, to
>determine various characteristics of the things they identify:
>is_function(), is_object(), is_type(), is_typedef(), etc.

Generalizing this, it seems to me that the whole concept of a
preprocessor should be eliminated. As others have pointed out, the
need for a preprocessor indicates flaws in the language. With the
addition to the language of enums, inline functions, consts,
optimizers that throw away dead code, etc., virtually all of the
intended uses of the preprocessor have been eliminated. (Yes, I
realize the preceding are not exact replacements for the corresponding
preprocessor functionality.) About the only thing left that isn't
available in another form is #include, and other languages have shown
good alternates for this that would enhance C..

However, I don't think this will happen to C in the foreseeable
future.

--
Peter Curran pcu...@acm.gov (chg gov=>org)

James Kuyper Jr.

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
Bill A wrote:
>
> Were these translation phases designed as a direct consequence of the
> computing power and resources the systems had at the time (20 years ago).

The phases were defined to control the interactions between various
parts of the translation process - they've little to do with any
particular implementation's limitations.

Mark Williams

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to

But its not a commandment that preprocessing (for the entire file)
finish before interpreting any of the tokens that were produced by it.

Mark Williams

Bill A

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
I don't think a language standard should be involved in dictating how a
compiler is implemented, and it sounds like by specifying translation phases
it has done this. I agree as someone already pointed out, it's not likely
to ever be changed. However, this isn't to say it doesn't need to be.

Bill

James Kuyper Jr. wrote in message <3800D91F...@wizard.net>...

James Kuyper Jr.

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
Bill A wrote:
>
> I don't think a language standard should be involved in dictating how a
> compiler is implemented, and it sounds like by specifying translation phases
> it has done this. I agree as someone already pointed out, it's not likely
> to ever be changed. However, this isn't to say it doesn't need to be.

As I said - the phases are there to control the interaction between
different language features required by the standard; not to force an
implmentation to actually implement those features in the order given.
The standard explicitly says so in note 5: "Implementations shall behave
as if these separate phases occur, even though many are typically folded
together in practice". All that matters is that the results be the same
as if they were implemented in that order.

Dennis Yelle

unread,
Oct 10, 1999, 3:00:00 AM10/10/99
to
"Douglas A. Gwyn" wrote:
>
> Bill A wrote:
> > #ifndef Byte
> > typedef unsigned char Byte;
> > #endif
>
> Can't do anything like that, because it doesn't
> fit the model for phases of translation.

The standard's current model for phases of translation is silly.
It reminds me of something one might have done back in the 1960's
if one was trying to compile a language similar to C on a machine
with only about 64K bytes of main memory.

Dennis Yelle

--
Want to get paid for using the internet?
If so, go to: http://alladvantage.com/go.asp?refid=BAL536
or: http://www.sendmoreinfo.com/SubMakeCookie.cfm?ExtractId=87656

Douglas A. Gwyn

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Bill A wrote:
> I don't think a language standard should be involved in dictating how
> a compiler is implemented, and it sounds like by specifying
> translation phases it has done this.

No! Translation phases specify a sequencing of many of the individual
transformations that make up the complete translation task. Since
the individual transformations are noncommutative, their order matters.
The *programmer* has to know the order of transformation in order to
get the right results; if this were not specified, portability would
be severely impacted.

Implementations are no more "dictated" by this than by any other part
of the spec. They have to perform the translation properly, but how
they accomplish that varies.

Nick Maclaren

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to

In article <38009A55...@null.net>, "Douglas A. Gwyn" <DAG...@null.net> writes:
|> Nick Maclaren wrote:
|> > Look, those things were designed for descriptive convenience and aren't
|> > the Seven Commandments!
|>
|> That preprocessing is performed first *is* a Commandment.

Just like "long is the widest signed integer type" or "Keep the
language small and simple"? Or any of the other dozen or so major
assumptions that were once generally true but were changed (often,
but not always, for good reasons) in C89 or C9X?

James Kuyper Jr.

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Dennis Yelle wrote:
>
> "Douglas A. Gwyn" wrote:
> >
> > Bill A wrote:
> > > #ifndef Byte
> > > typedef unsigned char Byte;
> > > #endif
> >
> > Can't do anything like that, because it doesn't
> > fit the model for phases of translation.
>
> The standard's current model for phases of translation is silly.
> It reminds me of something one might have done back in the 1960's
> if one was trying to compile a language similar to C on a machine
> with only about 64K bytes of main memory.
>
> Dennis Yelle

OK - how would you go about describing the proper interaction between
the following language features, without using the concept of phases? Is
your alternative equally clear? No more ambiguous? Meaningfully
different, yet backward compatible?

1. Mapping from physical source file multibyte characters to the source
character set.
2. Trigraph substitution (please don't reopen that debate; assume for
the purpose of your answer that we're keeping them).
3. Backslash escaping of new-line characters.
4. Parsing into pre-processing tokens.
5. Replacement of comments by single white-space characters.
6. Implementation of pre-processing directives.
7. Macro expansion.
8. Conversion of source character set members and escape sequences to
the execution character set in character and string literals.
9. Catenation of adjacent string literal tokens.
10. Parsing and syntactic and semantic analysis.
11. Resolution of external object and function references.
12. Creation of program image.

Nick Maclaren

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to

In article <3801D07A...@wizard.net>, "James Kuyper Jr." <kuy...@wizard.net> writes:
|> Dennis Yelle wrote:
|> > "Douglas A. Gwyn" wrote:
|> > >
|> > > Can't do anything like that, because it doesn't
|> > > fit the model for phases of translation.
|> >
|> > The standard's current model for phases of translation is silly.
|> > It reminds me of something one might have done back in the 1960's
|> > if one was trying to compile a language similar to C on a machine
|> > with only about 64K bytes of main memory.
|>
|> OK - how would you go about describing the proper interaction between
|> the following language features, without using the concept of phases? Is
|> your alternative equally clear? No more ambiguous? Meaningfully
|> different, yet backward compatible?
|>
|> 1. Mapping from physical source file multibyte characters to the source
|> character set.
|> 2. Trigraph substitution (please don't reopen that debate; assume for
|> the purpose of your answer that we're keeping them).
|> 3. Backslash escaping of new-line characters.
|> 4. Parsing into pre-processing tokens.
|> 5. Replacement of comments by single white-space characters.
|> 6. Implementation of pre-processing directives.
|> 7. Macro expansion.
|> 8. Conversion of source character set members and escape sequences to
|> the execution character set in character and string literals.
|> 9. Catenation of adjacent string literal tokens.
|> 10. Parsing and syntactic and semantic analysis.
|> 11. Resolution of external object and function references.
|> 12. Creation of program image.

One problem is that the current translation phases DON'T describe
their interactions properly, anyway! Yes, using a phase model is
sensible, but the current one has some quite serious flaws. For
example:

Your phases 6 and 10 are confounded, by the fact that the #if
definition says that expressions are interpreted as in execution,
despite there being semantic differences between preprocessing and
execution in expression handling. MOST of them are then tied down
by extra wording, but not all.

Your phase 10 is a composite phase, with several of its actions
required to be performed in certain orders, usually but not always
with extra wording to say so.

James Kuyper Jr.

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Nick Maclaren wrote:
>
> In article <3801D07A...@wizard.net>, "James Kuyper Jr." <kuy...@wizard.net> writes:
...

Agreed: I didn't mean to suggest that the phase model couldn't be
improved, merely that any worthwhile replacement would also probably
also use phases - possibly more of them. I also didn't mean to suggest a
12-phase model: I was merely trying to list all the distinct things that
are controlled by the current model.

Nick Maclaren

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to

In article <3801E3C4...@wizard.net>, "James Kuyper Jr." <kuy...@wizard.net> writes:
|>
|> Agreed: I didn't mean to suggest that the phase model couldn't be
|> improved, merely that any worthwhile replacement would also probably
|> also use phases - possibly more of them. I also didn't mean to suggest a
|> 12-phase model: I was merely trying to list all the distinct things that
|> are controlled by the current model.

Ah! Sorry for misunderstanding. Yes, I agree with that.

David R Tribble

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Keith Thompson wrote:
[...]

> (The real problem, IMHO, is C's weak namespace control. If you could
> define a type Byte without regard to whether something you've imported
> uses the same name, this wouldn't be necessary. One common solution
> is to call it something like xyz_Byte, where xyz_ is a unique prefix.)

I once considered suggesting preprocessor namespaces, e.g.:

#define STD::NULL ...whatever...
#define STD::getchar() ...whatever...

But then I realized that this is not very different from simply
using unique prefixes:

#define STD_NULL ...whatever...
#define std_getchar() ...whatever...

And it's way too late to suggest that the standard library names
be prefixed with 'std_' and 'STD_', right? (Maybe as an
implementation-specific extension?)

OTOH, preprocessor namespaces could be abbreviated, assuming we
also added a preprocessor 'using' directive:

#using STD::NULL // NULL is alias for STD::NULL
#using STD // All FOO are aliases for all STD::FOO

I'm not overly enthusiastic about this idea, though.

David R Tribble

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Peter Curran wrote:
[...]

> Generalizing this, it seems to me that the whole concept of a
> preprocessor should be eliminated. As others have pointed out, the
> need for a preprocessor indicates flaws in the language. With the
> addition to the language of enums, inline functions, consts,
> optimizers that throw away dead code, etc., virtually all of the
> intended uses of the preprocessor have been eliminated. (Yes, I
> realize the preceding are not exact replacements for the corresponding
> preprocessor functionality.) About the only thing left that isn't
> available in another form is #include, and other languages have shown
> good alternates for this that would enhance C.

And little things like conditional compilation, for things like
writing code that is ported to many systems. Tell me, please, how
to write the following without preprocessing directives?:

#if OS_UNIX
...code for Unix, including extremely system-specific
...system calls like gettimeofday()
#elif OS_WIN32
...ditto, for GetSystemTimeAsFileTime()
#elif OS_MACOS
...
#else
#error Unsupported O/S
#endif

Or, here's an even simpler example:

#ifndef BUFSIZE
#define BUFSIZE 1024
#endif

static char buffer[BUFSIZE];

> However, I don't think this will happen to C in the foreseeable
> future.

Hopefully it won't ever happen.

The C preprocessor adds immense power to the language, allowing
programmers to do things that (still) can't be done any other way.

I've found that most proponents of eliminating the preprocessor
have never experienced porting a million lines of code to several
systems. I humbly ask them not to break the legs of those of us who
have.

David R Tribble

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Dennis Yelle wrote:
>
> "Douglas A. Gwyn" wrote:
> >
> > Bill A wrote:
> >> #ifndef Byte
> >> typedef unsigned char Byte;
> >> #endif
> >
> > Can't do anything like that, because it doesn't
> > fit the model for phases of translation.
>
> The standard's current model for phases of translation is silly.
> It reminds me of something one might have done back in the 1960's
> if one was trying to compile a language similar to C on a machine
> with only about 64K bytes of main memory.

Have you ever tried writing a preprocessor, lexical analyzer, or
parser for C? I have (many times), and I can tell you it's a damn
good thing that we have standardized phases of translation to work
with.

It could be worse. Compare standard C to COBOL some time if you
need convincing. (Picture strings, COPY directives, COPY REPLACING,
and pseudo-text, all of which interact with each other across levels,
are just a few of the joys you have to deal with in COBOL.) It
could be much, much worse.

As for the original example, you can always do this:


#ifndef Byte
typedef unsigned char Byte;

#define Byte Byte
#endif

David R Tribble

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Mark Williams wrote:
>
> Douglas A. Gwyn wrote:
> >
> > Nick Maclaren wrote:
> >> Look, those things were designed for descriptive convenience and
> >> aren't the Seven Commandments!
> >
> > That preprocessing is performed first *is* a Commandment.
>
> But its not a commandment that preprocessing (for the entire file)
> finish before interpreting any of the tokens that were produced by it.

But the standard allows either kind of behavior. Or to put it
another way, it doesn't *require* an implementation to be able to
process tokens before preprocessing is complete.

James Kuyper Jr.

unread,
Oct 11, 1999, 3:00:00 AM10/11/99
to
Peter Curran wrote:
>
> On Mon, 11 Oct 1999 16:22:01 -0500, David R Tribble
> <da...@tribble.com> wrote:
...

> >And little things like conditional compilation, for things like
> >writing code that is ported to many systems. Tell me, please, how
> >to write the following without preprocessing directives?:
> >
> > #if OS_UNIX
> > ...code for Unix, including extremely system-specific
> > ...system calls like gettimeofday()
> > #elif OS_WIN32
> > ...ditto, for GetSystemTimeAsFileTime()
> > #elif OS_MACOS
> > ...
> > #else
> > #error Unsupported O/S
> > #endif
> <snip>
>
> if (OS_UNIX) {
> ...
> }
> else if (OS_WIN32) {
> ...
> }
> etc.
>
> As I said in my posting, the non-preprocessor functionality is not
> fully equivalent to the preprocessor functionality. One obvious
> difference is that if statements aren't allowed outside of functions.
> Such details are not insurmountable problems.

Well other details are essentially insurmountable. Without a
preprocessor, or equivalent functionality, how do you place code inside
the 'if' that would be undefined behavior which terminates compilation
on one compiler, and a permitted extension on another?

Bill A

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
I agree totally - it's indspensible. We cannot eliminate it. But there are
shortcomings and reasonable changes that already have been mentioned
hundreds of times and the preprocessor is being left as-is while the rest of
the langauge is allowed to grow with the times.

David R Tribble wrote in message <380254F9...@tribble.com>...

Peter Curran

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
On Mon, 11 Oct 1999 16:22:01 -0500, David R Tribble
<da...@tribble.com> wrote:

>Peter Curran wrote:
>[...]
>> Generalizing this, it seems to me that the whole concept of a
>> preprocessor should be eliminated. As others have pointed out, the
>> need for a preprocessor indicates flaws in the language. With the
>> addition to the language of enums, inline functions, consts,
>> optimizers that throw away dead code, etc., virtually all of the
>> intended uses of the preprocessor have been eliminated. (Yes, I
>> realize the preceding are not exact replacements for the corresponding
>> preprocessor functionality.) About the only thing left that isn't
>> available in another form is #include, and other languages have shown
>> good alternates for this that would enhance C.
>

>And little things like conditional compilation, for things like
>writing code that is ported to many systems. Tell me, please, how
>to write the following without preprocessing directives?:
>
> #if OS_UNIX
> ...code for Unix, including extremely system-specific
> ...system calls like gettimeofday()
> #elif OS_WIN32
> ...ditto, for GetSystemTimeAsFileTime()
> #elif OS_MACOS
> ...
> #else
> #error Unsupported O/S
> #endif
<snip>

if (OS_UNIX) {
...
}
else if (OS_WIN32) {
...
}
etc.

As I said in my posting, the non-preprocessor functionality is not
fully equivalent to the preprocessor functionality. One obvious
difference is that if statements aren't allowed outside of functions.
Such details are not insurmountable problems.

>> However, I don't think this will happen to C in the foreseeable


>> future.
>
>Hopefully it won't ever happen.
>

>The C preprocessor adds immense power to the language, allowing
>programmers to do things that (still) can't be done any other way.
>
>I've found that most proponents of eliminating the preprocessor
>have never experienced porting a million lines of code to several
>systems. I humbly ask them not to break the legs of those of us who
>have.

FWIW, I strongly suspect that I have ported more code that you will
ever see. My point (which is just repeating one that many others have
made before) is that the functionality of the preprocessor could, with
a little effort, be fully integrated into the language, eliminating
the bizarre limitations and differences that result from the use of a
separately defined preprocessor. Essentially no functionality need be
lost as a result.

I am not suggesting eliminating the functional capabilities that
result from the preprocessor, only that, with current compiler
technology, language modularity concepts, etc., these capabilities
could be integrated into the language more effectively and more
usefully. IMHO, it would be more sensible to move in that direction
than the yet-another-layer approach that the post to which I was
responding suggested. However, I don't think either is likely to be
adopted at this stage of the evolution of C.

"'For Historical Reasons' is the root of all evil."

Douglas A. Gwyn

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Dennis Yelle wrote:
> The standard's current model for phases of translation is silly.

It's easy to carp from the sidelines -- so what's *your*
suggestion for the *proper* phases of translation?
If they aren't close to the standard ones, then you're
not describing the C programming language.

Ross Ridge

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
David R Tribble <da...@tribble.com> wrote:
>And little things like conditional compilation, for things like
>writing code that is ported to many systems. Tell me, please, how
>to write the following without preprocessing directives?:
>
> #if OS_UNIX
> ...code for Unix, including extremely system-specific
> ...system calls like gettimeofday()
> #elif OS_WIN32
> ...ditto, for GetSystemTimeAsFileTime()
> #elif OS_MACOS
> ...
> #else
> #error Unsupported O/S
> #endif

extern int my_get_time();

Spagetti #ifdef's inside functions like in your example is bad style as
far as I'm concerned.

>Or, here's an even simpler example:
>
> #ifndef BUFSIZE
> #define BUFSIZE 1024
> #endif
>
> static char buffer[BUFSIZE];

enum { BUFSIZE = 1024 }; /* in C++: static const int BUFSIZE = 1024; */

At the very least you should use just:

#define BUFSIZE 1024

so you get an error/warning if BUFSIZE defined earlier.

>> However, I don't think this will happen to C in the foreseeable
>> future.
>
>Hopefully it won't ever happen.

While it's obviously not at this point yet, I think making the C
preprocessor completely redundant is a worthy goal.

>I've found that most proponents of eliminating the preprocessor
>have never experienced porting a million lines of code to several
>systems. I humbly ask them not to break the legs of those of us who
>have.

I've had the experience of maintaining code that's been ported a wider
range of systems than most (from MS-DOS to a signed magnitude machine),
and I've found that the more preprocessory trickery there is, the more
difficult and error prone the job is.

Ross Ridge

--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] rri...@csclub.uwaterloo.ca
-()-/()/ http://www.csclub.uwaterloo.ca/u/rridge/
db //

Helmut Leitner

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to

Peter Curran wrote:
>
> On Mon, 11 Oct 1999 16:22:01 -0500, David R Tribble


> <da...@tribble.com> wrote:
> >And little things like conditional compilation, for things like
> >writing code that is ported to many systems. Tell me, please, how
> >to write the following without preprocessing directives?:
> >
> > #if OS_UNIX
> > ...code for Unix, including extremely system-specific
> > ...system calls like gettimeofday()
> > #elif OS_WIN32
> > ...ditto, for GetSystemTimeAsFileTime()
> > #elif OS_MACOS
> > ...
> > #else
> > #error Unsupported O/S
> > #endif

> <snip>
>
> if (OS_UNIX) {
> ...
> }
> else if (OS_WIN32) {
> ...
> }
> etc.
>
> As I said in my posting, the non-preprocessor functionality is not
> fully equivalent to the preprocessor functionality. One obvious
> difference is that if statements aren't allowed outside of functions.
> Such details are not insurmountable problems.

How will the code e.g.
if(OS_UNIX) {
... fcntl-type file locking
} else if(OS_WIN32) {
... Windows file locking
}
*link* ???

--
Helmut Leitner lei...@hls.via.at
Graz, Austria www.hls-software.com

Zack Weinberg

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
In article <3802E29B...@chello.at>,
Helmut Leitner <helmut....@chello.at> wrote:

>
>
>Peter Curran wrote:
>>
>> if (OS_UNIX) {
>> ...
>> }
>> else if (OS_WIN32) {
>> ...
>> }
>> etc.
>>
>> As I said in my posting, the non-preprocessor functionality is not
>> fully equivalent to the preprocessor functionality. One obvious
>> difference is that if statements aren't allowed outside of functions.
>> Such details are not insurmountable problems.
>
>How will the code e.g.
> if(OS_UNIX) {
> ... fcntl-type file locking
> } else if(OS_WIN32) {
> ... Windows file locking
> }
>*link* ???

That's not the problem - assuming that OS_* are compile time constants, any
decent compiler ought to strip out the code for the other platform before
the linker ever sees it.

A harder problem is what to do about code like this:

while ((e = readdir(d)) != NULL) {
#ifdef DT_UNKNOWN
if (e->d_type != DT_UNKNOWN)
mode = DTTOIF(e->d_type);
else
#endif
if (lstat(e->d_name, sb)) {
perror(e->d_name);
continue;
} else
mode = (sb->st_mode & S_IFMT);
/* do stuff with mode here ... */
}

This is a snippet out of a file-tree-walking program I use. On platforms
that do not define DT_UNKNOWN, the structure pointed to by `e' does not have
a member called `d_type'. If the compiler proper sees that chunk of code on
a platform that doesn't have d_type, it will reject the program.

I'd certainly like to be able to rewrite that chunk like this:

if (exists(DT_UNKNOWN) && e->d_type != DT_UNKNOWN)
mode = DTTOIF(e->d_type);
else if (lstat(e->d_name, sb)) {
/* et cetera */

where exists() interrogates the symbol table - but for that to work, the
compiler would have to ignore the second half of the conditional and the
statements it controls if exists() evaluated false. More accurately, it'd
have to do syntactic analysis, but not semantic.

It can be done, but the precise behavior has to be nailed down carefully, or
it'd be worse than useless.

zw


Peter Curran

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
On Mon, 11 Oct 1999 21:56:53 -0400, "James Kuyper Jr."
<kuy...@wizard.net> wrote:

>> As I said in my posting, the non-preprocessor functionality is not
>> fully equivalent to the preprocessor functionality. One obvious
>> difference is that if statements aren't allowed outside of functions.
>> Such details are not insurmountable problems.
>

>Well other details are essentially insurmountable. Without a
>preprocessor, or equivalent functionality, how do you place code inside
>the 'if' that would be undefined behavior which terminates compilation
>on one compiler, and a permitted extension on another?

You wouldn't. A well-standardized language would not have either
undefined behaviour or (uncompilable) permitted extensions. A
detailed definition along the lines of what I was proposing is,
however, feasible even in C. There is, IMHO, no point in pursuing
this any further. The original posting was blue-skying, and so was my
response. I repeat: this is not doable in C in its current (over?)
mature state. However, an advance such as this would, IMHO,
significantly improve the language, especially w.r.t. portability.

Peter Curran

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
On Tue, 12 Oct 1999 06:13:46 GMT, Helmut Leitner
<helmut....@chello.at> wrote:


<snip>


>How will the code e.g.
> if(OS_UNIX) {
> ... fcntl-type file locking
> } else if(OS_WIN32) {
> ... Windows file locking
> }
>*link* ???

As I said in my original posting, I assume the existence of an
optimizer that dead code. Assuming (as the original question appeared
to) that the various OS_xxxx values are constants, only one branch of
the if is live; all other branches are dead and discarded.

IOW, this code would do exactly what the preprocessor does, except
that the conditions expressed in the if statements would be fully
compatible with those used at run time.

Mark Williams

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to

David R Tribble wrote:

> Mark Williams wrote:
> >
> > Douglas A. Gwyn wrote:
> > >
> > > Nick Maclaren wrote:
> > >> Look, those things were designed for descriptive convenience and
> > >> aren't the Seven Commandments!
> > >
> > > That preprocessing is performed first *is* a Commandment.
> >
> > But its not a commandment that preprocessing (for the entire file)
> > finish before interpreting any of the tokens that were produced by it.
>
> But the standard allows either kind of behavior. Or to put it
> another way, it doesn't *require* an implementation to be able to
> process tokens before preprocessing is complete.

Or going back several levels in the discussion... it _would_ be possible for
the standard to require that the implementation be able to process tokens at
specific points during compilation, which would allow eg preprocessing of
sizeof or __is_type_name(x) or even __is_local_variable(x) etc.

Mark Williams


Nick Maclaren

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to

In article <38033BA2...@my-deja.com>, Mark Williams <mar...@my-deja.com> writes:
|>
|> Or going back several levels in the discussion... it _would_ be possible for
|> the standard to require that the implementation be able to process tokens at
|> specific points during compilation, which would allow eg preprocessing of
|> sizeof or __is_type_name(x) or even __is_local_variable(x) etc.

Yes, it would - or, rather, it would HAVE been. Up until C99,
C was a one-pass language, in the usual rather misleading sense.
Defining the meaning of such cross-phase functionality would be a
right pain, but would have at least been theoretically possible.

However, C99 isn't a one-pass language, and has just been accepted.
Of course, there is only one aspect that isn't one-pass, and it
is a moderately perverse one, so it COULD be changed incompatibly.
There is now lashings of precedent for this.

Douglas A. Gwyn (IST/CNS)

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Helmut Leitner wrote:
> How will the code e.g.
> if(OS_UNIX) {
> ... fcntl-type file locking
> } else if(OS_WIN32) {
> ... Windows file locking
> }
> *link* ???

Presumably, proponents of such an approach would *require*
dead-code elimination.

A bigger problem is that they assume all API declarations
are available in all environments.

Nick Maclaren

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to

In article <38033F20...@arl.mil>, "Douglas A. Gwyn (IST/CNS)" <gw...@arl.mil> writes:
|> Helmut Leitner wrote:
|> > How will the code e.g.
|> > if(OS_UNIX) {
|> > ... fcntl-type file locking
|> > } else if(OS_WIN32) {
|> > ... Windows file locking
|> > }
|> > *link* ???
|>
|> Presumably, proponents of such an approach would *require*
|> dead-code elimination.

Yes.

|> A bigger problem is that they assume all API declarations
|> are available in all environments.

Actually, no, they don't. All that they require is that the API
definitions do not affect the parsing. This approach has been taken
very successfully in many environments, and fits abominably with the
C language. If I were going there, I wouldn't start from here.

David R Tribble

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Peter Curran wrote:
>
> Helmut Leitner <helmut....@chello.at> wrote:
> <snip>
> > How will the code e.g.
> > if(OS_UNIX) {
> > ... fcntl-type file locking
> > } else if(OS_WIN32) {
> > ... Windows file locking
> > }
> > *link* ???
>
> As I said in my original posting, I assume the existence of an
> optimizer that dead code. Assuming (as the original question appeared
> to) that the various OS_xxxx values are constants, only one branch of
> the if is live; all other branches are dead and discarded.

That's not enough. The contents of the enclosed statement blocks
must still contain valid code.

if (OS_UNIX)
{
t = gettimeofday(a, b); // defined in <time.h>
...
}
else if (OS_WIN32)
{
t = GetSystemTimeAsFileTime(); // defined in <winbase.h>
...
}

Try compiling this on a system where OS_UNIX==1, and you'll get an
undefined GetSystemTimeAsFileTime function. It doesn't matter that
OS_WIN32==0; the code within that block must still be valid code.

'if' is not a replacement for '#if'.

> IOW, this code would do exactly what the preprocessor does, except
> that the conditions expressed in the if statements would be fully
> compatible with those used at run time.

Execpt that '#if' allows the compiler to completely ignore whole
block of code; 'if' doesn't.

David R Tribble

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Mark Williams wrote:
>
> David R Tribble wrote:
>
>> Mark Williams wrote:
>> >
>> > Douglas A. Gwyn wrote:
>> > >
>> > > Nick Maclaren wrote:
>> > >> Look, those things were designed for descriptive convenience and
>> > >> aren't the Seven Commandments!
>> > >
>> > > That preprocessing is performed first *is* a Commandment.
>> >
>> > But its not a commandment that preprocessing (for the entire file)
>> > finish before interpreting any of the tokens that were produced by
>> > it.
>>
>> But the standard allows either kind of behavior. Or to put it
>> another way, it doesn't *require* an implementation to be able to
>> process tokens before preprocessing is complete.

>
> Or going back several levels in the discussion... it _would_ be
> possible for the standard to require that the implementation be able
> to process tokens at specific points during compilation, which would
> allow eg preprocessing of sizeof or __is_type_name(x) or even
> __is_local_variable(x) etc.

Yes, but that would suddenly make compilers that implement the
preprocessor as a truly separate phase non-conforming.

It would also unduly burden the compiler by the fact that the
preprocessor token stream wuold then be very closely tied to, and
synchronized with, the syntactic/semantic parsing stream.

And, finally, it is something that many of us simply do not want
or need.

David R Tribble

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Ross Ridge wrote:
>
> David R Tribble <da...@tribble.com> wrote:
> >> However, I don't think this will happen to C in the foreseeable
> >> future.
> >
> >Hopefully it won't ever happen.
>
> While it's obviously not at this point yet, I think making the C
> preprocessor completely redundant is a worthy goal.

Perhaps. There are still a few things that could be done to C to
get us closer to that goal. For instance, we could add true
compile-time constants like C++ has, so we wouldn't be stuck with
only enums.

But it is sheer hubris on the part of the language designer to assume
that the language is sufficiently expressive to meet all needs.
(Look at Java: no goto, no preprocessor, etc., and yet sometimes
(rarely) they are still sorely needed.)

But as long as "porting features" like the code below exist out there
in the real world, I really don't ever foresee the day that the
preprocessor will be removed:

#if defined(_WIN32) && defined(_DLL)
#define IMPORT __declspec(dllimport)
#define CFUNC __cdecl
#else
#define IMPORT
#define CFUNC
#endif

extern IMPORT int CFUNC my_read(int f, char *buf);
extern IMPORT int CFUNC my_write(int f, const char *buf);

Helmut Leitner

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to

Peter Curran wrote:
>
> On Tue, 12 Oct 1999 06:13:46 GMT, Helmut Leitner


> <helmut....@chello.at> wrote:
>
> <snip>
> >How will the code e.g.
> > if(OS_UNIX) {
> > ... fcntl-type file locking
> > } else if(OS_WIN32) {
> > ... Windows file locking
> > }
> >*link* ???
>
> As I said in my original posting, I assume the existence of an
> optimizer that dead code.

Sorry, I missed these few words in your posting.

> Assuming (as the original question appeared
> to) that the various OS_xxxx values are constants, only one branch of
> the if is live; all other branches are dead and discarded.

Although you argument is clear and clean, I would hate the language
to change this way. Too often I had to turn off optimizations for
one reason or the other... :-(



> IOW, this code would do exactly what the preprocessor does, except
> that the conditions expressed in the if statements would be fully
> compatible with those used at run time.

The evil things we may do with the preprocessor will not become
less evil if we use other means to accomplish them.
So I cannot share your dislike of the preprocessor. I would rather
agree with James Kuiper Jr. to enhance the preprocessor.

Dennis Yelle

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Nick Maclaren wrote:
>
> In article <38033BA2...@my-deja.com>, Mark Williams <mar...@my-deja.com> writes:
> |>
> |> Or going back several levels in the discussion... it _would_ be possible for
> |> the standard to require that the implementation be able to process tokens at
> |> specific points during compilation, which would allow eg preprocessing of
> |> sizeof or __is_type_name(x) or even __is_local_variable(x) etc.
>
> Yes, it would - or, rather, it would HAVE been. Up until C99,
> C was a one-pass language, in the usual rather misleading sense.
> Defining the meaning of such cross-phase functionality would be a
> right pain, but would have at least been theoretically possible.
>
> However, C99 isn't a one-pass language, and has just been accepted.
> Of course, there is only one aspect that isn't one-pass, and it
> is a moderately perverse one,

OK, Nick, I give up, what is this "one aspect" that you are
talking about here?

> so it COULD be changed incompatibly.
> There is now lashings of precedent for this.

Dennis Yelle

Dennis Yelle

unread,
Oct 12, 1999, 3:00:00 AM10/12/99
to
Previously,
Bill A wrote:
> #ifndef Byte
> typedef unsigned char Byte;
> #endif

"Douglas A. Gwyn" wrote:
> Can't do anything like that, because it doesn't
> fit the model for phases of translation.

Dennis Yelle wrote:


> The standard's current model for phases of translation is silly.

"Douglas A. Gwyn" wrote:
> It's easy to carp from the sidelines -- so what's *your*
> suggestion for the *proper* phases of translation?
> If they aren't close to the standard ones, then you're
> not describing the C programming language.

The phases themselves could be changed, but don't
need to be changed significantly to address this.
All one needs to do is to recognize that any C compiler
written in the last 20 years does NOT run the entire
program thru each phase n before it starts running phase n+1.
More likely, it runs each line thru phases 1..n before reading
the next line, except where it becomes more convenient to
process multiple lines, like when a line ends in backslash-newline.

It would be quite natural to say that definitions become
complete at their trailing ';' or '}', and that
those definitions become known to earlier phases
(no later than) the next following unescaped newline.

Of course the syntax proposed above is not workable, but
something like this:

#if ! is_typedef(Byte)


typedef unsigned char Byte;
#endif

would not cause any significant problems.

On the other hand, this:

typedef unsigned char Byte;
typedef unsigned char Byte;

should not cause any problems either.

Dennis Yelle

--
Want to get paid for using the internet?
If so, go to: http://alladvantage.com/go.asp?refid=BAL536
or: http://www.sendmoreinfo.com/SubMakeCookie.cfm?ExtractId=87656

Bill A

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
David R Tribble wrote in message <3803720C...@tribble.com>...

>Yes, but that would suddenly make compilers that implement the
>preprocessor as a truly separate phase non-conforming.

And aren't they non-conforming already for all of the other changes made
thus far to the stanard? Many of the changes aren't simply just parser
changes for synonyms for types. VLA's could require more work than forcing
a compiler to combine preprocessing and lexical analysis. Existing programs
that don't make use of any new preprocessing functionality won't break, and
compilers still available are no more broken by saying #if sizeof(int)==2
than using VLAs. The standard can't allow downward compatibility with
compilers by the mere fact of it changing requiring compiler changes.

>It would also unduly burden the compiler by the fact that the
>preprocessor token stream wuold then be very closely tied to, and
>synchronized with, the syntactic/semantic parsing stream.

I don't see the difficulty in this.

>And, finally, it is something that many of us simply do not want
>or need.

Many of us do, and those than don't use the new preprocessor features are no
worse off.

Bill


Clive D.W. Feather

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <3803215c....@news1.on.sympatico.ca>, Peter Curran
<pcu...@acm.gov> writes

>IOW, this code would do exactly what the preprocessor does, except
>that the conditions expressed in the if statements would be fully
>compatible with those used at run time.

And, presumably, would have the same scope rules as the rest of the
language, rather than the all-but-scopeless preprocessor.

--
Clive D.W. Feather | Internet Expert | Work: <cl...@demon.net>
Tel: +44 20 8371 1138 | Demon Internet Ltd. | Home: <cl...@davros.org>
Fax: +44 20 8371 1037 | | Web: <http://www.davros.org>
Written on my laptop; please observe the Reply-To address

Clive D.W. Feather

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <nkwM3.405$IZ5....@news.rdc1.ct.home.com>, Bill A
<bi...@megahits.com> writes

>I agree totally - it's indspensible. We cannot eliminate it. But there are
>shortcomings and reasonable changes that already have been mentioned
>hundreds of times and the preprocessor is being left as-is while the rest of
>the langauge is allowed to grow with the times.

Changes *have* been made to the preprocessor, both in C89 (v K&R) and
C99 (v C89). So why do you say this ?

Clive D.W. Feather

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <380366E4...@jps.net>, Dennis Yelle <denn...@jps.net>
writes

>> However, C99 isn't a one-pass language, and has just been accepted.
>> Of course, there is only one aspect that isn't one-pass, and it
>> is a moderately perverse one,
>
>OK, Nick, I give up, what is this "one aspect" that you are
>talking about here?

The interaction between inline and extern. It is no longer possible to
look at the definition of a function and say whether that definition is
visible outside the current translation unit, since its status is
affected by prototypes later in the TU.

Clive D.W. Feather

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <TrRM3.534$IZ5....@news.rdc1.ct.home.com>, Bill A
<bi...@megahits.com> writes

>>Yes, but that would suddenly make compilers that implement the
>>preprocessor as a truly separate phase non-conforming.
>
>And aren't they non-conforming already for all of the other changes made
>thus far to the stanard? Many of the changes aren't simply just parser
>changes for synonyms for types.

I can't think of anything new in C99 that means that you can't have a
separate preprocessor. What did you have in mind ?

Clive D.W. Feather

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <v24M3.249$IZ5....@news.rdc1.ct.home.com>, Bill A
<bi...@megahits.com> writes
>Were these translation phases designed as a direct consequence of the
>computing power and resources the systems had at the time (20 years ago).
[...]
>Was preprocessing separated from scanning and lexical analysis because of
>the times? I.e. there wasn't enough program memory to combine these.

This was probably the motivation behind the preprocessor/translator
split in early compilers, yes.

However, the reason the Standard splits the language into phases is to
make it clear how various features interact. There is no expectation
that, for example, there will really be a separate pass over the code
just to implement phase 6 (merging of string literals). Rather, writing
the Standard in this way allows us to omit reams of text along the lines
of "except that where one of the identifiers is #defined as a macro,
then the following effects apply after all the replacement has taken
place".

Clive D.W. Feather

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <3800C56F...@jps.net>, Dennis Yelle <denn...@jps.net>
writes

>The standard's current model for phases of translation is silly.
>It reminds me of something one might have done back in the 1960's
>if one was trying to compile a language similar to C on a machine
>with only about 64K bytes of main memory.

You are welcome to propose alternative wording for the Standard that
produces the same language. I'd be surprised if you end up with
something that is half as readable without having the phase concept in
it.

Remember, phase is not the same as pass or process.

Nick Maclaren

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <TrRM3.534$IZ5....@news.rdc1.ct.home.com>,

Bill A <bi...@megahits.com> wrote:
>David R Tribble wrote in message <3803720C...@tribble.com>...
>>Yes, but that would suddenly make compilers that implement the
>>preprocessor as a truly separate phase non-conforming.
>
>And aren't they non-conforming already for all of the other changes made
>thus far to the stanard? ...

Not for that reason. But you are correct that adding most of the
requested preprocessor enhancements would be no more incompatible
than any of the other changes now accepted.

>>It would also unduly burden the compiler by the fact that the
>>preprocessor token stream wuold then be very closely tied to, and
>>synchronized with, the syntactic/semantic parsing stream.
>
>I don't see the difficulty in this.

There isn't one, unless you are trying to hack an old two-process
implementation to support the new facility.

>>And, finally, it is something that many of us simply do not want
>>or need.
>
>Many of us do, and those than don't use the new preprocessor features are no
>worse off.

That is true. There are several compile-time constant aspects of
the language that would be immensely useful to be able to test during
preprocessing. But, except for a few built-in facilities (mainly
sizeof and perhaps offsetof), C is not a good place to start when
adding such features.

Chris Torek

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
>Peter Curran wrote:
>> As I said in my original posting, I assume the existence of an
>> optimizer that [removes] dead code. ...

In article <3803714A...@tribble.com> David R Tribble


<da...@tribble.com> writes:
>That's not enough. The contents of the enclosed statement blocks
>must still contain valid code.

[compressed vertically:]


> if (OS_UNIX) {
> t = gettimeofday(a, b); // defined in <time.h>
> ...
> } else if (OS_WIN32) {
> t = GetSystemTimeAsFileTime(); // defined in <winbase.h>
> ...
> }

>Try compiling this on a system where OS_UNIX==1, and you'll get an

>undefined GetSystemTimeAsFileTime function. ...

This is not a hard problem, and in fact many (most?) current compilers
omit both the "dead code" and the linker references at the same time.
(Perhaps you mean "you will get a warning about an unprototyped
function", which of course depends on QoI, at least in C89.)

The hard problem in a language like C is really something like this:

if (OS_UNIX) {
struct stat st;
struct dirent *dp;
DIR *dirp;
...
} else if (OS_MSDOS) {
struct ffblk blk;
...
}

where "struct stat" or "struct ffblk" do not exist; or even worse:

if (VMS) {
SYS$ugly_datatype x;
SYS$foo(&3, &x);
}

where the bit inside the test is not even syntactically valid (the
"SYS$" and "&3" are both actual "VMS-C" examples).

One can simply rule this sort of thing out entirely, and/or require
programmers to decompose things into OS-specific modules to get
them out of the way. (One can also argue that this is superior to
#if in the first place! :-) ) Still, from where C is today, there
seems to be no way to reach a preprocessor-free utopia (or is that
a "nirvana"?). :-)
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.

Douglas A. Gwyn (IST/CNS)

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
"Clive D.W. Feather" wrote:
> The interaction between inline and extern.

I thought we had decided that an implementation could simply
ignore all occurrences of "inline" ?

Nick Maclaren

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to

It has to diagnose the errors implied by the constraints, for a
start.

IF it is going to ignore the inlining hint completely, THEN that
is all it has to do. But, if it takes ANY notice of that hint,
then it has to parse the whole translation unit before starting
to generate code.

David R Tribble

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
Clive D.W. Feather wrote:
>
> Bill A <bi...@megahits.com> writes
>> Were these translation phases designed as a direct consequence of
>> the computing power and resources the systems had at the time (20
>> years ago).
>> [...]
>> Was preprocessing separated from scanning and lexical analysis
>> because of the times? I.e. there wasn't enough program memory to
>> combine these.
>
> This was probably the motivation behind the preprocessor/translator
> split in early compilers, yes.
>
> However, the reason the Standard splits the language into phases is to
> make it clear how various features interact. There is no expectation
> that, for example, there will really be a separate pass over the code
> just to implement phase 6 (merging of string literals). Rather,
> writing the Standard in this way allows us to omit reams of text
> along the lines of "except that where one of the identifiers is
> #defined as a macro, then the following effects apply after all the
> replacement has taken place".

It also clears up huge areas of ambiguities (e.g., what happens when
a backslash-newline splits a preprocessor directive across multiple
lines). It makes the actual writing of lexers, preprocessors, and
parsers for C a more well-defined enterprise. It also allows
programmers to write portable code confidently, knowing that it is
portable.

As I've stated before, it could have been a lot worse, for both
programmers and compiler vendors, by leaving the phases of
translations out of the standard. (Cf. COBOL lexicon for a lesson
in how *not* to do it.)

Dennis Ritchie

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
David R Tribble asked:

> And little things like conditional compilation, for things like
> writing code that is ported to many systems. Tell me, please, how
> to write the following without preprocessing directives?:
>
> #if OS_UNIX
> ...code for Unix, including extremely system-specific
> ...system calls like gettimeofday()
> #elif OS_WIN32
> ...ditto, for GetSystemTimeAsFileTime()
> #elif OS_MACOS
> ...
> #else
> #error Unsupported O/S
> #endif

Surely the better way to handle this is to abstract the interfaces
you need, and put the implementation code in unix.c,
win32.c etc. Then load your portable, #ifdef-free program
with the appropriate library.

David R Tribble

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to

How do you know which library to load? Perhaps we can simply use
MAKE macros to build the appropriate object module for us -
but that's still a use of macros, right? ;-)

Seriously, we do all of the above; but when one implementation differs
from another by only a few lines of code, it becomes a maintenance
headache to split the file into completely separate source files.
If nothing else, adding a new function later on requires remembering
to edit more than one file, and someone always forgets.

On a personal note, how often do you rely on the C preprocessor in
your code?

Ben Pfaff

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
Keith Thompson <k...@cts.com> writes:

> What C compiler meaningfully supported sizeof in preprocessor
> directives? [...]

Turbo C++ 3.0 for DOS.
--
"Let others praise ancient times; I am glad I was born in these."
--Ovid (43 BC-18 AD)

Keith Thompson

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
"Bill A" <bi...@megahits.com> writes:
> Clive D.W. Feather wrote in message ...

> >I can't think of anything new in C99 that means that you can't have a
> >separate preprocessor. What did you have in mind ?
>
> To allow the preprocessor to use information contained in the TU up to the
> point of each directive. The sizeof is the best example, yet the standard
> for the preprocessor was changed (or was enforced after many compilers
> allowed the behavior - which was first, the chicken or the egg?) and this
> caused a lot of programs that used sizeof in #if's to not compile.

Here's a response Dennis Ritchie posted to comp.std.c on 1998-05-08
(excavated from deja.com):
> > You are right. It was nice back in the days when things like
> >
> > #if (sizeof(int) == 8)
> >
> > actually worked (on some compilers).
>
> Must have been before my time.
>
> Dennis

(I had that article taped to my office door for a while.)

What C compiler meaningfully supported sizeof in preprocessor

directives? Is it possible that some preprocessors simply treated
sizeof as an unrecognized identifier, but failed to complain about it?

--
Keith Thompson (The_Other_Keith) k...@cts.com <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
"Oh my gosh! You are SO ahead of your time!" -- anon.

Ben Pfaff

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
Ben Pfaff <pfaf...@msu.edu> writes:

> Keith Thompson <k...@cts.com> writes:
>
> > What C compiler meaningfully supported sizeof in preprocessor

> > directives? [...]
>
> Turbo C++ 3.0 for DOS.

I should qualify that I'm referring to the C compiler from TC++3.
Also, this is from memory. I do not have the manuals in front of me.

Keith Thompson

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
"Bill A" <bi...@megahits.com> writes:
[...]
> Sorry, I was recalling "The C Programming Language 2nd Edition" by Brian W.
> Kernighan and Dennis M Ritchie (Prentice Hall) where preprocessing is
> covered in Appendix A.

The entire language is covered in Appendix A; that's the reference
manual. The rest of the book is more of a tutorial; the preprocessor
is covered in section 4.11, starting on page 88.

Jerry Coffin

unread,
Oct 13, 1999, 3:00:00 AM10/13/99
to
In article <yec3dve...@king.cts.com>, k...@cts.com says...

[ ... ]

> What C compiler meaningfully supported sizeof in preprocessor

> directives? Is it possible that some preprocessors simply treated
> sizeof as an unrecognized identifier, but failed to complain about it?

Looking through ancient copies of _The Programmer's Journal_ for Rex
Jaeschke's testing of compilers at the time, it looks like Mix Power C
1.2 and Manx Aztec C 4.2 both did. I'm not sure how much it matter,
since I've heard little about either in, say, the last decade or so...

--
Later,
Jerry.

The universe is a figment of its own imagination.

Bill A

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to
Clive D.W. Feather wrote in message ...
>I can't think of anything new in C99 that means that you can't have a
>separate preprocessor. What did you have in mind ?

To allow the preprocessor to use information contained in the TU up to the
point of each directive. The sizeof is the best example, yet the standard
for the preprocessor was changed (or was enforced after many compilers
allowed the behavior - which was first, the chicken or the egg?) and this
caused a lot of programs that used sizeof in #if's to not compile.

And to be able to make tests. My #ifdef Byte was not as good as the #if
is_typedef(Byte) given by James Kuyper, but ideas like this line would be
nice. Or little things like:

#ifdef ...
#elifdef ...
#endif

The preprocessor can do some awesome things. I've seen it used to handle
binary constants (e.g. B(11001010)) and I've seen a macro expand to more
than 10k of C code. But when you need to check that a struct remains a
static size, you can't easily do it.

Bill


Douglas A. Gwyn

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to
> > #if OS_UNIX
> > ...code for Unix, including extremely system-specific
> > ...system calls like gettimeofday()
> > #elif OS_WIN32
> > ...ditto, for GetSystemTimeAsFileTime()
> > #elif OS_MACOS
> > ...
> > #else
> > #error Unsupported O/S
> > #endif

Dennis Ritchie wrote:
> Surely the better way to handle this is to abstract the interfaces
> you need, and put the implementation code in unix.c,
> win32.c etc. Then load your portable, #ifdef-free program
> with the appropriate library.

Sure, it is better, if you have the luxury of designing the code
that way. Sometimes, however, a ton of code already exists and
a simple "Band-Aid"(tm) will get it to work in a slightly
changed environment. For example, one might have a signal
handler and want it to catch a lot of known possible signals,
but not *all* (such as SIGCLD, which has horrible semantics),
so a simple loop over signal numbers doesn't work, leaving
#ifdef SIGUSR2
signal(SIGUSR2, my_handler);
#endif
#ifdef SIGPIPE
signal(SIGPIPE, my_handler);
#endif
...
Or a loop through a table of signal numbers, where some of the
table entries have to be conditionalized like that.

C preprocessing, or macro processing in general, provides a form
of functionality that "pure" language semantics don't, at least
not for C. For one thing, function-like macros have "call by
name" argument semantics, for which there is no equivalent in
the language proper. And stringizing and token pasting have
their uses, especially in large projects. My main regret is
that C's macro processing is not more general; not infrequently
I find that I need to do something (such as the "counter" that
somebody recently asked about, in comp.lang.c.moderated I think)
for which C does not provide tools for a good solution. In the
huge MUVES project some ten years ago, we found occasion to use
"awk" to build C source code (tables of functions, etc.), and
occasionally a little C program to output the required C source.
One has the feeling that in a "perfect" programming language
(which doesn't yet exist) there would always be a good way for
the programmer to specify what he wants, without going outside
the language.

Bill A

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to
Clive D.W. Feather wrote in message <2rlPZ6Hd...@romana.davros.org>...

>Changes *have* been made to the preprocessor, both in C89 (v K&R) and
>C99 (v C89). So why do you say this ?

Sorry, I was too strong. Okay, you're right, there have been changes, but
it hasn't really kept up with the times, changes, and needs as the C
language proper has. And that it's viewed as a separate process when I
think it should be described as being more integrated with the language. In
earlier standards, it was even delegated to an Appendix, or so I recall.

Bill


Douglas A. Gwyn

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to
Bill A wrote:
> Sorry, I was too strong. Okay, you're right, there have been
> changes, but it hasn't really kept up with the times, changes,
> and needs as the C language proper has. And that it's viewed
> as a separate process when I think it should be described as
> being more integrated with the language. In earlier standards,
> it was even delegated to an Appendix, or so I recall.

Well, you recall incorrectly.

As to the rest, those are your opinions, unsubstantiated by
evidence or examples. Certainly the original "modern" C
preprocessor, by John Reiser, was separately invokable (and
in fact would be invoked only when the C source file began
with a #), but even fifteen years ago many production-quality
C compilers had already integrated preprocessing into the
tokenizer module of the compiler, and the C89 standard in
effect further encouraged that style of implementation.

Dennis Ritchie

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to
David R Tribble wrote, re my suggestion to use implementation-dependent
files vs. interspersing #ifdefs:

> How do you know which library to load? Perhaps we can simply use
> MAKE macros to build the appropriate object module for us -
> but that's still a use of macros, right? ;-)

Yes; but it more cleanly separates the portable part.

> Seriously, we do all of the above; but when one implementation differs
> from another by only a few lines of code, it becomes a maintenance
> headache to split the file into completely separate source files.
> If nothing else, adding a new function later on requires remembering
> to edit more than one file, and someone always forgets.

But if the #ifdefs might occur in any of the source files
they are even harder to find and remember to touch up,
as well as making the source harder to understand.


>
> On a personal note, how often do you rely on the C preprocessor in
> your code?

I looked at /sys/src and found a megaline of code with a bit
above 800 #ifdef occurrences; two particular imported programs
accounted for most of them (down to 200).

Of course little of this is mine own. I use the preprocessor
for simple #define and #include.

Gwyn observed later in the thread

> Sure, it is better, if you have the luxury of designing the code

> that way. Sometimes, however, a ton of code already exists ...

True enough. But now and again new programs are created, and my
argument is just that it is better to put the necessarily non-portable
parts in their separate worlds than strew them mercilessly throughout.

Dennis

Bill A

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to
Douglas A. Gwyn wrote in message <38052D38...@null.net>...

>> In earlier standards, it was even delegated to an Appendix, or so I
recall.
>
>Well, you recall incorrectly.

Sorry, I was recalling "The C Programming Language 2nd Edition" by Brian W.


Kernighan and Dennis M Ritchie (Prentice Hall) where preprocessing is
covered in Appendix A.

>As to the rest, those are your opinions, unsubstantiated by
>evidence or examples.

You give more examples in a later post of exactly what I'm talking about.
Counting using the preprocessor is a good one. #repeat is another. #input
to prompt for user input another. Don't confuse me with those people
talking about wanting to get rid of the preprocessor - I'd like it to do a
whole lot more. Sounds like you do to. I'd use these features many more
times than VLAs.

Bill


Helmut Leitner

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to

Ben Pfaff wrote:


>
> Keith Thompson <k...@cts.com> writes:
>
> > What C compiler meaningfully supported sizeof in preprocessor

> > directives? [...]
>
> Turbo C++ 3.0 for DOS.

and Borland C/C++ 4.0, 4.5, 5.0x
(for DOS, DOS+DPMI, Win16, Win32)

--
Helmut Leitner lei...@hls.via.at
Graz, Austria www.hls-software.com

Helmut Leitner

unread,
Oct 14, 1999, 3:00:00 AM10/14/99
to

Dennis Ritchie wrote:
> True enough. But now and again new programs are created, and my
> argument is just that it is better to put the necessarily non-portable
> parts in their separate worlds than strew them mercilessly throughout.

On the other hand this style leads to source code redundancy
which reduces the maintainability. There are also situations
where a simple DOS/WINDOWS/UNIX/ABC separation may be further
splitted e.g. by CONSOLE/TEXTSCREEN/GUI ifdefs. To put any
useful combination in a separate library would be a real pain.

It is loading more messages.
0 new messages