Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Computing sizeof() during compilation

344 views
Skip to first unread message

Dann Corbit

unread,
May 7, 1998, 3:00:00 AM5/7/98
to

Since C does not have run time binding, why can't the sizeof() operator be
computed at the preprocessor phase of compile time. That would make things
so much nicer, so that we could have constructs like:
#if (sizeof(int) == 8)
or whatever.
--
Hypertext C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
C-FAQ ftp: ftp://rtfm.mit.edu, C-FAQ Book: ISBN 0-201-84519-9
Try "C Programming: A Modern Approach" ISBN 0-393-96945-2
Want Software? Algorithms? Pubs? http://www.infoseek.com

Dennis Yelle

unread,
May 7, 1998, 3:00:00 AM5/7/98
to

In article <6it2mh$dhq$1...@client3.news.psi.net> "Dann Corbit" <dco...@solutionsiq.com> writes:
>Since C does not have run time binding, why can't the sizeof() operator be
>computed at the preprocessor phase of compile time. That would make things
>so much nicer, so that we could have constructs like:
>#if (sizeof(int) == 8)
>or whatever.

You are right. It was nice back in the days when things like

#if (sizeof(int) == 8)

actually worked (on some compilers).

Dennis Yelle

--
den...@netcom.com (Dennis Yelle)
"You must do the thing you think you cannot do." -- Eleanor Roosevelt

Dave Hansen

unread,
May 7, 1998, 3:00:00 AM5/7/98
to

On Thu, 7 May 1998 19:50:01 GMT, den...@netcom.com (Dennis Yelle)
wrote:

>In article <6it2mh$dhq$1...@client3.news.psi.net> "Dann Corbit" <dco...@solutionsiq.com> writes:
>>Since C does not have run time binding, why can't the sizeof() operator be
>>computed at the preprocessor phase of compile time. That would make things
>>so much nicer, so that we could have constructs like:
>>#if (sizeof(int) == 8)
>>or whatever.
>
>You are right. It was nice back in the days when things like
>
>#if (sizeof(int) == 8)
>
>actually worked (on some compilers).

I agree it would be nice.

However, I suspect the reason it's not allowed is the complexity it
adds to the preprocessor (which might be a separate executable). Not
only would cpp have to know how to parse structures and unions, but it
would have to know about padding, which might change depending on
compiler switches, etc. Consider:

union x {
struct y {
char yc;
int yi;
} xy;

struct z {
int xi;
char xc;
} xz;
} ux;

#if (sizeof(struct y) == sizeof(ux.xz))
/* Hopefully the preprocessor and the compiler made the same
assumptions about padding, or this isn't going to come
out right! */
#endif

Not that compiler writers aren't clever enough to figure out how to
make this all work...

Regards,

-=Dave
dha...@btree.com
Just my (10-010) cents
I can barely peak for myself, so I certainly can't speak for B-Tree.

Nick Maclaren

unread,
May 7, 1998, 3:00:00 AM5/7/98
to

In article <dennisEs...@netcom.com>,

Dennis Yelle <den...@netcom.com> wrote:
>In article <6it2mh$dhq$1...@client3.news.psi.net> "Dann Corbit" <dco...@solutionsiq.com> writes:
>>Since C does not have run time binding, why can't the sizeof() operator be
>>computed at the preprocessor phase of compile time. That would make things
>>so much nicer, so that we could have constructs like:
>>#if (sizeof(int) == 8)
>>or whatever.
>
>You are right. It was nice back in the days when things like
>
>#if (sizeof(int) == 8)
>
>actually worked (on some compilers).

The trouble was that it only sometimes worked :-(

I have never understood why it was so hard to handle this during
preprocessing. There are basically three cases:

1) Traditional K&R C or C89, compiled on the target system. This
can easily be made to work. sizeof is a bit weird, being the only
'normal' identifier known to the preprocessor, but so what?

2) Traditional K&R C or C89, cross-compiled. sizeof could be
implementation-defined, and might produce a diagnostic. For some
reason, this was regarded as unacceptable.

3) C9X with VLAs. sizeof is a non-constant expression, and is not
allowed in preprocessor expressions.

Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

Clive D.W. Feather

unread,
May 7, 1998, 3:00:00 AM5/7/98
to

In article <6it661$8ks$1...@lyra.csx.cam.ac.uk>, Nick Maclaren
<nm...@cus.cam.ac.uk> writes

>I have never understood why it was so hard to handle this during
>preprocessing. There are basically three cases:
>
> 1) Traditional K&R C or C89, compiled on the target system. This
>can easily be made to work. sizeof is a bit weird, being the only
>'normal' identifier known to the preprocessor, but so what?

If you only want sizeof (type name using basic types) to work, then it
probably could be made to do so. If you want to use identifiers,
typedefs, and so on, then the preprocessor has to understand the entire
compiled language. The confusion probably isn't worth it.

--
Clive D.W. Feather | Director of Software Development | Home email:
Tel: +44 181 371 1138 | Demon Internet Ltd. | <cl...@davros.org>
Fax: +44 181 371 1037 | <cl...@demon.net> |
Written on my laptop; please observe the Reply-To address |

Dennis Ritchie

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

> You are right. It was nice back in the days when things like
>
> #if (sizeof(int) == 8)
>
> actually worked (on some compilers).

Must have been before my time.

Dennis

Douglas A. Gwyn

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

> > You are right. It was nice back in the days when things like
> > #if (sizeof(int) == 8)
> > actually worked (on some compilers).

Dennis Ritchie wrote:
> Must have been before my time.

I think it may have been along some axis other than time (or space).
#define sizeof(whatever) 2 /* almost right on 6th Ed. */
#if sizeof(int) == 8 /* we don't need no stinking parens */
...
Of course, that makes sizeof a trifle less useful than normal.
\f[Webdings]J\fP

Nick Maclaren

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

In article <35529755...@null.net>,

Douglas A. Gwyn <DAG...@null.net> wrote:
>> > You are right. It was nice back in the days when things like
>> > #if (sizeof(int) == 8)
> > > actually worked (on some compilers).
>
>Dennis Ritchie wrote:
>> Must have been before my time.
>
>I think it may have been along some axis other than time (or space).

I believe that is precisely correct. I have used C compilers where
the preprocessor DID understand sizeof, but I vaguely remember them
being early drafts of so-called ANSI compilers - and many of those
early implementations were extremely so-called :-(

Christian Bau

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

In article <6it2mh$dhq$1...@client3.news.psi.net>, "Dann Corbit"
<dco...@solutionsiq.com> wrote:

> Since C does not have run time binding, why can't the sizeof() operator be
> computed at the preprocessor phase of compile time. That would make things

> so much nicer, so that we could have constructs like:
> #if (sizeof(int) == 8)
> or whatever.

Actually, the very first C compiler that I ever used could do exactly this
for standard C types!

What you can do portably: Lets say you want to have a type UInt32 that
should be an unsigned int type of exactly 32 bits. Then you can write

#if UINT_MAX == 0xffffffff
typedef unsigned int UInt32;
#else
#if ULONG_MAX == 0xffffffff
typedef unsigned long UInt32;
#else
#error There is no type of exactly 32 bits
#endif
#endif

which will continue working if you use a compiler that decides 16 bit
chars would be a good idea.

Stephen Baynes

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

It also works if there are 'holes' in the types.

At the best (sizeof(int) == 8 ) tells you that int can hold _at_most_ 8*CHAR_BIT
bits and no more. It does not guarentee that you will get that many.

--
Stephen Baynes CEng MBCS Stephen...@soton.sc.philips.com
Philips Semiconductors Ltd
Southampton SO15 0DJ +44 (01703) 316431
United Kingdom My views are my own.
Do you use ISO8859-1? Yes if you see © as copyright, ÷ as division and ½ as 1/2.

James Kuyper

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

Dann Corbit wrote:
>
> Since C does not have run time binding, why can't the sizeof() operator be
> computed at the preprocessor phase of compile time. That would make things
> so much nicer, so that we could have constructs like:
> #if (sizeof(int) == 8)
> or whatever.

Well, for one things, this won't work on VLA's.

Also, it would require the pre-processor to parse the language in
sufficient detail to calculate the size of any argument that could
appear in a sizeof() expression. The compiler can't determine the size
of a structure or union, for instance, until phase 7. Pre-processing
currently occurs in phase 4.

You could specify that using sizeof() a user-defined types is undefined
as far as the pre-processor is concerned. However, that still has the
disadvantage that the pre-processor must know enough about C to identify
the standard types, and know enough about the specific implementation to
determine how the size of the standard types varies with the compiler
options. That would prohibit one popular type of implementation, in
which the pre-processor is a actual seperate program, which knows next
to nothing about C, and can therefore be used on its own in contexts
that have nothing to do with C.

Nick Maclaren

unread,
May 8, 1998, 3:00:00 AM5/8/98
to

In article <NDJG1NBz...@romana.davros.org>, "Clive D.W. Feather" <cl...@on-the-train.demon.co.uk> writes:
|> In article <6it661$8ks$1...@lyra.csx.cam.ac.uk>, Nick Maclaren
|> <nm...@cus.cam.ac.uk> writes
|> >I have never understood why it was so hard to handle this during
|> >preprocessing. There are basically three cases:
|> >
|> > 1) Traditional K&R C or C89, compiled on the target system. This
|> >can easily be made to work. sizeof is a bit weird, being the only
|> >'normal' identifier known to the preprocessor, but so what?
|>
|> If you only want sizeof (type name using basic types) to work, then
|> it probably could be made to do so. If you want to use identifiers,
|> typedefs, and so on, then the preprocessor has to understand the
|> entire compiled language. The confusion probably isn't worth it.

Not really. It already has to parse and evaluate expressions. To
handle sizeof, it would need to include perhaps 50% of the remainder
of the parser (though it would probably be easier to include the lot)
and the fairly simple structure allocation (i.e. alignment) code; the
remaining work is trivial. You need at least as much for standard
Unix C utilities like cflow!

And, anyway, that is an argument that applies to K&R C only. In C89,
the language includes both the preprocessor and type mechanism, and
so an implementation can trivially merge the two. This can be done
without eliminating 'cpp' by simply stopping after the relevant
phases of compilation - this has been done in many C89 compilers, so
I am not just speculating.

I don't think that there are any language arguments against it, and
there aren't even any very strong technical difficulties. But I can
see that vendors with completely separate preprocessors and compilers
were reluctant to accept a change that would have forced them to
merge them, or copy code from one to the other.

Clive D.W. Feather

unread,
May 10, 1998, 3:00:00 AM5/10/98
to

In article <6iv5hi$adb$1...@lyra.csx.cam.ac.uk>, Nick Maclaren
<nm...@cus.cam.ac.uk> writes

>|> If you only want sizeof (type name using basic types) to work, then
>|> it probably could be made to do so. If you want to use identifiers,
>|> typedefs, and so on, then the preprocessor has to understand the
>|> entire compiled language. The confusion probably isn't worth it.
>Not really. It already has to parse and evaluate expressions. To
>handle sizeof, it would need to include perhaps 50% of the remainder
>of the parser (though it would probably be easier to include the lot)
>and the fairly simple structure allocation (i.e. alignment) code; the
>remaining work is trivial.

No - it also needs to keep track of the types of identifiers, and
therefore of the whole scope system. Consider:

typedef struct { int mumble; float sink; } t;
struct { t u; t *next; } s;
#if sizeof (s) > 42
/* ... */

and think what I could do with nested scopes, macros containing braces,
and so on. Currently the preprocessor puts no semantics on non-macro
identifiers.

>And, anyway, that is an argument that applies to K&R C only. In C89,
>the language includes both the preprocessor and type mechanism, and
>so an implementation can trivially merge the two.

Yes, but it can't trivially provide the required feedback. Your argument
is a red herring.

Nick Maclaren

unread,
May 11, 1998, 3:00:00 AM5/11/98
to

In article <mW7cQ5A9...@romana.davros.org>,

Clive D.W. Feather <cl...@demon.net> wrote:
>
>No - it also needs to keep track of the types of identifiers, and
>therefore of the whole scope system. Consider:
>
> typedef struct { int mumble; float sink; } t;
> struct { t u; t *next; } s;
> #if sizeof (s) > 42
> /* ... */
>
>and think what I could do with nested scopes, macros containing braces,
>and so on. Currently the preprocessor puts no semantics on non-macro
>identifiers.

I think that you are overstating the problems. We can exclude labels,
we don't have to distinguish storage classes etc., and the necessary
parts of the C scope system are pretty trivial. Heck, the hard part of
writing C source-munging tools is the necessity to either emulate or
call the preprocessor!

Remember the 'one-pass' nature of C - this turns what can be an extremely
hard problem into something that is really pretty simple. I quite agree
that we aren't going to see it, and you will have noticed that I haven't
bothered even proposing it, but the arguments are not linguistic. They
are the amount of hassle that it would cause many vendors.

>>And, anyway, that is an argument that applies to K&R C only. In C89,
>>the language includes both the preprocessor and type mechanism, and
>>so an implementation can trivially merge the two.
>
>Yes, but it can't trivially provide the required feedback. Your argument
>is a red herring.

Yes, it can - in the LANGUAGE. An implementation that handles the two
phases entirely separately can't, of course. An implementation that
handles preprocessing in parallel with syntax analysis can, and with
minimal extra work.

I could be wrong, but I am pretty sure that the changes to the wording of
the standard to support this would be minor, as there are already words
saying when identifiers become visible and when declarations are complete.
A little more tightening might be necessary.

I stand by my view that it could fairly easily be done, but I accept that
such a change would imply that many vendors would have to rewrite large
chunks of their compilation systems. This then becomes a cost-benefit
decision, and we know what was decided :-)

Clive D.W. Feather

unread,
May 14, 1998, 3:00:00 AM5/14/98
to

In article <6j6fht$rfv$1...@lyra.csx.cam.ac.uk>, Nick Maclaren
<nm...@cus.cam.ac.uk> writes

>>No - it also needs to keep track of the types of identifiers, and
>>therefore of the whole scope system.
[...]

>I think that you are overstating the problems. We can exclude labels,
>we don't have to distinguish storage classes etc.,

True.

>and the necessary
>parts of the C scope system are pretty trivial. Heck, the hard part of
>writing C source-munging tools is the necessity to either emulate or
>call the preprocessor!

But it's not just source-munging. You need to keep all identifiers and
types to hand and handle all the scope issues. It brings a chunk of the
compiler proper into the preprocessor.

>I could be wrong, but I am pretty sure that the changes to the wording of
>the standard to support this would be minor, as there are already words
>saying when identifiers become visible and when declarations are complete.
>A little more tightening might be necessary.

I have my doubts; I suspect that there are pathological cases
(particularly since the language bit *doesn't* take preprocessing into
account).

--
Clive D.W. Feather | Director of Software Development | Home email:

Tel: +44 181 371 1138 | Demon Internet Ltd. | <cd...@i.am>
Fax: +44 181 371 1037 | <cl...@demon.net> | Home web:
Written on my laptop; please observe the Reply-To address | http://i.am/davros

0 new messages