Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

long longs in c

22 views
Skip to first unread message

John R. Mashey

unread,
Aug 17, 1995, 3:00:00 AM8/17/95
to
In article <danpop.808659017@rscernix>, Dan...@mail.cern.ch (Dan Pop) writes:
|>
|> In <40tdmr$j...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R. Mashey) writes:
|>
|> >1) long longs are not part of ANSI C ... but probably will be, since:
(lots of people have implemented it, if not previously, as instigated
by 64-bit working group in 1992).
|>
|> Well, you'd better have a look at comp.std.c. None of the committee
|> people posting there seems to be favouring the addition of long long
|> in C9X. They're considering other schemes. long long seems to be
|> doomed to be a vendor extension.


I believe this conclusion to be unwarranted ....
a) Some features are random extensions by individual vendors.
b) Some extensions get widely implemented in advance of the
standard, because they solve some problem that cannot wait
until the next standard ... after all, standards have no
business changing overnight.
c) Standards committees may well need to sometimes invent new
things [like, when volatile was added years ago].
d) However, if an extension is widely implemented, it is incumbent
on an open standards committee to give that extension serious
consideration ... because otherwise, there is a strong
tendency for the defacto standards to evolve away from the
dejure standard... which is probably not a good idea.
e) Again, as I said before, the 1-2 members of the 1992 group were also
in the ANSI C group ... were where I got the opinion above from,
i.e., don't let the non-existence of long long in the standard
stop you from making progress - it is better to do something
consistent.

IF long long has definitively been ruled out (as opposed to being disliked by
a few committee members), it would be interesting to hear more... as it
seems inconsistent with past behavior, which has at least sometimes
ratified existing practices that were less than elegant... and was
appropriate in doing so.

--
-john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: ma...@sgi.com
DDD: 415-390-3090 FAX: 415-967-8496
USPS: Silicon Graphics 6L-005, 2011 N. Shoreline Blvd, Mountain View, CA 94039-7311

FFarance

unread,
Aug 18, 1995, 3:00:00 AM8/18/95
to
(Sorry if this got posted twice. The AOL news server went down in the
middle of
posting. AOL is a *real* pile.)

> From: ma...@mash.engr.sgi.com (John R. Mashey)
>
> In article <danpop.808659017@rscernix>, Dan...@mail.cern.ch (Dan Pop)
writes:
> |>
> |> In <40tdmr$j...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R.
Mashey) writes:
>|>
> |> >1) long longs are not part of ANSI C ... but probably will be,
since:
> (lots of people have implemented it, if not previously, as instigated
> by 64-bit working group in 1992).

The "long long" type is unlikely to be included in C9X. Although the
problem has been discussed in the EIR (extended integer range) working
group of NCEG (numeric C extensions group -- X3J11.1) for several years,
over the past two years is has been recognized as a faulty solution.

I've covered this point several times on "comp.std.c". You can read my
SBEIR (specification-based extended integer range) proposal in:

ftp://ftp.dmk.com/DMK/sc22wg14/c9x/extended-integers/sbeir.*

I'll be updating the proposal in about two weeks to include a FAQ with
many of the questions posed in "comp.std.c". Until then, I will summarize
the points I made in this thread about a month ago:

- After much analysis, the problem is not ``can I standardize
"long long", or how to I get a 64-bit type, or what is the name
of a 64-bit type'', but ``loss of intent information causes
portability problems''. This isn't an obvious conclusion. You
should read the paper to understand the *real* problem.

- Most people on WG14 and X3J11 now understand that this is the
real problem.

- Solving this problem revolves around capturing programmer
intent when declaring the integral type. Intent is specified
in terms of signedness, specified precision (what you want),
exactness (exactly N bits, at least N bits), performance
attributes (fastest, smallest, unoptimized).

- The nature of the solution matches the nature of the problem:

typedef signed fast int atleast:64 X;
typedef signed int exact:64 Y;

- The use of "long long" causes more harm (really!) because it
creates more porting problems. As a simple example, while we
might believe that "long long" is a 64-bit type, what happens
when you move the code to 128-bit machines? The "long long"
type will probably map into the X or Y typedef above. This
will cause porting problems because whatever "long long" is
mapped into, some will believe it is (and use it as) ``the
fastest type of at least 64 bits'' and others will believe it
is ``exactly 64 bits''. Thus, in the port to 128-bit machines,
we have to track down these implicit assumptions because
programmers
*rarely* document their intent (e.g., ``I want the fastest 32-bit
type'') and, mostly, they believe the type itself documents
intent (!). This is how porting problems are created.

> |> Well, you'd better have a look at comp.std.c. None of the committee
> |> people posting there seems to be favouring the addition of long long
> |> in C9X. They're considering other schemes. long long seems to be
> |> doomed to be a vendor extension.

It should probably remain a vendor extension.

> I believe this conclusion to be unwarranted ....
> a) Some features are random extensions by individual vendors.
> b) Some extensions get widely implemented in advance of the
> standard, because they solve some problem that cannot wait
> until the next standard ... after all, standards have no
> business changing overnight.

Yes, "long long" has been implemented by many vendors to solve *their*
immediate problem. However the 64-bit vendors could not agree on a
mapping of char/short/int/long to specific precisions (e.g., 8/16/32/64,
8/16/32/32, etc.). The reason the couldn't agree is that there was
another problem that caused them to have different opinions -- it's
related to the ``loss of intent information causes portability problems'':
the vendors couldn't agree on the mapping of type intents to the same
precisions on all architectures. While these vendors weren't discussing
these problems in SBEIR terms, they were blocked from solving the problem
because of this.

> c) Standards committees may well need to sometimes invent new
> things [like, when volatile was added years ago].

This solution wasn't ``just invented'', but developed over years by
analyzing what the *real* problem is. The nature of the solution matches
the nature of the problem. BTW, bit/byte ordering/alignment and
representation (e.g., two's complement) will be addressed in separate
proposals. The SBEIR proposal only addresses range extensions.

> d) However, if an extension is widely implemented, it is incumbent
> on an open standards committee to give that extension serious
> consideration ... because otherwise, there is a strong
> tendency for the defacto standards to evolve away from the
> dejure standard... which is probably not a good idea.

The "long long" extension was given serious consideration. It was worked
on for several years, but several difficult problems kept coming back.
It took us several years to recognize the true nature of the problem.

> e) Again, as I said before, the 1-2 members of the 1992 group were
also
> in the ANSI C group ... were where I got the opinion above
from,
> i.e., don't let the non-existence of long long in the standard
> stop you from making progress - it is better to do something
> consistent.

In 1992, that was probably a reasonable opinion. Since then we understand
the problem and have solutions being worked now.

> IF long long has definitively been ruled out (as opposed to being
disliked by
> a few committee members), it would be interesting to hear more... as it
> seems inconsistent with past behavior, which has at least sometimes
> ratified existing practices that were less than elegant... and was
> appropriate in doing so.

I think if we could have fixed "long long", even with a 90% solution,
we would have done it. Among the reasons for not including "long long"
are: we'd have to solve this problem again 10 years from now when people
were asking for "long long long" for their 128-bit machines; "long long"
causes more portability problems *across different architectures* than
it helps. Years ago, many people wondered out aloud if we could find
a ``right'' solution that solved the problem for once and all. The
SBEIR proposal is one solution.

-FF
-------------------------------------------------------------------
(``I only use AOL for reading netnews.'')
Frank Farance, Farance Inc.
E-mail: fr...@farance.com, Telephone: +1 212 486 4700
ISO JTC1/SC22/WG14 & ANSI X3J11 (C Programming Language) Project Editor

hal...@caip.rutgers.edu

unread,
Aug 19, 1995, 3:00:00 AM8/19/95
to
In article <4102js$1...@murrow.corp.sgi.com>, ma...@mash.engr.sgi.com (John R. Mashey) writes
: ....
: IF long long has definitively been ruled out (as opposed to being disliked by

: a few committee members), it would be interesting to hear more... as it
: seems inconsistent with past behavior, which has at least sometimes
: ratified existing practices that were less than elegant... and was
: appropriate in doing so.

What is more, the suggested is in manie respects _less_ elegant.

Steve Summit

unread,
Aug 19, 1995, 3:00:00 AM8/19/95
to
I probably shouldn't step in here; I should probably stay
hunkered down with the rabble in comp.lang.c. I barely follow
comp.std.c or X3J11, I haven't read the SBEIR proposal, and I'll
probably step on some toes.

In article <412dkr$7...@newsbf02.news.aol.com>, ffar...@aol.com writes:
> - After much analysis, the problem is not ``can I standardize
> "long long", or how to I get a 64-bit type, or what is the name
> of a 64-bit type'', but ``loss of intent information causes
> portability problems''.

That's certainly *a* problem, but my sense is that an even more
fundamental one is, "People are unclear and disagree on what they
want out of the language."

> - Solving this problem revolves around capturing programmer
> intent when declaring the integral type. Intent is specified
> in terms of signedness, specified precision (what you want),
> exactness (exactly N bits, at least N bits), performance
> attributes (fastest, smallest, unoptimized).
>
> - The nature of the solution matches the nature of the problem:
>
> typedef signed fast int atleast:64 X;
> typedef signed int exact:64 Y;

This isn't C. This is PL/I.

I believe it's risky to declare, after much analysis, that a
symptom is a fundamental problem and that the problem can and
must be solved.

> - The use of "long long" causes more harm (really!) because it
> creates more porting problems. As a simple example, while we
> might believe that "long long" is a 64-bit type, what happens
> when you move the code to 128-bit machines? The "long long"
> type will probably map into the X or Y typedef above. This
> will cause porting problems because whatever "long long" is
> mapped into, some will believe it is (and use it as) ``the
> fastest type of at least 64 bits'' and others will believe it
> is ``exactly 64 bits''.

But this is absolutely *not* a new problem! This is the exact
problem people had when porting code from 16 to 32 bit machines,
or vice versa.

My thesis -- and as I have not followed the latest round of
arguments in comp.std.c it is probably a naive one -- is that
while there are certainly several interrelated problems here, they
are in fact so fundamental that they count as postulates of the
language, and that if we feel they *need* solving we should
discard the language and start from scratch, not try to backpatch
them into C.

> Thus, in the port to 128-bit machines,
> we have to track down these implicit assumptions because programmers
> *rarely* document their intent (e.g., ``I want the fastest 32-bit
> type'') and, mostly, they believe the type itself documents
> intent (!). This is how porting problems are created.

Precisely. The real "problem" is poor programmers. Can we solve
this problem by fiddling with the type system? Even if we could,
should we?

> Yes, "long long" has been implemented by many vendors to solve *their*
> immediate problem. However the 64-bit vendors could not agree on a
> mapping of char/short/int/long to specific precisions (e.g., 8/16/32/64,
> 8/16/32/32, etc.).

But this is madness! There has never been, and there was never
supposed to be, a single, defined mapping between char/short/int/
long and any set of specific precisions! That simply is not how
C's type system works.

I realize that everyone else, at some level, knows this, too.
I realize that I'm tilting at windmills.
I realize that since 99% of all C programmers apparently cannot
use C's type system as it was intended, the pragmatic conclusion
can be reached that it's time to abandon C's type system and give
the 99% what they think they want, namely complete control over
exact type sizes. But I think it would be a real shame, a real
capitulation, and a real step backward to do this, and a
"solution" which does so should be labeled as such, as not really
being C any more.

> This solution wasn't ``just invented'', but developed over years by
> analyzing what the *real* problem is. The nature of the solution matches
> the nature of the problem.

I disagree that it took years just to develop this solution.

The first time you see a programmer struggling with elaborate
config scripts and typedefs such as int16 and int32, it's obvious
that a more "ideal" solution would be to introduce a full,
general, orthogonal set of type specifiers covering size, exact
vs. at-least, fast vs. small code vs. small data, etc., in other
words, just about what it sounds like SBEIR is. I'm sure I
started thinking along these lines at least five years ago,
perhaps even before 64-bit machines were a serious consideration.
I don't think that full, orthogonal control is a new or difficult
idea; plenty of languages have something like it. But "ideal" is
in quotes because it was obvious to me then, and is still obvious
to me now, that this solution is not C.

So what has it taken years to do, if not to come up with the
mechanics of a solution? I suspect it's taken years to summon
the will to introduce this solution. What the intervening years
have wrought is that C, despite these awful "problems," has
continued to grow in popularity, such that even more people who
don't understand it are trying desperately to misuse it, and
their voices have grown to deafening proportions, and the
original intent of C's type system has been even more dimly
forgotten, such that today it's possible to argue with a straight
face that this enormous bucket of extra specification ought to be
bolted onto C's type system.

Don't get me wrong. If I were designing a language, it would
have something exactly like that "enormous bucket of extra
specification." Although I claim that the extra level of
precision in declaration is needed far less of the time than is
usually assumed, it is needed some of the time, and It Would Be
Nice If there were a complete solution. But if the elusive
"Spirit of C" still held any real sway in determining the
language's future, a proposal like this would be a non-starter.
C's original philosophy (like Unix's) used to involve the
heretical repudiation of the last 10% of functionality which
would have accounted for an extra 90% of complexity and
development cost. It's true that the missing 10% of
functionality has engendered a constant stream of shrill
criticism all along, but I've come to realize that the omission
was in fact one of the most salient features, and perhaps a
major, unappreciated, "stealth" factor behind the language's
(operating system's) galloping success.

The nature of the proposed solution, though, does very definitely
match the nature of the problem. The problem is that people
continue to wish that C were something it is not, not realizing
that if C were what they thought they wanted it to be it would
never have succeeded and they wouldn't be using it in the first
place. And the solution is in tune with those wishes.

> BTW, bit/byte ordering/alignment and
> representation (e.g., two's complement) will be addressed in separate
> proposals.

This note highlights another significant aspect of what the real
problem really is. If all we cared about was computation, we
obviously wouldn't care about alignment and byte order. If we do
care, it's obvious that what we're worried about is that great
bugaboo: conforming to externally-imposed storage layouts.
Everybody desperately wants to do binary I/O in an "efficient"
way, namely by using fread and fwrite to read and write
structures which are used both for internal manipulation and
external interchange. And since some people, at least, also want
this binary I/O to be portable, they end up wanting complete
control over structure layout just so that they can get the I/O
to come out right.

It occurs to me that we could probably address 90% of what people
really want when they complain about type sizes, in a more direct
way, in a way no more radical than the SBEIR proposal, and
perhaps less disruptively, by doing one or both of the following:

1. Give programmers more control over the alignment and
packing of structures, perhaps by extending the existing
bitfield notation.

2. Build the rudiments of direct, binary I/O into the
language, with input and output statements which could
read and write data structures with as much convenience
to programmers as fread and fwrite, but with an
opportunity for appropriate translations to be
automatically introduced to conform to externally-imposed
storage layouts.

Number 2 is certainly radical, and it breaks another fundamental
postulate of the language (namely that I/O is not built in), and
I don't expect it will be taken seriously, but I submit that it
is no more radical or heretical than SBEIR. Neither 1 nor 2
above propagates sticky problems into parts of the language
unconcerned with external storage layouts -- neither necessitates
massive additions to the various sets of promotion rules; neither
induces screaming nightmares with respect to printf formats.

> I think if we could have fixed "long long", even with a 90% solution,
> we would have done it. Among the reasons for not including "long long"
> are: we'd have to solve this problem again 10 years from now when people
> were asking for "long long long" for their 128-bit machines; "long long"
> causes more portability problems *across different architectures* than
> it helps. Years ago, many people wondered out aloud if we could find
> a ``right'' solution that solved the problem for once and all.

The reason that "long long" couldn't be agreed upon was that
there was no *right* solution. If you were zealous about the
preservation of the original spirit of C, it was obvious that
making single "long" be a 64-bit type was the right solution.
If you were more pragmatic, or had hapless customers who insisted
that sizeof(int) had to remain equal to sizeof(long), or felt
that there ought to be a way to say "at least 64 bits," or wanted
people to be able to port code that made assumptions about exact
sizes without using typedefs, you felt a compelling need for
something like "long long".

When 128-bit machines come along, if you're on the zealous & pure
side, but if "long long" has somehow gotten its foot in the door,
there's no problem: long is 64 bits, and long long is 128 bits.


The real shame is that none of this is, or ought to be, as
important as it now seems to be. Most code that worries about
the exact sizes of things would be better off if it didn't. Much
binary I/O that worries about external storage layout would be
better off if it used more flexible representations (such as
text). If we capitulate to the demands of already-spoiled
programmers, by giving them all this extra control that they'd
be better off not needing and not using, we'll not only be
contributing to unnecessary tight coupling and fragility of code,
but we'll also be seriously degrading the simplicity and
learnability of the language. I realize that this is a
slippery-slope argument, but if something like SBEIR goes into
C9X, and even though the new exact and fast and atleast keywords
will be strictly optional, most new programmers learning C will
be taught that for every variable they declare they should make a
choice between at-least/exact and small/fast, and though they
won't realize it, their annoyance at being told to do so, and
their lingering suspicion that perhaps they shouldn't have to,
will be exactly what they'd have felt if they'd been learning
PL/I.

Steve Summit
s...@eskimo.com

FFarance

unread,
Aug 20, 1995, 3:00:00 AM8/20/95
to
(This posting is part 1 of 2. I think the news server is having a
problem with the long message.)

> Path:
newsbf01.news.aol.com!newstf01.news.aol.com!uunet!in2.uu.net!eskimo!scs
> From: s...@eskimo.com (Steve Summit)


>
> I probably shouldn't step in here; I should probably stay
> hunkered down with the rabble in comp.lang.c. I barely follow
> comp.std.c or X3J11, I haven't read the SBEIR proposal, and I'll
> probably step on some toes.

In summary, you make many good points that happen to be made also
in the SBEIR paper. You should read the people to understand the
problem and its solution completely. I've been giving highlights
of this in "comp.std.c", but the paper says it more completely.
With this in mind, I'll address some of your points and others I
will refer to the paper.

> In article <412dkr$7...@newsbf02.news.aol.com>, ffar...@aol.com writes:
> > - After much analysis, the problem is not ``can I standardize
> > "long long", or how to I get a 64-bit type, or what is the name
> > of a 64-bit type'', but ``loss of intent information causes
> > portability problems''.
>
> That's certainly *a* problem, but my sense is that an even more
> fundamental one is, "People are unclear and disagree on what they
> want out of the language."

Considering C must satisfy several needs, e.g., writing code that
works that same on all machines (strictly conforming programs) and
cost that works well on a single machine (conforming programs), there
is certainly disagreement on what people want in a language. This
is one of the reasons the C types char/short/int/long cause problems.

> > - Solving this problem revolves around capturing programmer
> > intent when declaring the integral type. Intent is specified
> > in terms of signedness, specified precision (what you want),
> > exactness (exactly N bits, at least N bits), performance
> > attributes (fastest, smallest, unoptimized).
> >
> > - The nature of the solution matches the nature of the problem:
> >
> > typedef signed fast int atleast:64 X;
> > typedef signed int exact:64 Y;
>
> This isn't C. This is PL/I.

Possibly, but there isn't anything wrong with borrowing other
technology from other languages. In fact, the C9X charter makes it
clear that we may be influenced by other languages.

> I believe it's risky to declare, after much analysis, that a
> symptom is a fundamental problem and that the problem can and
> must be solved.

Maybe one person's fundamental is another symptom. I think we did
the necessary amount of analysis (which was a lot) to solve the
requirements: people needed portable way to get more integer
range. Like much software engineering, the document that states
the customer's requirements (i.e., portable mechanism for extended
range) doesn't completely match the functional specification after
we analyze the problem (i.e., loss of type intent information
causes portability problems).

> > - The use of "long long" causes more harm (really!) because it
> > creates more porting problems. As a simple example, while we
> > might believe that "long long" is a 64-bit type, what happens
> > when you move the code to 128-bit machines? The "long long"
> > type will probably map into the X or Y typedef above. This
> > will cause porting problems because whatever "long long" is
> > mapped into, some will believe it is (and use it as) ``the
> > fastest type of at least 64 bits'' and others will believe it
> > is ``exactly 64 bits''.
>
> But this is absolutely *not* a new problem! This is the exact
> problem people had when porting code from 16 to 32 bit machines,
> or vice versa.

You *must* have read the SBEIR paper (:-) because the SBEIR paper
discusses moving from 16 to 32 bits and 32 bits to 64 bits (we are
discovering the same porting problems again). If you were using
C in the 1970's or early 1980's you probably saw many of these
porting issues.

> My thesis -- and as I have not followed the latest round of
> arguments in comp.std.c it is probably a naive one -- is that
> while there are certainly several interrelated problems here, they
> are in fact so fundamental that they count as postulates of the
> language, and that if we feel they *need* solving we should
> discard the language and start from scratch, not try to backpatch
> them into C.

I don't think we are ``backpatching'' this into C. In fact, the
SBEIR is *completely* compatible with existing C types, semantics,
and promotion rules (each vendor must describe the mapping of
char/short/int/long into an SBEIR type). The SBEIR conceptual
model allows you to specify more precisely (so to speak) the
parameters of the type. You don't have to use SBEIR types, but
if you are writing code that must work on many architectures with
space/time performance constraints (who isn't on large projects?),
then you'd find the SBEIR types solve these kind of porting problems.
IMPORTANT: If your writing code for a single architecture or
space/time performance isn't a major issue or programmers (software
porting and maintenance) are for free, then SBEIR isn't for you.

> > Thus, in the port to 128-bit machines,
> > we have to track down these implicit assumptions because
programmers
> > *rarely* document their intent (e.g., ``I want the fastest 32-bit
> > type'') and, mostly, they believe the type itself documents
> > intent (!). This is how porting problems are created.
>
> Precisely. The real "problem" is poor programmers. Can we solve
> this problem by fiddling with the type system? Even if we could,
> should we?

While it's true that poor programmers aggrevate the problem, even
good programmers have a hard time solving this problem. See the
SBEIR topics on experiment programs, preprocessor magic, and chosing
a correct C type. Good C programmers might have had some hope in
porting code among 16-bit and 32-bit systems, but they will have
a hard time getting it right when you add 64-bit systems (64-bit
systems had much less concensus on mapping char/short/int/long
into common precisions -- you point this out below).

> > Yes, "long long" has been implemented by many vendors to solve *their*
> > immediate problem. However the 64-bit vendors could not agree on a
> > mapping of char/short/int/long to specific precisions (e.g.,
8/16/32/64,
> > 8/16/32/32, etc.).
>
> But this is madness! There has never been, and there was never
> supposed to be, a single, defined mapping between char/short/int/
> long and any set of specific precisions! That simply is not how
> C's type system works.

I'm not suggesting that char/short/int/long should be mapped the
same on all machines -- quite the opposite. My point was that
someone in that group of vendors around 1992 thought that (1) this
was part of the problem (true) and (2) that they could provide
a common mapping (false). My statement above was pointing to the
reason they couldn't solve the problem: they weren't aware that
they were all trying to map type intents into the same C types
across all 64-bit architectures -- this is impossible. Even
if you restrict the problem to specifying the precision (4 choices:
8, 16, 32, 64), exactness (2 choices: exact or at-least),
performance (3 choices: space, time, none), conformance (2 choices:
stricly conforming or conforming), addressibility (2 choices:
required or not required), you get 96 possibilities of type
intents. You would map these on the the 5 C ``types'': bit field,
char, short, int, long (yes, I know bit fields aren't a type, that
is why I used quote marks). As you can see, information is lost
in the mapping (96->5) and this mapping is *different* for each
64-bit architecture and certainly different for other architectures --
this is the root of the problem. To put is another way, the
``fastest type of at least 16 bits'' (a type intent) is called
(mapped to) "short", "int", or "long" on some machines. This is
implemented as 16 bits, 32 bits, and 64 bits (not necessarily
corresponding to "short", "int", and "long").

> I realize that everyone else, at some level, knows this, too.
> I realize that I'm tilting at windmills.
> I realize that since 99% of all C programmers apparently cannot
> use C's type system as it was intended, the pragmatic conclusion
> can be reached that it's time to abandon C's type system and give
> the 99% what they think they want, namely complete control over
> exact type sizes. But I think it would be a real shame, a real
> capitulation, and a real step backward to do this, and a
> "solution" which does so should be labeled as such, as not really
> being C any more.

I think that C's original type system was faulty. This problem wasn't
as obvious when you only had 16-bit machines and, later on, 32-bit
machines. The problem is that the language is meeting several
needs at once: portable across all systems, works only on this
system, allows programmer to get close to hardware, allows programmer
to avoid knowledge of the hardware, programmers need space
optimizations, programmers need time optimizations, and so on. When
C ran on a limited set of similar architectures, the mapping to
hardware was similar (``you can have whatever color you like as
long as it's black''). Extending C's type system by creating
"long long" only further aggrevates the problem.

(This message is continued in the next posting.)

Lawrence Kirby

unread,
Aug 20, 1995, 3:00:00 AM8/20/95
to
In article <417oh5$j...@newsbf02.news.aol.com> ffar...@aol.com "FFarance" writes:

>To put is another way, the
>``fastest type of at least 16 bits'' (a type intent) is called
>(mapped to) "short", "int", or "long" on some machines.

Eh? You have just paraphrased the standard's description of int. int should
fulfill these requirements on any platform.

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------

FFarance

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
(This is part 2 of the posting. My mail server is having problems
accepting the complete posting.)

> Path:
newsbf01.news.aol.com!newstf01.news.aol.com!uunet!in2.uu.net!eskimo!scs
> From: s...@eskimo.com (Steve Summit)

> ...


> > This solution wasn't ``just invented'', but developed over years by
> > analyzing what the *real* problem is. The nature of the solution
matches
> > the nature of the problem.
>
> I disagree that it took years just to develop this solution.
>
> The first time you see a programmer struggling with elaborate
> config scripts and typedefs such as int16 and int32, it's obvious
> that a more "ideal" solution would be to introduce a full,
> general, orthogonal set of type specifiers covering size, exact
> vs. at-least, fast vs. small code vs. small data, etc., in other
> words, just about what it sounds like SBEIR is. I'm sure I
> started thinking along these lines at least five years ago,
> perhaps even before 64-bit machines were a serious consideration.
> I don't think that full, orthogonal control is a new or difficult
> idea; plenty of languages have something like it. But "ideal" is
> in quotes because it was obvious to me then, and is still obvious
> to me now, that this solution is not C.

I don't think we claim is as a ``new'' idea -- parameterized types
have been around for a long while. I think what was ``new'' was
discovering that the questions I listed above (e.g., ``what is
the name of the 64-bit type'', ``why can you standardize "long long"'',
etc.) were not the questions we needed to answer. If this has been
obvious to you (and others), then my hat's off to you. My experience
with standards committees is that, sometimes, it takes several years
to formulate the right questions to answer -- especially when
dealing with new technology. Maybe you should join the committee
to help us along in other areas we are struggling (really!).

> So what has it taken years to do, if not to come up with the
> mechanics of a solution?

My experience with good solutions is that they are obvious once
they are explained. You scratch your head and say ``I can't see
how we could think of this any other way''. Another problem that
is *not* obvious to people outside the standards committee is
solving all the related problems across *all* architecures. Most
people (based on the 100's of proposals I've reviewed) come up with
some novel syntax that (at best) expresses 80% of the solution and the
submitter believes that the remainder of the (supposedly non-creative)
work will be performed by the committee. In reality, the remaining
20% takes years to work out -- this is the hard, creative work.
As a simple demonstration, some of the not-so-obvious problems are:

- How do these types work in the promotion rules?
- What are the types of intermediate results?
- How do these types interact with existing C types?
- How do I print or scan values?
- What about preprocessor arithmetic?
- What about constants?
- What assumptions can I make with respect to performance
attributes?
- How do I write portable programs?
- How do I create temporary values?
- What about systems that want to implement the minimum
possible extensions?
- How do I teach this easily?

While these might be obvious *now*, they weren't obvious in 1992
(no one raised these issues). In fact, some of the issues in 1992 were
``what's the right mapping of char/short/int/long'' (this question
came from vendors), ``how would we write portable code if this
mapping differed'', ``just tell me right now how to declare a
64-bit type so that I can write much application code''.

> I suspect it's taken years to summon
> the will to introduce this solution. What the intervening years
> have wrought is that C, despite these awful "problems," has
> continued to grow in popularity, such that even more people who
> don't understand it are trying desperately to misuse it, and
> their voices have grown to deafening proportions, and the
> original intent of C's type system has been even more dimly
> forgotten, such that today it's possible to argue with a straight
> face that this enormous bucket of extra specification ought to be
> bolted onto C's type system.

I think the hardest part has been getting people to understand what
the real problem is. Once they understand that, they understand
the limitations of C's original type system. Even if they understand
C's type system, they now understand why it has been hard (and
getting harder) to write portable, good-performing programs. At
first glance, it appears to be an enormous bucket (probably because
the tentative syntax for declarations is long). In reality, the
SBEIR proposal makes very few, localized changes to the C Standard
(see the SBEIR paper for a list of changes).

The next hardest part is getting implementors to realize this isn't
as difficult as it looks (we'll be publishing a version of GCC that
has this feature). With respect to changing your compiler to
include SBEIR, we've presented our design on what changes you'd
need to make. The summary is:

- You have to recognize the additional keywords for
the declaration syntax.
- You have to store the parameter information in the
symbol table.
- When operating on two arithmetic types, you have to
choose a type for the intermediate result (we've given
a simple algorithm for this).
- If you choose to implement types outside your native
types (e.g., 48-bit integers), you must provide the
additional support. There are publicly available
multiprecision libraries if you're lazy.
- You must recognize the "precof()" operator (gets the
extra information you stored in the symbol table).
- You must modify "printf" and "scanf" to understand the
"?" type modifier (e.g., "printf("%?d",precof(x),x);").
- You must supply the "strtoint()" library function.
- You must supply the "<stdint.h>" header.

> Don't get me wrong. If I were designing a language, it would
> have something exactly like that "enormous bucket of extra
> specification." Although I claim that the extra level of
> precision in declaration is needed far less of the time than is
> usually assumed, it is needed some of the time, and It Would Be
> Nice If there were a complete solution. But if the elusive
> "Spirit of C" still held any real sway in determining the
> language's future, a proposal like this would be a non-starter.

I think this is in the ``Spirit of C'' because this really does
allow the programmer to ``get close to the hardware'' in a
portable way.

> C's original philosophy (like Unix's) used to involve the
> heretical repudiation of the last 10% of functionality which
> would have accounted for an extra 90% of complexity and
> development cost. It's true that the missing 10% of
> functionality has engendered a constant stream of shrill
> criticism all along, but I've come to realize that the omission
> was in fact one of the most salient features, and perhaps a
> major, unappreciated, "stealth" factor behind the language's
> (operating system's) galloping success.

I've outlined the development cost (assuming that your a compiler
vendor). If your speaking of application development cost, you'll
appreciate the reduction in porting cost. In your 90-10 model,
you'd expect the cost to be 10-90. But here, the additional SBEIR
features reduce application development cost. Having ported over
10 million lines of code, I can safely say that this would have
reduced much porting cost.

You're probably right that the omission was a salient feature and
contributed to the success of C over the past two decades. My guess
is that C conveniently didn't have to address this problem for most
of the architectures of the 1970's and 1980's. However, this paradigm
has reached its limitations. For C applications now, the SBIER
features will contribute to more success over the next 10-15 years
while not having SBEIR would majorly contribute to porting cost.

> The nature of the proposed solution, though, does very definitely
> match the nature of the problem. The problem is that people
> continue to wish that C were something it is not, not realizing
> that if C were what they thought they wanted it to be it would
> never have succeeded and they wouldn't be using it in the first
> place. And the solution is in tune with those wishes.

I still hear people complain about ANSI C being like Pascal (a
``police-state language'') because of function prototypes. These
people still like the K&R function definitions. I think prototypes
have greatly helped most programmers (to the detriment of people
being lazy about casting properly).

(Continued in the next posting.)

FFarance

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
(This is part 3 of the posting. My mail server is having problems
accepting the complete posting.)

> Path:
newsbf01.news.aol.com!newstf01.news.aol.com!uunet!in2.uu.net!eskimo!scs
> From: s...@eskimo.com (Steve Summit)
> ...

> > BTW, bit/byte ordering/alignment and
> > representation (e.g., two's complement) will be addressed in separate
> > proposals.
>
> This note highlights another significant aspect of what the real
> problem really is. If all we cared about was computation, we
> obviously wouldn't care about alignment and byte order. If we do
> care, it's obvious that what we're worried about is that great
> bugaboo: conforming to externally-imposed storage layouts.

Yes, this is an important issue. However, it was separated from
SBEIR because ordering/alignment is an orthogonal issue. As you
point out, ordering/alignment issues are only of concern when
conforming to externally-imposed layouts (e.g., shared memory,
data files, network packets).

> Everybody desperately wants to do binary I/O in an "efficient"
> way, namely by using fread and fwrite to read and write
> structures which are used both for internal manipulation and
> external interchange. And since some people, at least, also want
> this binary I/O to be portable, they end up wanting complete
> control over structure layout just so that they can get the I/O
> to come out right.

Since "fread" and "fwrite" aren't the only I/O mechanisms (shared
memory, memory mapped I/O, system calls, port I/O), you'd have to
fix all the I/O mechanisms to get this right. The ordering/alignment
features apply to the type, not the I/O operation, thus simplifying
its use. Maybe what you want is a ``binary'' "printf()" function.
However, that wouldn't work for shared memory data structure (use
a ``binary'' "sprintf()" rather than an assignment?).

> It occurs to me that we could probably address 90% of what people
> really want when they complain about type sizes, in a more direct
> way, in a way no more radical than the SBEIR proposal, and
> perhaps less disruptively, by doing one or both of the following:
>
> 1. Give programmers more control over the alignment and
> packing of structures, perhaps by extending the existing
> bitfield notation.

Interesting idea, but before you talk syntax, what is the conceptual
model and semantics? Syntax is the last thing that gets decided.

> 2. Build the rudiments of direct, binary I/O into the
> language, with input and output statements which could
> read and write data structures with as much convenience
> to programmers as fread and fwrite, but with an
> opportunity for appropriate translations to be
> automatically introduced to conform to externally-imposed
> storage layouts.

Sounds like you want to modify the "%c" format in "printf" to
solve this problem ("%c" in "scanf" already takes a precision).
Have you considered submitting a proposal on this? If you need
info in submitting a proposal or the location of the SBEIR paper,
send me E-mail at "fr...@farance.com".

> Number 2 is certainly radical, and it breaks another fundamental
> postulate of the language (namely that I/O is not built in), and
> I don't expect it will be taken seriously, but I submit that it
> is no more radical or heretical than SBEIR.

I don't think modifying "%c" in "printf()" to take a precision
is all that radical. However, it only solves the stream I/O
problem and not for other types of I/O.

> Neither 1 nor 2
> above propagates sticky problems into parts of the language
> unconcerned with external storage layouts -- neither necessitates
> massive additions to the various sets of promotion rules; neither
> induces screaming nightmares with respect to printf formats.

If you are only solving ordering/alignment problems, then promotion
rules aren't an issue. Number 1 above doesn't solve the range
problem because:

- There is no minimum maxima. Bit fields are only
portable up to 16-bits.
- Bit fields aren't addressible.
- Bit fields don't address performance issues: give me
the fastest type of at least 16 bits (possibly 16, 32,
or 64 bits).

In developing your solution, these are the kinds of problems that
take a while (possibly years) to develop correctly. I think you
should start with a conceptual model first, followed by semantics,
then with syntax. Starting with syntax first is like writing a
huge program without a design. After you get your conceptual
model and semantics defined, send them out for review -- there's
no point developing syntax and standards wording if there are
functional and design problems.

> ...


> The real shame is that none of this is, or ought to be, as
> important as it now seems to be.

I think this has become important now because of the diverse
architectures.

> Most code that worries about
> the exact sizes of things would be better off if it didn't. Much
> binary I/O that worries about external storage layout would be
> better off if it used more flexible representations (such as
> text).

Yes this might possible in some applications, but binary I/O is
important for performance reasons -- this is why we use binary
integers rather than BCD or text strings as numbers. And there
are many systems applications that have binary interfaces (e.g.,
networking protocols). Wishing binary would go away will be
difficult.

> If we capitulate to the demands of already-spoiled
> programmers, by giving them all this extra control that they'd
> be better off not needing and not using, we'll not only be
> contributing to unnecessary tight coupling and fragility of code,
> but we'll also be seriously degrading the simplicity and
> learnability of the language.

I would claim the opposite is true. Even good programmers need
this. Using SBEIR types gets the programmers to think about the
intent of the type (and compilers can check this, too). A similar
argument would have been made about function prototypes: this is
a hardship because it really forces the programmer to commit to
an interface to a function. While non-SBEIR types and K&R-style
functions allow the programmer to *initially* develop code marginally
faster (the compiler doesn't complain as much), in the end they are
drains on programmer productivity and porting cost.

> I realize that this is a
> slippery-slope argument, but if something like SBEIR goes into
> C9X, and even though the new exact and fast and atleast keywords
> will be strictly optional, most new programmers learning C will
> be taught that for every variable they declare they should make a
> choice between at-least/exact and small/fast, and though they
> won't realize it, their annoyance at being told to do so, and
> their lingering suspicion that perhaps they shouldn't have to,
> will be exactly what they'd have felt if they'd been learning
> PL/I.

Choosing "atleast" vs. "exact" is a semantic issue which they
*really* should understand. Choosing "small", "fast", or
unoptimized is a performance tuning issue. If beginning programmers
are writing code without much understanding of its intent, I'd
suspect that you will have a very expensive development cycle.
However, beginning programmers aren't writing big systems where
performance is critical. Beginning programmers can use the
traditional C types for learning, but should advance to SBEIR
as their understanding of the language develops -- similar to
programmers learning "getchar" and "putchar", graduating to
"printf" and "scanf", then "fread" and "fwrite", and finally
to the operating system's "read" and "write". You don't start
beginning programers with learning I/O via sockets.

Thanx for your comments.

FFarance

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
> From: Lawrence Kirby <fr...@genesis.demon.co.uk>

>
> In article <417oh5$j...@newsbf02.news.aol.com> ffar...@aol.com
"FFarance" writes:
>
> >To put is another way, the
> >``fastest type of at least 16 bits'' (a type intent) is called
> >(mapped to) "short", "int", or "long" on some machines.
>
> Eh? You have just paraphrased the standard's description of int. int
should
> fulfill these requirements on any platform.

No it doesn't. The C Standard says that "int" should have at least 16
bits precision (there is no performance requirement for the C Standard).
As compiler implementors ``bind'' (i.e., implement) the C language to
their platforms, they are free to choose any native data type as long
as the following requirements are met for "int":

- The precision is at least 16 bits.
- The ordering char <= short <= int <= long is held true.

In fact the type ``fastest type of at least 16 bits'' is mapped to
a "long" on some systems (e.g., some 64-bit systems). On other systems,
there is no C type (i.e., char/short/int/long) for this specification --
there is a C type for ``type of at least 16 bits'' though.

The confusing part here is that the C Standard doesn't require "int" to be
mapped into a specific SBEIR type, even though we talk about the C
Standard
and SBEIR types in similar terminology (i.e., ``an integer type of at
least N bits''). Other standards (e.g., POSIX) have similar terminology
about minimum precision. The following diagram might help illustrate
these concepts.

Level 3 C Standard "int" is at least 16 bits

--- binding to vendor implementation ---

Level 2 vendor's compiler "int" is "atleast:32" -- SBEIR type

--- binding to hardware ---

Level 1 hardware 32-bit value

When the vendor implements the "int" type, it may have the SBEIR type of
``atleast:32'' -- this type satisfies the requirements for the C
Standard (assuming the ordering of types is maintained -- see above).
In turn, this might be implemented as a 32-bit value.

In summary:

- The specification of "int" in Standard C doesn't require
the implementation to choose and SBEIR type of ``fast
atleast:16''.

- The type "int" might not be the fastest type of at least 16
bits. This worked in many 16-bit and 32-bit systems (when there
were fewer choices), but it doesn't apply in general. One cannot
assume that "int" has these qualifications just as "int" doesn't
represent the size of a pointer or doesn't represent the native
word size.

Alan Stokes

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
In <412dkr$7...@newsbf02.news.aol.com> ffar...@aol.com (FFarance) writes:
>I've covered this point several times on "comp.std.c". You can read my
>SBEIR (specification-based extended integer range) proposal in:

> ftp://ftp.dmk.com/DMK/sc22wg14/c9x/extended-integers/sbeir.*

I've read your proposal, and in general think it excellent. However,
you only allow for precision to be specified as a number of bits.
I frequently want to be able to specify precision as a range (bits is
fine for bit twiddling, but if I just want to store a value between
0 and SOME_MACRO efficiently I'm stuck).

I would suggest either extending your notation to allow a range to be
specified, or at the very least add a (preprocessor) operator to take
the integer value of log to the base 2, so I can convert ranges to bits
myself.

(And to head off any flamers out there, yes I know where to find Pascal &
Ada, but no I don't want them.)

--
Alan Stokes (al...@rcp.co.uk)
RCP Consultants Ltd
Didcot, UK

Thad Smith

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
In article <41a0ov$9...@newsbf02.news.aol.com>,

ffar...@aol.com (FFarance) wrote:
> From: Lawrence Kirby <fr...@genesis.demon.co.uk>

> In article <417oh5$j...@newsbf02.news.aol.com> ffar...@aol.com
> "FFarance" writes:

> > To put is another way, the ``fastest type of at least 16 bits''
> > (a type intent) is called (mapped to) "short", "int", or "long"
> > on some machines.

> Eh? You have just paraphrased the standard's description of int.
> int should fulfill these requirements on any platform.

> No it doesn't. The C Standard says that "int" should have at
> least 16 bits precision (there is no performance requirement for
> the C Standard). As compiler implementors ``bind'' (i.e.,
> implement) the C language to their platforms, they are free to
> choose any native data type as long as the following requirements
> are met for "int":

> - The precision is at least 16 bits.
> - The ordering char <= short <= int <= long is held true.

The standard also says "A 'plain' int object has the natural size
suggested by the architecture of the execution environment..." [ANSI
Classic 3.1.2.5]. While there is no guarantee that the natural size
is the fastest type, I expect it is normally true. The trick then is
to determine what natural size is suggested by the architecture when

- addressable unit is memory is A bits
- memory bus width is B bits
- instructions support literals up to C bits
- register size is D width
- instructions support double precision 2*D width operations.

Most of the embedded processors I work with have a natural size of 8
bits, which doesn't even qualify for an int.

Thad

Zefram

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
Steve Summit <s...@eskimo.com> wrote:
[a bit snipped]

>Precisely. The real "problem" is poor programmers. Can we solve
>this problem by fiddling with the type system? Even if we could,
>should we?

This `poor programming' is the result of the limitations of C's current
type system. There is currently no portable way to get a type greater
than 32 bits, which *is* a problem. (Observe the "llseek" function in
Linux -- a bad solution, in my opinion.) Programmers assumptions about
type sizes are inevitable when there is such a limited choice of
types. The SBEIR proposal allows programmers to clearly, and portably,
document the type they actually want in the code.

>But this is madness! There has never been, and there was never
>supposed to be, a single, defined mapping between char/short/int/
>long and any set of specific precisions! That simply is not how
>C's type system works.

That would be appropriate for other high-level programming languages.
Ideally, we would only need one integer type, which would be big enough
for everything we need to do on that platform. But C is a practical
programming language, intended for low-level systems programming and
hardware interfaces. This type of use requires types of known size.
It is a failing of C that it guarantees very little about type sizes.

[another big chunk, much of which I agree with, snipped]


> I realize that this is a
>slippery-slope argument, but if something like SBEIR goes into
>C9X, and even though the new exact and fast and atleast keywords
>will be strictly optional, most new programmers learning C will
>be taught that for every variable they declare they should make a
>choice between at-least/exact and small/fast, and though they
>won't realize it, their annoyance at being told to do so, and
>their lingering suspicion that perhaps they shouldn't have to,
>will be exactly what they'd have felt if they'd been learning
>PL/I.

This, I think, is a very minor problem. For normal use, int provides
an adequate integer type. int is guaranteed to be at least 16 bits,
which is good for many purposes, and long can be used where a larger
counter is required. It would be silly to start using the
atleast/exact/small/fast notations for simple cases. I suspect that
many people learning C will actually be taught to *avoid* the SBEIR
specifications.

-zefram

Lawrence Kirby

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
In article <41a0ov$9...@newsbf02.news.aol.com> ffar...@aol.com "FFarance" writes:

>> From: Lawrence Kirby <fr...@genesis.demon.co.uk>
>>
>> In article <417oh5$j...@newsbf02.news.aol.com> ffar...@aol.com
>"FFarance" writes:
>>
>> >To put is another way, the
>> >``fastest type of at least 16 bits'' (a type intent) is called
>> >(mapped to) "short", "int", or "long" on some machines.
>>
>> Eh? You have just paraphrased the standard's description of int. int
>should
>> fulfill these requirements on any platform.
>
>No it doesn't. The C Standard says that "int" should have at least 16
>bits precision (there is no performance requirement for the C Standard).

6.1.2.5:

"A ``plain'' int object has the natural size suggested by the archetecture
of the execution environment (large enough to contain any value in the range
INT_MIN to INT_MAX as defined in the header <limits.h>)."

I maintain you are just paraphrasing the standard's description of int.
There may be some debate about what 'the natural size' means but it should
certainly relate closely to the fastest type.

John R. Mashey

unread,
Aug 21, 1995, 3:00:00 AM8/21/95
to
In article <412dkr$7...@newsbf02.news.aol.com>, ffar...@aol.com (FFarance) writes:
|> > |> In <40tdmr$j...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R.
|> Mashey) writes:
|> >|>
|> > |> >1) long longs are not part of ANSI C ... but probably will be,
|> since:
|> > (lots of people have implemented it, if not previously, as instigated
|> > by 64-bit working group in 1992).
|>
|> The "long long" type is unlikely to be included in C9X. Although the
|> problem has been discussed in the EIR (extended integer range) working
|> group of NCEG (numeric C extensions group -- X3J11.1) for several years,
|> over the past two years is has been recognized as a faulty solution.

It is informative to hear that this has been recognized over the last two
years as faulty ... but there is a *serious* problem here...

SO, WHEN DO WE GET A NON-FAULTY SOLUTION?
I.e., there are proposals. When does one get accepted *enough* that
vendors dare go implement it and expect it will actually persist
(or close enough) in the final standard? (For example, years ago,
"volatile" was clearly known to be coming soon enough (1985/1986)
to save those of us worried about serious optimization, even though
the standard didn't get approved until later.)
=========
I'd like to go through some background, facts, and then a few opinions, to
observe that in the effort to get a "perfect" solution, we are now in the
awkward position of lacking a seriously-necessary feature / direction
in the standard, with the usual result: individual vendors go implement
extensions, not necessarily compatible. This particular one is *very*
frustrating, since it is not rocket science, but rather predictable.
Note: none of this is meant to be criticism of people involved in the
standards process, which is inherently a frustrating, and often thankless
task. It is meant as a plea to balance perfectionism versus pragmatism,
of which both are needed to make progress. It is also a plea that people
involved in this *must* have a good feel for likely hardware progress,
especially for a language like C that has always been meant to make
reasonably efficient use of hardware. While I have no special love for
"long long", especially its syntax, and while there are plenty of issues that need to be dealt with, and while I hardly believe C's type system is perfect ...

I believe that we have a *serious* problem in 1995, to *not* have multiple
implementations of compilers accepting a 64-bit integer data type, such that
it was already well-lined up to become part of the standard.
The situation we are in, is like where we would have been ~1978, had we
not already had "long" in the language for several years. That is:
a) PDP-11 UNIX would have been really ugly in the area of file pointers,
since int hadn't been big enough for a long time, that is, the
16-bit systems needed first-class 32-bit data, regardless of
anything else. Structs with 2 ints really wouldn't have been
very pleasant. Likewise, limiting files to 64KB wasn't either.
b) Preparing cross-compilers and tools for 32-bit systems would have
been more painful, that is, it was good to have first-class data
on the 16-bit system to prepare the tools, and when done, to get
code that made sense on both 16- and 32-bit systems.
c) It would have been far more difficult to maintain portability
and interoperability between 16- and 32-bit systems, that is,
one could both write portable code if one was careful, and
especially, one *could* provide structs for external data that
looked the same, since both 16- and 32-bit systems could
describe 8-, 16-, and 32-bit data.
Of course, this was in the days before people had converted as much code to
using typedefs, which made it pretty painful.

Deja vu... in 1995:
a) There is a great desire by many to have a UNIX API for 64-bit
files on 32-bit systems (called the Large File Summit), since
2GB file limits are behind the power curve of disks these days. This
is no problem on 64-bit systems, and it's not really too unclean
on 32-bit systems (if you've added a a first-class 64-bit type
and can typedef onto it. Yes, some older code breaks ... but
well-typedefed code is OK.) Some people implemented this in 1994.
b) Every major RISC microprocessor family used in systems either already
has a 64-bit version on the market [1992: MIPS & DEC], or has
one coming soon [1995: Sun UltraSPARC, IBM/Moto PPC620, 1996: HP
PA-8000]. Hence, some people have already done compilers that have
to run on 32-bit machines, to produce code for 64-bit machines ...
just like running PDP-11 compilers to produce VAX code.
c) Right now, without a 64-bit integer datatype usable in 32-bit C,
there is the same awkwardness we would have had, had we not had
long back in the 16->32 days.
(But what about 128-bits: I'd be pleased to ahve a 128-bit type as well ...
however, a pragmatic view says: we have the 64-bit problem right now, we've had it for several years; we won't have the 128-bit problem for quite a few years.
Based on the typical 2 bits every 3 years increase in addressing, a gross
estimate is: 32 bits/2 bits * 3 years = 48 years, or 1992+48 = 2040.
Personally, I'm aggressive, so might say year 2020 for wanting 128-bit
computers ... but on the other hand, there are some fairly serious penalties
for building 128-bit wide integer datapaths, and there are some other impediments to making the 64->128-bit transition as smooth as the 32->64 one;
In any case, I will be very surprised to see any widespread use, in general-purpose systems, of 128-bit-wide integer datapaths, in 10 years (2005).
I wouldn't be surprised to see 128-bit floating-point in some micros, but
128-bit integers would indeed surprise me. Hence, I'd much rather have a
simple solution for 64-bit right now. Of course, a plan that allows something
sensible for the bigger integers over time is goodness.

BACKGROUND
If 64-bit microprocessors are unfamiliar, consider reading:
John R. Mashey, "64-bit Computing", BYTE, Sept 1991, 135-142. This explained
what 64-bit micros were, the hardware trends leading to this, and that
there would be widespread use of them by 1995 (there is). While a little
old, most of what I said there still seems OK.

SOME FACTS
1) Vector supercomputers have been 64-bit systems for years. One may argue
that these are low-volume, and for several reasons (word-addressing on
CRAYs, non-existence of 32-bit family members, etc, stronger importance of
FORTRAN, etc), some people might argue that these are not very relevant
to C ... but still, there are several $B of hardware installed, and,
for example, CONVEX has supported long long as 64-bit integer for years.
CRAY made int 64 bits, and short 32 bits.

2) In 1992, 64-bit microprocessors became available from MIPS (R4000) and
DEC (Alpha), and started shipping in systems. For {SPARC, PPC, HP PA},
the same thing happens in 1995 or 1996 - the chips have all been announced;
some people guess the Intel/HP effort appears in 1998.

3) From 1992 thru current, I estimate there must be about $10B installed base
of 64-bit-capable microprocessor hardware already sold. Most of it is
still running 32-bit OSs, although some of the 32-bit OS code uses 64-bit
integer manipulations for speed.
I *think* >$1B worth is already running 64-bit
UNIX + programming environments, i.e., DEC UNIX and SGI IRIX 6 (shipped 12
months ago).
While some 64-bit hardware will stay running 32-bit software for a while,
new OS releases may well convert some of the existing hardware to 64-bit OSs, and an increasing percentage of newer systems will run the 64-bit OSs,
especially in larger servers. [2GB/4GB main memory limits do not make
the grade for big servers these days; while one can get above this on
32-bit hardware, it starts to get painful.]

4) DEC UNIX is a "pure" 64-bit system, that is, there is no 32-bit programming
model, since there was no such installed base of Alpha software, i.e.,
that was a plausible choice for DEC. SGI's IRIX 6 is a "mixed 64/32" model, i.e., it is a 64-bit OS that supports both 32- and 64-bit models, and will,
i.e., that is not a transitional choice, as we believe that many applications
will stick in 32-bit for a long time. IRIX 5 & 6 both support a 64-bit
interface to 64-bit file systems in 32-bit user programs, i.e., somewhere
underneath is a long long, although carefully typdeffed to avoid
direct references in user code.
DEC UNIX proves you can port a lot of software to 64-bit; IRIX proves you
can make code reasonably portable between 32- and 64-bit.

Both of these systems use the so-called LP64 model, i.e., at this instant,
the total installed base of 64-bit software (with possible exception of
CRAY T3D) uses LP64:
sizes in bits
Name char short int long ptr long long Notes
ILP32 8 16 32 32 32 64 many
LLP64 8 16 32 32 64 64 longlong needed
LP64 8 16 32 64 64 64 DEC, SGI
ILP64 8 16 64 64 64 64 (needs 32-bit)

(The comments mean: in LLP64 (Longlong+Pointer are 64), you need *something* to
describe 64-bit integers; in ILP64 (integer, long, pointer are 64) you'll
want to add some other type to describe 32-bit integer. I didn't invent
this nomenclature, which is less than elegant :-)

5) In 1992, there was a 6-month effort among {whole bunch of vendors} to see
if we could agree on a choice of {LLP64, LP64, ILP64}. There was *serious*
talent involved from around the industry, but at that time, we could not
agree. As it turns out, it probably doesn't matter much for well-typdeffed
code, i.e., newer applications. Some older code breaks no matter what you
choose, and and some older code works on 1-2 of the models and breaks on the
other(s), with the breakage depending on the specific application. What we
did agree on was (1) Supply some standard typedef names that application
vendors could use, if they wanted, and if they didn't already have their
own set of typdefs. Some vendors have done this. (2) Do long long as a
64-bit integer datatype (NOT as a might-be-any-size >= long), so we'd
at least have one. NOTE: this is more for the necessities of ILP32;
LP64 and ILP64 could get away without it, but the problem is in dealing
with 64-bit integers from ILP32 programs ... similar to the 16/32-bit days.
As noted, there were several people in this group also involved in ANSI C,
and we asked them about the wisdom of doing this, and were told, unambiguosly,
that we might as well go ahead and do it. Whether it was good or not, it
was not for lack of communication...

6) Now, there is a new 64-bit initiative to get some 64-bit API and data
representation issues settled. The first part (API), is crucial, and ISVs
really want it badly; that is, some vendors have already done 64-bit ports,
but a lot more are getting there, and we're starting to get into the "big"
applications that have masses of software, and not surprisingly, the ISVs
do not want to have to redo things any more than they need to.

OPINIONS:
1) There is a right time, and two wrong times, to standardize soemthing.
If is standardized too early, before some relevant experience has accumulated,
bad mistakes can be made. If it standardized too late, a whole bunch of people
will have already done it, likely in more-or-less incompatible ways, especially
in the subtle cases, or will have gotten into a less-than-elegant solution,
basically out of desperation to get something done. I'd distinguish between
two cases:
a) Add an extension because it is cool, because customers have
asked for it, because it helps performance, etc, etc ... or
because some competitor puts it in :-)
b) Add an extension because fundamental external industry trends
make it *excruciatingly painful* to do without the extension or
some equivalent.
I think "long long" fits b) better than a); people aren't doing this for fun;
they are doing it to fit the needs of straightforward, predictable,
hardware trends that mostly look like straight lines on semi-log charts,
with a transition coming very similar to that which occurred going from
PDP-11 to VAX, i.e., not rocket science, not needing brilliant innovations.

2) So, when is the right time to have at least gotten a simple data type
available to represent 64-bit integers (in a 32-bit environment, i.e.,
assuming that long was unavailable)?

1989: nobody would even admit to working on 64-bit micros.
1990: MIPS R4000 announced (late in year).
1991: Various vendors admit to 64-bit plans; 2GB (31-bits) SCSI disks starting
1992: 64-bit micros (MIPS, Alpha) ship in systems from multiple vendors
1992/1993: DEC ships OSF/1 (I can't recall whether late in 1992, or in 1993)
1994: SGI ships IRIX 6 (64/32-bit)
1996: IBM/Motorola PPC620, Sun UltraSPARC, HP PA8000 out in systems;
(PPC620 & UltraSPARC might be out in 1995, but for sure by 1996).
1996: ?
1998: ? Intel/HP 64-bit chip ??

From the above, it sure looks to me like we really needed to get *something*
for a 64-bit datatype in C (again, geenral agreement, not in a formal
standard), usable in ILP32 environments:
1991: would have been wonderful, but too much to expect.
1992: more likely, and there were some people with experience, and there
were several real chips to help check speed assumptions.
1993: starting to get a little late
1995: too late to catch most of the effort.

Without going through the sequences in detail, the usual realities of
adding this kind of extension usually mean that somebody is adding the
extension to C a year before it would ship (on 32-bit system), and probably
2 years before there's a 64-bit UNIX shipped. This means there were several
companies with committed development efforts in 1991/1992.

So, in summary: it would have been really nice if we could have gotten
something agreeable (that is not blessed as standard, that always takes longer,
but with some agreement of intent) in 1991, or at least in 1992, late enough
for people to have some experience, but early enough to get something
consistent to customers that could still have a chance of being blessed
later on. Proposals in 1995 ... are late enough that many vendors will
have already done the compiler work that they need to ship 64-bit
products in 1996 or 1997... Of course, thsi is hindsight, and I do feel
a little bad for not pushing harder on this in 1991.


|> - After much analysis, the problem is not ``can I standardize
|> "long long", or how to I get a 64-bit type, or what is the name
|> of a 64-bit type'', but ``loss of intent information causes
|> portability problems''. This isn't an obvious conclusion. You
|> should read the paper to understand the *real* problem.

Hmmm. Having started with C in 1973, from Dennis' 25-pager (that's all there
was), and having gone thru the 16- to 32-bit transitions, and Little-Endian ->
Big-Endian transitions, and being old enough to be at least acknowledged in
the first K&R C book, and having /managed various UNIX ports,
and worked on C compilers, and having moved applications around, and having
helped design RISC micros with some strong input from what real C code
looked like, and having helped design the first 64-bit RISC micro ...
I think I understand "loss of intent", which was certainly a major topic
of the 1992 series of meetings. (we just couldn't agree on which intents
were more common or more important.)

One more time: I claim "how I get a 64-bit type" IS a problem; I don't
think it's the only problem, and there may well be more general ways to
handle these issues (and as soon as I dig up gnu zip so I can look at
the files, I'll look at the SBEIR files).

BUT, I CLAIM THAT IT IS A *REAL*
PROBLEM WHEN $9B OF COMPUTERS CAN'T EVEN USE A SIMPLE C INTEGER DATATYPE TO
DESCRIBE THEIR OWN INTEGER REGISTERS. ($9B = $10B - $1B running 64-bit OSs).

|> - The use of "long long" causes more harm (really!) because it
|> creates more porting problems. As a simple example, while we

Causes more harm than what? Remember, some of us had no choice but to
figure out something to do in 1991 or 1992, to get 32-bitters ready to
deal with 64-bit. In any case, whether it causes more harm or not,
a whole bunch of us found some 64-bit integer data type *necessary*.

|> might believe that "long long" is a 64-bit type, what happens
|> when you move the code to 128-bit machines? The "long long"

It is *very* unlikely that there will be any 128-bit-integer CPUs used in
general-purpose systems in the next few years;
it would be nice if we could handle them, and the
64->128 transition earlier in the sequence than we did this time.
I'd be delighted if a better type system were in place well before the
time somebody has to worry about. [I expect to have retired long before that,
but I do have colleagues young enough that they will have to worry about :-)]

We tell everybody to use typdefs anyway; and some do;
we do our best to typdef all of the APIs so people use the right things;
would it have made people happier to have called this int64_t or __int64_t?
But, in any case, as far as I can tell, anyone who is using this is just
treating it as a 64-bit integer. If somebody is doing something else,
I'd be interested in hearing it.

|> type will probably map into the X or Y typedef above. This
|> will cause porting problems because whatever "long long" is
|> mapped into, some will believe it is (and use it as) ``the
|> fastest type of at least 64 bits'' and others will believe it
|> is ``exactly 64 bits''. Thus, in the port to 128-bit machines,
|> we have to track down these implicit assumptions because
|> programmers
|> *rarely* document their intent (e.g., ``I want the fastest 32-bit
|> type'') and, mostly, they believe the type itself documents
|> intent (!). This is how porting problems are created.

Like I say, I am *seriously* worried about supporting 64-bit integers on
32-bit machines, and *seriously* worried about source compatibility
between 32- and 64-bit machines ... 128-bit machines are far away, and
citing them as a big concern isn't a big help right now, although any
major change to the type scheme should certainly be played off versus the
realities, especially before all of us old fogies who've actually gone
through 2 factor-of-2-up-bits changes are out of this business :-)


|> > c) Standards committees may well need to sometimes invent new
|> > things [like, when volatile was added years ago].
|>
|> This solution wasn't ``just invented'', but developed over years by
|> analyzing what the *real* problem is. The nature of the solution matches
|> the nature of the problem. BTW, bit/byte ordering/alignment and
|> representation (e.g., two's complement) will be addressed in separate
|> proposals. The SBEIR proposal only addresses range extensions.

Sorry, I didn't mean to imply that committees invented random features on the
spur of the moment, but rather sometimes had to create features found in
few, if any existing implementations. I.e., "invent" was not a pejorative
in any way.

|> > e) Again, as I said before, the 1-2 members of the 1992 group were
|> also
|> > in the ANSI C group ... were where I got the opinion above
|> from,
|> > i.e., don't let the non-existence of long long in the standard
|> > stop you from making progress - it is better to do something
|> > consistent.
|>
|> In 1992, that was probably a reasonable opinion. Since then we understand
|> the problem and have solutions being worked now.

Again ... if the solutions are being worked on now ... they are too late,
I'm afraid.

|> I think if we could have fixed "long long", even with a 90% solution,
|> we would have done it. Among the reasons for not including "long long"
|> are: we'd have to solve this problem again 10 years from now when people
|> were asking for "long long long" for their 128-bit machines; "long long"
|> causes more portability problems *across different architectures* than
|> it helps. Years ago, many people wondered out aloud if we could find
|> a ``right'' solution that solved the problem for once and all. The
|> SBEIR proposal is one solution.

We agree on lots of things; I don't think long long solves all the problems.
I'd hope there's something better for
128-bit than long long long ... but I am really concerned that the common
law "the best is the enemy of the very good" is in operation here.

I think I have good reason to believe that 128-bit-integer machines are
25 years away, i.e., longer than the existence of C...

Meanwhile, $9B (and growing fast) worth of computers ... and having long long,
*demonstrably* has helped a bunch of porting efforts already.

Clive D.W. Feather

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In article <jZLOwA7...@rcp.co.uk>, Alan Stokes <al...@rcp.co.uk> wrote:
> I frequently want to be able to specify precision as a range (bits is
> fine for bit twiddling, but if I just want to store a value between
> 0 and SOME_MACRO efficiently I'm stuck).
>
> I would suggest either extending your notation to allow a range to be
> specified, or at the very least add a (preprocessor) operator to take
> the integer value of log to the base 2, so I can convert ranges to bits
> myself.

I thought the proposal already contained two macros giving the integers
either side of log-base-2 of the argument (I don't have a copy to hand).
Near the end, together with the macros like EIR_IS_FAST.

--
Clive D.W. Feather | If you lie to the compiler,
cl...@demon.net (work, preferred) | it will get its revenge.
cl...@stdc.demon.co.uk (home) | - Henry Spencer

FFarance

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
> From: al...@rcp.co.uk (Alan Stokes)

>
> I've read your proposal, and in general think it excellent. However,
> you only allow for precision to be specified as a number of bits.
> I frequently want to be able to specify precision as a range (bits is
> fine for bit twiddling, but if I just want to store a value between
> 0 and SOME_MACRO efficiently I'm stuck).

In the SBEIR paper, the EIR macros include "EIR_LG2CEIL" which takes
the logarithm base 2 and truncates *upward* (the ``ceiling'' function).
You might use it as:

signed int atleast:EIR_LG2CEIL(SOME_MACRO) X;

FFarance

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
> From: ma...@mash.engr.sgi.com (John R. Mashey)
> ... (an excellent article with good historical references)

John-

I, too, learned C from the same 25-page handout (starting in 1976) and
worked on many versions of UNIX kernels and C compilers for varying
architectures. Your history closely follows the history I describe in the
SBEIR paper. I, too, draw similar conclusions: the problems of the 16->32
transition are the same all over again in 32->64.

There is one aspect of the history that you are missing which might shed
some light on the faults of "long long". Whenever you have type X which
is useful for all the applications for the future (say, the next 3-8
years)
you don't worry too much about it because you can always ``cast up'' to
it for safety (also known as the ``largest type we care about''). In the
early 1970's, "int" was fine. Later on we realized we needed more and
"long" was added. After that some vendors added "long long". The lure
for "long long" is that it has the same appeal that the others had: they
were useful because it encompasses everything we care about ... for now.

Here's the interesting part: look at all the code that used those types
after they where ``overtaken'' by larger types. You'll find that unless
you are using the smallest type (always "char") or the largest type
("long long" as you suggest), all the types in the middle become
portability
nightmares. Thus, "long long" is attractive *now*, but will cause
problems
with 128-bit architectures. 32-bit machines started to arrive around 1978
and 64-bit machines around 1991 (my dates are approximate). 128-bit
machines will become available around 2004.

From the perspective of WG14, we expect to complete C9X around 1999. It
seems silly to solve the same problem 10 years from now. Also, the
portability problems *greatly* increase with the use of "long long" even
if you restrict yourself to 16-bit, 32-bit, and 64-bit architectures.

John R. Mashey

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In article <41cv0r$p...@newsbf02.news.aol.com>, ffar...@aol.com (FFarance) writes:

|> nightmares. Thus, "long long" is attractive *now*, but will cause
|> problems
|> with 128-bit architectures. 32-bit machines started to arrive around 1978
|> and 64-bit machines around 1991 (my dates are approximate). 128-bit
|> machines will become available around 2004.

I agree with many of the comments before this, but we still have the problem
upon us already. Again, I make no representation that long long, or
whatever its called, is a panacea... Again, we are where we would have
been if we hadn't had long, years ago, in doing the 16->32 transition.

re: 128-bit machines available in 2004: let me explain why I seriously
doubt that this is going to happen in any widespread way. You apparently
got 13 = 1991-1978, then 2004 = 1991+13.
a) 32-bit machines,. of course, got popular in in the 1960s with S/360s...
however, the time between generations is more closely related to the
number of bits of addressing, i.e., proportional to number of bits added.
b) But in any case, using DRAM curves, and microprocessor history, and
sizes of physical memory actually shipped by vendors (I've plotted all
these things at one time or another, some is in the BYTE article I noted):
1) DRAM sizes get 4X larger every 3 years; this has been consistent
for years; if anything it might slow down a little after the
next generation, or maybe not.
2) 4X larger = 2 more bits of physical addressing.
3) Of course, virtual addressing can consume address bits faster.
For a program actually using the data, a reasonable rule-of-thumb
is that there are practical programs 4X larger than the physical
memory that are still usable, i.e., whose reference patterns
don't make them page too much. Hennessy disagrees with me some,
claiming that memory-mapped file usage can burn virtual memory
faster, and I somewhat agree, but I also think it takes a while for
such techniques to become widely used. In any case, even this
tends to be at least somewhat bound by the actual size of
physical disks.
4) So, assuming that large microprocessor servers started hitting
4GB (32 bits) in 1994 (and that's a reasonable date: SGI sold
some 4GB systems in 1994, and some 8GBers either at the end of
1994 or beginning of 1995. So, if I pick a date for 4GB,
knowing there are always some bigger systems, it's 1994.
1994: 32 bits (4GB)
1997: 34 bits (16GB)
2000: 36 bits (64GB)
2003: 38 bits (256GB)
....
2042: 64 bits (16Billion GB) (hmmm, seems unlikely :-)
On the other hand, my 4:1 rule claims that the virtual memory
pressure is at least 2 bits ahead of the physical, or 3 years
earlier, and then there's the increasing use of mapped files,
and then allowing for somebody being more aggressive than the rest
of the crowd .... and I come back to my 2020 estimate.
5) Note that IRIX already has a 64-bit file system and files;
the largest single file we've seen is 370GB. Assuming disks
somehow maintain their current progress of 2X every 2 years,
and that 4GB 3.5" SCSI disks are around in force:
Right now, a 64-bit file pointer can address all the data in
4Billion 4GB disks ... which not everyone can afford :-)
by 2020, assuming straight-line progression, suppose we've gotten
13 doublings, you'd want to have a single disk of 32,000 GB (!),
and now a 64-bit file pointer can only address 2**19 or,
512,000 of such disks, still likely to be adequate for most uses.
6) Finally, while the first 64-bit micro came out in 1991/1992,
and the seocnd in 1992, it is 1998 (?) before all of the major
microprocessor families get there, and whereas tehre were at least
some 64-bit systems many years ago in the supercomputer world,
offering some ueful experience, I haven't noticed *any*
128-bit systems anywhere.
7) Bottom line: something very strange would have to happen to start
seeing serious use of 128-bitters in 2004. While your comments
on C have serious credibility ... I'd like to see some reasoning
to justify 2004, because every bit of trend analysis I've done or
seen says much later...

|> From the perspective of WG14, we expect to complete C9X around 1999. It
|> seems silly to solve the same problem 10 years from now. Also, the
|> portability problems *greatly* increase with the use of "long long" even
|> if you restrict yourself to 16-bit, 32-bit, and 64-bit architectures.

But again, I've got some serious portability problems, right now, that
don't seem to get solved except with an integral 64-bit type that is
useful in 32-bit environments, and of course, must persist into 64's.
While I used to care about non-power-of-2-sized architectures, that seems
less of an issue that it was the old days.

Oh well, it looks like we're doomed to a sad state of affairs over the
next few years: a whole lot of people will write code with an extension
that won't be part of the standard; teh extension won't benefit from the
standards process, but it will get used, perhaps with yet more flags
(like -ANSI and -ANSI+longlong, i.e., useANSI, but don't flag long longs.
Sigh.)

FFarance

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
> From: ma...@mash.engr.sgi.com (John R. Mashey)
>
> In article <41cv0r$p...@newsbf02.news.aol.com>, ffar...@aol.com
(FFarance) writes:
>
> |> nightmares. Thus, "long long" is attractive *now*, but will cause
> |> problems
> |> with 128-bit architectures. 32-bit machines started to arrive around
1978
> |> and 64-bit machines around 1991 (my dates are approximate). 128-bit
> |> machines will become available around 2004.
> ...

> re: 128-bit machines available in 2004: let me explain why I seriously
> doubt that this is going to happen in any widespread way. You apparently
> got 13 = 1991-1978, then 2004 = 1991+13.

Yes, the analysis was as simple as that. But I made a mistake with
imprecise words on 128-bits. I think I should have replaced
``architecture'' with the word ``application''. In the past, the
advance of architectures paralleled the advance of applications. While
I believe that there will be some 128-bit machines in 2004, I believe
there will *definitely* be 128-bit applications (possibly run on 64-bit
machines). In fact (as mfinney pointed out on this thread), they
have 128-bit applications now. Some crypto applications now require
1024-bit arithmetic.

> ...


> 7) Bottom line: something very strange would have to happen to
start
> seeing serious use of 128-bitters in 2004. While your comments
> on C have serious credibility ... I'd like to see some
reasoning
> to justify 2004, because every bit of trend analysis I've done
or
> seen says much later...

In any case, making these kind of predictions isn't an exact science. It
sounds like you're better qualified to analyze these kind of trends.
However, I think it is safe to say that WG14 will have to address the
128-bit problem reasonably soon.

> |> From the perspective of WG14, we expect to complete C9X around 1999.
It
> |> seems silly to solve the same problem 10 years from now. Also, the
> |> portability problems *greatly* increase with the use of "long long"
even
> |> if you restrict yourself to 16-bit, 32-bit, and 64-bit architectures.
> But again, I've got some serious portability problems, right now, that
> don't seem to get solved except with an integral 64-bit type that is
> useful in 32-bit environments, and of course, must persist into 64's.
> While I used to care about non-power-of-2-sized architectures, that
seems
> less of an issue that it was the old days.

As I said, "long long" appears attractive because it will be implemented
as a 64-bit type (now the largest type). There isn't portability problems
withe the largest type -- it's always the intermediate types.

Someone said the ``obvious'' choice is that "long" becomes the 64-bit
type.
Just that change on a compiler would cause a serious (binary)
incompatibility
problem along with breaking code source code (portability problem). Why?
Programmers (including experiment programs and preprocessor magic) made
assumptions about the types "int" and "long". From a development
perspective, any change in:

- the bit/byte ordering
- the bit/byte alignment
- the data representation
- the actual precision of the data types
- the size of the data types
- the stack alignment
- the operators

causes development that is, in effect, a porting effort. So changing
"long" from 32-bits to 64-bits on some machine is equivalent to porting
to a new architecture. Even so, doing the porting effort doesn't make
your code run the same on all machines. For example, on one of Cray's
machines they don't have a 16-bit type. This would be an issue for
applications that used "short".

> Oh well, it looks like we're doomed to a sad state of affairs over the
> next few years: a whole lot of people will write code with an extension
> that won't be part of the standard; teh extension won't benefit from the
> standards process, but it will get used, perhaps with yet more flags
> (like -ANSI and -ANSI+longlong, i.e., useANSI, but don't flag long
longs.
> Sigh.)

Yes, it is a sad state of affairs. Just as there was much BSD code that
assumed "int" was 32 bits. The industry went along with this
``extension''
for a while, but was burned by it ultimately. Could ``"int" means
32-bits'' benefitted from the standards process? Yes it could have.
Would
it be the right thing to standardize? No. The same is true for
"long long". I bet there's a good amount of "long long" code out there
already that assumes that it is (exactly) a 64-bit integer. A sad state
of affairs when you know what the problem and solution are. The thing
about this kind of problem (porting the code base to new architectures)
only occurs about once every 10-15 years. Thus, we aren't forced to
analyze why this keeps occuring. It would be sad not to fix this problem,
especially considering the cost of porting.

Thomas Koenig

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In comp.std.c, ffar...@aol.com (FFarance) wrote:

>From the perspective of WG14, we expect to complete C9X around 1999. It
>seems silly to solve the same problem 10 years from now. Also, the
>portability problems *greatly* increase with the use of "long long" even
>if you restrict yourself to 16-bit, 32-bit, and 64-bit architectures.

So, those are the problems. What are possible solutions? Something
akin to Fortran's KIND?
--
Thomas König, Thomas...@ciw.uni-karlsruhe.de, ig...@dkauni2.bitnet.
The joy of engineering is to find a straight line on a double
logarithmic diagram.

FFarance

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
> From: Jonathan Coxhead <jcox...@acorn.co.uk>
>
> Although I follow this group regularly, I haven't seen a reference to
> where to find these 'SBEIR proposals' of which everyone speaks.

The proposal is in:

ftp://ftp.dmk.com/DMK/sc22wg14/c9x/extended-integers/sbeir.*

> I've always imagined the adopted solution to the "long long"
> problem would look something like this:
>
> Consider that given hardware supports only a finite (small) number of
> integer widths: the problem is to provide access to these, without
> requiring compilers to do bizarre emulations for integers of widths that
> can't be implemented directly.

There are two main categories of SBEIR types: exact and at-least. An
exact type functions identical to a bit field, except that it is stored
in a container. For example:

unsigned int exact:7 X;

is a 7-bit value. You can take use "sizeof" and "&" on X -- you can't
on bit fields. An exact type is necessary because some applications
need these semantics. The compiler already knows how to do this because
it already supports bit fields. The same optimizations are available:
avoiding masking, using a native type when the precision matches a
type (e.g., exactly 32-bits).

The at-least type requires a minimum precision. For example, at-least
types requiring 20, 24, 28, and 32 bits would probably all be implemented
as a 32-bit type (assuming that it is a native type). Thus, the program
that uses:

signed int atleast:24 Y;

would always get *at-least* 24 bits precision, yet it might be implemented
as a *native* 24-bit, 32-bit, or 64-bit integer. This simplifies the
programming rather that having preprocessor magic or experiment programs
that try to ``discover'' the correct type that meets the needs of the
programmer.

Of course with SBEIR, you can quickly determine the type required for
temporary values:

typedef ... factor1_t;
typedef ... factor2_t;

#define factor1_len EIR_BIT(precof(factor1_t))
#define factor2_len EIR_BIT(precof(factor2_t))
#define product_len (factor1_len+factor2_len)

signed int atleast:product_len
my_product
(
factor1_len f1,
factor2_len f2
)
{
signed int atleast:product_len p;
p = f1;
p *= f2;
return p;
}

Determining the size required for a temporary would be difficult
with preprocessor magic or experiment programs.

In summary, SBEIR uses features already available in the compiler (this
isn't a run-time issue). SBEIR provides a typing mechanism that allows
programmers to write programs portable. SBEIR supports upto 128 bits of
precision. For the compilers that don't support 128-bit arithmetic, they
can (1) implement it themselves, (2) implement the simpler operators
(e.g., copy, add, subtract, negate, bit operators, bit shift, etc.) but
call a library routine for the more complex operators (e.g., multiple,
divide, remainder), (3) call a library routine for any operator. The
are several publicly available multi-precision libraries for those who
are lazy.

mfi...@inmind.com

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In <41d31q$i...@murrow.corp.sgi.com>, ma...@mash.engr.sgi.com (John R. Mashey) writes:
>re: 128-bit machines available in 2004: let me explain why I seriously
>doubt that this is going to happen in any widespread way. You apparently
>got 13 = 1991-1978, then 2004 = 1991+13.

I agree with everything you are saying...but you are still wrong.
If you gave me 128-bit numeric types today, or even 256-bit types
I would be using them before the month was out.

The reason is that the physical size of memory is not related to the
usefulness of wide integral types. I currently use 64-bits (painfully)
on a 32-bit system, but there are areas where I would like wider
types (especially random numbers and cryptography).

Also, you are finding wider data types already, some graphics
processors are now using 192-bit wide data buss. The newer
processors are using 64-bit wide (and probably some are using
128-bit wide) data, even when the address width is 32-bits.

So everything you said is correct. But irrelevant.

Michael Lee Finney


Carlie Coats

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In article <danpop.809087008@rscernix>,
Dan Pop <Dan...@mail.cern.ch> wrote:
>...It's high time to stop looking at the model imposed by the VAX
>as being the holy grail.

A minor exception with this: it was not necessary that "long"
on a VAX be 32-bit. The claim that the machine architecture
demanded it is spurious --"long" on a Z80 is not 8-bit. Instead,
this misdesign is a Berzerkeleyism which should never have happened.
Back when BSD 4 was being developed (if not sooner), they should have
bit the bullet and declared that they were doing "long" as 64-bit.

fwiw.

Carlie J. Coats, Jr. co...@ncsc.org *or* x...@hpcc.epa.gov
Environmental Programs phone: (919)248-9241
North Carolina Supercomputing Center fax: (919)248-9245
3021 Cornwallis Road P. O. Box 12889
Research Triangle Park, N. C. 27709-2889 USA
"My opinions are my own, and I've got *lots* of them!"


John R. Mashey

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In article <danpop.809087008@rscernix>, Dan...@mail.cern.ch (Dan Pop) writes:
(Hmm, some rather strong and pejorative statements about many people):

|> In <41b0qq$j...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R. Mashey) writes:
|>
|> >One more time: I claim "how I get a 64-bit type" IS a problem; I don't
|>
|> And the solution is straightforward: have a 64-bit long. C has 4 basic
|> integral types and each of them can have a different size: 8, 16, 32 and
|> 64 bits. Only brain dead software, making unwarranted assumptions about
|> the relative sizes of int's, long's and pointers will be affected.
In the real world, any vendor who, in 1991, declared that in their 32-bit
environment, the sizeof long would now be 8 bytes, would have been lynched by
their ISVs. Worse, such vendors would immediately have dropped to the
bottom of the port lists, incurring serious financial damage.
These things may be irrelevant to someone in a research environment, some of
which place highest priorities on 1) their own code and 2) free software
and relatively little on software from ISVs.
But these things are *not* irrelevant to many of the rest of us. Those
in research environments, paid for with research funding, may not consider
these things important ... but a vendor that ignores such issues usually
gets hurt badly, in many cases, going out of business. This effect is
most commonly seen in high-end tchnical computing, where mean-time to
bankruptcy is a an important parameter for purchase, and where
environments difficult to program have died pretty badly.

|> Because of DEC OSF/1, most free software has been already fixed.


|> It's high time to stop looking at the model imposed by the VAX as being
|> the holy grail.

Nobody that I know involved in such decisions thinks the VAX model is
the holy grail...

|> The "long long" pseudo-solution wasn't needed in the first place, it was
|> a mistake made by vendors who didn't have the balls to do the right thing,
|> then other vendors followed like lemmings. It comes a time when the
|> mistakes of the past have to be admitted and corrected.
These are fairly strong, and unnecessarily impolite words, that cast aspersions
upon people with whom you may not agree, but may well have to deal with
differing sets of requirements than yours.

John Carr

unread,
Aug 22, 1995, 3:00:00 AM8/22/95
to
In article <41dhd4$2...@newsbf02.news.aol.com>,
FFarance <ffar...@aol.com> wrote:

>In fact (as mfinney pointed out on this thread), they
>have 128-bit applications now. Some crypto applications now require
>1024-bit arithmetic.

These applications will never be satisfied with integer types only a few
times word size. Key sizes will increase as processing power increases.

--
John Carr (j...@mit.edu)

John R. Mashey

unread,
Aug 23, 1995, 3:00:00 AM8/23/95
to
In article <41d50f$4...@mujibur.inmind.com>, mfi...@inmind.com writes:

|> I agree with everything you are saying...but you are still wrong.
|> If you gave me 128-bit numeric types today, or even 256-bit types
|> I would be using them before the month was out.



|> The reason is that the physical size of memory is not related to the
|> usefulness of wide integral types. I currently use 64-bits (painfully)
|> on a 32-bit system, but there are areas where I would like wider
|> types (especially random numbers and cryptography).
|>
|> Also, you are finding wider data types already, some graphics
|> processors are now using 192-bit wide data buss. The newer
|> processors are using 64-bit wide (and probably some are using
|> 128-bit wide) data, even when the address width is 32-bits.
|>
|> So everything you said is correct. But irrelevant.

I don't think so ... please consider reading the 1991 BYTE article,
in which I discussed the issues of bus sizes, big integers, and at least
mentioned cryptography codes (p.136); I've spent a fair number of
hours talking to people whose agencies are unnamed, and have done things
like making foils that only talk about various-sized integer multiple/divide
times for them, and SGI sells some gear into that community...

This piece of this dicussion got started with belief that 128-bit
architectures would come in 2004, and I was answering that. Multi-precision
support in a language is an orthogonal and worthy issue in its own right;
however, the audience of people for whom the address issue is important is
substantially larger than those who do a lot of big-integer (especially
multiply-divides) work. [Note that MIPS RISC has always had hardware
for integer mul/div, unlike some RISCs, so we haven't ignored that piece.]
Put another way:
a) One may choose to put various degrees of multi-precision in a
language at any time relative to when higher-precision hardware
occurs.
b) But when the vendor community moves en masse to the next higher
power of 2, the necessities will *force* something to happen,
hence my discussion of the hardware trends.

Dan Pop

unread,
Aug 23, 1995, 3:00:00 AM8/23/95
to
In <41d4a2$i...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R. Mashey) writes:

>In article <danpop.809087008@rscernix>, Dan...@mail.cern.ch (Dan Pop) writes:
>(Hmm, some rather strong and pejorative statements about many people):

>|> In <41b0qq$j...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R. Mashey) writes:
>|>
>|> >One more time: I claim "how I get a 64-bit type" IS a problem; I don't
>|>

>|> And the solution is straightforward: have a 64-bit long. C has 4 basic
>|> integral types and each of them can have a different size: 8, 16, 32 and
>|> 64 bits. Only brain dead software, making unwarranted assumptions about
>|> the relative sizes of int's, long's and pointers will be affected.

>In the real world, any vendor who, in 1991, declared that in their 32-bit
>environment, the sizeof long would now be 8 bytes, would have been lynched by

I fail to see the relevance of the bitness of the environment. long's
have been 32-bit types in 8-bit environments for ages, and nobody complained.
What's so special about 32-bit environments that a 64-bit long is
unacceptable, while it was perfectly acceptable for DEC OSF/1 and DEC
announced plans to switch to 64-bit in OpenVMS, as well.

>their ISVs. Worse, such vendors would immediately have dropped to the
>bottom of the port lists, incurring serious financial damage.

Yet Sun made considerably more radical changes when they introduced
Solaris and all their ISV's started to port their apps (even if they
dreamed about lynching Sun :-) Anybody who ported software from SunOS
to Solaris knows that it was much worse than just fixing bugs caused by
passing long's to functions expecting int's and vice-versa or by the
assumption that sizeof(long) == 4.

>These things may be irrelevant to someone in a research environment, some of
>which place highest priorities on 1) their own code and 2) free software
>and relatively little on software from ISVs.
>But these things are *not* irrelevant to many of the rest of us. Those
>in research environments, paid for with research funding, may not consider
>these things important ... but a vendor that ignores such issues usually
>gets hurt badly, in many cases, going out of business. This effect is
>most commonly seen in high-end tchnical computing, where mean-time to
>bankruptcy is a an important parameter for purchase, and where
>environments difficult to program have died pretty badly.

Yeah, like the DEC Alpha, right? There, the 64-bit long's were a piece
of cake compared to the fact that sizeof(void *) != sizeof(int). Yet the
architecture survived and nobody considers it today as "difficult to
program". It is precisely because of the pioneering work done by DEC
that most software (free or commercial) is clean of any stupid assumptions
about the sizes of various data types.

>|> Because of DEC OSF/1, most free software has been already fixed.
>|> It's high time to stop looking at the model imposed by the VAX as being
>|> the holy grail.
>
>Nobody that I know involved in such decisions thinks the VAX model is
>the holy grail...

Yet, they blindly follow the VAX model, which was stupid even for the VAX
itself.

>|> The "long long" pseudo-solution wasn't needed in the first place, it was
>|> a mistake made by vendors who didn't have the balls to do the right thing,
>|> then other vendors followed like lemmings. It comes a time when the
>|> mistakes of the past have to be admitted and corrected.
>
>These are fairly strong, and unnecessarily impolite words, that cast aspersions
>upon people with whom you may not agree, but may well have to deal with
>differing sets of requirements than yours.

These were intended as strong words. Nevertheless, they are true.
If you know of any compelling reasons for making long a 32-bit type on
a 32-bit architecture in the first place, please share them with us.
Even today, the "compelling" reason for not fixing it is that it would
invalidate a lot of already broken software, which would have to be fixed.
And fixing broken software doesn't seem to be very high on any vendor's
priority list.

The committee did a very good job when they decided that a proper solution
is a better idea than validating a quick and dirty hack promoted by some
vendors. It's worth mentioning that X3J3 (the Fortran committee) took a
similar decision and rejected the type*N specifications, even if
they've been implemented for at least two decades by most vendors,
providing a proper way to specify a type in terms of range and
precision instead.

Dan
--
Dan Pop
CERN, CN Division
Email: Dan...@mail.cern.ch
Mail: CERN - PPE, Bat. 31 R-004, CH-1211 Geneve 23, Switzerland

John R. Mashey

unread,
Aug 24, 1995, 3:00:00 AM8/24/95
to
In article <danpop.809172764@rscernix>, Dan...@mail.cern.ch (Dan Pop) writes:

|> >In the real world, any vendor who, in 1991, declared that in their 32-bit
|> >environment, the sizeof long would now be 8 bytes, would have been lynched by
|>
|> I fail to see the relevance of the bitness of the environment. long's
|> have been 32-bit types in 8-bit environments for ages, and nobody complained.
|> What's so special about 32-bit environments that a 64-bit long is
|> unacceptable, while it was perfectly acceptable for DEC OSF/1 and DEC
|> announced plans to switch to 64-bit in OpenVMS, as well.

Nothing is special about a 32-bit environment, except that for any vendor
in 1991 to have done this would have broken all their binaries, and had to
double all their libraries, for no particular benefit to their customers.

|>
|> >their ISVs. Worse, such vendors would immediately have dropped to the
|> >bottom of the port lists, incurring serious financial damage.

|> Yet Sun made considerably more radical changes when they introduced
|> Solaris and all their ISV's started to port their apps (even if they
|> dreamed about lynching Sun :-) Anybody who ported software from SunOS
|> to Solaris knows that it was much worse than just fixing bugs caused by
|> passing long's to functions expecting int's and vice-versa or by the
|> assumption that sizeof(long) == 4.

Yes, but the SunOS->Solaris transition has been widely viewed as
unnecesarily traumatic, i.e., it is not a good example ... may they do more :-)

|>
|> >These things may be irrelevant to someone in a research environment, some of
|> >which place highest priorities on 1) their own code and 2) free software
|> >and relatively little on software from ISVs.
|> >But these things are *not* irrelevant to many of the rest of us. Those
|> >in research environments, paid for with research funding, may not consider
|> >these things important ... but a vendor that ignores such issues usually
|> >gets hurt badly, in many cases, going out of business. This effect is
|> >most commonly seen in high-end tchnical computing, where mean-time to
|> >bankruptcy is a an important parameter for purchase, and where
|> >environments difficult to program have died pretty badly.
|>
|> Yeah, like the DEC Alpha, right? There, the 64-bit long's were a piece
|> of cake compared to the fact that sizeof(void *) != sizeof(int). Yet the
|> architecture survived and nobody considers it today as "difficult to
|> program". It is precisely because of the pioneering work done by DEC
|> that most software (free or commercial) is clean of any stupid assumptions
|> about the sizes of various data types.

I wouldn't call Alphas difficult to program; SGI uses the same LP64 model as
DEC for our 64-bit environments. On the other hand, it is well-known that
it cost a lot of money for DEC to get 64-bit ports, and I've talked to more
than one (major) ISV whose application got no value from being 64-bit,
but had to make a lot of code changes earlier than they would have;
lack of applications hurt DEC's revenues for quite a while. This is not
to criticize their decision to go LP64 for Alpha UNIX: I'd probably
have done the same thing, and it makes a lot more sense to do it when
you have a brand-new archiecture that has no isntalled base.


|>
|> >|> Because of DEC OSF/1, most free software has been already fixed.
|> >|> It's high time to stop looking at the model imposed by the VAX as being
|> >|> the holy grail.
|> >
|> >Nobody that I know involved in such decisions thinks the VAX model is
|> >the holy grail...
|>
|> Yet, they blindly follow the VAX model, which was stupid even for the VAX
|> itself.

Maybe somebody "blindly" followed the VAX model ... but most people I know
made a reasoned decision that said:
a) If we follow the VAX model, a lot of software will port with minimal
effort.
b) If we don't follow the VAX model, and make sizeof(long) == 8,
we'll have more effort porting the UNIX code, and we'll have to
do all the multi-precision work now, before we get anything out,
and we'll have a lot of resistance from ISVs, and there will be
no obvious benefit.


|> >|> The "long long" pseudo-solution wasn't needed in the first place, it was
|> >|> a mistake made by vendors who didn't have the balls to do the right thing,
|> >|> then other vendors followed like lemmings. It comes a time when the
|> >|> mistakes of the past have to be admitted and corrected.
|> >
|> >These are fairly strong, and unnecessarily impolite words, that cast aspersions
|> >upon people with whom you may not agree, but may well have to deal with
|> >differing sets of requirements than yours.
|>
|> These were intended as strong words. Nevertheless, they are true.

This is your opinion....

Dan Pop

unread,
Aug 24, 1995, 3:00:00 AM8/24/95
to
In <41gjkp$3...@murrow.corp.sgi.com> ma...@mash.engr.sgi.com (John R. Mashey) writes:

>In article <danpop.809172764@rscernix>, Dan...@mail.cern.ch (Dan Pop) writes:
>
>|> >In the real world, any vendor who, in 1991, declared that in their 32-bit
>|> >environment, the sizeof long would now be 8 bytes, would have been lynched by
>|>
>|> I fail to see the relevance of the bitness of the environment. long's
>|> have been 32-bit types in 8-bit environments for ages, and nobody complained.
>|> What's so special about 32-bit environments that a 64-bit long is
>|> unacceptable, while it was perfectly acceptable for DEC OSF/1 and DEC
>|> announced plans to switch to 64-bit in OpenVMS, as well.

>Nothing is special about a 32-bit environment, except that for any vendor
>in 1991 to have done this would have broken all their binaries, and had to
>double all their libraries, for no particular benefit to their customers.

I must be missing something. How is changing the size of long going to
break _any_ binary? A binary is a binary is a binary. The language used
in the source file is completely irrelevant.

And what's so special about 1991? What's preventing the vendors from
making the change in 1995, with the next release of their operating system?
Only one set of libraries will be enough. It's quite common that programs
compiled for an older version of the OS don't work on the newer one (and
vice-versa), so this won't be something that never happened before.


>
>|>
>|> >their ISVs. Worse, such vendors would immediately have dropped to the
>|> >bottom of the port lists, incurring serious financial damage.
>
>|> Yet Sun made considerably more radical changes when they introduced
>|> Solaris and all their ISV's started to port their apps (even if they
>|> dreamed about lynching Sun :-) Anybody who ported software from SunOS
>|> to Solaris knows that it was much worse than just fixing bugs caused by
>|> passing long's to functions expecting int's and vice-versa or by the
>|> assumption that sizeof(long) == 4.

>Yes, but the SunOS->Solaris transition has been widely viewed as
>unnecesarily traumatic, i.e., it is not a good example ... may they do more :-)

It was unnecessarily traumatic because of the poor quality of the first
versions of Solaris 2, not because Solaris was incompatible with SunOS 4.

DEC did something very similar, with the transition from ULTRIX to DEC OSF/1.
Yet, both companies survived. The ISV's didn't reject them or even put
them at the end of their priority lists, as you claimed.

DEC had an installed base of MIPS machines, which used 32-bit longs and
32-pointers. All the brain dead software of DEC customers had to be fixed
when they switched to the new Alpha machines.

>|> >|> Because of DEC OSF/1, most free software has been already fixed.
>|> >|> It's high time to stop looking at the model imposed by the VAX as being
>|> >|> the holy grail.
>|> >
>|> >Nobody that I know involved in such decisions thinks the VAX model is
>|> >the holy grail...
>|>

>|> Yet, they blindly followed the VAX model, which was stupid even for the VAX
>|> itself.

>Maybe somebody "blindly" followed the VAX model ... but most people I know
>made a reasoned decision that said:
> a) If we follow the VAX model, a lot of software will port with minimal
> effort.
> b) If we don't follow the VAX model, and make sizeof(long) == 8,
> we'll have more effort porting the UNIX code, and we'll have to
> do all the multi-precision work now, before we get anything out,
> and we'll have a lot of resistance from ISVs, and there will be
> no obvious benefit.

Again, DEC's experience proved the bogosity of this pseudo-argumentation.
Doing the right thing is more important (and more rewarding on the long
term) than doing the easiest thing.

>|> These were intended as strong words. Nevertheless, they are true.

>This is your opinion....

Until proven wrong.

mfi...@inmind.com

unread,
Aug 24, 1995, 3:00:00 AM8/24/95
to
In <41dvmg$4...@murrow.corp.sgi.com>, ma...@mash.engr.sgi.com (John R. Mashey) writes:
>I don't think so ... please consider reading the 1991 BYTE article,
>in which I discussed the issues of bus sizes, big integers, and at least
>mentioned cryptography codes (p.136);

I would, if I could take 4 hours to dig it out of the magazine stack
in the basement! <G> I have 'em all the way back.

>however, the audience of people for whom the address issue is important is
>substantially larger than those who do a lot of big-integer (especially
>multiply-divides) work.

However, the driving force for large data types is not really the address
width, although that DOES play a component. It is the data width that
everybody needs.

>Put another way:
> a) One may choose to put various degrees of multi-precision in a
> language at any time relative to when higher-precision hardware
> occurs.

No argument.

> b) But when the vendor community moves en masse to the next higher
> power of 2, the necessities will *force* something to happen,
> hence my discussion of the hardware trends.

No argument here, either. However, in this case I think it is the data
width (such as the need for 64-bit file offsets) that is more of a
driving force.

Michael Lee Finney


FFarance

unread,
Aug 24, 1995, 3:00:00 AM8/24/95
to
> From: Dan...@mail.cern.ch (Dan Pop)

>
> >|> I fail to see the relevance of the bitness of the environment.
long's
> >|> have been 32-bit types in 8-bit environments for ages, and nobody
complained.
> >|> What's so special about 32-bit environments that a 64-bit long is
> >|> unacceptable, while it was perfectly acceptable for DEC OSF/1 and
DEC
> >|> announced plans to switch to 64-bit in OpenVMS, as well.
>
> >Nothing is special about a 32-bit environment, except that for any
vendor
> >in 1991 to have done this would have broken all their binaries, and had
to
> >double all their libraries, for no particular benefit to their
customers.
>
> I must be missing something. How is changing the size of long going to
> break _any_ binary? A binary is a binary is a binary. The language
used
> in the source file is completely irrelevant.

Changing the size, alignment, effective precision, etc., of a "long" or
any other data type will break binaries. You'll be forced to recompile
and port everything. For example, your library routine uses a structure
that has a "long" in it and the library, other libraries, and applications
expect that it is 4 bytes. Now you change the compiler and only recompile
the application with "long" as 8 bytes. The newly compiled application
now has a different, incompatible idea of how the structure is
organized. This is what is meant by breaking binaries.

If the application had used SBEIR types, along with OAX (ordering and
alignment extensions) and REP (data representation extensions), the
binaries *might* be compatible: the data structures would be compatible
but the calling procedures might not be compatible (stack alignment,
return values, etc.). Ah, all we need is to add a C binding to WG11's
LIPC (language independent procedure calls).

John R. Mashey

unread,
Aug 25, 1995, 3:00:00 AM8/25/95
to
In article <41je47$2...@newsbf02.news.aol.com>, ffar...@aol.com (FFarance) writes:
|> > From: Dan...@mail.cern.ch (Dan Pop)

|> > I must be missing something.

Yes.
Frank answered it well:

|> Changing the size, alignment, effective precision, etc., of a "long" or
|> any other data type will break binaries. You'll be forced to recompile
|> and port everything. For example, your library routine uses a structure

But I'd add a few more:
a) People use shared-libraries; you need to double those to support
both cases, since all the binaries a customer has don't magically
disappear and get repalcedwhen you ship a new system.
b) Even more, ISVs, especially some rather important ones, create complex
applications that do things like dyamically loading binaries, that may
well have come from 3rd parties, and again, they don'tall magically
get recompiled at the same time.
c) Strangely enough, not every program is self-contained; some read/write
data to disk. If they ever wrote data structures containing longs to
disk, and the compiler then decides that longs changed size, then even
a simple, single program breaks. You can't just recompile it, you've got
to go through a serious cleanup. [This, of course, is where "exact"
descriptors are good things, sicne they'd be the same under any model.]

DEC changed going from Ultrix to OSF/1 and this was sensible. I note they
didn't change VMS... I make no claim that every vendor that makes
a difficult transition will die, just that many who've made it hard to
program have, and that even the survivors have have suffered.

The "reasoned decision" comment was for 32-bit machines, i.e., why
everybody followed ILP32 in the 1980s.

You comment on ISVs ... I don't know which ones you talk to, but I
talk to some pretty serious ones fairly often ... which is where I get
the opinions.
--

John R MacMillan

unread,
Aug 25, 1995, 3:00:00 AM8/25/95
to
I have read the SBEIR proposal, and I believe it's well thought-out,
but I'm still unconvinced that it is a necessary/desirable feature to
add to C9X. I certainly do not buy a number of the arguments that have
been put forth in this newsgroup, elsewhere in this thread and
others. I'm willing to be convinced otherwise, however, so I'll try
to clearly list my concerns so that others can respond.

The proposal correctly diagnoses that failure to consider and document
the requirements a program has of its types causes porting problems,
but it's not clear to me that SBEIR is a solution. The paper also
claims that SBEIR would have greatly eased the 16-to-32 bit and
32-to-64 bit transitions we have seen and are seeing, but I'm not
certain I buy this for reasons I will outline below.

Like Frank, I've ported millions of lines of code, much to an
implementation with 64 bit longs and ints, 32 bit shorts, and 48 bit
pointers (and the null pointer was not all-bits-zero, but that's
another nightmare :-) ), and one of the large pieces used a long list
of typedefs to cover the at-least and exact requirements on the
types. So the designers had considered these requirements, and even
tried to address the problem, but the port was _still_ difficult
because the programmers misused types and made bad decisions about the
type requirements.

I also ported well-written code to the same machine, without problems,
and with the existing types.

The failing was not that they couldn't have written more portable
code, the failing was that they didn't. And I think they still will
with SBEIR; people will assume that sizeof(int exact:16) == 2 (even
though the proposal makes clear it is not), and probably even that
sizeof(int atleast:16) is also 2, and that you can stick a pointer
into and int atleast:128 (and that you get the same pointer back when
you go the other way), and so on.

I know that SBEIR should not be rejected because people will still
write non-portable code, but I guess I don't believe that it will
encourage people to write portable code, so I do not believe that
SBEIR will ease, for instance, the 64-to-128 bit transition.

It has also been claimed that since C's types are ``at-least'' types,
getting exact semantics is difficult. While it is a bit of work, I
agree with Steve that most code I have ported that wanted exact types
didn't really need them, and using them just made porting more
difficult (for example, networking code that tried to make its
internal representation match that on the wire). I think at-least
types are useful in most cases, and so I'm hesitant to encumber the
language with a new typing scheme just to hit the minority of cases.

C's current at-least numbers have also been cited as being difficult
to work with, since if you want an at-least 32, you need to use long,
and that might not be fastest (or smallest) when int might have done.
Well, if performance (or space) is that much of a concern then it
seems easier to me to deal with those specific cases as they arise at
present (it really isn't that hard to check limits.h to see if int is
big enough) and consider changing the current minimum limits. This
seems much less drastic than what is being proposed.

It has been argued that larger integer types are necessary for, for
example, cryptography, but many of these need larger types than are
being discussed anyway, and again I think that they represent the
minority of the applications. Also, it has been pointed out that
there are arbitrary-precision libraries for compiler vendors too
``lazy'' to write their own, but if this is true, then why can't the
applications that require them use them? Sure, function calls are a
bit messier than operators, but they aren't _that_ bad.

I haven't really thought much about the specifics of how I will have
to implement this in our compiler, should it become necessary, so I
can't really say how hard it will be. I agree that the _conceptual_
changes to the language are small, but I'm not certain that will
translate to the implementation changes also being small. I do know
for certain it would have been very hard to add to our old compiler.
And I definitely disagree with claim that it won't really increase
code size because there are tricks that can be done to avoid including
support for these features if they are not used (as is sometimes done
with floating-point). If they are not typically used, then I think we
should consider long and hard (or perhaps long long and hard :-) )
before adding them, and if they are typically used, then they will
increase code size.

One point that I think I must be completely missing was that SBEIR
allowed the programmer to get closer to the hardware in a portable
manner. I'm not sure I understand this, since as you get closer to
hardware, you typically get less portable to dissimilar hardware. If
portability is the goal, I think it's a bad idea to _try_ to get close
to the hardware.

C types are not perfect, but I think they've proven workable (if
annoying at times). The SBEIR proposal may fix some of these short-
comings, but then again, it may not, and it's a pretty serious change
to make before it has been proven. I think the greatest success of
the Standard is that it mostly codified existing practice, ie. it
standardized something that was proven, rather than standardized on
something and then checked to see if that worked (like some of the
other language standards).

I'm not sure what it would take to convince me, but I haven't seen it
here yet. :-)

FFarance

unread,
Aug 26, 1995, 3:00:00 AM8/26/95
to
> From: jo...@sco.com (John R MacMillan)

> I have read the SBEIR proposal, and I believe it's well thought-out,
> but I'm still unconvinced that it is a necessary/desirable feature to
> add to C9X.

You make several good points. I'll respond to each one.

> Like Frank, I've ported millions of lines of code, much to an
> implementation with 64 bit longs and ints, 32 bit shorts, and 48 bit
> pointers (and the null pointer was not all-bits-zero, but that's
> another nightmare :-) ), and one of the large pieces used a long list
> of typedefs to cover the at-least and exact requirements on the
> types. So the designers had considered these requirements, and even
> tried to address the problem, but the port was _still_ difficult
> because the programmers misused types and made bad decisions about the
> type requirements.

It is hard to prevent programmers from making mistakes or bad assumptions
with full knowledge of the intent of the type, just as dereferencing
NULL is bad and attempting to store more than 16 bits in a 16-bit value.
SBEIR doesn't address this problem.

In the paper, I had pointed to the problem of loss of (intent) information
causes portability problems. I had pointed to code where the programmer
incorrectly believes that the type itself documents the intent. I had
observed that I had seen little code that documents the intent of the
type at the point of declaration or typedef. As a coding convention,
having typedefs something like:

typedef ... SFA32; /* signed fast at-least 32 */
typedef ... UUE16; /* unsigned unoptimized exact 16 */

might have been helpful, but these don't solve the problem completely.
What goes in the ``...'' has to be determined by someone or something.
For some of the types, i.e., the ones that happen to directly map into
native types, this is easy. Some of the optimized at-least types are
harder. For example, does "fast atleast:16" map into a 32-bit or a
64-bit integer? You might have to write experiment programs to find
out on each system you ported to. The compiler and the compiler writer
already know how these performance attributes should map. In this sense,
SBEIR allows you to delegate this decision to the compiler (who knows
best).

Another problem I'm sure you've encountered is that getting an exact
type is difficult when the compiler doesn't support it. For example,
some Cray machines don't support a 16-bit type. It would be impossible
to write the typedef for "UUE16" above. Even though 16-bit bitfields
are supported, you can't have a bitfield as a typedef. You would
probably rewrite this as:

#ifdef SUPPORTS_16_BIT_TYPE
typedef ... UUE16;
#else
typedef struct
{
unsigned int x:16;
} UUE16;
#endif


main()
{
/* ... */
#ifdef SUPPORTS_16_BIT_TYPE
a = b+func(c,d);
#else
a = b+func(c.d).x;
#endif
}

UUE16
func
(
UUE16 p,
UUE16 q
)
{
UUE16 r;

#ifdef SUPPORTS_16_BIT_TYPE
r = p+q;
#else
r.x = p.x+q.x;
#endif
return r;
}

Now add in the possibility that some machines don't support another
type you care about (e.g., a 32-bit type) and you can image the nested
"#ifdef"s and complexity. At that point, you might say ``forget about
these exact types'' they are too hard to get right. The problem here
isn't that the compiler can't do them -- it can do the arithmetic
already. The problem is that the notation (i.e., type system) doesn't
support it conveniently. As programmers, we finally contort the program
to something that's acceptable to all implementations (e.g., using
at least types with masking), yet could perform better on most systems
because this compiler already supports these features. This is why
you still have programmers inserting ugly code: they know how to get
access to the feature in that compiler, but not portably. Many times
these ugly solutions are added to address performance (space or time)
concerns. And this is how we complete the cycle of creating more
non-portable code.

BTW, there are other design styles that C supports. The ``Spirit Of C''
includes two features:

- It should be possible to write portable code.
- It should be possible to write non-portable code.

So far, we've probably been discussing writing programs in a ``strictly
conforming style'' using SBEIR, i.e., the programs only take use the
specified precision, not the actual precision during calculations.

Some programmers may write programs for a specific machine. SBEIR doesn't
solve the problems of these programs.

Some programmers write in ``conforming style'' (rather than ``strictly
conforming style''). In Standard C, these programmers probably use
"<limits.h>" to determine the actual precision of their types (rather than
using only the specified precision). An example of this style would be a
program that used the type "int atleast:16" to break a problem (e.g.,
emulating decimal arithmetic) into smaller chunks. If the actual
precision is the type is larger than 16 bits, the program would take
advantage of the additional size (see the "EIR_BIT" macro) to solve the
problem in larger chunks.

> ...


> I know that SBEIR should not be rejected because people will still
> write non-portable code, but I guess I don't believe that it will
> encourage people to write portable code, so I do not believe that
> SBEIR will ease, for instance, the 64-to-128 bit transition.

The problem before was that some people didn't know how to write portable
code and the others who did had a hard time doing it because of
limitations
in the type system. The rest of the compiler, in general, wasn't the
problem because it could generate code for exact types or call libraries
for code if couldn't generate (it called libraries anyway for some
arithmetic operations, e.g., long multiple/divide, some of the IEEE
floating point arithmetic).

With SBEIR the type system will support the features needed by
programmers.
We will be able to educate programmer how to do it right because there
will
be a way to express this in the compiler. Certainly, SBEIR won't prevent
all porting problems. SBEIR will help facilitating portable integer
operations -- a large portion of C code.

> It has also been claimed that since C's types are ``at-least'' types,
> getting exact semantics is difficult. While it is a bit of work, I
> agree with Steve that most code I have ported that wanted exact types
> didn't really need them, and using them just made porting more
> difficult (for example, networking code that tried to make its
> internal representation match that on the wire). I think at-least
> types are useful in most cases, and so I'm hesitant to encumber the
> language with a new typing scheme just to hit the minority of cases.

There are a good number of applications that need exact types. While
exact types could be rewritten as at least types, the convenience of
a bit field of N bits is fairly useful because the compiler hides the
masking and can generate better code. Even so, having programmers
generate the bit masks is a common source of error. Making sure the
compiler gets it right is one failure point. Making sure the programmer
gets it right is N places is N failure points. A similar point is
made in the OAX (ordering and alignment extensions) proposal: although
programmers can call library functions (e.g., "htons", "htonl") to map
bit/byte ordering, many programmers get this wrong because of conceptual
issues writing this in C (i.e., a hazard of langauge). Thus, having the
compiler mechanically generate code is more reliable than programmers
generating the same code (a good reason for using higher level languages).

> C's current at-least numbers have also been cited as being difficult
> to work with, since if you want an at-least 32, you need to use long,
> and that might not be fastest (or smallest) when int might have done.
> Well, if performance (or space) is that much of a concern then it
> seems easier to me to deal with those specific cases as they arise at
> present (it really isn't that hard to check limits.h to see if int is
> big enough) and consider changing the current minimum limits. This
> seems much less drastic than what is being proposed.

If you look at how people solve this performance problem and what the
type system allows, you find that the solutions to this problem are
very machine specific: this is the source of portability problems. As
I said in a previous posting, SBEIR isn't for everyone. If you are
interested in portable applications that are performance sensitive
and the cost of maintenance/porting is high, then you'll be interested
in SBEIR. If you're not interested in portable applications, or
performance isn't an issue, or mainenance/porting is cheap then SBEIR
doesn't provide much help. I claim than many large, industrial-strength
applications probably have some need for SBEIR features.

> It has been argued that larger integer types are necessary for, for
> example, cryptography, but many of these need larger types than are
> being discussed anyway, and again I think that they represent the
> minority of the applications. Also, it has been pointed out that
> there are arbitrary-precision libraries for compiler vendors too
> ``lazy'' to write their own, but if this is true, then why can't the
> applications that require them use them? Sure, function calls are a
> bit messier than operators, but they aren't _that_ bad.

If you have to perform all your arithmetic via function calls, it
becomes messy. I've spent much time tracking down minor bugs on
code and/or libraries that emulate 64-bit arithmetic because the
compiler didn't support it directly. This is a problem for N-bit
arithmetic in general. The discussion of larger types was in the
context of what is the minimum precision the Standard should require?

> I haven't really thought much about the specifics of how I will have
> to implement this in our compiler, should it become necessary, so I

> can't really say how hard it will be. ...

We'll include a post-implementation analysis after we publish the port
of GCC.

> One point that I think I must be completely missing was that SBEIR
> allowed the programmer to get closer to the hardware in a portable
> manner. I'm not sure I understand this, since as you get closer to
> hardware, you typically get less portable to dissimilar hardware. If
> portability is the goal, I think it's a bad idea to _try_ to get close
> to the hardware.

In this context, ``getting closer to the hardware'' means ``being better
able to take advantage of the hardware''. Because you have access to
space/time optimizations, you can specify these generically rather than
including knowledge of specific implementations (e.g., the type X is
that fastest type of at least 16 bits on hardware Y).

> C types are not perfect, but I think they've proven workable (if
> annoying at times). The SBEIR proposal may fix some of these short-
> comings, but then again, it may not, and it's a pretty serious change
> to make before it has been proven. I think the greatest success of
> the Standard is that it mostly codified existing practice, ie. it
> standardized something that was proven, rather than standardized on
> something and then checked to see if that worked (like some of the
> other language standards).

The goal of C89 was simply to codify existing practice. We weren't
interesting in inventing unless absolutely necessary (we did add
"const", "volatile", function prototypes, and locales). Since C89,
the Numeric C Extension Group has been addressing several needs for
numeric programming. One of them was extended integer range. There
has been much experience with extended integer types. The IEEE 1596.5
Standard requires 8, 16, 32, 64, and 128 bit types. As best as I can
tell from the information from that standards committee, there are
compilers that run on HP, Sun, DEC, and IBM that support these types (I'm
not sure if they are commercially available). The use of multiprecision
libraries has been out for a while. The extension of arithmetic operators
for larger types has been available for a while (e.g., C++ -- although
C++'s operator overloading scheme is messed up). The C9X charter
considers
prior art, but the prior art is not required to be solely within
the C language. The GCC implementation and others will be a basis for
testing the C binding of these features. These features have already
been proven elsewhere.

Alan Stokes

unread,
Aug 29, 1995, 3:00:00 AM8/29/95
to
In <41cu3g$p...@newsbf02.news.aol.com> ffar...@aol.com (FFarance) writes:
>> From: al...@rcp.co.uk (Alan Stokes)
>>
>> I've read your proposal, and in general think it excellent. However,
>> you only allow for precision to be specified as a number of bits.
>> I frequently want to be able to specify precision as a range (bits is
>> fine for bit twiddling, but if I just want to store a value between
>> 0 and SOME_MACRO efficiently I'm stuck).

>In the SBEIR paper, the EIR macros include "EIR_LG2CEIL" which takes
>the logarithm base 2 and truncates *upward* (the ``ceiling'' function).
>You might use it as:

> signed int atleast:EIR_LG2CEIL(SOME_MACRO) X;

Fair enough then (I missed that the first time I read the paper, I'm
afraid).

Peter Curran

unread,
Aug 31, 1995, 3:00:00 AM8/31/95
to
In <41d31q$i...@murrow.corp.sgi.com>, John R. Mashey
(ma...@mash.engr.sgi.com) wrote

<An interesting argument suggesting 128-bit nachines will be a long
time in coming, which I will not repeat.>

Allow me to present another argument, arriving at much the same
conclusion w.r.t. C. I will assume, for the sake of concreteness, a
conventional 8-bit/byte model, but that is not critical to the
discussion.

In my experience, integer usage is subject to a version of the
well-known 80-20 rule.

80% of all integer usage can be satisified with 8-bit integers (this
includes all character and string processing.)

80% of the remaining 20% can be satisfied with 16-bit integers (most
loop counters, and the like, for example.)

80% of the remaining 4% can be satisfied with 32-bit integers (or should
that be 24-bit integers?)

And so on. Obviously these numbers are pure speculation, based on
a lot of experience but no serious study. However, I am reasonably
sure that they are in the right ballpark.

The implication is that there are very few uses for 64-bit integers,
and almost none for 128-bit integers. There may be good reason to
build 128-bit machines, for throughput reasons, but that doesn't mean
there are a lot of programs waiting for 128-bit integers. I am
not saying they are useless, but I am saying that I don't think there
are many applications that need integers larger than 64 bits but not
larger than 128 bits.

The implication of this for C, if I'm right, is that there isn't a
great need for a method of handling integers larger than 64 bits.
There are a few such applications, but they could be handled on an
_ad_hoc_ basis (e.g. using macros, vendor-specific types, etc.) with
less-than-perfect portability. People who work with such code, and
who need to port their code (a small number, I suggest) would have to
pay a small price to do the porting. (When I went to school, portable
code was defined as "code that can moved more easily than it can be
rewritten.") The rest of us, who have no need for such esoterica,
would not have to get involved with it.

In other words, IMHO, simply adding a 64-bit-integer type for C would
satisfy the vast majority of users, without all the complexity of the
other proposals floating around. (My preference would be to make
'long' serve this purpose, and make those who can't read the standard,
who assumed 'long' is 32 bits, fix their code, but I realize that
isn't going to happen.)

--
Peter Curran pcu...@isgtec.com
ISG Technologies, Inc (905) 672-2100 X315
6509 Airport Road, http://www.isgtec.com Fax (905) 672-2307
Mississauga, Ontario, Canada Usual disclaimers apply


FFarance

unread,
Sep 6, 1995, 3:00:00 AM9/6/95
to
> From: pcu...@isgtec.com (Peter Curran)
>
> ...

>
> The implication of this for C, if I'm right, is that there isn't a
> great need for a method of handling integers larger than 64 bits.
> There are a few such applications, but they could be handled on an
> _ad_hoc_ basis (e.g. using macros, vendor-specific types, etc.) with
> less-than-perfect portability.

The same was said of 64 bits years ago. This is now an expensive
porting problem because the mapping of type intents (see SBEIR paper)
to C ``types'' is different for 64-bit machines AND different among
64-bit machines.

One could easily say the same was said of 32 bit machines 15-20 years
ago (e.g., prior to the "long" type the time was passed as an array
of 2 "int"s).

> People who work with such code, and
> who need to port their code (a small number, I suggest) would have to
> pay a small price to do the porting. (When I went to school, portable
> code was defined as "code that can moved more easily than it can be
> rewritten.") The rest of us, who have no need for such esoterica,
> would not have to get involved with it.

I agree with your definition of porting. However, porting code is fairly
expensive. All solutions that don't maintain the programmer's intent
(i.e., lose information) become expensive to port because in these
cases, where information is lost, the programmer has embedded his/her
assumptions within the code (see below -- your 64-bit integer). The
expense comes from having to re-investigate what the programmer had
intended (was is ``exactly 32 bits?'', ``atleast 32 bits?'', ``the fastest
type of at least 32 bits?'', etc.). Even after you've ported the code
to the next system (probably bracketed with "#ifdef"s for different
systems), you still have the porting problem again.

You might say, ``just use "short", "int", and "long" the same for
each program''. This might work, but in commericial and/or high
performance systems, using a non-optimimum type can cause severe
space or time problems (i.e., performance problems). Thus, you can't
use the same C type on each port even though it satisfies the functional
requirements.

> In other words, IMHO, simply adding a 64-bit-integer type for C would
> satisfy the vast majority of users, without all the complexity of the
> other proposals floating around. (My preference would be to make
> 'long' serve this purpose, and make those who can't read the standard,
> who assumed 'long' is 32 bits, fix their code, but I realize that
> isn't going to happen.)

Adding a 64-bit type, say "int64", would be very short-sighted. For
the programmers that use this type, what happens when they move to
a different architecture (e.g., 72-bit, 96-bit, 128-bit)? Would "int64"
mean ``exactly 64 bits'', ``at least 64 bits'', ``the fastest type of
at least 64 bits'', ``the smallest type of atleast 64 bits''? Not all
people would make the same assumptions. Even if we standardize "int64"
to mean ``at least 64 bits'', this wouldn't solve the porting problem
because if you meant ``the fastest type of at least 64 bits'', you might
use "int64" on one system, "long" on another, and "int128" on other
systems. Thus, you haven't solved the porting problem. In fact, adding
a type list this causes more portability problems because less information
(statistically) is retained (see the SBEIR paper).

John Carr

unread,
Sep 6, 1995, 3:00:00 AM9/6/95
to
In article <42kc1t$r...@newsbf02.news.aol.com>,
FFarance <ffar...@aol.com> wrote:

>One could easily say the same was said of 32 bit machines 15-20 years
>ago (e.g., prior to the "long" type the time was passed as an array
>of 2 "int"s).

The situation is not comparable.

32 bit (and larger) machines had been around for years when C was developed.
A portable language 20 years ago should have allowed for 32 bit (or larger)
data types. C was developed on a low end system without plans for
portability to high end systems.

--
John Carr (j...@mit.edu)

Andries Brouwer

unread,
Sep 6, 1995, 3:00:00 AM9/6/95
to
ffar...@aol.com (FFarance) writes:

:: From: pcu...@isgtec.com (Peter Curran)


::
:: ...
:: The implication of this for C, if I'm right, is that there isn't a
:: great need for a method of handling integers larger than 64 bits.
:: There are a few such applications, but they could be handled on an
:: _ad_hoc_ basis (e.g. using macros, vendor-specific types, etc.) with
:: less-than-perfect portability.

: The same was said of 64 bits years ago. This is now an expensive
: porting problem because the mapping of type intents (see SBEIR paper)
: to C ``types'' is different for 64-bit machines AND different among
: 64-bit machines.

: One could easily say the same was said of 32 bit machines 15-20 years


: ago (e.g., prior to the "long" type the time was passed as an array
: of 2 "int"s).

I don't think so - I have worked with 27-, 36- and 60-bit machines
long ago (at least in the 27-bit case, over 24 years ago).

What is an integer? A number, something that you add and multiply.
My claim is that you never need more than 64 bits, and if you do
(as I do all the time, doing exact linear programming on rationals)
128 bits do not suffice either, you need unlimited precision.

For what other purposes does one use an integer?
As representation for a set, that is, a bitfield of given length.
As representation for a structure, if all the fields happen to fit.
As vehicle for moving other things around in case they happen
to fit, like pointers or floats.

But of course, if you need a structure or a bitfield then it is
better to declare one, and leave the representation to the compiler.
If you want to store a pointer into an integer, you deserve to be
punished, you need a union.

Then there is one further use: to address certain hardware registers
that have a given size. Such issues are hardware dependent and are
better handled by a given compiler on a given machine.

In short: so far I have not seen a single justification for adding
provisions to the language to describe integers (those things that
one adds and multiplies) of some fixed length larger than 64.
But having bitfields of arbitrary length, such that one could take
their address, and have boolean operations on them, would certainly
be very useful (and easy to implement).


Martin Kealey

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
Richard A. O'Keefe (o...@goanna.cs.rmit.edu.au) wrote:

> ffar...@aol.com (FFarance) writes:
> >Another problem I'm sure you've encountered is that getting an exact
> >type is difficult when the compiler doesn't support it.

> I am still rather bewildered by this. If the compiler _doesn't_
> support a type of some specific size, then how can you possibly get it?

[The compiler has to generate code to make it work, since the programmer
who asked for it would have had a good reason to use it - like a CRC
calculation, for instance.]

> Example 1: The hardware is a DEC-20 supporting 18-bit and 36-bit integer
> arithmetic. The program asks for _exactly_ 32 bits. A single variable
> is allocated. What happens?
> [Probable answer: a 36-bit word is allocated, and some kind of checking
> or masking is done when the variable is assigned to.]

The use of the term "exact" as Frank Farance uses it in this context means
"when you read the value you get exactly this number of bits". The point
about exactness is the behaviour of the (mod 2^N) integer field.

The question of whether you pack bits between machine words is handled
(potentially) by the "small" atribute. Even that is only a "hint" that
since we're going to declare a humungous array of these things, then
the compiler really ought to make them as compact as possible - but even
then padding would be allowed, since where talking about the semantics
of the abstract program, not actually the process of matching hardware
characteristics.

What is being proposed is probably more comprehensive in its expressive
power than is available in any other mainstream language today; the use
of several orthogonal sets of modifiers allows very fine definition of
what the required behaviour is.

Occasionally I think Frank gets the order of things a little bit confused:
the SBEIR is the *most abstract* level, which is then mapped onto specific
real object sizes which are "well suited" to the particular architecture,
which in turn are mapped onto entities like machine registers, co-processor
registers, parameter stack words, and "general" memory words. These last
set may not all have the same size, but the requirement is that they
eventually produce the same "as if" results for the abstract program.

> An array of these objects is allocated. What happens?
> Is it supposed to be the way Pascal 'packed arrays' were originally
...
> As it happens, the DEC-20 _could_ do this, because it could address
> bitfields anywhere in memory, using a fattened pointer.

In this case, yes, "int exact:5 small x[8000]" would use 5000 bytes of
storage. On a machine which doesn't have built-in unpacking, it might
use 8000 bytes.

> Example 2: The hardware is a 24-bit DSP, and the program asks for
> _exactly_ 16 bits (nothing that size) or _exactly_ 32 bits.

If you want a 32-bit CRC, then you want a 32-bit CRC; no amount of saying
"but the hardware doesn't support it" is going to get around that
requirement; it's up to the compiler to do its utmost.

> Example 3: The hardware is a very nice 32-bit RISC chip.
> The program asks for _exactly_ 24 bits (in order to talk to the DSP chip
> mentioned in example 2).

Then you are dealing with (non-portable) hardware, and know what your
compiler is going to produce when you give it this request.
"int exact:24 fast DSP" is the obvious way to phrase it, but even it's not
guaranteed to produce what you want.

> What I'm trying to understand is how "exactly N bit" types can have any
> role in *portable* code.

see note about 32-bit CRC above.

> As soon as bit fields entered the language, I switched
> over to using them. I switched back as soon as I found I was getting
> _worse_ code (which stayed true for a _long_ time), and when the C
> standard came out and I found out just how non-portable they were, I
> decided never ever to use another C bit-field in my life. I do have some
> code that could use bit-fields, but I use explicit shifts and masks
> FOR BETTER PORTABILITY.

This is sad; the compiler should be able to do at least as well as
a human at automatically generating the shift-and-mask, since it should
also know exactly when it can shortcircuit things (like, it happens
to already have the shortened value held in another register).

Maybe we should be able to attach the "register" or "fast" tag to any
particular bitfield to say "make this one go the fastest", which would
normally mean "put this in the low order bits so it doesn't have to be
shifted". (I'd prefer "register" since it would be orthogonal to the
"fast" attribute, simply specifying *which* bitfield was prefered
for the low-order bits without changing the pointer convention, while
"fast" has the potential to change the pointer width.)

> >The IEEE 1596.5
> >Standard requires 8, 16, 32, 64, and 128 bit types.

> In that case, why not just require a header
> <ieee1596.h>
> that typedefs
> s8int, u8int, ..., s128int, u128int
> and leave it at that?

The argument about these types not specifying enough information has
been hashed over many times; since most machines are going to implement
these typenames as exact types by default, it's going to make life hard
later on when someone realises they can make their compiler make faster
code by using "atleast" semantics, while some other programmer somewhere
else has assumed that these types imply "exact". All of a sudden, what
was thought to be portable code keels over & dies, and the compiler
vendor will probably get the rap for doing what they're supposed to be
allowed to do.

> I do hope the SBEIR proposal includes a way to specify integer sizes
> in human-oriented units, so that I don't have to figure out some
> kluge to approximate log-to-the-base-2 in macros.

Actually, I agree - it would be nice to specify ranges, since in theory
one could store a value of type "int small range:-5...250" in 8 bits.

Hey Frank, what about this? Add "range:X...Y" as another modifier which
could be used (in place of "atleast:" or "exact:" where applicable).

One more for the "wish list" - bit rotation operators, eg "<<|" and ">>|"
to make optimal use of exact bit width values.

-Martin.

Nathan Sidwell

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
FFarance (ffar...@aol.com) wrote:
: > From: pcu...@isgtec.com (Peter Curran)

: > The implication of this for C, if I'm right, is that there isn't a
: > great need for a method of handling integers larger than 64 bits.
: > There are a few such applications, but they could be handled on an
: > _ad_hoc_ basis (e.g. using macros, vendor-specific types, etc.) with
: > less-than-perfect portability.

: The same was said of 64 bits years ago. This is now an expensive
: porting problem because the mapping of type intents (see SBEIR paper)
: to C ``types'' is different for 64-bit machines AND different among
: 64-bit machines.

: One could easily say the same was said of 32 bit machines 15-20 years
: ago (e.g., prior to the "long" type the time was passed as an array
: of 2 "int"s).

The reason the natural integer size of machines has gone from 16 bit
to 32bits to 64 bits is _not_ _primarily_ because programs are
processing larger integers, but because memory sizes have gone up.

As addressable memory increases, you need larger pointers to address it
with. You need to be able to keep a whole pointer in a register (or
register pair, but that gets awkward). Therefore it's natural to make the
register width the same as the width of a pointer. Now when you have
pointers, you need to perform arithmetic on them, so the architecture
must have some operations which work on the same width as the register.
The upshot of this is that you get integers of the same size as
registers.

As the driving force behind increasing register size is accessible memory,
it seems unlikely that 128bit scalar machines will be required. 2^64
is a phenomenally large number, it'll take you nearly 6 years
to fill it up at a bandwidth of 100 Gigabytes per second.

The class of problems which require large integers (cryptography etc)
are not necessarily best supported by just increasing the register
size on the machine. Having the ability to chain arithmetic instructions
to perform long arithmetic is likely to be a better solution (although
current architectures can be awkward to use this way).

nathan

--
Nathan Sidwell Holder of the Xmris home page
Chameleon Architecture Group at SGS-Thomson, formerly Inmos
http://www.pact.srf.ac.uk/~nathan/ Tel 0117 9707182
nat...@inmos.co.uk or nat...@bristol.st.com or nat...@pact.srf.ac.uk

Thad Smith

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
In article <6495249.3...@kcbbs.gen.nz>,
mar...@kcbbs.gen.nz (Martin Kealey) wrote:

>> An array of these objects is allocated. What happens?
>> Is it supposed to be the way Pascal 'packed arrays' were originally
>...
>> As it happens, the DEC-20 _could_ do this, because it could address
>> bitfields anywhere in memory, using a fattened pointer.
>
>In this case, yes, "int exact:5 small x[8000]" would use 5000 bytes of
>storage. On a machine which doesn't have built-in unpacking, it might
>use 8000 bytes.

I don't see how that could happen and be compatible with C pointer
arithmetic, in which the byte is the smallest addressable unit. What
is sizeof x[0]?

>> As soon as bit fields entered the language, I switched
>> over to using them. I switched back as soon as I found I was getting
>> _worse_ code (which stayed true for a _long_ time), and when the C
>> standard came out and I found out just how non-portable they were, I
>> decided never ever to use another C bit-field in my life. I do have some
>> code that could use bit-fields, but I use explicit shifts and masks
>> FOR BETTER PORTABILITY.
>
>This is sad; the compiler should be able to do at least as well as
>a human at automatically generating the shift-and-mask, since it should
>also know exactly when it can shortcircuit things (like, it happens
>to already have the shortened value held in another register).

I think the issue here is the implementation-defined mapping of
bit-fields to memory, namely the order of allocation within the
storage unit (msb first or lsb first) and whether/when alignment takes
place. Borland, for example, uses the addressability characteristics
of the 80x86 memory system to allow bit-fields to straddle bytes and
words and still get efficient operations. I suspect a lot of
compilers don't do this. What is usually desired, I think, is a good
high-level way of specifying the layout of binary data for reading and
writing to standard external binary formats. I recall that Frank said
that some proposal is being prepared to address this issue.

Thad

Paul Eggert

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
mar...@kcbbs.gen.nz (Martin Kealey) writes:

> What is being proposed is probably more comprehensive in its expressive
> power than is available in any other mainstream language today; the use
> of several orthogonal sets of modifiers allows very fine definition of
> what the required behaviour is.

> Occasionally I think Frank gets the order of things a little bit confused:

Perhaps the proposal is a little _too_ comprehensive,
if even its proponents get confused by its complexity.

Will Rose

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
Peter Curran (pcu...@isgtec.com) wrote:
[...]
: In other words, IMHO, simply adding a 64-bit-integer type for C would

: satisfy the vast majority of users, without all the complexity of the
: other proposals floating around. (My preference would be to make
: 'long' serve this purpose, and make those who can't read the standard,
: who assumed 'long' is 32 bits, fix their code, but I realize that
: isn't going to happen.)

I messed up a lot of software in my youth, when I moved between
16 and 32 bit ints. After that painful exercise, I tried to
stick to the 'natural sizes', and assume nothing. Is the assumption
of 32 bit ints widespread?

OTOH, I sometimes 'lock' a size by using a short, which I've
always assumed to be 16 bits; this can be a false assumption,
and still dangerous. Perhaps we do need a generally available
64bit type, but not long, which is spoken for; the largest
integer the compiler wants to support...

Will
c...@crash.cts.com


Peter Curran

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
ffar...@aol.com (FFarance) wrote:

>> From: pcu...@isgtec.com (Peter Curran)
>> The implication of this for C, if I'm right, is that there isn't a
>> great need for a method of handling integers larger than 64 bits.

>The same was said of 64 bits years ago. This is now an expensive
>porting problem because the mapping of type intents (see SBEIR paper)
>to C ``types'' is different for 64-bit machines AND different among
>64-bit machines.

Yes - that is why I said using 'long' for the 64-bit type could not
happen.

>One could easily say the same was said of 32 bit machines 15-20 years
>ago (e.g., prior to the "long" type the time was passed as an array
>of 2 "int"s).

No - from the earliest days of C it was clear a 32-bit type was needed
- it just didn't exist in the hardware of the time, and so it wasn't
provided in the language. That is why the interface to functions like
time() is so awkward. It is also why, when defining a language, we
should listen to software needs over hardware needs. (And, IMHO, it
is very hard to see a need for very long integers.)

>I agree with your definition of porting. However, porting code is fairly
>expensive. All solutions that don't maintain the programmer's intent
>(i.e., lose information) become expensive to port because in these
>cases, where information is lost, the programmer has embedded his/her
>assumptions within the code (see below -- your 64-bit integer). The
>expense comes from having to re-investigate what the programmer had
>intended (was is ``exactly 32 bits?'', ``atleast 32 bits?'', ``the fastest
>type of at least 32 bits?'', etc.). Even after you've ported the code
>to the next system (probably bracketed with "#ifdef"s for different
>systems), you still have the porting problem again.

First, I am well aware of the cost of porting, when the software was
not written with porting in mind. I don't thing the SBEIR proposal
will change that one bit. It is entirely possible to write highly
portable code in C now (with the implementation minimums, etc., firmly
in mind). With the SBEIR proposal, people who write with porting in
mind will have a lot more details to handle, and people who don't will
still make a mess.

Second, I think it is pointless to try to write code that is fast at
the micro level and portable at the same time. Portable speed comes
from algorithm design, not bit twiddling. Making a specific variable
"fast" or "slow" provides no guarantee whatsoever about the
performance of the program on new hardware - your assumptions about
the hardware could be completely wrong. Performance (at the lowest
level) and portability are fundamentally incompatible.

>Adding a 64-bit type, say "int64", would be very short-sighted. For
>the programmers that use this type, what happens when they move to
>a different architecture (e.g., 72-bit, 96-bit, 128-bit)? Would "int64"
>mean ``exactly 64 bits'', ``at least 64 bits'', ``the fastest type of
>at least 64 bits'', ``the smallest type of atleast 64 bits''? Not all
>people would make the same assumptions. Even if we standardize "int64"
>to mean ``at least 64 bits'', this wouldn't solve the porting problem
>because if you meant ``the fastest type of at least 64 bits'', you might
>use "int64" on one system, "long" on another, and "int128" on other
>systems. Thus, you haven't solved the porting problem. In fact, adding
>a type list this causes more portability problems because less information
>(statistically) is retained (see the SBEIR paper).

As I said in my original article, I was assuming a conventional
8/bit/byte architecture for simplicity - I did not mean literally 64
bits. I meant 64 bits in the same sense as a "long' is currently 32
bits - the minimum length the standard requires.

In 25 of programming in C, I have never encountered a need for a
variable that is exactly 'n' bits, except when working on inherently
non-portable software - e.g. device drivers and the like, or when
doing machine-specific optimizations.. I have never encountered an
algorithm that could not be implemented effectively given only a
minimum guarantee on the size of the variables involved. That does
not mean such algorithms don't exist - but IMHO the whole of the C
community does not need to be burdened with the cost of providing for
such esoteric requirements.

[Please note change of address.]

--
Peter Curran pcu...@inforamp.net


Peter Curran

unread,
Sep 7, 1995, 3:00:00 AM9/7/95
to
c...@crash.cts.com (Will Rose) wrote:
>I messed up a lot of software in my youth, when I moved between
>16 and 32 bit ints. After that painful exercise, I tried to
>stick to the 'natural sizes', and assume nothing. Is the assumption
>of 32 bit ints widespread?

The standard currently guarantees that 'long' is at least 32 bits. It
would be completely consistent with the standard for an implementation
to make it 64 bits, and some have done so. However, the principal
argument I have heard against making this the conventional way of
providing 64-bit integers is that it would break too much code that
assumes long is 64 bits. I personally haven't seen much code of that
sort, but I have to assume it exists.

>OTOH, I sometimes 'lock' a size by using a short, which I've
>always assumed to be 16 bits; this can be a false assumption,
>and still dangerous. Perhaps we do need a generally available
>64bit type, but not long, which is spoken for; the largest
>integer the compiler wants to support...

Unfortunately, while that may have been the intent, in practice 'long'
appears to mean 'the shortest natural integer size of at least 32
bits.' Ugly, but apparently that's reality.

--
Peter Curran pcu...@inforamp.net


Lawrence Kirby

unread,
Sep 8, 1995, 3:00:00 AM9/8/95
to
In article <6495249.3...@kcbbs.gen.nz>
mar...@kcbbs.gen.nz "Martin Kealey" writes:

>[The compiler has to generate code to make it work, since the programmer
>who asked for it would have had a good reason to use it - like a CRC
>calculation, for instance.]

A CRC only requires a minimum variable width correspoinding to the CRC
width. A totally portable 32 bit CRC generator can be written in plain
ANSI C using longs.

I first though that SBEIR was an interesting idea but now I feel that
it is totally inappropriate for a language like C. At the very least it
needs to be proved and hence it should not be considered for the imminent
revision of the C standard, possibly for later ones. All we really
need is a couple of extra types guaranteeing the existance of larger integer
widths. For example int32, int64 and int128 which correspond to the
smallest widths that can reasonably be implemented on the platform of at least
32, 64 and 128 bits respectively. This frees int to remain the "natural size
suggested by the archetecture" as the current standard says. It also frees
long from the somewhat conflicting roles it enjoys in practice as a 32 bit
type (which is all portable code can assume) and as a 64 bit type on 64 bit
systems. This solution also makes expanding printf and scanf format specifiers
(not to mention general standard library support) relatively painless. They
would be a nightmare under SBEIR.

>> Example 1: The hardware is a DEC-20 supporting 18-bit and 36-bit integer
>> arithmetic. The program asks for _exactly_ 32 bits. A single variable
>> is allocated. What happens?
>> [Probable answer: a 36-bit word is allocated, and some kind of checking
>> or masking is done when the variable is assigned to.]

Frankly the use for an exact 36 bit type is so limited it doesn't warrant
inclusion in the language. In the very few instances where you need to
perform 36 bit calculations use a 64 bit type (which should certaily be
supported by the language revision) and mask appropriately. The object code
may even end up more efficient than using a compiler supplied 36 bit type
since masking only tends to be required at certain key points and you, the
programmer, are in a much better position to know what they are.

...

>What is being proposed is probably more comprehensive in its expressive
>power than is available in any other mainstream language today; the use
>of several orthogonal sets of modifiers allows very fine definition of
>what the required behaviour is.

The problem is that it is too expressive and will get misused. It is extremely
rare to require an integer type that does not correspond to some simple
multiple or factor of the machine's register width. SBEIR is simply a recipe
for unnecessarily inefficient programs.

>> Example 2: The hardware is a 24-bit DSP, and the program asks for
>> _exactly_ 16 bits (nothing that size) or _exactly_ 32 bits.
>
>If you want a 32-bit CRC, then you want a 32-bit CRC; no amount of saying
>"but the hardware doesn't support it" is going to get around that
>requirement; it's up to the compiler to do its utmost.

No, it is up to the programmer to write it correctly. The tools already
exist.

>> Example 3: The hardware is a very nice 32-bit RISC chip.
>> The program asks for _exactly_ 24 bits (in order to talk to the DSP chip
>> mentioned in example 2).
>
>Then you are dealing with (non-portable) hardware, and know what your
>compiler is going to produce when you give it this request.
>"int exact:24 fast DSP" is the obvious way to phrase it, but even it's not
>guaranteed to produce what you want.

It is code that can be non-portable, not hardware. Like 36 bits 24 bit
operations can be simulated quite easily in C. The sort of behind the scenes
jiggery-pokery that would be required to get the compiler to do it is against
the general C principle of minimising hidden overheads. Again the general
need for exact 24 bit objects is so low that the language should not be
burdened with it.

>> What I'm trying to understand is how "exactly N bit" types can have any
>> role in *portable* code.
>
>see note about 32-bit CRC above.
>
>> As soon as bit fields entered the language, I switched
>> over to using them. I switched back as soon as I found I was getting
>> _worse_ code (which stayed true for a _long_ time), and when the C
>> standard came out and I found out just how non-portable they were, I
>> decided never ever to use another C bit-field in my life. I do have some
>> code that could use bit-fields, but I use explicit shifts and masks
>> FOR BETTER PORTABILITY.
>
>This is sad; the compiler should be able to do at least as well as
>a human at automatically generating the shift-and-mask, since it should
>also know exactly when it can shortcircuit things (like, it happens
>to already have the shortened value held in another register).

The compiler can only assume as much as it can deduce from the code. The
programmer can usually go further.

>> In that case, why not just require a header
>> <ieee1596.h>
>> that typedefs
>> s8int, u8int, ..., s128int, u128int
>> and leave it at that?
>
>The argument about these types not specifying enough information has
>been hashed over many times; since most machines are going to implement
>these typenames as exact types by default, it's going to make life hard
>later on when someone realises they can make their compiler make faster
>code by using "atleast" semantics, while some other programmer somewhere
>else has assumed that these types imply "exact".

This is the problem with long I referred to earlier. The standard should make
it quite clear that these new types should be as small as possible. Perhaps
a simple solution is to require, say, int32 to be between 32 and 63 bit and
int64 to be between 64 and 127 bits. I think this will have the desired
effect.

> All of a sudden, what
>was thought to be portable code keels over & dies, and the compiler
>vendor will probably get the rap for doing what they're supposed to be
>allowed to do.

Not really. Portable code has never been allowed to die simply because the
type was wider than necessary.

>> I do hope the SBEIR proposal includes a way to specify integer sizes
>> in human-oriented units, so that I don't have to figure out some
>> kluge to approximate log-to-the-base-2 in macros.
>
>Actually, I agree - it would be nice to specify ranges, since in theory
>one could store a value of type "int small range:-5...250" in 8 bits.
>
>Hey Frank, what about this? Add "range:X...Y" as another modifier which
>could be used (in place of "atleast:" or "exact:" where applicable).

You really need a stronger typing system than C has, bound checking, and
language defined I/O (anybody mention PASCAL?) for this to be useful.

>One more for the "wish list" - bit rotation operators, eg "<<|" and ">>|"
>to make optimal use of exact bit width values.

Interesting idea which would map efficiently onto many system archetectures.
I don't think I've ever come across a use for it though. At the lower level
rotate through carry is generally more useful since it facilitates multi-
precision operations. Carry support would certainly make C a lower level
language though!

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------

John R MacMillan

unread,
Sep 8, 1995, 3:00:00 AM9/8/95
to
|The use of the term "exact" as Frank Farance uses it in this context means
|"when you read the value you get exactly this number of bits". The point
|about exactness is the behaviour of the (mod 2^N) integer field.

Except signed integers don't have defined overflow behaviour, only the
unsigned types do, right? So "exact" signed integers can always be
treated as "atleast" integers that size or larger, since padding is
allowed, can they not?

This seems to practically limit the use of the "exact" modifier to
unsigned types. I was not convinced exact types were necessary or
desirable before; I'm even less convinced now.

|What is being proposed is probably more comprehensive in its expressive
|power than is available in any other mainstream language today; the use
|of several orthogonal sets of modifiers allows very fine definition of
|what the required behaviour is.

Are you saying there's no prior art? ;-)

|> What I'm trying to understand is how "exactly N bit" types can have any
|> role in *portable* code.
|
|see note about 32-bit CRC above.

There are portable 32-bit CRC implementations with the existing C type
system; this certainly does not require an exact type, though I grant
that it might be easier. But for every piece of portable code that is
helped by an exact type, there are _tons_ of pieces that don't. The
former fall into the class where something about the algorithm
specifies a bit-size (I don't think hardware counts, because if you're
dealing with specific hardware, your code is not portable anyway),
while the latter are everything else.

I still think it's a major addition to the language that only serves a
minor number of uses.

|> >The IEEE 1596.5
|> >Standard requires 8, 16, 32, 64, and 128 bit types.
|
|> In that case, why not just require a header
|> <ieee1596.h>
|> that typedefs
|> s8int, u8int, ..., s128int, u128int
|> and leave it at that?
|
|The argument about these types not specifying enough information has
|been hashed over many times; since most machines are going to implement
|these typenames as exact types by default, it's going to make life hard
|later on when someone realises they can make their compiler make faster
|code by using "atleast" semantics, while some other programmer somewhere
|else has assumed that these types imply "exact". All of a sudden, what
|was thought to be portable code keels over & dies, and the compiler
|vendor will probably get the rap for doing what they're supposed to be
|allowed to do.

Surely IEEE 1596.5 defines whether the types are exact or not? If so,
the implementation is not at liberty to chage this. If they are not
exact, no matter _how_ you get them (ie. with SBEIR or with some
compiler-specific magic in ieee1596.h) some people will assume they are
exact. Just like some people now assume int is exactly 32 bits.

All this brings us back to whether or not we want to change C to
support exact types in the first place. I'd say no. Do we need a
better way to specify at least types in C? Again, I say no. So while
I think the SBEIR is well-crafted, and interesting, I don't think it
should be added to C. Perhaps D (or P, or C^2, or whatever :-) ).

Martin Kealey

unread,
Sep 9, 1995, 3:00:00 AM9/9/95
to
John R MacMillan (jo...@sco.com) wrote:
> |The point
> |about exactness is the behaviour of the (mod 2^N) integer field.

> Except signed integers don't have defined overflow behaviour, only the
> unsigned types do, right?

Good point, but what happens to ordinary signed bit fields which are
assigned too large a value? (sorry, don't have copy of std to hand).
I must admit it's pretty hard to think of a specific use for an exact
signed integer, however, it starts to make more sense in the
context of specified-range types - I *never* want a value outside
the specified range, and it should preferably fault on assignment rather
than waiting until I try to use the value later on (but yes I know
that is an implementation issue).

> So "exact" signed integers can always be treated as "atleast" integers
> that size or larger, since padding is allowed, can they not?

Padding may or may not be allowed with either atleast or exact types, much
as there may be padding bits between bitfields currently, but that doesn't
change the value semantics of the bitfield - the value read from a normal
bitf ield will always be in the Z(mod 2^N) range (or signed equivalent).
Exact bit range integers are an orthogonal completion of structure
bitfields.

> Are you saying there's no prior art? ;-)

Ok, I'll bite: care to point one out? :-) [I'm actually hoping I'll learn
something here...]

> There are portable 32-bit CRC implementations with the existing C type
> system; this certainly does not require an exact type, though I grant
> that it might be easier. But for every piece of portable code that is
> helped by an exact type, there are _tons_ of pieces that don't.

(nor are they disadvantaged however by a feature they don't use)

> I still think it's a major addition to the language that only serves a
> minor number of uses.

I'd consider it's greatest benefit that of documentation; if someone is
given an explicit mechanism to specify "exact" and then fails to use it,
they can't turn around & blame the compiler vendor for changing their
implementation of "atleast".


> |> >The IEEE 1596.5
> |> >Standard requires 8, 16, 32, 64, and 128 bit types.

> Surely IEEE 1596.5 defines whether the types are exact or not?

Sorry, I don't have that to hand either - but unless it defines BOTH
sets, then someone's code is going to break tomorrow because they made the
wrong assumption about the one set that is defined. (The breakage is
minor one way - loss of efficiency - but major the other - excess bits
causing out-of-range values.)

> All this brings us back to whether or not we want to change C to
> support exact types in the first place. I'd say no. Do we need a
> better way to specify at least types in C? Again, I say no.

This is true; the point is however, we need a way of documenting NOW when
exact types are needed so that we don't trip over them in today's code
when it's being recompiled on tomorrow's machines.

> So while
> I think the SBEIR is well-crafted, and interesting, I don't think it
> should be added to C. Perhaps D (or P, or C^2, or whatever :-) ).

Well I only use C because I have to; I'd much rather use a better
language, and if I can use a better language and still have it called
C so that I'm allowed to, then I'll be happy - in short, I think we
should do whatever is necessary to make C into a "proper" [:->]
language.

- Martin

Martin Kealey

unread,
Sep 9, 1995, 3:00:00 AM9/9/95
to
Thad Smith (th...@csn.net) wrote:
> In article <6495249.3...@kcbbs.gen.nz>,
> mar...@kcbbs.gen.nz (Martin Kealey) wrote:

> >> As it happens, the DEC-20 _could_ do this, because it could address
> >> bitfields anywhere in memory, using a fattened pointer.

> >In this case, yes, "int exact:5 small x[8000]" would use 5000 bytes of
> >storage. On a machine which doesn't have built-in unpacking, it might
> >use 8000 bytes.

> I don't see how that could happen and be compatible with C pointer
> arithmetic, in which the byte is the smallest addressable unit. What
> is sizeof x[0]?

As I recall, the term "byte" isn't defined by the standard; all that is
necessary is to define the sizeof operator for this architecture to return
the number of bits. (Can't remember if sizeof(char) is defined to be one,
but I don't think so.) Of course, although this would work, I don't expect
anyone will take this too seriously, it really just highlights shortcoming
of C's addressing model.

A preferable basis for "sizeof" would be to count in whatever the
machine's minimal addressible units are, be they bits, octets, or words;
then this problem just evaporates.

> I think the issue here is the implementation-defined mapping of
> bit-fields to memory, namely the order of allocation within the
> storage unit (msb first or lsb first) and whether/when alignment takes
> place.

There are two reasons for using bitfields: (1) because you want to conform
with some external dataspace mapping, and (2) because you want a variable
to have exactly some maximum range of value. Point (1) is covered by
the data layout extensions, while point (2) is addressed by having an
"exact" specifier.

- Martin.

Chris Torek

unread,
Sep 9, 1995, 3:00:00 AM9/9/95
to
In article <<DELvv...@sco.COM> John R MacMillan (jo...@sco.com) writes:
>>... for every piece of portable code that is helped by an exact
>>type, there are _tons_ of pieces that [are not].

In article <6495251.3...@kcbbs.gen.nz> Martin Kealey


<mar...@kcbbs.gen.nz> writes:
>(nor are they disadvantaged however by a feature they don't use)

Ah, would that this were so....

It has been quite a while since I wrote and maintained code in
FORTRAN. I was rather strict about my own code, using `IMPLICIT
UNDEFINED' (a vendor extension, but a common one) to avoid having
typographical errors turn into unexpected code. You might then say
that I was not disadvantaged by FORTRAN's implicit declarations,
but in fact I was. *Other* code, which I had to maintain, made
heavy use of such declarations.[%]

The situations are not entirely comparable, but the metaphor holds.
Those unused rooms still have to be heated, the windows kept clean,
the exterior painted. Nothing is ever quite free.

(Incidentally, I consider C's automatic function declaration one of
its bigger faults. Fortunately, most compilers have a switch to warn
about them. But there is still all that *other* code....)
-----
% Footnote for interesting anecdote: On learning of the old
UNIX `struct' utility, I began to run such code through
struct to convert it to Ratfor. (I had written my own
variant of a Ratfor translator while in high school, so it
was already familiar.) Struct was very good at ferreting
out the actual structure of the code, and I found its
results far more maintainable than the original inputs.
In the rare cases when struct gave up, the code was typically
incorrect in the first place.
-----

>I'd consider it's greatest benefit that of documentation; if someone is
>given an explicit mechanism to specify "exact" and then fails to use it,
>they can't turn around & blame the compiler vendor for changing their
>implementation of "atleast".

There is an old saying about writing a compiler for natural languages:
you then discover that programmers cannot write in those either.
Pascal, which is really very much like C, offers range types and
packed arrays. The result is that programmers use them, but
inconsistently and often incorrectly. Yes, it is wonderful to be
able to say: `It is not our fault. We gave them options; they just
misused them.' But we should *expect* them to be misused and plan
accordingly: New features should pay for themselves in the *usual*
cases, and not detract when *mis*used in the usual *errors*. It
can be hard to predict what those usual cases and errors might be,
but here we have other examples to draw from (Pascal, PL/I, etc.).

If SBEIR is considered `different enough' or `interesting enough',
it should still be tested (in some reasonably large-scale manner,
e.g., by implementing it as yet another variant of GNUC) before
being made into a standard.

FFarance

unread,
Sep 9, 1995, 3:00:00 AM9/9/95
to
> From: th...@csn.net (Thad Smith)

>
> >> An array of these objects is allocated. What happens?
> >> Is it supposed to be the way Pascal 'packed arrays' were originally
> >...
> >> As it happens, the DEC-20 _could_ do this, because it could address
> >> bitfields anywhere in memory, using a fattened pointer.
> >
> >In this case, yes, "int exact:5 small x[8000]" would use 5000 bytes of
> >storage. On a machine which doesn't have built-in unpacking, it might
> >use 8000 bytes.
>
> I don't see how that could happen and be compatible with C pointer
> arithmetic, in which the byte is the smallest addressable unit. What
> is sizeof x[0]?

The type "int exact:5 small" must be addressible (use ``address-of''
operator'') and sizable (use "sizeof" operator). Semantically, this
type is similar to:

struct
{
int z:5;
};

Since a structure is addressible and sizable, this would be rounded up
to some addressible boundary. The "small" qualifier means that this
is rounded up to the *next* addressible boundary. The ordering is
implementation-defined, as usual, for bit fields. However, if you
use the OAX (ordering and alignment) and REP (data representation)
features, you can get the kind of specification you want. I'll post
the proposals as soon as they are on the WG14 & X3J11 FTP site.

In summary, the array above would use 8000 bytes.

Clive D.W. Feather

unread,
Sep 9, 1995, 3:00:00 AM9/9/95
to
In article <6495251.3...@kcbbs.gen.nz>,

Martin Kealey <mar...@kcbbs.gen.nz> wrote:
> As I recall, the term "byte" isn't defined by the standard;

Yes it is, as equivalent to char.

--
Clive D.W. Feather | If you lie to the compiler,
cl...@demon.net (work, preferred) | it will get its revenge.
cl...@stdc.demon.co.uk (home) | - Henry Spencer

Thad Smith

unread,
Sep 9, 1995, 3:00:00 AM9/9/95
to
In article <6495251.3...@kcbbs.gen.nz>,
mar...@kcbbs.gen.nz (Martin Kealey) wrote:
>Thad Smith (th...@csn.net) wrote:

>> >In this case, yes, "int exact:5 small x[8000]" would use 5000 bytes of
>> >storage. On a machine which doesn't have built-in unpacking, it might
>> >use 8000 bytes.
>
>> I don't see how that could happen and be compatible with C pointer
>> arithmetic, in which the byte is the smallest addressable unit. What
>> is sizeof x[0]?
>

>As I recall, the term "byte" isn't defined by the standard; all that is
>necessary is to define the sizeof operator for this architecture to return
>the number of bits. (Can't remember if sizeof(char) is defined to be one,
>but I don't think so.)

Byte is defined in the Standard to be "the unit of data storage large
enough to hold any member of the basic character set of the execution
environment..." I was going to say that the defined connection
between a byte and type char is rather loose, but found that the
sizeof operator give the size of an object or type in "bytes" and also
that sizeof (char) is defined to be 1. This seems to establish that a
char must be the same size as a byte.

>A preferable basis for "sizeof" would be to count in whatever the
>machine's minimal addressible units are, be they bits, octets, or words;
>then this problem just evaporates.

This is already defined for C. A byte, however, can be a much larger
unit if small units are not directly addressable.

>> I think the issue here is the implementation-defined mapping of
>> bit-fields to memory, namely the order of allocation within the
>> storage unit (msb first or lsb first) and whether/when alignment takes
>> place.
>
>There are two reasons for using bitfields: (1) because you want to conform
>with some external dataspace mapping, and (2) because you want a variable
>to have exactly some maximum range of value. Point (1) is covered by
>the data layout extensions, while point (2) is addressed by having an
>"exact" specifier.

I would like to see a proposal for data layout extensions. For
the second reason, rather than using bitfields to specify a maximum
value, I use bitfields to conserve storage. In general I don't care
what the maximum value is. If I do, I explicitly using comparisons
and masks to handle maximums.

Ian Cargill

unread,
Sep 10, 1995, 3:00:00 AM9/10/95
to

>As I recall, the term "byte" isn't defined by the standard;

Yes, it is.

## 3.4 byte: The unit of data storage large enough to hold
## any member of the basic character set of the execution
## environment. It shall be possible to express the address of
## each individual byte of an object uniquely. A byte is
## composed of a contiguous sequence of bits, the number of
## which is implementation-defined. The least significant bit
## is called the low-order bit; the most significant bit is
## called the high-order bit.


>all that is
>necessary is to define the sizeof operator for this architecture to return
>the number of bits. (Can't remember if sizeof(char) is defined to be one,
>but I don't think so.)

The standard specifically states that sizeof (char) {or signed char
or unsigned char} MUST be 1. (Semantics of 6.3.3.4)

>A preferable basis for "sizeof" would be to count in whatever the
>machine's minimal addressible units are, be they bits, octets, or words;
>then this problem just evaporates.

And other problems would take their place. If sizeof returned
the number of bits, malloc would be problematical, since the amount
of memory required is not always the same as the number of bits used.
(e.g. packed structs for a familiar example.) You could have a 5 bit
int which still required 8 bits to be allocated. Seem to me that you
just swap one set of problems for another.


--
============================================================================
Ian Cargill CEng MIEE | Find out about the Association of C and C++ Users
Soliton Software Ltd. | in...@accu.org OR http://bach.cis.temple.edu/accu

Markus Freericks

unread,
Sep 10, 1995, 3:00:00 AM9/10/95
to
In article <zudUwQ9y...@csn.net> th...@csn.net (Thad Smith) writes:
> Byte is defined in the Standard to be "the unit of data storage large
> enough to hold any member of the basic character set of the execution
> environment..." I was going to say that the defined connection
> between a byte and type char is rather loose, but found that the
> sizeof operator give the size of an object or type in "bytes" and also
> that sizeof (char) is defined to be 1. This seems to establish that a
> char must be the same size as a byte.

Wouldn't it simplify the understanding of the standard if the term "byte"
was either totally removed from the standard (as it is currently synonymous
with "char"), or totally separated from it?

As it is, I don't see any value to the term "byte". Now if "byte" was
defined as the smallest addressable unit of memory, and sizeof(char) >= 1
(e.g., sizeof(char)==2 on a nybble-addressable machine, if such a beast
exists), bytes would be very useful..

-- Markus

(followup set to comp.std.c)

Mark Brader

unread,
Sep 10, 1995, 3:00:00 AM9/10/95
to
Markus Freericks (m...@cs.tu-berlin.de) writes:
> Wouldn't it simplify the understanding of the standard if the term "byte"
> was either totally removed from the standard (as it is currently synonymous
> with "char"), or totally separated from it?

It's not actually synonymous: the two things are the same size, but one is
a data type and the other is a unit that storage comes in.



> As it is, I don't see any value to the term "byte". Now if "byte" was
> defined as the smallest addressable unit of memory, and sizeof(char) >= 1
> (e.g., sizeof(char)==2 on a nybble-addressable machine, if such a beast
> exists), bytes would be very useful..

This sort of thing was proposed during standardization, not so much for
the benefit of "nybble-addressable machines" as for those addressed in
8-bit units but using 2 or more of those units for a character (as in
Unicode and other character sets handling languages like Japanese).

The committee basically felt that such a change would break too much
existing code; although not promised in K&R1, it was generally assumed
that sizeof(char) would always be 1. The Rationale, in section 3.3.3.4,
puts this terms that seem a bit too emphatic to me: "It is fundamental to
the correct usage of functions such as malloc and fread that sizeof (char)
be exactly one."

Since the standard continues to promise the sizeof(char) is 1, it is
equally infeasible to introduce this change today.
--
Mark Brader | "I don't care HOW you format char c; while ((c =
m...@sq.com | getchar()) != EOF) putchar(c); ... this code is a
SoftQuad Inc., Toronto | bug waiting to happen from the outset." --Doug Gwyn

My text in this article is in the public domain.

John R MacMillan

unread,
Sep 11, 1995, 3:00:00 AM9/11/95
to
|> Except signed integers don't have defined overflow behaviour, only the
|> unsigned types do, right?
|
|Good point, but what happens to ordinary signed bit fields which are
|assigned too large a value? (sorry, don't have copy of std to hand).

I believe if you assign to it, the behaviour is implementation
defined. But if an exception (which includes a result not in the
range of its type) occurs during the evaluation of an expression, the
behaviour is undefined.

|I must admit it's pretty hard to think of a specific use for an exact
|signed integer, however, it starts to make more sense in the
|context of specified-range types - I *never* want a value outside
|the specified range, and it should preferably fault on assignment rather
|than waiting until I try to use the value later on (but yes I know
|that is an implementation issue).

I don't think exact types would be able to require that the results
are never out range without requiring an exception, which is different
from current behaviour of integers and is another burden I wouldn't
want to impose on an implementation. Besides, as a specified-range
type, the SBEIR types are pretty lousy; they only work if your range
happens to map to size exactly.

|> Are you saying there's no prior art? ;-)
|
|Ok, I'll bite: care to point one out? :-) [I'm actually hoping I'll learn
|something here...]

I'm not aware of any as ambitious as SBEIR, either. But the C9X
charter (on the archive site, anyway) says that new features _should_
have prior art.

|I'd consider it's greatest benefit that of documentation; if someone is
|given an explicit mechanism to specify "exact" and then fails to use it,
|they can't turn around & blame the compiler vendor for changing their
|implementation of "atleast".

/*
* There's a simpler method of documenting your intent already in the
* language. :-)
*/

As for who is at fault on certain things, C already gives an explicit
mechanism for specifying certain atleast types, and no method (other
than bitfields or explicit masking) for exact types, and people manage
to misuse the current type system (and blame the vendors). Perhaps
I'm a pessimist, but I don't think that will change with SBEIR.

|Well I only use C because I have to; I'd much rather use a better
|language, and if I can use a better language and still have it called
|C so that I'm allowed to, then I'll be happy - in short, I think we
|should do whatever is necessary to make C into a "proper" [:->]
|language.

I guess this is the root of our disagreement. I tend to think of C as
being more or less ``done.'' I have no objection minor cosmetic
fixes, but I think the C community benefits most from having a stable,
proven language. C is not perfect, but I think that the better
approach is to create a new language that it better than C (even if it
looks a lot like C in some ways) and show it to be better, rather than
trying to ``fix'' C. And you should convince your boss to let you use
the best tool for the job. :-)

What happens if the changes to C turn out to be a bad idea? You can't
really back them out; you're stuck with them. If Objective-C or C++
(or, dare I say it, C+@) end up failing miserably, we can just ignore
them, and C is still a useful language. If they take off, they may
supplant C in whole or in part, and that's fine too.

Derick J.R. Qua-Gonzalez

unread,
Sep 11, 1995, 3:00:00 AM9/11/95
to
mar...@kcbbs.gen.nz (Martin Kealey) writes:

>As I recall, the term "byte" isn't defined by the standard; all that is


>necessary is to define the sizeof operator for this architecture to return
>the number of bits. (Can't remember if sizeof(char) is defined to be one,

>but I don't think so.)

sizeof(char) is defined to be unity. The sizeof operator always returns
the number of chars required to store a type. If you want the number of
bits, you have to multiply this with the manifest constant CHAR_BIT in
<limits.h>.

// DQG.
--
+------------------------------------------------------------------------+
| Derick R. Qua-Gonzalez | ________ |
| Department of Physics, California State University | \ / |
| dq...@dqua.EarthLink.Net | \ / |
| ``It is better to be hated for what one is, | \ / |
| than loved for what one is not.'' (A. Gide) | G \/ USA|
+-------------------------------------------------------------+----------+

Zefram

unread,
Sep 11, 1995, 3:00:00 AM9/11/95
to
Chris Torek <to...@bsdi.com> wrote:
>In article <6495251.3...@kcbbs.gen.nz> Martin Kealey
><mar...@kcbbs.gen.nz> writes:
>>(nor are they disadvantaged however by a feature they don't use)
>
>Ah, would that this were so....
>
>It has been quite a while since I wrote and maintained code in
>FORTRAN. I was rather strict about my own code, using `IMPLICIT
>UNDEFINED' (a vendor extension, but a common one) to avoid having
>typographical errors turn into unexpected code. You might then say
>that I was not disadvantaged by FORTRAN's implicit declarations,
>but in fact I was. *Other* code, which I had to maintain, made
>heavy use of such declarations.[%]

You were not disadvantaged by that feature of FORTRAN directly, but by
other people's use of it. And considering that use of implicit
declarations is the norm in FORTRAN, you have a very weak argument
here. (And wasn't the usual syntax IMPLICIT NONE?)

-zefram

Thad Smith

unread,
Sep 11, 1995, 3:00:00 AM9/11/95
to
In article <42vd3s$t...@news.cs.tu-berlin.de>,

m...@cs.tu-berlin.de (Markus Freericks) wrote:
>In article <zudUwQ9y...@csn.net> th...@csn.net (Thad Smith) writes:
>> Byte is defined in the Standard to be "the unit of data storage large
>> enough to hold any member of the basic character set of the execution
>> environment..." I was going to say that the defined connection
>> between a byte and type char is rather loose, but found that the
>> sizeof operator give the size of an object or type in "bytes" and also
>> that sizeof (char) is defined to be 1. This seems to establish that a
>> char must be the same size as a byte.
>
>Wouldn't it simplify the understanding of the standard if the term "byte"
>was either totally removed from the standard (as it is currently synonymous
>with "char"), or totally separated from it?

I agree. Defining sizeof (char) to be one byte is a rather
round-about way of equating the two, IMHO. It is interesting to note
that CHAR_BIT defines the number of bits in a byte, not a char.

Thad

Richard A. O'Keefe

unread,
Sep 11, 1995, 3:00:00 AM9/11/95
to
ffar...@aol.com (FFarance) wrote:
: Another problem I'm sure you've encountered is that getting an exact

: type is difficult when the compiler doesn't support it.

I wrote:
@ I am still rather bewildered by this. If the compiler _doesn't_
@ support a type of some specific size, then how can you possibly get it?

mar...@kcbbs.gen.nz (Martin Kealey) wrote:
>[The compiler has to generate code to make it work, since the programmer
>who asked for it would have had a good reason to use it - like a CRC
>calculation, for instance.]

But that is to say that the compiler *does* support the type in question!
What can it possibly mean to say that "the compiler generates code to
make it work" but it "doesn't support it"?

>The use of the term "exact" as Frank Farance uses it in this context means
>"when you read the value you get exactly this number of bits". The point
>about exactness is the behaviour of the (mod 2^N) integer field.

Surely, when you *read* a variable, you should get back exactly what you
last *wrote* into it. However, if that is all that is meant by an "exact"
type, it fails miserably to provide what some people have been asking for.
If I say "give me exactly 16 bits", then this interpretation would allow
the compiler to _really_ use 128 bits, and mask all but 16 of them off
whenever I used such a variable. That may perhaps be a useful thing to
have, but it is not what people asking for "exactly 16 bits" have meant
in the past.

In Ada 95 you can ask for a type which is "exactly N bits" in this sense
of N-bit *values* by declaring
type N_Bit_Integer is mod 2**N;

If you want to control the *storage size* of such a type, which is what
people asking for exact types have previously been requesting, you use
a representation clause:

for N_Bit_Integer'Size use N;

>What is being proposed is probably more comprehensive in its expressive
>power than is available in any other mainstream language today; the use
>of several orthogonal sets of modifiers allows very fine definition of
>what the required behaviour is.

I think I must have lost the thread somewhere; wasn't C supposed to be a
*small* language?

I have not been able to get my hands on a copy of the SBEIR.
(Nor can I get access to the C9X documents at ftp://ftp.dmk.com ; net
software at this end insists that there is no such host.)

>In this case, yes, "int exact:5 small x[8000]" would use 5000 bytes of
>storage. On a machine which doesn't have built-in unpacking, it might
>use 8000 bytes.

You are telling me that the programmer doesn't know what size (s)he'll
get, nor whether the type will be cheap to use or extremely costly.

>> Example 3: The hardware is a very nice 32-bit RISC chip.
>> The program asks for _exactly_ 24 bits (in order to talk to the DSP chip
>> mentioned in example 2).

>Then you are dealing with (non-portable) hardware, and know what your
>compiler is going to produce when you give it this request.
>"int exact:24 fast DSP" is the obvious way to phrase it, but even it's not
>guaranteed to produce what you want.

Then what is the *point* of it? For what it's worth, I have programmed a
machine where a mainframe and a communications processor shared the same
physical memory but had different word sizes. Basically, you seem to be
saying that 'exact' doesn't actually buy me anything very useful; if I
want a type that is exactly N bits in storage, I have a keyword that will
fool me and thousands of other programmers into believing that it is
relevant, but it isn't.

I wrote:
@ I do have some
@ code that could use bit-fields, but I use explicit shifts and masks
@ FOR BETTER PORTABILITY.

>This is sad; the compiler should be able to do at least as well as
>a human at automatically generating the shift-and-mask, since it should
>also know exactly when it can shortcircuit things (like, it happens
>to already have the shortened value held in another register).

As soon as you try to produce _portable_ code, you cease to regard talk
about _the_ compiler as meaningful. Who said anything about efficiency?
The C standard *defines* shifts and masks more strictly than it defines
bitfields, for better portability you use better defined parts of the
language. Let's face it, any compiler that does a good job of bit-fields
should be able to do a good job of shifts with constant shift counts and
masking with constant masks by turning them into its internal representation
of bit fields. Let me turn it around for you: a compiler should be able
to do at least as well at automatically generating a bit field as a human.

>Maybe we should be able to attach the "register" or "fast" tag to any
>particular bitfield to say "make this one go the fastest", which would
>normally mean "put this in the low order bits so it doesn't have to be
>shifted". (I'd prefer "register" since it would be orthogonal to the
>"fast" attribute, simply specifying *which* bitfield was prefered
>for the low-order bits without changing the pointer convention, while
>"fast" has the potential to change the pointer width.)

: The IEEE 1596.5
: Standard requires 8, 16, 32, 64, and 128 bit types.

@ In that case, why not just require a header
@ <ieee1596.h>
@ that typedefs
@ s8int, u8int, ..., s128int, u128int
@ and leave it at that?

>The argument about these types not specifying enough information has
>been hashed over many times

and it has been misunderstood this time. If some IEEE standard requires
certain types, then it requires then, whether they specify enough
information or not. The semantics of these types, whether at-least
or pseudo-exact or real exact, are presumably specified by that standard,
or what is the point of it? And in that case the argument about the
semantics being changes holds no water.

>One more for the "wish list" - bit rotation operators, eg "<<|" and ">>|"
>to make optimal use of exact bit width values.

I would settle for signed shifts. There's a nice irony here: the current
C standard refuses to require a compiler to support signed right shift for
right shift _operators_, but in effect _does_ require it for bit fields...

--
"conventional orthography is ... a near optimal system for the
lexical representation of English words." Chomsky & Halle, S.P.E.
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.

Richard A. O'Keefe

unread,
Sep 11, 1995, 3:00:00 AM9/11/95
to
mar...@kcbbs.gen.nz (Martin Kealey) writes about SBEIR:

>John R MacMillan (jo...@sco.com) wrote:
>> But for every piece of portable code that is
>> helped by an exact type, there are _tons_ of pieces that don't.

>(nor are they disadvantaged however by a feature they don't use)

I am afraid this is not true. Compiler writers have only finite
resources: effort spent implementing a complex way of specifying
integers (which as so far desribed here has no apparent advantages
over Ada 95's rather more transparent scheme) is effort not
available for other tasks, such as porting to new machines, tuning
for Hexia, debugging, providing better test coverage tools, you name it.

Stephen Baynes

unread,
Sep 12, 1995, 3:00:00 AM9/12/95
to
Derick J.R. Qua-Gonzalez (dq...@earthlink.net) wrote:
: mar...@kcbbs.gen.nz (Martin Kealey) writes:


: sizeof(char) is defined to be unity. The sizeof operator always returns


: the number of chars required to store a type. If you want the number of
: bits, you have to multiply this with the manifest constant CHAR_BIT in
: <limits.h>.

Note that with the exception of char - not all those bits are accessible.
So you can't assume that because CHAR_BIT is 8 and sizeof(int) is 4 that
you have 32 bits available in an int (it could be 31 or 24 or whatever
and it won't be more than 32).

--
Stephen Baynes bay...@mulsoc2.serigate.philips.nl
Philips Semiconductors Ltd
Southampton My views are my own.

walter briscoe

unread,
Sep 12, 1995, 3:00:00 AM9/12/95
to
In article <DEs6w...@ukpsshp1.serigate.philips.nl>
bay...@ukpsshp1.serigate.philips.nl "Stephen Baynes" writes:

> Note that with the exception of char - not all those bits are accessible.
> So you can't assume that because CHAR_BIT is 8 and sizeof(int) is 4 that
> you have 32 bits available in an int (it could be 31 or 24 or whatever
> and it won't be more than 32).

Fascinating assertion! Justification in standard?
--
walter briscoe

Stephen Baynes

unread,
Sep 12, 1995, 3:00:00 AM9/12/95
to
Markus Freericks (m...@cs.tu-berlin.de) wrote:

: Wouldn't it simplify the understanding of the standard if the term "byte"


: was either totally removed from the standard (as it is currently synonymous
: with "char"), or totally separated from it?

"Byte" is nearly but not totally synonymous with "char". A byte is a collection
of bits (CHAR_BIT of them - though that should be BYTE_BIT). "char" is a byte
sized integral type, ie a byte being interpreted as an integer.

I think that this is a useful distinction, however it would be possible to
use the term "char" for "byte" as is often done without real loss.

--
Stephen Baynes bay...@mulsoc2.serigate.philips.nl
Philips Semiconductors Ltd
Southampton My views are my own.

United Kingdom

Mark Brader

unread,
Sep 13, 1995, 3:00:00 AM9/13/95
to
Stephen Baynes (bay...@ukpsshp1.serigate.philips.nl) writes:

> > Note that with the exception of char - not all those bits are accessible.
> > So you can't assume that because CHAR_BIT is 8 and sizeof(int) is 4 that
> > you have 32 bits available in an int (it could be 31 or 24 or whatever
> > and it won't be more than 32).

Walter Briscoe (wal...@wbriscoe.demon.co.uk) comments, in surprisingly
telegraphic style:



> Fascinating assertion! Justification in standard?

I, for one, don't think there is any. However, I am told that there are
machines where C implementers say they need to do this to keep things as
efficient as C should be. (No, I don't remember which ones specifically.)

Defect Report 069 asked for clarification of this and related issues.
The draft response is now up for balloting as part of the Record of
Response 2 / Technical Corrigendum 2 package. Without amending the
text of the standard, it accepted the model described by Stephen,
introducing the terms "object representation" (the 32 bits of Stephen's
message), "value representation" (the 31 or 24 bits), and "holes" (the
other bits).

I submitted the following for use in constructing Canada's official
response to the ballot:

| At least some of the material in the response to appears to be
| normative; nothing in the standard seems to correspond to the
| distinction made between object representations and value
| representations. If the response within amending 6.1.2.5 is
| correct, it should be clarified to explain what in the standard
| leads to this distinction. Altneratively, there should be an
| amendment to 6.1.2.5 incorporating something along the lines of
| the first few paragraphs of the response.

(I also complained about the term "hole" being confusing, in view of
the existing uses of this word and "padding" in the standard and in
K&R 1 and 2, and suggested other terminology.)
--
Mark Brader, m...@sq.com "History will be kind to me, for I intend
SoftQuad Inc., Toronto to write it." -- Churchill

Stephen Baynes

unread,
Sep 13, 1995, 3:00:00 AM9/13/95
to
Mark Brader (m...@sq.com) wrote:
: Stephen Baynes (bay...@ukpsshp1.serigate.philips.nl) writes:

: > > Note that with the exception of char - not all those bits are accessible.
: > > So you can't assume that because CHAR_BIT is 8 and sizeof(int) is 4 that
: > > you have 32 bits available in an int (it could be 31 or 24 or whatever
: > > and it won't be more than 32).

: Walter Briscoe (wal...@wbriscoe.demon.co.uk) comments, in surprisingly
: telegraphic style:
:
: > Fascinating assertion! Justification in standard?

: I, for one, don't think there is any. However, I am told that there are
: machines where C implementers say they need to do this to keep things as
: efficient as C should be. (No, I don't remember which ones specifically.)

: Defect Report 069 asked for clarification of this and related issues.
: The draft response is now up for balloting as part of the Record of
: Response 2 / Technical Corrigendum 2 package. Without amending the
: text of the standard, it accepted the model described by Stephen,
: introducing the terms "object representation" (the 32 bits of Stephen's
: message), "value representation" (the 31 or 24 bits), and "holes" (the
: other bits).

I don't have a full standard in front of me so I can only work from what
has been said on the net - eg reports about the above defect report.

Basically my understanding is that the standard says nothing much that
the "object reprentation" and "value representation" are the same. Unless
it can be justified they are the same then they can be different. For char
there is some justification that they must be the same so that functions
such as fread can work (as all IO is done as if by fputc/fgetc).

I would prefer to see the "object representation" and the "value
representation" kept as separate as possible so that the number of
architectures that ANSI C can be implemented on is not reduced for uneccessary
reasons. I cannot see a necessary reason to make them the same (except perhaps
an explicit special case for char - even then with memcpy etc - does one need
it?). All the code that I have seen that might require a relationship between
the object and value representations assumes a specific value of CHAR_BIT and
2s complement arithmetic and lots of other things (such as byte and bit
ordering).

Certainly the thinking in the standard needs to be clarified on this issue,
as the generalizations from the orignal K&R C

FFarance

unread,
Sep 13, 1995, 3:00:00 AM9/13/95
to
> From: pcu...@inforamp.net (Peter Curran)
>
> ...

> First, I am well aware of the cost of porting, when the software was
> not written with porting in mind. I don't thing the SBEIR proposal
> will change that one bit. It is entirely possible to write highly
> portable code in C now (with the implementation minimums, etc., firmly
> in mind). With the SBEIR proposal, people who write with porting in
> mind will have a lot more details to handle, and people who don't will
> still make a mess.

SBEIR doesn't help poorly-written code. SBEIR is useful if: (1) the
cost of development or maintenance isn't free, (2) you desire programs
that run on several architectures, (3) performance is a design or coding
consideration in your application.

Although you claim you can write portable code, its performance isn't
particularly great *across machine architectures*. For example, you
can't use bit fields greater than 16 bits (for exact types) or get
more than 32 bits of precision in your application. Even if you don't
care about extra precision, you can't get the performance (space or
time) when moving to different architectures. To get the performance,
you must map the ``type intents'' to appropriate C type that maps into
the desired type in that architecture. To get the correct type, you
must write experiment programs, perform preprocessor magic, or manually
map the types (the manual approach is done for most programming).

SBEIR allows you to take the ``type intents'' and specify them as the
type you desire. The compiler does the mapping to the *native* types
and/or features that already exist in the compiler (e.g., bit fields
in structures). This doesn't add complexity to the compiler because
the compiler already knows how to do this. For example, a compiler
knows that "short" is mapped into its native 16-bit type. The same
compiler might map "int atleast:16" into the same native 16-bit type.
Other than the mapping of C types in to native types, the remainder of
the compiler deals with native types. In other words, the changes to
a compiler a fairly localized. The only area where there might be
an issue is arithmetic on non-native sizes, but this *only* applies
to "small atleast" types which must be rounded up to the next CHAR_BIT
boundary. In this case, you can call a multi-precision library (there
are several publicly available libraries) for the operation -- or you
might even generate the code.

> Second, I think it is pointless to try to write code that is fast at
> the micro level and portable at the same time. Portable speed comes
> from algorithm design, not bit twiddling. Making a specific variable
> "fast" or "slow" provides no guarantee whatsoever about the
> performance of the program on new hardware - your assumptions about
> the hardware could be completely wrong. Performance (at the lowest
> level) and portability are fundamentally incompatible.

I'm assuming you've already optimized the algorithm. Once you've
optimized the algorithm, then you become interested in the implementation
details of the datatypes you are using. The ``type intents'' (i.e., the
SBEIR paramaterization of the types) are generally what you are interested
in. You're right that "fast" doesn't make any performance guarantee,
but it is likely that the compiler will provide access to the fastest
type of your desire because is probably is a native type that it already
generates code for. On many machines, for types with less precision
that an "int", it would probably map into the "int" type (assuming the
compiler vendor has mapped "int" into the fastest type). The use of
"small" does provide a performance guarantee: it is the smallest type
rounded up to the next "char" -- it probably isn't fast, but it's the
smallest addressable type.

Of course, there is always the case that your whole alogrithm is
implemented
as a single instruction on the compiler you're using. SBEIR doesn't
address that problem, nor is that problem particularly common.

> In 25 of programming in C, I have never encountered a need for a
> variable that is exactly 'n' bits, except when working on inherently
> non-portable software - e.g. device drivers and the like, or when
> doing machine-specific optimizations.. I have never encountered an
> algorithm that could not be implemented effectively given only a
> minimum guarantee on the size of the variables involved. That does
> not mean such algorithms don't exist - but IMHO the whole of the C
> community does not need to be burdened with the cost of providing for
> such esoteric requirements.

I think you'll find exact types more common than you think. For
example, any code that uses a "long" (or "unsigned long") and expects
that the type is 32-bits *and* doesn't bracket all operations with
masks of 0xFFFFFFFF is an example of an exact type. If they used the
mask of 0xFFFFFFFF, you could make the case that the programmer expected
the type was an at-least type and he/she was performing the masking
manually. However, there is much code out that that makes assumptions
like this (the same applies for 8-bit, 16-bit, and 64-bit types). Exact
types are fairly common in existing code.

FFarance

unread,
Sep 13, 1995, 3:00:00 AM9/13/95
to
> From: jo...@sco.com (John R MacMillan)

>
> Surely IEEE 1596.5 defines whether the types are exact or not? If so,
> the implementation is not at liberty to chage this. If they are not
> exact, no matter _how_ you get them (ie. with SBEIR or with some
> compiler-specific magic in ieee1596.h) some people will assume they are
> exact. Just like some people now assume int is exactly 32 bits.

IEEE 1596.5 has both exact and at-least types. The type "Doublet"
("Quadlet", "Octlet", etc.) is exactly 16 bits. The type "Doubmin"
("Quadmin", etc.) is at least 16 bits. The programmer chooses the
correct type, based upon type intent.

Stephen Baynes

unread,
Sep 13, 1995, 3:00:00 AM9/13/95
to
Stephen Baynes (bay...@ukpsshp1.serigate.philips.nl) wrote:
: Markus Freericks (m...@cs.tu-berlin.de) wrote:

: : Wouldn't it simplify the understanding of the standard if the term "byte"
: : was either totally removed from the standard (as it is currently synonymous
: : with "char"), or totally separated from it?

: "Byte" is nearly but not totally synonymous with "char". A byte is a collection
: of bits (CHAR_BIT of them - though that should be BYTE_BIT). "char" is a byte
: sized integral type, ie a byte being interpreted as an integer.

: I think that this is a useful distinction, however it would be possible to
: use the term "char" for "byte" as is often done without real loss.

It gets worse I have just looked at the section on memcpy, 7.11.2.1, [as given
in Plauger's book]. This says "The memcpy function copies n characters...". If
one assumes that characters are the same as char then for memcpy to be useful
then copying a char must copy all the object representation of the byte that
implements the char.

I think there might be material for a defect report here about the use
of "n characters". I think that should be either "n bytes" or "n chars" or
"an object of size n". The last would be more consistant with the wording
for malloc (7.10.3.3) and fread (7.9.8.1). Anyone else care to comment?

Thad Smith

unread,
Sep 13, 1995, 3:00:00 AM9/13/95
to
In article <DEu4E...@ukpsshp1.serigate.philips.nl>,
bay...@ukpsshp1.serigate.philips.nl (Stephen Baynes) wrote:

>It gets worse I have just looked at the section on memcpy, 7.11.2.1, [as given
>in Plauger's book]. This says "The memcpy function copies n characters...".

...


>I think there might be material for a defect report here about the use
>of "n characters". I think that should be either "n bytes" or "n chars" or
>"an object of size n". The last would be more consistant with the wording
>for malloc (7.10.3.3) and fread (7.9.8.1). Anyone else care to comment?

I agree with this comment about n bytes being more precise. The last
option doesn't seem right because memcpy() can work on partial objects
or multiple objects as well as a single object.

Thad

Thad Smith

unread,
Sep 14, 1995, 3:00:00 AM9/14/95
to
In article <1995Sep13....@sq.com>, m...@sq.com (Mark Brader) wrote:
>> ... memcpy() can work on partial objects or multiple objects as well
>> as a single object.
>
>There is no such thing as a partial object in C.

I don't see the importance of this distinction. If I copy the first
sizeof(double)/2 bytes of a double (assume sizeof(double) > 1), I
would call this part of an object which has type double, or a partial
object for short. This can be combined with the remaining bytes to
completely copy an object of type double.

>To paraphrase Isaac
>Asimov said, if you break a piece of chalk in half, you don't get two
>partial piece of chalk; you get two pieces. An object is a region of
>storage, so it works the same way.

Chalk is uniform, so each piece serve the purpose of the whole. This
isn't true of half a double. Every occurrance of "object" that I have
seen in the standard refers to an object with some corresponding type.
This doesn't fit with the half double, unless you want to say that the
object is an array of a char type. While we can call the copied
portion an object which is an array of chars, it makes more sense to
me when copying part of a double to say that a partial object was
copied.

Thad

John R MacMillan

unread,
Sep 14, 1995, 3:00:00 AM9/14/95
to
|SBEIR doesn't help poorly-written code. SBEIR is useful if: (1) the
|cost of development or maintenance isn't free, (2) you desire programs
|that run on several architectures, (3) performance is a design or coding
|consideration in your application.

C is already useful in those situations. SBEIR is useful if: 1) you
require exact sized types, 2) you want to specify a preference for
space/speed optimization, 3) you require types larger than 32 bits.

In the case of 1), SBEIR is only more convenient than current usage in
C. In case of 2), it's not clear that simple keywords will be of much
use ("register" for instance has not proven particularly useful in
generating portable faster code because it is often used casually, and
compilers value it differently). In the case of 3), SBEIR provides
much more than you require.

|... The compiler does the mapping to the *native* types


|and/or features that already exist in the compiler (e.g., bit fields
|in structures). This doesn't add complexity to the compiler because
|the compiler already knows how to do this.

This is not true. SBEIR requires to compiler to carry additional
information for each additional integer type, and adds complexity to
the promotion rules. And of course there is complexity added to
declaration processing, the library, as well as required support for
precof() and the EIR macros.

|I'm assuming you've already optimized the algorithm. Once you've
|optimized the algorithm, then you become interested in the implementation
|details of the datatypes you are using.

No, typically at this point, after having profiled and improved the
algorithm, it's usually meeting performance requirements. If not,
then I look at the optimizations the compiler is doing, and how on
this architecture with this compiler (ie. not portably) I can improve
the performance. Of these optimizations, datatypes may be one portion.

|I think you'll find exact types more common than you think.

No one has yet to give a portable use for an exact signed type. Very
few portable uses for exact unsigned types have been given. Most
examples have dealt with specific hardware, or with binary I/O, and so
are inherently non-portable.

|For
|example, any code that uses a "long" (or "unsigned long") and expects
|that the type is 32-bits *and* doesn't bracket all operations with
|masks of 0xFFFFFFFF is an example of an exact type.

Right. This is an example of bad code, since C never promised long
was exactly 32 bits, and you acknowledged earlier in this article that
SBEIR does not help with poorly written code.

|... Exact


|types are fairly common in existing code.

Exact types are fairly commonly misused in existing code, when they
needn't be, usually dealing with hardware or binary I/O both
internally as well as at the interface where the exactness is
required. If we allow programmers to specify them ala SBEIR, I
predict that these misuses will simply become codified, and will
perform poorly on machines that do not natively support the exact
type. I think this is a bad thing.

John R MacMillan

unread,
Sep 14, 1995, 3:00:00 AM9/14/95
to
|Since SBEIR maps into existing types or constructs (e.g., bit fields in
|structures), this has already been proven by the existence of C.

SBEIR is goes well beyond what has been proven by C. Bitfields were a
limited implementation of exact types (no address of or pointers to,
and the promotion rules were essentially ``turn into int''). SBEIR is
completely new method of describing types along several orthogonal
axes (space/speed optimization, size, and atleast/exact semantics).

In fact, SBEIR is in some ways more expressive than any other
language with which I am familiar.

|> The problem is that it is too expressive and will get misused. It is
|> extremely rare to require an integer type that does not correspond to
|> some simple multiple or factor of the machine's register width. SBEIR
|> is simply a recipe for unnecessarily inefficient programs.
|
|Quite the opposite is true. Many people misread the "int atleast:16" as
|this must be a 16-bit integer.

That should be a clue. If many people who follow comp.std.c are
misreading it, then many more programmers at large will misread and
misunderstand.

|For exact types, presumably the programmer needed exactly N bits. If
|he/she doesn't need exactly N bits, he/she shouldn't use an exact type.

I have seen _much_ code that relies on having exact types that did not
need exact types. In fact, _all_ non-hardware-specific code I have
seen that relied on exact types did not need it in many of the places
it was used, though I am prepared admit that there may be some.

The Amorphous Mass

unread,
Sep 14, 1995, 3:00:00 AM9/14/95
to
On 13 Sep 1995, FFarance wrote:
> However, the smallest type isn't what we always want. For storing records
> in databases, this might be true. We probably want the fastest type for
> inner loops -- how would the programmer request that?

register x; /* that is the fastest type on any architecture, no? */

___________
Bushido, n.: the ancient art of keeping your | James Robinson
cool when a US President ralphs in your lap. | james-f-...@uiowa.edu


Stephen Baynes

unread,
Sep 15, 1995, 3:00:00 AM9/15/95
to
Thad Smith (th...@csn.net) wrote:
: In article <1995Sep13....@sq.com>, m...@sq.com (Mark Brader) wrote:
: >To paraphrase Isaac

: >Asimov said, if you break a piece of chalk in half, you don't get two
: >partial piece of chalk; you get two pieces. An object is a region of
: >storage, so it works the same way.

: Chalk is uniform, so each piece serve the purpose of the whole. This
: isn't true of half a double. Every occurrance of "object" that I have
: seen in the standard refers to an object with some corresponding type.
: This doesn't fit with the half double, unless you want to say that the
: object is an array of a char type. While we can call the copied
: portion an object which is an array of chars, it makes more sense to
: me when copying part of a double to say that a partial object was
: copied.

But it does make sense to copy a double that is a field (a part) of a
structure. C objects can be broken down meaningfully to smaller objects
allong certain lines. However it must be sort of valid to break along
character lines - as memcpy does when copying. The characters must be
valid so that they can be copied, even if on their own they are not
meaningful.

Michael Smith

unread,
Sep 18, 1995, 3:00:00 AM9/18/95
to
For the record: Silicon Graphics IRIX 6.x (a 64 bit UNIX implementation)
uses 64 bit longs, 32 bit ints and 16 bit shorts.

It does not make good sense to tie an integer definition to a number of
bits as machine architectures always change. If you create "long long"
for a 64 bit integer, what will we call a 128 bit integer? If I remain
in the computer industry for the next 20 years (and I plan to), I expect
to see integer much bigger than 128 bits as computer sizes continue to
increase exponentially.

Code which relies on the number of bits in an integer is unlikely to be
portable anyway. Life's hard. If you need to write bit-level code then
you should keep the non-portable routines together and accept that they
will need tweeking when you move platforms. If your code is littered
with "32" instead of "sizeof(long)" then it's simply sloppy and chances
are that you will have many other problems.

If you really need a 32 bit integer (or any other size) you can use
typedefs and conditional code.

#define IRIX6 1
#define SUNOS 2
#define WINDOWS 3

#define OS IRIX6

#if OS==IRIX6
typedef int INT32;
#elif OS==SUNOS || OS==WINDOWS
typedef long INT32
#else
#error "Need to define integer sizes in intdef.h"
#endif


--
#####################################################################
Michael Smith msm...@mpx.com.au

Emmenjay Consulting
Making computers work for you, not the other way around.

PO Box 909 Ph 018 240 704
Kensington 2033
AUSTRALIA
#####################################################################


hal...@caip.rutgers.edu

unread,
Sep 19, 1995, 3:00:00 AM9/19/95
to
In article <6495251.3...@kcbbs.gen.nz>, mar...@kcbbs.gen.nz (Martin Kealey) writes
: ....
: This is true; the point is however, we need a way of documenting NOW when
: exact types are needed so that we don't trip over them in today's code
: when it's being recompiled on tomorrow's machines.

That is simple: _nevver_. No program needs exact-lengthed 'int's.

James Kanze US/ESC 60/3/141 #40763

unread,
Sep 19, 1995, 3:00:00 AM9/19/95
to
In article <43lmfo$8...@caip.rutgers.edu> hal...@caip.rutgers.edu
writes:

|> In article <6495251.3...@kcbbs.gen.nz>, mar...@kcbbs.gen.nz (Martin Kealey) writes
|> > ....

|> > As I recall, the term "byte" isn't defined by the standard; all that is
|> > necessary is to define the sizeof operator for this architecture to return
|> > the number of bits. (Can't remember if sizeof(char) is defined to be one,
|> > but I don't think so.)

|> Others hav written, that by definition

Others being, in this case, the C standards committee, when they wrote
the standard.

|> (sizeof char) == 1
|> . I suspect that on the PDP-10 this has queer outcums. The commonest
|> byte-length for text-files is 7 bits; this puts 5 bytes into one 36-bit
|> word. What valu of "CHAR_BIT" is right?

According to the standard, a byte must be at least 8 bits. Although
I'm not familiar with the PDP-10, it is my impression that most 36 bit
machines use 9 bit bytes.

|> Furthermore, although the basic integer is 36 bits long, there is also
|> halfword support, instructions that make uzing 18-bit-long integers not
|> too inconvenient. I guess that there is no supporting this in C, for then
|> sizeof (short int)
|> is 2 1/2.

Sizeof must return a size_t, which must be an unsigned integral type.
Since size_of must be defined for all data types, this practically
means that all data types must consist of an integral multiple of
CHAR_BIT.

|> > A preferable basis for "sizeof" would be to count in whatever the
|> > machine's minimal addressible units are, be they bits, octets, or words;
|> > then this problem just evaporates.

|> On the PDP-11 we would want "sizeof" to reckon in bits, even though
|> it addresses in 36-bit words.

I have no problem with the restriction that sizeof count in bytes.
The C language requires that bytes (char's) appear addressable, and
provides no possibilities for addressing smaller entities. In
addition, it requires that all other types consist of bytes, so in
fact, there size must be a integral multiple of that of a char.

Anything else would be a radical break with existing practice,
including existing practice on machines with funny sized bytes and
word addressing.
--
James Kanze Tel.: (+33) 88 14 49 00 email: ka...@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
-- Beratung in industrieller Datenverarbeitung

hal...@caip.rutgers.edu

unread,
Sep 19, 1995, 3:00:00 AM9/19/95
to
In article <KANZE.95S...@slsvhdt.lts.sel.alcatel.de>, ka...@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763) writes
> ....

> Anything else would be a radical break with existing practice,
> including existing practice on machines with funny sized bytes and
> word addressing.

Yes, quite. In this case, if implementing C on the PDP-10 were a real
issu, there would be no means of implementing it and holding to practice
of both PDP=10-programming and C--which was my point.

hal...@caip.rutgers.edu

unread,
Sep 19, 1995, 3:00:00 AM9/19/95
to
In article <6495251.3...@kcbbs.gen.nz>, mar...@kcbbs.gen.nz (Martin Kealey) writes
> ....
> As I recall, the term "byte" isn't defined by the standard; all that is
> necessary is to define the sizeof operator for this architecture to return
> the number of bits. (Can't remember if sizeof(char) is defined to be one,
> but I don't think so.)
Others hav written, that by definition
(sizeof char) == 1
. I suspect that on the PDP-10 this has queer outcums. The commonest
byte-length for text-files is 7 bits; this puts 5 bytes into one 36-bit
word. What valu of "CHAR_BIT" is right?

Furthermore, although the basic integer is 36 bits long, there is also


halfword support, instructions that make uzing 18-bit-long integers not
too inconvenient. I guess that there is no supporting this in C, for then
sizeof (short int)
is 2 1/2.

> A preferable basis for "sizeof" would be to count in whatever the

Derick J.R. Qua-Gonzalez

unread,
Sep 19, 1995, 3:00:00 AM9/19/95
to
msm...@jolt.mpx.com.au (Michael Smith) writes:

>It does not make good sense to tie an integer definition to a number of
>bits as machine architectures always change. If you create "long long"
>for a 64 bit integer, what will we call a 128 bit integer?

Quite true. How about "longer long int"? :-)

> ... Code which relies on the number of bits in an integer is
>unlikely to be portable anyway. ... If your code is littered with


>"32" instead of "sizeof(long)" then it's simply sloppy and chances
>are that you will have many other problems.

Actually, you probably mean sizeof(long)*CHAR_BIT, since the sizeof
returns the number of char comprising the type, not the number of
bits.

Best regards,
Derick.

Douglas Rogers

unread,
Sep 21, 1995, 3:00:00 AM9/21/95
to
Derick J R Qua-Gonzalez writes:
In article <dqua.811538130@dqua> dq...@earthlink.net (Derick J.R. Qua-Gonzalez) writes:


dqua> msm...@jolt.mpx.com.au (Michael Smith) writes:
>> It does not make good sense to tie an integer definition to a
>> number of bits as machine architectures always change. If you
>> create "long long" for a 64 bit integer, what will we call a 128
>> bit integer?

It has struck me over the past few months that extensions to the
standard are often very useful. Long longs are a case in point:-

We have already the following definitions:-

short int - at least 16 bits
long int - at least 32 bits

Now I have someone who wants to do arithmetic that needs at least 64
bits, so what do I do? Point him to the `standard' extension of:-

long long int - at least 64 bits

All the compilers on our system support this, and he is happy EXCEPT
there is no published norm on how to print out a long long! Clearly
the language could not take long longs as required, as not all
machines support 64 bit arithmetic. Even better if he could write
portable code with a test that would tell him if the code was not
compatible with the compiler/system he was trying to port it to.

This is not the only example of standard extensions, others examples
of useful extensions are ASCII character set, dynamic allocation on
the stack and standard floating point overflow mechanisms.

My point then, is couldn't the standard encompass standard extensions
and provide ways of testing if they are supported in the currently
used compiler/system?

--
Douglas

---
=============================================================================
Douglas Rogers MAIL: d...@dcs.ed.ac.uk Tel: +44 31-650 5172 (direct dial)
Fax: +44 31-667 7209
============================= Mostly harmless ===============================

Lawrence Kirby

unread,
Sep 21, 1995, 3:00:00 AM9/21/95
to

>Furthermore, although the basic integer is 36 bits long, there is also
>halfword support, instructions that make uzing 18-bit-long integers not
>too inconvenient. I guess that there is no supporting this in C, for then
> sizeof (short int)
>is 2 1/2.

Since the minimum char width is 8 the sensible approach is to use 9 bit
chars in this case. Shorts at 18 bit and ints as 36 bit then fall out very
naturally. A line has to be drawn somewhere about the minimum size of a type
otherwise it would be nearly impossible to write a strictly
conforming/portable program. 8 is a very natural number for this since it
is used on the vast majority of systems and can hold many international
character sets. Even in this particular case where the system can support
a shorter size C can be implemented quite efficiently using a slightly longer
one so its not a great issue nor is it a particularly common or important
case.

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------

John R MacMillan

unread,
Sep 21, 1995, 3:00:00 AM9/21/95
to
|Overall, this isn't that much complexity to be added. Adding "long long"
|or any other types have the same complexity.

Surely this cannot be true. Long long is a subset of what's required
by SBEIR; the rest of SBEIR cannot be free. (I'm not sure adding long
long is the answer, either, but I am sure it's not as complex as the
SBEIR proposal.)

|Exact signed types are portable, as long as: (1) you don't shift them,
|(2) you don't overflow them, (3) you don't depend on the most postive
|(e.g., +32768) or most negative (e.g., -32768) number.

Perhaps I wasn't clear. There can no portable _need_ for exact signed
types given that exact types may have padding and have undefined
behaviour on overflow, etc., since they are indistinguishable in use
from atleast types that are that large. They are no more useful than
atleast types (being indistinguishable), and can IMHO can only portray
programmer confusion more than intent (the programmer is saying ``I
need exactly N bits'' when they cannot need exactly N bits).

|If you add the companion proposal on data representation (REP), ...
|
| [...]
|
|Adding in the ordering and alignment extensions (OAX) proposal, ...

I have not yet seen these proposals (I could not find them on DMK's
archive, are they there?) but I think I'm going to disagree heartily
with them, as well. I just don't think those things belong in C.
Sorry. :-(

FFarance

unread,
Sep 21, 1995, 3:00:00 AM9/21/95
to
(Response is in multiple parts. This is part 1.)

> From: jo...@sco.com (John R MacMillan)
>

> |SBEIR doesn't help poorly-written code. SBEIR is useful if: (1) the
> |cost of development or maintenance isn't free, (2) you desire programs
> |that run on several architectures, (3) performance is a design or
coding
> |consideration in your application.
>
> C is already useful in those situations. SBEIR is useful if: 1) you
> require exact sized types, 2) you want to specify a preference for
> space/speed optimization, 3) you require types larger than 32 bits.

Considering that I make these same points in my paper, I'd find it hard
to disagree with myself :-). The first set of points address a
project management and technical management perspective. The second set
of points address linguistic issues. Neither set excludes the other.
When I was making the point ``SBEIR doesn't help poorly-written code ...''
I was talking about coding practices and portability issues, i.e., in
the realm of project management and technical management. There are
many other good reasons to use SBEIR. You've included a couple above.

> In the case of 1), SBEIR is only more convenient than current usage in
> C. In case of 2), it's not clear that simple keywords will be of much
> use ("register" for instance has not proven particularly useful in
> generating portable faster code because it is often used casually, and
> compilers value it differently). In the case of 3), SBEIR provides
> much more than you require.

#1 addresses porting cost, i.e., coding isn't for free. If it is, then
you probably don't care about porting cost. SBEIR helps reduce porting
cost, i.e., makes code more portable.

#2 addresses the mapping of type intents to native types. Clearly, it is
possible to have low quality compilers. For example, low quality
compilers are free to implement all types as 32-bits and not use any
features of the hardware (e.g., registers). Can we require higher
quality? No we can't. However, in higher quality implementations you'll
want to have a choice among the implementations of the data types. Most C
code characterizes integral data types with the parameters desribed in
the SBEIR paper. The original programmers might not have been aware of
the parameters, but anyone who has ported the code has answered these
questions about the use of the types (i.e., type intent).

#3 addresses the implementation needs of the application. If performance
isn't really an issue, then your choice of types isn't really that
important, e.g., using a bit field, "char", "short", "int", or "long" to
represent a boolean value (0 or 1) all work equally well if performance
(and addressability for bit fields) isn't an issue. BTW, this isn't
intended to get the ``boolean'' discussion rolling again.

You claim ``SBEIR provides much more than you require''. This may be
true for some people, but these needs were developed over the past
3 or so years. It isn't clear from your posting what is ``extra''.

> |... The compiler does the mapping to the *native* types
> |and/or features that already exist in the compiler (e.g., bit fields
> |in structures). This doesn't add complexity to the compiler because
> |the compiler already knows how to do this.
>
> This is not true. SBEIR requires to compiler to carry additional
> information for each additional integer type, and adds complexity to
> the promotion rules. And of course there is complexity added to
> declaration processing, the library, as well as required support for
> precof() and the EIR macros.

Overall, this isn't that much complexity to be added. Adding "long long"
or any other types have the same complexity. The "precof" operator gets
the integer type from its operand. Since this is stored in the node
already, this is almost trivial. Any feature adds *some* complexity to
a system. I claim that this is about the same as adding any other type.

> ...


> |I think you'll find exact types more common than you think.
>
> No one has yet to give a portable use for an exact signed type. Very
> few portable uses for exact unsigned types have been given. Most
> examples have dealt with specific hardware, or with binary I/O, and so
> are inherently non-portable.

Exact signed types are portable, as long as: (1) you don't shift them,


(2) you don't overflow them, (3) you don't depend on the most postive

(e.g., +32768) or most negative (e.g., -32768) number. For example, you'd
be able to use values in the range -32767 to +32767 portably within
16-bit signed bit fields in Standard C (as long as you don't do #1-3).

If you add the companion proposal on data representation (REP), you
have "twoscomp" which gives two's complement representation. You'd
be able to use the values -32768 to +32767 portably if you used the
declaration:

signed twoscomp int exact:16 X;

The only restriction with this type is that you can't bit shift
portably. However, a type cast to unsigned and back to signed
would give you a portable bit shift, but this type of shift isn't
sign-preserving.

Adding in the ordering and alignment extensions (OAX) proposal, you
could have specific bit/byte ordering:

typedef twocomp bigend align:2 signed int exact:16
AlignedBigSignedDoublet; /* an IEEE 1596.5 type */

(Continued in next part.)

FFarance

unread,
Sep 21, 1995, 3:00:00 AM9/21/95
to
(This is part 2 of the posting.)

> From: jo...@sco.com (John R MacMillan)
>

> ...


> |For
> |example, any code that uses a "long" (or "unsigned long") and expects
> |that the type is 32-bits *and* doesn't bracket all operations with
> |masks of 0xFFFFFFFF is an example of an exact type.
>
> Right. This is an example of bad code, since C never promised long
> was exactly 32 bits, and you acknowledged earlier in this article that
> SBEIR does not help with poorly written code.

Yes, but this is an example of where the programmer expects an exact
type: 32-bits. My claim was that programmers have needs and expectations
for this type.

> |... Exact
> |types are fairly common in existing code.
>
> Exact types are fairly commonly misused in existing code, when they
> needn't be, usually dealing with hardware or binary I/O both
> internally as well as at the interface where the exactness is
> required.

While hardware and binary I/O are common applications (which some people
consider non-portable so why bother -- I don't agree), there are other
applications such as shared memory and networking applications. In the
latter set of applications, you must conform to *externally* imposed
formats. You can write this with shifts and masks, but this has proven
to be *very* error prone.

> If we allow programmers to specify them ala SBEIR, I
> predict that these misuses will simply become codified, and will
> perform poorly on machines that do not natively support the exact
> type. I think this is a bad thing.

If programmers start using exact types because they need them (i.e.,
explicit about their intent) then programmers will reduce one aspect
of portability costs (they are still free to make many expensive
mistakes elsewhere :-). However, the performance of exact types on
hardware that doesn't support them natively will be about the same whether
they use exact types or perform the bit masking themselves.

The mistake you are making in your analysis here is assuming X to
make point 1 and assuming not X to make point 2, then you make both
points 1 and 2. Here ``X'' is ``the programmer intend for an
exact type''.

If the programmer intends an exact type, then he/she shouldn't have
to do any masking because the compiler has provided an exact type. If
the program is ported to a 36-bit machine, the programmer will get lower
performance because of the bit masking. The lower performance would
happen anyway because the programmer needed exactly 32-bits. If this
were implemented manually, the programmer would be masking with 0xFFFFFFFF
(what the compiler is doing for you) so the performance would be the
same. In your analysis, you assume that because the programmer doesn't
do the masking, the application would work right on a 36-bit machine:
it wouldn't because the masking hadn't been done to give exactly 32-bits
(what the programmer had intended). Thus, when comparing exact types to
manually implemented shifting/masking, there is no different for a given
architecture.

If the programmer intends an at-least type, then he/she wouldn't have
to do any masking because the programmer knows that 32 or more bits
of precision are available. If the programmer is writing in a
``minimalist
style'' (see the updated SBEIR paper), then he/she only uses 32 bits
of precision (i.e., the *specified* precision) even though there may be
more. This would perform equally well on 32-bit and 36-bit machines. If
the programmer is writing in an ``adaptive style'', then he/she may use
the available (*actual*) precision. In this case, too, the program
performs well on both 32-bit and 36-bit machines.

In summary, if you need an exact type, then your performance should be
about the same (for a particular architecture) whether you implement it or
the compiler implements it. Having the compiler implement this reduces
errors because bit shifting/masking has been demonstrated to be
error-prone.
The performance of exact types will vary from machine to machine -- this
is expected behavior. The use of at-least types allow the compiler to
use native types with minimum ranges. It doesn't make sense to use
at-least types with bit shifting/masks for emulating exact types because
the compiler can do a better, less error-prone job. Adding the qualifier
"fast" to an exact type allows the compiler to choose a faster type
(e.g., a 3-bit bit field is stored in a 64-bit type rather than an
8-bit type). If you don't know if you need an exact type or an at-least
type, then you will be unable to successful port the code to many
architectures because you are missing a crucial aspect of the type intent.

Michael Quinlan

unread,
Sep 22, 1995, 3:00:00 AM9/22/95
to
hal...@caip.rutgers.edu wrote:

>On the PDP-11 we would want "sizeof" to reckon in bits, even though
>it addresses in 36-bit words.

Is the 'PDP-11' above one of your strange misspellings of PDP-10? The
PDP-11 has 16-bit words and 8-bit bytes.

+--------------------------------------------+
| Michael Quinlan |
| mi...@primenet.com |
| http://www.primenet.com/~mikeq |
+--------------------------------------------+


hal...@caip.rutgers.edu

unread,
Sep 25, 1995, 3:00:00 AM9/25/95
to
hal...@caip.rutgers.edu wrote
>On the PDP-10 we would want "sizeof" to reckon in bits, even though

>it addresses in 36-bit words.

<I had written "PDP-11", but that was a mistake for "PDP-10".>

>From dq...@dqua.earthlink.net Sat 1995 Sep 23 23:57:09:
:Hi... have you tried writing a Sizeof(...) macro that does essentially:
:
: #define Sizeof(t) (sizeof(t)*CHAR_BIT)
:
:where CHAR_BIT is from <limits.h>.

This is useless if the C conforms to TOPS-20 s textfile-custom, which is that
ordinarilie bytes are 7 bits long, 5 to a word: that is 35 bits of a 36-bit
word (the spare bit has been uzed for to mark line-numbers).

>From ka...@lts.sel.alcatel.de:
:According to the standard, a byte must be at least 8 bits. Although


:I'm not familiar with the PDP-10, it is my impression that most 36 bit
:machines use 9 bit bytes.

At MIT this was so, but not in DEC s wares.

Ka...@lts.sel.alcatel.de quotes further restrictions which show that C on
the PDP-10 under TOPS-10/20 breaks with either local custom or the standard.
Under Tandem s Guardian there is like trouble --this time with line-length--
and the C-implementors devized a new textfile-kind for C.

John R MacMillan

unread,
Sep 25, 1995, 3:00:00 AM9/25/95
to
|: Incidentally, there has been a lot of discussion about exact-length types.
|: Would someone who supports the idea of exact length types please post up
|: some code that depends on their existence, that can't be implemented in any
|: other (relatively painless) way.
|
|Not code, but an example; and not of portability between machines but
|between different compilers on the same machine.

[example deleted]

The SBEIR proposal will not help in your example, since exact types
may be larger than requested. From the proposal:

An exact type is one that has exactly N bits that participate
in its value. The type may consume storage larger than N bits
(e.g., to pad to a convenient byte or word boundary), but the
remaining storage does not participate (holes) in the value.
For example, a type of exactly 24 bits might be implemented as
3 8-bit bytes or as 32-bit word with the necessary bit mask
operations.

So your exact:16 type may consume 32 bits of storage.

As an aside, IMHO your best bet to be portable across compilers in your
example is to use an array of chars, and use a set of macros and/or
functions to manipulate the appropriate bits. It's not pretty, but
SBEIR won't make it any prettier.

Richard A. O'Keefe

unread,
Oct 5, 1995, 3:00:00 AM10/5/95
to
hal...@caip.rutgers.edu writes:
>Others hav written, that by definition
> (sizeof char) == 1
>. I suspect that on the PDP-10 this has queer outcums. The commonest
>byte-length for text-files is 7 bits; this puts 5 bytes into one 36-bit
>word. What valu of "CHAR_BIT" is right?

It has been a long time since I used the Snyder C compiler under TOPS-10,
but as I recall it,
sizeof (char) == sizeof (short) == sizeof (int) == sizeof (long) == 1
CHAR_BIT did not exist at that time; it it had existed its value would have
been 36.

getc() _did_ return numbers in the range -1 .. 127. What's more,
putc() wrote 7-bit characters. Again, remember that at that time
there was no 'binary mode' for files, so this compiler _did_ satisfy
the key requirement of the Standard: all members of the execution character
set _could_ be written to a file and read back.

>Furthermore, although the basic integer is 36 bits long, there is also
>halfword support, instructions that make uzing 18-bit-long integers not
>too inconvenient.

Indeed there is half-word support in the hardware, but where is it written
that a C implementation has to provide direct access to everything the
hardware supports?

>> A preferable basis for "sizeof" would be to count in whatever the
>> machine's minimal addressible units are, be they bits, octets, or words;
>> then this problem just evaporates.

>On the PDP-10 we would want "sizeof" to reckon in bits, even though
>it addresses in 36-bit words.

It all depends on what you mean by "minimal addressable units". Yes, you
could form addresses
[offset 0 <= X < 36, length 0 < Y <= 36-X, word 0 <= Z < 2**18]
BUT you couldn't do address _arithmetic_ with these addresses in most models.
(The KL-10 processors did have an "add integer to bitfield address"
ihstruction, but it's one of the classic arguments for RISC; it was slower
than doing it with a sequence of KI-10 instructions.)
--
"conventional orthography is ... a near optimal system for the
lexical representation of English words." Chomsky & Halle, S.P.E.
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.

0 new messages