Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Zero overhead overflow checking

41 views
Skip to first unread message

jacob navia

unread,
Sep 7, 2009, 7:47:34 AM9/7/09
to
Abstract:

Overflow checking is not done in C. This article proposes a solution
to close this hole in the language that has almost no impact in the run
time behavior.

1: The situation now
--------------------

Any of the four operations on signed integers can overflow. The result
of the operation is meaningless, what can have a catastrophic impact
on the operations that follow. There are important security issues
associated with overflows.

The only way to catch overflows now is to use cumbersome C expressions
that force modifications of source code, and are very slow since done
in C.

2: The proposed solution
------------------------

In the experimental compiler lcc-win, I have implemented since 2003 an
overflow checking mechanism.

From that work I have derived this proposal.

2.A: A new pragma
------------------

#pragma STDC OVERFLOW_CHECK on_off_flag

When in the ON state, any overflow of an addition, subtraction,
multiplication or division provokes a call to the overflow handler.
Operations like +=, -=, *=, and /= are counted also.

Only the types signed int and signed long long are concerned.

The initial state of the overflow flag is implementation defined.

2.B: Setting the handler for overflows
---------------------------------------

overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);

The function set_overflow_handler sets the function to be called in
case of overflow to the specified value. If "newvalue" is NULL,
the function sets the handler to the default value (the value
it had at program startup).

2.C: The handler function
-------------------------

typedef void (*overflow_handler_t)(unsigned line_number,
char *filename,
char *function_name,...);
This function will be called when an overflow is detected. The
arguments have the same values as __LINE__ __FILE__ and __FUNC__

If this function returns, execution continues with an implementation
defined value as the result of the operation that overflowed.

-------------------------------------------------------------------

Implementation.

I have implemented this solution, and the overhead is almost zero.
The most important point for implementors is to realize that the
normal flow (i.e. when there is no overflow) should not be disturbed.

No overhead implementation:
--------------------------
1. Perform operation (add, subtract, etc)
2. Jump on overflow to an error label
3: Go on with the rest of the program

The overhead of this is below accuracy in a PC system.
It can't be measured.

Implementation with small overhead (3-5%)
1. Perform operation
2. If no overflow jump to continuation
3. save registers
4. Push arguments
5. Call handler
6. Pop arguments
7. Restore registers
continuation:

The problem with the second implementation is that the flow of control
is disturbed. The branch to the continuation code will be mispredicted
since it is a forward branch. This provokes pipeline turbulence.

The first solution provokes no pipeline turbulence since the forward
jump will be predicted as not taken. This will be a good prediction
in the overwhelming majority of situations (no overflow). The only
overhead is just an additional instruction, i.e. almost nothing.
----------------------------------------------------------------------

Dag-Erling Smørgrav

unread,
Sep 7, 2009, 8:53:38 AM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> Any of the four operations on signed integers can overflow. The result
> of the operation is meaningless, what can have a catastrophic impact
> on the operations that follow. There are important security issues
> associated with overflows.

There are plenty of cases - such as a linear congruential PRNG or
certain parts of a TCP stack - where overflow is either unimportant or
intentional. There are also plenty of cases where the programmer can
safely assume that overflow can not possibly happen.

> The only way to catch overflows now is to use cumbersome C expressions
> that force modifications of source code, and are very slow since done
> in C.

Your solution also requires modifying the source code, and to claim that
something is very slow just because it is "done in C" is simply idiotic.
As a compiler author, you should know better. A good compiler could
recognize at least some (correctly implemented) overflow checks as such
and generate code that checks the CPU's integer overflow flag instead of
performing an explicit comparison.

> 2.A: A new pragma
> ------------------
>
> #pragma STDC OVERFLOW_CHECK on_off_flag
>
> When in the ON state, any overflow of an addition, subtraction,
> multiplication or division provokes a call to the overflow handler.
> Operations like +=, -=, *=, and /= are counted also.
>
> Only the types signed int and signed long long are concerned.

What about signed long? And why only signed? Overflow can be just as
painful in unsigned arithmetic.

> 2.B: Setting the handler for overflows
> ---------------------------------------
>
> overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);
>
> The function set_overflow_handler sets the function to be called in
> case of overflow to the specified value. If "newvalue" is NULL,
> the function sets the handler to the default value (the value
> it had at program startup).
>
> 2.C: The handler function
> -------------------------
>
> typedef void (*overflow_handler_t)(unsigned line_number,
> char *filename,
> char *function_name,...);
> This function will be called when an overflow is detected. The
> arguments have the same values as __LINE__ __FILE__ and __FUNC__

C already has a tried and tested mechanism for this kind of thing:
signals.

> I have implemented this solution, and the overhead is almost zero.
> The most important point for implementors is to realize that the
> normal flow (i.e. when there is no overflow) should not be disturbed.

I find that slightly condescending. Compiler writers are (usually not)
idiots.

> No overhead implementation:
> --------------------------
> 1. Perform operation (add, subtract, etc)
> 2. Jump on overflow to an error label
> 3: Go on with the rest of the program

I assume that the code that follows the error label calls the handler.
What happens when the handler returns? Does the flow of control return
to the point immediately after the expression or sub-expression that
triggered the overflow?

> The overhead of this is below accuracy in a PC system.
> It can't be measured.

That depends entirely on the workload. Programs that perform large
amounts of integer arithmetic (e.g. signal processing, or numerical
analysis using arbitrary-precision arithmetic) may be noticeably
affected - not just because of the processing overhead, but also because
of the extra instructions, which reduce spatial locality.

> Implementation with small overhead (3-5%)
> 1. Perform operation
> 2. If no overflow jump to continuation
> 3. save registers
> 4. Push arguments
> 5. Call handler
> 6. Pop arguments
> 7. Restore registers
> continuation:
>
> The problem with the second implementation is that the flow of control
> is disturbed. The branch to the continuation code will be mispredicted
> since it is a forward branch. This provokes pipeline turbulence.

The branch in your "no-overhead" implementation is also a forward
branch.

Unlike lcc, the C standard is not limited to i386 and amd64 systems.
Most CPUs don't do branch prediction; many (if not most) of those that
do allow the compiler to provide hints. Some compilers (such as gcc)
allow the programmer to specify which branch is more likely to be taken.

Even in the absence of hints, you can't make any assumptions about what
the CPU will do; it is reasonable (and, in this case, correct) for the
CPU to assume that a branch conditional on the overflow flag is used to
handle exceptional conditions, and therefore less likely to be followed.

BTW, I have yet to see a C compiler where the caller is responsible for
saving and restoring registers. Perhaps that is a peculiarity of lcc,
or of the Windows ABI? Usually, the callee saves and restores registers
that it intends to use for itself; some CPUs use register renaming or
other mechanisms to avoid pushing registers onto the stack.

> The first solution provokes no pipeline turbulence since the forward
> jump will be predicted as not taken. This will be a good prediction
> in the overwhelming majority of situations (no overflow). The only
> overhead is just an additional instruction, i.e. almost nothing.

On many microcontrollers, an additional instruction means an additional
clock cycle, no matter what.

DES
--
Dag-Erling Smørgrav - d...@des.no

Chris Dollin

unread,
Sep 7, 2009, 9:07:54 AM9/7/09
to
Dag-Erling Sm�rgrav wrote:

> jacob navia <ja...@nospam.org> writes:

>> #pragma STDC OVERFLOW_CHECK on_off_flag
>>
>> When in the ON state, any overflow of an addition, subtraction,
>> multiplication or division provokes a call to the overflow handler.
>> Operations like +=, -=, *=, and /= are counted also.
>>
>> Only the types signed int and signed long long are concerned.
>
> What about signed long? And why only signed? Overflow can be just as
> painful in unsigned arithmetic.

The C standard specifies that unsigned arithmetic wraps around and
does not "overflow"; there's nothing to check and no room to manoeuver.

--
"The career of Hern VI from its native Acolyte cluster - James Blish
across the centre of the galaxy made history -- /Earthman, Come Home/
particularly in the field of instrumentation."

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

jacob navia

unread,
Sep 7, 2009, 9:28:34 AM9/7/09
to
Dag-Erling Smørgrav a écrit :

> jacob navia <ja...@nospam.org> writes:
>> Any of the four operations on signed integers can overflow. The result
>> of the operation is meaningless, what can have a catastrophic impact
>> on the operations that follow. There are important security issues
>> associated with overflows.
>
> There are plenty of cases - such as a linear congruential PRNG or
> certain parts of a TCP stack - where overflow is either unimportant or
> intentional. There are also plenty of cases where the programmer can
> safely assume that overflow can not possibly happen.
>

In that case, in MOST cases you have... nothing to do.
If you do not enable the checking with the pragma
the code does the same thing as before.

I just do not understand your objection. Or maybe you start answering
messages before reading them to the end?

:-)

>> The only way to catch overflows now is to use cumbersome C expressions
>> that force modifications of source code, and are very slow since done
>> in C.
>
> Your solution also requires modifying the source code,

No. The lcc-win compiler accepts a command line argument that sets the
overflow checking for the compilation unit.

> and to claim that
> something is very slow just because it is "done in C" is simply idiotic.

C is in general slower as assembly language, specially here.

> As a compiler author, you should know better. A good compiler could
> recognize at least some (correctly implemented) overflow checks as such
> and generate code that checks the CPU's integer overflow flag instead of
> performing an explicit comparison.
>


There is NO compiler in the world that does that. And with good reasons.

>> 2.A: A new pragma
>> ------------------
>>
>> #pragma STDC OVERFLOW_CHECK on_off_flag
>>
>> When in the ON state, any overflow of an addition, subtraction,
>> multiplication or division provokes a call to the overflow handler.
>> Operations like +=, -=, *=, and /= are counted also.
>>
>> Only the types signed int and signed long long are concerned.
>
> What about signed long? And why only signed? Overflow can be just as
> painful in unsigned arithmetic.
>

Unsigned arithmetic is defined in standard C. PLease read the
corresponding standard pages.


>> 2.B: Setting the handler for overflows
>> ---------------------------------------
>>
>> overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);
>>
>> The function set_overflow_handler sets the function to be called in
>> case of overflow to the specified value. If "newvalue" is NULL,
>> the function sets the handler to the default value (the value
>> it had at program startup).
>>
>> 2.C: The handler function
>> -------------------------
>>
>> typedef void (*overflow_handler_t)(unsigned line_number,
>> char *filename,
>> char *function_name,...);
>> This function will be called when an overflow is detected. The
>> arguments have the same values as __LINE__ __FILE__ and __FUNC__
>
> C already has a tried and tested mechanism for this kind of thing:
> signals.
>

Why not using the signal mechanism?

The problem with it is that here more information abouty WHERE the
error occurs is passed to the program. This is important, to be
able to use this feature in an effective manner. Obviously the
information can be available in a debugger, if you put a
breakpoint in a signal handler, but the solution I propose
doesn't need that, and can be used in a production setting
(logging the coordinates in a faults file for instance)

>> I have implemented this solution, and the overhead is almost zero.
>> The most important point for implementors is to realize that the
>> normal flow (i.e. when there is no overflow) should not be disturbed.
>
> I find that slightly condescending. Compiler writers are (usually not)
> idiots.
>

Sorry, I can't please everyone. In another message Mr Nilsson asked me

Have you offered to show M$ or GCC how to implement it?

When I do that, you complain now.

>> No overhead implementation:
>> --------------------------
>> 1. Perform operation (add, subtract, etc)
>> 2. Jump on overflow to an error label
>> 3: Go on with the rest of the program
>
> I assume that the code that follows the error label calls the handler.
> What happens when the handler returns?

You should read the entire message before answering. That question
is answered just below.

> Does the flow of control return
> to the point immediately after the expression or sub-expression that
> triggered the overflow?
>

Yes.

>> The overhead of this is below accuracy in a PC system.
>> It can't be measured.
>
> That depends entirely on the workload. Programs that perform large
> amounts of integer arithmetic (e.g. signal processing, or numerical
> analysis using arbitrary-precision arithmetic) may be noticeably
> affected - not just because of the processing overhead, but also because
> of the extra instructions, which reduce spatial locality.
>

Maybe it is measurable, maybe not. In any case since you can turn it off
at any time this is not so important.

>> Implementation with small overhead (3-5%)
>> 1. Perform operation
>> 2. If no overflow jump to continuation
>> 3. save registers
>> 4. Push arguments
>> 5. Call handler
>> 6. Pop arguments
>> 7. Restore registers
>> continuation:
>>
>> The problem with the second implementation is that the flow of control
>> is disturbed. The branch to the continuation code will be mispredicted
>> since it is a forward branch. This provokes pipeline turbulence.
>
> The branch in your "no-overhead" implementation is also a forward
> branch.
>

Again. It is a forward branch that will be correctly predicted
since in most cases you have no overflow!

In the other implementation you have a forward branch that will be
almost always taken, provoking the turbulence.

> Unlike lcc, the C standard is not limited to i386 and amd64 systems.

That is big news to me.

:-)

> Most CPUs don't do branch prediction; many (if not most) of those that
> do allow the compiler to provide hints. Some compilers (such as gcc)
> allow the programmer to specify which branch is more likely to be taken.
>

So what? What is your point here?

> Even in the absence of hints, you can't make any assumptions about what
> the CPU will do; it is reasonable (and, in this case, correct) for the
> CPU to assume that a branch conditional on the overflow flag is used to
> handle exceptional conditions, and therefore less likely to be followed.
>
> BTW, I have yet to see a C compiler where the caller is responsible for
> saving and restoring registers. Perhaps that is a peculiarity of lcc,
> or of the Windows ABI? Usually, the callee saves and restores registers
> that it intends to use for itself; some CPUs use register renaming or
> other mechanisms to avoid pushing registers onto the stack.
>

lcc saves all registers before calling the handler procedure because an
operation is interrupted before we reach the next sequence point.

Normally, as you correctly point out, the callee doesn't save any
registers because the scratch registers are saved before calling a
function. But here, we are maybe in the middle of an expression

z = (R+67)/(w-34)+Height;

Scratch registers are holding values that need to be saved since
if the handler returns, the computation goes on.

> On many microcontrollers, an additional instruction means an additional
> clock cycle, no matter what.
>

You seem to be more interested in fast execution with maybe wrong
results than correct results at all times. In any case if you program
a coffee machine and overflow is not a problem: it suffices to do
NOTHING and everything will work as before.

What is the problem then?

jacob navia

unread,
Sep 7, 2009, 9:30:58 AM9/7/09
to
jacob navia a �crit :

> Only the types signed int and signed long long are concerned.

AAAARGH!

I forgot signed long. Sorry.

Thanks to Mr Smorgrav that pointed me to this error.

Gordon Burditt

unread,
Sep 7, 2009, 9:40:00 AM9/7/09
to
>Overflow checking is not done in C. This article proposes a solution
>to close this hole in the language that has almost no impact in the run
>time behavior.
>
>1: The situation now
>--------------------
>
>Any of the four operations on signed integers can overflow. The result

There are a lot more than four operations. In particular, serious
attention needs to be given to increment and decrement, which probably
cause a lot more overflows than division.

>of the operation is meaningless, what can have a catastrophic impact
>on the operations that follow. There are important security issues
>associated with overflows.
>
>The only way to catch overflows now is to use cumbersome C expressions
>that force modifications of source code, and are very slow since done
>in C.
>
>2: The proposed solution
>------------------------
>
>In the experimental compiler lcc-win, I have implemented since 2003 an
>overflow checking mechanism.
>
> From that work I have derived this proposal.
>
>2.A: A new pragma
>------------------
>
>#pragma STDC OVERFLOW_CHECK on_off_flag
>
>When in the ON state, any overflow of an addition, subtraction,
>multiplication or division provokes a call to the overflow handler.
>Operations like +=, -=, *=, and /= are counted also.

Specific mention of ++ and -- here is more important than mentioning
+=, etc.

How fine-grained does this have to operate? If, for example:
e = ((a+b)
#pragma STDC OVERFLOW_CHECK on
*
#pragma STDC OVERFLOW_CHECK off
(c+d))+1;
will that check only the multiplication for overflow? If not, how do
I check only the multiplication for overflow?


>Only the types signed int and signed long long are concerned.

Why not signed long also? And signed short? Why can't signed chars
overflow? And why no checks for floating point? Pointers can wrap,
too.

>The initial state of the overflow flag is implementation defined.

This seems problematical for existing (bad) code.

>2.B: Setting the handler for overflows
>---------------------------------------
>
>overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);

This intrudes on the programmer's namespace, unless you put
declarations like this into a new (to-become-standard) header file
where they will interfere only if that header file is included.

>The function set_overflow_handler sets the function to be called in
>case of overflow to the specified value. If "newvalue" is NULL,
>the function sets the handler to the default value (the value
>it had at program startup).

You have failed to state what the default handler *does* when
it is called. This is too important to make it "implementation
defined" and leave it at that.

>
>2.C: The handler function
>-------------------------
>
>typedef void (*overflow_handler_t)(unsigned line_number,
> char *filename,
> char *function_name,...);

This is a variable-argument function? Why?

>This function will be called when an overflow is detected. The
>arguments have the same values as __LINE__ __FILE__ and __FUNC__

Under what circumstances will it be called with more than 3 arguments?

This makes it very difficult to deal with overflow in a way other
than spitting out an error message and optionally dying. In
particular, if I want to test if a particular expression overflowed,
I end up embedding *LINE NUMBERS* into the code, line numbers which
may very well change even if the only change is running it through
"indent" or editing comments. (A try/catch structure would work
better here.)

>If this function returns, execution continues with an implementation
>defined value as the result of the operation that overflowed.

Can this be a trap value? If so, the program may die before I
decide that the whole computation produced a useless value and
substitute a default or demand better input.

>-------------------------------------------------------------------
>
>Implementation.
>
>I have implemented this solution, and the overhead is almost zero.
>The most important point for implementors is to realize that the
>normal flow (i.e. when there is no overflow) should not be disturbed.
>
>No overhead implementation:

Don't turn "almost zero" into "zero": that's a standard marketing
lie. A technical proposal doesn't need this kind of exaggeration,
and it's not necessary. You can call it "almost zero overhead" if
you like, at least on an x86 architecture.

Gordon Burditt

unread,
Sep 7, 2009, 9:53:39 AM9/7/09
to
>> There are plenty of cases - such as a linear congruential PRNG or
>> certain parts of a TCP stack - where overflow is either unimportant or
>> intentional. There are also plenty of cases where the programmer can
>> safely assume that overflow can not possibly happen.
>>
>
>In that case, in MOST cases you have... nothing to do.
>If you do not enable the checking with the pragma
>the code does the same thing as before.

This is inconsistent with your proposal that the initial state
of overflow checking is implementation-defined.

>> Even in the absence of hints, you can't make any assumptions about what
>> the CPU will do; it is reasonable (and, in this case, correct) for the
>> CPU to assume that a branch conditional on the overflow flag is used to
>> handle exceptional conditions, and therefore less likely to be followed.
>>
>> BTW, I have yet to see a C compiler where the caller is responsible for
>> saving and restoring registers. Perhaps that is a peculiarity of lcc,
>> or of the Windows ABI? Usually, the callee saves and restores registers
>> that it intends to use for itself; some CPUs use register renaming or
>> other mechanisms to avoid pushing registers onto the stack.
>>
>
>lcc saves all registers before calling the handler procedure because an
>operation is interrupted before we reach the next sequence point.

Also, you may need free registers, depending on the CPU, to push
arguments onto the stack. Also, the callee doesn't usually save
and restore the register(s) used for the return value, which might
be in use as a scratch register.

jacob navia

unread,
Sep 7, 2009, 10:06:40 AM9/7/09
to
Gordon Burditt a �crit :

>
> Specific mention of ++ and -- here is more important than mentioning
> +=, etc.
>

I will add them.

> How fine-grained does this have to operate? If, for example:
> e = ((a+b)
> #pragma STDC OVERFLOW_CHECK on
> *
> #pragma STDC OVERFLOW_CHECK off
> (c+d))+1;
> will that check only the multiplication for overflow? If not, how do
> I check only the multiplication for overflow?
>

This will work in lcc-win, but for a standard it is problematic because
it could be difficult for the optimizer to move code around.

>
>> Only the types signed int and signed long long are concerned.
>
> Why not signed long also?

I corrected that already. It was an oversight.

> And signed short? Why can't signed chars
> overflow?

Because they are promoted to ints when doing arithmetic.

And why no checks for floating point?

Because they have their own set of flags.

> Pointers can wrap, too.

Sure, but if the wrapping around is done using unsigned arithmetic
the behavior is correct.

>
>> The initial state of the overflow flag is implementation defined.
>
> This seems problematical for existing (bad) code.
>

The goal here is to be as compatible with the existing code as possible.
If the default is on, as mandated by the standard, old code will no
longer work because it will call a handler at overflow and the program
will crash. Not very funny.

This allows also for an implementation of C in secure environments
where overflow can't be tolerated and sets it as ON by default. Or
it uses some global file, command line option, whatever.


>> 2.B: Setting the handler for overflows
>> ---------------------------------------
>>
>> overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);
>
> This intrudes on the programmer's namespace, unless you put
> declarations like this into a new (to-become-standard) header file
> where they will interfere only if that header file is included.
>

Yes I would propose that we use the same file as the safer C library
of Microsoft or some

<overflowcheck.h>

>> The function set_overflow_handler sets the function to be called in
>> case of overflow to the specified value. If "newvalue" is NULL,
>> the function sets the handler to the default value (the value
>> it had at program startup).
>
> You have failed to state what the default handler *does* when
> it is called. This is too important to make it "implementation
> defined" and leave it at that.
>

It does... whatever it wants. How can we specify what a handler does?

Let's get real. Did the standard specify what does a signal handler do?

>> 2.C: The handler function
>> -------------------------
>>
>> typedef void (*overflow_handler_t)(unsigned line_number,
>> char *filename,
>> char *function_name,...);
>
> This is a variable-argument function? Why?
>
>> This function will be called when an overflow is detected. The
>> arguments have the same values as __LINE__ __FILE__ and __FUNC__
>
> Under what circumstances will it be called with more than 3 arguments?
>

It could be that some implementations pass MORE information to the
overflow handler than the bare required minimum. What can that be
is up to the implementation.

> This makes it very difficult to deal with overflow in a way other
> than spitting out an error message and optionally dying. In
> particular, if I want to test if a particular expression overflowed,
> I end up embedding *LINE NUMBERS* into the code, line numbers which
> may very well change even if the only change is running it through
> "indent" or editing comments. (A try/catch structure would work
> better here.)


Sure, lcc-win implements try/catch. But it is already difficult
(as you see) to convince people of this extremely simple stuff,
try/catch is much more complex.


>
>> If this function returns, execution continues with an implementation
>> defined value as the result of the operation that overflowed.
>
> Can this be a trap value? If so, the program may die before I
> decide that the whole computation produced a useless value and
> substitute a default or demand better input.
>

Then you have to search for a better implementation, what do you expect?

You can always have a setjmp BEFORE the computation and treat the error
in the setjmp clause. Your handler just makes a longjmp.

Dag-Erling Smørgrav

unread,
Sep 7, 2009, 10:13:57 AM9/7/09
to
jacob navia <ja...@nospam.org> writes:

> Dag-Erling Smørgrav <d...@des.no> writes:
> > Most CPUs don't do branch prediction; many (if not most) of those that
> > do allow the compiler to provide hints. Some compilers (such as gcc)
> > allow the programmer to specify which branch is more likely to be taken.
> So what? What is your point here?

That your claims about the comparative performance of the two
implementations have no basis in reality.

> > BTW, I have yet to see a C compiler where the caller is responsible for
> > saving and restoring registers. Perhaps that is a peculiarity of lcc,
> > or of the Windows ABI? Usually, the callee saves and restores registers
> > that it intends to use for itself; some CPUs use register renaming or
> > other mechanisms to avoid pushing registers onto the stack.
> lcc saves all registers before calling the handler procedure because an
> operation is interrupted before we reach the next sequence point.

I'll have to remember never to use it, then.

> Normally, as you correctly point out, the callee doesn't save any
> registers because the scratch registers are saved before calling a
> function.

I said the *caller* normally doesn't save any registers.

> You seem to be more interested in fast execution with maybe wrong
> results than correct results at all times.

*you* were the one who brought up performance; *you* were the one who
claimed that overflow checking has no performance overhead.

jacob navia

unread,
Sep 7, 2009, 10:26:41 AM9/7/09
to
Dag-Erling Smørgrav a écrit :

> jacob navia <ja...@nospam.org> writes:
>> Dag-Erling Smørgrav <d...@des.no> writes:
>>> Most CPUs don't do branch prediction; many (if not most) of those that
>>> do allow the compiler to provide hints. Some compilers (such as gcc)
>>> allow the programmer to specify which branch is more likely to be taken.
>> So what? What is your point here?
>
> That your claims about the comparative performance of the two
> implementations have no basis in reality.
>

I measured the performance difference in my machine,
but I haven't done an extensive study.

What is obvious is that the performance hit will be almost
zero for most advanced CPUs.


If, as you said above, some compilers allow to specify
the branch that is going to be taken, it will be even easier
to do that in assembly by the compiler without the user
having to modify the code.

>>> BTW, I have yet to see a C compiler where the caller is responsible for
>>> saving and restoring registers. Perhaps that is a peculiarity of lcc,
>>> or of the Windows ABI? Usually, the callee saves and restores registers
>>> that it intends to use for itself; some CPUs use register renaming or
>>> other mechanisms to avoid pushing registers onto the stack.
>> lcc saves all registers before calling the handler procedure because an
>> operation is interrupted before we reach the next sequence point.
>
> I'll have to remember never to use it, then.
>

Sure, nobody is forcing you to use it. If you take care to try top
understand what I am saying however, it would be more practical.

In an expression evaluation (that you snipped) scratch registers are
used to hold intermediate values. Since we maybe will come back to this
point (if the handler returns), those values will be used in the
evaluation of the full expression. We need to save them.

This is not a normal call. In a normal call, all scratch registers have
been saved to RAM or are unused, so there is no need to save them in the
caller code. This is NOT a normal call since it has to preserve context
WITHIN two sequence points.


>> Normally, as you correctly point out, the callee doesn't save any
>> registers because the scratch registers are saved before calling a
>> function.
>
> I said the *caller* normally doesn't save any registers.
>

Yes, you said that. But you misunderstood. See the explanation above.

>> You seem to be more interested in fast execution with maybe wrong
>> results than correct results at all times.
>
> *you* were the one who brought up performance; *you* were the one who
> claimed that overflow checking has no performance overhead.
>

I brought performance because many people are concerned about that. If
you review this dicussion, performance issues are the bulk of the
critics to this proposal. So, I tried to address those concerns with a
solution that doesn't impact performance at all.

Dag-Erling Smørgrav

unread,
Sep 7, 2009, 10:53:47 AM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> What is obvious is that the performance hit will be almost zero for
> most advanced CPUs.

C is not restricted to "the most advanced CPUs". For every "advanced
CPU" in the world, there are tens or hundreds of embedded processors,
microcontrollers, DSPs etc. Even a COTS desktop, laptop or server with
an "advanced CPU" can contain multiple secondary processors: I've worked
with IBM servers that had an i486 (IIRC) on the backplane monitoring the
main CPUs.

Eric Sosman

unread,
Sep 7, 2009, 11:17:33 AM9/7/09
to
jacob navia wrote:
> Abstract:
>
> Overflow checking is not done in C. This article proposes a solution
> to close this hole in the language that has almost no impact in the run
> time behavior.
>
> 1: The situation now
> --------------------
>
> Any of the four operations on signed integers can overflow. The result
> of the operation is meaningless, what can have a catastrophic impact
> on the operations that follow. There are important security issues
> associated with overflows.

It seems "the four operations" are +, -, *, / and their
variants. Why omit <<? (Especially, why omit << if an
optimizer is likely to substitute it for *?)

> The only way to catch overflows now is to use cumbersome C expressions
> that force modifications of source code, and are very slow since done
> in C.
>
> 2: The proposed solution
> ------------------------
>
> In the experimental compiler lcc-win, I have implemented since 2003 an
> overflow checking mechanism.
>
> From that work I have derived this proposal.
>
> 2.A: A new pragma
> ------------------
>
> #pragma STDC OVERFLOW_CHECK on_off_flag
>
> When in the ON state, any overflow of an addition, subtraction,
> multiplication or division provokes a call to the overflow handler.
> Operations like +=, -=, *=, and /= are counted also.
>
> Only the types signed int and signed long long are concerned.

A peculiar limitation. Is there a reason for omitting
`signed long', and the signed <stdint.h> types that don't
promote to `int'? If the implementation permits wider-than-int
bit-fields, shouldn't their arithmetic be testable, too?

> The initial state of the overflow flag is implementation defined.
>
> 2.B: Setting the handler for overflows
> ---------------------------------------
>
> overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);
>
> The function set_overflow_handler sets the function to be called in
> case of overflow to the specified value. If "newvalue" is NULL,
> the function sets the handler to the default value (the value
> it had at program startup).

The main alternative would be to raise a signal and run the
signal handler. A weakness of that approach is that it's hard
to smuggle much information through C's bare-bones signal scheme;
the "wider" interface described here is more flexible.

Note that since the overflow handler function may be called
at pretty much any moment in the evaluation of an expression,
the values of variables modified by that expression are uncertain.
This shouldn't be any more troublesome than `y[i++] = f(x[j++])',
though.

If overflow checking is enabled (perhaps by way of the I-D
initial state) but set_overflow_handler() has not been called,
what happens if an overflow occurs? Presumably there's a default
handler pre-set by the implementation; what does (should) it do?

> 2.C: The handler function
> -------------------------
>
> typedef void (*overflow_handler_t)(unsigned line_number,
> char *filename,
> char *function_name,...);
> This function will be called when an overflow is detected. The
> arguments have the same values as __LINE__ __FILE__ and __FUNC__

ITYM __func__, but I also question the utility. On an
implementation where __FILE__ is meaningful, it and __LINE__
suffice to locate the overflow site within a reasonably narrow
range, and __func__ does not narrow the range any further. Note
that all three pieces of information are meaningful only to a
person with access to the source code; none is helpful to a
source-less end user.

What variadic arguments are supplied, and how does the
handler learn about them?

> If this function returns, execution continues with an implementation
> defined value as the result of the operation that overflowed.
>
> -------------------------------------------------------------------
>
> Implementation.
>
> I have implemented this solution, and the overhead is almost zero.
> The most important point for implementors is to realize that the
> normal flow (i.e. when there is no overflow) should not be disturbed.
>
> No overhead implementation:
> --------------------------
> 1. Perform operation (add, subtract, etc)
> 2. Jump on overflow to an error label

The jump can't be a "pure" jump unless there's a separate
error label for every possible overflow site. Without some
notion of where the jump came from, I don't see how you could
figure out the __LINE__ value, nor how you could resume execution
if the overflow handler returns. If you jump to an error label
that's shared by all overflow sites in the function, you'll need
some kind of "JSR" or "BAL" jump. On architectures where this
kind of jump is unconditional, you'd need both a conditional
jump to test the overflow and a jump-with-back-link to get to
the overflow processing code.

> 3: Go on with the rest of the program
>
> The overhead of this is below accuracy in a PC system.
> It can't be measured.
>
> Implementation with small overhead (3-5%)
> 1. Perform operation
> 2. If no overflow jump to continuation
> 3. save registers
> 4. Push arguments
> 5. Call handler
> 6. Pop arguments
> 7. Restore registers
> continuation:
>
> The problem with the second implementation is that the flow of control
> is disturbed. The branch to the continuation code will be mispredicted
> since it is a forward branch. This provokes pipeline turbulence.
>
> The first solution provokes no pipeline turbulence since the forward
> jump will be predicted as not taken. This will be a good prediction
> in the overwhelming majority of situations (no overflow). The only
> overhead is just an additional instruction, i.e. almost nothing.

Architectures with "branch delay slots" may also require
inserting a no-op, unless the code generator can find something
something that can be done safely regardless of whether the
overflow handler is called.

There remains the question implementation for architectures
where overflow detection is more burdensome. The Digital Alpha
has been cited as a machine where detection requires additional
compare instructions. Also, if you need both the operands and
the results to do post-hoc overflow detection, you can't use
instructions that overwrite operands with results; this means
that arithmetic uses more distinct CPU registers than it would
otherwise, restricting the optimizer's freedom to use those
registers for other purposes.

On the whole, the proposal seems a reasonable beginning.
Some of the design decisions (ignoring shifts, ignoring long,
handler arguments, ...) need more scrutiny, and the questions
of implementation efficiency are not altogether settled.

--
Eric Sosman
eso...@ieee-dot-org.invalid

jacob navia

unread,
Sep 7, 2009, 11:52:38 AM9/7/09
to
Eric Sosman a �crit :

>
> It seems "the four operations" are +, -, *, / and their
> variants. Why omit <<? (Especially, why omit << if an
> optimizer is likely to substitute it for *?)
>

Why omit << and >>
------------------

Signed overflow on shifts could mean that when shifting bits out,
the sign changes. I do not think that in a shift operation a call
to a trap handler and all the big overhead is justified. Most shift-out
operations handle the numbers as a bit sequence, not as arithmetical
values. True, you can substitut a multiplication by 2 with a shift,
and then the overflow wouldn't be detected, but that's life, you can't
have everything

The compiler should be careful that when the pragma overflow check
is ON, the usual optimization of substituting a multiplication by a
power of two with a shift is no longer valid!

[snip]

>> Only the types signed int and signed long long are concerned.
>
> A peculiar limitation. Is there a reason for omitting
> `signed long',

This was an oversight.

> and the signed <stdint.h> types that don't
> promote to `int'?

Which ones? I mean short and chars promote to int as far as
I know.

If the implementation permits wider-than-int
> bit-fields, shouldn't their arithmetic be testable, too?
>

If the bit-fields are promoted they will be tested anyway.

>
> The main alternative would be to raise a signal and run the
> signal handler. A weakness of that approach is that it's hard
> to smuggle much information through C's bare-bones signal scheme;
> the "wider" interface described here is more flexible.
>
> Note that since the overflow handler function may be called
> at pretty much any moment in the evaluation of an expression,
> the values of variables modified by that expression are uncertain.
> This shouldn't be any more troublesome than `y[i++] = f(x[j++])',
> though.
>

If there is an overflow, the whole expression is undefined. Now,
nobody knows and sometimes a program gives wrong results or makes
the software crash hours later.

> If overflow checking is enabled (perhaps by way of the I-D
> initial state) but set_overflow_handler() has not been called,
> what happens if an overflow occurs?

The default handler is called.

> Presumably there's a default
> handler pre-set by the implementation; what does (should) it do?
>

This is implementation defined.

>> 2.C: The handler function
>> -------------------------
>>
>> typedef void (*overflow_handler_t)(unsigned line_number,
>> char *filename,
>> char *function_name,...);
>> This function will be called when an overflow is detected. The
>> arguments have the same values as __LINE__ __FILE__ and __FUNC__
>
> ITYM __func__, but I also question the utility. On an
> implementation where __FILE__ is meaningful, it and __LINE__
> suffice to locate the overflow site within a reasonably narrow
> range, and __func__ does not narrow the range any further. Note
> that all three pieces of information are meaningful only to a
> person with access to the source code; none is helpful to a
> source-less end user.
>

As all bugs, theyt aren't user friendly. But what would you expect
that the C runtime tells the user?

You should setup a handler that does a sensible thing, for instance
to use the name of the function as a key into a hashtable of recovery
points where the program can recover.


> What variadic arguments are supplied, and how does the
> handler learn about them?
>

This is implementation defined. An implementation could store the source
code of the offending lines and pass it as a char * to the handler,
or it could pass a pointer to the stack frame so that the program
could change the values of some variables, who knows?

There are a lot of possibilities. The variable arguments allows
implementations to pass more information...

>
> The jump can't be a "pure" jump unless there's a separate
> error label for every possible overflow site.

I do exactly that. I setup a special label for each line of
code where an overflow is possible.

> Without some
> notion of where the jump came from, I don't see how you could
> figure out the __LINE__ value, nor how you could resume execution
> if the overflow handler returns.

I thought at the beginning of this exactly like you and I had
a version with
operation
if not overflow goto continuation
call handler
continuation:

But this inccurs a 3-5% performance hit.

> If you jump to an error label
> that's shared by all overflow sites in the function, you'll need
> some kind of "JSR" or "BAL" jump. On architectures where this
> kind of jump is unconditional, you'd need both a conditional
> jump to test the overflow and a jump-with-back-link to get to
> the overflow processing code.
>

Yes.

>
> Architectures with "branch delay slots" may also require
> inserting a no-op, unless the code generator can find something
> something that can be done safely regardless of whether the
> overflow handler is called.
>

Sure, but that is with all jumps.

> There remains the question implementation for architectures
> where overflow detection is more burdensome. The Digital Alpha
> has been cited as a machine where detection requires additional
> compare instructions.

Happily DEC is no longer there, and COMPAQ, that bought DEC
is no longer there either, being bought by Hewlett Packard.

> Also, if you need both the operands and
> the results to do post-hoc overflow detection, you can't use
> instructions that overwrite operands with results; this means
> that arithmetic uses more distinct CPU registers than it would
> otherwise, restricting the optimizer's freedom to use those
> registers for other purposes.
>

Maybe. If overflow checking is too expensive and your application
doesn't need it, then just do not turn it on.

Keith Thompson

unread,
Sep 7, 2009, 1:04:14 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> Gordon Burditt a écrit :
[...]

>> Pointers can wrap, too.
>
> Sure, but if the wrapping around is done using unsigned arithmetic
> the behavior is correct.

If pointer arithmetic causes a pointer to wrap around, it's almost
certainly not the correct behavior, unless the implementation has
allocated an object in an address range that wraps around to zero.

Pointers are not numbers. Pointer "overflow" occurs when a pointer
goes beyond the bounds of the object to which it points, not when its
numeric value overflows. Some kind of checking for pointer "overflow"
would certainly be useful if it could be done reasonably efficiently,
but I don't think it should be part of a proposal for integer overflow
checking. It would require some sort of "fat pointers", and it's not
practical to require that for all implementations.

[...]

>> You have failed to state what the default handler *does* when
>> it is called. This is too important to make it "implementation
>> defined" and leave it at that.
>
> It does... whatever it wants. How can we specify what a handler does?

Um, by specifying what the default handler does. Remember, we're
talking about a default handle that's set up *by the implementation*.

There's an argument to be made for leaving it implementation-defined,
but it certainly *could* be specified.

[...]

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson

unread,
Sep 7, 2009, 1:09:41 PM9/7/09
to
Eric Sosman <eso...@ieee-dot-org.invalid> writes:
> jacob navia wrote:
[...]

>> 2.C: The handler function
>> -------------------------
>>
>> typedef void (*overflow_handler_t)(unsigned line_number,
>> char *filename,
>> char *function_name,...);
>> This function will be called when an overflow is detected. The
>> arguments have the same values as __LINE__ __FILE__ and __FUNC__
>
> ITYM __func__, but I also question the utility. On an
> implementation where __FILE__ is meaningful, it and __LINE__
> suffice to locate the overflow site within a reasonably narrow
> range, and __func__ does not narrow the range any further. Note
> that all three pieces of information are meaningful only to a
> person with access to the source code; none is helpful to a
> source-less end user.

There is precedent: the assert() macro prints the values of __FILE_-,
__LINE__, and __func__. In another thread, I had suggested that jacob
should borrow that wording.

Keith Thompson

unread,
Sep 7, 2009, 1:14:38 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> Eric Sosman a écrit :
[...]

>> There remains the question implementation for architectures
>> where overflow detection is more burdensome. The Digital Alpha
>> has been cited as a machine where detection requires additional
>> compare instructions.
>
> Happily DEC is no longer there, and COMPAQ, that bought DEC
> is no longer there either, being bought by Hewlett Packard.
[...]

Alpha processors are still in production use.

Are there other processors, perhaps even new ones, that use similar
schemes to what the Alpha uses? Or ones that use other schemes?
Personally, I don't know, and unless you've done an exhaustive study
of all CPUs currently in use, I suggest that you don't know either.

If this proposal is to have any chance of being accepted into the
standard, you can't ignore architectures that differ from the ones
you're accustomed to.

Maybe the best possible implementation of your proposal on the XYZ-137
imposes a 15% overhead on typical code -- and maybe that's acceptable.
I'm not saying you need to know everything about every CPU, just that
you need to acknowledge the issue. If you give the impression, even
unintentionally, of having an "All the world's an x86" attitude,
you'll be taken less seriously.

Keith Thompson

unread,
Sep 7, 2009, 1:25:17 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> Abstract:
>
> Overflow checking is not done in C.

Correction: Overflow checking is not required in C. Since the
behavior of signed integer overflow is undefined, an implementation
can already do anything it likes, including checking.

Of course a standardized solution would have the advantage that users
could write portable code that uses it, unlike the current situation
where extensions are either inconsistent or nonexistent.

[...]

> Any of the four operations on signed integers can overflow.

Restricting this to +, -, *, /, even including ++, --, and the
compound assignment operators, is IMHO a bad idea.

The shift operators are not fundamentally different from +-*/.
Treating them differently would be an inconsistency in the language.

"/" can overflow in one rare case: INT_MIN/-1. "%" has problems with
the same operands; <TYPE>_MIN%-1 doesn't produce a mathematical result
that's outside the range of the type, but the C standard committee
recently added wording to 6.5.5 saying that the behavior of "%" is
undefined when "/" would overflow on the same operands. Your proposal
should cover this case.

[...]


> Only the types signed int and signed long long are concerned.

And signed long, as you've already acknowledged.

What about extended integer types? ("lcc-win doesn't support extended
integer types" is not a good answer.)

[...]

Eric Sosman

unread,
Sep 7, 2009, 2:23:37 PM9/7/09
to
jacob navia wrote:
> [...]

> 2.A: A new pragma
> ------------------
>
> #pragma STDC OVERFLOW_CHECK on_off_flag
>
> When in the ON state, any overflow of an addition, subtraction,
> multiplication or division provokes a call to the overflow handler.
> Operations like +=, -=, *=, and /= are counted also.
> [...]

It occurs to me that this #pragma needs some tightening
up, because there are corner cases:

#pragma STDC OVERFLOW_CHECK OFF
x = a
#pragma STDC OVERFLOW_CHECK ON
+ b + c;


/* Different (?) from the above */
#pragma STDC OVERFLOW_CHECK OFF
x = a +
#pragma STDC OVERFLOW_CHECK ON
b + c;

One possibility would be to leave the tightening to the
implementation: If an expression contains sub-expressions in
which the state of overflow checking differs, let it be up to
the implementation how much of the expression actually gets
overflow checking.

A (weakly) related issue is to describe how faithfully the
generated code must follow the abstract machine. For example,
is it permissible to rewrite

#define OVERHEAD 1
#pragma STDC OVERFLOW_CHECK ON
x = a + OVERHEAD - 1;
as
x = a;

? The original expression can overflow if executed literally,
but the rewritten expression cannot; is the transformation
allowed? What optimizations (if any) must overflow detection
inhibit?

--
Eric Sosman
eso...@ieee-dot-org.invalid

christian.bau

unread,
Sep 7, 2009, 2:34:29 PM9/7/09
to
Some comments:

1. In addition to "off" and "on", the pragma could have a third
setting "restore" which will restore to the state before the previous
"on" or "off", allowing this to be nested. So if someone doing
encryption wants no overflow checking, they put "off" and "restore"
around their code, and after that code we are back to the initial
setting.

2. Instead of specifying what will happen, there could be wording like
"this pragma is intended to have documented behavior that can be
noticed and can be used to find problems instead of producing
undefined behavior when an overflow happens". If the compiler can
guarantee that any overflow will trap into the debugger, that would be
something I would find useful. This would also be more efficient with
processors that have a "sticky" overflow flag.

3. This should handle as many overflow situations as possible. For
example signed left shifts, pointer overflow (if p is a pointer and i
is an int, then one would expect p+i > p if i > 0 and p+i < p if i <
i. In practice this happens only for a limited range of integers i.
Adding i outside that range would be overflow).

4. Obviously any compiler implementing this and caring about speed
would find what is the fastest code _for the target machine_. A
conditional branch leaving normal control flow will obviously be
optimal on _some_ processors; for other processors a different
strategy would be used.

5. I would have suggested an operator overflow () ("operator" in the
sense that "sizeof" is an operator): The "overflow" operator has a
single argument which is an expression; it evaluates the expression
including side effects as other operators would do and yields 1 if any
of the operators in the source code of the expression produced an
overflow, 0 otherwise. For example:

if (overflow (x = a+b)) printf ("Calculation produced overflow
\n");
else printf ("Sum of a and b is %d\n", x);

If the expression contains function calls, the function calls would
not be checked.

As in the "pragma" version, I think this is easiest to implement if
the compiler generates tokens like "checked-operator-plus" instead of
"operator-plus", depending on the situation. This handles things like
an inline function compiled with overflow checking when overflow
checking is disabled where it is called; this would perform checks for
the inline function even when it is inlined.

Keith Thompson

unread,
Sep 7, 2009, 2:49:35 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
[...]

> From that work I have derived this proposal.
>
> 2.A: A new pragma
> ------------------
>
> #pragma STDC OVERFLOW_CHECK on_off_flag
[...]

A tiny quibble: the standard refers to this as "on-off-switch", not
"on_off_flag" (note '-' rather than '_').

Eric Sosman

unread,
Sep 7, 2009, 2:51:01 PM9/7/09
to
jacob navia wrote:
> Eric Sosman a �crit :
>>
>> It seems "the four operations" are +, -, *, / and their
>> variants. Why omit <<? (Especially, why omit << if an
>> optimizer is likely to substitute it for *?)
>>
>
> Why omit << and >>
> ------------------
>
> Signed overflow on shifts could mean that when shifting bits out,
> the sign changes. I do not think that in a shift operation a call
> to a trap handler and all the big overhead is justified. Most shift-out
> operations handle the numbers as a bit sequence, not as arithmetical
> values. True, you can substitut a multiplication by 2 with a shift,
> and then the overflow wouldn't be detected, but that's life, you can't
> have everything
>
> The compiler should be careful that when the pragma overflow check
> is ON, the usual optimization of substituting a multiplication by a
> power of two with a shift is no longer valid!

The decision seems capricious. Also, forbidding the use
of shift to evaluate a multiplication adds to the cost of doing
the overflow checks (on the assumption that shifts are cheaper
than multiplications).

>>> Only the types signed int and signed long long are concerned.
>>
>> A peculiar limitation. Is there a reason for omitting
>> `signed long',
>
> This was an oversight.
>
>> and the signed <stdint.h> types that don't
>> promote to `int'?
>
> Which ones? I mean short and chars promote to int as far as
> I know.

Yes (except that char might promote to unsigned int). But
<stdint.h> can define types that are wider than int, and these
types do not promote at all. Arithmetic on int39_t values is
done (as if) in 39-bit arithmetic, not in promoted-to-64-bit
arithmetic. (This has implications for detection of overflow,
if the underlying hardware uses 64-bit arithmetic to simulate
39-bit operations.)

> If the implementation permits wider-than-int
>> bit-fields, shouldn't their arithmetic be testable, too?
>
> If the bit-fields are promoted they will be tested anyway.

A bit-field wider than int cannot possibly promote to int.
(It can't promote at all, in fact.)

>>> 2.C: The handler function
>>> -------------------------
>>>
>>> typedef void (*overflow_handler_t)(unsigned line_number,
>>> char *filename,
>>> char *function_name,...);

>> [...]


>> What variadic arguments are supplied, and how does the
>> handler learn about them?
>
> This is implementation defined. An implementation could store the source
> code of the offending lines and pass it as a char * to the handler,
> or it could pass a pointer to the stack frame so that the program
> could change the values of some variables, who knows?

Ugh. That means the source code needs a different version
of the overflow handler for every implementation it might run on.
Sounds like a return to the pre-ANSI days, with separate #ifdef
blocks for every compiler known to Man.

>> The jump can't be a "pure" jump unless there's a separate
>> error label for every possible overflow site.
>
> I do exactly that. I setup a special label for each line of
> code where an overflow is possible.

Is that enough? You say that if the overflow handler returns,
execution proceeds as if the overflowing operation had produced
some implementation-defined value. But if there are several
possible overflow sites in the same expression, and you have
only one "bail out" label per line, how can you know where in
the expression execution should resume?

x = a * f() >= 0 ? b * g() : c * h();

The choice of "What next?" -- and the observable side-effects
of that choice -- cannot be figured out based only on knowing
that the overflow occurred in line 42.

>> Architectures with "branch delay slots" may also require
>> inserting a no-op, unless the code generator can find something
>> something that can be done safely regardless of whether the
>> overflow handler is called.
>>
> Sure, but that is with all jumps.

Sure, but you're going to greatly increase the number of
jumps by inserting a new one after every arithmetic operation.
Many more jumps means many more delay slots, which means it's
less likely that useful work can be found for them, which means
that more of them will be filled with no-ops. Instead of one
ADD or whatever, you get ADD,JOV,NOP -- maybe not quite as bad
as it looks because there are other instructions to load operands
and store results and stuff, but it still has the effect of
shrinking the instruction cache.

>> There remains the question implementation for architectures
>> where overflow detection is more burdensome. The Digital Alpha
>> has been cited as a machine where detection requires additional
>> compare instructions.
>
> Happily DEC is no longer there, and COMPAQ, that bought DEC
> is no longer there either, being bought by Hewlett Packard.

It's my impression -- only an impression, mind you -- that
ISO is not interested in a "Programming Languages - C for x86"
standard. You can't simply live in the 8086 past and ignore
modern designs.

>> Also, if you need both the operands and
>> the results to do post-hoc overflow detection, you can't use
>> instructions that overwrite operands with results; this means
>> that arithmetic uses more distinct CPU registers than it would
>> otherwise, restricting the optimizer's freedom to use those
>> registers for other purposes.
>
> Maybe. If overflow checking is too expensive and your application
> doesn't need it, then just do not turn it on.

What happened to "zero overhead?"

--
Eric Sosman
eso...@ieee-dot-org.invalid

Keith Thompson

unread,
Sep 7, 2009, 2:57:45 PM9/7/09
to
"christian.bau" <christ...@cbau.wanadoo.co.uk> writes:
> 1. In addition to "off" and "on", the pragma could have a third
> setting "restore" which will restore to the state before the previous
> "on" or "off", allowing this to be nested. So if someone doing
> encryption wants no overflow checking, they put "off" and "restore"
> around their code, and after that code we are back to the initial
> setting.

The existing STDC pragmas (see C99 6.10.6) take an "on-off-switch"
argument, which can be any of ON, OFF, or DEFAULT.

What you're suggesting would require an implicit stack of settings.
Having every occurence of the pragma push a new value onto that stack
seems wasteful, conceptually if not practically.

I think there's some precedent for having PUSH and POP arguments. So
you could have:

#pragma STDC OVERFLOW_CHECK PUSH
/* doesn't change the current state, but sets things up for a
following POP */

#pragma STDC OVERFLOW_CHECK ON

...

#pragma STDC OVERFLOW_CHECK POP
/* restores previous state */

Or perhaps:

#pragma STDC OVERFLOW_CHECK PUSH_ON
...
#pragma STDC OVERFLOW_CHECK POP

I'm undecided whether this is worth doing, and if so just how it
should be specified. But if this were to be done for the
OVERFLOW_CHECK pragma, it should be done for all the STDC pragmas.

[...]

Hallvard B Furuseth

unread,
Sep 7, 2009, 3:02:10 PM9/7/09
to
Eric Sosman writes:
> jacob navia wrote:
>> [...]
>> 2.A: A new pragma
>> ------------------
>>
>> #pragma STDC OVERFLOW_CHECK on_off_flag
>>
>> When in the ON state, any overflow of an addition, subtraction,
>> multiplication or division provokes a call to the overflow handler.
>> Operations like +=, -=, *=, and /= are counted also.
>> [...]
>
> It occurs to me that this #pragma needs some tightening
> up, because there are corner cases:
>
> #pragma STDC OVERFLOW_CHECK OFF
> x = a
> #pragma STDC OVERFLOW_CHECK ON
> + b + c;
> (...)

Make it scoped and copy the scope rules of the floating-point pragmas.
(#pragma STDC FP_CONTRACT, FENV_ACCESS, CX_LIMITED_RANGE).

Well, actually I haven't checked carefully if they have the same rules,
but it seems preferable to not have different arithmetic pragmas with
different scope rules.

> A (weakly) related issue is to describe how faithfully the
> generated code must follow the abstract machine. For example,
> is it permissible to rewrite
>
> #define OVERHEAD 1
> #pragma STDC OVERFLOW_CHECK ON
> x = a + OVERHEAD - 1;
> as
> x = a;
> ?

Heh. I can argue both sides of that one. If it's permissible, the
pragma definition would need to define what kind of code rewrites are
permissible. Anything between two sequence points, perhaps?

Nitpick: That's the same as the simpler "x = a + 1 - 1" since macros
are textually replaced, the replacement text doesn't carry a private
"OVERFLOW_CHECK" tagging around.

Some other thoughts:

char and short arithmetic can overflow, if they are as wide as int so
promotion to int does not protect from overflow. (In the case of char,
that means char must be at least 16 bits wide.)

Defining a overflow handler function is still fairly limiting for what
one can do once overflow is detected. Since the example implementation
already does a goto, it'd be possible to turn this into a try-except
with unfriendly syntax instead, by allowing something like
#pragma STDC OVERFLOW_CHECK goto oops;
Except if the hardware traps on overflow instead of setting a flag.
Maybe C++ people can say how exception handling deals with that, and
what it'd cost to insert it in C. Possibly too much except as an
option, I don't know. Jacob has already been invited to submit it to
C++ folks, who can submit to C after working with it.

I'd call it #pragma STDC INT_OVERFLOW_CHECK or something, to make the
distinction from floating-point pragmas visible.

--
Hallvard

Hallvard B Furuseth

unread,
Sep 7, 2009, 3:06:25 PM9/7/09
to
I wrote:
> char and short arithmetic can overflow, if they are as wide as int so
> promotion to int does not protect from overflow. (In the case of char,
> that means char must be at least 16 bits wide.)

Duh, sorry. They get promoted to int first, of course.

--
Hallvard

jacob navia

unread,
Sep 7, 2009, 3:15:59 PM9/7/09
to
christian.bau a �crit :

> Some comments:
>
> 1. In addition to "off" and "on", the pragma could have a third
> setting "restore" which will restore to the state before the previous
> "on" or "off", allowing this to be nested. So if someone doing
> encryption wants no overflow checking, they put "off" and "restore"
> around their code, and after that code we are back to the initial
> setting.
>

Lcc-win offers that for ALL pragmas it supports. The syntax is:

#pragma something(push,on)

#pragma something(pop)

This was proposed by the microsoft compiler years ago, at least since
the 32 bit versions. I have generalized it to all pragmas. But that
is ANOTHER proposal. :-)

> 2. Instead of specifying what will happen, there could be wording like
> "this pragma is intended to have documented behavior that can be
> noticed and can be used to find problems instead of producing
> undefined behavior when an overflow happens". If the compiler can
> guarantee that any overflow will trap into the debugger, that would be
> something I would find useful. This would also be more efficient with
> processors that have a "sticky" overflow flag.
>

That is too vague to be realloy useful. A defined API and a defined
type are more concrete and allow for more portable code.

> 3. This should handle as many overflow situations as possible. For
> example signed left shifts, pointer overflow (if p is a pointer and i
> is an int, then one would expect p+i > p if i > 0 and p+i < p if i <
> i. In practice this happens only for a limited range of integers i.
> Adding i outside that range would be overflow).
>

I am not sure if (p+i) --> p + (unsigned)i. The relevant part of the
standard (6.5.6) doesn't require this, so I think we could add your
suggestion.


The addition of a pointer with an integer should be checked.


>
> 5. I would have suggested an operator overflow () ("operator" in the
> sense that "sizeof" is an operator): The "overflow" operator has a
> single argument which is an expression; it evaluates the expression
> including side effects as other operators would do and yields 1 if any
> of the operators in the source code of the expression produced an
> overflow, 0 otherwise. For example:
>
> if (overflow (x = a+b)) printf ("Calculation produced overflow
> \n");
> else printf ("Sum of a and b is %d\n", x);
>
> If the expression contains function calls, the function calls would
> not be checked.
>
> As in the "pragma" version, I think this is easiest to implement if
> the compiler generates tokens like "checked-operator-plus" instead of
> "operator-plus", depending on the situation. This handles things like
> an inline function compiled with overflow checking when overflow
> checking is disabled where it is called; this would perform checks for
> the inline function even when it is inlined.

1. Your proposal is very good when you are interested in checking a
single expression. It is bad for checking a whole program.

2. I think your approach is complenetary to mine. But it is a much
bigger change in the language.

Since the opposition I have found for my proposals, I have greatly
reduced everything. In principle your approach is very good and
I would support it.

jacob navia

unread,
Sep 7, 2009, 3:18:58 PM9/7/09
to
Eric Sosman a �crit :

> jacob navia wrote:
>> Eric Sosman a �crit :
>>> What variadic arguments are supplied, and how does the
>>> handler learn about them?
>>
>> This is implementation defined. An implementation could store the source
>> code of the offending lines and pass it as a char * to the handler,
>> or it could pass a pointer to the stack frame so that the program
>> could change the values of some variables, who knows?
>
> Ugh. That means the source code needs a different version
> of the overflow handler for every implementation it might run on.
> Sounds like a return to the pre-ANSI days, with separate #ifdef
> blocks for every compiler known to Man.
>

ONLY if you want to treat those extra arguments!

If you want maximum portability you stick to the 3 arguments
specified in the standard. If you treat the optional arguments
then it is no longer portable of course. The extra arguments are there
to provide flexibility to the implementors of this.

jacob navia

unread,
Sep 7, 2009, 3:31:00 PM9/7/09
to
Eric Sosman a �crit :

>>> and the signed <stdint.h> types that don't
>>> promote to `int'?
>>
>> Which ones? I mean short and chars promote to int as far as
>> I know.
>
> Yes (except that char might promote to unsigned int). But
> <stdint.h> can define types that are wider than int, and these
> types do not promote at all. Arithmetic on int39_t values is
> done (as if) in 39-bit arithmetic, not in promoted-to-64-bit
> arithmetic. (This has implications for detection of overflow,
> if the underlying hardware uses 64-bit arithmetic to simulate
> 39-bit operations.)
>

Actually, the overflow checking should be done for all integer
types with bit size >= int.

>>> The jump can't be a "pure" jump unless there's a separate
>>> error label for every possible overflow site.
>>
>> I do exactly that. I setup a special label for each line of
>> code where an overflow is possible.
>
> Is that enough? You say that if the overflow handler returns,
> execution proceeds as if the overflowing operation had produced
> some implementation-defined value. But if there are several
> possible overflow sites in the same expression, and you have
> only one "bail out" label per line, how can you know where in
> the expression execution should resume?
>
> x = a * f() >= 0 ? b * g() : c * h();
>

You do not have access to that information. If you want to
know which expression overflows you break that expression into
several lines. Some implementations could generate code that
would store the variables being checked and would pass them on to you
so you would get in your handler (in the extra arguments that are
provided) the whole expression and the position within that
expression.

You see why the extra arguments are useful?

The standard would prescribe the minimum, other implementations,
with implementation specific flags could do much more.

> The choice of "What next?" -- and the observable side-effects
> of that choice -- cannot be figured out based only on knowing
> that the overflow occurred in line 42.
>

You can very easily break up the expression. Besides, with this
proposal you know at least that there was an overflow.

With the current situation you just get a wroing result with
no warning whatsoever!

>>> Architectures with "branch delay slots" may also require
>>> inserting a no-op, unless the code generator can find something
>>> something that can be done safely regardless of whether the
>>> overflow handler is called.
>>>
>> Sure, but that is with all jumps.
>
> Sure, but you're going to greatly increase the number of
> jumps by inserting a new one after every arithmetic operation.
> Many more jumps means many more delay slots, which means it's
> less likely that useful work can be found for them, which means
> that more of them will be filled with no-ops. Instead of one
> ADD or whatever, you get ADD,JOV,NOP -- maybe not quite as bad
> as it looks because there are other instructions to load operands
> and store results and stuff, but it still has the effect of
> shrinking the instruction cache.
>

Why insert nops? You do not insert anything. If there is an overflow,
it doesn't matter if one extra instruction was executed because you
are going to go to the exception handler anyway and the result of
the whole expression is undefined, so you can avoid those NOPs
without any danger.

>>> There remains the question implementation for architectures
>>> where overflow detection is more burdensome. The Digital Alpha
>>> has been cited as a machine where detection requires additional
>>> compare instructions.
>>
>> Happily DEC is no longer there, and COMPAQ, that bought DEC
>> is no longer there either, being bought by Hewlett Packard.
>
> It's my impression -- only an impression, mind you -- that
> ISO is not interested in a "Programming Languages - C for x86"
> standard. You can't simply live in the 8086 past and ignore
> modern designs.
>

The Intel/Amd architecture "the past" ???

Well, I am sorry, it really doesn't look like it was like this mind you.

But anyway, even machines like the alpha can detect overflow with
not a lot of problems... They will have a small performance hit

If that is too much for you, avoid those machines or do not check for
overflow

>>> Also, if you need both the operands and
>>> the results to do post-hoc overflow detection, you can't use
>>> instructions that overwrite operands with results; this means
>>> that arithmetic uses more distinct CPU registers than it would
>>> otherwise, restricting the optimizer's freedom to use those
>>> registers for other purposes.
>>
>> Maybe. If overflow checking is too expensive and your application
>> doesn't need it, then just do not turn it on.
>
> What happened to "zero overhead?"
>

I said zero overhead for normal RISC machines or for the x86. Not
for the alpha, or for any brain dead machines out there!

jacob navia

unread,
Sep 7, 2009, 3:33:03 PM9/7/09
to
Keith Thompson a �crit :


Microsoft proposed

#pragma something(push, newvalue)
and
#pragma something(pop)

I have generalized that, so ALL lcc-win pragmas have a stack.

This would be very useful for many things but it is ANOTHER
discussion and ANOTHER proposal.

jacob navia

unread,
Sep 7, 2009, 3:44:07 PM9/7/09
to
Eric Sosman a �crit :

> A (weakly) related issue is to describe how faithfully the
> generated code must follow the abstract machine. For example,
> is it permissible to rewrite
>
> #define OVERHEAD 1
> #pragma STDC OVERFLOW_CHECK ON
> x = a + OVERHEAD - 1;
> as
> x = a;
>
> ? The original expression can overflow if executed literally,
> but the rewritten expression cannot; is the transformation
> allowed? What optimizations (if any) must overflow detection
> inhibit?
>

This is implementation defined.

This checking should NOT interfere with the language is such a manner
as to be very expensive and make optimizations impossible. If constant
propagation, a very simple and safe optimization, is rendered impossible
because the overflow checking nobody will want to use this.

Keith Thompson

unread,
Sep 7, 2009, 3:58:54 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> Eric Sosman a écrit :
>> jacob navia wrote:
>>> Eric Sosman a écrit :

The prototype you propose is:

typedef void (*overflow_handler_t)(unsigned line_number,
char *filename,
char *function_name,...);

For any variadic function, the values of the parameters that precede
the ", ..." need to be able at least to tell the function whether to
start looking for more arguments.

How is an overflow handler going to be able to tell, from the values
of line_number, filename, and function_name, whether any more
arguments were passed?

Keith Thompson

unread,
Sep 7, 2009, 4:01:55 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> Keith Thompson a écrit :
[...]

>> Or perhaps:
>>
>> #pragma STDC OVERFLOW_CHECK PUSH_ON
>> ...
>> #pragma STDC OVERFLOW_CHECK POP
>>
>> I'm undecided whether this is worth doing, and if so just how it
>> should be specified. But if this were to be done for the
>> OVERFLOW_CHECK pragma, it should be done for all the STDC pragmas.
>>
>> [...]
>
> Microsoft proposed
>
> #pragma something(push, newvalue)
> and
> #pragma something(pop)
>
> I have generalized that, so ALL lcc-win pragmas have a stack.
>
> This would be very useful for many things but it is ANOTHER
> discussion and ANOTHER proposal.

If you're going to use the "#pragma STDC" syntax, I think you need
to be consistent with the existing (and any future) STDC pragmas.
If you're going to propose a new mechanism to be used with #pragma
STDC OVERFLOW_CHECK, I think you should propose the same mechanism
for all of them, just to avoid creating a gratuitous inconsistency
in the language.

Keith Thompson

unread,
Sep 7, 2009, 4:02:52 PM9/7/09
to
Hallvard B Furuseth <h.b.fu...@usit.uio.no> writes:
[...]

> I'd call it #pragma STDC INT_OVERFLOW_CHECK or something, to make the
> distinction from floating-point pragmas visible.

INTEGER_OVERFLOW_CHECK, not INT_OVERFLOW_CHECK, since it doesn't just
apply to type int.

Keith Thompson

unread,
Sep 7, 2009, 4:05:58 PM9/7/09
to
jacob navia <ja...@nospam.org> writes:
> christian.bau a écrit :
[...]

>> 3. This should handle as many overflow situations as possible. For
>> example signed left shifts, pointer overflow (if p is a pointer and i
>> is an int, then one would expect p+i > p if i > 0 and p+i < p if i <
>> i. In practice this happens only for a limited range of integers i.
>> Adding i outside that range would be overflow).
>
> I am not sure if (p+i) --> p + (unsigned)i. The relevant part of the
> standard (6.5.6) doesn't require this, so I think we could add your
> suggestion.

No, p+i is certainly not equivalent to p+(unsigned)i.

int arr[10];
int *p = arr+5;
int i = -1;
p + i; /* points to arr[4] */

"Overflow" for pointer arithmetic is defined in terms of the bounds of
the object being pointed to. I suggest that checking such overflows
is beyond the scope of your proposal. I've discussed this in more
detail elsethread.

[...]

Eric Sosman

unread,
Sep 7, 2009, 4:17:37 PM9/7/09
to
Keith Thompson wrote:
> [...]

> The prototype you propose is:
>
> typedef void (*overflow_handler_t)(unsigned line_number,
> char *filename,
> char *function_name,...);
>
> For any variadic function, the values of the parameters that precede
> the ", ..." need to be able at least to tell the function whether to
> start looking for more arguments.

It suffices that the function be able to tell "somehow," not
that the information be conveyed in the fixed arguments. If the
implementation-specific stuff changes from one implementation to
another but not within a single implementation, then a suitable
#ifdef will do it.

That said, I think

typedef void (*overflow_handler_t)
(struct *overflow_handler_data);

would be a better choice. The struct would have certain "always
present" elements and others that the implementation might choose
to add, in the manner of various other library functions.

> How is an overflow handler going to be able to tell, from the values
> of line_number, filename, and function_name, whether any more
> arguments were passed?

By not asking questions it doesn't need to ask?

--
Eric Sosman
eso...@ieee-dot-org.invalid

Eric Sosman

unread,
Sep 7, 2009, 4:34:33 PM9/7/09
to

The original proposal says "If [the handler] returns,


execution continues with an implementation defined value as

the result of the operation that overflowed." How can you
accomplish that if all you know is that the overflow occurred
somewhere in line 42? If all three potential overflows are
lumped together as "line 42," where do you resume execution
without knowing which of the three multiplications overflowed?

> You see why the extra arguments are useful?

I did not say they weren't. But they're not enough, absent
a way to get back to the point just after the overflow.

>> The choice of "What next?" -- and the observable side-effects
>> of that choice -- cannot be figured out based only on knowing
>> that the overflow occurred in line 42.
>
> You can very easily break up the expression. Besides, with this
> proposal you know at least that there was an overflow.

Is it permissible to have more than one potentially-overflowing
operator in an expression, *and* for the handler to return? If so,
the implementation needs to keep track of the restart point.

>>>> Architectures with "branch delay slots" may also require
>>>> inserting a no-op, unless the code generator can find something
>>>> something that can be done safely regardless of whether the
>>>> overflow handler is called.
>>>>
>>> Sure, but that is with all jumps.
>>
>> Sure, but you're going to greatly increase the number of
>> jumps by inserting a new one after every arithmetic operation.
>> Many more jumps means many more delay slots, which means it's
>> less likely that useful work can be found for them, which means
>> that more of them will be filled with no-ops. Instead of one
>> ADD or whatever, you get ADD,JOV,NOP -- maybe not quite as bad
>> as it looks because there are other instructions to load operands
>> and store results and stuff, but it still has the effect of
>> shrinking the instruction cache.
>
> Why insert nops? You do not insert anything. If there is an overflow,
> it doesn't matter if one extra instruction was executed because you
> are going to go to the exception handler anyway and the result of
> the whole expression is undefined, so you can avoid those NOPs
> without any danger.

Not if the instruction in the branch delay slot changes the
state -- for example, by executing another arithmetic opcode that
also overflows ...

>>>> There remains the question implementation for architectures
>>>> where overflow detection is more burdensome. The Digital Alpha
>>>> has been cited as a machine where detection requires additional
>>>> compare instructions.
>>>
>>> Happily DEC is no longer there, and COMPAQ, that bought DEC
>>> is no longer there either, being bought by Hewlett Packard.
>>
>> It's my impression -- only an impression, mind you -- that
>> ISO is not interested in a "Programming Languages - C for x86"
>> standard. You can't simply live in the 8086 past and ignore
>> modern designs.
>
> The Intel/Amd architecture "the past" ???

Yes. The 8086 reached the market in the middle of the Carter
administration, and its origins stretch back to the Nixon years,
if not further. It's a pretty old design.

> But anyway, even machines like the alpha can detect overflow with
> not a lot of problems... They will have a small performance hit

You've measured that performance hit, have you? Or at least
made a serious attempt to estimate it?

>>> Maybe. If overflow checking is too expensive and your application
>>> doesn't need it, then just do not turn it on.
>>
>> What happened to "zero overhead?"
>>
> I said zero overhead for normal RISC machines or for the x86. Not
> for the alpha, or for any brain dead machines out there!

Too bad -- You'd been doing so well, so very well, and then ...
(If you wear wooden shoes, does shooting yourself in the foot
count as sabotage?)

--
Eric Sosman
eso...@ieee-dot-org.invalid

jacob navia

unread,
Sep 7, 2009, 6:31:09 PM9/7/09
to
Eric Sosman a �crit :

> I did not say they weren't. But they're not enough, absent
> a way to get back to the point just after the overflow.
>

The point just after the overflow is just a label. Look at the generated
code:

subl %ebx,%ecx
jo _$L5 ; jump if overflow to L5
_$L6:
cdq
idivl %ecx
;; the rest of the normal control flow
;; goes here

After the end of the function we have label L5:
_$L5:
pusha
pushl $6
pushl $main__labelname
call __overflow
addl $8,%esp
popa
jmp _$L6

You see now?

>
> Is it permissible to have more than one potentially-overflowing
> operator in an expression, *and* for the handler to return? If so,
> the implementation needs to keep track of the restart point.
>

Yes, and I do that above.

>
> Not if the instruction in the branch delay slot changes the
> state -- for example, by executing another arithmetic opcode that
> also overflows ...
>

The result of the expression is undefined. It doen't matter.

>> The Intel/Amd architecture "the past" ???
>
> Yes. The 8086 reached the market in the middle of the Carter
> administration, and its origins stretch back to the Nixon years,
> if not further. It's a pretty old design.
>

Mmmm I would say there are a few differences between the 8086 and
the intel i7 with 8 cores I am using now, excuse me. Obviously
just small differences from your point of view

:-)


>> But anyway, even machines like the alpha can detect overflow with
>> not a lot of problems... They will have a small performance hit
>
> You've measured that performance hit, have you? Or at least
> made a serious attempt to estimate it?
>

Yes. According to the discussion we had it has instructions that test
overflow. You have to make a pipeline flush, to keep the overflow
flag in synch with the execution unit, so there is pipeline turbulence.

This is an old problem. Chicken or egg?

Since C doesn't test overflow, hardware designers start getting rid of
the overflow flag. This #prgma could make them think again.


>>>> Maybe. If overflow checking is too expensive and your application
>>>> doesn't need it, then just do not turn it on.
>>>
>>> What happened to "zero overhead?"
>>>
>> I said zero overhead for normal RISC machines or for the x86. Not
>> for the alpha, or for any brain dead machines out there!
>
> Too bad -- You'd been doing so well, so very well, and then ...
> (If you wear wooden shoes, does shooting yourself in the foot
> count as sabotage?)
>

I think that a machine where overflow can't be easily checked is
well brain dead. The DEC people were known for their VAX, that
decided arbitrarily to cut strings at 64K.

Well, each company has its bugs and stuff. Intel people have other bugs.

Richard Bos

unread,
Sep 7, 2009, 7:08:55 PM9/7/09
to
=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <d...@des.no> wrote:

> jacob navia <ja...@nospam.org> writes:
> > What is obvious is that the performance hit will be almost zero for
> > most advanced CPUs.
>
> C is not restricted to "the most advanced CPUs". For every "advanced
> CPU" in the world, there are tens or hundreds of embedded processors,
> microcontrollers, DSPs etc. Even a COTS desktop, laptop or server with
> an "advanced CPU" can contain multiple secondary processors: I've worked
> with IBM servers that had an i486 (IIRC) on the backplane monitoring the
> main CPUs.

Yes, but they don't count, because they don't support jacob's compiler.

Richard

Bart

unread,
Sep 7, 2009, 8:49:41 PM9/7/09
to

It must be great being a cpu designer, being able to create novel new
hardware incompatible with anything in the past, present or future,
and apparently not caring whether it's compatible with any software
either.

That all seems to be perfectly acceptable. But when it comes to
languages, we're only allowed to have this single, monolithic super-
language that must run on any conceivable hardware, from the lowliest
crummy microprocessor up to supercomputers, even though there are
several evidently different stratas of application areas.

Odd, isn't it. (And I'm talking about C, since I don't know of any
mainstream languages quite like it.)

Personally I wouldn't have a problem with, say, a C-86 language, that
is targetted at x86-class processors. It would make a lot of things a
lot simpler. And people who use other processors can have their own
slightly different version. C is after all supposed to work at the
machine level; finally it will know exactly what that machine is!

--
Bartc

Eric Sosman

unread,
Sep 7, 2009, 9:09:37 PM9/7/09
to
jacob navia wrote:
> Eric Sosman a �crit :
>> I did not say they weren't. But they're not enough, absent
>> a way to get back to the point just after the overflow.
>>
>
> The point just after the overflow is just a label. Look at the generated
> code:
>
> subl %ebx,%ecx
> jo _$L5 ; jump if overflow to L5
> _$L6:
> cdq
> idivl %ecx
> ;; the rest of the normal control flow
> ;; goes here
>
> After the end of the function we have label L5:
> _$L5:
> pusha
> pushl $6
> pushl $main__labelname
> call __overflow
> addl $8,%esp
> popa
> jmp _$L6
>
> You see now?

I see that you cannot implement the specification you
promulgated, not with one "L5" per line of possibly-overflowing
code. You need one "L5" per potential overflow *site*, and
that's another matter.

The penalty can be reduced, if each "L5" is just a call-and-
return to The Great Caller of the Overflow Handler. But you still
need something like four instructions (three typically not executed)
per arithmetic operation: ADD, JOV, CALLequiv, RETequiv.

>> Is it permissible to have more than one potentially-overflowing
>> operator in an expression, *and* for the handler to return? If so,
>> the implementation needs to keep track of the restart point.
>
> Yes, and I do that above.

No, you fail to do that above. If all you've got is an "L5"
that jumps unconditionally to "L6," you've got either an infinite
loop or a total abortion.

>> Not if the instruction in the branch delay slot changes the
>> state -- for example, by executing another arithmetic opcode that
>> also overflows ...
>
> The result of the expression is undefined. It doen't matter.

Then how can you say that "execution continues?" What does
it mean to "continue" an execution, but either skipping its side-
effects or performing them infinitely often?

Perhaps the proposal would be improved (it would certainly
be simplified) by saying that the behavior is undefined if the
handler returns. (See precedents in signal handlers.) Take
careful note: This is a constructive suggestion, not an attack.

>>> The Intel/Amd architecture "the past" ???
>>
>> Yes. The 8086 reached the market in the middle of the Carter
>> administration, and its origins stretch back to the Nixon years,
>> if not further. It's a pretty old design.
>
> Mmmm I would say there are a few differences between the 8086 and
> the intel i7 with 8 cores I am using now, excuse me. Obviously
> just small differences from your point of view

The "overflow flag" that your implementation rests upon was
present in the original 8086 thirty-plus years ago, was even then
a conscious imitation of still earlier designs, and as far as I can
see has not changed materially since that time. New instructions
that can set, clear, and test the flag may have been added in the
meantime -- but the overflow flag itself is a relic of the far past,
a vermiform appendix that some more modern designs have chosen to
do without.

>>> But anyway, even machines like the alpha can detect overflow with
>>> not a lot of problems... They will have a small performance hit
>>
>> You've measured that performance hit, have you? Or at least
>> made a serious attempt to estimate it?
>
> Yes. According to the discussion we had it has instructions that test
> overflow.

Yes: You hang on to the operands, and do some comparisons
involving them and the computed result to see whether overflow
has occurred. Observe that these would be instructions *always*
executed, not branches seldom taken -- put that in your "zero
overhead" pipe and smoke it!

> You have to make a pipeline flush, to keep the overflow
> flag in synch with the execution unit, so there is pipeline turbulence.

You've lost me.

> This is an old problem. Chicken or egg?

You've lost me again. Tonight, it was chicken, on the grill
with a Secret Sauce of my wife's invention.

> Since C doesn't test overflow, hardware designers start getting rid of
> the overflow flag. This #prgma could make them think again.

No; hardware designers got rid of overflow flags (and carry
flags and sign flags and zero flags and parity flags and The Holy
Flag Of The Cause) because they are contention points, reducing
the amount of parallelism one can achieve in the hardware. Hardware
designers pay attention to Amdahl's Law (even as software designers
blunder along in cack-handed ignorance).

>>>>> Maybe. If overflow checking is too expensive and your application
>>>>> doesn't need it, then just do not turn it on.
>>>>
>>>> What happened to "zero overhead?"
>>>>
>>> I said zero overhead for normal RISC machines or for the x86. Not
>>> for the alpha, or for any brain dead machines out there!
>>
>> Too bad -- You'd been doing so well, so very well, and then ...
>> (If you wear wooden shoes, does shooting yourself in the foot
>> count as sabotage?)
>
> I think that a machine where overflow can't be easily checked is
> well brain dead. The DEC people were known for their VAX, that
> decided arbitrarily to cut strings at 64K.

Isn't that the "counted string" you were so vigorously
promoting just a week or two ago? My, but how fashions change!
Ah, but that's a rant already ranted.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Eric Sosman

unread,
Sep 7, 2009, 9:28:53 PM9/7/09
to
Eric Sosman wrote:
> [...]

> The penalty can be reduced, if each "L5" is just a call-and-
> return to The Great Caller of the Overflow Handler. But you still
> need something like four instructions (three typically not executed)
> per arithmetic operation: ADD, JOV, CALLequiv, RETequiv.

Sorry; thinko; that should have been "ADD, JOV, CALLequiv, JMP."

--
Eric Sosman
eso...@ieee-dot-org.invalid

Keith Thompson

unread,
Sep 7, 2009, 9:47:11 PM9/7/09
to
Bart <b...@freeuk.com> writes:
[...]

> Personally I wouldn't have a problem with, say, a C-86 language, that
> is targetted at x86-class processors. It would make a lot of things a
> lot simpler. And people who use other processors can have their own
> slightly different version. C is after all supposed to work at the
> machine level; finally it will know exactly what that machine is!

I'd have a *big* problem with that, if it meant that software
written for x86 systems won't run on SPARC, or ARM, or even x86-64.

I generally don't even think about what CPU I'm using at the moment,
because it doesn't matter. Making it matter would not be a step
forward.

Bart

unread,
Sep 7, 2009, 11:04:35 PM9/7/09
to
On Sep 8, 2:47 am, Keith Thompson <ks...@mib.org> wrote:
> Bart <b...@freeuk.com> writes:
>
> [...]
>
> > Personally I wouldn't have a problem with, say, a C-86 language, that
> > is targetted at x86-class processors. It would make a lot of things a
> > lot simpler. And people who use other processors can have their own
> > slightly different version. C is after all supposed to work at the
> > machine level; finally it will know exactly what that machine is!
>
> I'd have a *big* problem with that, if it meant that software
> written for x86 systems won't run on SPARC, or ARM, or even x86-64.

Yet, hardware created around a SPARC processor presumably won't work
with an ARM? Somebody made a decision to use a specific set of
hardware, requiring different circuitry, peripherals, power supply,
different manuals and expertise, the software however must work,
unchanged, across the lot?

OK, fair enough. I'm sure the C code driving that 486 monitoring the
IBM servers that someone mentioned, will also work unchanged driving
the Dec Alpha monitoring Fuji servers instead (I've no idea what this
stuff does, and I suspect Fuji actually make film stock...).

My point is that C software can be considered an integral part of a
system and therefore can be allowed to be specific to that system in
the same way the bits of hardware can be. Ie., not just doing a
specific job but taking advantage of known characteristics of the
processor.

--
Bartc

Gordon Burditt

unread,
Sep 8, 2009, 12:53:18 AM9/8/09
to
>> How fine-grained does this have to operate? If, for example:
>> e = ((a+b)
>> #pragma STDC OVERFLOW_CHECK on
>> *
>> #pragma STDC OVERFLOW_CHECK off
>> (c+d))+1;
>> will that check only the multiplication for overflow? If not, how do
>> I check only the multiplication for overflow?
>>
>
>This will work in lcc-win, but for a standard it is problematic because
>it could be difficult for the optimizer to move code around.

Isn't it still possible to move code around if each operation is
tagged as overflow-checked or not? You also have to tag it with
the line number and file it came from, for calling the handler.

>>> Only the types signed int and signed long long are concerned.
>>

>> Why not signed long also?
>
>I corrected that already. It was an oversight.
>
>> And signed short? Why can't signed chars
>> overflow?
>
>Because they are promoted to ints when doing arithmetic.

signed char foo = CHAR_MAX;
signed short foo2 = SHRT_MAX;
foo++; /* oops */
foo2++; /* oops if it survives the last oops */

>And why no checks for floating point?
>
>Because they have their own set of flags.

Ok, I guess I'll buy that, but it seems awkward to have two systems
and conversions between the two in the same expression are likely
to get dropped on the floor.

>> Pointers can wrap, too.
>
>Sure, but if the wrapping around is done using unsigned arithmetic
>the behavior is correct.

I'm not sure I'll buy that one. Unsigned integer calculations have
the wrapping behavior because they are *defined* as unsigned, not
because they are *implemented* as unsigned. Remember, many of the
signed integer calculations are implemented (e.g. on x86) as wrapping
non-trapping signed integer calculations. This seems to be a major
complaint of yours. Implementing it as non-trapping signed integer
calculations doesn't excuse not checking it.

It seems to me that, with overflow checking on:
char a;
char *p;

for (p = &a; ; )
p++;

p should have at least one checked overflow before p == &a again.


>>> 2.B: Setting the handler for overflows
>>> ---------------------------------------
>>>
>>> overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);
>>
>> This intrudes on the programmer's namespace, unless you put
>> declarations like this into a new (to-become-standard) header file
>> where they will interfere only if that header file is included.
>>
>
>Yes I would propose that we use the same file as the safer C library
>of Microsoft or some

For your own sanity, don't make your proposal dependent on some
Microsoft header file which you are trying to edit for them and
then get them to distribute. Make it a separate file.

><overflowcheck.h>

This will work much better.

>>> The function set_overflow_handler sets the function to be called in
>>> case of overflow to the specified value. If "newvalue" is NULL,
>>> the function sets the handler to the default value (the value
>>> it had at program startup).
>>
>> You have failed to state what the default handler *does* when
>> it is called. This is too important to make it "implementation
>> defined" and leave it at that.
>>
>
>It does... whatever it wants. How can we specify what a handler does?
>
>Let's get real. Did the standard specify what does a signal handler do?

Did the standard specify what a *DEFAULT* signal handler does? Yes,
absolutely. Terminate the program. Dumping core or generating an
error message is apparently not required nor prohibited.

>>> 2.C: The handler function
>>> -------------------------
>>>

>>> typedef void (*overflow_handler_t)(unsigned line_number,
>>> char *filename,
>>> char *function_name,...);
>>

>> This is a variable-argument function? Why?
>>
>>> This function will be called when an overflow is detected. The
>>> arguments have the same values as __LINE__ __FILE__ and __FUNC__
>>
>> Under what circumstances will it be called with more than 3 arguments?
>>
>
>It could be that some implementations pass MORE information to the
>overflow handler than the bare required minimum. What can that be
>is up to the implementation.

>>> If this function returns, execution continues with an implementation


>>> defined value as the result of the operation that overflowed.
>>

>> Can this be a trap value? If so, the program may die before I
>> decide that the whole computation produced a useless value and
>> substitute a default or demand better input.
>
>Then you have to search for a better implementation, what do you expect?
>
>You can always have a setjmp BEFORE the computation and treat the error
>in the setjmp clause. Your handler just makes a longjmp.

Ok, are you going to guarantee that this is allowed? Even if the overflow
happens in a signal handler?


Incidentally, what happens if overflow happens in the overflow handler?
Is that undefined behavior? Or does the overflow handler have to be
compiled with overflow unchecked?

Keith Thompson

unread,
Sep 8, 2009, 2:14:35 AM9/8/09
to
Bart <b...@freeuk.com> writes:
> On Sep 8, 2:47 am, Keith Thompson <ks...@mib.org> wrote:
>> Bart <b...@freeuk.com> writes:
>>
>> [...]
>>
>> > Personally I wouldn't have a problem with, say, a C-86 language, that
>> > is targetted at x86-class processors. It would make a lot of things a
>> > lot simpler. And people who use other processors can have their own
>> > slightly different version. C is after all supposed to work at the
>> > machine level; finally it will know exactly what that machine is!
>>
>> I'd have a *big* problem with that, if it meant that software
>> written for x86 systems won't run on SPARC, or ARM, or even x86-64.
>
> Yet, hardware created around a SPARC processor presumably won't work
> with an ARM? Somebody made a decision to use a specific set of
> hardware, requiring different circuitry, peripherals, power supply,
> different manuals and expertise, the software however must work,
> unchanged, across the lot?

Um, yes.

> OK, fair enough. I'm sure the C code driving that 486 monitoring the
> IBM servers that someone mentioned, will also work unchanged driving
> the Dec Alpha monitoring Fuji servers instead (I've no idea what this
> stuff does, and I suspect Fuji actually make film stock...).
>
> My point is that C software can be considered an integral part of a
> system and therefore can be allowed to be specific to that system in
> the same way the bits of hardware can be. Ie., not just doing a
> specific job but taking advantage of known characteristics of the
> processor.

My point is that C can be used both for portable software (including
most of the software carrying these words between my keyboard and your
monitor) as well as for non-portable software (such as device
drivers).

Would your hypothetical C-86 language have enough advantages for
x86-specific code to make up for the fact that it wouldn't work *at
all* on anything else?

Phil Carmody

unread,
Sep 8, 2009, 3:49:51 AM9/8/09
to
Chris Dollin <chris....@hp.com> writes:

> Dag-Erling Smørgrav wrote:
>
>> jacob navia <ja...@nospam.org> writes:
>
>>> #pragma STDC OVERFLOW_CHECK on_off_flag
>>>
>>> When in the ON state, any overflow of an addition, subtraction,
>>> multiplication or division provokes a call to the overflow handler.
>>> Operations like +=, -=, *=, and /= are counted also.
>>>
>>> Only the types signed int and signed long long are concerned.
>>
>> What about signed long? And why only signed? Overflow can be just as
>> painful in unsigned arithmetic.
>
> The C standard specifies that unsigned arithmetic wraps around and
> does not "overflow"; there's nothing to check and no room to manoeuver.

At least n869.txt *does not directly define* what it means by
overflow, nor, it appears, does n1256.txt. However, I think it's
clear that many people, and at least one large microprocessor
manufacturer, view trying to fit a large thing into a smaller
thing such that it doesn't fit can be called "overflow".

If you accept it as valid usage of the word overflow, then clearly
C's unsigned types can overflow when being shifted to the left, as
the standard defines the resulting value of E1<<E2 in terms of the
*mathematical* value E1 * 2^E2 which it reduces modulo an appropriate
number. That mathematical value may not fit into a variable of the
desired type, and therefore I think it's fair to consider that the
operation has involved an overflow, albeit one which has precisely
defined numerical semantics.

It's clear that the C standardisation committee do not support this
use of the word overflow, given their blanket assertion about it not
applying to these operations on unsigned types.

In contrast, it's also clear that Intel does support this use of
the word when they describe the semantics of the overflow bit.

In the face of such huge opposition, it might be prudent to make
sure that no ambiguity is possible, by defining what is meant by
the term in advance of first use.

Given my cross-post location, I hope this will be given formal
consideration. Should there be anything else I need to do, I will
so do.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Francis Glassborow

unread,
Sep 8, 2009, 5:19:45 AM9/8/09
to
Bart wrote:

> Personally I wouldn't have a problem with, say, a C-86 language, that
> is targetted at x86-class processors. It would make a lot of things a
> lot simpler. And people who use other processors can have their own
> slightly different version. C is after all supposed to work at the
> machine level; finally it will know exactly what that machine is!
>
> --
> Bartc

And lose all the benefits of portability :(
Actually C is targeted at an abstract machine (as is almost every other
language I know) Some languages such as Java require that the abstract
machine be provided by the real hardware however much that may cost in
performance. The theory being that you then have complete portability
over hardware. In reality this ideal is not met (though most get close)

My problem is that C is inconsistent in supporting common features of
hardware. It supports a clock and defines what should happen if there is
no clock but it does not provide any mechanism for accessing an overflow
bit even though that is common on CPUs.

In addition to providing a standard way to access an overflow bit
(actually the problem with that access is that we would also need a way
to clear it) I think that it would be helpful if an implementation
defined how it deals with integer overflow. In practice most handle it
by wrapping (the most logical solution on a 2's complement machine) some
may saturate (I believe that there are CPUs that work that way) for
others it can be a serious problem but even they should be able to
signal overflow to the process.

I have never liked C making signed integer overflow UB. In theory
implementations can document the behaviour but in practice most do not
seem to do so.

The common answer that expert programmers can avoid overflow is elitist.
Unlike most undefined behaviour which is easy to teach how to avoid,
integer overflow is quite hard and is not tackled in most books and
courses. Indeed most books and courses do not even mention that there is
a UB problem (probably because in practice i*j does not result in
drastic damage outside the program itself.)

The unfortunate message to students is that UB often does not matter
because the program will do something perfectly sensible, repeatable and
predictable on most hardware)

jacob navia

unread,
Sep 8, 2009, 5:27:14 AM9/8/09
to
Eric Sosman a �crit :

> The "overflow flag" that your implementation rests upon was
> present in the original 8086 thirty-plus years ago, was even then
> a conscious imitation of still earlier designs, and as far as I can
> see has not changed materially since that time. New instructions
> that can set, clear, and test the flag may have been added in the
> meantime -- but the overflow flag itself is a relic of the far past,
> a vermiform appendix that some more modern designs have chosen to
> do without.
>

This is just not true. Take the Cell processor of IBM, a modern
RISC architecture for parallel processing. It has (just like
the x86 x64 of Intel) a flags register where the executed
instruction writes its report. True, in the Cell you can do
without by using the versions of the instruction that do NOT
set the flags, but you can use the flags if you want.

A processor that can't report overflow is unusable. If I
propose to check for overflow in C is because the language has
a hole here, overflow checking is necessary to avoid getting
GARBAGE out of your calculations.

If you think the DEC alpha is the "state of the art" in hardware
processing please think again.

Dag-Erling Smørgrav

unread,
Sep 8, 2009, 5:49:50 AM9/8/09
to
jacob navia <ja...@nospam.org> writes:
> Eric Sosman <eso...@ieee.org> writes:
> > A peculiar limitation. Is there a reason for omitting `signed
> > long', and the signed <stdint.h> types that don't promote to `int'?

> Which ones? I mean short and chars promote to int as far as
> I know.

short a = 20000;
short b = 20000;

a += b;

The addition won't overflow, but the assignment will (assuming 16-bit
short).

DES
--
Dag-Erling Smørgrav - d...@des.no

Dag-Erling Smørgrav

unread,
Sep 8, 2009, 5:55:15 AM9/8/09
to
Bart <b...@freeuk.com> writes:
> Yet, hardware created around a SPARC processor presumably won't work
> with an ARM?

http://www.linux.org/
http://www.gnu.org/
http://www.freebsd.org/
http://www.netbsd.org/

not to mention 90% of the third-party software written for those
platforms.

Nick Keighley

unread,
Sep 8, 2009, 6:37:25 AM9/8/09
to
On 8 Sep, 01:49, Bart <b...@freeuk.com> wrote:
> On Sep 8, 12:08 am, ralt...@xs4all.nl (Richard Bos) wrote:
> > =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <d...@des.no> wrote:
> > > jacob navia <ja...@nospam.org> writes

> > > > What is obvious is that the performance hit will be almost zero for
> > > > most advanced CPUs.

"almost-zero" isn't "zero overhead"

> > > C is not restricted to "the most advanced CPUs".  For every "advanced
> > > CPU" in the world, there are tens or hundreds of embedded processors,
> > > microcontrollers, DSPs etc.  Even a COTS desktop, laptop or server with
> > > an "advanced CPU" can contain multiple secondary processors: I've worked
> > > with IBM servers that had an i486 (IIRC) on the backplane monitoring the
> > > main CPUs.
>
> > Yes, but they don't count, because they don't support jacob's compiler.

:-)

> It must be great being a cpu designer, being able to create novel new
> hardware incompatible with anything in the past, present or future,
> and apparently not caring whether it's compatible with any software
> either.

pretty rare I'd have thought these days. Intel's processors wouldn't
look the way they do if they had a clean sheet.

> That all seems to be perfectly acceptable. But when it comes to
> languages, we're only allowed to have this single, monolithic super-
> language that must run on any conceivable hardware, from the lowliest
> crummy microprocessor up to supercomputers, even though there are
> several evidently different stratas of application areas.

that's what C does, yes. If you want Java...

> Odd, isn't it. (And I'm talking about C, since I don't know of any
> mainstream languages quite like it.)

I'd always thougt C was a mainstream language.

> Personally I wouldn't have a problem with, say, a C-86 language, that
> is targetted at x86-class processors.

yuk. In that case rename it C-86 or something. Or even better give
it completly different name. EightySixScript or something. Script-86.
Gosh the least portable HLL ever. 40 years of computer science
and computer engineering, vanished like a soap bubble.

> It would make a lot of things a lot simpler.

and few things that some people think are important much harder.

> And people who use other processors can have their own
> slightly different version.

bleah

> C is after all supposed to work at the
> machine level; finally it will know exactly what that machine is!

Read some computer science books until you understand the term
"abstraction"


--
"Programs must be written for people to read, and only
incidentally for machines to execute."
- Abelson & Sussman, Structure and Interpretation of Computer Programs

In the development of the understanding of complex phenomena,
the most powerful tool available to the human intellect is
abstraction. Abstraction arises from the recognition of similarities
between certain objects, situations, or processes in the real world
and the decision to concentrate on these similarities and to ignore,
for the time being, their differences.
- C.A.R. Hoare

Nick Keighley

unread,
Sep 8, 2009, 6:47:05 AM9/8/09
to
On 8 Sep, 04:04, Bart <b...@freeuk.com> wrote:
> On Sep 8, 2:47 am, Keith Thompson <ks...@mib.org> wrote:
> > Bart <b...@freeuk.com> writes:

> > > Personally I wouldn't have a problem with, say, a C-86 language, that
> > > is targetted at x86-class processors. It would make a lot of things a
> > > lot simpler. And people who use other processors can have their own
> > > slightly different version. C is after all supposed to work at the
> > > machine level; finally it will know exactly what that machine is!
>
> > I'd have a *big* problem with that, if it meant that software
> > written for x86 systems won't run on SPARC, or ARM, or even x86-64.
>
> Yet, hardware created around a SPARC processor presumably won't work
> with an ARM? Somebody made a decision to use a specific set of
> hardware, requiring different circuitry, peripherals, power supply,
> different manuals and expertise, the software however must work,
> unchanged, across the lot?

that's kind of the point. That's what device drivers are all about.
I can run the same software on my new(ish) laptop that I ran on my
old computer. The manufacturers of computers, OSs and their drivers go
to
a fair bit of trouble to make the hardware invisible. There's a reason
for this.

People who develop embedded software (the hidden software in your
phone,
DVD player, washing machine, car etc etc.) can test much of it on bog
standard desk-tops even though its going to run on some wierd hardware
with chip named after an Acorn.

The software can be tested *before* the hardware even exists.

C portability makes this easier.


> OK, fair enough. I'm sure the C code driving that 486 monitoring the
> IBM servers that someone mentioned, will also work unchanged driving
> the Dec Alpha monitoring Fuji servers instead (I've no idea what this
> stuff does, and I suspect Fuji actually make film stock...).

Fuji make (or used to make) computers. They make on awful lot of
stuff.

> My point is that C software can be considered an integral part of a
> system and therefore can be allowed to be specific to that system in
> the same way the bits of hardware can be. Ie., not just doing a
> specific job but taking advantage of known characteristics of the
> processor.

But it's easier if it doesn't. Oh, some parts will be hardware
specific.
But they should be restricted to small localised parts of the
software
(think drivers again). The bulk of the application can be tested sans
hardware. Handy if the real hardware isn't real portable or failure
isn't
an option. (I suspect avionics software is tested on the ground
first).

Nick Keighley

unread,
Sep 8, 2009, 7:03:59 AM9/8/09
to
On 8 Sep, 02:09, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
> jacob navia wrote:
> > Eric Sosman a écrit :

<snip>

> > I think that a machine where overflow can't be easily checked is
> > well brain dead. The DEC people were known for their VAX, that
> > decided arbitrarily to cut strings at 64K.
>
>      Isn't that the "counted string" you were so vigorously
> promoting just a week or two ago?  My, but how fashions change!
> Ah, but that's a rant already ranted.

ooh! So Navia strings use a 24-bit or 32-bit (or, I suppose 64-bit)
count value. Lets assumei it's an int.

So on a 32-bit architecture.

Nstring s = NS_make ("a");

s takes 5 bytes


Francis Glassborow

unread,
Sep 8, 2009, 7:09:50 AM9/8/09
to
and assuming a 16-bit int, the addition will overflow.

However this raises the whole issue of narrowing conversions. They are
allowed in C but what should happen if the converted value looses
information?

Note that unlike the overflow as a result of computation, there is no
implicit problem with narrowing integer conversions, just drop the high
bits (on a S & M machine you will need to retain the sign bit)

Eric Sosman

unread,
Sep 8, 2009, 7:37:55 AM9/8/09
to
jacob navia wrote:
> Eric Sosman a �crit :
>> The "overflow flag" that your implementation rests upon was
>> present in the original 8086 thirty-plus years ago, was even then
>> a conscious imitation of still earlier designs, and as far as I can
>> see has not changed materially since that time. New instructions
>> that can set, clear, and test the flag may have been added in the
>> meantime -- but the overflow flag itself is a relic of the far past,
>> a vermiform appendix that some more modern designs have chosen to
>> do without.
>
> This is just not true.

"Some more modern designs have chosen to do without [flags]"
is "just not true?" Every CPU designed in the last three decades
has arithmetic condition flags in its architecture? You're sure
of this, are you?

--
Eric Sosman
eso...@ieee-dot-org.invalid

Bart

unread,
Sep 8, 2009, 8:30:58 AM9/8/09
to
On Sep 8, 11:37 am, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:

> On 8 Sep, 01:49, Bart <b...@freeuk.com> wrote:

> > It must be great being a cpu designer, being able to create novel new
> > hardware incompatible with anything in the past, present or future,
> > and apparently not caring whether it's compatible with any software
> > either.
>
> pretty rare I'd have thought these days. Intel's processors wouldn't
> look the way they do if they had a clean sheet.

Someone mentioned hundreds of embedded processors for each advanced
processor. I guess these must be all a little different.

>
> > That all seems to be perfectly acceptable. But when it comes to
> > languages, we're only allowed to have this single, monolithic super-
> > language that must run on any conceivable hardware, from the lowliest
> > crummy microprocessor up to supercomputers, even though there are
> > several evidently different stratas of application areas.
>
> that's what C does, yes. If you want Java...

C does it with penalties, such as making some kinds of programming a
minefield because this won't work on processor X, and that is a wrong
assumption for processor Y, even though this application is designed
to run only on processor Z.

>
> > Odd, isn't it. (And I'm talking about C, since I don't know of any
> > mainstream languages quite like it.)
>
> I'd always thougt C was a mainstream language.

I didn't say it wasn't. Just that there no others that I know of,
which are like C, that are mainstream (but presumably plenty of
private or in-house ones, like one or two of mine).

> > Personally I wouldn't have a problem with, say, a C-86 language, that
> > is targetted at x86-class processors.
>
> yuk. In that case rename it C-86 or something. Or even better give
> it completly different name. EightySixScript or something. Script-86.
> Gosh the least portable HLL ever. 40 years of computer science
> and computer engineering, vanished like a soap bubble.

But this doesn't apply to hardware? Why can't that abstraction layer
that you mention a bit later be applied to C-86?

>
> > It would make a lot of things a lot simpler.
>
> and few things that some people think are important much harder.

Have a look at C#'s basic types:

Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits. Now try
and get the same hard and fast facts about C's types, you can't! It's
like asking basic questions of a politician.

That's one thing that would be simpler; what sort of things would be
harder (other than the obvious one of running on a system where these
type sizes are all different)?

--
Bartc

Bart

unread,
Sep 8, 2009, 8:55:56 AM9/8/09
to
On Sep 8, 7:14 am, Keith Thompson <ks...@mib.org> wrote:
> Bart <b...@freeuk.com> writes:

> Would your hypothetical C-86 language have enough advantages for
> x86-specific code to make up for the fact that it wouldn't work *at
> all* on anything else?

C-86 might be as simple as a bunch of assumptions (such as basic type
sizes, alignment needs, byte-order and so on). And I guess some people
may be programming in C-86 already!

As for portability, you'd have to ask them; it may not be a big deal
if they know their product will have to run on x86 for the next few
years.

Or perhaps C is used for (firmware for) a processor inside a phone
say, but that phone will be produced in the millions. Surely it must
help (in programmer effort, performance, code sise, any sort of
measure except actual portability) to have a tailored C version for
that system.

(When it comes do downloadable software, that is a different matter,
and perhaps a more adept language is needed, or just a non-specific
C!)

--
Bartc

Dag-Erling Smørgrav

unread,
Sep 8, 2009, 9:27:03 AM9/8/09
to
Bart <b...@freeuk.com> writes:
> Or perhaps C is used for (firmware for) a processor inside a phone
> say, but that phone will be produced in the millions. Surely it must
> help (in programmer effort, performance, code sise, any sort of
> measure except actual portability) to have a tailored C version for
> that system.

You'd be surprised how many simply use gcc; many microcontroller vendors
port gcc to every new chip (or pay someone to do it for them).

jacob navia

unread,
Sep 8, 2009, 9:33:24 AM9/8/09
to
Dag-Erling Smørgrav a écrit :

There is nbo way gcc can function correctly for small architectures.
Obviously there are many ports of it to many architectures, most of
them full of bugs.

The code of gcc is around 12-15MB source code. To rewrite a BIG
part of this for a small microprocessor is like trying to kill a fly
with an atomic bomb.

Yes, maybe the fly dies, but it could fly away until the missile arrives

:-)

I am biased, of course. Like you are biased for gcc. I just want to
restore a sense of proportion. Porting gcc to a new architecture and
debug the resulting code is a work of several years until all is
debugged and fixed.

And no, it can't be done by a single person.

Chris Dollin

unread,
Sep 8, 2009, 9:45:02 AM9/8/09
to
jacob navia wrote:

> Dag-Erling Sm�rgrav a �crit :

>> You'd be surprised how many simply use gcc; many microcontroller vendors
>> port gcc to every new chip (or pay someone to do it for them).
>

> There is nbo way gcc can function correctly for small architectures.

That's an interesting claim; what's your reasoning? [I read Dag as
saying that the gcc /back-end/ has been ported and that gcc is
being used as a cross-compiler.]

> Obviously there are many ports of it to many architectures, most of
> them full of bugs.

Isn't "full" so ridiculous a claim as to weaken what case you have?
The way I'd use "full of bugs" would make a product essentially
useless. Is that the claim you're making?

> The code of gcc is around 12-15MB source code. To rewrite a BIG
> part of this for a small microprocessor is like trying to kill a fly
> with an atomic bomb.

/Is/ it a BIG part that needs to be rewritten?

> I am biased, of course. Like you are biased for gcc. I just want to
> restore a sense of proportion. Porting gcc to a new architecture and
> debug the resulting code is a work of several years until all is
> debugged and fixed.

That's going to depend heavily on how new a "new" architecture
is, yes?

--
"My name is Hannelore Ellicott-Chatham. I *end messes*." Hannelore,
/Questionable Content/

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England

jacob navia

unread,
Sep 8, 2009, 10:15:28 AM9/8/09
to
Chris Dollin a �crit :

> jacob navia wrote:
>
>> Dag-Erling Sm�rgrav a �crit :
>
>>> You'd be surprised how many simply use gcc; many microcontroller vendors
>>> port gcc to every new chip (or pay someone to do it for them).
>> There is nbo way gcc can function correctly for small architectures.
>
> That's an interesting claim; what's your reasoning? [I read Dag as
> saying that the gcc /back-end/ has been ported and that gcc is
> being used as a cross-compiler.]
>

1) Harvard architectures are not supported in gcc's conceptual model.
Those architectures (where memory is not contiguous) are common for
microcontrollers

2) Gcc has no "back end" The front end and the back end are intermixed
in an extremely complex way, and there are very few people in the
world that understand those connections well enough to write a
port of gcc to a new architecture.

3) I tried two weeks to fix a bug in gcc's code generation for the
x86 (32 bit windows version). I had a debugger, and some scattered
docs. The bug was a very small one concerning stdcall support.

After two weeks of intensive work I gave up. The learning curve
is at least several years. The learning curve till I mastered
the original lcc was around 8 months (full time) and that version
of lcc was only 200K code with a very good documentation.

For gcc you would have to multiply that by 5-10 at least.

>> Obviously there are many ports of it to many architectures, most of
>> them full of bugs.
>
> Isn't "full" so ridiculous a claim as to weaken what case you have?
> The way I'd use "full of bugs" would make a product essentially
> useless. Is that the claim you're making?
>

They aren't tested well since there are few programs that run.

The only port of gcc to a microcontroller was done in 1999 by a team
of 5-8 people. They stopped in 2001. It was a version for the
ATMEL microprocessor. And that was in those happy times of gcc 2.

We are in gcc 4 now, and the size was multiplied by 2, complexity by
100. Now, I am not saying that you can't do it.

You can do it if you put a team of 10 gcc experts to work for you.
They are well paid since there aren't a lot of them. Project would
take around 1.5 - 2 years. With associated costs (machines, offices,
secretary, etc) you have for around 2 million dollars/port.

There are better adapted, smaller, compilers for microprocessors
than gcc.


>> The code of gcc is around 12-15MB source code. To rewrite a BIG
>> part of this for a small microprocessor is like trying to kill a fly
>> with an atomic bomb.
>
> /Is/ it a BIG part that needs to be rewritten?
>

For a new architecture?

Yes.

>> I am biased, of course. Like you are biased for gcc. I just want to
>> restore a sense of proportion. Porting gcc to a new architecture and
>> debug the resulting code is a work of several years until all is
>> debugged and fixed.
>
> That's going to depend heavily on how new a "new" architecture
> is, yes?
>

Yes. For a new version of ARM that adds a few instructions you would
need one guy with some knowledge of RTL and 4-5 months (including
debugging)

Bart

unread,
Sep 8, 2009, 10:25:58 AM9/8/09
to
On Sep 8, 12:37 pm, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
> jacob navia wrote:
> > Eric Sosman a écrit :

Knuth's MMIX architecture also makes use of integer overflow (in the
form of a trip flag). So he seems to think it's still worthwhile.
However he also acknowledges he had help from the "people at [Dec?]
Alpha", so perhaps it's fortunate overflow handling didn't go out the
window.

--
Bartc

Dag-Erling Smørgrav

unread,
Sep 8, 2009, 10:42:28 AM9/8/09
to
jacob navia <ja...@nospam.org> writes:
> The only port of gcc to a microcontroller was done in 1999 by a team
> of 5-8 people. They stopped in 2001. It was a version for the
> ATMEL microprocessor. And that was in those happy times of gcc 2.

Atmel is one of those vendors I mentioned that port gcc to their
microcontrollers. In fact, they ship devkits with Linux preloaded.

You really have no clue, do you? You seem to have a worldview that has
very little to do with reality and much to do with supporting your
religious belief that you are smarter than everybody else and that your
compiler is superior to every other compiler. It's called cognitive
dissonance; look it up on Wikipedia.

jacob navia

unread,
Sep 8, 2009, 10:59:34 AM9/8/09
to
Dag-Erling Smørgrav a écrit :

Mr Smorgrav

Yes, I have no clue, and cognitive dissonance and I have
a religious belief that I am superior and that my compiler is superior
and my worldview has little to do with reality.

I propose that we stop discussing then.

Yours sincerely

Keith Thompson

unread,
Sep 8, 2009, 11:26:05 AM9/8/09
to
jacob navia <ja...@nospam.org> writes:
[...]

> A processor that can't report overflow is unusable. If I
> propose to check for overflow in C is because the language has
> a hole here, overflow checking is necessary to avoid getting
> GARBAGE out of your calculations.
>
> If you think the DEC alpha is the "state of the art" in hardware
> processing please think again.

The DEC Alpha certainly can detect overflow. It just uses a
different mechanism than the x86.

Don't confuse the ability to detect overflow (something that I
agree any CPU should have, and I don't know of any that can't)
with the particular mechanism of a dedicated flag.

I don't think your recent proposal actually depends on that
particular mechanism, so I don't know what the argument is about
anyway. If your proposed feature would impose some extra overhead
on systems that use different overflow detection mechanisms (and I
don't know that it would), I don't think that's a serious problem,
as long as the overhead is not too bad and can be avoided by
disabling checking.

Think of the DEC Alpha as just one example of a CPU that, while it's
perfectly capable of detecting overflow, doesn't happen to use the
same mechanism as the CPUs you usually work with. There are likely
other examples, and there are likely to be yet more in the future.
The fact that DEC has been acquired by another company is hardly
relevant.

Keith Thompson

unread,
Sep 8, 2009, 11:29:07 AM9/8/09
to

The standard says overflow on a signed integer computation
invokes undefined behavior, but overflow on conversion either
yields an implementation-defined result or (in C99) raises an
implementation-defined signal.

I've never understood why the language treats these two kinds of
overflow differently.

But if the language were to add a mechanism for handling overflows,
I'd want it to apply to both arithmetic operations and conversions.

jacob navia

unread,
Sep 8, 2009, 11:44:30 AM9/8/09
to
Keith Thompson a écrit :

> The standard says overflow on a signed integer computation
> invokes undefined behavior, but overflow on conversion either
> yields an implementation-defined result or (in C99) raises an
> implementation-defined signal.
>
> I've never understood why the language treats these two kinds of
> overflow differently.
>
> But if the language were to add a mechanism for handling overflows,
> I'd want it to apply to both arithmetic operations and conversions.
>


In principle you are right but in practice...

There are SO many places where overflow in integer conversions is
assumed that the whole thing would be unusable.

int i;
char p[12];

p[1] = i;

assuming that the compiler will do the equivalent of
p[1]=i&0xff;

The checking could be done of course, but I would make it a different
proposal and with a different name. Anyway for the language both
overflows are NOT the same anyway.

Francis Glassborow

unread,
Sep 8, 2009, 12:55:31 PM9/8/09
to

My immediate reaction is 'rubbish'. The whole raison d'etre of C was to
enable quick porting of unix to new hardware. The only new stuff needed
is the code generator and I see no reason that should not be the work of
a single person in a relatively short time.

Beej Jorgensen

unread,
Sep 8, 2009, 12:58:30 PM9/8/09
to
Bart <b...@freeuk.com> wrote:
>Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits. Now try
>and get the same hard and fast facts about C's types, you can't!

I'm going to go with:

Byte int_least8_t
Short int_least16_t
Int int_least32_t
Long int_least64_t

-Beej

Keith Thompson

unread,
Sep 8, 2009, 12:59:15 PM9/8/09
to
jacob navia <ja...@nospam.org> writes:
> Keith Thompson a écrit :
>> The standard says overflow on a signed integer computation
>> invokes undefined behavior, but overflow on conversion either
>> yields an implementation-defined result or (in C99) raises an
>> implementation-defined signal.
>>
>> I've never understood why the language treats these two kinds of
>> overflow differently.
>>
>> But if the language were to add a mechanism for handling overflows,
>> I'd want it to apply to both arithmetic operations and conversions.
>
> In principle you are right but in practice...
>
> There are SO many places where overflow in integer conversions is
> assumed that the whole thing would be unusable.
>
> int i;
> char p[12];
>
> p[1] = i;
>
> assuming that the compiler will do the equivalent of
> p[1]=i&0xff;

Sure, there's plenty of code that *assumes* they're equivalent.

They're not, and programmers have had 20 years warning that you
can't make that assumption.

I think there's even more code that assumes that the value of i is
in the range CHAR_MIN..CHAR_MAX; checking would detect cases where
that assumption is incorrect due to a programming error.

> The checking could be done of course, but I would make it a different
> proposal and with a different name. Anyway for the language both
> overflows are NOT the same anyway.

I disagree.

Francis Glassborow

unread,
Sep 8, 2009, 1:01:49 PM9/8/09
to
Bart wrote:

>>> Odd, isn't it. (And I'm talking about C, since I don't know of any
>>> mainstream languages quite like it.)
>> I'd always thougt C was a mainstream language.
>
> I didn't say it wasn't. Just that there no others that I know of,
> which are like C, that are mainstream (but presumably plenty of
> private or in-house ones, like one or two of mine).
>

Actually you did because you left out an essential 'other'

And I am talking about C since I don't know of any OTHER mainstream
languages quite like it.

jacob navia

unread,
Sep 8, 2009, 1:05:56 PM9/8/09
to
Francis Glassborow a écrit :

>
> My immediate reaction is 'rubbish'. The whole raison d'etre of C was to
> enable quick porting of unix to new hardware. The only new stuff needed
> is the code generator and I see no reason that should not be the work of
> a single person in a relatively short time.

For a compiler like lcc it took me at least 8 months to get
some confidence into the modifications I was doing in the code
generator. It took me much longer to fully understand the
machine description and being able to write new rules
for it.

lcc's code was 250K or so. It has a VERY GOOD DOCUMENTATION.

gcc's code is 15MB or more, with confusing documentation.

With all respect, I disagree with you.

But we are going away from the main subject of this
discussion that was the overflow checking proposal.

Let's agree then, that we disagree in this point.

:-)

Eric Sosman

unread,
Sep 8, 2009, 1:29:21 PM9/8/09
to
Bart wrote:
> [...]

> Have a look at C#'s basic types:
>
> Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits. Now try
> and get the same hard and fast facts about C's types, you can't! It's
> like asking basic questions of a politician.

Basic questions like "How long is a piece of string?"

Besides, what has this to do with detecting overflow? Or are you
trying to make some obscure point about overflow detection in C#?

--
Eric....@sun.com

Eric Sosman

unread,
Sep 8, 2009, 1:36:04 PM9/8/09
to
Keith Thompson wrote:
> jacob navia <ja...@nospam.org> writes:
>> [... concerning integer overflow vs. narrowing conversions ...]

>> The checking could be done of course, but I would make it a different
>> proposal and with a different name. Anyway for the language both
>> overflows are NOT the same anyway.
>
> I disagree.

They're certainly different in the eyes of the C Standard. One
produces undefined behavior (3.4.3), while the other produces
implementation-defined behavior (6.3.1.3p3). Although they pose
similar hazards to the program, the language considers them different
conditions with different descriptions and different outcomes.

--
Eric....@sun.com

Michael Foukarakis

unread,
Sep 8, 2009, 1:52:43 PM9/8/09
to
On Sep 7, 2:47 pm, jacob navia <ja...@nospam.org> wrote:

>
> 2.A: A new pragma
> ------------------
>
> #pragma STDC OVERFLOW_CHECK on_off_flag
>
> When in the ON state, any overflow of an addition, subtraction,
> multiplication or division provokes a call to the overflow handler.
> Operations like +=, -=, *=, and /= are counted also.

Assignment can also cause an overflow. You should concider such cases
as well.

>
> Only the types signed int and signed long long are concerned.
>

Signed integers that are converted to unsigned types for expression
evaluation are also concerned. Several such cases have led to
exploitable bugs. Pointers that wrap around also cause similar
problems. Do not limit your problem space unless necessary.

> The initial state of the overflow flag is implementation defined.
>
> 2.B: Setting the handler for overflows
> ---------------------------------------
>
> overflow_handler_t set_overflow_handler(overflow_handler_t newvalue);
>

All your function and variable names should be named appropriately to
avoid namespace pollution, but I understand this is an early draft.
Just keep it in mind.

> The function set_overflow_handler sets the function to be called in
> case of overflow to the specified value. If "newvalue" is NULL,
> the function sets the handler to the default value (the value
> it had at program startup).

What you describe here implies runtime behaviour. How are you going to
determine "newvalue" in a static context? Are you considering partial
code evaluation? Emulation? Please describe your method more
thoroughly so that appropriate feedback can be given.

> 2.C: The handler function
> -------------------------
>
> typedef void (*overflow_handler_t)(unsigned line_number,
>                                     char *filename,
>                                     char *function_name,...);
> This function will be called when an overflow is detected. The
> arguments have the same values as __LINE__ __FILE__ and __FUNC__
>
> If this function returns, execution continues with an implementation
> defined value as the result of the operation that overflowed.
>

I assume this is provided for the user to handle his/her overflows as
deems fit. This is indeed zero-overhead, but is also zero-work. What
exactly is the novelty you're providing us with? There are well known
methods to detect overflows and many have already been posted in clc
lately - what is the fundamental advantage your compiler will provide
me that will give me reason to stop using those and perform my own
error (overflow) handling in my programs?

> -------------------------------------------------------------------
>
> Implementation.
>
> I have implemented this solution, and the overhead is almost zero.

If you do not provide concrete numbers you cannot make such claims.

> The most important point for implementors is to realize that the
> normal flow (i.e. when there is no overflow) should not be disturbed.
>
> No overhead implementation:
> --------------------------
>         1. Perform operation (add, subtract, etc)
>         2. Jump on overflow to an error label
>         3: Go on with the rest of the program
>
> The overhead of this is below accuracy in a PC system.
> It can't be measured.

Wrong. There are well known implemented methods to measure even number
of instructions executed between two observation points. Use them and
provide us with results for specific test setups.


>
> Implementation with small overhead (3-5%)
>         1. Perform operation
>         2. If no overflow jump to continuation
>         3. save registers
>         4. Push arguments
>         5. Call handler
>         6. Pop arguments
>         7. Restore registers
>     continuation:
>
> The problem with the second implementation is that the flow of control
> is disturbed.

Your implementation is also changing the flow of control. Let me say
here that implementation-defined does not relieve you of all
responsibility - you could at the very least specify some desired
behaviour or properties that should be held true. You also need to
consider the case where an expression contains more than one variables
| expressions which can overflow. Where does program flow resume then?
You can be pessimistic and return before any of the expressions are
evaluated, but in concurrent environments there is the possibility
something is messed up even then.

The branch to the continuation code will be mispredicted
> since it is a forward branch. This provokes pipeline turbulence.
>
> The first solution provokes no pipeline turbulence since the forward
> jump will be predicted as not taken.

This is not always true; consider the gcc likely/unlikely macros.

> This will be a good prediction
> in the overwhelming majority of situations (no overflow). The only
> overhead is just an additional instruction, i.e. almost nothing.
>

Overall, very good initiative, but it needs a lot of work.

Bart

unread,
Sep 8, 2009, 2:20:35 PM9/8/09
to

I've partly lost the thread but I think I was making a case for
processor-specific versions of C, one where it's clear whether or not
features such as overflow checking exist and exactly how they will
work.

C# was an example of one area where it's specs are transparent
compared to C.

If it's necessary to cover every computer in existence, and every one
not yet in existence, I don't think there will be any agreement (about
the overflow thing).

--
Bartc

Stephen Sprunk

unread,
Sep 8, 2009, 2:32:54 PM9/8/09
to
jacob navia wrote:
> Dag-Erling Smørgrav a écrit :
>> Bart <b...@freeuk.com> writes:
>>> Or perhaps C is used for (firmware for) a processor inside a phone
>>> say, but that phone will be produced in the millions. Surely it must
>>> help (in programmer effort, performance, code sise, any sort of
>>> measure except actual portability) to have a tailored C version for
>>> that system.
>>
>> You'd be surprised how many simply use gcc; many microcontroller vendors
>> port gcc to every new chip (or pay someone to do it for them).
>
> There is nbo way gcc can function correctly for small architectures.
> Obviously there are many ports of it to many architectures, most of
> them full of bugs.

Given that GCC has become the default compiler for virtually every
"small architecture", it obviously can function correctly and isn't all
that buggy, at least compared to the commercial compilers it replaced.

My company uses GCC on several different "small architectures" and have
only found a handful of bugs over the years, almost all of which were
already fixed in a more recent version than what we were using at the
time. The remaining few were easy enough to work around.

> The code of gcc is around 12-15MB source code. To rewrite a BIG
> part of this for a small microprocessor is like trying to kill a fly
> with an atomic bomb.

The back-end part that translates RTL to assembly is quite small, and
that is often all that needs porting for a new architecture. Most "new"
architectures are variations on existing ones, in part due to the
conscious desire to make it easier to port compilers, in which case you
only need to tweak a few things.

> I am biased, of course. Like you are biased for gcc. I just want to
> restore a sense of proportion. Porting gcc to a new architecture and
> debug the resulting code is a work of several years until all is
> debugged and fixed.
>
> And no, it can't be done by a single person.

That depends on how different the new system is from the nearest
existing system(s). There are also companies dedicated to doing that
work, if you don't have folks in-house to do it. It doesn't take years;
the vendor effectively can't sell their new chip until GCC works on it,
so they get that part done very, very quickly. In fact, the better ones
even feed back info from the compiler team about how GCC generates code
for their chips so that they can improve future generations.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Isaac Jaffe

Bart

unread,
Sep 8, 2009, 2:41:03 PM9/8/09
to
On Sep 8, 6:52 pm, Michael Foukarakis <electricde...@gmail.com> wrote:
> On Sep 7, 2:47 pm, jacob navia <ja...@nospam.org> wrote:
>
>
>
> > 2.A: A new pragma
> > ------------------
>
> > #pragma STDC OVERFLOW_CHECK on_off_flag
>
> > When in the ON state, any overflow of an addition, subtraction,
> > multiplication or division provokes a call to the overflow handler.
> > Operations like +=, -=, *=, and /= are counted also.
>
> Assignment can also cause an overflow. You should concider such cases
> as well.

And casts (implicit and explicit) as well. But I would call this range
checking, different from overflow, because with overflow you don't
have a value to check against, you only get a indication of the
overflow.

Range checking is also a possibility, but needs more care because I'm
sure many behaviours depend on truncating an out-of-range value.

--
Bartc

Eric Sosman

unread,
Sep 8, 2009, 2:58:54 PM9/8/09
to
Bart wrote:
> On Sep 8, 6:29 pm, Eric Sosman <Eric.Sos...@sun.com> wrote:
>> Bart wrote:
>>> [...]
>>> Have a look at C#'s basic types:
>>> Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits. Now try
>>> and get the same hard and fast facts about C's types, you can't! It's
>>> like asking basic questions of a politician.
>> Basic questions like "How long is a piece of string?"
>>
>> Besides, what has this to do with detecting overflow? Or are you
>> trying to make some obscure point about overflow detection in C#?
>
> I've partly lost the thread but I think I was making a case for
> processor-specific versions of C, one where it's clear whether or not
> features such as overflow checking exist and exactly how they will
> work.

Ah, yes: Back to the Good Old Days when computer time was scarce
and costly, programmer time cheap and plentiful.

Computer hardware advances too quickly for processor-specific
software to be worthwhile, except in edge cases. Are you, even now,
rewriting all your code for the "Nehalem" processor? Or are you
running the same code on "Nehalem" as you did on "Core" and on
"Yonah" and so on back into the mists of time?

Yes, there's a certain amount of processor-specific code in all
of these. Most of *that* is to hide the model-to-model idiosyncracies,
to give "ordinary" code the illusion that nothing has changed. A driver
here, a virtualization hook there, knit one, purl two, and behold! It's
"the same" processor you've always known (or thought you knew). And
what's the benefit of all this trickery? It allows the ENORMOUS body
of existing portable code to run without change, that's what.

> If it's necessary to cover every computer in existence, and every one
> not yet in existence, I don't think there will be any agreement (about
> the overflow thing).

If it were necessary to throw out and re-invent all the software
every time a new CPU design came along, the computer industry would die.
To some extent it has: The ubiquity of the x86 architecture forces
software to accommodate an endlessly-accreted, duct-tape-and-baling-wire
feature forest grown atop a rush job that was only meant to buy time for
the iAPX 432 (saith Wikipedia). Which is why, I guess, some people seem
unable to contemplate any other approach to computing than one thrown
together three-plus decades ago by some engineers in a hurry.

--
Eric....@sun.com

Keith Thompson

unread,
Sep 8, 2009, 3:10:46 PM9/8/09
to

You're right, the language does treat them differently.

The point on which I disagree is that I think that any proposal for
integer overflow checking in the C standard should apply to both.
I should have said so more clearly.

I'm also (still) curious about *why* the standard makes this
distinction, rather than treating conversion like any other
arithmetic operation. A quick look at the C99 Rationale was
unilluminating.

I suspect it's just a matter of practicality, that real-world
behavior on arithmetic overflow varies enough that it wasn't
practical for the standard to narrow the options, but the range
of behavior on conversions made it possible to say simply that the
result is implementation-defined.

Keith Thompson

unread,
Sep 8, 2009, 3:17:31 PM9/8/09
to
Michael Foukarakis <electr...@gmail.com> writes:
> On Sep 7, 2:47 pm, jacob navia <ja...@nospam.org> wrote:
>> 2.A: A new pragma
>> ------------------
>>
>> #pragma STDC OVERFLOW_CHECK on_off_flag
>>
>> When in the ON state, any overflow of an addition, subtraction,
>> multiplication or division provokes a call to the overflow handler.
>> Operations like +=, -=, *=, and /= are counted also.
>
> Assignment can also cause an overflow. You should concider such cases
> as well.

Assignment itself cannot *directly* cause an overflow; it just
copies a value into an object. A conversion that's implicit in
an assignment can cause an "overflow", though the standard doesn't
use that term; see C99 6.3.1.3p3:

Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined
or an implementation-defined signal is raised.

For consistency, all forms of conversions should be treated alike.
That includes implicit conversions resulting from a cast, as well as
implicit conversions resulting from assignment, argument passing,
return statements, parameter passing, the "usual arithmetic
conversions", and whatever other cases I've forgotten.

I've argued that conversions should be treated like arithmetic
operators; there's obviously some disagreement on that point.

[big snip]

Keith Thompson

unread,
Sep 8, 2009, 3:25:41 PM9/8/09
to

There's no need to just "agree to disagree" on this point, since it
can be answered more or less definitively with a little research.
gcc has been ported to new architectures many times. It shouldn't
be difficult to find out how long it typically takes.

Michael Foukarakis

unread,
Sep 8, 2009, 3:47:31 PM9/8/09
to
On 8 Σεπτ, 22:17, Keith Thompson <ks...@mib.org> wrote:

> Michael Foukarakis <electricde...@gmail.com> writes:
> > On Sep 7, 2:47 pm, jacob navia <ja...@nospam.org> wrote:
> >> 2.A: A new pragma
> >> ------------------
>
> >> #pragma STDC OVERFLOW_CHECK on_off_flag
>
> >> When in the ON state, any overflow of an addition, subtraction,
> >> multiplication or division provokes a call to the overflow handler.
> >> Operations like +=, -=, *=, and /= are counted also.
>
> > Assignment can also cause an overflow. You should concider such cases
> > as well.
>
> Assignment itself cannot *directly* cause an overflow; it just
> copies a value into an object.  A conversion that's implicit in
> an assignment can cause an "overflow", though the standard doesn't
> use that term; see C99 6.3.1.3p3:
>
>     Otherwise, the new type is signed and the value cannot be
>     represented in it; either the result is implementation-defined
>     or an implementation-defined signal is raised.
>

Valid point. I had the = operator in mind, while it's clear that
conversion/truncation and friends are the cause of the actual
overflow.

> For consistency, all forms of conversions should be treated alike.
> That includes implicit conversions resulting from a cast, as well as
> implicit conversions resulting from assignment, argument passing,
> return statements, parameter passing, the "usual arithmetic
> conversions", and whatever other cases I've forgotten.
>

I agree, to me it seems pointless to distinguish between those cases.

jacob navia

unread,
Sep 8, 2009, 3:52:34 PM9/8/09
to
Keith Thompson a écrit :

> jacob navia <j...@nospam.org> writes:
>> lcc's code was 250K or so. It has a VERY GOOD DOCUMENTATION.
>>
>> gcc's code is 15MB or more, with confusing documentation.
>>
>> With all respect, I disagree with you.
>>
>> But we are going away from the main subject of this
>> discussion that was the overflow checking proposal.
>>
>> Let's agree then, that we disagree in this point.
>
> There's no need to just "agree to disagree" on this point, since it
> can be answered more or less definitively with a little research.
> gcc has been ported to new architectures many times. It shouldn't
> be difficult to find out how long it typically takes.
>

OK, sure.

Here is an example for the ATMEL microprocessor...

http://users.rcn.com/rneswold/avr/index.html

---------------------------------------------------------------
A GNU Development Environment for the AVR Microcontroller
Rich Neswold

rnes...@earthlink.net

Copyright © 1999, 2000, 2001, 2002 by Richard M. Neswold, Jr.

This document attempts to cover the details of the GNU Tools that are
specific to the AVR family of processors.

Acknowledgements

This document tries to tie together the labors of a large group of
people. Without these individuals' efforts, we wouldn't have a terrific,
free set of tools to develop AVR projects. We all owe thanks to:

*

The GCC Team, which produced a very capable set of development
tools for an amazing number of platforms and processors.
*

Denis Chertykov <den...@overta.ru> for making the AVR-specific
changes to the GNU tools.
*

Denis Chertykov and Marek Michalkiewicz <mar...@linux.org.pl> for
developing the standard libraries and startup code for AVR-GCC.
*

Uros Platise for developing the AVR programmer tool, uisp.
*

Joerg Wunsch <jo...@FreeBSD.ORG> for adding all the AVR
development tools to the FreeBSD ports tree and for providing the demo
project in Chapter 2.
*

Brian Dean <b...@bsdhome.com> for developing avrprog (an alternate
to uisp) and for contributing Section 1.4.1 which describes how to use it.


---------------------------------------------------------------------

It took them from 1999 to 2002 (The copyright notices) and they
surely do not mention all people involved, just the main ones.

OK?

But this has nothing to do with the discussion. Please let's come back
to the overflow discussion.

Bart

unread,
Sep 8, 2009, 6:04:21 PM9/8/09
to

I've seen how portable C came be: a long int is 32-bits under Windows,
and 64-bits under Linux, under gcc on the *same machine*! And if have
to download an application, and it's only available as C source code
(as a myriad files in a bz2 tarball to boot), then I simply don't
bother; life's too short.

Other languages do the portable thing much better than C, so why not
let C concentrate on what it's good at -- implementing systems to
build on top of.

I suspect anyway that many are already using C in exactly the non-
portable ways I'm suggesting (there's no law about it). If on a
machine where char=int=64 bits, is it really possible to code a fast,
tight program completely oblivious of this fact?

--
Bartc

Keith Thompson

unread,
Sep 8, 2009, 6:24:07 PM9/8/09
to
Bart <b...@freeuk.com> writes:
[...]

> I've seen how portable C came be: a long int is 32-bits under Windows,
> and 64-bits under Linux, under gcc on the *same machine*!

And yet there's plenty of C source code that works in both
environments. You propose to take away that advantage.

> And if have
> to download an application, and it's only available as C source code
> (as a myriad files in a bz2 tarball to boot), then I simply don't
> bother; life's too short.

That's your decision, but as long as you have the right environment,
installing such a package is nearly trivial. (I use a wrapper script
that handles the half dozen or so commands required.)

> Other languages do the portable thing much better than C, so why not
> let C concentrate on what it's good at -- implementing systems to
> build on top of.

It already does that.

[...]

robert...@yahoo.com

unread,
Sep 8, 2009, 7:43:01 PM9/8/09
to
On Sep 7, 12:14 pm, Keith Thompson <ks...@mib.org> wrote:
> Alpha processors are still in production use.
>
> Are there other processors, perhaps even new ones, that use similar
> schemes to what the Alpha uses?


MIPS, for example.


>Or ones that use other schemes?


S/360..zSeries, for example.

jacob navia

unread,
Sep 8, 2009, 8:10:09 PM9/8/09
to
Michael Foukarakis a écrit :

The disagreement (from my point of view) is in the scope
of this proposal.

I wanted to check overflow for the 4 operations. Truncating overflow
is used a lot as a way of discarding irrelevant data (maybe after some
shifts or maybe not) that any overlapping of the overflow proposal
with THAT problem would confuse everything.

To access some byte or some subset of the data stored in an integer
how much code does a
char c = (integer >> 8);

meaning

char c = (integer >> 8)&0xff;


I agree with you that it should have been written in the second form
but there is just too much code that is already written in the first
form.

Your proposal should be handled by ANOTHER check macro

#pragma STDC CHECK_ASSIGNMENT_OVERFLOW on-off-flag

That is another discussion.

Keith Thompson

unread,
Sep 8, 2009, 8:40:41 PM9/8/09
to
jacob navia <j...@nospam.org> writes:
[...]

> The disagreement (from my point of view) is in the scope
> of this proposal.
>
> I wanted to check overflow for the 4 operations. Truncating overflow
> is used a lot as a way of discarding irrelevant data (maybe after some
> shifts or maybe not) that any overlapping of the overflow proposal
> with THAT problem would confuse everything.
>
> To access some byte or some subset of the data stored in an integer
> how much code does a
> char c = (integer >> 8);
>
> meaning
>
> char c = (integer >> 8)&0xff;

I don't know how much code does that kind of thing.
It's certainly not something I'd write, and it already
stores an implementation-defined value in c (or raises an
implementation-defined signal).

> I agree with you that it should have been written in the second form
> but there is just too much code that is already written in the first
> form.

It *should* have been written using unsigned types. Bitwise or shift
operators should be used on signed types only if you're sure that the
result is representable; if it isn't, that's a bug in the program,
just the kind of thing that overflow checking should catch. And yes,
in this case it's the assignment, and the implicit conversion
associated with it, that's the immediate cause of the problem.

> Your proposal should be handled by ANOTHER check macro
>
> #pragma STDC CHECK_ASSIGNMENT_OVERFLOW on-off-flag
>
> That is another discussion.

It should be CHECK_CONVERSION_OVERFLOW, not CHECK_ASSIGNMENT_OVERFLOW.
(And it's a pragma, not a macro.)

Ok, if it's going to be controlled by a pragma, I wouldn't object to
using a separate one for conversions. But I would object to adding
one to the language without the other.

jacob navia

unread,
Sep 8, 2009, 8:46:41 PM9/8/09
to
Keith Thompson a �crit :

>
> It should be CHECK_CONVERSION_OVERFLOW, not CHECK_ASSIGNMENT_OVERFLOW.
> (And it's a pragma, not a macro.)
>
> Ok, if it's going to be controlled by a pragma, I wouldn't object to
> using a separate one for conversions. But I would object to adding
> one to the language without the other.
>

Great.

Plan A:

I go on with the
#pragma STDC OVERFLOW_CHECK

and you start

#pragma STDC CHECK_CONVERSION_OVERFLOW

I will surely support your proposal and you mine.

:-)

It would be much more productive for all if you worked
to propose things too.

Dik T. Winter

unread,
Sep 8, 2009, 9:48:51 PM9/8/09
to
In article <h82rs9$ugj$1...@aioe.org> j...@nospam.org writes:
...

> No overhead implementation:
> --------------------------
> 1. Perform operation (add, subtract, etc)
> 2. Jump on overflow to an error label
> 3: Go on with the rest of the program

Do you check for overflow after every operation?
--
dik t. winter, cwi, science park 123, 1098 xg amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

John Nagle

unread,
Sep 9, 2009, 2:53:44 AM9/9/09
to
jacob navia wrote:
> Abstract:
>
> Overflow checking is not done in C. This article proposes a solution
> to close this hole in the language that has almost no impact in the run
> time behavior.

I actually implemented this back in 1982.

The DEC VAX had hardware to trap on integer overflow. The
trap was enabled by a bit in the "entry mask" generated for
each function. The CALL instruction read the entry mask and
set various modes for execution of the code.

I modified the C compiler that came with 4.1BSD to generate
the entry mask with integer overflow checking enabled. Then
I rebuilt all the standard programs with that compiler.
About half of them worked without any fixes. The others
were overflowing silently for one reason or another.

Because that compiler wasn't careful about generating
different instructions for signed and unsigned arithmetic,
it didn't quite generate the right code to allow unsigned
arithmetic to wrap. It looked like a big project to
clean up the compiler, so we never went on to do that.

Back in 1982, integer overflow wasn't considered OK.
Pascal and Ada checked for it. C was considered sloppy for not
doing that. There was more interest in correctness back then.

If C were to have overflow-free semantics, the right way
to do it would be this:

1. Overflow can only occur in user-defined variables.
The compiler must create longer intermediates for expressions
when necessary to prevent overflow within an expression
which would not cause overflow in the result. For example, in

short a,b,c,d;
a = (b * c) / d;

"(b * c)" needs to be a "long". If an intermediate is required
that is too large for the available hardware, a compile error
must be reported.

2. Unsigned arithmetic should be checked for overflow.
If wrapped arithmetic is desired, it should be indicated by
idioms such as

unsigned short a;
a = (a + 1) % 65536;

or

a = (a + 1) & 0xffff;

The compiler is free to optimize such expressions into unchecked
wrapped arithmetic. For the 1% or less of the time that you
really want wrapped arithmetic, that's how to express it.

This has the nice property that you get the same answer on all platforms.
However, with the final demise of the 36-bit ones-complement machines.
(UNISYS finally ended the ClearPath line, the successor to the UNISYS
B series, the UNIVAC 2200 series, and the UNIVAC 1100 series, in
early 2009) this is no longer a real issue. It was, back in 1982, when
DECsystem 36-bit machines were powering most of academia.

At this point, it's way too late in the history of C to fix this.
However, it could have been fixed.

I once put a lot of effort into overflow theory. See
http://www.animats.com/papers/verifier/verifiermanual.pdf
That's from 1982.

John Nagle

jacob navia

unread,
Sep 9, 2009, 3:35:54 AM9/9/09
to
Dik T. Winter a �crit :

> In article <h82rs9$ugj$1...@aioe.org> j...@nospam.org writes:
> ...
> > No overhead implementation:
> > --------------------------
> > 1. Perform operation (add, subtract, etc)
> > 2. Jump on overflow to an error label
> > 3: Go on with the rest of the program
>
> Do you check for overflow after every operation?

No, after addition, subtraction, and multiplication
only.

I haven't done division yet. It could undeflow only
in a special situation.

jacob navia

unread,
Sep 9, 2009, 5:19:00 AM9/9/09
to
John Nagle a �crit :

>
> I actually implemented this back in 1982.
>
> The DEC VAX had hardware to trap on integer overflow. The
> trap was enabled by a bit in the "entry mask" generated for
> each function. The CALL instruction read the entry mask and
> set various modes for execution of the code.
>
> I modified the C compiler that came with 4.1BSD to generate
> the entry mask with integer overflow checking enabled. Then
> I rebuilt all the standard programs with that compiler.
> About half of them worked without any fixes. The others
> were overflowing silently for one reason or another.
>

Imagine that. 50% of the programs at that time were producing
incorrect results in some situations!

I think the number could be the same today. We will never know
until we get this into the language

> Because that compiler wasn't careful about generating
> different instructions for signed and unsigned arithmetic,
> it didn't quite generate the right code to allow unsigned
> arithmetic to wrap. It looked like a big project to
> clean up the compiler, so we never went on to do that.
>
> Back in 1982, integer overflow wasn't considered OK.
> Pascal and Ada checked for it. C was considered sloppy for not
> doing that. There was more interest in correctness back then.
>

No. Today there is even greater interest in correctness. The problem
is that in the C community an attitude of general sloppiness exists.
That is why people are leaving the language and going to others
like Java or C#.


> If C were to have overflow-free semantics, the right way
> to do it would be this:
>
> 1. Overflow can only occur in user-defined variables.
> The compiler must create longer intermediates for expressions
> when necessary to prevent overflow within an expression
> which would not cause overflow in the result. For example, in
>
> short a,b,c,d;
> a = (b * c) / d;
>
> "(b * c)" needs to be a "long". If an intermediate is required
> that is too large for the available hardware, a compile error
> must be reported.
>

Most machines report overflows. It is only necessary to TEST the
overflow flag at each operation. This is very cheap!

> 2. Unsigned arithmetic should be checked for overflow.
> If wrapped arithmetic is desired, it should be indicated by
> idioms such as
>
> unsigned short a;
> a = (a + 1) % 65536;
>
> or
>
> a = (a + 1) & 0xffff;
>
> The compiler is free to optimize such expressions into unchecked
> wrapped arithmetic. For the 1% or less of the time that you
> really want wrapped arithmetic, that's how to express it.
>
> This has the nice property that you get the same answer on all platforms.
> However, with the final demise of the 36-bit ones-complement machines.
> (UNISYS finally ended the ClearPath line, the successor to the UNISYS
> B series, the UNIVAC 2200 series, and the UNIVAC 1100 series, in
> early 2009) this is no longer a real issue. It was, back in 1982, when
> DECsystem 36-bit machines were powering most of academia.
>

Unsigned arithmetic is not undefined behavior what overflow is
concerned... It should wrap around. My proposition goes only for
the signed arithmetic.


> At this point, it's way too late in the history of C to fix this.
> However, it could have been fixed.
>

Better late than never. I do not see why it should be "too late".


> I once put a lot of effort into overflow theory. See
> http://www.animats.com/papers/verifier/verifiermanual.pdf
> That's from 1982.
>
> John Nagle

Nothing moved since then. It is a pity. It could have been done in 1982.

Nick Keighley

unread,
Sep 9, 2009, 5:35:48 AM9/9/09
to
On 8 Sep, 13:30, Bart <b...@freeuk.com> wrote:
> On Sep 8, 11:37 am, Nick Keighley <nick_keighley_nos...@hotmail.com>
> wrote:
> > On 8 Sep, 01:49, Bart <b...@freeuk.com> wrote:

> > > It must be great being a cpu designer, being able to create novel new
> > > hardware incompatible with anything in the past, present or future,
> > > and apparently not caring whether it's compatible with any software
> > > either.
>
> > pretty rare I'd have thought these days. Intel's processors wouldn't
> > look the way they do if they had a clean sheet.
>
> Someone mentioned hundreds of embedded processors for each advanced
> processor. I guess these must be all a little different.

I can't parse that. The point is chip designers *don't* have a
completely
free hand. They *do* have to pay some attention to history.


> > > That all seems to be perfectly acceptable. But when it comes to
> > > languages, we're only allowed to have this single, monolithic super-
> > > language that must run on any conceivable hardware, from the lowliest
> > > crummy microprocessor up to supercomputers, even though there are
> > > several evidently different stratas of application areas.
>
> > that's what C does, yes. If you want Java...
>
> C does it with penalties, such as making some kinds of programming a
> minefield because this won't work on processor X, and that is a wrong
> assumption for processor Y, even though this application is designed
> to run only on processor Z.

this is actually much less hard than you make it sound.
I know a program where a good chunk of the code ran on both a Z80
(8-bit microprocessor) and on a 32-bit Sun (I can't remember if the
Sun was a 68000 or a Sparc- but that's rather the point).


> > > Odd, isn't it. (And I'm talking about C, since I don't know of any
> > > mainstream languages quite like it.)
>

> > I'd always [thought] C was a mainstream language.


>
> I didn't say it wasn't. Just that there no others that I know of,
> which are like C, that are mainstream (but presumably plenty of
> private or in-house ones, like one or two of mine).

well no other language is quite like C, because then it would be C!
Fortran was renowned for its portability. It's true the only examples
of portable programs I have experienced are C programs. I think that
says a lot about C (and perhaps a bit about me!). But many languages
are
highly portable.

Consider Chicken Scheme (an implementaion of a Lisp-like language)
according to its web page, Chicken is...

"Highly portable and known to run on many platforms, including x86,
x86-64,
IA-64, PowerPC, SPARC and UltraSPARC, Alpha, MIPS, ARM and S/390 "

though that list looks a little old they are trying to be portable.
And yes it can be compiled.

Of course it's implemented in and compiles to C...


> > > Personally I wouldn't have a problem with, say, a C-86 language, that
> > > is targetted at x86-class processors.
>
> > yuk. In that case rename it C-86 or something. Or even better give
> > it completly different name. EightySixScript or something. Script-86.
> > Gosh the least portable HLL ever. 40 years of computer science
> > and computer engineering, vanished like a soap bubble.
>
> But this doesn't apply to hardware? Why can't that abstraction layer
> that you mention a bit later be applied to C-86?

you lost me. Hardware varies because it has to deal with the physics
of
the real world. Software can provide an insulation layer that hides
that variability. I'm not sure where you're trying to move the
abstraction
to.

> > > It would make a lot of things a lot simpler.
>
> > and few things that some people think are important much harder.


>
> Have a look at C#'s basic types:
>
> Byte: 8 bits; Short: 16 bits; Int: 32: bits; Long: 64 bits.

which means it won't run efficiently on some hardware.

> Now try
> and get the same hard and fast facts about C's types, you can't! It's
> like asking basic questions of a politician.

you get certain minimum guarantees. In my experience tying down
the size of things to that extent is unhelpful. I've worked
on a project where use of raw types was banned. I hated it.

character char
octet unsigned char
16 bits int
32 bits long

I hardly ever use short. I've not yet needed 64-bits and I'm aware C
has
problems there.


> That's one thing that would be simpler;

I don't agree. You need to think a bit, but once you've thought it's
easy.

> what sort of things would be
> harder

writing software that runs on many platforms. I gave you the example
of embedded systems.


> (other than the obvious one of running on a system where these
> type sizes are all different)?

Dik T. Winter

unread,
Sep 9, 2009, 5:52:19 AM9/9/09
to
In article <h87rtk$ep4$1...@aioe.org> j...@nospam.org writes:
> John Nagle a �crit :
...

> > I modified the C compiler that came with 4.1BSD to generate
> > the entry mask with integer overflow checking enabled. Then
> > I rebuilt all the standard programs with that compiler.
> > About half of them worked without any fixes. The others
> > were overflowing silently for one reason or another.
>
> Imagine that. 50% of the programs at that time were producing
> incorrect results in some situations!

You read that wrong. Not every program that was overflowing delivered
the wrong result! It can be that on at least some of the programs the
overflow was intentional.

Dag-Erling Smørgrav

unread,
Sep 9, 2009, 8:02:17 AM9/9/09
to
Bart <b...@freeuk.com> writes:
> Someone mentioned hundreds of embedded processors for each advanced
> processor. I guess these must be all a little different.

I think you misunderstand me. I did not say that for every advanced CPU
model there are hundreds of embedded CPU models; I said that for every
advanced CPU that comes off the figurative assembly line, tens or
hundreds of embedded CPUs do so as well.

Your cell phone contains at least one GPP and one DSP (possibly combined
on the same chip). The same probably goes for your desk phone. The
cell towers that your cell phone talks to and the exchanges that your
desk phone talks to contain hundreds of GPPs and DSPs. If you have a
multifunction digital watch, chances are it contains a microcontroller.
Your car contains dozens of microcontrollers. Your alarm clock, your TV
set, your DVD player, their respective remote controls, your food
processor, your microwave oven, your dishwasher, your burglary alarm,
probably some of the sensors connected to it, your printer, your copier,
your web camera, your switch (at least if it's managed), your DSL
router... the list goes on.

Chances are many of those microcontrollers are similar (many are based
on the ARM7 or ARM9 architecture) if not identical, and chances are many
of those run one of a handful of real-time operating systems (VxWorks,
QNX, RTEMS) or even Linux or BSD, all of which are written mostly in C
(thus disproving Jacob's claim that C can't be implemented on Harvard
machines like the ARM9).

DES
--
Dag-Erling Smørgrav - d...@des.no

Ben Bacarisse

unread,
Sep 9, 2009, 8:30:07 AM9/9/09
to
Dag-Erling Smørgrav <d...@des.no> writes:
<snip>

> Chances are many of those microcontrollers are similar (many are based
> on the ARM7 or ARM9 architecture) if not identical, and chances are many
> of those run one of a handful of real-time operating systems (VxWorks,
> QNX, RTEMS) or even Linux or BSD, all of which are written mostly in C
> (thus disproving Jacob's claim that C can't be implemented on Harvard
> machines like the ARM9).

Small but important point: I don't think he said that. He said (over
in comp.std.c) that gcc was that "Harvard architectures are not
supported in gcc's conceptual model" which is not the same. I think
you posted a counter-example to that claim but as well, but we should
avoid putting words into others' mouths.

--
Ben.

Hallvard B Furuseth

unread,
Sep 9, 2009, 11:23:30 AM9/9/09
to
Keith Thompson writes:
> The standard says overflow on a signed integer computation
> invokes undefined behavior, but overflow on conversion either
> yields an implementation-defined result or (in C99) raises an
> implementation-defined signal.
>
> I've never understood why the language treats these two kinds of
> overflow differently.

char *s;
int c;
while ((c = getchar()) != EOF) { ... *s++ = c; }

Have you ever written code like that?

This can "overflow on coversion" when char is signed. Yet C strings are
char, not unsigned char. That makes it a major pain in the *ss to
handle strings a formally correct way, too much so for my taste. I just
don't worry about it, trusting the market to isolate me from anyone
producing one's complement char implementations or whatever.

--
Hallvard

Keith Thompson

unread,
Sep 9, 2009, 12:46:37 PM9/9/09
to

Ok, that's a good point. Common idioms like this depend on the
assumption that signed and unsigned chars are interchangeable,
even though the language doesn't support this assumption, and yes,
I've depended on that myself.

But that still doesn't quite explain the discrepancy. In C99,
the above code could easily blow up if the conversion raises an
implementation-defined signal. Even in C90, it could fail badly if
the implementatation-defined result is something odd (like, say,
if the result saturates to CHAR_MAX) -- though it could only fail
for character codes exceeding CHAR_MAX, and historically those were
somewhat unusual. If the behavior of the conversion were undefined,
the situation wouldn't be much worse than it already is.

Eric Sosman

unread,
Sep 9, 2009, 12:50:28 PM9/9/09
to
Bart wrote:
>
> I've seen how portable C came be: a long int is 32-bits under Windows,
> and 64-bits under Linux, under gcc on the *same machine*! And if have
> to download an application, and it's only available as C source code
> (as a myriad files in a bz2 tarball to boot), then I simply don't
> bother; life's too short.

Okay, fine: C is inherently non-portable, is implemented on only
an insignificant handful of machines, and it takes years to port C
code from one machine to the next. Useless, a failed language.

So why are you wasting your time here? Life's too unsigned short.

--
Eric....@sun.com

It is loading more messages.
0 new messages