8 bit signed overflowing problem

137 views
Skip to first unread message

Øyvind Teig

unread,
Sep 12, 2015, 9:15:46 AM9/12/15
to ISO C++ Standard - Discussion
This code does not seem to work as expected in C and C++, several compiler tested. It works as expected in Go (http://play.golang.org/p/lH2SzZQEHG). 

#include <stdio.h>
main ()
{
    unsigned int  Cnt1 = 0; // 16 bits (kind of system clock)
    unsigned int  Cnt2 = 0; // 16 bits (observer)
    signed   char Timer_Secs = 0;             // 8 bits with sign
    signed   char FutureTimeout_1Minute = 60; // 8 bits with sign

    while (Cnt1 < 602) { //                      ##
        if ((Timer_Secs - FutureTimeout_1Minute) >= 0) {            
            FutureTimeout_1Minute += 60; // Overflow/underflow allowed, 
                                         // so wraps and changes sign 
            Cnt2++; 
        } else {} // No code
        Timer_Secs++; 
        Cnt1++;
    }
    //               ## 
    // Cnt2 is 35 if >  0 above 
    // Cnt2 is 35 if >= 0 above 
    // Cnt2 is 10 if == 0 above EXPECTED
    //         10 is expected
    printf ("Cnt2 %d", Cnt2);
}

Why is this so,.. or is this a plain error in the language/compiler(s)?

See blog note "Know your timer's types" at
     Disclaimer: no money or gift associated with that url or any other url in my blog notes

Miro Knejp

unread,
Sep 12, 2015, 9:35:31 AM9/12/15
to std-dis...@isocpp.org
Your mistake is assuming overflow/underflow does what you expect it to do.
The language does not define what happens to *signed* under/overflow because processors don't all do the same. If the compiler/optimizer detects signed over/underflow happens all bets are off and it can do whatever it wants, even remove it. Considering your example has constants everywhere a sufficiently smart compiler will detect that "FutureTimeout_1Minute" overflows. Languages where signed overflow is well defined either support only hardware with expected behavior or do additional checking at the cost of runtime performance.

Furthermore, if you want exactly signed types, use (u)int16_t and (u)int8_t, as "unsigned int" and "signed char"  only have specified *minimum* sizes and can vary between platforms, especially on embedded systems.
--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-discussion/.

Øyvind Teig

unread,
Sep 12, 2015, 11:05:25 AM9/12/15
to ISO C++ Standard - Discussion
Thanks. Using int8_t causes the same "unexpected" behaviour (http://www.edaplayground.com/x/LSL)

> Languages where signed overflow is well defined either support only hardware with expected behavior or do additional checking at the cost of runtime performance.

The occam solution (mentioned in my blog note) would cause exception on overflow or underflow on '+' and '-' operations, but would allow defined wraparound on PLUS, MINUS (and AFTER) operations.

The rationale for the C/C++/C++11/C++14(?) designers not to let the definition of a two's complement number not be implemented as one, but let it rely on compiler writer or processor designer's choices sounds to me rather contrary to the idea of a "high level" language.

How deep is this thinking in this community? Is this line of reasoning challenged in any way, or do the language designers really mean that this kind of behaviour (undefined semantics) is what's expected from C++ coders?

C/C++ does it right for wider words than 8 bits, even signed. Are you certain that this is not an 8-bit implementation glitch? What's the idea of unexpected (=incorrect) behaviour of a narrow width and expected (=correct) behaviour on a wider width?

I work with safety critical systems. I found this error while working with some code at work. So far I am not releaved. It's not a mistake of mine.

Øyvind

Thiago Macieira

unread,
Sep 12, 2015, 11:17:32 AM9/12/15
to std-dis...@isocpp.org
On Saturday 12 September 2015 08:05:24 Øyvind Teig wrote:
> C/C++ does it right for wider words than 8 bits, even signed.

No, it doesn't. You MUST write code such that signed integers do not overflow,
because the compiler is allowed to assume that you wrote it like that. That
applies to all sizes.

Therefore, this line is wrong because it relies on UB:
FutureTimeout_1Minute += 60; // Overflow/underflow allowed,
// so wraps and changes sign

The compiler knows that you started with a positive value and you only mutate
FutureTimeout_1Minute by adding positive values. Therefore,
FutureTimeout_1Minute is always positive (remember, it does not overflow). Your
comment is wrong.

Yes, this is a source of a lot of complaints by developers. It has caused one
of the biggest flamewars in GCC's bugzilla, when it started optimising UB code
away. The side-effect of that was the warning

warning: assuming signed overflow does not occur when assuming that (X + c) < X
is always false

If you used unsigned instead of signed, then the code would become worse,
since now the compiler needs to insert overflow checks where there would
otherwise be none.

So the rule is simple:
- if you need overflow characteristics, use unsigned
- if you know your operations don't overflow, use signed

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Miro Knejp

unread,
Sep 12, 2015, 11:52:05 AM9/12/15
to std-dis...@isocpp.org
On 12 Sep 2015, at 17:05 , Øyvind Teig <oyvin...@gmail.com> wrote:

Thanks. Using int8_t causes the same "unexpected" behaviour (http://www.edaplayground.com/x/LSL)

> Languages where signed overflow is well defined either support only hardware with expected behavior or do additional checking at the cost of runtime performance.

The occam solution (mentioned in my blog note) would cause exception on overflow or underflow on '+' and '-' operations, but would allow defined wraparound on PLUS, MINUS (and AFTER) operations.

The rationale for the C/C++/C++11/C++14(?) designers not to let the definition of a two's complement number not be implemented as one, but let it rely on compiler writer or processor designer's choices sounds to me rather contrary to the idea of a "high level" language.
C++ does not dictate 2’s complement. See [basic.fundamental]/7 “this International Standard permits 2’s complement, 1’s complement and signed magnitude representations for integral types". This allows C++ to be supported on a wider range of hardware. And since all of those have different behavior on signed overflow the language doesn’t mandate what to do.

However, unsigned integer overflow is defined: [basic.fundamental]/4 “Unsigned integers shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer."


How deep is this thinking in this community? Is this line of reasoning challenged in any way, or do the language designers really mean that this kind of behaviour (undefined semantics) is what's expected from C++ coders?
Those operations are usually undefined because there is either no consensus among hardware implementations (like different bit patterns after overflow depending on representation) or because making it defined would incur unnecessary runtime checking overhead.


C/C++ does it right for wider words than 8 bits, even signed. Are you certain that this is not an 8-bit implementation glitch? What's the idea of unexpected (=incorrect) behaviour of a narrow width and expected (=correct) behaviour on a wider width?
The size of the integer doesn’t matter. The fact it works on your machine with a 16 bit int but not 8 bit int is mere coincidence and of absolutely no concern to the language. You created a signed overflow situation and therefore the language no longer cares what happens and you’re at the mercy of your compiler and processor. That is the point of “undefined behavior”. It may work, it may not work, or it may make demons fly out of your nose. The morale of the story is don’t create situations that result in undefined behavior.

If you what to know why it only happens for 8 bit ints, then look at the difference in generated assembly and read your processor’s reference manual to figure out why it works for 16 bit (maybe it’s using a different register or instruction). But relying on that behavior will make your code inherently non-portable and the next compiler update may break it again.


I work with safety critical systems.
Then don’t rely on undefined behavior. There are many operations in the language for which behavior is not defined. Get to know them, avoid them, and find tools that detect them in your code (static analysis tools or clang’s UB sanitizer). This is especially important when using an optimizing compiler, as it does not have to acknowledge undefined behavior when it considers the correctness of its transformations. So in your example, if it can show that the offending addition always causes overflow, it may even remove it and pretend it never existed (and compilers do that).

I found this error while working with some code at work. So far I am not releaved. It's not a mistake of mine.
It is. You violated the language rules. You expect something to behave in a certain way that is intentionally left undefined in the language. If you don’t play by the rules don’t be surprised when your expectations aren’t met.

David Krauss

unread,
Sep 12, 2015, 12:35:02 PM9/12/15
to std-dis...@isocpp.org
On 2015–09–12, at 11:17 PM, Thiago Macieira <thi...@macieira.org> wrote:

Therefore, this line is wrong because it relies on UB:
           FutureTimeout_1Minute += 60; // Overflow/underflow allowed,
                                        // so wraps and changes sign

Oddly, 8-bit addition really can’t overflow because the operands have to be promoted to at least 16 bits first. When the wider result is converted back to 8 bits the result is implementation-defined (§4.7/3). The standard has this to say about implementation-defined behavior (§1.3.11):

behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents

Non-deterministic results are not specifically excluded, but reasonable QOI seems to demand consistency.


Øyvind Teig

unread,
Sep 12, 2015, 2:24:39 PM9/12/15
to ISO C++ Standard - Discussion
Miro, Thiago and David

thanks for your clarifying descriptions. I have learnt more than I'd like to admit, and I have updated in my blog note with a warning. More warning than I like! We certainly use static analysis tools at work.

But building platform-independent software by carefully using platform-dependent Lego bricks certainly makes me doubt. I cannot for the best of it understand that it need be this way.

I recently listened to a lecture by a guy who had sw on board the Philae Spacecraft that landed on Comet 67P/Churyumov-Gerasimenko after 10 years in space. He said he was sceptical about the direction of languages and methodologies since he wrote that sw. I understood that he meant it's becoming more complicated not less. I think I agree.

Øyvind 

Brent Friedman

unread,
Sep 12, 2015, 2:29:37 PM9/12/15
to std-dis...@isocpp.org
If you don't like the semantics of builtin signed integers, perhaps you should consider using a library that implements your desired semantics.

Myriachan

unread,
Sep 12, 2015, 4:02:47 PM9/12/15
to ISO C++ Standard - Discussion
On Saturday, September 12, 2015 at 8:52:05 AM UTC-7, Miro Knejp wrote:
C++ does not dictate 2’s complement. See [basic.fundamental]/7 “this International Standard permits 2’s complement, 1’s complement and signed magnitude representations for integral types". This allows C++ to be supported on a wider range of hardware. And since all of those have different behavior on signed overflow the language doesn’t mandate what to do.


And this is holding C++ back.  Pretty much all the other major compiled or JITted languages define overflow.
 
However, unsigned integer overflow is defined: [basic.fundamental]/4 “Unsigned integers shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer."


It's very hard to do unsigned overflow exactly properly, because the language likes to make everything signed.  For example, the following code has undefined behavior on most platforms today:

std::uint16_t meow = 0xFFFFu;
meow *= meow;

The language not working according to programmer expectations is something that will kill C and C++ in the long run.

Melissa

Miro Knejp

unread,
Sep 12, 2015, 4:07:35 PM9/12/15
to std-dis...@isocpp.org
Am 12.09.2015 um 20:24 schrieb Øyvind Teig:
Miro, Thiago and David

thanks for your clarifying descriptions. I have learnt more than I'd like to admit, and I have updated in my blog note with a warning. More warning than I like! We certainly use static analysis tools at work.
I suggest you point out in that article that the 16 bit version is *exactly as broken* as the 8 bit version. It is mere coincidence that it works for you. Particurarly the comment "// Overflow/underflow allowed" is definitely *wrong* in both examples (it is not allowed in C/C++) and might lead readers to unfortunate conclusions.


But building platform-independent software by carefully using platform-dependent Lego bricks certainly makes me doubt. I cannot for the best of it understand that it need be this way.
People have done this all the time. The standard library, Boost, POCO, Qt, etc are all prime examples of abstracting platform-dependent code behind platform-independent interfaces.
Message has been deleted

Øyvind Teig

unread,
Sep 12, 2015, 5:39:16 PM9/12/15
to ISO C++ Standard - Discussion
So, making Go portable needs C or C++ if it needs to be ported to machines where only one's complement numbers and the language supports two's complement only? I think the last version of if is all in Go, compiler and run-time. I assume the compiler could add a Go function to transform between two's to one's complement, callable only from the generated code? I thought I saw that every day when I delve into generated assembler. Teach me.

Having two roles, as both high level and low level is a demanding double role for any tool. Not to say its user.

Is it portability-wise layered well enough? 

The platform-dependent matters seem to be so thinly spread. Is it?

Øyvind Teig

unread,
Sep 13, 2015, 5:51:39 AM9/13/15
to ISO C++ Standard - Discussion
Since the 8-bit signed example I showed always gave Cnt2 of 35, 35 and the expected 10 this looks like an implementation pattern. What's going on under the hood? And what's the rationale for the implementation decision?

Øyvind 

Edward Catmur

unread,
Sep 14, 2015, 11:20:19 AM9/14/15
to ISO C++ Standard - Discussion
This is not an appropriate place to discuss the behavior of implementations, except insofar as they conform or do not conform to the Standard. The compiler you are using conforms to the Standard in its translation of your program.

That said, it's easy enough to guess at the implementation decision; arithmetic is being performed in the machine word while comparison is being performed in 8-bit signed integers. This is an efficient strategy on most modern architectures that provide sub-word data types, and is a correct transformation since signed arithmetic does not overflow or wrap around.

Øyvind Teig

unread,
Sep 14, 2015, 3:45:26 PM9/14/15
to ISO C++ Standard - Discussion
Thank you. Interesting! 

I hope that a C version that actually works within your Standard might be appropriate here. Suggested in the XMOS thread in my blog note (here):

if ((signed char) (Timer_Secs - FutureTimeout_1Minute) > 0)  {...} else {...}

I have not found any platform where this does not work as expected, for both ==, >= and >. I'd certainly like to see how a C++ template (?) might make a general (char, short, long, long long (??)) version of this. I do realise that even if it works (for me) there may not ever be a general platform-independent solution?

Edward Catmur

unread,
Sep 15, 2015, 5:42:59 AM9/15/15
to std-dis...@isocpp.org
Yes, that will work, although it has implementation-defined behavior, since you are converting a (promoted) int with a value not representable in the target (signed char) type; [conv.integral]/3. It is reasonable to expect that implementation-defined behavior to be unsurprising given knowledge of the characteristics of the platform, and in any case it can be ascertained by reading implementation documentation (which is required to be provided).

Note though that this will only work for types with lesser range than int, e.g. signed char and short; otherwise signed overflow will occur, which has undefined behavior.

For a solution working with general integral types, it is always well-defined to perform arithmetic that may wraparound in the corresponding unsigned type; the problem is that converting back to the signed type has implementation-defined behavior. If the signed type has the same size range as the unsigned type (e.g. 2's or 1's complement; not sign-magnitude) you can write a conversion function; you can also model signed arithmetic with unsigned types.

--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-discussion/pdaoWRp8ZYg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-discussio...@isocpp.org.

Øyvind Teig

unread,
Sep 15, 2015, 1:44:26 PM9/15/15
to ISO C++ Standard - Discussion


tirsdag 15. september 2015 11.42.59 UTC+2 skrev Edward Catmur følgende:
Yes, that will work, although it has implementation-defined behavior, since you are converting a (promoted) int with a value not representable in the target (signed char) type; [conv.integral]/3. It is reasonable to expect that implementation-defined behavior to be unsurprising given knowledge of the characteristics of the platform, and in any case it can be ascertained by reading implementation documentation (which is required to be provided).

Note though that this will only work for types with lesser range than int, e.g. signed char and short; otherwise signed overflow will occur, which has undefined behavior.

I have tested the AFTER_32 macro with int, and both of these seem to work:
#define AFTER_32(a,b) ((a-b)>0)      // works
#define AFTER_32(a,b) ((int)(a-b)>0) // also works  
Reply all
Reply to author
Forward
0 new messages