Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

two's complement idea

534 views
Skip to first unread message

Bonita Montero

unread,
Nov 4, 2019, 2:48:27 PM11/4/19
to
What would realls speak against that numieric_limits<type> would include
a flag whether a type has a two's complement and maybe another flag that
says that a type has a two's complement wrap-around?

Bonita Montero

unread,
Nov 4, 2019, 3:19:56 PM11/4/19
to
> What would realls speak against that numieric_limits<type> would include
> a flag whether a type has a two's complement and maybe another flag that
> says that a type has a two's complement wrap-around?

And a further flag that says that signed shifts have the usual behaviour
as on most platforms?

Öö Tiib

unread,
Nov 4, 2019, 5:40:06 PM11/4/19
to
On Monday, 4 November 2019 22:19:56 UTC+2, Bonita Montero wrote:
> > What would realls speak against that numieric_limits<type> would include
> > a flag whether a type has a two's complement and maybe another flag that

You yourself started a thread discussing proposal paper p0907r0
at end of July "In the end, rason will come" that would make it redundant.

> > says that a type has a two's complement wrap-around?

The numeric_limits<type>::is_modulo is already there. Unfortunately
no compiler sets it to signed types (even with -fwrapv flag).

> And a further flag that says that signed shifts have the usual behaviour
> as on most platforms?

Language lawyers at isocpp.org seem quite uncertain what they want:
<https://groups.google.com/a/isocpp.org/forum/#!msg/std-proposals/MZzCyAL1qRo/p493_UdUAgAJ>

Bo Persson

unread,
Nov 4, 2019, 5:55:24 PM11/4/19
to
Possibly that in C++20 all signed integers *are* two's complement. :-)

https://wg21.link/P0907

Siri Cruise

unread,
Nov 4, 2019, 10:55:53 PM11/4/19
to
In article <qppva0$cbj$1...@news.albasani.net>,
#define ONESCOMPLEMENT (-1==~1)
#define TWOSCOMPLEMENT (-1==~0)

--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

Alf P. Steinbach

unread,
Nov 4, 2019, 11:52:44 PM11/4/19
to
On 05.11.2019 03:55, Siri Cruise wrote:
> In article <qppva0$cbj$1...@news.albasani.net>,
> Bonita Montero <Bonita....@gmail.com> wrote:
>
>> What would realls speak against that numieric_limits<type> would include
>> a flag whether a type has a two's complement and maybe another flag that
>> says that a type has a two's complement wrap-around?
>
> #define ONESCOMPLEMENT (-1==~1)
> #define TWOSCOMPLEMENT (-1==~0)

Nice idea.

But please use `constexpr` or `const` declarations, not macros.

Using macros conveys the idea that macros are a reasonable solution to
the problem of naming a compile time constant.


- Alf

Ian Collins

unread,
Nov 5, 2019, 12:17:55 AM11/5/19
to
:)

--
Ian.

Bonita Montero

unread,
Nov 5, 2019, 12:56:51 AM11/5/19
to
> #define TWOSCOMPLEMENT (-1==~0)

Or ...


#include <limits>

using namespace std;

constexpr bool hasTwosComplementAndWrapAround()
{
return (numeric_limits<int>::max() + 1) == numeric_limits<int>::min();
}

constexpr bool hasSignedShift()
{
return (-2 >> 2) == -1;
}


Christian Gollwitzer

unread,
Nov 5, 2019, 2:24:17 AM11/5/19
to
Am 05.11.19 um 06:56 schrieb Bonita Montero:

>     return (numeric_limits<int>::max() + 1) == numeric_limits<int>::min();

Couldn't this in theory format your hard drive (although in practice it
never will)? Or more realistically, always return a false negative from
a zealous optimizer?

Christian

Bonita Montero

unread,
Nov 5, 2019, 2:35:56 AM11/5/19
to
>>      return (numeric_limits<int>::max() + 1) ==
>> numeric_limits<int>::min();

> Or more realistically, always return a false negative from
> a zealous optimizer?

When it returns a false-negative, the function does what
it should: it says that you can't rely on the wrap-around.

Öö Tiib

unread,
Nov 5, 2019, 2:36:38 AM11/5/19
to
Yes, when numeric_limits<int>::is_modulo is false then the
numeric_limits<int>::max() + 1 is undefined behavior.
By progress of P0907 it seems that it will continue being
undefined behavior after C++20 too.

David Brown

unread,
Nov 5, 2019, 4:31:39 AM11/5/19
to
I think there are a few points that can be taken from the discussions on
this topic in various places.

1. Two's complement representation, with no padding bits, should be
standardised. There simply is no need for anything else in the C++
world - there are no implementations of C++ with any other
representation, and no likelihood of them being used in the future.
Picking the one fixed representation simplifies a few things in the
standard.

2. Some people get very worked up about signed integer overflow
behaviour. There are also several widely believed, but incorrect, myths
about it - such as ideas that it used to be defined behaviour, or that
it always wraps on x86 systems, or that defining wrapping behaviour is
always a good idea, or that the undefined nature of signed integer
overflow is just a hangover from when different representations were
allowed. None of these is true.

3. Some code is written assuming that signed integers wrap on overflow,
and the code is incorrect if that is not the case for the implementation
used.

4. Some code would be easier to write in a correct and efficient way if
signed integer overflow wrapped.

5. Some code is more efficient if signed integer overflow is undefined
behaviour (assuming a suitable optimising compiler).

6. Some errors are significantly easier to detect (statically, or using
run-time tools like sanitizers) when signed integer overflow is
undefined behaviour.

7. Sometimes it would also be useful with undefined behaviour for
unsigned integer overflow, for efficiency reasons or for detecting
errors in the code.


To my mind, this all says C++ should support both models - indeed, it
should support several.

I would propose a set of template classes, such as:

std::overflow_wrap<T>
std::overflow_undef<T>
std::overflow_saturate<T>
std::overflow_throw<T>

These could be used with any integer types, signed and unsigned. The
standard unsigned types are equivalent to the overflow_wrap types. The
standard signed types are overflow_undef. Implementations would be free
to implement overflow_undef (and therefore plain signed integer types)
in any way they want, with the expectation that it is efficient. But
compilers should document clearly if they consider plain ints to be
overflow_undef. And if they say they are overflow_wrap, then they must
be entirely consistent about it. (If you look in MSVC's documentation,
you'll get the impression that it wraps on overflow - except when it
optimises on the assumption that overflow doesn't occur.)

I'd also be in favour of having a standardised pragma to change the
behaviour of plain int types, as an aid to dealing with existing code.

When people can't agree, and when every option has its advantages and
disadvantages, the only rational solution is to give people an explicit
choice.


David Brown

unread,
Nov 5, 2019, 4:32:58 AM11/5/19
to
Two's complement representation, but /not/ wrapping.

Bonita Montero

unread,
Nov 5, 2019, 5:23:35 AM11/5/19
to
> 3. Some code is written assuming that signed integers wrap on overflow,
> and the code is incorrect if that is not the case for the implementation
> used.

On which architectures today?

> 5. Some code is more efficient if signed integer overflow is undefined
> behaviour (assuming a suitable optimising compiler).

Which optimization?
Optimizing the following away:

int i;
...
if( i + 1 < i )
...

doesn't make sense.

James Kuyper

unread,
Nov 5, 2019, 9:20:12 AM11/5/19
to
On 11/5/19 2:36 AM, Öö Tiib wrote:
> On Tuesday, 5 November 2019 09:24:17 UTC+2, Christian Gollwitzer wrote:
>> Am 05.11.19 um 06:56 schrieb Bonita Montero:
>>
>>>     return (numeric_limits<int>::max() + 1) == numeric_limits<int>::min();
>>
>> Couldn't this in theory format your hard drive (although in practice it
>> never will)? Or more realistically, always return a false negative from
>> a zealous optimizer?

The function is_modulo() is supposed to return true if the signed
overflow is defined by the implementation to result in wrap-around as an
extension to C++ (21.3.4.1p62). The behavior is defined by neither the
standard nor the implementation, only if is_modulo() is false. If
undefined behavior causes hasTwosComplementAndWrapAround() to return
false, that means that it does NOT Wrap around, so the function is
returning precisely the result that it's supposed to return.

The important problem is the possibility of a false positive result:
hasTwosComplementAndWrapAround() returning true when signed overflow
does NOT wrap around, which is entirely possible, given that the
alternative to wrapping around is undefined behavior.

>
> Yes, when numeric_limits<int>::is_modulo is false then the
> numeric_limits<int>::max() + 1 is undefined behavior.

Note, in particular, that if numeric_limits<T>::is_modulo() is false,
then signed overflow might, for instance, always give a result of
numeric_limits<T>::min(). That would cause Bonita's implementation for
hasTwosComplementAndWrapAround() to return true, despite the fact that
overflow does NOT wrap around.

David Brown

unread,
Nov 5, 2019, 9:43:23 AM11/5/19
to
On 05/11/2019 11:23, Bonita Montero wrote:
>> 3. Some code is written assuming that signed integers wrap on overflow,
>> and the code is incorrect if that is not the case for the implementation
>> used.
>
> On which architectures today?

All architectures. Some (few) compilers specifically document that they
have wrapping behaviour on overflow - most don't, even if they /usually/
wrap.

Most hardware these days has wrapping on signed integer overflow, unless
you specifically choose instructions with different behaviour (like
trapping on overflow or saturation). So you often get it as a
side-effect of the compiler generating efficient code. But that does
not mean the language supports it, or the compiler supports it.
Remember, "undefined behaviour" includes "sometimes giving wrapped results".

Consider this code:

int foo(int x) {
return (x * 20) / 5;
}

What is foo(214748367) (assuming 32-bit ints) ?

With x = 214748367, x * 20 is 4294967340, equal to 0x10000002c. So with
wrapping signed arithmetic, x * 20 is 44. Dividing by 5 gives 8
(rounding down).


On the other hand, it's easy to think that "foo" can be reduced to "x *
4". As long as there are no overflows in calculating "x * 20", that is
always true. So an optimising compiler that knows signed integer
overflow never happens, will generate "x * 4" code for foo - it's just
an "lea" or a "shl" instruction in x86, and much more efficient than
doing a multiply by 20 and then a divide by 5.

Applying this, foo(214748367) is 858993468.

These are two completely different answers. One is with wrapping
overflow semantics, the other is with an efficient implementation but
relies on optimising from the assumption that undefined behaviour,
signed integer overflow, never occurs.


Let's look in practice.

Paste this into godbolt.org, and look at it for different compilers
(with optimisations enabled).

const int testval = 214748367;

int foo(int x) {
return (x * 20) / 5;
}

int foo1(void) {
return foo(testval);
}

MSVC generates "shl eax, 2" for "foo" - efficient code, for undefined
behaviour in signed integer overflow. "foo1" is "mov eax, 8" - when
doing compile-time calculation for the value, it uses wrapping semantics
and gives a different value from the result of using the compiled function.

There is nothing wrong with that, of course - it is perfectly good C and
C++ behaviour. But it would be unexpected for people who think signed
integers wrap.


gcc gives an "lea" instruction for "foo", which is effectively the same
as MSVC. For "foo1", it returns 858993468, which is the same value as
it gives running the instructions of foo, but does not match wrapping
behaviour.

gcc also has the option "-fwrapv" to give full wrapping semantics.
"foo" is then handled by a full multiply by 20 followed by a divide by
5, and "foo1" returns 8.

Again, this is all fine. If you want wrapping signed semantics, you can
ask for it and get it. If not, you get efficient code.


>
>> 5. Some code is more efficient if signed integer overflow is undefined
>> behaviour (assuming a suitable optimising compiler).
>
> Which optimization?
> Optimizing the following away:
>
> int i;
> ...
> if( i + 1 < i )
>    ...
>
> doesn't make sense.

Sure it makes sense. It's basic maths - if you add 1 to a number, with
normal arithmetic, you can't make it smaller. If you specifically have
a different arithmetic system, such as a modulo system, then it's a
different matter. But C and C++ don't have that for signed integers
(though any compiler is free to give you those semantics if it wants).

Bonita Montero

unread,
Nov 5, 2019, 10:58:12 AM11/5/19
to
>> Optimizing the following away:
>>
>> int i;
>> ...
>> if( i + 1 < i )
>> ...
>>
>> doesn't make sense.

> Sure it makes sense. It's basic maths - if you add 1 to a number, with
> normal arithmetic, you can't make it smaller.

With the same reasoning you could say that unsigneds might never
wrap; but in fact they're specified to wrap. And That's not how
computers work. If someone does this he does the above intentionally
there's only _one_ reason for it: he wants to check for wrap-around.
So this is an optimization which no one asked for.

Öö Tiib

unread,
Nov 5, 2019, 12:03:00 PM11/5/19
to
On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
>
> When people can't agree, and when every option has its advantages and
> disadvantages, the only rational solution is to give people an explicit
> choice.

There are lot of aspects that majority of people involved seem to
agree with.

* on majority of cases overflow (even unsigned) is programming error
* trap (overflow that signals or throws) helps to find programming
errors most reliably and quickly
* wrap (overflow with modular arithmetic) can be used intentionally
and efficiently in some algorithms
* snap (overflow that results with saturating nan) is good when
damaged, disconnected or short circuited (temperature?) sensor
should not turn our device (airplane?) into brick
* compiler assuming that overflow not occurring is ensured by some
external logic may allow noteworthy optimizations
* efficiency is of low importance for lot of code
* efficiency of C++ is one of its design goals
* it is good when programmer's intention can be indicated in code

So if to think of it then it is large puzzle indeed how to resolve it
elegantly and it continues to be bad to leave it all up to
implementations to decide and then each developer to struggle
with it on his own with "this standard imposes no requirements".

I myself would like most current operators to trap on overflow
(by raising SIGFPE or throwing std::overflow_error) both for
signed and unsigned.
For wrap, snap and "compiler may assume that the operation
does not overflow" cases I would like new operators.

Scott Lurndal

unread,
Nov 5, 2019, 12:15:56 PM11/5/19
to
=?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> writes:
>On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
>>
>> When people can't agree, and when every option has its advantages and
>> disadvantages, the only rational solution is to give people an explicit
>> choice.
>
>There are lot of aspects that majority of people involved seem to
>agree with.
>
>* on majority of cases overflow (even unsigned) is programming error
>* trap (overflow that signals or throws) helps to find programming
> errors most reliably and quickly

While true, see efficiency, below.

>* wrap (overflow with modular arithmetic) can be used intentionally
> and efficiently in some algorithms
>* snap (overflow that results with saturating nan) is good when
> damaged, disconnected or short circuited (temperature?) sensor
> should not turn our device (airplane?) into brick
>* compiler assuming that overflow not occurring is ensured by some
> external logic may allow noteworthy optimizations
>* efficiency is of low importance for lot of code

With this, I cannot agree.

>* efficiency of C++ is one of its design goals
>* it is good when programmer's intention can be indicated in code
>
>So if to think of it then it is large puzzle indeed how to resolve it
>elegantly and it continues to be bad to leave it all up to
>implementations to decide and then each developer to struggle
>with it on his own with "this standard imposes no requirements".
>
>I myself would like most current operators to trap on overflow
>(by raising SIGFPE or throwing std::overflow_error) both for
>signed and unsigned.

Integer? Float? Both? Isn't that a function of the hardware? Or do you expect to
generate a conditional branch to inject a signal
after every arithmetic instruction (or sequence thereof)
that could overflow?

Note that ARM supports a cumulative overflow (floating point only)
that can be used to check if one of several consecutive operations
overflowed without needing to check each one.

Even someone who believes that "efficiency is of low importance"
wouldn't be willing to accept the performance degradation caused
by such checking code.

Öö Tiib

unread,
Nov 5, 2019, 1:20:08 PM11/5/19
to
On Tuesday, 5 November 2019 19:15:56 UTC+2, Scott Lurndal wrote:
> =?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> writes:
> >On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
> >>
> >> When people can't agree, and when every option has its advantages and
> >> disadvantages, the only rational solution is to give people an explicit
> >> choice.
> >
> >There are lot of aspects that majority of people involved seem to
> >agree with.
> >
> >* on majority of cases overflow (even unsigned) is programming error
> >* trap (overflow that signals or throws) helps to find programming
> > errors most reliably and quickly
>
> While true, see efficiency, below.
>
> >* wrap (overflow with modular arithmetic) can be used intentionally
> > and efficiently in some algorithms
> >* snap (overflow that results with saturating nan) is good when
> > damaged, disconnected or short circuited (temperature?) sensor
> > should not turn our device (airplane?) into brick
> >* compiler assuming that overflow not occurring is ensured by some
> > external logic may allow noteworthy optimizations
> >* efficiency is of low importance for lot of code
>
> With this, I cannot agree.

You can elaborate slightly more? Perhaps you disagreed before
reading "for lot of code".

>
> >* efficiency of C++ is one of its design goals
> >* it is good when programmer's intention can be indicated in code
> >
> >So if to think of it then it is large puzzle indeed how to resolve it
> >elegantly and it continues to be bad to leave it all up to
> >implementations to decide and then each developer to struggle
> >with it on his own with "this standard imposes no requirements".
> >
> >I myself would like most current operators to trap on overflow
> >(by raising SIGFPE or throwing std::overflow_error) both for
> >signed and unsigned.
>
> Integer? Float? Both? Isn't that a function of the hardware?

I meant unsigned and signed integers. IEEE floating point has
by default exceptions disabled so I would leave it like that.
I see no actual dichotomy between hardware or software.

> Or do you expect to
> generate a conditional branch to inject a signal
> after every arithmetic instruction (or sequence thereof)
> that could overflow?

Where hardware does not help there software has to emulate it.

> Note that ARM supports a cumulative overflow (floating point only)
> that can be used to check if one of several consecutive operations
> overflowed without needing to check each one.

Yes but I meant integers.

> Even someone who believes that "efficiency is of low importance"
> wouldn't be willing to accept the performance degradation caused
> by such checking code.

I meant (and tried to say that) "for lot of code efficiency is
of low importance". It is about 90% of most code bases that
runs less than 1% of total run-time. It is lot of code. Also
it is often less well tested and defects in it sometimes
manifest like hard-to-reproduce instability.

What is worth to optimize or where overflow is not error I
would like to be clearly indicated with syntax:

Manfred

unread,
Nov 5, 2019, 1:31:45 PM11/5/19
to
On 11/5/19 6:02 PM, Öö Tiib wrote:
> On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
>>
>> When people can't agree, and when every option has its advantages and
>> disadvantages, the only rational solution is to give people an explicit
>> choice.

Except when giving a choice complicates the matter more than the
benefits it can give.
From this perspective I like compiler flags more than adding new source
code constructs. Maybe because I think that signed wrapping, unlike
unsigned, is of no practical use.

>
> There are lot of aspects that majority of people involved seem to
> agree with.

I, for one, find some of them questionable.

>
> * on majority of cases overflow (even unsigned) is programming error
That's my first - unsigned wrapping /can/ be useful in some contexts. I
can't say if it is about the majority of cases, but it is enough to keep
it in place.

> * trap (overflow that signals or throws) helps to find programming
> errors most reliably and quickly
I would say it is a better debugging aid than nothing, but as
reliability goes it can only catch what is executed, and it is often
impossible to test all possible combinations at runtime. On the other
hand, it would degrade efficiency to a possibly unacceptable level (As
Scott pointed out).

> * wrap (overflow with modular arithmetic) can be used intentionally
> and efficiently in some algorithms
> * snap (overflow that results with saturating nan) is good when
> damaged, disconnected or short circuited (temperature?) sensor
> should not turn our device (airplane?) into brick
I am not sure bout this.
Floating point arithmetic does something of the kind (NaN propagates
through), but I think that if it is about a sensor failure, and
especially if it is about safety, then the hardware itself or the device
driver should handle this properly, i.e. with specific error conditions,
rather than the language.

> * compiler assuming that overflow not occurring is ensured by some
> external logic may allow noteworthy optimizations
True.

> * efficiency is of low importance for lot of code
Disagree. It is not always important, but it is important enough to be
one of the main reasons to choose C++, so it should be of primary
importance for the language.

> * efficiency of C++ is one of its design goals
> * it is good when programmer's intention can be indicated in code
True, as long as verbosity is kept at a reasonable level.

>
> So if to think of it then it is large puzzle indeed how to resolve it
> elegantly and it continues to be bad to leave it all up to
> implementations to decide and then each developer to struggle
> with it on his own with "this standard imposes no requirements".
>
> I myself would like most current operators to trap on overflow
> (by raising SIGFPE or throwing std::overflow_error) both for
> signed and unsigned.
I agree with Scott here.

> For wrap, snap and "compiler may assume that the operation
> does not overflow" cases I would like new operators.
>
See above, my impression is that this would result in excessive bloat of
the language. Better use compiler flags, or at most #pragmas for
sections of code.

Paavo Helde

unread,
Nov 5, 2019, 2:45:23 PM11/5/19
to
On 5.11.2019 17:58, Bonita Montero wrote:
>
> With the same reasoning you could say that unsigneds might never
> wrap; but in fact they're specified to wrap.

In retrospect, this (wrapping unsigneds) looks like a major design mistake.

IMO, wrapping integers (signed or unsigned) are an example of
"optimization which nobody asked for", and they are there basically only
because the hardware happened to support such operations.

Öö Tiib

unread,
Nov 5, 2019, 2:56:04 PM11/5/19
to
On Tuesday, 5 November 2019 20:31:45 UTC+2, Manfred wrote:
> On 11/5/19 6:02 PM, Öö Tiib wrote:
> > On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
> >>
> >> When people can't agree, and when every option has its advantages and
> >> disadvantages, the only rational solution is to give people an explicit
> >> choice.
>
> Except when giving a choice complicates the matter more than the
> benefits it can give.
> From this perspective I like compiler flags more than adding new source
> code constructs. Maybe because I think that signed wrapping, unlike
> unsigned, is of no practical use.
>
> >
> > There are lot of aspects that majority of people involved seem to
> > agree with.
>
> I, for one, find some of them questionable.
>
> >
> > * on majority of cases overflow (even unsigned) is programming error
> That's my first - unsigned wrapping /can/ be useful in some contexts. I
> can't say if it is about the majority of cases, but it is enough to keep
> it in place.

Perhaps here is some difference in meaning of word "majority".
For me "majority" means "more than 50% of all". Also "all" is
all such operations in code base. You seem to express disagreement
with something else.

> > * trap (overflow that signals or throws) helps to find programming
> > errors most reliably and quickly
> I would say it is a better debugging aid than nothing, but as
> reliability goes it can only catch what is executed, and it is often
> impossible to test all possible combinations at runtime. On the other
> hand, it would degrade efficiency to a possibly unacceptable level (As
> Scott pointed out).

To disagree you need to tell what helps to find programming errors
involving arithmetic overflow more reliably and quickly.
Also about efficiency that would move compiler optimization developers
to right direction.
Instead of logic that it is up to programmer to ensure that here is
no overflow so if here is then this code can be erased
they would try to prove that overflow is impossible themselves so they
can erase the codes and not to generate trapping checks.

> > * wrap (overflow with modular arithmetic) can be used intentionally
> > and efficiently in some algorithms
> > * snap (overflow that results with saturating nan) is good when
> > damaged, disconnected or short circuited (temperature?) sensor
> > should not turn our device (airplane?) into brick
> I am not sure bout this.
> Floating point arithmetic does something of the kind (NaN propagates
> through), but I think that if it is about a sensor failure, and
> especially if it is about safety, then the hardware itself or the device
> driver should handle this properly, i.e. with specific error conditions,
> rather than the language.

Physically damaged, disconnected or short-circuited temperature sensor
can no way repair or reconnect itself. So the software of device that
has such sensor has to work with NaNs for currently measured temperature
or has to turn into brick. There are nothing else to do.
Hardware like that of T-1000 from Terminator-2 is not yet invented.

>
> > * compiler assuming that overflow not occurring is ensured by some
> > external logic may allow noteworthy optimizations
> True.
>
> > * efficiency is of low importance for lot of code
> Disagree. It is not always important, but it is important enough to be
> one of the main reasons to choose C++, so it should be of primary
> importance for the language.
>
> > * efficiency of C++ is one of its design goals
> > * it is good when programmer's intention can be indicated in code
> True, as long as verbosity is kept at a reasonable level.
>
> >
> > So if to think of it then it is large puzzle indeed how to resolve it
> > elegantly and it continues to be bad to leave it all up to
> > implementations to decide and then each developer to struggle
> > with it on his own with "this standard imposes no requirements".
> >
> > I myself would like most current operators to trap on overflow
> > (by raising SIGFPE or throwing std::overflow_error) both for
> > signed and unsigned.
> I agree with Scott here.
>
> > For wrap, snap and "compiler may assume that the operation
> > does not overflow" cases I would like new operators.
> >
> See above, my impression is that this would result in excessive bloat of
> the language. Better use compiler flags, or at most #pragmas for
> sections of code.

My reasoning is that operators (like say a +% b for wrapping add)
are most terse and can be most flexibly mixed with each other.

David Brown

unread,
Nov 5, 2019, 3:48:14 PM11/5/19
to
On 05/11/2019 16:58, Bonita Montero wrote:
> >> Optimizing the following away:
> >>
> >> int i;
> >> ...
> >> if( i + 1 < i )
> >>     ...
> >>
> >> doesn't make sense.
>
> > Sure it makes sense.  It's basic maths - if you add 1 to a number, with
> > normal arithmetic, you can't make it smaller.
>
> With the same reasoning you could say that unsigneds might never
> wrap; but in fact they're specified to wrap.

Unsigneds are specified to wrap, yes - but almost all occurrences of
unsigned overflow are bugs in the code.

/Very/ occasionally, you want wrapping semantics for integer arithmetic.
It thus made sense for C (and C++) to provide a way to get wrapping
when you need it - and it was easy to specify it for unsigned types, but
would be unduly inefficient for signed types.

It makes no sense that if you have 4294967295 apples in a pile, add an
apple, you get 0 apples. It makes no sense that if you have 2147483647
apples in a pile, add an apple, you get -2147483648 apples.


> And That's not how
> computers work.

That is utterly irrelevant. Computers work with electrical signals, and
at a different level, with bits. That has no bearing on what makes
sense in a computer language or a program. The wrapping and the two's
complement format is simply the cheapest and fastest way to make the
hardware, nothing more than that.

> If someone does this he does the above intentionally
> there's only _one_ reason for it: he wants to check for wrap-around.
> So this is an optimization which no one asked for.
>

The only reason to write code like manually that is through a
misunderstanding of the language.

But weird and irrational code can be generated as a result of inlining,
macros, templates, constant propagation, etc. And you /want/ the
compiler to optimise these and remove code that could not possibly run.


Oh, and please tell me you read the rest of my post and understood it -
both how it demonstrated an optimisation based on undefined signed
overflows, and how compilers do not treat signed integers as wrapping.

(While it would be nice to get an answer there, I know you won't give it
- I know you are don't understand basic human qualities like politeness,
and I know you will do anything to avoid admitting that you were wrong
and prefer to remain ignorant. I wrote my posts hoping other people
will benefit from them too. But if you prove /me/ wrong by replying
properly to my posts and questions, I would be much obliged.)

Manfred

unread,
Nov 5, 2019, 4:48:46 PM11/5/19
to
On 11/5/2019 8:55 PM, Öö Tiib wrote:
> On Tuesday, 5 November 2019 20:31:45 UTC+2, Manfred wrote:
>> On 11/5/19 6:02 PM, Öö Tiib wrote:
>>> On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
>>>>
>>>> When people can't agree, and when every option has its advantages and
>>>> disadvantages, the only rational solution is to give people an explicit
>>>> choice.
>>
>> Except when giving a choice complicates the matter more than the
>> benefits it can give.
>> From this perspective I like compiler flags more than adding new source
>> code constructs. Maybe because I think that signed wrapping, unlike
>> unsigned, is of no practical use.
>>
>>>
>>> There are lot of aspects that majority of people involved seem to
>>> agree with.
>>
>> I, for one, find some of them questionable.
>>
>>>
>>> * on majority of cases overflow (even unsigned) is programming error
>> That's my first - unsigned wrapping /can/ be useful in some contexts. I
>> can't say if it is about the majority of cases, but it is enough to keep
>> it in place.
>
> Perhaps here is some difference in meaning of word "majority".
> For me "majority" means "more than 50% of all". Also "all" is
> all such operations in code base. You seem to express disagreement
> with something else.
My point is about what to do with the fact that many times overflow is a
programming error: even if this is true I think that unsigned overflow
should have defined behavior (and wrap) rather than being handled as an
error by the compiler.
Undoubtedly, but that's not what I wrote.

So the software of device that
> has such sensor has to work with NaNs for currently measured temperature
> or has to turn into brick. There are nothing else to do.
> Hardware like that of T-1000 from Terminator-2 is not yet invented.
No need to call in Schwarzenegger for help.
My point is that rather than using NaNs the hardware or driver should
raise specific error signals (like some error code on the control I/O
port, or at the API level) instead.

David Brown

unread,
Nov 5, 2019, 5:00:16 PM11/5/19
to
On 05/11/2019 18:02, Öö Tiib wrote:
> On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown wrote:
>>
>> When people can't agree, and when every option has its advantages and
>> disadvantages, the only rational solution is to give people an explicit
>> choice.
>
> There are lot of aspects that majority of people involved seem to
> agree with.
>
> * on majority of cases overflow (even unsigned) is programming error

Agreed.

> * trap (overflow that signals or throws) helps to find programming
> errors most reliably and quickly

Agreed (where "trap" could mean any kind of notification, exception,
error log, etc.). But this is something you might only want during
debugging - it is of significant efficiency cost.

> * wrap (overflow with modular arithmetic) can be used intentionally
> and efficiently in some algorithms

It is occasionally, but rarely, useful.

> * snap (overflow that results with saturating nan) is good when
> damaged, disconnected or short circuited (temperature?) sensor
> should not turn our device (airplane?) into brick

Saturation in general can be useful.

> * compiler assuming that overflow not occurring is ensured by some
> external logic may allow noteworthy optimizations

Agreed.

> * efficiency is of low importance for lot of code

Agreed.

> * efficiency of C++ is one of its design goals

Agreed.

> * it is good when programmer's intention can be indicated in code

Agreed.


A couple of other options that I didn't mention before for overflows are:

* NaN (signalling or quiet), so that you can do long calculations and
then check at the end if it worked.

* Unspecified value - the overflow will give a valid int (or whatever
type is in use) value, but the compiler gives no information about what
the value might be. This can be useful when you can detect something
has failed, and return (valid, value) pairs. It will allow the compiler
to take shortcuts and simplify calculations but not allow undefined
behaviour to "move" forwards or backwards.

>
> So if to think of it then it is large puzzle indeed how to resolve it
> elegantly and it continues to be bad to leave it all up to
> implementations to decide and then each developer to struggle
> with it on his own with "this standard imposes no requirements".
>
> I myself would like most current operators to trap on overflow
> (by raising SIGFPE or throwing std::overflow_error) both for
> signed and unsigned.

I like that in debugging or finding problems - with tools like
sanitizers. But I would not want that in normal code. With this kind
of semantics, the compiler can't even simplify "x + 1 - 1" to "x". I
much prefer to be able to write code in whatever way is simplest,
clearest or most maintainable for /me/, knowing that the compiler will
turn it into the most efficient results. I intentionally use an
optimising compiler for C and C++ programming - when efficiency doesn't
matter, I'll program in Python where integers grow to avoid overflow.

> For wrap, snap and "compiler may assume that the operation
> does not overflow" cases I would like new operators.
>

I suggested new types for the different behaviour, but of course it is
the operations that have the behaviour, not the types. However, I can't
see a convenient way to specify overflow behaviour on operations - using
types is the best balance between flexibility and legible code.

David Brown

unread,
Nov 5, 2019, 5:15:09 PM11/5/19
to
On 05/11/2019 19:31, Manfred wrote:
> On 11/5/19 6:02 PM, Öö Tiib wrote:
>> On Tuesday, 5 November 2019 11:31:39 UTC+2, David Brown  wrote:
>>>
>>> When people can't agree, and when every option has its advantages and
>>> disadvantages, the only rational solution is to give people an explicit
>>> choice.
>
> Except when giving a choice complicates the matter more than the
> benefits it can give.
> From this perspective I like compiler flags more than adding new source
> code constructs. Maybe because I think that signed wrapping, unlike
> unsigned, is of no practical use.
>

Personally, I think signed wrapping is only useful very occasionally -
and unsigned wrapping only slightly more often. But when you need
wrapping, it is a lot more efficient to have it than to have to mess
around with checks for overflows.

I don't like compiler flags as a solution here, because it is
disconnected from the code. (Though flags are excellent for when you
have to work with code written by someone who doesn't understand the
language and has only tested with inferior compilers.) If I have code
that makes use of additional semantics, like wrapping integers or
invalid pointer casts, I put the flags in the code as pragmas (or gcc
function attributes). That way the code will work regardless of the
flags used to compile the code. It also gives a fine place to put
comments explaining the issue.

>>
>> There are lot of aspects that majority of people involved seem to
>> agree with.
>
> I, for one, find some of them questionable.
>
>>
>> * on majority of cases overflow (even unsigned) is programming error
> That's my first - unsigned wrapping /can/ be useful in some contexts. I
> can't say if it is about the majority of cases, but it is enough to keep
> it in place.
>

I think it can be agreed that sometimes the wrapping is useful, and
sometimes it is a programming error. But people disagree about the
split, both for signed and unsigned types.

>
>> * efficiency is of low importance for lot of code
> Disagree. It is not always important, but it is important enough to be
> one of the main reasons to choose C++, so it should be of primary
> importance for the language.

There are many good reasons for choosing C++ other than efficiency. It
is a well-established fact that in almost all code, only small sections
are performance critical. That does not mean we want intentionally
inefficient code in the rest of it, of course. And if the code gets
full of run-time checks then it /would/ end up as inefficient.

Öö Tiib

unread,
Nov 5, 2019, 6:49:30 PM11/5/19
to
I just listed facts with what lot of people agree like
"on majority of cases overflow (even unsigned) is programming error"
I did not say what to conclude from these facts here.

> >>> * wrap (overflow with modular arithmetic) can be used intentionally
> >>> and efficiently in some algorithms
> >>> * snap (overflow that results with saturating nan) is good when
> >>> damaged, disconnected or short circuited (temperature?) sensor
> >>> should not turn our device (airplane?) into brick
> >> I am not sure bout this.
> >> Floating point arithmetic does something of the kind (NaN propagates
> >> through), but I think that if it is about a sensor failure, and
> >> especially if it is about safety, then the hardware itself or the device
> >> driver should handle this properly, i.e. with specific error conditions,
> >> rather than the language.
> >
> > Physically damaged, disconnected or short-circuited temperature sensor
> > can no way repair or reconnect itself.
> Undoubtedly, but that's not what I wrote.

Ok.

> So the software of device that
> > has such sensor has to work with NaNs for currently measured temperature
> > or has to turn into brick. There are nothing else to do.
> > Hardware like that of T-1000 from Terminator-2 is not yet invented.
> No need to call in Schwarzenegger for help.
> My point is that rather than using NaNs the hardware or driver should
> raise specific error signals (like some error code on the control I/O
> port, or at the API level) instead.

Device has to operate on incomplete data. And saturating silent NaN
works perfectly as such missing part of data. Driver that is panicking
throwing up and signaling too lot has to be killed to reduce disturbance.
Panic solves nothing regardless if you are Schwarzenegger or not. ;)

Bonita Montero

unread,
Nov 6, 2019, 1:08:18 AM11/6/19
to
>> And That's not how computers work.

> That is utterly irrelevant.

You can rely on that p0907r0 will be included in an upcoming standard
and all implementations will have std::numeric_limits<signed...>::
is_modulo to be set to true; so g++ must drop thé shown optimization.
There are so many language-properties that represent how a CPU logi-
cally works, why not this property?

David Brown

unread,
Nov 6, 2019, 3:30:33 AM11/6/19
to
On 06/11/2019 07:08, Bonita Montero wrote:
>>> And That's not how computers work.
>
>> That is utterly irrelevant.
>
> You can rely on that p0907r0 will be included in an upcoming standard
> and all implementations will have std::numeric_limits<signed...>::

Have you actually /read/ the paper, and its subsequent revisions (we are
now on p0907r4) ?

<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r4.html>

Signed integer overflow remains undefined behaviour. This is what the
majority of the committee, the majority of compiler vendors, and the
majority of users want.

> is_modulo to be set to true; so g++ must drop thé shown optimization.

"is_modulo" can be (but doesn't need to be) set to true if the
implementation gives signed integer arithmetic wrapping semantics.

/If/ an implementation has is_modulo set true for signed types, then you
are correct that it can't do the kind of optimisations I showed (or many
other optimisations). gcc, clang and MSVC currently have is_modulo
false for signed integer types, and do not guarantee wrapping behaviour.
This is fine, and the way it should be. (gcc and clang leave it false
even under "-fwrapv", which is also fine.)

> There are so many language-properties that represent how a CPU logi-
> cally works, why not this property?

C and C++ are high level languages, abstracted from the underlying cpu.

And it has already been explained to you why undefined signed integer
overflow is a good idea.

Bonita Montero

unread,
Nov 6, 2019, 3:57:52 AM11/6/19
to
>> There are so many language-properties that represent how a CPU logi-
>> cally works, why not this property?

> C and C++ are high level languages, abstracted from the underlying cpu.

C isn't high-level and C++ is high-level as well as low-level.
And the issue we're ralking about is low-level.

Manfred

unread,
Nov 6, 2019, 7:33:48 AM11/6/19
to
On 11/5/2019 8:45 PM, Paavo Helde wrote:
> On 5.11.2019 17:58, Bonita Montero wrote:
>>
>> With the same reasoning you could say that unsigneds might never
>> wrap; but in fact they're specified to wrap.
>
> In retrospect, this (wrapping unsigneds) looks like a major design mistake.

No, it isn't.

>
> IMO, wrapping integers (signed or unsigned) are an example of
> "optimization which nobody asked for", and they are there basically only
> because the hardware happened to support such operations.
>

Look at the following code and see for yourself how efficient it is to
check for integer overflow if unsigned integers do wrap.
Achieving the same would be much more verbose (and less efficient) if
unsigned overflow were not defined behavior.

(taken from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm)

char* make_pathname (const char *dir, const char *fname, const char *ext)
{
size_t dirlen = strlen (dir);
size_t filelen = strlen (fname);
size_t extlen = strlen (ext);

size_t pathlen = dirlen;

// detect and handle integer wrapping
if ( (pathlen += filelen) < filelen
|| (pathlen += extlen) < extlen
|| (pathlen += 3) < 3)
return 0;

char *p, *path = malloc (pathlen);
if (!path)
return 0;

p = memcpy (path, dir, dirlen);
p [dirlen] = '/';

p = memcpy (p + dirlen + 1, fname, filelen);
p [filelen] = '.';

memcpy (p + filelen + 1, ext, extlen + 1);

return path;
}

Bonita Montero

unread,
Nov 6, 2019, 7:51:30 AM11/6/19
to
> char* make_pathname (const char *dir, const char *fname, const char *ext)
>   {
>       size_t dirlen = strlen (dir);
>       size_t filelen = strlen (fname);
>       size_t extlen = strlen (ext);
>       size_t pathlen = dirlen;
>       // detect and handle integer wrapping
>       if (   (pathlen += filelen) < filelen
>           || (pathlen += extlen) < extlen
>           || (pathlen += 3) < 3)
>           return 0;

Sorry, but when are paths longer than size_t?

Öö Tiib

unread,
Nov 6, 2019, 8:26:51 AM11/6/19
to
On Wednesday, 6 November 2019 00:00:16 UTC+2, David Brown wrote:
> On 05/11/2019 18:02, Öö Tiib wrote:
>
> > * trap (overflow that signals or throws) helps to find programming
> > errors most reliably and quickly
>
> Agreed (where "trap" could mean any kind of notification, exception,
> error log, etc.). But this is something you might only want during
> debugging - it is of significant efficiency cost.

Indeed, majority of programming errors should be found during
debugging.

> > I myself would like most current operators to trap on overflow
> > (by raising SIGFPE or throwing std::overflow_error) both for
> > signed and unsigned.
>
> I like that in debugging or finding problems - with tools like
> sanitizers. But I would not want that in normal code. With this kind
> of semantics, the compiler can't even simplify "x + 1 - 1" to "x".

It may be can or may be can not that depends on wording.
I have not really thought it thru how to word the semantics
precisely. The major purpose is to get errors when program
is actually storing value into type where it does not fit (IOW
really overflows).

Analogical argument is that automatic storage overflow may
not be trapped in principle since that would disallow optimizing
recursions (that exhaust stack) into loops (that don't exhaust
stack). The rules can be still likely worded in a way that
implementation is allowed not to trap when it manages to get
the job done without exhausting automatic storage somehow.

> I much prefer to be able to write code in whatever way is simplest,
> clearest or most maintainable for /me/, knowing that the compiler will
> turn it into the most efficient results. I intentionally use an
> optimising compiler for C and C++ programming - when efficiency doesn't
> matter, I'll program in Python where integers grow to avoid overflow.

I do almost same but I think of some of it slightly differently.
Exact formula in programming is unfortunately more important
than its clarity and intuitivity for reader.

For example we need to calculate average of two values of A and B.
Mathematically there are lot of ways to calculate it and what is
most intuitive may depend on meaning of A and B. Like:
1) (A + B) / 2
2) A / 2 + B / 2
3) A + (B - A) / 2
4) B + (A - B) / 2
etc.
But in software these can be very different expressions because
these have different potential overflows and/or losses of
accuracy. Until something helps to reduce that issue it is
all about exactly that formula and period.

As of efficiency It is anyway often uncertain until it is
shown where the bottlenecks are. Also it can be often only
shown by profiling products with realistic worst case loads
of data and then it is usually small subset of code that can
change overall efficiency.

Python I use less not because of its bad performance but
because I have failed to use it scalably. Lot of of lIttle script
programs is great, but when any of those starts to grow bigger
then my productivity with those drops. For C++ same feels like
nonsensically unimportant size. Somehow in C++ I have
learned to separate different concerns and to abstract details
away but not in Python.

> > For wrap, snap and "compiler may assume that the operation
> > does not overflow" cases I would like new operators.
> >
>
> I suggested new types for the different behaviour, but of course it is
> the operations that have the behaviour, not the types. However, I can't
> see a convenient way to specify overflow behaviour on operations - using
> types is the best balance between flexibility and legible code.

I mean totally new "advanced" operators like (A +% B) or (C +^ D).
Yes there will be precedence (and may be associativity etc.)
to define but it is business as usual and not some show-stopper
issue. In some languages (like Swift) it is done and it seems to
work fine.

David Brown

unread,
Nov 6, 2019, 8:37:23 AM11/6/19
to
That is just silly, in all sorts of ways.

First, decide if the function is an "internal" function where you can
trust the parameters, and have undefined behaviour if assumptions don't
hold, or an "external" function where you have to check the validity of
the parameters.

If it is internal, you know the lengths of the passed strings will not
sum to more than 4G - or you don't care if someone does something
ridiculous. (And on most modern systems, size_t is 64-bit - overflowing
here would require 16 EB ram for storing the strings.)

If it is external, the checking is too little - if you have char*
pointers from an unknown source, you should be wary about running
strlen() on them because you don't know if it will actually end with a 0
in a reasonable limit.


You only need to check for overflow if it is possible for the
calculations to overflow. If the operands are too small to cause an
overflow, there will not be an overflow.

And until you are talking about large integers for cryptography or that
sort of thing, adding up realistic numbers will not overflow a 64-bit type.

So /if/ you have an old 32-bit size_t system, and /if/ you have
maliciously crafted parameters that point to huge strings (and you'll
have to make them point within the same string - you don't get over 4 GB
user memory address space with 32-bit size_t), then you can do your
adding up using 64-bit types and you get zero risk of overflow.


uint_least64_t dirlen = strlen (dir);
uint_least64_t filelen = strlen (fname);
uint_least64_t extlen = strlen (ext);

uint_least64_t pathlen = dirlen + filelen + extlen;

if (big_size_t > MAX_SANE_PATHLENGTH) return 0;


There are times when unsigned wrapping overflow is useful. This is not
one of them.

Manfred

unread,
Nov 6, 2019, 9:38:40 AM11/6/19
to
You realize that this comes from the glibc maintainers, don't you?
You can say they wrote silly code for this example (I don't), but I
doubt there are many more knowledgeable people about this kind of matter
than them.

Moreover, I took this as an example of detection of integer overflow.
The fact that it happens to be about pathname strings is irrelevant to
this discussion.

>
> First, decide if the function is an "internal" function where you can
> trust the parameters, and have undefined behaviour if assumptions don't
> hold, or an "external" function where you have to check the validity of
> the parameters.

This example was written about code safety, so yes, I believe it is
pretty clear the assumption is that strings come from an external source.
Obviously this applies to string /contents/; the pointer themselves can
only be internal to the program (can't they?), so no need to check for
null pointer.
On the other hand, contents of the string is checked by ensuring that
the result of strlen and their combination is valid. This is ensured
/exactly/ by making use of unsigned wrapping behavior.

>
> If it is internal, you know the lengths of the passed strings will not
> sum to more than 4G - or you don't care if someone does something
> ridiculous. (And on most modern systems, size_t is 64-bit - overflowing
> here would require 16 EB ram for storing the strings.)
>
> If it is external, the checking is too little - if you have char*
> pointers from an unknown source, you should be wary about running
> strlen() on them because you don't know if it will actually end with a 0
> in a reasonable limit.

This code handles C strings, so there is no way to check for their
length other than running strlen.
The fact that you seem to miss is that it is exactly thanks to the check
that you call "silly" that it is ensured that they "actually end with a
0 in a reasonable limit".

We could argue about what happens with /read/ access to a
non-0-terminated string, but I would simply assume that the strings are
0 terminated, since the function is going to be called by some other
part of the program that can take care that there is a 0 at the end of
the buffer. What is not guaranteed is that the strings actually contain
pathnames, and don't contain very long malicious text instead (e.g. they
could come from stdin).
That risk is avoided by the code you call silly.
So, no there is not too little checking.

>
>
> You only need to check for overflow if it is possible for the
> calculations to overflow. If the operands are too small to cause an
> overflow, there will not be an overflow.
>
> And until you are talking about large integers for cryptography or that
> sort of thing, adding up realistic numbers will not overflow a 64-bit type.

In fact cryptography is another example where unsigned wrap is useful,
but it would be much more complex (and off topic) to draw an example of
this (not that I claim to be an expert in this area)

And no, just assuming that "adding up realistic numbers will not
overflow a 64-bit type" is not what safe code is about.

>
> So /if/ you have an old 32-bit size_t system, and /if/ you have
> maliciously crafted parameters that point to huge strings (and you'll
> have to make them point within the same string - you don't get over 4 GB
> user memory address space with 32-bit size_t), then you can do your
> adding up using 64-bit types and you get zero risk of overflow.
>
>
> uint_least64_t dirlen = strlen (dir);
> uint_least64_t filelen = strlen (fname);
> uint_least64_t extlen = strlen (ext);
>
> uint_least64_t pathlen = dirlen + filelen + extlen;
>
> if (big_size_t > MAX_SANE_PATHLENGTH) return 0;

You realize that this code is less efficient than the original one,
don't you?
And what would be the correct value for MAX_SANE_PATHLENGTH? Are you
aware of the trouble that has been caused by Windows MAX_PATH?

The example I posted achieves the same level of safety, using less
resources, and allowing for the maximum string length that the system
can /safely/ handle (don't miss the check after malloc). What more do
you want?

>
>
> There are times when unsigned wrapping overflow is useful. This is not
> one of them.
>

I suggest you read the code again (and its source - it is instructive)

Bonita Montero

unread,
Nov 6, 2019, 9:54:00 AM11/6/19
to
>>> (taken from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm)
>>>
>>> char* make_pathname (const char *dir, const char *fname, const char
>>> *ext)
>>>    {
>>>        size_t dirlen = strlen (dir);
>>>        size_t filelen = strlen (fname);
>>>        size_t extlen = strlen (ext);
>>>
>>>        size_t pathlen = dirlen;
>>>
>>>        // detect and handle integer wrapping
>>>        if (   (pathlen += filelen) < filelen
>>>            || (pathlen += extlen) < extlen
>>>            || (pathlen += 3) < 3)
>>>            return 0;

>> That is just silly, in all sorts of ways.

> You realize that this comes from the glibc maintainers, don't you?
> You can say they wrote silly code for this example (I don't), but I
> doubt there are many more knowledgeable people about this kind of
> matter than them.

Yes, this is useless code.

James Kuyper

unread,
Nov 6, 2019, 10:06:55 AM11/6/19
to
On 11/6/19 9:38 AM, Manfred wrote:
...
> And no, just assuming that "adding up realistic numbers will not
> overflow a 64-bit type" is not what safe code is about.

Assuming it: no. Verifying it: yes. If you validate your inputs, you can
often place upper and lower limits on the value of an expression
calculated from those inputs. If those limits fall within the range that
is guaranteed to be representable in the expression's type, it is
perfectly legitimate to not bothering to include an overflow check.

Paavo Helde

unread,
Nov 6, 2019, 10:55:12 AM11/6/19
to
Seriously?

std::string make_pathname(const std::string& dir,
const std::string& fname, const std::string& ext)
{
return dir + "/" + fname + "." + ext;
}

No need to check for any overflows.

Not to speak about that there cannot be overflow in the first place
because if pathlen overflows the three strings dir, fname and ext would
not fit in the process memory anyway.

Not to speak about that the time lost for a more explicit check for
overflow would be zero or unmeasurable, compared to any file access
itself, or even when compared to the malloc() call in the same function.


Alf P. Steinbach

unread,
Nov 6, 2019, 11:00:45 AM11/6/19
to
On 06.11.2019 15:38, Manfred wrote:
> On 11/6/2019 2:37 PM, David Brown wrote:
>> On 06/11/2019 13:33, Manfred wrote:
>>> [snip]
>>> Look at the following code and see for yourself how efficient it is to
>>> check for integer overflow if unsigned integers do wrap.
>>> Achieving the same would be much more verbose (and less efficient) if
>>> unsigned overflow were not defined behavior.
>>>
>>> (taken from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm)
>>>
>>> char* make_pathname (const char *dir, const char *fname, const char
>>> *ext)
>>>    {
>>>        size_t dirlen = strlen (dir);
>>>        size_t filelen = strlen (fname);
>>>        size_t extlen = strlen (ext);
>>>
>>>        size_t pathlen = dirlen;
>>>
>>>        // detect and handle integer wrapping
>>>        if (   (pathlen += filelen) < filelen
>>>            || (pathlen += extlen) < extlen
>>>            || (pathlen += 3) < 3)
>>>            return 0;
>>
>> That is just silly, in all sorts of ways.
>
> You realize that this comes from the glibc maintainers, don't you?
> You can say they wrote silly code for this example (I don't), but I
> doubt there are many more knowledgeable people about this kind of matter
> than them.

David has a point that with 32-bit `size_t` there's no way to have
separate strings whose lengths sum to >= 4G.

So some of the arguments have to point within the same superlong string
in order for the checking to end up at `return 0;`.

Whether it's silly to try to give well-defined behavior also for such an
unlikely case: maybe silly when one just codes up something for limited
use and with limited time, but probably not silly when one's crafting
widely used library code.

I.e. the context, what it's made for, "glibc", is important.

However I think the appeal to authority, "glibc /maintainers/", is a
fallacious argument.


- Alf

Manfred

unread,
Nov 6, 2019, 11:48:11 AM11/6/19
to
I should check the details (if I had the time and will to do it) but
even if this is true for the physical memory address space, if I
remember correctly the 386 has way larger virtual memory addressing
space: it does have segmenting capability, even if most OSs never used it.
I don't remember if it is possible for the 386 to address more than 4G
within a single process, though. Theoretically it is nonetheless
possible, using segments, to have a 32-bit architecture wherein the
lengths sum up to more than 4G.

More practically, the example was about code safety, and so the
possibility of malicious usage has to be assumed, hence the need for the
check (at least for the +3 part).

>
> So some of the arguments have to point within the same superlong string
> in order for the checking to end up at `return 0; >
> Whether it's silly to try to give well-defined behavior also for such an
> unlikely case: maybe silly when one just codes up something for limited
> use and with limited time, but probably not silly when one's crafting
> widely used library code.
>
> I.e. the context, what it's made for, "glibc", is important.
>
> However I think the appeal to authority, "glibc /maintainers/", is a
> fallacious argument.

It would be if it was only an appeal to authority.
After giving context (and yes, pointing out that this example was not
just rubbish taken from some dump in the internet), in the followup my
argument has gone into the subject of the matter.

>
>
> - Alf

Bonita Montero

unread,
Nov 6, 2019, 11:55:01 AM11/6/19
to
> I should check the details (if I had the time and will to do it)
> but even if this is true for the physical memory address space, if
> I remember correctly the 386 has way larger virtual memory addressing
> space: it does have segmenting capability, even if most OSs never used.
> it.

And there is an operating-system using the glibc in a segmented
environment?

> space: it does have segmenting capability, even if most OSs never used
> it. I don't remember if it is possible for the 386 to address more than
> 4G within a single process, though. Theoretically it is nonetheless
> possible, using segments, to have a 32-bit architecture wherein the
> lengths sum up to more than 4G.

I think it would be rather stupid to continue the segmented behaviour
of the 286 protected mode with the 386 protected mode, although it is
hypothetically possible. Also because the 32-bit-machnise almost never
had more memory than 4GB.

Manfred

unread,
Nov 6, 2019, 12:16:08 PM11/6/19
to
How do you think that overflow check is done inside std::string?

>
> Not to speak about that there cannot be overflow in the first place
> because if pathlen overflows the three strings dir, fname and ext would
> not fit in the process memory anyway.
>
> Not to speak about that the time lost for a more explicit check for
> overflow would be zero or unmeasurable, compared to any file access
> itself, or even when compared to the malloc() call in the same function.
>
>

What do you mean with "more explicit check for overflow"?

Assuming you know the variables are 32-bit unsigned, I suppose you could do
if (pathlen < 0xFFFFFFFF-filelen)
{
pathlen += filelen;
}
else
{
return 0;
}

and then repeat, but honestly I don't see the benefit of it compared to
the above (as a first you are introducing a dependency on the specific
integer size).
Or you can cast to a wider type, but then you are not solving the
problem, you are only moving it forward, and still I wouldn't see the
benefit.

Besides, this is about /integer/ overflow check, so it could apply to
more computations other than memory size.

Bonita Montero

unread,
Nov 6, 2019, 12:33:49 PM11/6/19
to
> What do you mean with "more explicit check for overflow"?

concatenating strings in C++ with the + opereator is reliable.

Öö Tiib

unread,
Nov 6, 2019, 1:55:20 PM11/6/19
to
On Wednesday, 6 November 2019 19:16:08 UTC+2, Manfred wrote:
> On 11/6/2019 4:54 PM, Paavo Helde wrote:
> > Seriously?
> >
> > std::string make_pathname(const std::string& dir,
> >     const std::string& fname, const std::string& ext)
> > {
> >     return dir + "/" + fname + "." + ext;
> > }
> >
> > No need to check for any overflows.
>
> How do you think that overflow check is done inside std::string?

All standard library writers are rather good programmers.
Obviously they have something that is easy to read from afar that it does
no way overflow. Likely it is some short inline member to call
when size is supposed to grow that does the check:

if (max_size() - size() < size_to_add)
throw std::length_error(text_to_throw);

Why don't you look into any of implementations in your computer?

Bo Persson

unread,
Nov 6, 2019, 2:00:16 PM11/6/19
to
The original problem wasn't only about memory.

The designers of Windows NT *did* briefly consider adding support for
more than one 4GB segment in a program.

However, to load a new segment you first have to swap the old 4GB
segment out to disk. And they couldn't see PC hard disks ever becoming
that large. :-)


Bo Persson

Bonita Montero

unread,
Nov 6, 2019, 2:13:24 PM11/6/19
to
> The designers of Windows NT *did* briefly consider adding support for
> more than one 4GB segment in a program.

The actually have it today for a small in which the
thread-information-block resides:
https://en.wikipedia.org/wiki/Win32_Thread_Information_Block

> However, to load a new segment you first have to swap the old 4GB
> segment out to disk.

That's not necessarily true.

Bonita Montero

unread,
Nov 6, 2019, 2:16:02 PM11/6/19
to
> All standard library writers are rather good programmers.

Another aspect is: if millions of developers reuse code instead of
writing it on their own, errors will be noticed for sure; but when
you do everything yourself, a lot of errors won't be noticed.

David Brown

unread,
Nov 6, 2019, 2:36:42 PM11/6/19
to
No. But that doesn't change my opinion - at least not much. It makes a
bit more sense if it has to be portable to DOS, for example, with 16-bit
size_t.

> You can say they wrote silly code for this example (I don't), but I
> doubt there are many more knowledgeable people about this kind of matter
> than them.
>
> Moreover, I took this as an example of detection of integer overflow.
> The fact that it happens to be about pathname strings is irrelevant to
> this discussion.

It is, IMHO, a fine example of when overflow should not be issue - and
that defined overflow behaviour is a lot less useful than many people think.

(I still agree that occasionally it is useful.)

>
>>
>> First, decide if the function is an "internal" function where you can
>> trust the parameters, and have undefined behaviour if assumptions don't
>> hold, or an "external" function where you have to check the validity of
>> the parameters.
>
> This example was written about code safety, so yes, I believe it is
> pretty clear the assumption is that strings come from an external source.
> Obviously this applies to string /contents/; the pointer themselves can
> only be internal to the program (can't they?), so no need to check for
> null pointer.

I don't know the circumstances for this function, so I can't answer
that. But if it is an externally accessible function, I think checks
for null pointers would make sense.

> On the other hand, contents of the string is checked by ensuring that
> the result of strlen and their combination is valid. This is ensured
> /exactly/ by making use of unsigned wrapping behavior.
>

The code is going out of its way to check something that is almost
impossible to occur, even with malicious intentions - while ignoring
problems that could easily occur even by accident.

>>
>> If it is internal, you know the lengths of the passed strings will not
>> sum to more than 4G - or you don't care if someone does something
>> ridiculous.  (And on most modern systems, size_t is 64-bit - overflowing
>> here would require 16 EB ram for storing the strings.)
>>
>> If it is external, the checking is too little - if you have char*
>> pointers from an unknown source, you should be wary about running
>> strlen() on them because you don't know if it will actually end with a 0
>> in a reasonable limit.
>
> This code handles C strings, so there is no way to check for their
> length other than running strlen.

size_t limited_strlen(const char* s, size_t n) {
const char* p = memchr(s, 0, n);
if (!p) return (size_t) -1; // Indicate bad string
return p - s;
}

> The fact that you seem to miss is that it is exactly thanks to the check
> that you call "silly" that it is ensured that they "actually end with a
> 0 in a reasonable limit".

The checks don't do that. If the three parameters pointed to strings
adding up to 4 GB - 1 characters in total, by what stretch of the
imagination is that "in a reasonable limit" ?

>
> We could argue about what happens with /read/ access to a
> non-0-terminated string, but I would simply assume that the strings are
> 0 terminated, since the function is going to be called by some other
> part of the program that can take care that there is a 0 at the end of
> the buffer.

Failing to 0 terminate a character sequence that is intended to be a
string is not an uncommon bug.

> What is not guaranteed is that the strings actually contain
> pathnames, and don't contain very long malicious text instead (e.g. they
> could come from stdin).

You need text sequences totalling more than 4 GB to cause a problem (and
that's on older 32-bit size_t systems). Since the user program can't
have as much as 4 GB of data space, you'd need not only huge malicious
texts, but you'd also need at least two of the three parameters to overlap.


> That risk is avoided by the code you call silly.
> So, no there is not too little checking.
>
>>
>>
>> You only need to check for overflow if it is possible for the
>> calculations to overflow.  If the operands are too small to cause an
>> overflow, there will not be an overflow.
>>
>> And until you are talking about large integers for cryptography or that
>> sort of thing, adding up realistic numbers will not overflow a 64-bit
>> type.
>
> In fact cryptography is another example where unsigned wrap is useful,
> but it would be much more complex (and off topic) to draw an example of
> this (not that I claim to be an expert in this area)

Yes, I agree. As I said, sometimes unsigned wrap is useful. But not in
this example.

>
> And no, just assuming that "adding up realistic numbers will not
> overflow a 64-bit type" is not what safe code is about.

Safe coding is never about assuming anything. But safe coding involves
looking at what values can occur (intentionally and unintentionally),
and making checks /only/ when you need checks. It involves /thinking/,
rather than making checks in the blind "just in case". And when you
have 64-bit types, there are few non-specialist occasions when your
values can overflow with simple addition. If there is a risk of their
being an overflow, then you probably have not sanity checked the values
coming in.

>
>>
>> So /if/ you have an old 32-bit size_t system, and /if/ you have
>> maliciously crafted parameters that point to huge strings (and you'll
>> have to make them point within the same string - you don't get over 4 GB
>> user memory address space with 32-bit size_t), then you can do your
>> adding up using 64-bit types and you get zero risk of overflow.
>>
>>
>>          uint_least64_t dirlen = strlen (dir);
>>          uint_least64_t filelen = strlen (fname);
>>          uint_least64_t extlen = strlen (ext);
>>
>>     uint_least64_t pathlen = dirlen + filelen + extlen;
>>
>>     if (big_size_t > MAX_SANE_PATHLENGTH) return 0;
>
> You realize that this code is less efficient than the original one,
> don't you?

You realise that it is /more/ efficient on any 64-bit system? And that
on 32-bit systems it is at most marginally less efficient, but might be
more efficient (by avoiding branches)? And that this code does
something /useful/ by checking for a sensible maximum pathlength? And
that any few instruction cycles lost here are negligible compared to the
hundreds that are needed for copying even a small path string?

> And what would be the correct value for MAX_SANE_PATHLENGTH? Are you
> aware of the trouble that has been caused by Windows MAX_PATH?

You'd pick a big value for it - but a lot less than 4G. Most
filesystems limit file names to 255 bytes. What would you say is a
maximum depth of directories before it is clear that you have thoroughly
screwed up somewhere - maybe 1000? So pick MAX_SANE_PATHLENGTH at 256K
on 32-bit systems, and 16M on 64-bit systems. People with more
knowledge of the target system in question can pick better values.

POSIX used to have a limit of 4096, which was okay in the old days but
got a bit limiting. Windows is limited to 32K now, which seems to be
fine - the old limit was 260 characters which is clearly too small.

>
> The example I posted achieves the same level of safety, using less
> resources, and allowing for the maximum string length that the system
> can /safely/ handle (don't miss the check after malloc). What more do
> you want?

See what I wrote above.

>
>>
>>
>> There are times when unsigned wrapping overflow is useful.  This is not
>> one of them.
>>
>
> I suggest you read the code again (and its source - it is instructive)

I did.

David Brown

unread,
Nov 6, 2019, 2:44:29 PM11/6/19
to
The 32-bit x86 could have a larger virtual memory space than 4 GB, and
processors with the PAE could have more physical memory too (I think 64
GB, but I am not sure). However, individual processes on *nix or
Windows could only access 4 GB (split either 2G/2G or 1G/3G between
kernel space and user space). The pointers here would all be within the
same 4 GB limit.


> I don't remember if it is possible for the 386 to address more than 4G
> within a single process, though. Theoretically it is nonetheless
> possible, using segments, to have a 32-bit architecture wherein the
> lengths sum up to more than 4G.
>
> More practically, the example was about code safety, and so the
> possibility of malicious usage has to be assumed, hence the need for the
> check (at least for the +3 part).
>

Checking parameters for sanity is vital for externally-facing functions.
This one fails to check for null pointers, fails to check for sane
sizes of the strings (strlen is not a good idea on unknown strings, if
you consider excessive run time to be a problem), and fails to check for
any kind of appropriate size on the data. It checks that the curtains
are closed while leaving the front door open.

Manfred

unread,
Nov 6, 2019, 2:53:58 PM11/6/19
to
My reply was about the comment "No need to check for any overflows." -
obviously if you do C++ programming you need to know how to check for
integer overflow (and we all know how to use '+' with std::string, but
we may also need to calculate e.g. a length of a std::vector to resize,
and check for overflow in the process)

The fragment you posted obviously works well with no need for unsigned
wrapping (I didn't say you can't do without, I said it's more efficient
and less verbose if you have it), but it also has different requirements
than the example I posted: two strings only, with the precondition that
the 'this' string is less than max_size() long.

Just to stick with this example (although my point was about integer
overflow rather than string concatenation), you see that if you lift
these preconditions and need to concat more than 2 strings (as it was in
the example) then either you precalculate the total length of the final
string with a series of additions (and you see that the example is more
efficient, more clear and less verbose than a series of if/else) or you
do string+string+string and you end up with multiple allocations and
copies (and yes, as often as this is totally irrelevant, sometimes this
does matter).

This is to say that if you want to handle integer overflow efficiently,
then unsigned wrapping /is/ useful, it is not a "major design mistake" -
which is what triggered all this (for me at least).

Paavo Helde

unread,
Nov 6, 2019, 3:12:30 PM11/6/19
to
On 6.11.2019 19:15, Manfred wrote:
> On 11/6/2019 4:54 PM, Paavo Helde wrote:
>
>> std::string make_pathname(const std::string& dir,
>> const std::string& fname, const std::string& ext)
>> {
>> return dir + "/" + fname + "." + ext;
>> }
>>
>> No need to check for any overflows.
>
> How do you think that overflow check is done inside std::string?

I'm pretty confident it's done properly. Note that std::string would
even be allowed to use implementation-specific features which are UB by
the standard, should it be necessary.

>>
>> Not to speak about that there cannot be overflow in the first place
>> because if pathlen overflows the three strings dir, fname and ext
>> would not fit in the process memory anyway.
>>
>> Not to speak about that the time lost for a more explicit check for
>> overflow would be zero or unmeasurable, compared to any file access
>> itself, or even when compared to the malloc() call in the same function.
>>
>>
>
> What do you mean with "more explicit check for overflow"?
>
> Assuming you know the variables are 32-bit unsigned, I suppose you could do
> if (pathlen < 0xFFFFFFFF-filelen)
> {
> pathlen += filelen;
> }
> else
> {
> return 0;
> }

Yes, that's what I meant by "more explicit". And there is
std::numeric_limits<T>::max(), no need for 0xFFFFFFFFF magic numbers.

>
> and then repeat, but honestly I don't see the benefit of it compared to
> the above (as a first you are introducing a dependency on the specific
> integer size).
> Or you can cast to a wider type, but then you are not solving the
> problem, you are only moving it forward, and still I wouldn't see the
> benefit.

To be honest, the check I would wrote would be something along to:

if (strlen(filename)>10000) {
throw std::runtime_error("You must be kidding");
}

and it would happen way before any make_pathname() function.

>
> Besides, this is about /integer/ overflow check, so it could apply to
> more computations other than memory size.

No, the variables in your example were size_t.

Paavo Helde

unread,
Nov 6, 2019, 3:25:34 PM11/6/19
to
On 6.11.2019 21:53, Manfred wrote:
> Just to stick with this example (although my point was about integer
> overflow rather than string concatenation), you see that if you lift
> these preconditions and need to concat more than 2 strings (as it was in
> the example) then either you precalculate the total length of the final
> string with a series of additions (and you see that the example is more
> efficient, more clear and less verbose than a series of if/else) or you
> do string+string+string and you end up with multiple allocations and
> copies (and yes, as often as this is totally irrelevant, sometimes this
> does matter).

If the profiler shows the multiple allocations in this function are
creating a bottleneck, you can avoid this easily:

std::string make_pathname(const std::string& dir,
const std::string& fname, const std::string& ext)
{
std::string result;
result.reserve(dir.length()+fname.length()+ext.length()+2);
result += dir;
result += '/';
result += fname;
result += '.';
result += ext;
return result;
}

There is still no need to check for overflow.

Note that if there is an overflow, reserve() will reserve less memory
than needed, but that's OK because the later += operations will check
for it anyway.

And if a future C++ implementation makes unsigned wrapping undefined and
the above addition overflows, triggers an assertion and terminates the
program, then this would be a better result than trying to compose a
filename which won't fit in memory.

David Brown

unread,
Nov 6, 2019, 3:50:30 PM11/6/19
to
On 06/11/2019 14:26, Öö Tiib wrote:
> On Wednesday, 6 November 2019 00:00:16 UTC+2, David Brown wrote:
>> On 05/11/2019 18:02, Öö Tiib wrote:
>>
>>> * trap (overflow that signals or throws) helps to find programming
>>> errors most reliably and quickly
>>
>> Agreed (where "trap" could mean any kind of notification, exception,
>> error log, etc.). But this is something you might only want during
>> debugging - it is of significant efficiency cost.
>
> Indeed, majority of programming errors should be found during
> debugging.
>
>>> I myself would like most current operators to trap on overflow
>>> (by raising SIGFPE or throwing std::overflow_error) both for
>>> signed and unsigned.
>>
>> I like that in debugging or finding problems - with tools like
>> sanitizers. But I would not want that in normal code. With this kind
>> of semantics, the compiler can't even simplify "x + 1 - 1" to "x".
>
> It may be can or may be can not that depends on wording.
> I have not really thought it thru how to word the semantics
> precisely. The major purpose is to get errors when program
> is actually storing value into type where it does not fit (IOW
> really overflows).

It is possible to consider intermediary results as "infinite precision",
which would give you that optimisation. But how would you treat:

x = x + 1;
x = x - 1;

Would that cause a trap if x started at 0x3ffffff (assuming 32-bit int)?
What if "x" is held in a register rather than stored in memory? There
are lots of possible inconsistencies here. (If it is any guide, gcc
with "-ftrapv" optimises the two statements into "x". With
"-fsanitize=signed-integer-overflow", the single expression "x + 1 - 1"
is simplified to "x", but the two statements lead to checks.)

Actually, it is worth looking at some gcc (or clang) generated code with
-ftrapv and -fsanitize=signed-integer-overflow, perhaps with
<https://gotbolt.org>, to see just host costly trapping semantics can
be. They are very far from negligible.
Yes. It is up to the programmer to ensure the expression is written in
a way that does not overflow or lose accuracy (at least, not more than
you are willing to lose). It is up to the compiler to pick the most
efficient way to get the results requested from that expression,
according to the rules of the language (and compiler flags) that
determine the semantics of the expression.

>
> As of efficiency It is anyway often uncertain until it is
> shown where the bottlenecks are. Also it can be often only
> shown by profiling products with realistic worst case loads
> of data and then it is usually small subset of code that can
> change overall efficiency.
>
> Python I use less not because of its bad performance but
> because I have failed to use it scalably. Lot of of lIttle script
> programs is great, but when any of those starts to grow bigger
> then my productivity with those drops. For C++ same feels like
> nonsensically unimportant size. Somehow in C++ I have
> learned to separate different concerns and to abstract details
> away but not in Python.

That's fine. Different languages have their advantages and
disadvantages - and programmer preference is very subjective.

>
>>> For wrap, snap and "compiler may assume that the operation
>>> does not overflow" cases I would like new operators.
>>>
>>
>> I suggested new types for the different behaviour, but of course it is
>> the operations that have the behaviour, not the types. However, I can't
>> see a convenient way to specify overflow behaviour on operations - using
>> types is the best balance between flexibility and legible code.
>
> I mean totally new "advanced" operators like (A +% B) or (C +^ D).
> Yes there will be precedence (and may be associativity etc.)
> to define but it is business as usual and not some show-stopper
> issue. In some languages (like Swift) it is done and it seems to
> work fine.
>

Ah, okay. I think there is a big risk of confusion here, and for people
to forget which operators do what. (There is also the risk of
accidentally writing smilies in your code!).

I am happier with the type-based solution, and simply refusing to
compile combinations of types that don't make sense (like adding a
wrapping type and a saturation type). Undefined or unspecified
behaviour types can be silently promoted to other more defined behaviour
as needed.

That also has the advantage that it can be done today, with current C++.

David Brown

unread,
Nov 6, 2019, 4:20:14 PM11/6/19
to
On 06/11/2019 20:53, Manfred wrote:

> This is to say that if you want to handle integer overflow efficiently,
> then unsigned wrapping /is/ useful, it is not a "major design mistake" -
> which is what triggered all this (for me at least).

I partly agree on this (despite disagreeing about the example you gave).
Unsigned wrapping is one way to handle integer overflow efficiently.
It is not the only way, and not necessarily the best way, but it can
sometimes be useful.

One thing to note is that when the checks are handled within the
standard library (such as for C++ string concatenation), the
implementation does not need to stick to portable, standard C++. A gcc
standard library could use "#pragma GCC optimize(("fwrapv"))" and
wrapping signed integer arithmetic, if it preferred. Or it could use
__int128 types, or the overflow __builtins - whatever gives the best
results.

But if you are writing portable, standard C or C++ and need to check for
overflows, unsigned wrapping is sometimes the best solution.

However, I would say that the main reason for having unsigned integers
wrap is simply that sometimes you want wrapping behaviour - overflow
detection is only one of a number of possible applications for wrapping.

Manfred

unread,
Nov 6, 2019, 4:54:55 PM11/6/19
to
[...]

>>>
>>> First, decide if the function is an "internal" function where you can
>>> trust the parameters, and have undefined behaviour if assumptions don't
>>> hold, or an "external" function where you have to check the validity of
>>> the parameters.
>>
>> This example was written about code safety, so yes, I believe it is
>> pretty clear the assumption is that strings come from an external source.
>> Obviously this applies to string /contents/; the pointer themselves
>> can only be internal to the program (can't they?), so no need to check
>> for null pointer.
>
> I don't know the circumstances for this function, so I can't answer
> that.  But if it is an externally accessible function, I think checks
> for null pointers would make sense.

I think there is some misunderstanding about what we mean for "external"
here, and below.
This is an example of application code, in the context of code safety,
that may be written as part of the final program or as part of a
library. Either way, the pointers are in control of the programmer who
writes the program, and he does not want to pass null pointers here.

Yes, this may happen if there is a mistake somewhere in the program, but
this is not what code safety is about, it is about code correctness,
which is a somewhat different thing (although it is true that incorrect
code is almost certainly unsafe code as well).
It is for this reason that standard library functions do not check for
null on all pointers passed in, and I am sure you don't check for null
within every function you write.
Code safety as I understand it is about ensuring the program works as
expected with every kind of data input that is supplied to the /program/
- like through an I/O stream (kernel safety is a different matter of
course).
In other words, when you read from std::cin, a file or a socket, you
allocate a buffer and you know that its pointer is valid. When you reach
EOF you also know that the buffer is 0-terminated - you place a 0
yourself if the runtime does not do this for you.

What you don't know (you can't control it) is if the /contents/ between
the pointer address and the terminating 0 has a valid format, whether it
is too large or whatever, and that's what you need to ensure that your
program is robust against in order to be safe.

>
>> On the other hand, contents of the string is checked by ensuring that
>> the result of strlen and their combination is valid. This is ensured
>> /exactly/ by making use of unsigned wrapping behavior.
>>
>
> The code is going out of its way to check something that is almost
> impossible to occur, even with malicious intentions - while ignoring
> problems that could easily occur even by accident.
See above.

>
>>>
>>> If it is internal, you know the lengths of the passed strings will not
>>> sum to more than 4G - or you don't care if someone does something
>>> ridiculous.  (And on most modern systems, size_t is 64-bit - overflowing
>>> here would require 16 EB ram for storing the strings.)
>>>
>>> If it is external, the checking is too little - if you have char*
>>> pointers from an unknown source, you should be wary about running
>>> strlen() on them because you don't know if it will actually end with a 0
>>> in a reasonable limit.
>>
>> This code handles C strings, so there is no way to check for their
>> length other than running strlen.
>
> size_t limited_strlen(const char* s, size_t n) {
>     const char* p = memchr(s, 0, n);
>     if (!p) return (size_t) -1;        // Indicate bad string
>     return p - s;
> }

Incidentally, this takes a (pointer, length) pair, which is not a C string.
But that's irrelevant, I explained above that the presence of the
terminating 0 can and needs to be ensured at the calling level.

>
>> The fact that you seem to miss is that it is exactly thanks to the
>> check that you call "silly" that it is ensured that they "actually end
>> with a 0 in a reasonable limit".
>
> The checks don't do that.  If the three parameters pointed to strings
> adding up to 4 GB - 1 characters in total, by what stretch of the
> imagination is that "in a reasonable limit" ?

Define reasonable.
The checks ensure that the code will concat correctly whatever length
can be /safely/ handled by the system, and handle correctly excessive
lengths as well, thus covering the entire input domain. That's enough
for safety, with no need to introduce any dependency on an additional
and arbitrary parameter.

>
>>
>> We could argue about what happens with /read/ access to a
>> non-0-terminated string, but I would simply assume that the strings
>> are 0 terminated, since the function is going to be called by some
>> other part of the program that can take care that there is a 0 at the
>> end of the buffer.
>
> Failing to 0 terminate a character sequence that is intended to be a
> string is not an uncommon bug.
See above.

>
>> What is not guaranteed is that the strings actually contain pathnames,
>> and don't contain very long malicious text instead (e.g. they could
>> come from stdin).
>
> You need text sequences totalling more than 4 GB to cause a problem (and
> that's on older 32-bit size_t systems).  Since the user program can't
> have as much as 4 GB of data space, you'd need not only huge malicious
> texts, but you'd also need at least two of the three parameters to overlap.
>
>
>> That risk is avoided by the code you call silly.
>> So, no there is not too little checking.
>>
>>>
>>>
>>> You only need to check for overflow if it is possible for the
>>> calculations to overflow.  If the operands are too small to cause an
>>> overflow, there will not be an overflow.
>>>
Yes, keep "if the operands are too small" in mind, and see below.

[...]

>>
>>>
>>> So /if/ you have an old 32-bit size_t system, and /if/ you have
>>> maliciously crafted parameters that point to huge strings (and you'll
>>> have to make them point within the same string - you don't get over 4 GB
>>> user memory address space with 32-bit size_t), then you can do your
>>> adding up using 64-bit types and you get zero risk of overflow.
>>>
>>>
>>>          uint_least64_t dirlen = strlen (dir);
>>>          uint_least64_t filelen = strlen (fname);
>>>          uint_least64_t extlen = strlen (ext);
>>>
>>>     uint_least64_t pathlen = dirlen + filelen + extlen;
>>>
>>>     if (big_size_t > MAX_SANE_PATHLENGTH) return 0;
>>
>> You realize that this code is less efficient than the original one,
>> don't you?
>
> You realise that it is /more/ efficient on any 64-bit system?

Watch out: you wrote this example for 32-bit, but on 64 bit it can fail
for large enough inputs:

// the following /can/ now silently overflow
>>> uint_least64_t pathlen = dirlen + filelen + extlen;
>>>
>>> if (big_size_t > MAX_SANE_PATHLENGTH) return 0;

That is, on 64-bit it becomes unsafe.

  And that
> on 32-bit systems it is at most marginally less efficient, but might be
> more efficient (by avoiding branches)?  And that this code does
> something /useful/ by checking for a sensible maximum pathlength?  And
> that any few instruction cycles lost here are negligible compared to the
> hundreds that are needed for copying even a small path string?
As I wrote elsethread, if you are on 32-bit, and cast to 64-bit you
don't really solve the problem, you move it ahead - in other words, you
can't think of handling integer overflow by assuming you always have a
wider type available.
Remember, the topic here is about unsigned wrapping, i.e. integer overflow.

However, without avoiding your point, your alternative works under some
conditions, but it is more wasteful of resources (and as a programmer
involved with microcontrollers and embedded software you are sensible to
this matter), I don't see the benefit of an arbitratry
MAX_SANE_PATHLENGTH - in fact I'd rather not bother about any, and I
don't see how it would be any better than the original.

Manfred

unread,
Nov 6, 2019, 5:05:38 PM11/6/19
to
Note that here you are still assuming wrapping.

>
> And if a future C++ implementation makes unsigned wrapping undefined
Undefined? Then I believe it wouldn't work.

and
> the above addition overflows, triggers an assertion and terminates the
> program, then this would be a better result than trying to compose a
> filename which won't fit in memory.
>
And here you are still assuming defined behavior (implying that every
single integer operation is checked at runtime).
Moreover, program termination is /not/ safe coding - you know what DoS
attack is.

David Brown

unread,
Nov 6, 2019, 6:52:57 PM11/6/19
to
That would make it an internal function, not an external function.
Checking for absurd string lengths (and therefore the overflow) is thus
unnecessary.

> Yes, this may happen if there is a mistake somewhere in the program, but
> this is not what code safety is about, it is about code correctness,
> which is a somewhat different thing (although it is true that incorrect
> code is almost certainly unsafe code as well).
> It is for this reason that standard library functions do not check for
> null on all pointers passed in, and I am sure you don't check for null
> within every function you write.

I also don't check sizes, ranges, etc., because my functions are all
internal (since I code for small embedded systems, not libraries).
Functions have a defined set of values that are valid for their
parameters - and they can assume those rules are upheld (baring program
bugs, as you say).

> Code safety as I understand it is about ensuring the program works as
> expected with every kind of data input that is supplied to the /program/
> - like through an I/O stream (kernel safety is a different matter of
> course).

Correct.

> In other words, when you read from std::cin, a file or a socket, you
> allocate a buffer and you know that its pointer is valid. When you reach
> EOF you also know that the buffer is 0-terminated - you place a 0
> yourself if the runtime does not do this for you.
>
> What you don't know (you can't control it) is if the /contents/ between
> the pointer address and the terminating 0 has a valid format, whether it
> is too large or whatever, and that's what you need to ensure that your
> program is robust against in order to be safe.

Agreed. So you check it for validity. 4 GB path strings are not valid
input. You don't need to check for overflow on the sum of the lengths -
you need to check that the data is sane.

>
>>
>>> On the other hand, contents of the string is checked by ensuring that
>>> the result of strlen and their combination is valid. This is ensured
>>> /exactly/ by making use of unsigned wrapping behavior.
>>>
>>
>> The code is going out of its way to check something that is almost
>> impossible to occur, even with malicious intentions - while ignoring
>> problems that could easily occur even by accident.
> See above.
>
>>
>>>>
>>>> If it is internal, you know the lengths of the passed strings will not
>>>> sum to more than 4G - or you don't care if someone does something
>>>> ridiculous.  (And on most modern systems, size_t is 64-bit -
>>>> overflowing
>>>> here would require 16 EB ram for storing the strings.)
>>>>
>>>> If it is external, the checking is too little - if you have char*
>>>> pointers from an unknown source, you should be wary about running
>>>> strlen() on them because you don't know if it will actually end with
>>>> a 0
>>>> in a reasonable limit.
>>>
>>> This code handles C strings, so there is no way to check for their
>>> length other than running strlen.
>>
>> size_t limited_strlen(const char* s, size_t n) {
>>      const char* p = memchr(s, 0, n);
>>      if (!p) return (size_t) -1;        // Indicate bad string
>>      return p - s;
>> }
>
> Incidentally, this takes a (pointer, length) pair, which is not a C string.

It is checking that the non-null const char * pointer points to a C
string of no greater than n in length. It will return a failure
indication if that is not the case (without bothering to check if it is
a C string that is too long, or if it is not a string at all).

> But that's irrelevant, I explained above that the presence of the
> terminating 0 can and needs to be ensured at the calling level.

The checks need to be made somewhere. I don't know where this function
fits in the scheme of things, so I don't know whether it makes sense to
check in the function, or check before calling the function. But I do
know that calculating the strlen values and checking for overflow on
adding them fails to check for real problems if checks are needed, and
is redundant if checks are not needed.

>
>>
>>> The fact that you seem to miss is that it is exactly thanks to the
>>> check that you call "silly" that it is ensured that they "actually
>>> end with a 0 in a reasonable limit".
>>
>> The checks don't do that.  If the three parameters pointed to strings
>> adding up to 4 GB - 1 characters in total, by what stretch of the
>> imagination is that "in a reasonable limit" ?
>
> Define reasonable.

That is the job of the person specifying the function.

> The checks ensure that the code will concat correctly whatever length
> can be /safely/ handled by the system, and handle correctly excessive
> lengths as well, thus covering the entire input domain. That's enough
> for safety, with no need to introduce any dependency on an additional
> and arbitrary parameter.

Nonsense. 4 GB path strings - and 16 EB path strings on 64-bit systems
- cannot be handled safely by the system. They might be blocked by
something further down the chain of calls, but you are not going to
persuade anyone that 4 GB path strings are sensible and can be used
safely for the purpose of dealing with files.
I know exactly what I wrote. And I know that it is safe - because you
won't get overflow until strlen has finished counting these 16 PB long
strings, and the earth has melted into the dying sun.

>
>   And that
>> on 32-bit systems it is at most marginally less efficient, but might
>> be more efficient (by avoiding branches)?  And that this code does
>> something /useful/ by checking for a sensible maximum pathlength?  And
>> that any few instruction cycles lost here are negligible compared to
>> the hundreds that are needed for copying even a small path string?
> As I wrote elsethread, if you are on 32-bit, and cast to 64-bit you
> don't really solve the problem, you move it ahead - in other words, you
> can't think of handling integer overflow by assuming you always have a
> wider type available.
> Remember, the topic here is about unsigned wrapping, i.e. integer overflow.

And I have been demonstrating how it is not appropriate in your example
- it is neither necessary nor sufficient.

>
> However, without avoiding your point, your alternative works under some
> conditions, but it is more wasteful of resources (and as a programmer
> involved with microcontrollers and embedded software you are sensible to
> this matter), I don't see the benefit of an arbitratry
> MAX_SANE_PATHLENGTH - in fact I'd rather not bother about any, and I
> don't see how it would be any better than the original.
>

Sorry, but my solution is /not/ wasteful of resources on a realistic
system - it is more efficient for 64-bit systems.

(You would be unlikely to have such a function on a small
microcontroller, and you know your limits on such a system - you would
not need to check for overflow.)

>>
>>> And what would be the correct value for MAX_SANE_PATHLENGTH? Are you
>>> aware of the trouble that has been caused by Windows MAX_PATH?
>>
>> You'd pick a big value for it - but a lot less than 4G.  Most
>> filesystems limit file names to 255 bytes.  What would you say is a
>> maximum depth of directories before it is clear that you have
>> thoroughly screwed up somewhere - maybe 1000?  So pick
>> MAX_SANE_PATHLENGTH at 256K on 32-bit systems, and 16M on 64-bit
>> systems.  People with more knowledge of the target system in question
>> can pick better values.
>>
>> POSIX used to have a limit of 4096, which was okay in the old days but
>> got a bit limiting.  Windows is limited to 32K now, which seems to be
>> fine - the old limit was 260 characters which is clearly too small.
>>
>>>
>>> The example I posted achieves the same level of safety, using less
>>> resources, and allowing for the maximum string length that the system
>>> can /safely/ handle (don't miss the check after malloc). What more do
>>> you want?
>>
>> See what I wrote above.
>>
>>>
>>>>
>>>>
>>>> There are times when unsigned wrapping overflow is useful.  This is not
>>>> one of them.
>>>>
>>>
>>> I suggest you read the code again (and its source - it is instructive)
>>
>> I did.
>

Then let's leave it there. I don't think there is much to be gained by
continuing in this line.

Paavo Helde

unread,
Nov 7, 2019, 3:23:19 AM11/7/19
to
Yes, one needs to protect against attacks. However, if somebody has
managed to send in gigabytes or exabytes of path strings into my program
the DoS attack has already happened, as storing and processing them will
take a lot of resources.

I agree that one needs to check against overflows where appropriate, but
this pathname concatenation example is about the silliest one could
imagine. If you look at the original page
"http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1969.htm" you see that
this is not the part he is demonstrating or discussing (it's not colored
in red like rsize_t additions). That's just some lines which he has
copied over from the earlier "unsafe" sample without any changes. He
does not argue that this is the best way to check for integer overflows,
that's just some copy-paste lines which are short and do no harm.

Note that the whole page is about *buffer overflows*, not *integer
overflows*.

Öö Tiib

unread,
Nov 8, 2019, 5:02:03 AM11/8/19
to
Is it all very similar to issue like where compiler uses 80 bit
long double (as best supported by hardware) to do mid-way
double arithmetics and so the end result does not saturate into
INF or -INF on that platform but it does on other that uses
64 bit double all the way.

I feel dealing with such issue" to be *lot* easier than dealing
with utter undefined behavior (that all such examples are by
current scripture). For me every solution is better than
undefined behavior.

> Actually, it is worth looking at some gcc (or clang) generated code with
> -ftrapv and -fsanitize=signed-integer-overflow, perhaps with
> <https://gotbolt.org>, to see just host costly trapping semantics can
> be. They are very far from negligible.

I have looked and measured its actual impact quite a lot. My
thinking of it (is surely biased but it) revolves around handful of
points.

1) It is about default behavior, in most code its efficiency does
not matter and for where it matters there indeed *must* be
alternatives.

2) Efficiency of sub-steps of processing usually does matter
where we intensively process lot of different data. With lot of data
we do hit memory latency. Processing has been for decades so
much cheaper that it can be worth to reduce memory latency
problems by adding even more processing. For example 1:4
uncompressing is faster than copy.
Note: it is ultra rare case where speed of memcpy is our actual
bottleneck and not stupidity of copying same data over and
over too lot or even copying it at all. Same is with processing
same thing pointlessly over and over.

3) World evolves. The Swift, Rust, Go, and D communities
already ask for hardware support for trapping integer
conversions and trapping arithmetic. People like the languages.
If hardware starts to support it better then C++ will seem
backwards without convenient ways to use trapping integers
in production code.

4) Compilers will evolve. What happened to "register" or
"inline"? Same can happen with usage of current "assume
that it does not overflow" operations. Just make those not
default but performance optimizations like register and
inline.

> >>> For wrap, snap and "compiler may assume that the operation
> >>> does not overflow" cases I would like new operators.
> >>>
> >>
> >> I suggested new types for the different behaviour, but of course it is
> >> the operations that have the behaviour, not the types. However, I can't
> >> see a convenient way to specify overflow behaviour on operations - using
> >> types is the best balance between flexibility and legible code.
> >
> > I mean totally new "advanced" operators like (A +% B) or (C +^ D).
> > Yes there will be precedence (and may be associativity etc.)
> > to define but it is business as usual and not some show-stopper
> > issue. In some languages (like Swift) it is done and it seems to
> > work fine.
> >
>
> Ah, okay. I think there is a big risk of confusion here, and for people
> to forget which operators do what. (There is also the risk of
> accidentally writing smilies in your code!).

It is not that. We see wrapping and saturating is used rarely.
Also wrapping is on half of cases used without need where
it can confuse like "for (unsigned u = 10; u < 11; --u)".
There will not be lot of mixed usages, just sometimes.

> I am happier with the type-based solution, and simply refusing to
> compile combinations of types that don't make sense (like adding a
> wrapping type and a saturation type). Undefined or unspecified
> behaviour types can be silently promoted to other more defined behaviour
> as needed.

That saturating type perhaps has to be separate type indeed. It
will have "magical" values and so needs separate protection.
On rest of cases operators feel to be best. About too implicit
conversions ... it is another source of hard-to-notice defects.

> That also has the advantage that it can be done today, with current C++.

Life has shown that library features (or even worse "debugging tools")
as opposed to language features will be adopted slowly by programmers
and the compiler writers are even slower to start thinking in correct
direction.

David Brown

unread,
Nov 8, 2019, 8:03:54 AM11/8/19
to
Yes - and that is a real mess for standardisation and repeatability.
It's okay if you are not bothered about the portability and
repeatability of your floating point calculations, and are happy to
accept that an unrelated change in one part of your program can lead to
different results somewhere else just because it changed when floating
point registers get moved out into the stack. If that's fine, you can
use "-ffast-math" or equivalent switches and tell your compiler it is
free to make such code. (Personally, I use "-ffast-math" in my own code
- but it is not something everyone wants.)

I do not want that situation in integer code. I fully expect the
compiler to generate identical code - equally efficient, equal results,
equal levels of static checking - whether I break a calculation into
multiple statements or have it all as one expression. I want to make
the decisions about the arrangements based on the clarity of the code,
not on where I want the compiler to optimise or where I want it to
insert hidden range checks.

I am much happier with the function of
"-fsanitize=signed-integer-overflow" here - it generates more checks,
less efficient code, but the functionality is clear and consistent.
I'll use that when I am happy to take the performance hit in the name of
debugging code.

> I feel dealing with such issue" to be *lot* easier than dealing
> with utter undefined behavior (that all such examples are by
> current scripture). For me every solution is better than
> undefined behavior.

The whole point is that you /don't/ hit undefined behaviour. Any time
your code attempts to execute something with undefined behaviour, it is
a bug in your code. Tools (like sanitizers) that help you find the
undefined behaviour are good. Static analysis can help too. Once you
have your code correct, you are not doing anything undefined, and there
is no need for run-time checks that can't ever be triggered.

To me, undefined behaviour is /better/ than the alternatives for most
purposes. It helps me write better and more correct code, and as a
bonus it is more efficient.

Maybe this is something that stems from my programming education - I was
taught in terms of specifications. Any function has a specification
that gives its pre-conditions and its post-conditions. The function
will assume the pre-conditions are valid, and establish the
post-conditions. If you don't ensure you satisfy the pre-conditions
before calling the function, you have no write to expect any know
behaviour at all from the function. This principle is known popularly
as "garbage in, garbage out". It has been understood since the birth of
the programmable computer:

"""
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into
the machine wrong figures, will the right answers come out?' I am not
able rightly to apprehend the kind of confusion of ideas that could
provoke such a question.
"""

If you want to say that some programmers don't know that integer
overflow is undefined behaviour, and think it wraps, then blame the way
C and C++ is taught and what the books say, and do something about that
- don't try to dumb down and water out the language.

>
>> Actually, it is worth looking at some gcc (or clang) generated code with
>> -ftrapv and -fsanitize=signed-integer-overflow, perhaps with
>> <https://gotbolt.org>, to see just host costly trapping semantics can
>> be. They are very far from negligible.
>
> I have looked and measured its actual impact quite a lot. My
> thinking of it (is surely biased but it) revolves around handful of
> points.
>
> 1) It is about default behavior, in most code its efficiency does
> not matter and for where it matters there indeed *must* be
> alternatives.

C++ (and C) are often chosen because of their efficiency. If you start
adding run-time tests everywhere, you lose that. Why would you then use
C and C++ at all? If you want extra safety and are willing to pay the
efficiency price, there are other languages to choose from - such as
Java or C#. When you choose to program in C or C++, you are accepting a
responsibility to program carefully and correctly. You are picking a
language that trusts the programmer. Why would you then want the
language or compiler to tell you "I /know/ you claim to be a good
programmer who knows the language and codes carefully - but I don't
believe you, and I'm going to nanny you all the way" ?

And once you start saying "C++ programmers can't be trusted to write
expressions that don't overflow", where do you stop? Can you trust them
to use pointers, or do you want run-time checks on all uses of pointers?
Can you trust them to use arrays? Function pointers? Memory
allocation? Multi-threading? There are many, many causes of faults and
bugs in C++ programs that are going to turn up a lot more often than
integer overflows (at least in the 32-bit world - most calculations
don't come close to that limit).

>
> 2) Efficiency of sub-steps of processing usually does matter
> where we intensively process lot of different data. With lot of data
> we do hit memory latency. Processing has been for decades so
> much cheaper that it can be worth to reduce memory latency
> problems by adding even more processing. For example 1:4
> uncompressing is faster than copy.
> Note: it is ultra rare case where speed of memcpy is our actual
> bottleneck and not stupidity of copying same data over and
> over too lot or even copying it at all. Same is with processing
> same thing pointlessly over and over.
>
> 3) World evolves. The Swift, Rust, Go, and D communities
> already ask for hardware support for trapping integer
> conversions and trapping arithmetic. People like the languages.
> If hardware starts to support it better then C++ will seem
> backwards without convenient ways to use trapping integers
> in production code.

I have nothing against using trapping integer arithmetic in cases where
it is useful. But it should only be as an active choice - either by
choosing special types in the code, or by enabling debug options in the
tools.

And hardware-assisted trapping cannot reach the efficiency of having
overflow as undefined behaviour, or even of wrapping, precisely for the
reasons I gave above about optimising "x + 1 - 1".

>
> 4) Compilers will evolve. What happened to "register" or
> "inline"? Same can happen with usage of current "assume
> that it does not overflow" operations. Just make those not
> default but performance optimizations like register and
> inline.
>
>>>>> For wrap, snap and "compiler may assume that the operation
>>>>> does not overflow" cases I would like new operators.
>>>>>
>>>>
>>>> I suggested new types for the different behaviour, but of course it is
>>>> the operations that have the behaviour, not the types. However, I can't
>>>> see a convenient way to specify overflow behaviour on operations - using
>>>> types is the best balance between flexibility and legible code.
>>>
>>> I mean totally new "advanced" operators like (A +% B) or (C +^ D).
>>> Yes there will be precedence (and may be associativity etc.)
>>> to define but it is business as usual and not some show-stopper
>>> issue. In some languages (like Swift) it is done and it seems to
>>> work fine.
>>>
>>
>> Ah, okay. I think there is a big risk of confusion here, and for people
>> to forget which operators do what. (There is also the risk of
>> accidentally writing smilies in your code!).
>
> It is not that. We see wrapping and saturating is used rarely.

And thus it should be written explicitly, with types that have names.
Symbol combinations are easily mixed up and forgotten, especially if
they are rarely used.

One option - which could easily be implemented in C++ of today - would
be to write "x + wrapping + y" as a wrapping addition operator.

> Also wrapping is on half of cases used without need where
> it can confuse like "for (unsigned u = 10; u < 11; --u)".

That is an example of well-defined behaviour that is void of meaning.

> There will not be lot of mixed usages, just sometimes.
>
>> I am happier with the type-based solution, and simply refusing to
>> compile combinations of types that don't make sense (like adding a
>> wrapping type and a saturation type). Undefined or unspecified
>> behaviour types can be silently promoted to other more defined behaviour
>> as needed.
>
> That saturating type perhaps has to be separate type indeed. It
> will have "magical" values and so needs separate protection.
> On rest of cases operators feel to be best. About too implicit
> conversions ... it is another source of hard-to-notice defects.
>

I would be careful about providing implicit conversions.

>> That also has the advantage that it can be done today, with current C++.
>
> Life has shown that library features (or even worse "debugging tools")
> as opposed to language features will be adopted slowly by programmers
> and the compiler writers are even slower to start thinking in correct
> direction.
>

Life has shown that library features are precisely how C++ has evolved
in the last decade or two - language features have mostly been added in
order to make library features safer to use, easier to use, more
efficient, or easier to implement. The next big leap for the C++
language - metaclasses - is precisely so that we can write libraries
that handle things that would previously need language changes.

Öö Tiib

unread,
Nov 8, 2019, 3:08:47 PM11/8/19
to
Yes, and for me it is fine if standard requires "x + 1 - 1" to trap
if x + 1 does overflow. Compilers will add their fast math options
in eye-blink anyway.

> I do not want that situation in integer code.

Then there won't be.

> I am much happier with the function of
> "-fsanitize=signed-integer-overflow" here - it generates more checks,
> less efficient code, but the functionality is clear and consistent.
> I'll use that when I am happy to take the performance hit in the name of
> debugging code.

It is not trapping math there but debugging tool that caught "undefined
behavior". It does not help me on cases when I need trapping maths.

>
> > I feel dealing with such issue" to be *lot* easier than dealing
> > with utter undefined behavior (that all such examples are by
> > current scripture). For me every solution is better than
> > undefined behavior.
>
> The whole point is that you /don't/ hit undefined behaviour.

The whole issue that you seemingly don't understand is that there are
cases (and those seem to be majority) where I don't need to have undefined
behaviors nor wrapping behaviors nor arbitrarily growing precisions.
I want it to trap by default and sometimes to have snapping behavior.

But I have either undefined behavior with signed or wrapping
behavior with unsigned and non-portable intrinsic functions.
So I have to choose what hack is uglier to write trapping or
snapping myself manually and everyone else in same situation
like me has to do same.

> To me, undefined behaviour is /better/ than the alternatives for most
> purposes. It helps me write better and more correct code, and as a
> bonus it is more efficient.

It is better only when I know that it can no way overflow because the
values make sense in thousands while the variables can count in billions.

> Maybe this is something that stems from my programming education - I was
> taught in terms of specifications. Any function has a specification
> that gives its pre-conditions and its post-conditions. The function
> will assume the pre-conditions are valid, and establish the
> post-conditions. If you don't ensure you satisfy the pre-conditions
> before calling the function, you have no write to expect any know
> behaviour at all from the function. This principle is known popularly
> as "garbage in, garbage out". It has been understood since the birth of
> the programmable computer:
>
> """
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into
> the machine wrong figures, will the right answers come out?' I am not
> able rightly to apprehend the kind of confusion of ideas that could
> provoke such a question.
> """

Yes and I want to turn the language full of specified undefined
behaviors and contradictions and useless features and defects into
programs where there are zero. When wrong figures
are put into my programs then I want the program to give
answers that follow from those and when contradicting figures
are put in then I want my programs to refuse in civil manner.
For me quarter of good cases working is not good
enough quality. And I'm frustrated when I can't find a way to
use it and have to use some kind of nonconforming hacks by
compiler vendor.

> If you want to say that some programmers don't know that integer
> overflow is undefined behaviour, and think it wraps, then blame the way
> C and C++ is taught and what the books say, and do something about that
> - don't try to dumb down and water out the language.

A language that does not have simple features nor convenient ways to
implement such is dumbed down by its makers, not by me.

> >
> >> Actually, it is worth looking at some gcc (or clang) generated code with
> >> -ftrapv and -fsanitize=signed-integer-overflow, perhaps with
> >> <https://gotbolt.org>, to see just host costly trapping semantics can
> >> be. They are very far from negligible.
> >
> > I have looked and measured its actual impact quite a lot. My
> > thinking of it (is surely biased but it) revolves around handful of
> > points.
> >
> > 1) It is about default behavior, in most code its efficiency does
> > not matter and for where it matters there indeed *must* be
> > alternatives.
>
> C++ (and C) are often chosen because of their efficiency. If you start
> adding run-time tests everywhere, you lose that. Why would you then use
> C and C++ at all? If you want extra safety and are willing to pay the
> efficiency price, there are other languages to choose from - such as
> Java or C#. When you choose to program in C or C++, you are accepting a
> responsibility to program carefully and correctly. You are picking a
> language that trusts the programmer. Why would you then want the
> language or compiler to tell you "I /know/ you claim to be a good
> programmer who knows the language and codes carefully - but I don't
> believe you, and I'm going to nanny you all the way" ?

The languages C and C++ have ages ago stopped trusting me. When I write
"register" then they treat like I actually meant "I don't want pointers
to it" and when I write "inline" then they read that "I want to
define it in header file". English is yes my 4th language but did I
really write that?

> And once you start saying "C++ programmers can't be trusted to write
> expressions that don't overflow", where do you stop? Can you trust them
> to use pointers, or do you want run-time checks on all uses of pointers?
> Can you trust them to use arrays? Function pointers? Memory
> allocation? Multi-threading? There are many, many causes of faults and
> bugs in C++ programs that are going to turn up a lot more often than
> integer overflows (at least in the 32-bit world - most calculations
> don't come close to that limit).

Sometimes I want to write code where overflow is not my programming
error but allowable contradiction in input data. There I want to throw or
to result with INF on overflow. In similar manner like I want it to
throw or to result with NaN when kilograms were requested
to add to meters or to subtract from degrees of angle or diameter
of 11-th apple from 10 is requested. Choice between if I want silent
or trapping failures depends on if I want to process the whole data
regardless of inconsistencies or constraint violations in it or to
interrupt on first such problem in data. Don't such use-cases make sense?

>
> >
> > 2) Efficiency of sub-steps of processing usually does matter
> > where we intensively process lot of different data. With lot of data
> > we do hit memory latency. Processing has been for decades so
> > much cheaper that it can be worth to reduce memory latency
> > problems by adding even more processing. For example 1:4
> > uncompressing is faster than copy.
> > Note: it is ultra rare case where speed of memcpy is our actual
> > bottleneck and not stupidity of copying same data over and
> > over too lot or even copying it at all. Same is with processing
> > same thing pointlessly over and over.
> >
> > 3) World evolves. The Swift, Rust, Go, and D communities
> > already ask for hardware support for trapping integer
> > conversions and trapping arithmetic. People like the languages.
> > If hardware starts to support it better then C++ will seem
> > backwards without convenient ways to use trapping integers
> > in production code.
>
> I have nothing against using trapping integer arithmetic in cases where
> it is useful. But it should only be as an active choice - either by
> choosing special types in the code, or by enabling debug options in the
> tools.
>
> And hardware-assisted trapping cannot reach the efficiency of having
> overflow as undefined behaviour, or even of wrapping, precisely for the
> reasons I gave above about optimising "x + 1 - 1".

Currently there are outright nothing. Integer math is in same state (or
even worse) than it was in eighties.
I see that I technically mostly use wrapping to implement trapping,
snapping or otherwise refusing math. The code does make everybody's
eyeballs to bleed (sorry if you happen to be fan of those
__builtin_add_overflow things) but if there was any better way to have
refusing math I would not touch wrapping math at all.
Robert Ramey started to write his safe throwing integer library decade
ago and it still isn't added to boost I think.
Too slowly and too lot into useless semantic garbage. At least
std::byte gets useful in C++20. We may finally assign a whopping char
to it without UB on half of the cases. :D
https://wg21.cmeerw.net/cwg/issue2338
So there are tiny progresses and good news all the time.

Tim Rentsch

unread,
Nov 9, 2019, 8:55:38 AM11/9/19
to
"Alf P. Steinbach" <alf.p.stein...@gmail.com> writes:

> On 05.11.2019 03:55, Siri Cruise wrote:
>
>> In article <qppva0$cbj$1...@news.albasani.net>,
>> Bonita Montero <Bonita....@gmail.com> wrote:
>>
>>> What would realls speak against that numieric_limits<type> would include
>>> a flag whether a type has a two's complement and maybe another flag that
>>> says that a type has a two's complement wrap-around?
>>
>> #define ONESCOMPLEMENT (-1==~1)
>> #define TWOSCOMPLEMENT (-1==~0)
>
> Nice idea.
>
> But please use `constexpr` or `const` declarations, not macros.
>
> Using macros conveys the idea that macros are a reasonable solution to
> the problem of naming a compile time constant.

Why do C++ fans hate the C preprocessor so much?

Tim Rentsch

unread,
Nov 9, 2019, 9:13:21 AM11/9/19
to
Manfred <non...@add.invalid> writes:

> My point is about what to do with the fact that many times
> overflow is a programming error: even if this is true I think that
> unsigned overflow should have defined behavior (and wrap) rather
> than being handled as an error by the compiler.

I absolutely concur. Your other comments in the thread also are
right on the money. Frankly I am rather baffled by the comments
of those on the other side of what you've been saying.

Tim Rentsch

unread,
Nov 9, 2019, 9:24:57 AM11/9/19
to
Paavo Helde <myfir...@osa.pri.ee> writes:

> On 5.11.2019 17:58, Bonita Montero wrote:
>
>> With the same reasoning you could say that unsigneds might never
>> wrap; but in fact they're specified to wrap.
>
> In retrospect, this (wrapping unsigneds) looks like a major design mistake.
>
> IMO, wrapping integers (signed or unsigned) are an example of
> "optimization which nobody asked for", and they are there basically
> only because the hardware happened to support such operations.

Apparently you think how the hardware behaves (specifically, for
unsigned or two's complement operands) is some sort of accident
of history. It isn't. It was a deliberate choice, made by smart
people. It isn't just a coincidence that the semantics of
wrapping works nicely in a variety of applications.

Siri Cruise

unread,
Nov 9, 2019, 9:34:28 AM11/9/19
to
In article <86tv7dc...@linuxsc.com>,
Because C++ is dedicated to new, obscure, ambiguous syntax everytime anyone has
a hangnail.

Do you to ensure a variable is defined before first use? Well, you could
dedicate years of effort to synchronise loaders and compilers with new object
file tables, or

#define variable (*variable_())
type_t* variable_ () {
static lock_t lock = unlocked;
static bool defined = false;
static type_t v;
if (!defined) {
if (!try_lock(lock)) error("circular initialisation");
if (!defined) {
defined = true;
v = (initialisation());
}
unlock(lock);
}
return &v;
}

--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
The first law of discordiamism: The more energy This post / \
to make order is nore energy made into entropy. insults Islam. Mohammed

Öö Tiib

unread,
Nov 9, 2019, 9:46:16 AM11/9/19
to
It is not hate. We prefer to use macros when what these do can not
be done as simply by using inline functions, templates or constants.

Reasons are that macros are names that ignore scope and work on
textual substitutions. Macros are harder to debug and cause
unanticipated effects and confusion in clumsy hands. However for
teams that manage to hire members with godly hands macros are
fine. :D

Siri Cruise

unread,
Nov 9, 2019, 10:14:52 AM11/9/19
to
In article <86lfspc...@linuxsc.com>,
CPUs with an overflow flag note additions that overflow the register. A compiler
can implement integer overflow exceptions on such CPUs.

Öö Tiib

unread,
Nov 9, 2019, 10:46:23 AM11/9/19
to
Since C++11 such singletons can be written like:

type_t* variable_ () {
static type_t v = (initialisation());
return &v;
}
It possibly results with deadlock when (initialisation())
contains recursive calls to variable_() but that does
not make sense so possibly has to be fixed by programming
anyway.

Bonita Montero

unread,
Nov 9, 2019, 11:03:12 AM11/9/19
to
>> Using macros conveys the idea that macros are a reasonable solution to
>> the problem of naming a compile time constant.

> Why do C++ fans hate the C preprocessor so much?

Because most of the times you use macros you can use inline-functions,
templated or not, as well; and the advantage over macros is, that they
can be debugged. And substituting larger templated functions via macros
is a mess.

David Brown

unread,
Nov 10, 2019, 5:52:18 AM11/10/19
to
Wrapping in hardware was not a deliberate choice of semantics - it was a
deliberate choice of the simplest, fastest and most efficient hardware
with little concern for the semantics.

There are three primary choices for defined behaviour of signed integer
overflow - saturation, trapping, and wrapping. None of these matches
mathematical integer behaviour, of course. Saturation is a natural and
useful choice in many circumstances. Trapping (including NaN
implementations) is a good way of seeing that something has gone wrong,
so that you either have the right answer, or know there has been an
error. Wrapping has no logical use or point as a semantic of an
operation. It is, however, a very convenient aid to multi-precision
arithmetic.

Implementations of signed integer representations other than two's
complement, such as ones' complement and sign-magnitude, do not
typically have wrapping semantics - they will have trapping or
saturation semantics, if they have defined semantics.

But in hardware, two's complement representation is the simplest and
most efficient method. And if you have wrapping for addition and
subtraction, it becomes easily expandable - a vital feature from the
days of 8-bit ALU's being used for 16-bit, 32-bit and larger arithmetic,
and still useful occasionally. This is related to the reason why in
many cpu's, there is a "carry" flag for addition that doubles as the
"not borrow" flag for subtraction - it means you implement your
subtraction "x - y" as "x + ~y" and re-use almost all the hardware elements.

Saturation and trapping cost hardware - the wrapping is free. It would
be the standard in hardware even if it didn't have the use for
multi-precision arithmetic.

Paavo is right when viewed from the high-level software perspective -
wrapping is not something people looked for, either for signed or
unsigned arithmetic. For very low-level software, it /is/ useful in the
case where you are building up larger arithmetic operations from smaller
ones, and for certain kinds of comparisons - these are things that are
mostly hidden by the compiler in a high level language like C.

David Brown

unread,
Nov 10, 2019, 7:17:25 AM11/10/19
to
I am not at all keen on the idea of the standards defining certain
behaviour and then compilers having flags to disregard those definitions
- that is, I think, a terrible idea. If trapping were to be added to
the standards, it would be much better if the standards offered a
user-selectable choice (I suppose via pragmas).

Remember, features like gcc's "-ffast-math" do not instruct the compiler
to ignore parts of the /C/ standard - they ignore parts of the IEEE
floating point standards. This is a very different matter. As far as I
know, "gcc -ffast-math" is no more non-compliant than "gcc".

>
>> I do not want that situation in integer code.
>
> Then there won't be.

I am entirely confident that trapping on overflow will never be required
by the C or C++ standards anyway, so this is hypothetical. But it is
feasible that there could be optional support defined (such as via a new
library section in C++), and there I would want the consistency.

>
>> I am much happier with the function of
>> "-fsanitize=signed-integer-overflow" here - it generates more checks,
>> less efficient code, but the functionality is clear and consistent.
>> I'll use that when I am happy to take the performance hit in the name of
>> debugging code.
>
> It is not trapping math there but debugging tool that caught "undefined
> behavior". It does not help me on cases when I need trapping maths.
>

Trapping maths can be handled in many ways - kill with an error message
is often useful in debugging, but as you say it is not useful for
handling it within the program. Throwing an exception, calling a trap
handler, returning a signalling NaN, are other alternatives. My point
was that I wanted the semantics of -fsanitize=signed-integer-overflow
for detecting the overflow - how that overflow is another matter.

>>
>>> I feel dealing with such issue" to be *lot* easier than dealing
>>> with utter undefined behavior (that all such examples are by
>>> current scripture). For me every solution is better than
>>> undefined behavior.
>>
>> The whole point is that you /don't/ hit undefined behaviour.
>
> The whole issue that you seemingly don't understand is that there are
> cases (and those seem to be majority) where I don't need to have undefined
> behaviors nor wrapping behaviors nor arbitrarily growing precisions.
> I want it to trap by default and sometimes to have snapping behavior.
>
> But I have either undefined behavior with signed or wrapping
> behavior with unsigned and non-portable intrinsic functions.
> So I have to choose what hack is uglier to write trapping or
> snapping myself manually and everyone else in same situation
> like me has to do same.

I am all in favour of choice, but there are a few things I disagree with
you. First, you seem to want trapping to be the default in the
standards - that is a very costly idea to impose on everyone else, as
well as being rather difficult to specify and implement.

Secondly, you seem to want this for all arithmetic. I'm fine with using
trapping explicitly in code when it is particular useful - that would
work naturally with exceptions in C++. But I would want the lower-level
code that is not going to overflow, to be free from the cost of this
checking. (Compilers might be able to eliminate some of the overhead,
but they won't get everything.)

Thirdly, you seem to think trapping is useful semantics in general - I
disagree. There are times when it could be convenient in code where
overflow is a realistic but unlikely scenario, mostly it sounds like you
are happy with releasing code that is buggy as long as the consequences
of the bugs are somewhat limited. I don't like that attitude, and I
don't think you do either - so there is maybe still something I don't
understand here.

>
>> To me, undefined behaviour is /better/ than the alternatives for most
>> purposes. It helps me write better and more correct code, and as a
>> bonus it is more efficient.
>
> It is better only when I know that it can no way overflow because the
> values make sense in thousands while the variables can count in billions.

To me, that should /always/ be the case - though I am quite happy with
values that only make sense up to a thousand with a variable that can
count to a thousand. I always want to know the limits of my values, and
there is no need to work with bigger sizes than needed.

>
>> Maybe this is something that stems from my programming education - I was
>> taught in terms of specifications. Any function has a specification
>> that gives its pre-conditions and its post-conditions. The function
>> will assume the pre-conditions are valid, and establish the
>> post-conditions. If you don't ensure you satisfy the pre-conditions
>> before calling the function, you have no write to expect any know
>> behaviour at all from the function. This principle is known popularly
>> as "garbage in, garbage out". It has been understood since the birth of
>> the programmable computer:
>>
>> """
>> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into
>> the machine wrong figures, will the right answers come out?' I am not
>> able rightly to apprehend the kind of confusion of ideas that could
>> provoke such a question.
>> """
>
> Yes and I want to turn the language full of specified undefined
> behaviors and contradictions and useless features and defects into
> programs where there are zero.

You can't. It won't work. (Well, you can probably deal with
contradictions in the language, though I don't know what you are
referring to here.) Languages /always/ have undefined behaviours. And
there is rarely a benefit in turning undefined behaviour into defined
behaviour unless that behaviour is correct.

If a program tries to calculate "foo(x)", and the calculation overflows
and gives the wrong answer, the program is broken. Giving a definition
to the overflow - whether it is wrapping, trapping, throwing, saturating
- does not make the program correct. It is still broken.

Some types of defined behaviour can make it easier to find the bug
during testing - such as error messages on overflows. Some make it
harder to find the bugs, such as wrapping. Some make it easier to limit
the damage from the bugs, such as throwing an exception. None change
the fact that the program has a bug.

And if you look at the language, there are /loads/ of undefined
behaviours. Integer overflows are just a tiny one, and one that has
become a good deal less common since 32-bit int became the norm. C++
will still let programmers shoot themselves in the foot in all sorts of
ways - worrying about the splinters in their fingers will not fix the
hole in their feet.

> When wrong figures
> are put into my programs then I want the program to give
> answers that follow from those and when contradicting figures
> are put in then I want my programs to refuse in civil manner.

That is fair enough - that is why you should check the figures that are
put into the program, not the calculations in the good code that you
have written.

> For me quarter of good cases working is not good
> enough quality. And I'm frustrated when I can't find a way to
> use it and have to use some kind of nonconforming hacks by
> compiler vendor.
>
>> If you want to say that some programmers don't know that integer
>> overflow is undefined behaviour, and think it wraps, then blame the way
>> C and C++ is taught and what the books say, and do something about that
>> - don't try to dumb down and water out the language.
>
> A language that does not have simple features nor convenient ways to
> implement such is dumbed down by its makers, not by me.
>

No language covers everything that all programmers want - there are
always compromises.
Yes, you did. At least, that's what the definition of the languages
says. The words used in a programming language don't always correspond
to their normal meanings in English. Nor do they always remain the same
as they move from compiler extensions through to standards. It is
unfortunate, but true - and it is the programmer's job to write in a way
that is clear to the reader /and/ precise in the language.

(Incidentally, you write better English many people I know who have it
as their first language. And it is not as if Estonian is close to English!)

>
>> And once you start saying "C++ programmers can't be trusted to write
>> expressions that don't overflow", where do you stop? Can you trust them
>> to use pointers, or do you want run-time checks on all uses of pointers?
>> Can you trust them to use arrays? Function pointers? Memory
>> allocation? Multi-threading? There are many, many causes of faults and
>> bugs in C++ programs that are going to turn up a lot more often than
>> integer overflows (at least in the 32-bit world - most calculations
>> don't come close to that limit).
>
> Sometimes I want to write code where overflow is not my programming
> error but allowable contradiction in input data. There I want to throw or
> to result with INF on overflow.

That should be handled by C++ classes that have such semantics on
overflow - which you use explicitly when you want that behaviour. I'd
be happy to see a standard library with this behaviour.

> In similar manner like I want it to
> throw or to result with NaN when kilograms were requested
> to add to meters or to subtract from degrees of angle or diameter
> of 11-th apple from 10 is requested.

That is a completely different matter. Personally, I would not want an
exception or a NaN here - I would want a compile time error. Again,
this can be handled perfectly well in C++ using strong class types. And
again, it would be nice to see such libraries standardised.

In both these cases, I think concepts will make the definition and use
of such class template libraries a good deal neater.

> Choice between if I want silent
> or trapping failures depends on if I want to process the whole data
> regardless of inconsistencies or constraint violations in it or to
> interrupt on first such problem in data. Don't such use-cases make sense?

Yes, which is why it is far better to deal with it using explicit
choices of the types used rather than making it part of the language.
Which is fine for many people. And since hardware assistance would give
you very little benefit for what you want here, it is not surprising
that it does not exist. (Hardware assistance could be a big help in a
wide variety of other undefined behaviour, bugs, and low-level features
in C and C++, but that is a different discussion.)
I don't think anyone will tell you __builtin_add_overflow leads to
pretty code! But you write these sorts of things once, and hide them
away in a library so that the use of the functions is simple and clear.

Alf P. Steinbach

unread,
Nov 10, 2019, 8:37:48 AM11/10/19
to
On 10.11.2019 13:17, David Brown wrote:
> [snip]
> Remember, features like gcc's "-ffast-math" do not instruct the compiler
> to ignore parts of the /C/ standard - they ignore parts of the IEEE
> floating point standards.  This is a very different matter.  As far as I
> know, "gcc -ffast-math" is no more non-compliant than "gcc".

Possibly you're talking about C only, I'm not sure of the wider thread
context here.

Sorry for jumping in.

However, with C++ the `-ffast-math` option for g++ is in conflict with
the standard because `numeric_limits` then still reports IEEE 754
compliance, while e.g. NaN comparisons are then non-compliant.

I remember once engaging in a heated (!) debate with Gabriel Dos Reis
about this, so it's possible to have different opinions about it, even
among knowledgeable people, for certain.

IMO the fast and the compliant floating point types should at least by
default be differently named types, not the same type with different
behavior depending on an option.

So I regard it as a quite nasty design bug.


- Alf

Paavo Helde

unread,
Nov 10, 2019, 9:31:27 AM11/10/19
to
The hardware which I am familiar with properly sets the carry and
overflow flags for arithmetic operations. In case of multiplication it
also produces the result twice larger than the operands, in order to not
lose information. So at the hardware level the results are always
well-defined and complete, this is very fine. The calling program can
study the flags and decide what to do about this.

The problem with C is that they decided the best way to deal with this
is just to ignore the overflow flag and the upper half of the
multiplication result. Not sure about PDP-11 itself, but they had to
think about other hardware too, and at some point they decided that the
best way to deal with the problem was just to ignore all the extra
information the current or future CPU-s might provide.

An imaginary conversation:

A: BTW, what should we do with the arithmetic overflow? There are
overflow flags and extended precision results in some hardware that we
could use.

B: Because of speed and simplicity, we just ignore it. Each extra opcode
would slow us down!

A: But in case of multiplication, we might lose half of the result!

B. Just ignore it.

A. But what we write down in the specs? We can't just say we carry out
an hardware operation and then ignore everything except the first result
register.

B. Let's see... Isn't there a nice name for that behavior? Yep, it's
called wrapping! Write this down in the specs!

A. But, but, with signed integers, there are different representations,
so that on some hardware it does not actually wrap!

B. Ouch, my head is hurting. Just declare it undefined then!


David Brown

unread,
Nov 10, 2019, 11:08:29 AM11/10/19
to
On 10/11/2019 14:37, Alf P. Steinbach wrote:
> On 10.11.2019 13:17, David Brown wrote:
>> [snip]
>> Remember, features like gcc's "-ffast-math" do not instruct the
>> compiler to ignore parts of the /C/ standard - they ignore parts of
>> the IEEE floating point standards.  This is a very different matter.
>> As far as I know, "gcc -ffast-math" is no more non-compliant than "gcc".
>
> Possibly you're talking about C only, I'm not sure of the wider thread
> context here.
>
> Sorry for jumping in.

Feel free!

>
> However, with C++ the `-ffast-math` option for g++ is in conflict with
> the standard because `numeric_limits` then still reports IEEE 754
> compliance, while e.g. NaN comparisons are then non-compliant.

Fair enough - that's a good point.

>
> I remember once engaging in a heated (!) debate with Gabriel Dos Reis
> about this, so it's possible to have different opinions about it, even
> among knowledgeable people, for certain.

I could imagine a difference of opinion here - "gcc -ffast-math" has
IEEE standard formats but not IEEE standard operations.

>
> IMO the fast and the compliant floating point types should at least by
> default be differently named types, not the same type with different
> behavior depending on an option.

The compiler does require an explicit option here for non-IEEE
operations - it is (baring bugs or missing features - I don't know the
details) IEEE standard by default.

There is always a balance on these things. "-ffast-math" can be very
much faster than IEEE standard behaviour on many platforms, and is
perfectly acceptable for most people's use of floating point. Why
should the majority have slower code (and perhaps worse accuracy) just
because some few scientific users need identical results consistent
across platforms?

I'd actually say my preferred solution is that strict IEEE types should
be a different, specific types that programmers can pick explicitly,
while "-ffast-math" equivalent could be the default. The
"numeric_limits" could differentiate between the representation and the
operations (the same should apply to signed integers, differentiating
between representation and wrapping overflow behaviour), and
implementations should set the values here according to what they
guarantee regardless of flags.

>
> So I regard it as a quite nasty design bug.
>

"Bug" implies a mistake. This is a design choice, on which people
disagree - not a bug.

Manfred

unread,
Nov 10, 2019, 12:34:22 PM11/10/19
to
On 11/10/19 5:08 PM, David Brown wrote:
> On 10/11/2019 14:37, Alf P. Steinbach wrote:
>> On 10.11.2019 13:17, David Brown wrote:
>>> [snip]

>
>>
>> IMO the fast and the compliant floating point types should at least by
>> default be differently named types, not the same type with different
>> behavior depending on an option.
>
> The compiler does require an explicit option here for non-IEEE
> operations - it is (baring bugs or missing features - I don't know the
> details) IEEE standard by default.
>
> There is always a balance on these things.  "-ffast-math" can be very
> much faster than IEEE standard behaviour on many platforms, and is
> perfectly acceptable for most people's use of floating point.  Why
> should the majority have slower code (and perhaps worse accuracy) just
> because some few scientific users need identical results consistent
> across platforms?

It's not that simple (and -ffast-math does not improve accuracy), see below.

>
> I'd actually say my preferred solution is that strict IEEE types should
> be a different, specific types that programmers can pick explicitly,
> while "-ffast-math" equivalent could be the default.

The fact is that -ffast-math can cause loss of precision, so it can
introduce nasty bugs in existing code that assumes IEEE conformance.
Since managing accuracy errors is among the most common causes of
floating point bugs, the majority of programmers is better off without
-ffast-math.
If you want the extra speed, you need to know what you are doing (and
that it doesn't come for free) so you'd better ask for it explicitly.
By the way, with floating point there are a number of techniques to be
considered that improve efficiency before tweaking compiler flags.

Manfred

unread,
Nov 10, 2019, 12:37:30 PM11/10/19
to
On 11/9/19 3:13 PM, Tim Rentsch wrote:
> Manfred <non...@add.invalid> writes:
>
>> My point is about what to do with the fact that many times
>> overflow is a programming error: even if this is true I think that
>> unsigned overflow should have defined behavior (and wrap) rather
>> than being handled as an error by the compiler.
>
> I absolutely concur. Your other comments in the thread also are
> right on the money.
I know.

Frankly I am rather baffled by the comments
> of those on the other side of what you've been saying.
>
Not a surprise, though. The internet is this too.

Manfred

unread,
Nov 10, 2019, 1:02:30 PM11/10/19
to
On 11/10/19 3:31 PM, Paavo Helde wrote:
> On 9.11.2019 16:24, Tim Rentsch wrote:
>> Paavo Helde <myfir...@osa.pri.ee> writes:
>>
>>> On 5.11.2019 17:58, Bonita Montero wrote:
>>>
>>>> With the same reasoning you could say that unsigneds might never
>>>> wrap;  but in fact they're specified to wrap.
>>>
>>> In retrospect, this (wrapping unsigneds) looks like a major design
>>> mistake.
>>>
>>> IMO, wrapping integers (signed or unsigned) are an example of
>>> "optimization which nobody asked for", and they are there basically
>>> only because the hardware happened to support such operations.
>>
>> Apparently you think how the hardware behaves (specifically, for
>> unsigned or two's complement operands) is some sort of accident
>> of history.  It isn't.  It was a deliberate choice, made by smart
>> people.  It isn't just a coincidence that the semantics of
>> wrapping works nicely in a variety of applications.
>
> The hardware which I am familiar with properly sets the carry and
> overflow flags for arithmetic operations. In case of multiplication it
> also produces the result twice larger than the operands, in order to not
> lose information. So at the hardware level the results are always
> well-defined and complete, this is very fine. The calling program can
> study the flags and decide what to do about this.

The main difference between the hardware and the language is that 1) the
former can count on accessory entities that are not part of a variable
(the flags register) and 2) it does not have to fit within the semantics
of a type system, so that it is OK for WORD op WORD to yield DWORD

If you go into the specifics, you may see that:
1) flags information would need to be handled by the compiler, with the
possible result of littering the generated code with branches for every
integer expression.

2) Having an arithmetic operator (multiplication) yield a different type
than the operands would pose significant complications to the language -
if you really want this, try mimicking that with some C++ wrapper
classes and operators thereof, and see where you go.
The solution, when the full range is needed, is to tell the compiler
that this is what you want by casting the /operands/.
In this way the compiler can deliver the result you want (while
optimizing away the extra space for the operands themselves).

>
> The problem with C is that they decided the best way to deal with this
> is just to ignore the overflow flag and the upper half of the
> multiplication result. Not sure about PDP-11 itself, but they had to
> think about other hardware too, and at some point they decided that the
> best way to deal with the problem was just to ignore all the extra
> information the current or future CPU-s might provide

See above.
Moreover, given the impact of the choice, and its business consequences,
it is hard to believe that this was just sloppiness.

>
> An imaginary conversation:
>
> A: BTW, what should we do with the arithmetic overflow? There are
> overflow flags and extended precision results in some hardware that we
> could use.
>
> B: Because of speed and simplicity, we just ignore it. Each extra opcode
> would slow us down!
>
> A: But in case of multiplication, we might lose half of the result!
>
> B. Just ignore it.
>
> A. But what we write down in the specs? We can't just say we carry out
> an hardware operation and then ignore everything except the first result
> register.
>
> B. Let's see... Isn't there a nice name for that behavior? Yep, it's
> called wrapping! Write this down in the specs!
>
> A. But, but, with signed integers, there are different representations,
> so that on some hardware it does not actually wrap!
>
> B. Ouch, my head is hurting. Just declare it undefined then!
>
>

This is fiction.

Ian Collins

unread,
Nov 10, 2019, 1:48:50 PM11/10/19
to
On 11/11/2019 06:34, Manfred wrote:
>>
>> I'd actually say my preferred solution is that strict IEEE types should
>> be a different, specific types that programmers can pick explicitly,
>> while "-ffast-math" equivalent could be the default.
>
> The fact is that -ffast-math can cause loss of precision, so it can
> introduce nasty bugs in existing code that assumes IEEE conformance.
> Since managing accuracy errors is among the most common causes of
> floating point bugs, the majority of programmers is better off without
> -ffast-math.
> If you want the extra speed, you need to know what you are doing (and
> that it doesn't come for free) so you'd better ask for it explicitly.
> By the way, with floating point there are a number of techniques to be
> considered that improve efficiency before tweaking compiler flags.

With ARM processors which use NEON, -ffast-math is required if you want
to vectorise floating point operation. This is because the hardware
isn't IEEE compliant. This was a bit of a pain for us which resulted in
lots of extra testing....

--
Ian.

David Brown

unread,
Nov 10, 2019, 3:31:39 PM11/10/19
to
On 10/11/2019 18:34, Manfred wrote:
> On 11/10/19 5:08 PM, David Brown wrote:
>> On 10/11/2019 14:37, Alf P. Steinbach wrote:
>>> On 10.11.2019 13:17, David Brown wrote:
>>>> [snip]
>
>>
>>>
>>> IMO the fast and the compliant floating point types should at least
>>> by default be differently named types, not the same type with
>>> different behavior depending on an option.
>>
>> The compiler does require an explicit option here for non-IEEE
>> operations - it is (baring bugs or missing features - I don't know the
>> details) IEEE standard by default.
>>
>> There is always a balance on these things.  "-ffast-math" can be very
>> much faster than IEEE standard behaviour on many platforms, and is
>> perfectly acceptable for most people's use of floating point.  Why
>> should the majority have slower code (and perhaps worse accuracy) just
>> because some few scientific users need identical results consistent
>> across platforms?
>
> It's not that simple (and -ffast-math does not improve accuracy), see
> below.
>
>>
>> I'd actually say my preferred solution is that strict IEEE types
>> should be a different, specific types that programmers can pick
>> explicitly, while "-ffast-math" equivalent could be the default.
>
> The fact is that -ffast-math can cause loss of precision, so it can
> introduce nasty bugs in existing code that assumes IEEE conformance.

-ffast-math /can/ cause loss of precision, but it can also improve
precision. If you write "(x + 1e50) - 1e50", -ffast-math will let the
compiler generate code for "x" while IEEE conformance requires the
addition and subtraction, losing you all your precision of x (assuming x
is small compared to 1e50). The key point is that it doesn't keep the
consistency required by IEEE.

And of course if you have code that assumes IEEE conformance and needs
it for correctness, and you turn off conformance, then you risk a result
that doesn't work. You can't use -ffast-math unless you know the code
doesn't need IEEE conformance.

So it /is/ that simple, and that obvious.

And while I say this is /my/ preferred solution, I can quite happily
appreciate that other people prefer something else. (And realistically,
I know it takes a huge incentive before the status quo is changed. Even
if it turns out, as I think is the case, that most programs would run
correctly and more efficiently with -ffast-math, that does not mean the
default is likely to change.)

> Since managing accuracy errors is among the most common causes of
> floating point bugs, the majority of programmers is better off without
> -ffast-math.

I don't think most programs care unduly about the accuracy of floating
point calculations. I think most calculations in floating point have
low enough accuracy needs that you can use doubles and accept that the
answer is approximate but good enough. And I think most programmers are
totally unaware of the details of IEEE and how to use the features it
gives in getting slightly more control of error margins. But I freely
admit that is gut feeling, not based on any kind of surveys, statistics,
etc.

> If you want the extra speed, you need to know what you are doing (and
> that it doesn't come for free) so you'd better ask for it explicitly.

It can make a very large difference, depending on the target processor
and the type of code. (Many small processors with floating point
hardware have hardware that supports "ordinary" floats, but does not
handle infinities, NaNs, infinitesimals, etc. -ffast-math can generated
code that runs in a few hardware instructions while without it you need
library calls and hundreds of instructions. Yes, it can be /that/ big a
difference - and that is personal experience, not gut feeling.)

> By the way, with floating point there are a number of techniques to be
> considered that improve efficiency before tweaking compiler flags.

Of course.

Paavo Helde

unread,
Nov 11, 2019, 5:09:11 AM11/11/19
to
On 10.11.2019 20:02, Manfred wrote:
> The main difference between the hardware and the language is that 1) the
> former can count on accessory entities that are not part of a variable
> (the flags register) and 2) it does not have to fit within the semantics
> of a type system, so that it is OK for WORD op WORD to yield DWORD
>
> If you go into the specifics, you may see that:
> 1) flags information would need to be handled by the compiler, with the
> possible result of littering the generated code with branches for every
> integer expression.

Yes, that's correct.

> 2) Having an arithmetic operator (multiplication) yield a different type
> than the operands would pose significant complications to the language -
> if you really want this, try mimicking that with some C++ wrapper
> classes and operators thereof, and see where you go.
> The solution, when the full range is needed, is to tell the compiler
> that this is what you want by casting the /operands/.
> In this way the compiler can deliver the result you want (while
> optimizing away the extra space for the operands themselves).

Note that when the operands are cast to suitable larger types, then
overflow cannot happen, so for this usage case the overflow behavior
does not need to be defined.

>
>>
>> The problem with C is that they decided the best way to deal with this
>> is just to ignore the overflow flag and the upper half of the
>> multiplication result. Not sure about PDP-11 itself, but they had to
>> think about other hardware too, and at some point they decided that
>> the best way to deal with the problem was just to ignore all the extra
>> information the current or future CPU-s might provide
>
> See above.
> Moreover, given the impact of the choice, and its business consequences,
> it is hard to believe that this was just sloppiness.

Of course not. The decision to produce fastest code was a deliberate
choice made by smart people, as you said. Any kind of overflow checking
would have slowed the code down at least twice, which would have been of
course disastrous for a language meant as "better assembler".

With the specified wrapping behavior for unsigneds the story is
different however. They defined the unsigned overflow as wrapping
because they could, not because of some deeper considerations. Would the
existing hardware have provided some other behavior (like saturating) at
zero cost, I'm sure they would have standardized that behavior instead.

Note that they did not just copy the hardware behavior. The common
hardware itself does not have wrapping behavior, rather it produces the
correct result with carry/overflow flags and extended registers. One
only gets wrapping behavior by deliberately ignoring the correct result.

In a similar token, they did not define the behavior for signed overflow
because they couldn't. More specifically, the zero cost behavior was
different on different hardware, and they could not find enough
justification for any kind of non-zero cost overflow behavior.

Scott Lurndal

unread,
Nov 11, 2019, 8:59:16 AM11/11/19
to
Paavo Helde <myfir...@osa.pri.ee> writes:
>On 10.11.2019 20:02, Manfred wrote:

>With the specified wrapping behavior for unsigneds the story is
>different however. They defined the unsigned overflow as wrapping
>because they could, not because of some deeper considerations. Would the
>existing hardware have provided some other behavior (like saturating) at
>zero cost, I'm sure they would have standardized that behavior instead.

You are projecting your preferences. I'm sure they would have made
the choice they made even if saturating arithmetic operations had been available.

Paavo Helde

unread,
Nov 11, 2019, 10:09:52 AM11/11/19
to
I do not have preference for saturating. Saturating is worse than
wrapping, as it does not make it possible to detect overflow post factum
even in the simplest usage case.

Looks like I should have worded myself better: "Would the
existing hardware have provided *only* some other behavior (like
saturating) at zero cost, ..."

Richard Damon

unread,
Nov 18, 2019, 7:40:41 AM11/18/19
to
My guess at that converstation is it would go like:

A: BTW, what should we do with the arithmetic overflow? There are
overflow flags and extended precision results in some hardware that we
could use.

B: How does that work on the other machines, can we do this efficiently?

C: Not on my machine, we do it differently than that, we do it by xxx.

A: My machine can't do it that way!

B: Sounds like we can't make it part of the standard, leave it for
implementation extensions.

A: But in case of multiplication, we might lose half of the result!

D: My machine doesn't automatically create a double precision result,
that would slow me down a lot.

B: Sounds like an extension too.


Note, the key in this discussion was focused on defining consistent
behavior in the language where the target machines had behavior that
could be made easily consistent, and leaving as undefined behavior (or
unspecified/implementation defined) that wasn't. These decisions were
initially being made in the 70's and 80's, and at that time there were a
lot more machines with unique operating modes.

Note also, it was EXPECTED that implementations would implement
extensions to let a program access special features of the machine to be
more efficient on that machine. Special symbols/keywords to get access
to things like the carry bit, or double precision multiplication. I
remember using a multiply-divide intrinsic that would compute

a * b / c where the product was done to double precision (long) and the
divide took that to yield a int result.

Tim Rentsch

unread,
Nov 28, 2019, 3:19:57 PM11/28/19
to
This doesn't answer my question. I understand that macros have
shortcomings in some applications. The question is why do C++
fans hate using macros even when, like here, the shortcomings
don't apply?

Tim Rentsch

unread,
Nov 28, 2019, 3:27:28 PM11/28/19
to
I believe your view of the history doesn't match what actually
occurred. I don't propose attempting to resolve the matter; I
just wanted to be clear that I think I see what you're saying
even though I believe some of your premises don't match the
historical facts.

Alf P. Steinbach

unread,
Nov 28, 2019, 6:14:26 PM11/28/19
to
On 28.11.2019 21:19, Tim Rentsch wrote:
> Öö Tiib <oot...@hot.ee> writes:
>
>> On Saturday, 9 November 2019 15:55:38 UTC+2, Tim Rentsch wrote:
>>
>>> "Alf P. Steinbach" <alf.p.stein...@gmail.com> writes:
>>>
>>>> On 05.11.2019 03:55, Siri Cruise wrote:
>>>>
>>>>> In article <qppva0$cbj$1...@news.albasani.net>,
>>>>> Bonita Montero <Bonita....@gmail.com> wrote:
>>>>>
>>>>>> What would realls speak against that numieric_limits<type> would include
>>>>>> a flag whether a type has a two's complement and maybe another flag that
>>>>>> says that a type has a two's complement wrap-around?
>>>>>
>>>>> #define ONESCOMPLEMENT (-1==~1)
>>>>> #define TWOSCOMPLEMENT (-1==~0)
>>>>
>>>> Nice idea.
>>>>
>>>> But please use `constexpr` or `const` declarations, not macros.
>>>>
>>>> Using macros conveys the idea that macros are a reasonable solution to
>>>> the problem of naming a compile time constant.
>>>
>>> Why do C++ fans hate the C preprocessor so much?
>>
>> It is not hate. We prefer to use macros when what these do can not
>> be done as simply by using inline functions, templates or constants.
>>
>> Reasons are that macros are names that ignore scope and work on
>> textual substitutions. Macros are harder to debug and cause
>> unanticipated effects and confusion in clumsy hands. However for
>> teams that manage to hire members with godly hands macros are
>> fine. :D
>
> [snip] here, the shortcomings [of macros] don't apply?

They do.

For example, regarding what was mentioned about scope, you generally
can't have two libraries that both define `ONESCOMPLEMENT`. Perhaps
library A defines it as above but library B defines it to produce a
one's complement. Perhaps both define it as above but using library B
you get a redefinition warning.

And for example, regarding the so far not mentioned visual impact, the
shouting uppercase is an annoying eyesore.


- Alf

David Brown

unread,
Nov 29, 2019, 3:15:51 AM11/29/19
to
To be fair, you don't have to use all-caps for macros. Personally, I
only use all-caps for macros if the fact that it is a macro is
important. So I would write:

#define noOfWotsits 10

just as I would write

enum { noOfWotists = 10 };
static const int noOfWotsits = 10;
const int noOfWotsits = 10; // C++ only

depending on the language (C, C++, or compatible with both) and what I
want to be able to do with it. Disregarding scoping, these will all
have the same effect in almost all sensible uses - I see no earthly
reason to insist that one version should be written in all-caps and the
others in your favourite multi-word style (snake, camel, etc.).

But if I had written this macro (which I would not do - it's just an
example):

#define SQUARE(x) ((x) * (X))

then I would use all-caps, because it is important to understand that it
is a macro.

Bo Persson

unread,
Nov 29, 2019, 4:58:47 AM11/29/19
to
The problem is still that macros defies the use of scopes and namespaces.

Unlike you are one of the "members with godly hands" referred to above,
a slight mistake might break the next version of the standard library.
Note that the appendix "Index of library names" has grown from 35 to 70
pages between C++14 and C++20. None of these may be #define'ed.

Just try

#define general "Patton"

to break floating-point formatting, or

#define Init 0

to get odd compile errors from most I/O, or try

#define a 'a'

to disable <random> numbers.


And the library reserved names use all versions of lower_case,
UPPER_CASE, and Mixed.


Bo Persson








David Brown

unread,
Nov 29, 2019, 6:49:48 AM11/29/19
to
They do - and when considering "what is the clearest, simplest, most
maintainable, most efficient choice" for a fixed value or re-usable
sequence of code or characters, then that counts against macros.

Whether these count as "problems" or not will depend on the code, and
where they are defined. Macros within an implementation file are rarely
an issue for scoping - if they are, your implementation file is probably
too big, or you have too little imagination in naming your identifiers.
But macros in headers can have wider implications.

> Unlike you are one of the "members with godly hands" referred to above,
> a slight mistake might break the next version of the standard library.

That's a risk, certainly. I am a big fan of specifying exactly what
standard I want for my code - I don't change versions of the standard
library willy-nilly. There are many things that can be badly affected,
not just because of macros.

> Note that the appendix "Index of library names" has grown from 35 to 70
> pages between C++14 and C++20. None of these may be #define'ed.
>
> Just try
>
> #define general "Patton"
>
> to break floating-point formatting, or
>
> #define Init 0
>
> to get odd compile errors from most I/O, or try
>
> #define a 'a'
>
> to disable <random> numbers.
>
>
> And the library reserved names use all versions of lower_case,
> UPPER_CASE, and Mixed.
>

Perhaps you misunderstood my post. I didn't say "use lots of macros",
or "macros are risk-free", or "macros are better than const". I simply
said that when I choose to use a macro, I don't give it a shouty name
unless I really need it to shout "I am a macro". Your points are all
good reasons to prefer something other than macros (templates, const,
inline functions, etc.) where practical, and I don't disagree with any
of them.

Tim Rentsch

unread,
Dec 7, 2019, 10:53:56 AM12/7/19
to
"Alf P. Steinbach" <alf.p.stein...@gmail.com> writes:

> On 28.11.2019 21:19, Tim Rentsch wrote:
>
A macro may be defined in places other than public libraries. Or
if it is defined in a public library, and a possible ambiguity is
important to remove, it's easy enough to prefix the macro name
with the namespace used for the library. Would you have the same
objection to using a macro if the macro were not defined in a
public library?

> Perhaps both define it as above but using library B you get a
> redefinition warning.

I don't know why you think that. Giving several identical
definitions for a macro is allowed in C90, C99, C11, C++98,
C++03, C++11, C++14, and C++17. Testing with gcc, clang, g++,
and clang++ gave no diagnostics even with -Wall and -Wextra.


> And for example, regarding the so far not mentioned visual
> impact, the shouting uppercase is an annoying eyesore.

But that has nothing to do with the macro processor. You are
perfectly free to choose a name that is not all uppercase, or
even all lowercase, if you'd rather. What you said might be an
argument about how to name a macro but it is not an argument
about whether to use a macro.

Taking all of these together, the objections given sound more
like rationalizations trying to justify an overarching dislike of
macros than objectively reasoned conclusions.

Öö Tiib

unread,
Dec 7, 2019, 11:22:29 AM12/7/19
to
The authors of libraries are often competitors and so do not like
that you pick and integrate best of breed from their products.
The bigger ones seemingly even enjoy introducing such conflicts
and contradictions. Sometimes even with standard library. Anyone
targeting Windows had sooner or later to look up what to do with
min and max macros in windows.h.

>
> > Perhaps both define it as above but using library B you get a
> > redefinition warning.
>
> I don't know why you think that. Giving several identical
> definitions for a macro is allowed in C90, C99, C11, C++98,
> C++03, C++11, C++14, and C++17. Testing with gcc, clang, g++,
> and clang++ gave no diagnostics even with -Wall and -Wextra.

Libraries next to never define macro identical to letter and so
tests show opposite:
<http://coliru.stacked-crooked.com/a/65fc657ef88c80fa>
What you suggest to do with these annoying warnings? Silence?
Branch library code? Get the authors to cooperate?

> > And for example, regarding the so far not mentioned visual
> > impact, the shouting uppercase is an annoying eyesore.
>
> But that has nothing to do with the macro processor. You are
> perfectly free to choose a name that is not all uppercase, or
> even all lowercase, if you'd rather. What you said might be an
> argument about how to name a macro but it is not an argument
> about whether to use a macro.

I agree. We can just configure our tools and text editors to show
macros with different color if we for some reason want these to stand
out visually at all. However lot of coding standards have those rules
in and so majority of macros that we face are in SCREAMING_CAPS.

> Taking all of these together, the objections given sound more
> like rationalizations trying to justify an overarching dislike of
> macros than objectively reasoned conclusions.

Sure, if you dismiss all problems (that we have tired of facing in
practice) with claim that those do not exist then answering to your
question why we avoid macros is impossible.

Tim Rentsch

unread,
Jan 10, 2020, 11:44:28 PM1/10/20
to
Your comments are not really about macros but about how various
people misuse macros. Any language feature can be misused.
My question is about macros, not about people who misuse them.

Öö Tiib

unread,
Jan 11, 2020, 2:44:32 AM1/11/20
to
Crickets chirping? No answers?

> >>> And for example, regarding the so far not mentioned visual
> >>> impact, the shouting uppercase is an annoying eyesore.
> >>
> >> But that has nothing to do with the macro processor. You are
> >> perfectly free to choose a name that is not all uppercase, or
> >> even all lowercase, if you'd rather. What you said might be an
> >> argument about how to name a macro but it is not an argument
> >> about whether to use a macro.
> >
> > I agree. We can just configure our tools and text editors to show
> > macros with different color if we for some reason want these to stand
> > out visually at all. However lot of coding standards have those rules
> > in and so majority of macros that we face are in SCREAMING_CAPS.
> >
> >> Taking all of these together, the objections given sound more
> >> like rationalizations trying to justify an overarching dislike of
> >> macros than objectively reasoned conclusions.
> >
> > Sure, if you dismiss all problems (that we have tired of facing in
> > practice) with claim that those do not exist then answering to your
> > question why we avoid macros is impossible.
>
> Your comments are not really about macros but about how various
> people misuse macros. Any language feature can be misused.
> My question is about macros, not about people who misuse them.

Programmer who is ordered to write software for platform where he
can not avoid wasting workdays into dealing with that misuse of
feature by platform vendors still can not avoid disliking the
feature that allows such annoying misuses. Especially in places
where usage of inline functions, templates or constants would
be as simple but would not cause the issues existence of whose
you dismiss.

Tim Rentsch

unread,
Jan 14, 2020, 10:08:45 AM1/14/20
to
Tiib writes:

> On Saturday, 11 January 2020 06:44:28 UTC+2, Tim Rentsch wrote:

[...]

>> Your comments are not really about macros but about how various
>> people misuse macros. Any language feature can be misused.
>> My question is about macros, not about people who misuse them.
>
> Programmer who is ordered to write software for platform where he
> can not avoid wasting workdays into dealing with that misuse of
> feature by platform vendors still can not avoid disliking the
> feature that allows such annoying misuses. Especially in places
> where usage of inline functions, templates or constants would
> be as simple but would not cause the issues existence of whose
> you dismiss.

That may be interesting but it doesn't address the question
I was asking.

Öö Tiib

unread,
Jan 14, 2020, 11:42:49 AM1/14/20
to
Your question was "Why do C++ fans hate the C preprocessor so much?".

Tim Rentsch

unread,
Jan 21, 2020, 5:09:42 AM1/21/20
to
Yes, and your responses don't speak to that question. They do
explain why you don't like how some people use macros, but
nothing about the macros themselves. My question is about
reactions to the preprocessor (and macros in particular), not
about reactions to how people might misuse it. Do you not
understand the difference?

James Kuyper

unread,
Jan 21, 2020, 8:14:33 AM1/21/20
to
Do you don't understand that he considers that the way they invite
misuse, might be precisely the thing that he finds objectionable about
macros? That is a problem that involves both the macros and their users,
but you can't change the way most people behave, so it's fair to level
such a criticism against macros themselves. That certainly seems to me
to be the obvious interpretation of his words.
I say this without implying that I share that opinion.

bol...@nowhere.org

unread,
Jan 21, 2020, 11:44:54 AM1/21/20
to
The sort of developers who hate macros are the ones who've never had to do
cross platform development.

Jorgen Grahn

unread,
Jan 21, 2020, 3:01:43 PM1/21/20
to
On Tue, 2020-01-21, bol...@nowhere.org wrote:
...
> The sort of developers who hate macros are the ones who've never had to do
> cross platform development.

How about those of us who had to do it, and hated it?

By the way, even for that scenario, the preprocessor is overused.
See the "#ifdefs considered harmful" paper.

/Jorgen

PS. I don't hate macros.

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
It is loading more messages.
0 new messages