Rotational shift operator

gb2...@gmail.com

unread,

Nov 22, 2017, 4:57:54 AM11/22/17

to ISO C++ Standard - Future Proposals

I have not seen <#, <#=, ># and >#= be used thus far so I want to suggest these as the operators in the next set of standards.

Vishal Oza

unread,

Nov 22, 2017, 5:45:53 AM11/22/17

to ISO C++ Standard - Future Proposals

I am not sure if the would work as # symbol is a preprocessor marker. I thing a language lawyer could tell you if this is possible by the standard. If this is possible then what would they do. Please give examples.

gb2...@gmail.com

unread,

Nov 22, 2017, 6:30:06 AM11/22/17

to ISO C++ Standard - Future Proposals

On Wednesday, 22 November 2017 10:45:53 UTC, Vishal Oza wrote:

I am not sure if the would work as # symbol is a preprocessor marker.

If not # then @ could be used, <> start it off to reference similarity to how the << >> operators work

On Wednesday, 22 November 2017 10:45:53 UTC, Vishal Oza wrote:

If this is possible then what would they do. Please give examples.

#include "bitmath.h"
int main(void) {
signed char num = 127
signed char num1 = num;
signed char num2 = -22;
size_t char_bit = bitmax( 1 );
biti_t numb1 = numtobiti( &num1, 1, CHAR_BIT );
biti_t numb2 = numtobiti( &num2, 1, CHAR_BIT );
#ifdef ROTATION_OPERATORS_DEFINED
// Faster as compiler can generate asm instruction for it
unsigned char sig = (1 ># 1);
#else
// Slower as relies on functions
unsigned char sig = 1;
unsigned char one = 1;
biti_t sigi = num2biti( &sig, 1, char_bit );
biti_t onei = num2biti( &sig, 1, char_bit );
bitRor( sigi, onei );
#endif
if ( (unsigned char)num1 >= sig ) {
    if ( (unsigned char)num2 > sig ) )
      bitRem( numb1, numb2 );
    else
      bitAdd( numb1, numb2 );
}
else if ( (unsigned)num2 >= sig )
    bitAdd( numb1, numb2 );
else
    bitRem( numb1, numb2 );
if ( num1 != (num - num2) )
    printf("Failure: num1=%i", (int)num1 );
else
    printf("Success: num1=%i", (int)num1 );
return ( num1 != (num - num2) );
}

Message has been deleted

Bo Persson

unread,

Nov 22, 2017, 6:51:39 AM11/22/17

to std-pr...@isocpp.org

On 2017-11-22 12:30, gb2...@gmail.com wrote:
> On Wednesday, 22 November 2017 10:45:53 UTC, Vishal Oza wrote:
> #ifdef ROTATION_OPERATORS_DEFINED
> // Faster as compiler can generate asm instruction for it
> unsigned char sig = (1 ># 1);
> #else
> // Slower as relies on functions
> unsigned char sig = 1;
> unsigned char one = 1;
> biti_t sigi = num2biti( &sig, 1, char_bit );
> biti_t onei = num2biti( &sig, 1, char_bit );
> bitRor( sigi, onei );
> #endif

Nothing stops the compilers from generating inline code for functions.

They already do that for lots of functions:

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

Bo Persson

gb2...@gmail.com

unread,

Nov 22, 2017, 7:06:19 AM11/22/17

to ISO C++ Standard - Future Proposals, b...@gmb.dk

On Wednesday, 22 November 2017 11:51:39 UTC, Bo Persson wrote:

Nothing stops the compilers from generating inline code for functions.

They already do that for lots of functions:

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

That's vendor specific, not a standardized feature that all compilers are required to support when supporting a standard which means the cross-compiler code (which most frameworks attempt) would need to be complex, adding this operator into the standard allows frameworks to begin depreciating old complicated code and subsequently removing it when the needed standard becomes common place falling back to not so fast code when standards that are no longer directly supported are used.

Bo Persson

unread,

Nov 22, 2017, 7:15:34 AM11/22/17

to std-pr...@isocpp.org

But as the compilers already have built-in support for the rotate
instructions, all you have to do is standardize the name, not invent new
operators.

Bo Persson

gb2...@gmail.com

unread,

Nov 22, 2017, 7:35:18 AM11/22/17

to ISO C++ Standard - Future Proposals, b...@gmb.dk

On Wednesday, 22 November 2017 12:15:34 UTC, Bo Persson wrote:

But as the compilers already have built-in support for the rotate
instructions, all you have to do is standardize the name, not invent new
operators.

And as you saw with my example using functions is more complex, besides which I'm already working on a standardized name as well (if you hadn't noticed the "bitRor") but that I will submit later when I have code for all three bit math types: signed (bit*) unsigned (ubit*) and floating (fbit*), I originally wanted to use all lowercase on those names but found the "bitand" was showing as a keyword of some sort so as a temporary naming scheme I'm using uppercase for the first letter of the operation, I'll give more thought to it later when I'm done with the fallback code (since I'm gonna provide fallback code for short, int, long and a couple of unconfirmed standardized names for larger sizes). I gotta prepare for work now so I will answer further posts another day.

Thiago Macieira

unread,

Nov 22, 2017, 11:42:10 AM11/22/17

to std-pr...@isocpp.org

On quarta-feira, 22 de novembro de 2017 01:57:54 PST gb2...@gmail.com wrote:
> I have not seen <#, <#=, ># and >#= be used thus far so I want to suggest
> these as the operators in the next set of standards.

No need. Compilers are smart enough to detect a rotation. So a macro like:

#define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))

is already sufficient. See:

https://godbolt.org/g/Xj7FPs

Note how three uses a left rotate of 23 and one used a right rotate of 41.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Nevin Liber

unread,

Nov 22, 2017, 11:59:35 AM11/22/17

to std-pr...@isocpp.org

You mean like in P0553?

Other than the inline namespace, it is slated for C++20 (I can't remember which header we decided upon, though).

--

Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> +1-847-691-1404

Arthur O'Dwyer

unread,

Nov 22, 2017, 11:15:17 PM11/22/17

to ISO C++ Standard - Future Proposals

In LEWG, the runoff was between <bit_ops> and <bit>, and LEWG came down on the side of <bit>.

This is the same header which will also include std::bit_value, std::bit_reference, std::bit_pointer (P0237), and std::bit_cast (P0476).

–Arthur

Nicol Bolas

unread,

Nov 22, 2017, 11:53:15 PM11/22/17

to ISO C++ Standard - Future Proposals

`bit_value`, `bit_reference`, and `bit_pointer`, I can understand. `bit_cast` makes no sense, as it doesn't really have anything to do with "bits" in and of themselves. You could easily call it "bitewise_cast" or whatever.

It makes no sense in that header.

Arthur O'Dwyer

unread,

Nov 23, 2017, 1:32:46 AM11/23/17

to ISO C++ Standard - Future Proposals

(Caveat: I might be misinterpreting the notes about 'bit_cast'. I think it's currently slated for <bit> but I'm not 100% sure.)

Nicol's argument is reasonable. But the counterargument is that we're introducing a whole bunch of things with 'bit' in their name, so it is actively helpful to working programmers if we put them all in the <bit> header. Putting one of the new 'bit' features arbitrarily off in a non-<bit> header would be just as harmful to the teachability of the language as, say, putting five of the standard algorithms off in a non-<algorithm> header. (Or putting three others in a different non-<algorithm> header.)

And the counter to the counter: Putting 'std::bit_cast' in <bit> based on superficial name-similarity would be just as stupid and harmful to the long-term maintainability of the library as, say, putting 'std::function' in <functional> or 'std::istream_iterator' in <iterator>.

my $.02,

Arthur

Tony V E

unread,

Nov 23, 2017, 1:35:36 AM11/23/17

to Standard Proposals

The problem is that bit_cast is misnamed. It has nothing to do with the other bit functions, and more importantly:

bit_cast<Foo>(512);

The 9th *bit* is set in 512 - regardless of byte ordering. Yet the value of the returned Foo depends on the *bytes* of 512, not the bits.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0LGrYEXfXLgOSKMvjJfGgJEA%2BLJDpk87ba4b-w6ZwkLWw%40mail.gmail.com.

--

Be seeing you,

Tony

gb2...@gmail.com

unread,

Nov 23, 2017, 7:22:07 AM11/23/17

to ISO C++ Standard - Future Proposals

On Wednesday, 22 November 2017 16:42:10 UTC, Thiago Macieira wrote:

On quarta-feira, 22 de novembro de 2017 01:57:54 PST gb2...@gmail.com wrote:
> I have not seen <#, <#=, ># and >#= be used thus far so I want to suggest
> these as the operators in the next set of standards.

No need. Compilers are smart enough to detect a rotation. So a macro like:

#define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))

Sure the compiler may be smart enough but that doesn't mean that we should use a macro to complex math when an operator is faster to compile as well as being simpler to use and easy to understand for others looking at the code. I can't remember who the guy that said it but here's a quote (as best I remember it) "always get a lazy man to do a complex job, he'll make it simpler" - I'm the lazy man in this instance, I don't want to look at complex code if I can instead look at simple code, I also see no reason why you should insist against minor & simple addition to the language, if you want to use complex code in your project, fine but don't force the rest of us to do so if a simpler option can be made available.

Thiago Macieira

unread,

Nov 23, 2017, 12:33:41 PM11/23/17

to std-pr...@isocpp.org

On quinta-feira, 23 de novembro de 2017 04:22:07 PST gb2...@gmail.com wrote:
> On Wednesday, 22 November 2017 16:42:10 UTC, Thiago Macieira wrote:
> > On quarta-feira, 22 de novembro de 2017 01:57:54 PST gb2...@gmail.com
> >
> > <javascript:> wrote:
> > > I have not seen <#, <#=, ># and >#= be used thus far so I want to
> >
> > suggest
> >
> > > these as the operators in the next set of standards.
> >
> > No need. Compilers are smart enough to detect a rotation. So a macro like:
> > #define ROTL(x, b) (uint64_t)(((x) << (b)) | ((x) >> (64 - (b))))
>
> Sure the compiler may be smart enough but that doesn't mean that we should
> use a macro to complex math when an operator is faster to compile as well
> as being simpler to use and easy to understand for others looking at the
> code.

You don t need to use a macro. You can write the pair of shifts and OR
directly or you can make an inline function.

I REALLY don't think we should add an operator to do something we can already
do and it's just as efficient.

> I can't remember who the guy that said it but here's a quote (as best
> I remember it) "always get a lazy man to do a complex job, he'll make it
> simpler" - I'm the lazy man in this instance, I don't want to look at
> complex code if I can instead look at simple code, I also see no reason why
> you should insist against minor & simple addition to the language, if you
> want to use complex code in your project, fine but don't force the rest of
> us to do so if a simpler option can be made available.

Because changing the language grammar is difficult and requires updates
everywhere, not to mention possible unintended consequences. The choice of
characters is quite difficult, since most of them are used and the others are
unavailable.

All this for no gain in efficiency.

At best, I think we should add the operation to a header as free, inline
functions, which does exactly what I pasted above. So your code would still
read:

return rotl(v, b);

gb2...@gmail.com

unread,

Nov 24, 2017, 3:57:01 AM11/24/17

to ISO C++ Standard - Future Proposals

uint8_t x = 1, b = 2;
uint8_t val1 = (((x) << (b)) | ((x) >> ((sizeof(x)*CHAR_BIT) - (b))));
uint8_t val2 = x ># (b * 2);
uint8_t val3 = (uint8_t)rotl( x, b * 3 );

Now tell me if you were to simply glance at this code after the operator became common place and you didn't have any prior context which value would you understand what to expect quickest?

Jonathan Müller

unread,

Nov 24, 2017, 4:21:44 AM11/24/17

to std-pr...@isocpp.org

On 24.11.2017 09:57, gb2...@gmail.com wrote:
>
> uint8_t x = 1, b = 2;
> uint8_t val1 = (((x) << (b)) | ((x) >> ((sizeof(x)*CHAR_BIT) - (b))));
> uint8_t val2 = x ># (b * 2);
> uint8_t val3 = (uint8_t)rotl( x, b * 3 );
>
> Now tell me if you were to simply glance at this code after the operator
> became common place and you didn't have any prior context which value
> would you understand what to expect quickest?
>

val3 as it doesn't use a weird operator I've never seen before.

David Brown

unread,

Nov 24, 2017, 5:14:07 AM11/24/17

to std-pr...@isocpp.org

I'd also say val3, as to me val2 looks like it would be a rotate /right/
operator rather than a rotate left - while "rotl" is clear and obvious.

Jens Maurer

unread,

Nov 24, 2017, 5:16:38 AM11/24/17

to std-pr...@isocpp.org

rotl and friends is on track for C++20:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0553r1.html

Jens

gb2...@gmail.com

unread,

Nov 24, 2017, 5:39:28 AM11/24/17

to ISO C++ Standard - Future Proposals

re-read the question you obviously did not read thoroughly then answer

gb2...@gmail.com

unread,

Nov 24, 2017, 5:43:59 AM11/24/17

to ISO C++ Standard - Future Proposals, da...@westcontrol.com

I never said they would all be the same direction so you understood the operator correctly at a simple glance which is what I wanted out of the operator to begin with, having said that which one do you feel took longer to understand (even if only by a fraction of a second).

Jens Maurer

unread,

Nov 24, 2017, 5:50:06 AM11/24/17

to std-pr...@isocpp.org

This discussion is pointless.

A standardized function works very well for bit rotation,
and once commonplace (and even when not) will be readily
understood when reading code.

Inventing new operators for this fringe use-case of
bit rotation is entirely unnecessary. Should this ever
come before the C++ committee, I'd vote against it,
and, venturing an educated guess, so will a vast majority
of the other participants.

If you want to convince me otherwise, you have to find
good arguments why new operators are superior over named
functions for bit rotation. Simply "because we can" is
insufficient.

https://isocpp.org/std/submit-a-proposal

Jens

On 2017-11-24 11:39, gb2...@gmail.com wrote:
> On Friday, 24 November 2017 09:21:44 UTC, Jonathan Müller wrote:
>

gb2...@gmail.com

unread,

Nov 24, 2017, 6:16:27 AM11/24/17

to ISO C++ Standard - Future Proposals

How about readability? For example if we compare the process api of win32 & posix you'll notice that with posix you rely on directory & file functions to do the bulk of the work, in a small function where you can easily see the initial values of each poorly named value that's fine but when a poster posts complex code for both win32 and posix using quick names like x, i, etc (which a lot of newbies (of which we were all once part of) like to do) it is easier to read & correct the win32 version because of names like OpenProcess or Process32First/Process32Next then opendir and readdir which posix forces you to rely on, even if they are named properly there is still only so much a name can tell you from just a glance.

Jens Maurer

unread,

Nov 24, 2017, 6:55:52 AM11/24/17

to std-pr...@isocpp.org

Which POSIX process API that uses opendir are you talking about?
A specific reference to the POSIX standard
http://pubs.opengroup.org/onlinepubs/9699919799/ is appreciated.

Anyway, back to the topic: I'd much prefer to read "rotl" and "rotr"
in foreign code than an operator. Even seeing << always gives me
pause (does he really want a bit-shift here, or is that his way
of saying "multiplication by 2^n").

Beyond that, new functions are pure library extensions (= easy
and fairly cheap), but new operators are language extensions
(= more complicated and costly in terms of committee review and
implementation). Does bit rotation justify these extra costs?
No way.

Jens

David Brown

unread,

Nov 24, 2017, 11:01:01 AM11/24/17

to std-pr...@isocpp.org

You are using the Windows API as a model of readability? What about the
"OpenFile" function - that at one point in Windows history could be used
to open a connection to devices, serial ports, network sockets -
practically everything /except/ a file. Opening an existing file is
done by the "CreateFile" funciton. How could anyone possibly get confused?

And what this has to do with the readability of <<# vs. rotl, I cannot
imagine.

Thiago Macieira

unread,

Nov 24, 2017, 12:24:00 PM11/24/17

to std-pr...@isocpp.org

On sexta-feira, 24 de novembro de 2017 00:57:01 PST gb2...@gmail.com wrote:
> uint8_t x = 1, b = 2;
> uint8_t val1 = (((x) << (b)) | ((x) >> ((sizeof(x)*CHAR_BIT) - (b))));
> uint8_t val2 = x ># (b * 2);
> uint8_t val3 = (uint8_t)rotl( x, b * 3 );

The val3 one.

Nicol Bolas

unread,

Nov 24, 2017, 1:46:10 PM11/24/17

to ISO C++ Standard - Future Proposals, gb2...@gmail.com

On Friday, November 24, 2017 at 3:57:01 AM UTC-5, gb2...@gmail.com wrote:

uint8_t x = 1, b = 2;
uint8_t val1 = (((x) << (b)) | ((x) >> ((sizeof(x)*CHAR_BIT) - (b))));
uint8_t val2 = x ># (b * 2);
uint8_t val3 = (uint8_t)rotl( x, b * 3 );

Now tell me if you were to simply glance at this code after the operator became common place and you didn't have any prior context which value would you understand what to expect quickest?

I think the question is essentially assuming its own conclusion. Why? Because it presupposes a circumstance that I do not believe can come about: the notion that "the operator became common place".

Bitshifting is already a fairly rare operation in most of the code out there. Oh sure, if you dig down deep, you'll have to bitshift to do some things. But at the higher-level code that most people work in? It's not a common thing. It's not nearly as commonplace as addition, multiplication, or logical operators. It's not even as common as bitwise operators (which are frequently used for flags and the like without needing shifts). Oh yes, I know there are programmers who do a lot of bit-work in their daily lives and need to shift bits around. But for the majority of C++ programmers? Meh.

Bit-rotation is a less common operation than bit shifting. Oh sure, there are times when you need to do it. But it seems apparent that shifting is done more often in general than rotation. And as previously established, shifting isn't exactly common.

Given that, I do not see how a bit rotation operator can ever be called "common place". And without that caveat, without the operator being something people see even semi-frequently, it's obvious that #3 will be the more readable option.

That is, #2 is only readable if you're familiar with it. #3 will be readable regardless of familiarity (though I would prefer a longer name).

Thiago Macieira

unread,

Nov 24, 2017, 7:09:52 PM11/24/17

to std-pr...@isocpp.org

On sexta-feira, 24 de novembro de 2017 10:46:09 PST Nicol Bolas wrote:
> Bit-*rotation* is a less common operation than bit shifting.

Proof: the fact that there's no built-in rotation operator in any major
language, not even in C that was supposed to be "portable assembly".
Especially in the olden days when you coudn't count on the compilers
understanding what you meant.

And if you want an even less common operation: rotate with carry.

Andrey Semashev

unread,

Nov 25, 2017, 12:14:24 AM11/25/17

to std-pr...@isocpp.org

On 11/25/17 03:09, Thiago Macieira wrote:
> On sexta-feira, 24 de novembro de 2017 10:46:09 PST Nicol Bolas wrote:
>> Bit-*rotation* is a less common operation than bit shifting.
>
> Proof: the fact that there's no built-in rotation operator in any major
> language, not even in C that was supposed to be "portable assembly".
> Especially in the olden days when you coudn't count on the compilers
> understanding what you meant.
>
> And if you want an even less common operation: rotate with carry.

That's not exactly a proof, more of a conservative approach of language
designers. Every widespread CPU architecture includes rotation
instructions because theys are important in some areas.

Thiago Macieira

unread,

Nov 25, 2017, 12:37:36 AM11/25/17

to std-pr...@isocpp.org

No doubt they are, because unlike a high-level language, executing a rotation
in assembly couldn't be executed in one cycle due to the data dependency[*],
not to mention the need to use one extra register, possibly two. A dedicated
instruction for rotation (and one more for rotate-with-carry) can probably
execute in one cycle by just reusing the barrel-shifter.

The high-level language doesn't need that because the compilers are smart
enough to notice the pattern of two-shifts-and-or as a rotation and write
assembly accordingly.

[*] unless the processor is smart enough to do macro-op fusing of three
instructions. The Intel x86 processors' microcode can fuse CMP+Jcc, but I
don't know of three-instruction fusing.

Nicol Bolas

unread,

Nov 25, 2017, 1:03:09 AM11/25/17

to ISO C++ Standard - Future Proposals

Or a simple recognition that syntax is limited and a function will perform just as efficiently with basic inlining support as an operator in the language. So you spend your limited resource based on how common something is, not how "important" it may be "in some areas".

Andrey Semashev

unread,

Nov 25, 2017, 7:08:12 AM11/25/17

to std-pr...@isocpp.org

On 11/25/17 08:37, Thiago Macieira wrote:
> On sexta-feira, 24 de novembro de 2017 21:14:19 PST Andrey Semashev wrote:
>> On 11/25/17 03:09, Thiago Macieira wrote:
>>> On sexta-feira, 24 de novembro de 2017 10:46:09 PST Nicol Bolas wrote:
>>>> Bit-*rotation* is a less common operation than bit shifting.
>>>
>>> Proof: the fact that there's no built-in rotation operator in any major
>>> language, not even in C that was supposed to be "portable assembly".
>>> Especially in the olden days when you coudn't count on the compilers
>>> understanding what you meant.
>>>
>>> And if you want an even less common operation: rotate with carry.
>>
>> That's not exactly a proof, more of a conservative approach of language
>> designers. Every widespread CPU architecture includes rotation
>> instructions because theys are important in some areas.
>
> No doubt they are, because unlike a high-level language, executing a rotation
> in assembly couldn't be executed in one cycle due to the data dependency[*],
> not to mention the need to use one extra register, possibly two. A dedicated
> instruction for rotation (and one more for rotate-with-carry) can probably
> execute in one cycle by just reusing the barrel-shifter.
>
> The high-level language doesn't need that because the compilers are smart
> enough to notice the pattern of two-shifts-and-or as a rotation and write
> assembly accordingly.

Well, the compilers were not always that smart, and if we had the
standard rotation functions or operators from the start, people wouldn't
have to write assembler code and compilers wouldn't need to be taught to
recognize certain code patterns. Reliability of this recognition has
always been and will be a question of QoI, i.e. something that cannot be
relied on.

I'm not arguing for operators or functions approach here. I'm saying
that dedicated rotation operations are long overdue in C and C++.

Thiago Macieira

unread,

Nov 25, 2017, 3:24:30 PM11/25/17

to std-pr...@isocpp.org

On sábado, 25 de novembro de 2017 04:08:07 PST Andrey Semashev wrote:
> > The high-level language doesn't need that because the compilers are smart
> > enough to notice the pattern of two-shifts-and-or as a rotation and write
> > assembly accordingly.
>
> Well, the compilers were not always that smart, and if we had the
> standard rotation functions or operators from the start, people wouldn't
> have to write assembler code and compilers wouldn't need to be taught to
> recognize certain code patterns. Reliability of this recognition has
> always been and will be a question of QoI, i.e. something that cannot be
> relied on.

True, chicken-and-the-egg: if the operators had been available, perhaps they
would have been used more often.

But my assertion remains: if they had really been needed, they'd have been
standardised and added to some language by now. For now, they only exist as
intrinsics in <x86intrin.h>.

> I'm not arguing for operators or functions approach here. I'm saying
> that dedicated rotation operations are long overdue in C and C++.

I dispute that. They can only be overdue if you needed them and didn't have
them. Since you can do rotation right now, you have them.

David Brown

unread,

Nov 25, 2017, 3:46:36 PM11/25/17

to std-pr...@isocpp.org

On 24/11/17 11:43, gb2...@gmail.com wrote:
> On Friday, 24 November 2017 10:14:07 UTC, David Brown wrote:
>
> On 24/11/17 10:21, Jonathan Müller wrote:

> > On 24.11.2017 09:57, gb2...@gmail.com <javascript:> wrote:
> >>
> >> uint8_t x = 1, b = 2;
> >> uint8_t val1 = (((x) << (b)) | ((x) >> ((sizeof(x)*CHAR_BIT) -
> (b))));
> >> uint8_t val2 = x ># (b * 2);
> >> uint8_t val3 = (uint8_t)rotl( x, b * 3 );
> >>
> >> Now tell me if you were to simply glance at this code after the
> >> operator became common place and you didn't have any prior context
> >> which value would you understand what to expect quickest?
> >>
> >
> > val3 as it doesn't use a weird operator I've never seen before.
> >
>
> I'd also say val3, as to me val2 looks like it would be a rotate
> /right/
> operator rather than a rotate left - while "rotl" is clear and obvious.
>
>
> I never said they would all be the same direction so you understood the
> operator correctly at a simple glance which is what I wanted out of the
> operator to begin with, having said that which one do you feel took
> longer to understand (even if only by a fraction of a second).
>

val3 would be as fast as val2 if you hadn't included an unnecessary cast
and (to my eyes) odd spacing.

Even if val2 were faster by a fraction of a second, that would not be
relevant. It would need to be a /lot/ faster to be significant, and it
would also need to be an operation that is used regularly. Rotations
are rare - they only turn up in a few niche types of code, such as
cryptography algorithms. And usually such code is so complicated and
requires such time and thought to understand, that any time spend
understanding a rotation operator or function is negligible.

There seems to be an idea that because most cpus have fast instructions
for rotate operations, that this means they must be very useful and
should be common in code. It is simply not the case. These
instructions are /occasionally/ useful - but on the occasions where they
are used, the code often needs to be fast. They are also very cheap to
implement in the hardware (once you have the shift operations
implemented). In real-life code, they are needed so rarely that
expressions such as val1 above are probably fine (and good compilers
will optimise them nicely) - for cases where they are used a lot, a
simple static inline function will do the job.

There are some types where having additional operators beyond C++'s
current set could be useful. Rotations of integer types is not, IMHO,
one of them.

Nicol Bolas

unread,

Nov 25, 2017, 4:17:18 PM11/25/17

to ISO C++ Standard - Future Proposals

On Saturday, November 25, 2017 at 3:24:30 PM UTC-5, Thiago Macieira wrote:

On sábado, 25 de novembro de 2017 04:08:07 PST Andrey Semashev wrote:
> > The high-level language doesn't need that because the compilers are smart
> > enough to notice the pattern of two-shifts-and-or as a rotation and write
> > assembly accordingly.
>
> Well, the compilers were not always that smart, and if we had the
> standard rotation functions or operators from the start, people wouldn't
> have to write assembler code and compilers wouldn't need to be taught to
> recognize certain code patterns. Reliability of this recognition has
> always been and will be a question of QoI, i.e. something that cannot be
> relied on.

True, chicken-and-the-egg: if the operators had been available, perhaps they
would have been used more often.

But my assertion remains: if they had really been needed, they'd have been
standardised and added to some language by now.

One could make the same case about `optional`, for example. Or `variant`. Or all of `filesystem`. Or a C++ file IO API that isn't stupid. Or any number of other things that the language/library clearly needs but we don't have.

Things which are needed are things people have a need for. And you can demonstrate that need by showing people doing the same thing over and over. We saw people making `variant` and `optional` types, so we made a standard one. We saw people making filesystems, so we standardized one. Etc.

The question is not whether bitwise rotation is a worthy addition to C++. It's how it should be added.

For now, they only exist as
intrinsics in <x86intrin.h>.

> I'm not arguing for operators or functions approach here. I'm saying
> that dedicated rotation operations are long overdue in C and C++.

I dispute that. They can only be overdue if you needed them and didn't have
them. Since you can do rotation right now, you have them.

Like we don't need Qt, right? We can just write cross-platform GUI code ourselves.

The fact that compilers are written to detect manual bitwise rotation and convert them into optimal code makes it clear that it is used enough for them to have special-cases for it. The fact that bitwise rotations are useful is not something that can reasonably be disputed.

gb2...@gmail.com

unread,

Nov 25, 2017, 4:18:53 PM11/25/17

to ISO C++ Standard - Future Proposals

On Saturday, 25 November 2017 20:24:30 UTC, Thiago Macieira wrote:

True, chicken-and-the-egg: if the operators had been available, perhaps they
would have been used more often.

But my assertion remains: if they had really been needed, they'd have been
standardised and added to some language by now. For now, they only exist as
intrinsics in <x86intrin.h>.

1st point the chicken would have to come 1st via evolution (assuming our universe wasn't actually made by god), 2nd just because they didn't implement an operator does not mean they didn't try to come up with one and simply failed due to restraint of the time period, nowadays compilers are smart enough not to confuse <# etc as preprocessor code since # would would have to be the 1st non whitespace character on each line.

Andrey Semashev

unread,

Nov 25, 2017, 5:33:24 PM11/25/17

to std-pr...@isocpp.org

We have everything right now because we have assembler. We don't have
them in C++, unless you want to rely on optimizations (I don't). This
state of things doesn't make C++ any better, so I don't think this is an
argument you'd want to take.

Thiago Macieira

unread,

Nov 25, 2017, 11:45:14 PM11/25/17

to std-pr...@isocpp.org

On sábado, 25 de novembro de 2017 13:17:18 PST Nicol Bolas wrote:
> > But my assertion remains: if they had really been needed, they'd have been
> > standardised and added to some language by now.
>
> One could make the same case about `optional`, for example. Or `variant`.
> Or all of `filesystem`. Or a C++ file IO API that isn't stupid. Or any
> number of other things that the language/library clearly needs but we don't
> have.

Sorry, that's not the same. I was setting the bar for adding an operator.

Those have existed for some time as implementation-specific extensions, such as
the Intel-defined intrinsics in <x86intrin.h> (GCC and Clang) and <immintrin.h>
(ICC and MSVC).

So like most of the standardisation coming from Boost, we get experimented and
have found what works.

But no one has created an operator as an extension.

gb2...@gmail.com

unread,

Nov 26, 2017, 2:51:50 AM11/26/17

to ISO C++ Standard - Future Proposals

That'll be because of people like you who won't let the language move forward because you're being stubborn about a standard that was set yonks ago, if people treated human language the same way then we'd never get words like "lol"/"ikr" or something as simple as the word "you" slipping in, we'd instead be using "thee" or whatever it was.

Adding an operator for something that does get used is not gonna change the way the standard works in other areas, it just provides a way to do basic math as simple as the rest, it also provides a reason to depreciate things like rotl which should never have needed to be created in the 1st place.

Viacheslav Usov

unread,

Nov 26, 2017, 9:01:35 AM11/26/17

to ISO C++ Standard - Future Proposals

On Sun, Nov 26, 2017 at 5:45 AM, Thiago Macieira <thi...@macieira.org> wrote:

> But no one has created an operator as an extension.

If somebody wants rotary shifters as operators, the usual shift operators can be overloaded. One just needs a new type for that, e.g.:

x << std::rotary(y)

Or, perhaps,

std::rotary(x) << y

That is probably in response to the general thread, not specifically to your message.

Cheers,

V.

Nicol Bolas

unread,

Nov 26, 2017, 9:58:06 AM11/26/17

to ISO C++ Standard - Future Proposals

On Saturday, November 25, 2017 at 11:45:14 PM UTC-5, Thiago Macieira wrote:

On sábado, 25 de novembro de 2017 13:17:18 PST Nicol Bolas wrote:
> > But my assertion remains: if they had really been needed, they'd have been
> > standardised and added to some language by now.
>
> One could make the same case about `optional`, for example. Or `variant`.
> Or all of `filesystem`. Or a C++ file IO API that isn't stupid. Or any
> number of other things that the language/library clearly needs but we don't
> have.

Sorry, that's not the same. I was setting the bar for adding an operator.

gb2...@gmail.com

unread,

Nov 26, 2017, 1:05:19 PM11/26/17

to ISO C++ Standard - Future Proposals

On Sunday, 26 November 2017 14:01:35 UTC, Viacheslav Usov wrote:

If somebody wants rotary shifters as operators, the usual shift operators can be overloaded. One just needs a new type for that, e.g.:

x << std::rotary(y)

Or, perhaps,

std::rotary(x) << y

That is probably in response to the general thread, not specifically to your message.

Cheers,
V.

That implies C++ and that one won't want to use the shift operator for it's original purpose, I'm suggesting this operator specifically for hard integers like un/signed char, short, int, long & long long

zxu...@gmail.com

unread,

Nov 26, 2017, 1:14:45 PM11/26/17

to ISO C++ Standard - Future Proposals, gb2...@gmail.com

On Sunday, 26 November 2017 18:05:19 UTC, gb2...@gmail.com wrote:

On Sunday, 26 November 2017 14:01:35 UTC, Viacheslav Usov wrote:
If somebody wants rotary shifters as operators, the usual shift operators can be overloaded. One just needs a new type for that, e.g.:

x << std::rotary(y)

Or, perhaps,

std::rotary(x) << y

That is probably in response to the general thread, not specifically to your message.

Cheers,
V.

My messages seem to be getting garbled in waterfox, trying this from chrome now. That implies C++ and no desire to use the shift operators for it's original purpose, my suggestion involves the hard integers themselves: un/signed char, short, int, long & long long

Viacheslav Usov

unread,

Nov 26, 2017, 1:25:34 PM11/26/17

to ISO C++ Standard - Future Proposals

On Sun, Nov 26, 2017 at 7:05 PM, <gb2...@gmail.com> wrote:

> That implies C++

Is this not natural in a C++ proposals forum?

Besides, that is not true. std::rotary() (with std:: replaced appropriately) can equally well work in C, where this will require, depending on the form chosen, at least one new built-in type and new semantics for the shift operators for the new type(s).

> and that one won't want to use the shift operator for it's original purpose, I'm suggesting this operator specifically for hard integers like un/signed char, short, int, long & long long

I do not think I really understand what you say here, but in the form I mentioned nothing makes the built-in operator unavailable, nor "hard integers" excluded.

Cheers,

V.

zxu...@gmail.com

unread,

Nov 26, 2017, 3:42:49 PM11/26/17

to ISO C++ Standard - Future Proposals

On Sunday, 26 November 2017 18:25:34 UTC, Viacheslav Usov wrote:

Is this not natural in a C++ proposals forum?

I followed links saying C/C++ (I forgot the origin as it started as a google search on "c suggest feature"), I assumed that meant both languages were being handled here, is that not so?

On Sunday, 26 November 2017 18:25:34 UTC, Viacheslav Usov wrote:

Besides, that is not true. std::rotary() (with std:: replaced appropriately) can equally well work in C, where this will require, depending on the form chosen, at least one new built-in type and new semantics for the shift operators for the new type(s).

No C does not provide overwriting/overloading of operators and rightly so since that would make code harder to understand for outside developers, I've always viewed C#/Obj-C/C++ as extensions built specifically for the lazy kind of programmer (one who can't be bothered to check for memory leaks or type a few extra characters for specific versions of functions), a good programmer will always use C when possible since it squeezes out the maximum in speed, memory and control.

Jonathan Müller

unread,

Nov 26, 2017, 3:44:55 PM11/26/17

to std-pr...@isocpp.org

On 26.11.2017 21:42, zxu...@gmail.com wrote:
> On Sunday, 26 November 2017 18:25:34 UTC, Viacheslav Usov wrote:
>
> Is this not natural in a C++ proposals forum?
>
> I followed links saying C/C++ (I forgot the origin as it started as a
> google search on "c suggest feature"), I assumed that meant both
> languages were being handled here, is that not so?

No, this is the discussion for C++ proposals. Not C.

>
> On Sunday, 26 November 2017 18:25:34 UTC, Viacheslav Usov wrote:
>
> Besides, that is not true. std::rotary() (with std:: replaced
> appropriately) can equally well work in C, where this will require,
> depending on the form chosen, at least one new built-in type and new
> semantics for the shift operators for the new type(s).
>
>
> No C does not provide overwriting/overloading of operators and rightly
> so since that would make code harder to understand for outside
> developers, I've always viewed C#/Obj-C/C++ as extensions built
> specifically for the lazy kind of programmer (one who can't be bothered
> to check for memory leaks or type a few extra characters for specific
> versions of functions), a good programmer will always use C when
> possible since it squeezes out the maximum in speed, memory and control.
>

Yeah, no.

zxu...@gmail.com

unread,

Nov 26, 2017, 4:32:08 PM11/26/17

to ISO C++ Standard - Future Proposals

Well anyway since I clearly have trouble finding it could you direct me to the c group, I'll re-post my suggestion there with a link to this one, since this would affect C++ as well it's better that the conversion is continued here.

Andrey Semashev

unread,

Nov 26, 2017, 4:38:04 PM11/26/17

to std-pr...@isocpp.org

With a few exceptions, operators are overloadable and I would expect the
proposed rotation operators to be as well. It would probably make sense
to provide overloads for user-defined types like bigint or std::bitset.
This, actually, is one point in favor of doing operators rather than
functions. (Yes, we can overload functions too, but it may follow the
std::swap peril, where overloads are easilly not found or are more
difficult to use.)

Ren Industries

unread,

Nov 26, 2017, 4:57:27 PM11/26/17

to std-pr...@isocpp.org

Right on, who cares about elegance or code clarity. Better to mangle the function name manually!

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/14d1040a-d763-4ba6-a138-a30a35298e0d%40isocpp.org.

Nicol Bolas

unread,

Nov 26, 2017, 5:17:58 PM11/26/17

to ISO C++ Standard - Future Proposals, zxu...@gmail.com

No, it wouldn't. C++ does not automatically inherit features from C. So even if you could convince them to adopt this operator (and they're a *lot* more change-adverse than the C++ committee), that doesn't mean C++ will get it.

Thiago Macieira

unread,

Nov 26, 2017, 11:03:18 PM11/26/17

to std-pr...@isocpp.org

On sábado, 25 de novembro de 2017 23:51:49 PST gb2...@gmail.com wrote:
> That'll be because of people like you who won't let the language move
> forward because you're being stubborn about a standard that was set yonks
> ago, if people treated human language the same way then we'd never get
> words like "lol"/"ikr" or something as simple as the word "you" slipping
> in, we'd instead be using "thee" or whatever it was.

If we treated a computer language like a human language, we'd never write a
good compiler and much less cross-platform code, because everyone would write
code differently, in different dialects.

I have a friend who wrote one language like that just for fun. All functions
were verbs, all variable names were nouns, property getters were adjectives.
When passing a variable as a parameter, you had to spell the variable name in
the accusative case if passed by value, dative case if you wanted to pass by
reference/pointer. To take a function's address, you used its infinitive name,
but to call it you need to use the imperative mood. And of course, all
functions in the language's standard library were irregular verbs, so the way
you needed to call them followed a different formation rule compared to your
own functions.

> Adding an operator for something that does get used is not gonna change the
> way the standard works in other areas, it just provides a way to do basic
> math as simple as the rest, it also provides a reason to depreciate things
> like rotl which should never have needed to be created in the 1st place.

Sorry, but you may be wrong there. The fact that we add an operator changes
the language grammar, which may have unintended consequences. We've already
discussed the fact that there are no good symbol characters available, as # is
used by the preprocessor, $ is used in certain implementations as identifier,
characters outside the basic character set are used in certain language
extensions.

Even if you find one, there's a chance that the sequence of symbols will have
been used somewhere and in someone's code and the new operator changes the
meaning.

It's doable to add new operators, just see the spaceship operator being added.
But there needs to be a really good reason to do so, including an explanation
of why a function isn't sufficient to solve the problem.

Thiago Macieira

unread,

Nov 26, 2017, 11:08:20 PM11/26/17

to std-pr...@isocpp.org

On domingo, 26 de novembro de 2017 13:32:08 PST zxu...@gmail.com wrote:
> Well anyway since I clearly have trouble finding it could you direct me to
> the c group, I'll re-post my suggestion there with a link to this one,
> since this would affect C++ as well it's better that the conversion is
> continued here.

You have trouble finding it because they don't discuss over the Internet.

You should prepare a paper and submit it for the next mailing. You may need to
attend one of their meetings to present and defend your paper.

See http://www.open-std.org/jtc1/sc22/wg14/ for more information.

Viacheslav Usov

unread,

Nov 27, 2017, 6:54:27 AM11/27/17

to ISO C++ Standard - Future Proposals

On Sun, Nov 26, 2017 at 9:42 PM, <zxu...@gmail.com> wrote:

> I followed links saying C/C++

Never trust anything that says "C/C++". It is usually a lie.

> No C does not provide overwriting/overloading of operators

I said nothing about "overwriting/overloading of operators". I said "new semantics for the shift operators for the new type(s)".

> C when possible since it squeezes out the maximum in speed, memory and control

Having maximum in speed and memory at the same time is an exceptional occurrence, never the norm, no matter what language is used.

Cheers,

V.

zxu...@gmail.com

unread,

Nov 27, 2017, 7:55:23 AM11/27/17

to ISO C++ Standard - Future Proposals

On Monday, 27 November 2017 04:08:20 UTC, Thiago Macieira wrote:

You have trouble finding it because they don't discuss over the Internet.

You should prepare a paper and submit it for the next mailing. You may need to
attend one of their meetings to present and defend your paper.

See http://www.open-std.org/jtc1/sc22/wg14/ for more information.

Ugh, there's a limit to being out-dated, I'll just wait until the stubborn old mules insisting on keeping things in such out-dated handling pop their clogs. If I die before the committee comes to the modern era then it was never worth my time.

zxu...@gmail.com

unread,

Nov 27, 2017, 8:10:54 AM11/27/17

to ISO C++ Standard - Future Proposals

On Monday, 27 November 2017 11:54:27 UTC, Viacheslav Usov wrote:

Never trust anything that says "C/C++". It is usually a lie.

I agree but I don't plan on trawling through hundreds of links just to avoid that stuff.

On Monday, 27 November 2017 11:54:27 UTC, Viacheslav Usov wrote:

I said nothing about "overwriting/overloading of operators". I said "new semantics for the shift operators for the new type(s)".

new "semantics" on operators always means overwriting/overloading original handling

On Monday, 27 November 2017 11:54:27 UTC, Viacheslav Usov wrote:

Having maximum in speed and memory at the same time is an exceptional occurrence, never the norm, no matter what language is used.

I never said they were always at maximum simultaneously just that only C as the base language can achieve it as the C++ extensions always produce something the developer cannot control and thus cannot squeeze out the maximum of speed, memory &/or control, none of those can be done when relying on extensions. Further more extensions make debugging considerably more complicated when you cannot straight away see where you're function starts in disassembly because of those extensions worming there way in before or even in the function code itself.

zxu...@gmail.com

unread,

Nov 27, 2017, 8:22:09 AM11/27/17

to ISO C++ Standard - Future Proposals

On Monday, 27 November 2017 04:03:18 UTC, Thiago Macieira wrote:

If we treated a computer language like a human language, we'd never write a
good compiler and much less cross-platform code, because everyone would write
code differently, in different dialects.

You missed my point there, no matter the language it undergoes a refinement process as long as it is in use, this refinement produces the useful stuff that did not exist at the start, this end point is what I'm getting at here and what this group is really all about, without these types of discussions the programming languages would actually undergo that process in the form of new languages (aka perl, rust, java etc) and we would never settle on a single language as the main and always be stuck with crappy capabilities, these groups instead do that process via discussions like these which in turn produce the useful stuff that is eventually added to the language instead of changing the language.

zxu...@gmail.com

unread,

Nov 27, 2017, 8:42:16 AM11/27/17

to ISO C++ Standard - Future Proposals

On Monday, 27 November 2017 04:03:18 UTC, Thiago Macieira wrote:

Sorry, but you may be wrong there. The fact that we add an operator changes
the language grammar, which may have unintended consequences. We've already
discussed the fact that there are no good symbol characters available, as # is
used by the preprocessor, $ is used in certain implementations as identifier,
characters outside the basic character set are used in certain language
extensions.

Yes # is used by the preprocessor but it's still smart enough not to think '#' or "#" or any string containing # is not a preprocessor instruction so why not an operator?

What you said is basically the same as saying people should not use tone of sound or other similar stuff to distinguish between words because every sound of that group is used to represent that one sound, such a statement is ultimately stupid because people like the Japanese already do precisely that. Also I'm not hung up on actually using # but that does seem like the least likely to cause problems.

Even if you find one, there's a chance that the sequence of symbols will have
been used somewhere and in someone's code and the new operator changes the
meaning.

A chance does not mean definitely and catering to the minority is what caused so many wars in the past, mainly catering to the top 1% of the wealthy or the lusts of a single man etc. Just because someone does not like a change does not necessarily mean it is not good for them, e.g. I did not like switching from short ass names like a, b ,c when programming but found it was a good change when I finally got out of that habit because when I asked for help people actually understood a lot better what each variable was supposed to do.

It's doable to add new operators, just see the spaceship operator being added.
But there needs to be a really good reason to do so, including an explanation
of why a function isn't sufficient to solve the problem.

Well my reason is I don't want to think more than I need to (the irony that I program as a hobby is not lost on me) but I'm sure someone has a far better reason that, there is the example I gave earlier about readability but there could still be even better reasons such as specialty programs making heavy use of rotation (don't ask me for what though since I have no idea)

Andrey Semashev

unread,

Nov 27, 2017, 9:30:11 AM11/27/17

to std-pr...@isocpp.org

On November 27, 2017 4:42:18 PM zxu...@gmail.com wrote:

> On Monday, 27 November 2017 04:03:18 UTC, Thiago Macieira wrote:
>>
>> Sorry, but you may be wrong there. The fact that we add an operator
>> changes
>> the language grammar, which may have unintended consequences. We've
>> already
>> discussed the fact that there are no good symbol characters available, as
>> # is
>> used by the preprocessor, $ is used in certain implementations as
>> identifier,
>> characters outside the basic character set are used in certain language
>> extensions.
>>
>
> Yes # is used by the preprocessor but it's still smart enough not to think
> '#' or "#" or any string containing # is not a preprocessor instruction so
> why not an operator?

The # character will interfere regardless of position in the input because
it is a preprocessor instruction in macros.

#define ROL(x, y) x <# y

will probably expand to

x <" y"

In any case, interpreting tokens depending on the context is more
difficult, especially if it affects two relatively separate components like
preprocessor and parser. Better use a separate token valid in C++.

Nicol Bolas

unread,

Nov 27, 2017, 10:12:48 AM11/27/17

to ISO C++ Standard - Future Proposals, zxu...@gmail.com

Arbitrarily declaring that people who don't agree with you are a priori wrong, out-dated, or whatever is not a viable approach for convincing anyone of the validity of your position.

Hyman Rosen

unread,

Nov 27, 2017, 11:47:14 AM11/27/17

to std-pr...@isocpp.org, zxu...@gmail.com

On Mon, Nov 27, 2017 at 10:12 AM, Nicol Bolas <jmck...@gmail.com> wrote:

Arbitrarily declaring that people who don't agree with you are a priori wrong, out-dated, or whatever is not a viable approach for convincing anyone of the validity of your position.

Then again, there's that Max Planck paraphrased quote, "Science advances one funeral at a time."

I think that there is a reasonable possibility that the C++ Standard committee is a victim of epistemic
closure - the same people meeting time after time, agreeing with each other on certain priorities that
perhaps the C++ community at large would not agree with. Meanwhile, even simple fixes to obvious
errors remain unaddressed draft after draft (such as the number of bits required for an enumeration
type in [dcl.enum]), while more ways are found to make programs have undefined behavior so that
optimizationists can point to the beautiful code their compilers generate by ignoring what the
programmers actually wanted their programs to do.

Jonathan Müller

unread,

Nov 27, 2017, 11:50:26 AM11/27/17

to std-pr...@isocpp.org

Yeah, no.

(wow, second time in this thread)

Nicol Bolas

unread,

Nov 27, 2017, 12:22:00 PM11/27/17

to ISO C++ Standard - Future Proposals, zxu...@gmail.com

On Monday, November 27, 2017 at 11:47:14 AM UTC-5, Hyman Rosen wrote:

On Mon, Nov 27, 2017 at 10:12 AM, Nicol Bolas <jmck...@gmail.com> wrote:
Arbitrarily declaring that people who don't agree with you are a priori wrong, out-dated, or whatever is not a viable approach for convincing anyone of the validity of your position.

Then again, there's that Max Planck paraphrased quote, "Science advances one funeral at a time."

I think that there is a reasonable possibility that the C++ Standard committee is a victim of epistemic
closure - the same people meeting time after time, agreeing with each other on certain priorities that
perhaps the C++ community at large would not agree with.

Yeah, the C++ community at large is totally unhappy with prioritizing Concepts, Modules, Reflection, operator<=> and comparison operator generation, structured binding, and so forth. The committee needs to spend more time dealing with making C++ more like C and turning the "object model" into just some memory where stuff maybe kinda exists until it doesn't.

Look, I'll be the first to admit that the committee screws up and gets things wrong (their constant rejection of any terse lambda syntax or strong alias proposals, for example). But those aren't a question of priorities; they have more to do with their process not being rooted into solving problems so much as evaluating proposals. That there's no real recognition that rejecting approach after approach to solving a problem is effectively saying that the problem will never be solved.

Meanwhile, even simple fixes to obvious
errors remain unaddressed draft after draft (such as the number of bits required for an enumeration
type in [dcl.enum]), while more ways are found to make programs have undefined behavior so that
optimizationists can point to the beautiful code their compilers generate by ignoring what the
programmers actually wanted their programs to do.

It's funny; I can't seem to recall a single instance of C++ adding new UB for this purpose. Oh sure, there have been lots of clarifications of rules that were unclear. But there has been no change to the object model since C++98; only having more detail in explaining how it works.

All of the problems you cite about C++'s object model either are in C++98 or are due to defect resolutions to wording in C++98. date back to C++98. The only difference is that the model has been better specified, so you finally noticed what is and is not UB.

Hyman Rosen

unread,

Nov 27, 2017, 1:08:39 PM11/27/17

to std-pr...@isocpp.org

On Mon, Nov 27, 2017 at 12:21 PM, Nicol Bolas <jmck...@gmail.com> wrote:

Yeah, the C++ community at large is totally unhappy with prioritizing Concepts,

Modules, Reflection, operator<=> and comparison operator generation, structured

binding, and so forth. The committee needs to spend more time dealing with making

C++ more like C and turning the "object model" into just some memory where stuff

maybe kinda exists until it doesn't.

<https://thenextweb.com/dd/2015/11/02/linux-creator-linus-torvalds-had-a-meltdown-over-a-pull-request-and-it-was-awesome/>
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71892>

There are in fact quite a few members of the community who have no patience for the
optimizationists who are breaking the intent of programmers. When a compiler writer
decides that (a + 1 < a) is unilaterally false, even though a programmer has written
a test for that, that compiler writer has made a grave error, not in interpreting the language
but in the service provided to users of the compiler.

It's funny; I can't seem to recall a single instance of C++ adding new UB for this purpose.

Oh sure, there have been lots of clarifications of rules that were unclear. But there has

been no change to the object model since C++98; only having more detail in explaining

how it works. All of the problems you cite about C++'s object model either are in C++98

or are due to defect resolutions to wording in C++98. date back to C++98. The only

difference is that the model has been better specified, so you finally noticed what is and

is not UB.

It's not that I notice. It's that compiler vendors are using these "clarifications" to break
programs - they deliberately miscompile what the programmer has written in ways that
are inconsistent with the way they compiled programs before, thereby silently breaking
them and leaving users scrambling to find out why mysterious failures are happening.
Whether or not these possibilities have always been present, it's the clarifications that
make them manifest.

And what exactly is wrong with "some memory where stuff exists"? Strangely enough,
computer programs run on hardware that contains memory where stuff exists. It's C++
that's pretending that this isn't the case and twisting itself into knots trying to describe
that, while breaking programs for people who are trying to pick apart floating-point
numbers, stream values in and out of objects, protect themselves against undefined
but common behavior, and don't believe that "copying" a byte means that they are not
allowed to look at its value.

As far as I'm concerned, the point of the C++ object model is to insure that certain
functions - the *allocators and *structors - get called at certain times. The whole
notion that "object creation" is some magic fairy dust that needs to be sprinkled over
memory or else its contents are rendered untouchable is pernicious nonsense.

Bo Persson

unread,

Nov 27, 2017, 1:26:41 PM11/27/17

to std-pr...@isocpp.org

An odd thing here is that all major compilers are sponsored (or even
owned) by organizations using them to build their operating systems.
They even send their own representatives to the committee meetings.

One could imagine that them being truly unsatisfied customers would have
some effects on the committee work. Or new standard revisions not being
unanimously accepted.

Bo Persson

Thiago Macieira

unread,

Nov 27, 2017, 1:29:29 PM11/27/17

to std-pr...@isocpp.org

On Monday, 27 November 2017 05:22:09 PST zxu...@gmail.com wrote:
> You missed my point there, no matter the language it undergoes a refinement
> process as long as it is in use, this refinement produces the useful stuff
> that did not exist at the start, this end point is what I'm getting at here
> and what this group is really all about, without these types of discussions
> the programming languages would actually undergo that process in the form
> of new languages (aka perl, rust, java etc) and we would never settle on a
> single language as the main and always be stuck with crappy capabilities,
> these groups instead do that process via discussions like these which in

> turn produce the useful stuff that is eventually *added* to the language
> *instead of changing* the language.

And we're open to suggestions on improving the language. We all agree we
should have a way to perform rotations.

We just disagree that we need an operator in the core language. A function in
a library header is just as efficient, perhaps easier to read, definitely
won't cause parsing mistakes, and it has existing, proven track record of
working.

Nicol Bolas

unread,

Nov 27, 2017, 1:50:11 PM11/27/17

to ISO C++ Standard - Future Proposals

On Monday, November 27, 2017 at 1:08:39 PM UTC-5, Hyman Rosen wrote:
>
> On Mon, Nov 27, 2017 at 12:21 PM, Nicol Bolas <jmck...@gmail.com
> <javascript:>> wrote:
>>
>> Yeah, the C++ community at large is totally unhappy with prioritizing
>> Concepts,
>>
> Modules, Reflection, operator<=> and comparison operator generation,
>> structured
>>
> binding, and so forth. The committee needs to spend more time dealing with
>> making
>>
> C++ more like C and turning the "object model" into just some memory where
>> stuff
>>
> maybe kinda exists until it doesn't.
>>
>
> <
> https://thenextweb.com/dd/2015/11/02/linux-creator-linus-torvalds-had-a-meltdown-over-a-pull-request-and-it-was-awesome/
> >
> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71892>
>
> There are in fact quite a few members of the community who have no
> patience for the
> optimizationists who are breaking the intent of programmers.
>

I wasn't aware that Linux Torvalds was a member of the *C++* community. I
thought he had decided that C++ was crap long ago.

And even if he was, this is hardly evidence for the C++ community at large
being upset with the standards committee over this.

Also, you haven't explained what this has to do with *prioritization* from
the committee. I rather suspect that if you the choice between "modules"
and "less UB", most C++ programmers will pick "modules".

Lastly, please explain how those compiler changes from the GCC bug report
are the result of "clarifications" of C++98? Show me the wording in C++98
that made calling member functions with a NULL pointer well-defined
behavior. Because if you can't, then these were not due to "clarifications"
of C++98; they've *always been there*.

When a compiler writer
> decides that (a + 1 < a) is unilaterally false, even though a programmer
> has written
> a test for that, that compiler writer has made a grave error, not in
> interpreting the language
> but in the service provided to users of the compiler.
>
It's funny; I can't seem to recall a single instance of C++ adding new UB
>> for this purpose.
>>

> Oh sure, there have been lots of *clarifications* of rules that were

>> unclear. But there has
>>
> been no change to the object model since C++98; only having more detail in
>> explaining
>>
> how it works. All of the problems you cite about C++'s object model
>> either are in C++98
>>
> or are due to defect resolutions to wording in C++98. date back to C++98.
>> The only
>>
> difference is that the model has been better specified, so you finally
>> noticed what is and
>>
> is not UB.
>>
>
> It's not that I notice. It's that compiler vendors are using these
> "clarifications" to break
> programs
>
- they deliberately miscompile what the programmer has written in ways that
> are inconsistent with the way they compiled programs before, thereby
> silently breaking
> them and leaving users scrambling to find out why mysterious failures are
> happening.
> Whether or not these possibilities have always been present, it's the
> clarifications that
> make them manifest.
>

Are you so sure about that? If so, prove it.

Give an example of code that used to "work" that doesn't now, based *solely*
on wording from C++11 and later. Where a "clarification" is the difference
maker, rather than simply when compiler writers decided to start making
changes.

FYI: your example `if(a + 1 < a)` is not such a case. C89 permitted
compilers to assume that was false for signed integers.

Myriachan

unread,

Nov 27, 2017, 3:59:40 PM11/27/17

to ISO C++ Standard - Future Proposals

On Saturday, November 25, 2017 at 12:46:36 PM UTC-8, David Brown wrote:

Rotations are rare - they only turn up in a few niche types of code, such as
cryptography algorithms. And usually such code is so complicated and
requires such time and thought to understand, that any time spend
understanding a rotation operator or function is negligible.

I work with cryptography code a lot, and I completely agree here. I really don't care if I have to call an inline function named rotl() instead of having a custom operator to do it. It's very little inconvenience, and it ends up the same machine instruction anyway. It's better not having to deal with adding a new operator for something rather rare. (Whereas needing operator <=> is a lot more common than needing bit rotation.)

Melissa

Hyman Rosen

unread,

Nov 27, 2017, 5:16:12 PM11/27/17

to std-pr...@isocpp.org

On Mon, Nov 27, 2017 at 1:50 PM, Nicol Bolas <jmck...@gmail.com> wrote:

Lastly, please explain how those compiler changes from the GCC bug report are the result of "clarifications" of C++98? Show me the wording in C++98 that made calling member functions with a NULL pointer well-defined behavior. Because if you can't, then these were not due to "clarifications" of C++98; they've always been there.
...
Give an example of code that used to "work" that doesn't now, based solely on wording from C++11 and later. Where a "clarification" is the difference maker, rather than simply when compiler writers decided to start making changes.

FYI: your example `if(a + 1 < a)` is not such a case. C89 permitted compilers to assume that was false for signed integers.

Bo Persson said:
> An odd thing here is that all major compilers are sponsored (or even owned) by organizations using them to build their operating systems. They even send their own representatives to the committee meetings.
>
> One could imagine that them being truly unsatisfied customers would have some effects on the committee work. Or new standard revisions not being unanimously accepted.

Since I consider compiler writers the problem, it's not surprising to me that they're not complaining about the results!

C compilers were intended to be "close to the machine," so their arithmetic was supposed to be "whatever the machine did." Signed overflow was formally undefined because compilers could run on hardware with 1's-complement arithmetic, or with hardware traps on overflow, or whatever, and defining the effects in the language could badly penalize programs running on hardware that really wanted to do the arithmetic differently.

That's qualitatively different from compilers deliberately adopting a model not represented by the hardware in order to *cause* programs to have undefined behavior. On a processor where signed overflow naturally wraps around (which is virtually all of them), the compiler should not make signed overflow be undefined behavior and then eliminate code which checks for overflow after the fact.

Similarly, the fact that calling a member function through a null pointer is not a reason for compilers to eliminate (this == 0) tests. Implementations typically call member functions as if they were ordinary functions with a hidden initial parameter, and if a program calls a member function through a null pointer, then the this pointer would indeed be null. When the programmer has written such a test, the compiler has no business removing it.

And programmers have from time immemorial wanted to overlay different types on the same area of memory and interpret the object representation of one type as the object representation of another. In Fortran, they used EQUIVALENCE to do that. In C and C++, they used unions or just plain old casting from one pointer type to another. (That's how the X Window event system works.) Taking that away in the formal language definition doesn't change the fact that people want to do this, have written programs that do it, and that those programs break at the whim of compiler writers, who are there in the meetings to make sure that nothing gets into the standard that would "negatively impact optimization."

Tony V E

unread,

Nov 27, 2017, 5:21:26 PM11/27/17

to Standard Proposals

On Mon, Nov 27, 2017 at 11:46 AM, Hyman Rosen <hyman...@gmail.com> wrote:

On Mon, Nov 27, 2017 at 10:12 AM, Nicol Bolas <jmck...@gmail.com> wrote:
Arbitrarily declaring that people who don't agree with you are a priori wrong, out-dated, or whatever is not a viable approach for convincing anyone of the validity of your position.

Then again, there's that Max Planck paraphrased quote, "Science advances one funeral at a time."

I think that there is a reasonable possibility that the C++ Standard committee is a victim of epistemic
closure - the same people meeting time after time,

I'm pretty sure that currently, most of the committee members are new to the committee. ie less than 5 years on the committee. Possibly less than 3 even.

agreeing with each other on certain priorities that
perhaps the C++ community at large would not agree with. Meanwhile, even simple fixes to obvious
errors remain unaddressed draft after draft (such as the number of bits required for an enumeration
type in [dcl.enum]), while more ways are found to make programs have undefined behavior so that
optimizationists can point to the beautiful code their compilers generate by ignoring what the
programmers actually wanted their programs to do.

--

Be seeing you,

Tony

Thiago Macieira

unread,

Nov 27, 2017, 6:49:18 PM11/27/17

to std-pr...@isocpp.org

On Monday, 27 November 2017 14:15:48 PST Hyman Rosen wrote:
> C compilers were intended to be "close to the machine," so their arithmetic
> was supposed to be "whatever the machine did."

No, they were not.

C was designed to be "portable assembly", which meant it's not "whatever the
machine did" but instead a known set of rules known to work the same on all
machines. Sometimes that meant reducing to a subset of what the machine could
not, sometimes it meant some machines needed to do more work compared to
others.

> Signed overflow was
> formally undefined because compilers could run on hardware with
> 1's-complement arithmetic, or with hardware traps on overflow, or whatever,
> and defining the effects in the language could badly penalize programs
> running on hardware that really wanted to do the arithmetic differently.

Right.

> That's qualitatively different from compilers deliberately adopting a model
> not represented by the hardware in order to *cause* programs to have
> undefined behavior. On a processor where signed overflow naturally wraps
> around (which is virtually all of them), the compiler should not make
> signed overflow be undefined behavior and then eliminate code which checks
> for overflow after the fact.

I understand you believe this, but that's not a shared belief.

Others believe that, since the language says you cannot do something,
compilers are free to assume you did not do it.

> And programmers have from time immemorial wanted to overlay different types
> on the same area of memory and interpret the object representation of one
> type as the object representation of another. In Fortran, they used
> EQUIVALENCE to do that. In C and C++, they used unions or just plain old
> casting from one pointer type to another. (That's how the X Window event
> system works.) Taking that away in the formal language definition doesn't
> change the fact that people want to do this, have written programs that do
> it, and that those programs break at the whim of compiler writers, who are
> there in the meetings to make sure that nothing gets into the standard that
> would "negatively impact optimization."

I agree we need a way to fix this, since we need to do that. That means
specifying the behaviour properly, not getting rid of all UB.

David Brown

unread,

Nov 28, 2017, 4:33:42 AM11/28/17

to std-pr...@isocpp.org

On 27/11/17 21:59, Myriachan wrote:
> On Saturday, November 25, 2017 at 12:46:36 PM UTC-8, David Brown wrote:
>
> Rotations are rare - they only turn up in a few niche types of code,
> such as
> cryptography algorithms. And usually such code is so complicated and
> requires such time and thought to understand, that any time spend
> understanding a rotation operator or function is negligible.
>
>
> I work with cryptography code a lot, and I completely agree here.

Since I don't work with cryptography myself, I am glad to see that
confirmation!

> I
> really don't care if I have to call an inline function named rotl()
> instead of having a custom operator to do it. It's very little
> inconvenience, and it ends up the same machine instruction anyway. It's
> better not having to deal with adding a new operator for something
> rather rare. (Whereas needing operator <=> is a lot more common than
> needing bit rotation.)
>

There is plenty of scope for other operators that would be of greater
potential use in C and C++. The power operator comes to mind
(especially one that handles integer powers well, unlike the pow()
function). gcc used to have "max" and "min" operators as extensions,
but I believe they have been removed from later versions. And of course
for user-defined types, all sorts of operators could make sense - concat
for strings, matrix transpose and inversion, vector dot product and
cross product, and so on. If unicode mathematics operators were
allowed, it would open the possibility of very neat and clear
mathematical code in C++ - and huge confusion between × and x, · and .,
etc., and endless frustration from people who can't type those symbols.

David Brown

unread,

Nov 28, 2017, 7:27:19 AM11/28/17

to std-pr...@isocpp.org

On 28/11/17 00:49, Thiago Macieira wrote:

(Thiago, the things I have written below are primarily for other
people's benefit - I know you know this stuff.)

> On Monday, 27 November 2017 14:15:48 PST Hyman Rosen wrote:
>> C compilers were intended to be "close to the machine," so their arithmetic
>> was supposed to be "whatever the machine did."
>
> No, they were not.
>
> C was designed to be "portable assembly",

C was /not/ designed to be a "portable assembly". It was designed to be
a portable low-level language eliminating the need for assembly in most
situations. There is a big difference. (The rest of your explanation
here is, IMHO, correct - precisely because C has never been "portable
assembly". So I fully agree with your comments here - it is just the
name "portable assembler" is what people use to justify claims that C
should work in particular ways on particular systems.)

> which meant it's not "whatever the
> machine did" but instead a known set of rules known to work the same on all
> machines. Sometimes that meant reducing to a subset of what the machine could
> not, sometimes it meant some machines needed to do more work compared to
> others.
>
>> Signed overflow was
>> formally undefined because compilers could run on hardware with
>> 1's-complement arithmetic, or with hardware traps on overflow, or whatever,
>> and defining the effects in the language could badly penalize programs
>> running on hardware that really wanted to do the arithmetic differently.
>
> Right.

There is an additional reason for making signed overflow undefined
behaviour - there is no single sensible behaviour that could be picked.
If we think of a 16-bit machine (because the numbers are easier :-) ),
what do you get if you take 20,000 apples and add another 15,000 apples?
In what world does it make sense for the result to be -30536 apples?
It would be more sensible to throw an error, crash the program, raise a
signal, or post an error message. It could also make sense to use
saturating arithmetic - the "int" can't handle more than 32,767 apples -
so 20,000 + 10,000 is 32,767. And on other occasions, it makes more
sense to just take whatever the hardware gives you.

So since there is no signed overflow behaviour that makes sense, it does
not make sense to have defined behaviour for it.

And it is perfectly reasonable for a compiler to note that since there
is no sensible behaviour for signed overflow, the programmer will ensure
that it does not happen - or does not care what the code does if it
/does/ happen. Thus the compiler can optimise on the assumption that
signed overflow does not exist.

The mistake some people make is to think that a C (or C++) "int" is just
a signed type matching the cpu's registers. It is not - it is an
approximate model of an integer number. That model will, of course, be
limited - for efficiency, and because any computer is finite. But
identities like "a + 1 > a" are always true in this model.

(Unsigned integer types, on the other hand, explicitly model modular
arithmetic types.)

>
>> That's qualitatively different from compilers deliberately adopting a model
>> not represented by the hardware in order to *cause* programs to have
>> undefined behavior.

First, compilers here are deliberately adopting the /C/ model of
arithmetic. If you want the /processor's/ model of arithmetic, use
assembly or a programming language whose arithmetic model matches the
one you want. If you want to program in C or C++, learn to understand
the C model of arithmetic.

Secondly, compilers do /not/ cause programs to have undefined behaviour,
or cause them to have bugs (baring errors in the compiler, of course).
The program already had undefined behaviour and bugs. All that has
changed is that the undefined behaviour coincidentally matched the
programmer's expectations in some cases, and not in other cases.

>> On a processor where signed overflow naturally wraps
>> around (which is virtually all of them), the compiler should not make
>> signed overflow be undefined behavior and then eliminate code which checks
>> for overflow after the fact.
>
> I understand you believe this, but that's not a shared belief.
>

Unfortunately, it /is/ a belief that some people share. Fortunately it
seems to be a small proportion of C (and C++) programmers.

> Others believe that, since the language says you cannot do something,
> compilers are free to assume you did not do it.

Since that is the believe shared by the people that defined the
language, and the people that write the compilers, it is clearly the
important one!

Most people, I think, simply think "don't let your signed integers
overflow, dereference null pointers, etc." - then there is no problem.

>
>> And programmers have from time immemorial wanted to overlay different types
>> on the same area of memory and interpret the object representation of one
>> type as the object representation of another. In Fortran, they used
>> EQUIVALENCE to do that. In C and C++, they used unions or just plain old
>> casting from one pointer type to another. (That's how the X Window event
>> system works.) Taking that away in the formal language definition doesn't
>> change the fact that people want to do this, have written programs that do
>> it, and that those programs break at the whim of compiler writers, who are
>> there in the meetings to make sure that nothing gets into the standard that
>> would "negatively impact optimization."
>
> I agree we need a way to fix this, since we need to do that. That means
> specifying the behaviour properly, not getting rid of all UB.
>

Indeed.

I believe C specifies that union based type punning is defined behaviour
(implementation defined, since it depends on byte orders, alignments,
layouts, etc.) while it is undefined behaviour in C++.

Hyman Rosen

unread,

Nov 28, 2017, 12:32:39 PM11/28/17

to std-pr...@isocpp.org

On Tue, Nov 28, 2017 at 7:26 AM, David Brown <da...@westcontrol.com> wrote:

There is an additional reason for making signed overflow undefined
behaviour - there is no single sensible behaviour that could be picked.

There is no need to pick a single sensible behavior.

The mistake some people make is to think that a C (or C++) "int" is just
a signed type matching the cpu's registers.

The mistake some people make is to think that it's not.

Secondly, compilers do /not/ cause programs to have undefined behaviour,

Yes, they do. C leaves certain operations undefined because it could be
problematic to implement them in a single way on hardware that doesn't
support it. It's the compilers that choose not to provide a definition suitable
to their implementation. On nearly every common processor, there is no
reason that signed wraparound should not be the result of signed integer
arithmetic.

All that has changed is that the undefined behaviour coincidentally matched

the programmer's expectations in some cases, and not in other cases.

The truly pernicious aspect of this is that the compilers have increasingly
and silently stopped matching expectations on pre-existing programs. They
are doing that for optimizationism - making previously defined (by the compiler)
operations deliberately undefined so that they can show off clever tricks.

> Others believe that, since the language says you cannot do something,
> compilers are free to assume you did not do it.

Since that is the believe shared by the people that defined the
language, and the people that write the compilers, it is clearly the
important one!

As I said, epistemic closure. There are very important users who are furious
over this sort of thing, but they are not listened to:
<https://lwn.net/Articles/511259/>
<http://yarchive.net/comp/linux/timer_wrapping_c.html>

Nicol Bolas

unread,

Nov 28, 2017, 2:16:27 PM11/28/17

to ISO C++ Standard - Future Proposals

On Tuesday, November 28, 2017 at 12:32:39 PM UTC-5, Hyman Rosen wrote:

On Tue, Nov 28, 2017 at 7:26 AM, David Brown <da...@westcontrol.com> wrote:
There is an additional reason for making signed overflow undefined
behaviour - there is no single sensible behaviour that could be picked.

There is no need to pick a single sensible behavior.

The mistake some people make is to think that a C (or C++) "int" is just
a signed type matching the cpu's registers.

The mistake some people make is to think that it's not.

Secondly, compilers do /not/ cause programs to have undefined behaviour,

Yes, they do. C leaves certain operations undefined because it could be
problematic to implement them in a single way on hardware that doesn't
support it. It's the compilers that choose not to provide a definition suitable
to their implementation.

OK, so we've left the realm of "the standard is broken" and moved into the realm of "the compiler is broken because it doesn't do the thing I expect it to do."

Let us now go back to where this all started:

> I think that there is a reasonable possibility that the C++ Standard committee is a victim of epistemic

closure - the same people meeting time after time, agreeing with each other on certain priorities that perhaps the C++ community at large would not agree with.

Since your problem is, by your own admission, not with the C++ standards committee, would you like to retract this statement?

On nearly every common processor, there is no
reason that signed wraparound should not be the result of signed integer
arithmetic.

All that has changed is that the undefined behaviour coincidentally matched
the programmer's expectations in some cases, and not in other cases.

The truly pernicious aspect of this is that the compilers have increasingly
and silently stopped matching expectations on pre-existing programs. They
are doing that for optimizationism - making previously defined (by the compiler)
operations deliberately undefined so that they can show off clever tricks.

OK, you're effectively saying that the optimizations they're making are not actually "optimizing" anything. That they're not making real code faster. That they're just "showing off clever tricks" that are presumably useless.

What evidence do you have for this? Prove that such optimizations don't improve the performance of real code.

Because if these "clever tricks" actually improve the performance of real code, then they are optimizations, not mere "clever tricks". And since the point of C and C++ is to be fast, making compiled C and C++ fast is not just the right of compiler writers, it's their job.

> Others believe that, since the language says you cannot do something,
> compilers are free to assume you did not do it.

Since that is the believe shared by the people that defined the
language, and the people that write the compilers, it is clearly the
important one!

As I said, epistemic closure. There are very important users who are furious
over this sort of thing, but they are not listened to:
<https://lwn.net/Articles/511259/>
<http://yarchive.net/comp/linux/timer_wrapping_c.html>

I don't know what citing Torvalds again is intended to prove. Yes, Torvalds agrees with you; we know that already.

However important he may be, his views are clearly not important enough to make compiler writers agree with him. Probably because they have to serve needs that don't align with his.

C and C++ compilers do not belong to Linus Torvalds. Or to any one person.

Hyman Rosen

unread,

Nov 28, 2017, 4:51:25 PM11/28/17

to std-pr...@isocpp.org

On Tue, Nov 28, 2017 at 2:16 PM, Nicol Bolas <jmck...@gmail.com> wrote:

OK, so we've left the realm of "the standard is broken" and moved into the realm of "the compiler is broken because it doesn't do the thing I expect it to do."

The standard should urge implementations to make behavior that is undefined
in the standard implementation-defined. It should emphasize in more places
than just for pointer-to-integer conversion that behavior should be "unsurprising
to those who know the structure of the underlying machine."

In fact, it already does that in some places. In [expr.shift], the effect of right-shifting
a negative signed integer is implementation-defined, not undefined. Signed arithmetic
overflow is undefined behavior because the C standard anticipated implementations on
hardware where such overflow causes a trap.

Whether the standard or its interpreters are at fault isn't relevant. The outcome is bad
when "assume undefined behavior can't happen" becomes the operating paradigm.

Let us now go back to where this all started:

> I think that there is a reasonable possibility that the C++ Standard committee is a victim of epistemic
closure - the same people meeting time after time, agreeing with each other on certain priorities that perhaps the C++ community at large would not agree with.

Since your problem is, by your own admission, not with the C++ standards committee, would you like to retract this statement?

No, not really. Undefined behavior on signed overflow is only one aspect.
Other aspects, such as being unable to reinterpret_cast memory to an object
type without using placement new, not allowing type punning in unions and
elsewhere, and "vector can't be implemented in C++" are all problems of the
standard.

OK, you're effectively saying that the optimizations they're making are not actually "optimizing" anything. That they're not making real code faster. That they're just "showing off clever tricks" that are presumably useless.

What evidence do you have for this? Prove that such optimizations don't improve the performance of real code.

Because if these "clever tricks" actually improve the performance of real code, then they are optimizations, not mere "clever tricks". And since the point of C and C++ is to be fast, making compiled C and C++ fast is not just the right of compiler writers, it's their job.

The point of a programming language is to communicate the intent of programmers
to the operation of a computer. Doing the wrong thing faster is not. The notion that
the point of C and C++ is to be fast is the epistemic closure I'm talking about. The fact
that the manager of one of the most widely used operating systems in the world is
furious about compiler decisions that deliberately break old working code, the fact that
code ubiquitous in standard library implementations uses type punning, as does the
code of the X Window system, none of that makes a dent in optimizationist attitude.

Optimization should be the process of improving code along some axis (size, speed)
without changing its meaning, not deliberately removing meaning in order to enable
more improvements.

C and C++ have already added a plethora of integer variants. If non-wrapping signed
overflow is important to many users, create integer types that are specifically not permitted
to wrap and have the users use those. If strict aliasing is important to many users, adopt
the restrict keyword from C and have the users apply it to things they know aren't aliased.
Treat uninitialized data as having arbitrary but stable contents, and is many users believe
it's important to have undefined uninitialized data, invent an initialization syntax that will let
them specify just that on the objects they care about.

C and C++ compilers do not belong to Linus Torvalds. Or to any one person.

Yes. That's why I keep banging the drum - so that optimizationists will at least know
that other voices exist even if they will not heed them.

zxu...@gmail.com

unread,

Nov 28, 2017, 4:54:35 PM11/28/17

to ISO C++ Standard - Future Proposals

On Monday, 27 November 2017 14:30:11 UTC, Andrey Semashev wrote:

The # character will interfere regardless of position in the input because
it is a preprocessor instruction in macros.

#define ROL(x, y) x <# y

will probably expand to

x <" y"

I had actually forgotten about that scenario but the whole point of this thread was to get the discussion going on what is the most suitable for an operator, since no one has dained fit to actually suggest an alternative I'll just mention other suggestion: <?, <?=, >? & >?=

I haven't seen any other usage for ? beside the "(a < b) ? a : b" scenario so I can't see it being a problem.

I read further down about how pow() should have it's own operator, the immediate thought I had there was maybe ** & **= would work, old compilers which aren't supposed to support the new standard would just fail to compile it anyway. As for the rotl/rotr thing I only don't like them because of A: how 1 needs to cast to be sure they get the part of the integer they want when they use it, neither can they be sure the bits rotate in the same way they would rotate on smaller/larger integer (e.g. 0x1 becoming 0x80000000 instead of 0x80) & B: One cannot apply the rotation directly to the integer they are working with (like how "a += 1" is faster than "a = a + 1")

Thiago Macieira

unread,

Nov 28, 2017, 8:27:10 PM11/28/17

to std-pr...@isocpp.org

On Tuesday, 28 November 2017 09:32:14 PST Hyman Rosen wrote:
> Yes, they do. C leaves certain operations undefined because it could be
> problematic to implement them in a single way on hardware that doesn't
> support it. It's the compilers that choose not to provide a definition
> suitable
> to their implementation. On nearly every common processor, there is no
> reason that signed wraparound should not be the result of signed integer
> arithmetic.

Let's take another example: shifting left or right by more than the unsigned
type's number of bits. In C and C++ this is UB.

What should, according to you, the following produce:

1U << functionThatReturns33();

Please answer bearing in mind that the SHL operation in assembly is only
specified to work with values less than 32. Any higher value may shift
everything, nothing, or shift by the modulo 32.

Thiago Macieira

unread,

Nov 28, 2017, 8:33:26 PM11/28/17

to std-pr...@isocpp.org

On Tuesday, 28 November 2017 17:27:03 PST Thiago Macieira wrote:
> What should, according to you, the following produce:
>
> 1U << functionThatReturns33();
>
> Please answer bearing in mind that the SHL operation in assembly is only
> specified to work with values less than 32. Any higher value may shift
> everything, nothing, or shift by the modulo 32.

Please answer in two cases:

1) regular context, the compiler could not inline functionThatReturns33()

2) constant expression context, the compiler can inline and did constant
propagation

in other words:

$ cat a.cpp
extern int functionThatReturns33();
unsigned u1 = 1U << functionThatReturns33();

$ cat b.cpp
constexpr int constexpr33() { return 33; }
int functionThatReturns33() { return constexpr33(); }

constexpr unsigned u2 = 1U << constexpr33();

What are the values of u1 and u2, according to you?

Peter Koch Larsen

unread,

Nov 28, 2017, 9:42:54 PM11/28/17

to std-pr...@isocpp.org

I believe there is a tension here between performance and correctness.
In my opinion if there is a choice, correctness should always be
chosen.
If a programmer writes if (i + 1 < i), allowing the compiler to
evaluate that to false is a very bad a priori solution. It is
undefined behaviour, but it clearly was not the intention of the
programmer to utilise that effect.
In my opinion, the proper thing to do with such an expression should be:
a) if the compiler can create a warning that UB is removing the check,
by all means go for it. A responsible programmer (or the build-system)
will see the warning and correct the code.
b) if the compiler is unable to generate a warning, evaluate the
expression and live with it unless the build-system is set to "extreme
optimisation". Perhaps we need an extra flag here.

Legally, the compiler writers are allowed to perform these
optimizations, but morally they should refrain from doing so unless
specifically told to do so.

/Peter

> --
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> To view this discussion on the web visit
> https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/8176d75b-8df7-40ba-af35-d8213937d5bc%40isocpp.org.

Thiago Macieira

unread,

Nov 28, 2017, 10:24:08 PM11/28/17

to std-pr...@isocpp.org

On Tuesday, 28 November 2017 18:42:51 PST Peter Koch Larsen wrote:
> If a programmer writes if (i + 1 < i), allowing the compiler to
> evaluate that to false is a very bad a priori solution. It is
> undefined behaviour, but it clearly was not the intention of the
> programmer to utilise that effect.
> In my opinion, the proper thing to do with such an expression should be:
> a) if the compiler can create a warning that UB is removing the check,
> by all means go for it. A responsible programmer (or the build-system)
> will see the warning and correct the code.

GCC did that. It's was incredibly annoying. It was for many years the *only*
warning we could not fix in Qt, so it got blacklisted with -Wno-error-strict-
overflow. We *couldn"t* fix the warning, since the compiler was telling us
that it had performed an optimisation we *wanted*.

The code wasn't obivous "i + 1 < i". It was a series of inlined functions and
constant propagation. There was nothing wrong with our code, nothing wrong
with the optimisation.

GCC 8 has got rid of the warning.

> b) if the compiler is unable to generate a warning, evaluate the
> expression and live with it unless the build-system is set to "extreme
> optimisation". Perhaps we need an extra flag here.

If you want it, use -ftrapv -fwrapv.

I don't. I want the compiler to optimise my code by way of conostant
propagation. Like I said, there was nothing wrong with the code. The
optimisation was correct.

> Legally, the compiler writers are allowed to perform these
> optimizations, but morally they should refrain from doing so unless
> specifically told to do so.

I disagree.

Let's take another typical case: null pointer dereference. A null pointer is
defined by the standard as a special value and MUST be, because in code like

delete[] ptr;

the compiler needs to generate a null pointer check (see [1])

That means a valid array ptr must not be null. Since that is the case, if the
code in question were

ptr[0] = S();
delete[] ptr;

Should the compiler be allowed to delete the null pointer check it had done
because the array had be valid?

That's the core of the issue Linus was complaining about.

[1] https://godbolt.org/g/4Y3X8g

Thiago Macieira

unread,

Nov 29, 2017, 12:41:38 AM11/29/17

to std-pr...@isocpp.org

On Tuesday, 28 November 2017 13:54:35 PST zxu...@gmail.com wrote:
> I had actually forgotten about that scenario but the whole point of this
> thread was to get the discussion going on what is the most suitable for an
> operator, since no one has dained fit to actually suggest an alternative
> I'll just mention other suggestion: <?, <?=, >? & >?=

https://gcc.gnu.org/onlinedocs/gcc-3.4.2/gcc/Min-and-Max.html

Deprecated, though.

> I haven't seen any other usage for ? beside the "(a < b) ? a : b" scenario
> so I can't see it being a problem.

See above.

> I read further down about how pow() should have it's own operator, the
> immediate thought I had there was maybe ** & **= would work, old compilers
> which aren't supposed to support the new standard would just fail to
> compile it anyway. As for the rotl/rotr thing I only don't like them
> because of A: how 1 needs to cast to be sure they get the part of the
> integer they want when they use it, neither can they be sure the bits
> rotate in the same way they would rotate on smaller/larger integer (e.g.
> 0x1 becoming 0x80000000 instead of 0x80) & B: One cannot apply the rotation
> directly to the integer they are working with (like how "a += 1" is faster
> than "a = a + 1")

Faster to type, but not to execute.

But I did not understand your reason for not liking. What cast is necessary?
Can you give an example where using rotl without a cast would result in a
surprise or incorrect result?

Please remember: integer promotion rules still apply for built-in operators.
The following does NOT run into signed integer overflow:
short v = 0x4000;
v *= v;

David Brown

unread,

Nov 29, 2017, 4:59:09 AM11/29/17

to std-pr...@isocpp.org

On 29/11/17 04:24, Thiago Macieira wrote:

>
> Let's take another typical case: null pointer dereference. A null pointer is
> defined by the standard as a special value and MUST be, because in code like
>
> delete[] ptr;
>
> the compiler needs to generate a null pointer check (see [1])
>
> That means a valid array ptr must not be null. Since that is the case, if the
> code in question were
>
> ptr[0] = S();
> delete[] ptr;
>
> Should the compiler be allowed to delete the null pointer check it had done
> because the array had be valid?

And the answer is, of course, yes - the compiler /should/ be able to
delete that null pointer check. The compiler can assume the programmer
knows what he/she is doing - it trusts the programmer to write correct
code according to the language definitions, the compiler definitions,
and the target definitions.

However, it would be nice if the gcc folks had included a warning when
an explicit null pointer check was removed by this optimisation. The
challenge is doing this sort of thing without false positives. It would
be good to have a warning on:

*p = 1;
if (!p) foo();

Here the whole "if (!p) foo();" can be removed by the compiler - but
there is almost certainly a mistake somewhere by the programmer.

But if you have:

#define checkfoo(p) if (!(p)) foo()

if (p) {
checkfoo(p);
}

then the check and the call to foo() can also clearly be removed by the
same optimisation, and you /don't/ want a warning here.

What that means for us programmers is that compilers can't tell us all
our mistakes - we still have to know the language, know the tools, and
write correct code.

>
> That's the core of the issue Linus was complaining about.
>

Yes - and he was blaming the wrong people. The fault lies with the
author of the kernel code and the people who reviewed it.

David Brown

unread,

Nov 29, 2017, 6:57:45 AM11/29/17

to std-pr...@isocpp.org

On 28/11/17 18:32, Hyman Rosen wrote:
> On Tue, Nov 28, 2017 at 7:26 AM, David Brown <da...@westcontrol.com
> <mailto:da...@westcontrol.com>> wrote:
>
> There is an additional reason for making signed overflow undefined
> behaviour - there is no single sensible behaviour that could be picked.
>
>
> There is no need to pick a single sensible behavior.

There are occasions when it is better to pick something rather than
nothing. But there are many advantages in having /no/ defined behaviour
for things like signed overflow in C. Picking a fixed behaviour - such
as two's complement wrap-around for signed overflow - comes at a cost.
Some of these costs include:

1. int arithmetic is no longer associative and distributive.
2. Multiply and divide (when there is no remainder) are no longer inverses.
3. Basic identities like "if you add a positive number to an integer, it
gets bigger" no longer hold.
4. Division by a known constant can't always be done by multiplication.

If you want integer arithmetic with fixed overflow characteristics, C
and C++ provide that - use unsigned types.

If you /really/ want a different dialect of C for which signed integer
arithmetic is two's complement, pick a compiler with options to give you
that.

>
> The mistake some people make is to think that a C (or C++) "int" is just
> a signed type matching the cpu's registers.
>
>
> The mistake some people make is to think that it's not.

You are wrong. Totally wrong. It is a type with certain
characteristics, some of which are implementation-dependent to allow a
compiler to generate more efficient code. But it is /not/ a model of
the cpu's registers. (You are probably reading this on a system with
64-bit registers and 32-bit int's - even the sizes don't match.)

>
> Secondly, compilers do /not/ cause programs to have undefined behaviour,
>
>
> Yes, they do. C leaves certain operations undefined because it could be
> problematic to implement them in a single way on hardware that doesn't
> support it.

Did you bother to read what I wrote? That is sometimes the reason for
undefined behaviour, but certainly not always the case. In particular,
if the language designers thought it would be useful to programmers to
know the behaviour of such operations, then they make it "implementation
defined behaviour", not "undefined behaviour".

Note that there /are/ a few cases of behaviour that is undefined in the
C (and C++) standards where giving a defined behaviour probably would
make more sense. There have been changes like that in revisions of the
standards. (One example is accessing different fields in a union. C
has realised it is useful to be able to do it, practical to implement
it, and there is little cost in making it defined behaviour - C99
onwards made it defined behaviour. But C++ has this from the C90
standards, and has not (yet) changed to having defined behaviour here.)

There are also large numbers of undefined behaviours that cannot
possibly be given a definition - such as trying to access data through
an invalid pointer.

> It's the compilers that choose not to provide a definition
> suitable
> to their implementation.

You are confusing implementation defined behaviour (which compilers have
to define and document) and undefined behaviour (which compilers do not
have to define).

Compilers can, of course, choose to give definitions for undefined
behaviour. I think most compilers do that to some extent, and may give
options to allow more definitions. That gives the programmer more
details of defined behaviour - but also limits optimisations in the
compiler.

> On nearly every common processor, there is no
> reason that signed wraparound should not be the result of signed integer
> arithmetic.
>

Yes, there are excellent reasons for it. I am glad that signed integer
overflow is undefined behaviour - it means that the code I write (which
does not overflow) results in smaller and faster object code. I have
programmed in C for over 20 years, mostly low-level code on small
embedded systems - as close to the metal as you can get. And I can't
think of /any/ circumstances in which it would have been useful for
signed integer overflow to wrap around. I can think of cases where
saturation would have been useful, and cases where trapping and calling
an error handler would have been useful. But wrapping? No, it is
pointless.

The /sole/ reasons for having two's complement arithmetic in /hardware/
are because it is simpler and faster to implement than any other method,
and that it makes life easier when you are implementing larger
arithmetic types from smaller ones. Neither of these is relevant to C
programming.

>
> All that has changed is that the undefined behaviour coincidentally
> matched
>
> the programmer's expectations in some cases, and not in other cases.
>
>
> The truly pernicious aspect of this is that the compilers have increasingly
> and silently stopped matching expectations on pre-existing programs.

And how, exactly, are compilers supposed to know what particular
behaviour a programmer thinks is "right" for the undefined behaviour?
The whole point is that there is /no/ defined behaviour - the code does
not make sense. When you program in C, you use the C language - as
defined by the C standard, augmented by the compiler documentation and
any other relevant standards for the target platform (like the ABI).
You use that language to tell the compiler what you want. If you are
using code that has no meaning in the language - no defined behaviour -
then the compiler does not know what you want.

What you are describing is like if you had for many years gone to your
local shop and asked for "3 thingymagigs and 2 whatdyacallthems", then
popped into the pub for "my usual". Then then you have gone to a
different shop and pub, and can't understand when these places have no
idea what you are talking about.

> They
> are doing that for optimizationism - making previously defined (by the
> compiler)
> operations deliberately undefined so that they can show off clever tricks.

No, they are making code that previously had /no/ defined behaviour do
something different than it used to. They make code that was broken by
design but happened to be useful by coincidence, into code that is still
broken by design and is no longer useful at all. This is the price for
making /good/ code run more efficiently.

Frankly, I have little sympathy for people who write bad code and expect
their tools to read their minds. And as someone who writes correct
code, I don't want to pay a price to support such people with their bad
code.

If you want to continue to write meaningless programs, that is your
choice. You have a number of options available to you.

1. You can buy a book or go on a course, and learn to program C and C++
properly.

2. You can use your tools in restricted modes - turn off optimisation,
turn on "wrapping integer support", and hope that you haven't made /too/
many mistakes in your understanding of the language.

3. You can learn to use the diagnostic and debugging tools provided by
compilers, such as the "sanatize" options to help track down and fix
your mistakes.

4. You can stick your head in the sand and blame other people when your
code makes mistakes.

5. You can moan about how things were better in the good old days while
other people write correct code.

You are not alone in your misconceptions about C and C++. But you are
not right. And if you don't deal with /your/ problem here, it is /you/
who will suffer.

>
> > Others believe that, since the language says you cannot do something,
> > compilers are free to assume you did not do it.
>
> Since that is the believe shared by the people that defined the
> language, and the people that write the compilers, it is clearly the
> important one!
>
>
> As I said, epistemic closure. There are very important users who are
> furious
> over this sort of thing, but they are not listened to:
> <https://lwn.net/Articles/511259/>
> <http://yarchive.net/comp/linux/timer_wrapping_c.html>
>

Linus Torvalds can get furious at the drop of a hat - he seems rather
calm here. As I say, you are not alone in your misunderstandings - but
appeal to authority does not make you right. It merely means that the
fact that people make this kind of mistake is important - it means
compiler writers need to try hard to improve warnings, include
alternative options, and document changes. It does /not/ mean they need
to stop improving their optimisers.

zxu...@gmail.com

unread,

Nov 29, 2017, 7:04:07 AM11/29/17

to ISO C++ Standard - Future Proposals

On Wednesday, 29 November 2017 05:41:38 UTC, Thiago Macieira wrote:

https://gcc.gnu.org/onlinedocs/gcc-3.4.2/gcc/Min-and-Max.html

Deprecated, though.

What about the @ character? Any issues with that?

On Wednesday, 29 November 2017 05:41:38 UTC, Thiago Macieira wrote:

> I read further down about how pow() should have it's own operator, the
> immediate thought I had there was maybe ** & **= would work, old compilers
> which aren't supposed to support the new standard would just fail to
> compile it anyway. As for the rotl/rotr thing I only don't like them
> because of A: how 1 needs to cast to be sure they get the part of the
> integer they want when they use it, neither can they be sure the bits
> rotate in the same way they would rotate on smaller/larger integer (e.g.
> 0x1 becoming 0x80000000 instead of 0x80) & B: One cannot apply the rotation
> directly to the integer they are working with (like how "a += 1" is faster
> than "a = a + 1")

Faster to type, but not to execute.

But I did not understand your reason for not liking. What cast is necessary?
Can you give an example where using rotl without a cast would result in a
surprise or incorrect result?

rotl is not necessarily going to be a macro to operator usage, it can be a function using int/long/long long which means it falls to us to ensure the result not only that we receive is safe to place in the target variable but also the operation was done the way we would expect it to for the integer size we are working with.

David Brown

unread,

Nov 29, 2017, 7:48:47 AM11/29/17

to std-pr...@isocpp.org

On 29/11/17 13:04, zxu...@gmail.com wrote:
> On Wednesday, 29 November 2017 05:41:38 UTC, Thiago Macieira wrote:
>
> https://gcc.gnu.org/onlinedocs/gcc-3.4.2/gcc/Min-and-Max.html
> <https://gcc.gnu.org/onlinedocs/gcc-3.4.2/gcc/Min-and-Max.html>
>
> Deprecated, though.
>
>
> What about the @ character? Any issues with that?

The @ character is not part of the basic character set for C or C++, and
is not in use in the standard language. The only use I have seen for it
is in extensions in compilers for embedded systems as a way of giving
fixed known addresses to variables (typically memory-mapped
peripherals). I haven't noticed it in any current language proposals
either.

The most common suggestions I have seen for rotation operators are <<<
and >>>, which have the advantage of not needing new characters or
interacting badly with existing usage. (I still don't think such
operators make sense - but these are perhaps the least bad choices.)

Thiago Macieira

unread,

Nov 29, 2017, 10:43:50 AM11/29/17

to std-pr...@isocpp.org

On Wednesday, 29 November 2017 04:04:07 PST zxu...@gmail.com wrote:
> > But I did not understand your reason for not liking. What cast is
> > necessary?
> > Can you give an example where using rotl without a cast would result in a
> > surprise or incorrect result?
>
> rotl is not necessarily going to be a macro to operator usage, it can be a
> function using int/long/long long which means it falls to us to ensure the
> result not only that we receive is safe to place in the target variable but
> also the operation was done the way we would expect it to for the integer
> size we are working with.

I did expect to be functions, as the current experience are intrinsic
functions.

Please give an example of what you're saying that would result in surprise or
incorrect result.

Hyman Rosen

unread,

Nov 29, 2017, 12:17:59 PM11/29/17

to std-pr...@isocpp.org

On Tue, Nov 28, 2017 at 8:33 PM, Thiago Macieira <thi...@macieira.org> wrote:

> What should, according to you, the following produce:
>
> 1U << functionThatReturns33();
>
> Please answer bearing in mind that the SHL operation in assembly is only
> specified to work with values less than 32. Any higher value may shift
> everything, nothing, or shift by the modulo 32.

Please answer in two cases:

1) regular context, the compiler could not inline functionThatReturns33()

2) constant expression context, the compiler can inline and did constant
propagation

First of all, there is no reason for the behavior to be undefined rather than
unspecified, unless there are platforms where doing such shifts causes traps
in a way that would be unnatural to prevent.

The value should be whatever the naive assembly language shift instruction
generates with such a count. On Intel platforms beyond the 8086, for example,
it is specified that the processor masks the shift count with 5 or 6 bits depending
on whether the shift is of a 32-bit or 64-bit value, so the shift is done modulo the
bit size - shifting a 32-bit value by 32 leaves the value unchanged.
<https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf>

If the compiler can detect that the shift size is such a too-large value, it should
issue a warning, because it is unlikely that the programmer intended the behavior,
but the code that it generates should be as-if the naive shift were executed.

In all cases, the result of const expressions, constexpr expressions, and
preprocessor expressions should match the results that would obtain if the
expressions were to be evaluated at runtime.

Andrey Semashev

unread,

Nov 29, 2017, 12:36:21 PM11/29/17

to std-pr...@isocpp.org

This means that the behavior of the code is undefined because its result
is different on every target platform, including different numeric
results and traps.

> In all cases, the result of const expressions, constexpr expressions, and
> preprocessor expressions should match the results that would obtain if the
> expressions were to be evaluated at runtime.

This is not possible, because this means that the program can now not
only run differently, but also be *compiled* differently depending on
the target architecture:

template< unsigned int N >
struct foo;

template< >
struct foo< 0u >
{
static void bar() { puts("Hello!); }
};

template< >
struct foo< 1u >
{
static void bar() { puts("Bye!); }
};

foo< 1u << functionThatReturns33() >::bar();

And what if the CPU issues a trap in this situation? Do you expect the
compiler to crash or something?

There is no UB allowed in constant expressions, precisely for this
reason. So if you want some particular behavior, you have to define it,
and it should be a fixed single definition, not just "do whatever the
CPU does". This, in turn, means blessing one implementation and
penalizing all others to the point that a particular shift instruction
is no longer possible to use.

Thiago Macieira

unread,

Nov 29, 2017, 12:39:10 PM11/29/17

to std-pr...@isocpp.org

On Wednesday, 29 November 2017 09:17:35 PST Hyman Rosen wrote:
> On Tue, Nov 28, 2017 at 8:33 PM, Thiago Macieira <thi...@macieira.org>
> wrote:
> > > What should, according to you, the following produce:
> > > 1U << functionThatReturns33();
> > >
> > > Please answer bearing in mind that the SHL operation in assembly is only
> > > specified to work with values less than 32. Any higher value may shift
> > > everything, nothing, or shift by the modulo 32.
> >
> > Please answer in two cases:
> > 1) regular context, the compiler could not inline functionThatReturns33()
> >
> > 2) constant expression context, the compiler can inline and did constant
> >
> > propagation
>
> First of all, there is no reason for the behavior to be undefined rather
> than
> unspecified, unless there are platforms where doing such shifts causes traps
> in a way that would be unnatural to prevent.

Suppose there are. What now, should it be undefined?

> The value should be whatever the naive assembly language shift instruction
> generates with such a count. On Intel platforms beyond the 8086, for
> example,
> it is specified that the processor masks the shift count with 5 or 6 bits
> depending
> on whether the shift is of a 32-bit or 64-bit value, so the shift is done
> modulo the
> bit size - shifting a 32-bit value by 32 leaves the value unchanged.

In other words, the same assembly code can have different behaviour depending
on the processor used at runtime. How can the compiler implement in a
constexpr context the behaviour that isn't defined?

> If the compiler can detect that the shift size is such a too-large value,
> it should
> issue a warning, because it is unlikely that the programmer intended the
> behavior,
> but the code that it generates should be as-if the naive shift were
> executed.
>
> In all cases, the result of const expressions, constexpr expressions, and
> preprocessor expressions should match the results that would obtain if the
> expressions were to be evaluated at runtime.

Which runtime? Remember, the same assembly can produce different results
depending on the processor.

Please give me an answer: if I compile for 16-bit real-mode x86, what should
1U << 33
be?

This is a *relevant* case, since early non-UEFI boot code still runs 16-bit
real-mode, with Intel CPUs released in 2017 (the Quark line of MCUs).

Hyman Rosen

unread,

Nov 29, 2017, 12:43:44 PM11/29/17

to std-pr...@isocpp.org

I disagree, with some nuance.

A. If the platform always causes a trap when indirecting through a null pointer,
the check can be eliminated.

B. The compiler should differentiate between things that it believes are true
because they can only be false through executing undefined behavior, and
things that it believes are true because of constant propagation or other logical
deductions. In the former case, code must not be eliminated, while in the latter
case it can be. If I write *p = 1; if (!p) foo(); then the compiler should
not eliminate the test. But if I write if (p) if (!p) foo(); then the compiler
can eliminate the second test and the code it controls.

C. If the compiler cannot distinguish between the two situations, it should leave
user code alone, and generate whatever tests it would generate absent its
knowledge.

Notice your own contradictory claims - you first say that the compiler should
assume that programmers know what they are doing, then you say that compilers
should eliminate code that those programmers have written.

Andrey Semashev

unread,

Nov 29, 2017, 12:53:35 PM11/29/17

to std-pr...@isocpp.org

On 11/29/17 14:57, David Brown wrote:
> On 28/11/17 18:32, Hyman Rosen wrote:
>> On Tue, Nov 28, 2017 at 7:26 AM, David Brown <da...@westcontrol.com
>> <mailto:da...@westcontrol.com>> wrote:
>>
>> There is an additional reason for making signed overflow undefined
>> behaviour - there is no single sensible behaviour that could be picked.
>>
>>
>> There is no need to pick a single sensible behavior.
>
> There are occasions when it is better to pick something rather than
> nothing. But there are many advantages in having /no/ defined behaviour
> for things like signed overflow in C. Picking a fixed behaviour - such
> as two's complement wrap-around for signed overflow - comes at a cost.
> Some of these costs include:
>
> 1. int arithmetic is no longer associative and distributive.

Not sure how this follows from allowing two's complement wrap-around.
Could you give an example?

> 2. Multiply and divide (when there is no remainder) are no longer inverses.

In all cases, where they are currently reversible (i.e. where no
overflow happens), they will continue to be so. OTOH, when overflow does
happen you can now not rely on any particular result or operations
property. A fixed overflow semantic would at least give you the result
and guarantee that the compiler will not cripple your code somehow.

> 3. Basic identities like "if you add a positive number to an integer, it
> gets bigger" no longer hold.

I don't see that as that much of a problem.

This argument is often quoted in relation to pointer arithmetic, so that
the compiler can assume that pointers (possibly, with an offset) never
wrap. I agree that wrapping *pointers* should continue to be undefined,
but that doesn't mean the same should be true for integers.

> 4. Division by a known constant can't always be done by multiplication.

Is that true? Compilers do convert division to multiplication on real
hardware, where signed integers do wrap. I don't see why that will
change if signed wrapping is allowed in the language.

Hyman Rosen

unread,

Nov 29, 2017, 1:16:30 PM11/29/17

to std-pr...@isocpp.org

On Wed, Nov 29, 2017 at 12:38 PM, Thiago Macieira <thi...@macieira.org> wrote:

Suppose there are. What now, should it be undefined?

Yes, but with the same proviso I gave before. The standard should say in general
that when certain arithmetic expressions are undefined because of the possibility
of traps, when an implementation is on a platform that does not have such traps
then the behavior should be specified by the implementation to be unsurprising to
people with knowledge of that platform.

In other words, the same assembly code can have different behaviour depending
on the processor used at runtime. How can the compiler implement in a
constexpr context the behaviour that isn't defined?

Because the compiler is told what platform it's compiling for.

Which runtime? Remember, the same assembly can produce different results
depending on the processor.

The compiler has a platform for which it's compiling, and should behave according
to that platform. If it believes it can generate code which will also run on a different
platform as well as its notional one, it should generate code that will produce the same
result on the other platform as on its notional one.

Please give me an answer: if I compile for 16-bit real-mode x86, what should
1U << 33
be?

This is a *relevant* case, since early non-UEFI boot code still runs 16-bit
real-mode, with Intel CPUs released in 2017 (the Quark line of MCUs).

According to the Intel manual,

IA-32 Architecture Compatibility
The 8086 does not mask the shift count. However, all other IA-32 processors (starting with the Intel 286 processor)
do mask the shift count to 5 bits, resulting in a maximum count of 31. This masking is done in all operating modes
(including the virtual-8086 mode) to reduce the maximum execution time of the instructions.

so (1U << 33) is 2 unless the compiler is compiling for a true 8086, in which
case it's 0.

Hyman Rosen

unread,

Nov 29, 2017, 1:43:40 PM11/29/17

to std-pr...@isocpp.org

On Wed, Nov 29, 2017 at 12:36 PM, Andrey Semashev <andrey....@gmail.com> wrote:

This means that the behavior of the code is undefined because its result is different

on every target platform, including different numeric results and traps.

The standard should suggest that in the absence of traps, the behavior should be
implementation-defined in an unsurprising way.

This is not possible, because this means that the program can now not only run differently,

but also be *compiled* differently depending on the target architecture

Of course it's possible. There is nothing wrong with having compilation results differ
based on target architecture. If I have, as per your example,

template <unsigned N> bool foo () { return true; }
template <> bool foo<0>() { return false; }

then foo<0x8000u + 0x8000u>() will return true on platforms where unsigned is
32 bits but 0 on platforms where it is 16 bits.

And what if the CPU issues a trap in this situation? Do you expect the compiler to crash or something?

No, then I expect the compiler to issue an error and not compile the program.

There is no UB allowed in constant expressions, precisely for this reason.

So if you want some particular behavior, you have to define it, and it should

be a fixed single definition, not just "do whatever the CPU does". This, in turn,

means blessing one implementation and penalizing all others to the point that

a particular shift instruction is no longer possible to use.

No, I completely disagree. Constant expressions should produce exactly the
same result as runtime expressions. If the expression would produce a trap,

the constant expression should not compile, otherwise you get whatever your
target platform would produce.

Thiago Macieira

unread,

Nov 29, 2017, 1:50:48 PM11/29/17

to std-pr...@isocpp.org

On Wednesday, 29 November 2017 09:43:20 PST Hyman Rosen wrote:
> A. If the platform always causes a trap when indirecting through a null
> pointer,
> the check can be eliminated.

That's the case. Therefore, the check can be eliminated.

The special case of the Linux kernel was that on some MCU, under some special
circumstance, the zero page *was* mapped and therefore didn't trap. So the GCC
developers added a special flag to allow that particular case to work.

In other words, this is *exactly* what you're asking for.

> Notice your own contradictory claims - you first say that the compiler
> should
> assume that programmers know what they are doing, then you say that
> compilers
> should eliminate code that those programmers have written.

The compiler should assume programmers know what they are doing, which means
programmers don't write UB code. Since they don't, the compiler is allowed to
discard any UB from the list of possibilities.

The example from Qt was more or less like this:

if (d->size + newitems > d->alloc)
realloc(d->size + newitems);

However, due to constant propagation through the inlining, it knows that
d->alloc = d->size
and it knows newitems is a constant value (1).

So it reduced the expression to:

if (d->size + 1 > d->size)

Since signed integer overflow is UB, the compiler is allowed to assume I've
written proper code to ensure that d->size + 1 will not overflow (I have), in
which case the comparison is a success.

I agree that if the compiler cannot tell things apart, it should leave the
comparison alone. But it *could* tell things apart. So it should optimise the
comparison and not leave a useless compare-and-branch which will never be
true.

Thiago Macieira

unread,

Nov 29, 2017, 1:55:43 PM11/29/17

to std-pr...@isocpp.org

On Wednesday, 29 November 2017 10:16:06 PST Hyman Rosen wrote:
> > This is a *relevant* case, since early non-UEFI boot code still runs
> > 16-bit
> > real-mode, with Intel CPUs released in 2017 (the Quark line of MCUs).
>
> According to the Intel manual,
>

> *IA-32 Architecture Compatibility*
> *The 8086 does not mask the shift count. However, all other IA-32
> processors (starting with the Intel 286 processor)*
> *do mask the shift count to 5 bits, resulting in a maximum count of 31.
> This masking is done in all operating modes*
> *(including the virtual-8086 mode) to reduce the maximum execution time of
> the instructions.*

>
> so (1U << 33) is 2 unless the compiler is compiling for a true 8086, in
> which
> case it's 0.

In other words, if I compile for true 8086 and run on a modern Quark, the
value that the compiler created will be different from the value produced at
runtime.

Is that ok by you?

Mind you, GCC can't produce 16-bit real-mode code (the -m16 option produces
32-bit real-mode code) so the compiler I'm using for this kind of early boot
code is very likely an old one, that may still be assuming 8086 semantics
because no one bothered to update that assumption.

Matthew Woehlke

unread,

Nov 29, 2017, 2:00:35 PM11/29/17

to std-pr...@isocpp.org

On 2017-11-29 12:43, Hyman Rosen wrote:
> If I write *p = 1; if (!p) foo(); then the compiler should
> not eliminate the test.

Why not? In no case will `foo()` execute. If `p`, the condition is
false. If `!p`, the condition is irrelevant because the preceding line
will cause a SEGV.

> Notice your own contradictory claims - you first say that the compiler
> should
> assume that programmers know what they are doing, then you say that
> compilers
> should eliminate code that those programmers have written.

Did you read Thiago's example? It's rare that real optimization
encounters cases this simple. Often there is intervening logic or const
propagation such that the code the user wrote is reasonable *in some
cases*, but not in the particular case where optimization happened.

--
Matthew

Thiago Macieira

unread,

Nov 29, 2017, 2:01:42 PM11/29/17

to std-pr...@isocpp.org

On Wednesday, 29 November 2017 09:53:31 PST Andrey Semashev wrote:
> > 3. Basic identities like "if you add a positive number to an integer, it
> > gets bigger" no longer hold.
>
> I don't see that as that much of a problem.
>
> This argument is often quoted in relation to pointer arithmetic, so that
> the compiler can assume that pointers (possibly, with an offset) never
> wrap. I agree that wrapping *pointers* should continue to be undefined,
> but that doesn't mean the same should be true for integers.

See the example I've just posted in response to Hyman Rosen. Pasted here:

The example from Qt was more or less like this:

if (d->size + newitems > d->alloc)
realloc(d->size + newitems);

However, due to constant propagation through the inlining, it knows that
d->alloc = d->size
and it knows newitems is a constant value (1).

So it reduced the expression to:

if (d->size + 1 > d->size)

Matthew Woehlke

unread,

Nov 29, 2017, 2:08:42 PM11/29/17

to std-pr...@isocpp.org

On 2017-11-29 12:53, Andrey Semashev wrote:
> On 11/29/17 14:57, David Brown wrote:
>> Picking a fixed behaviour - such as two's complement wrap-around
>> for signed overflow - comes at a cost. Some of these costs
>> include:
>>
>> 1. int arithmetic is no longer associative and distributive.
>
> Not sure how this follows from allowing two's complement wrap-around.
> Could you give an example?

Consider two expressions:

a + b < c
a < c - b

Given infinite precision, the values of these expressions must be
identical. However, let:

a = MAX_INT
b = 1
c = MAX_INT / 2

Now, `a + b` is less than 0 (due to wrapping), and therefore less than
`c`. However, `a` is greater than `c - b`.

>> 2. Multiply and divide (when there is no remainder) are no longer
>> inverses.
>
> In all cases, where they are currently reversible (i.e. where no
> overflow happens), they will continue to be so. OTOH, when overflow does
> happen you can now not rely on any particular result or operations
> property. A fixed overflow semantic would at least give you the result
> and guarantee that the compiler will not cripple your code somehow.

No, what you've done is prevent optimization. Because right now, if I
write something that ultimately turns into `b = (a * 2) / 2`, the
compiler can simplify that to `b = a`, because assuming that UB doesn't
happen allows the compiler to apply such transforms. Once you remove
that restriction, you basically cripple the compiler's ability to optimize.

I can already hear people screaming that they don't *want* the compiler
to optimize. Well... go write in assembly, if that's what you want. Lots
of real world code deals with small number that won't overflow that
*should* be optimized in this manner. (I'd venture so far as to claim
that *most* real world code likewise assumes that numbers don't
overflow, i.e. is written with the assumption that its inputs will be
"small" numbers, and without trying to guard against things going wrong
in a situation where overflow would occur.)

>> 3. Basic identities like "if you add a positive number to an integer, it
>> gets bigger" no longer hold.
>
> I don't see that as that much of a problem.

for (int i = 1; i <= j; ++i)

Can I assume that this loop will terminate? As a programmer, I am almost
certainly making that assumption. Why shouldn't the compiler be able to
do likewise?

Why does it matter? Well, if I can assume that overflow doesn't occur, I
can rewrite the loop as:

for (int i = 0; i < j;)
{
++i;
...
}

...which *will* terminate. If I can't make that assumption, the compiler
has to emit a potentially infinite loop.

What about this?

constexpr k = 0; // far away

i = 1;
do
{
...
if (i < k) // never true?
...
i += 1; // or any positive constant value
} while (...)

In both these examples, the compiler can currently assume `i > 0`.

--
Matthew

Thiago Macieira

unread,

Nov 29, 2017, 2:41:10 PM11/29/17

to std-pr...@isocpp.org

On Wednesday, 29 November 2017 11:08:38 PST Matthew Woehlke wrote:
> I can already hear people screaming that they don't *want* the compiler
> to optimize. Well... go write in assembly, if that's what you want.

Just use unsigned if you really meant to have the overflow semantics.

Converting from unsigned to signed is completely defined if the result is in
the positive range of the signed type, and IB if it's outside. Since it's IB,
you can rely on the result being the same across compiler versions and
optimisation flags.