Pragmas to make certain operations well-defined

175 views
Skip to first unread message

demio...@gmail.com

unread,
Jul 25, 2016, 4:19:41 AM7/25/16
to ISO C++ Standard - Future Proposals
This is a suggestion to provide pragmas that would make certain operations well-defined.  The names of the pragmas are not intended to be the actual names, though I did try to make them semi-reasonable.

GCC implements command-line flags that correspond to the all of these except (possibly) the second.  Clang implements the first and third.  The first, third, and fourth are used (as command-line options) by at least the Linux kernel, albeit in C and not C++ (though the corresponding flags are also accepted by C++ compilers).  Microsoft has probably screwed up strict aliasing in CoreCLR– https://github.com/dotnet/coreclr/issues/6454.

These pragmas (with different names) probably make just as much sense in C as in C++, and should probably (if they are a reasonable idea) be proposed there as well (probably renamed).

This suggestion could be totally off-base.  Nevertheless, I feel like a lot of code in the wild depends on these at least some of these behaviors (enforced via non-standard compile-time options). 

Specifically:
  • Make signed overflow wrap:
    #pragma STDC++ overflow wrap
    This requires the compiler to ensure that signed integer overflow behaves as unsigned integer overflow – as 2's complement wrapping arithmetic.  I suspect that there will be little performance penalty – every non-obsolete computer I know of already provide 2's complement in hardware.
  • Make signed overflow well-defined:
    #pragma STDC++ no-strict-overflow
    This requires that signed overflow either perform according to 2's complement, or trap with an unspecified exception (that must be fatal if uncaught, but may or may not be a C++ exception).
  • Turn off strict aliasing
    #pragma STDC++ no-strict-aliasing
    This turns off the strict-aliasing rule.  This is very useful for programs such as garbage collectors and memory allocators – indeed, I am not sure if a memory allocator can be written in strictly conforming C++ as it stands today.  It also allows for a C++ implementation of memcpy that is faster than going 1 byte at a time.
  • Make null pointer dereference implementation defined (at worst)
    #pragma STDC++ no-delete-null-pointer-checks
    This makes dereferencing nullptr have implementation-defined behavior that is guaranteed to correspond to the behavior of reading from address zero (on platforms where that makes sense).  This is useful for kernels and other programs that interface directly with hardware, since address 0 may map to meaningful data.
  • Make accessing of out-of-bounds memory allowed, provided that valid memory is guaranteed to exist there.
    #pragma STDC++ no-strict-bounds
    This allows for accessing a multi-dimensional array as a one-dimensional array.

Myriachan

unread,
Jul 26, 2016, 7:52:55 PM7/26/16
to ISO C++ Standard - Future Proposals, demio...@gmail.com
On Monday, July 25, 2016 at 1:19:41 AM UTC-7, demio...@gmail.com wrote:
This is a suggestion to provide pragmas that would make certain operations well-defined.  The names of the pragmas are not intended to be the actual names, though I did try to make them semi-reasonable.

GCC implements command-line flags that correspond to the all of these except (possibly) the second.  Clang implements the first and third.  The first, third, and fourth are used (as command-line options) by at least the Linux kernel, albeit in C and not C++ (though the corresponding flags are also accepted by C++ compilers).  Microsoft has probably screwed up strict aliasing in CoreCLR– https://github.com/dotnet/coreclr/issues/6454.


GCC and clang both have -fwrapv to enable two's-complement wrapping semantics for signed integer types.
 
These pragmas (with different names) probably make just as much sense in C as in C++, and should probably (if they are a reasonable idea) be proposed there as well (probably renamed).


I pretty much would turn them on all the time.  I strongly dislike many of the object lifetime rules, particularly in C++.
 
This suggestion could be totally off-base.  Nevertheless, I feel like a lot of code in the wild depends on these at least some of these behaviors (enforced via non-standard compile-time options). 

Specifically:
  • Make signed overflow wrap:
    #pragma STDC++ overflow wrap
    This requires the compiler to ensure that signed integer overflow behaves as unsigned integer overflow – as 2's complement wrapping arithmetic.  I suspect that there will be little performance penalty – every non-obsolete computer I know of already provide 2's complement in hardware.
  • Make signed overflow well-defined:
    #pragma STDC++ no-strict-overflow
    This requires that signed overflow either perform according to 2's complement, or trap with an unspecified exception (that must be fatal if uncaught, but may or may not be a C++ exception).
In many similar ideas I have, I've called the acceptable behavior set "snap, trap or wrap": that is, saturate to the largest (or smallest) value, trap in an implementation-defined way, or wrap two's complement.  This matches the common hardware behaviors out there, of which wrap is obviously the most common.
 
  • Turn off strict aliasing
    #pragma STDC++ no-strict-aliasing
    This turns off the strict-aliasing rule.  This is very useful for programs such as garbage collectors and memory allocators – indeed, I am not sure if a memory allocator can be written in strictly conforming C++ as it stands today.  It also allows for a C++ implementation of memcpy that is faster than going 1 byte at a time.
I would rather have "restrict" from C99 and a new "may_alias" attribute/keyword to manipulate aliasing.  Permitting type-based alias analysis is an important optimization, but we need a way to suppress or extend this behavior sometimes to accomplish low-level tasks or more strongly optimize.
 
  • Make null pointer dereference implementation defined (at worst)
    #pragma STDC++ no-delete-null-pointer-checks
    This makes dereferencing nullptr have implementation-defined behavior that is guaranteed to correspond to the behavior of reading from address zero (on platforms where that makes sense).  This is useful for kernels and other programs that interface directly with hardware, since address 0 may map to meaningful data.
In Windows NT, null pointers can't safely be optimized away, either.  Because of the exception handling mechanism, code might rely upon the dereferencing of null pointers triggering an exception, which can be caught and handled by several means.  This design is critical to the security system in the kernel to protect against invalid input from user mode.

  • Make accessing of out-of-bounds memory allowed, provided that valid memory is guaranteed to exist there.
    #pragma STDC++ no-strict-bounds
    This allows for accessing a multi-dimensional array as a one-dimensional array.
This also would permit the classic trick of having a 1-element array at the end of a struct and referencing past the end when memory follows.  This mechanism is used in the APIs of Windows and UNIX, so we've been relying upon this particular undefined behavior since forever.

Melissa

Nicol Bolas

unread,
Jul 26, 2016, 8:12:05 PM7/26/16
to ISO C++ Standard - Future Proposals, demio...@gmail.com
On Monday, July 25, 2016 at 4:19:41 AM UTC-4, demio...@gmail.com wrote:
This is a suggestion to provide pragmas that would make certain operations well-defined.  The names of the pragmas are not intended to be the actual names, though I did try to make them semi-reasonable.

GCC implements command-line flags that correspond to the all of these except (possibly) the second.  Clang implements the first and third.  The first, third, and fourth are used (as command-line options) by at least the Linux kernel, albeit in C and not C++ (though the corresponding flags are also accepted by C++ compilers).  Microsoft has probably screwed up strict aliasing in CoreCLR– https://github.com/dotnet/coreclr/issues/6454.

These pragmas (with different names) probably make just as much sense in C as in C++, and should probably (if they are a reasonable idea) be proposed there as well (probably renamed).

This suggestion could be totally off-base.  Nevertheless, I feel like a lot of code in the wild depends on these at least some of these behaviors (enforced via non-standard compile-time options).

This list contains two kinds of things:

1: Definitions for behavior that a large number of implementations already pretty much support.
2: Changing the behavior of existing constructs in ways that implementations don't by default support.

I've always liked the idea of a "regular" profile for C++, one that defines a lot of stuff that the vast majority of systems allow, but we leave un/implementation-defined to allow a small percentage of systems to work. Since it's constexpr, you can `static_assert` if your code relies on it or `if constexpr` around places where you need it to be true.

These would be things like integers being 2's complement (and thus well-defined overflow behavior), 8-bit bytes (and all of the `int*_t` types), perhaps even an endian setting with well-defined behavior for accessing integers via byte arrays, and so forth.

These are things that should be present when the compiler/system can support them. So it wouldn't be something you turn on; it's something you query.

I don't much like items from #2. Things like turning off strict aliasing, null pointer stuff, etc. That stuff ought to remain the realm of compiler switches, rather than standard-supported behavior.

Thiago Macieira

unread,
Jul 26, 2016, 9:06:34 PM7/26/16
to std-pr...@isocpp.org
On terça-feira, 26 de julho de 2016 16:52:55 PDT Myriachan wrote:
> > - Make accessing of out-of-bounds memory allowed, provided that valid
> > memory is guaranteed to exist there.
> > #pragma STDC++ no-strict-bounds
> > This allows for accessing a multi-dimensional array as a
> > one-dimensional array.
> >
> This also would permit the classic trick of having a 1-element array at
> the end of a struct and referencing past the end when memory follows. This
> mechanism is used in the APIs of Windows and UNIX, so we've been relying
> upon this particular undefined behavior since forever.

Which in turn shows this shouldn't be a pragma, but instead should be a per-
array setting. There aren't many arrays that can be accessed out-of-bounds,
but the few that are, are often significant.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Nicol Bolas

unread,
Jul 26, 2016, 10:43:46 PM7/26/16
to ISO C++ Standard - Future Proposals, demio...@gmail.com
On Tuesday, July 26, 2016 at 8:12:05 PM UTC-4, Nicol Bolas wrote:
On Monday, July 25, 2016 at 4:19:41 AM UTC-4, demio...@gmail.com wrote:
This is a suggestion to provide pragmas that would make certain operations well-defined.  The names of the pragmas are not intended to be the actual names, though I did try to make them semi-reasonable.

GCC implements command-line flags that correspond to the all of these except (possibly) the second.  Clang implements the first and third.  The first, third, and fourth are used (as command-line options) by at least the Linux kernel, albeit in C and not C++ (though the corresponding flags are also accepted by C++ compilers).  Microsoft has probably screwed up strict aliasing in CoreCLR– https://github.com/dotnet/coreclr/issues/6454.

These pragmas (with different names) probably make just as much sense in C as in C++, and should probably (if they are a reasonable idea) be proposed there as well (probably renamed).

This suggestion could be totally off-base.  Nevertheless, I feel like a lot of code in the wild depends on these at least some of these behaviors (enforced via non-standard compile-time options).

This list contains two kinds of things:

1: Definitions for behavior that a large number of implementations already pretty much support.
2: Changing the behavior of existing constructs in ways that implementations don't by default support.

I've always liked the idea of a "regular" profile for C++, one that defines a lot of stuff that the vast majority of systems allow, but we leave un/implementation-defined to allow a small percentage of systems to work. Since it's constexpr, you can `static_assert` if your code relies on it or `if constexpr` around places where you need it to be true.

OK, that paragraph didn't work out at all. So let me try again.

The idea is that there's a lot of stuff which most of the major compilers/platforms support. But C++ doesn't specify behavior for them, because there are a relatively small number of compilers/platforms which don't support them. So a lot of code gets written that assumes certain things that aren't guaranteed: two's complement signed integers, 8-bit bytes, integers stored in big|little endian formats, etc.

What I'm thinking is permitting a compiler to advertise the fact that it is "regular", via a constexpr variable or somesuch. Regular implementations define a number of commonly assumed operations, and should represent the major compilers/platforms behavior in these regards. Since the variable is `constexpr`, you can static_assert on it if your code absolutely relies on this behavior, or you can `if constexpr` around it if you have an alternative way of doing what you need to do.

FrankHB1989

unread,
Jul 27, 2016, 2:05:21 AM7/27/16
to ISO C++ Standard - Future Proposals, demio...@gmail.com


在 2016年7月27日星期三 UTC+8上午8:12:05,Nicol Bolas写道:
There are more than one problem domains. First, the language should be able to express all settings of them (with either defined or deliberately undefined behavior on some operations), probably better by library extensions (as types) with underlying built-in native support (like `int*_t`), and may even allow different settings work in a same program. Then even explicit query is not need - if the code can't work, it simply does not type check; otherwise it is portable. The second problem is which particular set should be mandated.

 
I don't much like items from #2. Things like turning off strict aliasing, null pointer stuff, etc. That stuff ought to remain the realm of compiler switches, rather than standard-supported behavior.
Agreed.
 

Myriachan

unread,
Jul 27, 2016, 3:24:53 PM7/27/16
to ISO C++ Standard - Future Proposals
On Tuesday, July 26, 2016 at 6:06:34 PM UTC-7, Thiago Macieira wrote:
On terça-feira, 26 de julho de 2016 16:52:55 PDT Myriachan wrote:
> This also would permit the classic trick of having a 1-element array at
> the end of a struct and referencing past the end when memory follows.  This
> mechanism is used in the APIs of Windows and UNIX, so we've been relying
> upon this particular undefined behavior since forever.

Which in turn shows this shouldn't be a pragma, but instead should be a per-
array setting. There aren't many arrays that can be accessed out-of-bounds,
but the few that are, are often significant.


I just noticed that C99 permits this already as "flexible array members", where you declare the array at the end of a struct without a size.  It then guarantees that sizeof(structname) == offsetof(structname, membername).

This would be convenient in C++, even if only permitted for standard-layout types.  It's required to interface with many existing APIs, at least without bending over backward.

That said, the original suggestion of being able to access multidimensional arrays as if they were single-dimensional arrays would be really nice.  Implementing it would require changes to the aliasing rules and to the definition of "pointer to array" types.

Multidimensional arrays would be really nice for programs that have manual use of vectors.  It doesn't directly matter for now, because those intrinsics are beyond the Standard, and are implemented in compilers so as to be allowed to alias anything.

Melissa

Demi Obenour

unread,
Jul 27, 2016, 9:35:02 PM7/27/16
to std-pr...@isocpp.org

My personal view is that if a memory access points to memory that has not been deallocated, and to a valid bit-pattern for the type of the access, and there are no data races, the access should be well-defined.

Without this, I cannot see how to reasonably implement a garbage collector that works with heterogeneous objects that it cannot just treat as opaque bytes (because it needs to follow pointers and possibly invoke tracing methods).  In the case, the actual type of the object is often unknown, and may not even exist at compile time — the GC uses metadata to process the object based on its layout at runtime.


--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/h_UoZuTPhZw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/42422923-778e-4ea6-9b4c-4fe2d0d447e1%40isocpp.org.

Edward Catmur

unread,
Jul 28, 2016, 6:48:08 PM7/28/16
to ISO C++ Standard - Future Proposals, demio...@gmail.com
On Thursday, 28 July 2016 02:35:02 UTC+1, Demi Obenour wrote:

My personal view is that if a memory access points to memory that has not been deallocated, and to a valid bit-pattern for the type of the access, and there are no data races, the access should be well-defined.

Without this, I cannot see how to reasonably implement a garbage collector that works with heterogeneous objects that it cannot just treat as opaque bytes (because it needs to follow pointers and possibly invoke tracing methods).  In the case, the actual type of the object is often unknown, and may not even exist at compile time — the GC uses metadata to process the object based on its layout at runtime.


At the risk of sounding like a broken record, why not memcpy? 

Demi Obenour

unread,
Jul 29, 2016, 3:44:50 AM7/29/16
to Edward Catmur, ISO C++ Standard - Future Proposals

Because that would require way too much use of memcpy.  The code would be completely unreadable.

Edward Catmur

unread,
Jul 29, 2016, 4:28:03 AM7/29/16
to std-pr...@isocpp.org
On Fri, Jul 29, 2016 at 8:44 AM, Demi Obenour <demio...@gmail.com> wrote:

Because that would require way too much use of memcpy.  The code would be completely unreadable.

So write a wrapper function. What's more readable, read_memory_as<void*>(p) or *reinterpret_cast<void**>(p) // I hope the optimizer doesn't notice that this is illegal?

And in how many source locations does a mark-and-sweep GC read the memory it is scanning, anyway?

On Jul 28, 2016 6:48 PM, "Edward Catmur" <e...@catmur.co.uk> wrote:
On Thursday, 28 July 2016 02:35:02 UTC+1, Demi Obenour wrote:

My personal view is that if a memory access points to memory that has not been deallocated, and to a valid bit-pattern for the type of the access, and there are no data races, the access should be well-defined.

Without this, I cannot see how to reasonably implement a garbage collector that works with heterogeneous objects that it cannot just treat as opaque bytes (because it needs to follow pointers and possibly invoke tracing methods).  In the case, the actual type of the object is often unknown, and may not even exist at compile time — the GC uses metadata to process the object based on its layout at runtime.


At the risk of sounding like a broken record, why not memcpy? 

--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/h_UoZuTPhZw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Demi Obenour

unread,
Jul 29, 2016, 12:35:59 PM7/29/16
to ISO C++ Standard - Future Proposals

The bigger issue is a copying GC.  You really don't want to call move and/or copy constructors.  Much easier if you can restrict yourself the types that can be blitted with memcpy.  This almost certainly runs afoul of object lifetime rules, but is required if you want good performance.


Ren Industries

unread,
Jul 29, 2016, 12:37:55 PM7/29/16
to std-pr...@isocpp.org
Isn't that why we have the TriviallyCopyable concept?

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jul 29, 2016, 1:43:50 PM7/29/16
to std-pr...@isocpp.org
On sexta-feira, 29 de julho de 2016 12:37:53 PDT Ren Industries wrote:
> Isn't that why we have the TriviallyCopyable concept?

No, it's the Relocatable concept and the destructive move, which we've
discussed over and over again but as far as I can tell, there's no actual
proposal of.

Nicol Bolas

unread,
Jul 29, 2016, 1:58:23 PM7/29/16
to ISO C++ Standard - Future Proposals
On Friday, July 29, 2016 at 1:43:50 PM UTC-4, Thiago Macieira wrote:
On sexta-feira, 29 de julho de 2016 12:37:53 PDT Ren Industries wrote:
> Isn't that why we have the TriviallyCopyable concept?

No, it's the Relocatable concept and the destructive move, which we've
discussed over and over again but as far as I can tell, there's no actual
proposal of.

You mean besides P0023. And N4158 and N4034?

Thiago Macieira

unread,
Jul 29, 2016, 2:02:42 PM7/29/16
to std-pr...@isocpp.org
I stand corrected. Reading now, thanks for the info.

Demi Obenour

unread,
Jul 29, 2016, 8:59:47 PM7/29/16
to ISO C++ Standard - Future Proposals

I think the bigger question is "Why should one need to go through all those hoops to do something as simple as reading/writing memory?".

I think that there should be an easier way.  Nobody, to my knowledge, creates a helper function like was described.  Much easier to use a (quite common) compiler switch than to change your code.  This proposal provides a standard way to do what many/most C++ compilers already provide as an opt-in extension.


--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/h_UoZuTPhZw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jul 29, 2016, 10:31:21 PM7/29/16
to std-pr...@isocpp.org
On sexta-feira, 29 de julho de 2016 20:59:45 PDT Demi Obenour wrote:
> I think the bigger question is "Why should one need to go through all those
> hoops to do something as simple as reading/writing memory?".

DTRT (Do The Right Thing).

> I think that there should be an easier way. Nobody, to my knowledge,
> creates a helper function like was described.

We do, if we want to write portable code today. For example, the code I
recently wrote for endianness conversion and reading from unaligned memory
(file and socket buffers) is all based around memcpy.

It may look superfluous and ugly to you, but a line like this:

qint64 x = qFromBigEndian<qint64>(buffer);

does the endianness swap and unaligned read in a single assembly instruction
(on Haswell processors). In the end, the code is more readable because I know
that the source was big endian.

> Much easier to use a (quite
> common) compiler switch than to change your code. This proposal provides a
> standard way to do what many/most C++ compilers already provide as an
> opt-in extension.

Using a compiler switch today is not an option for header code.

The pragmas are also not an option for macro code. You and I probably think
that they're poor practice and shouldn't be used, but others may disagree.

Finally, there's the issue that pragmas in template code increase the
complexity for the compiler, since it needs to remember the pragma state for
when it instantiates the template.

Demi Obenour

unread,
Jul 31, 2016, 2:42:17 AM7/31/16
to ISO C++ Standard - Future Proposals

Yes, it may be required, but who actually does it?  I suspect that this proposal will mostly make code strictly confirming that currently relies on non-standard switches.


--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/h_UoZuTPhZw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jul 31, 2016, 3:15:09 AM7/31/16
to std-pr...@isocpp.org
On domingo, 31 de julho de 2016 02:42:15 PDT Demi Obenour wrote:
> Yes, it may be required, but who actually does it? I suspect that this
> proposal will mostly make code strictly confirming that currently relies on
> non-standard switches.

Like I said, I do.

Demi Obenour

unread,
Jul 31, 2016, 4:07:45 AM7/31/16
to ISO C++ Standard - Future Proposals

Do you know of anyone else?  Any open source code?


--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/h_UoZuTPhZw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Edward Catmur

unread,
Jul 31, 2016, 4:30:22 AM7/31/16
to std-pr...@isocpp.org

On 31 Jul 2016 9:07 a.m., "Demi Obenour" <demio...@gmail.com> wrote:
>
> Do you know of anyone else?  Any open source code?

Most of the boost code I've seen is correct. Do you want specific examples?

> To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAJEMUN8G%2B0OHHE88_ysz%2BQx7OL-d-f-K%2BWNL%2B5HkBC3F1WButw%40mail.gmail.com.

Thiago Macieira

unread,
Jul 31, 2016, 2:30:34 PM7/31/16
to std-pr...@isocpp.org
On domingo, 31 de julho de 2016 04:07:42 PDT Demi Obenour wrote:
> Do you know of anyone else? Any open source code?

My code is open source.

http://code.qt.io/cgit/qt/qtbase.git/tree/src/corelib
Reply all
Reply to author
Forward
0 new messages