CALL: Compilers Are Language Lawyers

105 views
Skip to first unread message

Michael Witten

unread,
Apr 16, 2018, 7:35:55 PM4/16/18
to std-dis...@isocpp.org
Synopsis: If a human can point out a conformance problem with
some code, then a compiler SHOULD point out that
problem; possibly, if a human can answer a question
about the Standard, then a compiler should also be able
to answer that question.

Perhaps, there is a stricter statement: Every undefined
behavior is run-time behavior; at compile-time, there
is no such thing as undefined behavior. If this is true
(or often true), this draws a distinction that may be
helpful in identifying where future revisions of the
Standard must define behavior for what may currently
be considered an invocation of undefined behavior.

For instance, if you define a variable in namespace
`std', your compiler should at least warn that
you're invoking undefined behavior; or, maybe, that
behavior should be better defined by the Standard.

It is common for a programmer to treat a compiler as being the
ultimate Language Lawyer, someone who will gleefully peruse every
nook and cranny of a program in order to report even the minutest
contravention of the Standard. Alas, in practice, a compiler does
not fulfill this role, potentially misleading the programmer to
confidently write code that is non-portable or that invokes
undefined behavior, and that causes much gnashing of teeth.

Instead, the Standard should impose the following requirement:
Within some practical, standardized limits, if every compiler
necessarily has the answer to a question, then every compiler
must provide a standardized means by which to ask that question
and to receive the definite answer.

Howard E. Hinnant emphasized this very principle as far as it
applies to one particular case: What is the endianness of the
execution environment? The "ancient, time-honored tradition" of
asking this question is considered in Howard's proposal for
`std::endian':

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0463r1.html
https://howardhinnant.github.io/endian.html

In that proposal, Howard eloquently identifies the special
position of the compiler in providing the answer:

There are many hacks that work most of the time. None of them
seem bullet proof. Few give the answer as a compile-time
constant. And in every single case:

THE COMPILER KNOWS THE ANSWER!

I, for one, burst out laughing when I read that insight, because
it struck me that every compiler necessarily knows so many
answers, and yet keeps them quietly hidden away like the guarded
trade secrets of the guilds of yore.

Indeed, Howard's proposal led me to my own question. You see, I
badly wanted to use `std::endian' in a program, but there is a
problem: My code is targeting C++17, and Howard's newfangled
feature is slated for support only in C++20.

No matter, I thought. I'll just "backport" it; I'll just include
in my code an extended version of Howard's sample implementation,
conditionally compiled to avoid a future conflict:

#if __cplusplus >= 202000L
#include <type_traits>
#else
namespace std
{
enum class endian
{
#ifdef _WIN32
little = 0,
big = 1,
native = little
#elif (defined __ORDER_LITTLE_ENDIAN__) && (defined __ORDER_BIG_ENDIAN__) && (defined __BYTE_ORDER__)
little = __ORDER_LITTLE_ENDIAN__,
big = __ORDER_BIG_ENDIAN__,
native = __BYTE_ORDER__
#else
#error "This platform has no implementation for C++20's `std::endian'."
#endif
};
}
#endif

If that code exists in the includable source file `compat.h',
then one might write the following program:

#include <iostream>
#include "compat.h"

int main()
{
if (std::endian::native == std::endian::little)
std::cout << "The execution environment is little-endian.\n";
}

If that program exists in the source file `endian.cpp', then one
might compile it with GCC's `g++', and run it as follows:

$ g++ -std=c++17 endian.cpp -o endian
$ ./endian
The execution environment is little-endian.

Great! The code works as expected. However... should one actually
expect the code to work? On second thought, it seems like a
suspicious intrusion of the Standard's sacred namespace. Well, by
default, C++ implementations tend to play fast and loose, or at
least they often provide nonstandard extensions; surely, this
question can be resolved by requesting the compiler to be a
little more unforgiving:

$ be_unforgiving='-pedantic-errors -Wall -Wextra -Werror'
$ g++ -std=c++17 $be_unforgiving endian.cpp -o endian

Huh. Hmmm. No problems. How about LLVM's compiler?

$ clang++ -std=c++17 -Weverything -Wno-c++98-compat endian.cpp

Nothing. Sigh... Fine. Let's go where few have gone before:
To consult the Standard. Consider section `[namespace.std]':

http://eel.is/c++draft/namespace.std (generated on 2018-04-15)
https://github.com/cplusplus/draft/blob/99325f3d8975075d27e40d8548919f70fc7824b8/source/lib-intro.tex#L2202

According to that, the future C++20 Standard will state something
similar to the following:

Unless otherwise specified, the behavior of a C++ program is
undefined if it adds declarations or definitions to namespace
std or to a namespace within namespace std.

Finally! Clarity. Sort of. I mean, I'm not going to go looking
for where it might be "otherwise specified", so practicality
demands that I assume my usage of namespace `std' exists
[inexplicably] outside the purview of the Standard. That is, The
aforementioned program invokes undefined behavior, and I wish I
had never been able to compile it (without any warning); I feel
so betrayed; I feel as if the world is a house of cards built
atop a foundation of sand; I feel the need to gnash my teeth.

With this new knowledge, there naturally percolates in the mind a
better-defined solution, namely to avoid naming namespace `std':

#if __cplusplus >= 202000L
#include <type_traits>
using std::endian;
#else
enum class endian
{
#ifdef _WIN32
little = 0,
big = 1,
native = little
#elif (defined __ORDER_LITTLE_ENDIAN__) && (defined __ORDER_BIG_ENDIAN__) && (defined __BYTE_ORDER__)
little = __ORDER_LITTLE_ENDIAN__,
big = __ORDER_BIG_ENDIAN__,
native = __BYTE_ORDER__
#else
#error "This platform has no implementation for C++20's `std::endian'."
#endif
};
#endif

Now, the program in question can be re-written thusly:

#include <iostream>
#include "compat.h"

int main()
{
if (endian::native == endian::little)
std::cout << "The execution environment is little-endian.\n";
}

However, this solution is actually quite unsatisfying, because
I'd rather it be written in terms of namespace `std', and why
shouldn't it be? Why shouldn't I have a well-defined way by which
to backport fully-accepted library additions, even if only
incompletely? After all, that kind of dangerous access is
precisely what draws experts to C++; some low-level, unpalatable
hack can be hammered into place, and then covered with a safe,
compatible, clean abstraction for everyday use.

And, you'll note that I employed the term "a well-defined way"
rather than the term "a well-formed way". Consider the Standard's
section `[defns.well.formed]':

http://eel.is/c++draft/defns.well.formed
https://github.com/cplusplus/draft/blob/99325f3d8975075d27e40d8548919f70fc7824b8/source/intro.tex#L331

It states that a "well-formed program" is defined as a:

C++ program constructed according to the syntax rules,
diagnosable semantic rules, and the one-definition rule

Well, even though the initial version of the program has
undefined behavior, it still manages to tick all those boxes, and
is thus a well-formed program. Gah! I suppose that's the loophole
for the experts, but I must say it looks more like a noose.

Recall to mind the rule that riles:

Unless otherwise specified, the behavior of a C++ program is
undefined if it adds declarations or definitions to namespace
std or to a namespace within namespace std.

By establishing "undefined behavior", that sentence in the
Standard not only allows, but even invites, an implementation to
utterly ignore that entire sentence; effectively, it is as though
that sentence doesn't even exist, because pretending that it
doesn't exist is certainly within the realm of allowed behavior.
Why is that sentence even there? How many more such ghostly
sentences exist within the Standard? Quite a few, I imagine, as
implied by GCC's documentation:

$ info '(gcc)Warning Options'

According to that, the GCC project isn't even much interested in
what the Standard does explicitly prohibit:

A feature to report any failure to conform to ISO C might be
useful in some instances, but would require considerable
additional work and would be quite different from '-Wpedantic'.
We don't have plans to support such a feature in the near
future.

Hey. Maybe that's a good thing. If our compilers actually cared,
we'd have nothing to talk about in the forums.

Surely, though, this is not the case here; surely, that which is
a matter of local compile-time analysis could benefit from rules
with well-defined behavior. Were a Language Lawyer to happen upon
the program in question, that lawyer would feel morally compelled
to belittle the blunder; shouldn't the compiler chastise the
same? Let the Standard state instead something similar to this:

Namespace std shall be distinguished from any other namespace
in only the following way: If a C++ program adds a declaration
or definition to namespace std (or to a namespace within
namespace std), then that declaration or definition shall
be treated as an implementation-specific extension (4.1) of
this International Standard. [Note: Such an extension is not
allowed to alter the behavior of any well-formed program;
thus, if such an extension does alter the behavior of any
well-formed program, the extended implementation is essentially
non-conforming, and the behavior is undefined. ---end note]

As referenced, this rule depends on section `[intro.compliance]':

http://eel.is/c++draft/intro.compliance
https://github.com/cplusplus/draft/blob/99325f3d8975075d27e40d8548919f70fc7824b8/source/intro.tex#L441

That section states:

A conforming implementation may have extensions (including
additional library functions), provided they do not alter
the behavior of any well-formed program. Implementations are
required to diagnose programs that use such extensions that are
ill-formed according to this document. Having done so, however,
they can compile and execute such programs.

This also seems to have the benefit of standardizing the existing
behavior of at least 2 major compilers, `g++' and `clang++'.

Where else in the Standard might compile-time behavior be
separated from run-time behavior, and thereby be recast as at
least a diagnosable rule?

Sincerely,
Michael Witten

Kevin Morris

unread,
Apr 16, 2018, 7:57:14 PM4/16/18
to std-dis...@isocpp.org
How can std enforce this rule; the standard library is not depended on by the compiler? It seems to me that something like this would require that compilers and the standard library be coupled.

Regards,
Kevin Morris


--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

unread,
Apr 16, 2018, 8:52:16 PM4/16/18
to ISO C++ Standard - Discussion
On Monday, April 16, 2018 at 7:35:55 PM UTC-4, Michael Witten wrote:
Synopsis: If a  human can  point out  a conformance  problem with
          some  code,  then  a  compiler SHOULD  point  out  that
          problem;  possibly, if  a human  can answer  a question
          about the Standard, then a compiler should also be able
          to answer that question.
         
          Perhaps, there is a stricter statement: Every undefined
          behavior is  run-time behavior; at  compile-time, there
          is no such thing as undefined behavior. If this is true
          (or often true),  this draws a distinction  that may be
          helpful in  identifying where  future revisions  of the
          Standard  must define  behavior for what  may currently
          be considered an invocation of undefined behavior.

          For  instance, if  you define  a variable  in namespace
          `std',  your   compiler  should  at  least   warn  that
          you're  invoking  undefined  behavior; or,  maybe, that
          behavior should be better defined by the Standard.

It is  common for a programmer  to treat a compiler  as being the
ultimate Language Lawyer, someone who will gleefully peruse every
nook and cranny of a program in order to report even the minutest
contravention of the Standard.

Citation needed.

I've seen many people who treat the compiler as the language. This is the "if my compiler allows it, then my code is fine" approach. But only rarely do such people say that their compiler's behavior genuinely defines the language, such that if the standard says something different, then the standard is wrong.

Alas, in practice, a compiler does
not fulfill  this role, potentially misleading  the programmer to
confidently  write  code that  is  non-portable  or that  invokes
undefined behavior, and that causes much gnashing of teeth.

Instead, the  Standard should  impose the  following requirement:
Within  some practical,  standardized limits,  if every  compiler
necessarily has  the answer  to a  question, then  every compiler
must provide a  standardized means by which to  ask that question
and to receive the definite answer.

No.

Who defines "every compiler?" Who defines what these "questions" are?

C++ is written against an abstract machine. The purpose of that is so that it can remain portable to a variety of systems. What you're trying to do is remove the abstract machine, to say that the language is merely what compilers do.

That's bad. That's not a standard; that's a bunch of arbitrary, incoherent, and contradictory rules.

Howard E.  Hinnant emphasized  this very principle  as far  as it
applies to  one particular  case: What is  the endianness  of the
execution environment?  The "ancient, time-honored  tradition" of
asking  this  question is  considered  in  Howard's proposal  for
`std::endian':

  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0463r1.html
  https://howardhinnant.github.io/endian.html

In  that  proposal,  Howard  eloquently  identifies  the  special
position of the compiler in providing the answer:

  There are many  hacks that work most of the  time. None of them
  seem  bullet  proof. Few  give  the  answer as  a  compile-time
  constant. And in every single case:

    THE COMPILER KNOWS THE ANSWER!

I, for one, burst out laughing  when I read that insight, because
it  struck  me that  every  compiler  necessarily knows  so  many
answers, and yet keeps them  quietly hidden away like the guarded
trade secrets of the guilds of yore.

Yes, and it's a terrible answer, because nowhere in the standard does it say what the question means! The standard gives precisely zero guarantees about the behavior of this code:

std::uint32_t val = 0xFF00FF00;
auto bits = std::bitcast<std::array<std::uint8_t, 4>>(val);

if constexpr(std::endian::native == std::endian::little)
{
 
assert(bits[0] == 0x00);
}
else if constexpr(std::endian::native == std::endian::big)
{
 
assert(bits[0] == 0xFF);
}

Where in the standard does it say that a compiler that uses `std::endian::little` will issue that assert? And if the standard doesn't say that, then what good is defining the endian? Howard's proposal is completely useless from a pure standards perspective, since the standard never spells out what any of the answers actually change about the behavior of the code.

Oh sure, you know what endian means. I know what endian means. But since the standard does not, it can give no statement about the behavior of the aforementioned code.

As such, what good is asking the question? If the compiler is not bound to actually be little endian if it says that it is little endian, the question is meaningless. Yes, under Howard's proposal it is perfectly legal for an implementation to claim `endian::little` status without actually being little endian.

That's how poor the proposal is.

However, this  solution is  actually quite  unsatisfying, because
I'd rather  it be written  in terms  of namespace `std',  and why
shouldn't it be? Why shouldn't I have a well-defined way by which
to  backport  fully-accepted  library  additions,  even  if  only
incompletely?

Because implementations need to have a playground of their own, one which you can't go stomping around in and breaking things. You can't add stuff to `std` because the standard library and the compiler can add whatever they want to `std`.

A valid C++17 implementation could already define `enum class endian` and use it internally for their own purposes. By defining yours in `std`, you now break their code. Or rather, it breaks yours. Implementations need a place where they can define things without fear that someone else can come along and break them.
 
Recall to mind the rule that riles:

  Unless otherwise  specified, the behavior  of a C++  program is
  undefined if  it adds declarations or  definitions to namespace
  std or to a namespace within namespace std.
 
By  establishing  "undefined  behavior",  that  sentence  in  the
Standard not only allows, but  even invites, an implementation to
utterly ignore that entire sentence; effectively, it is as though
that  sentence doesn't  even  exist, because  pretending that  it
doesn't exist is certainly within  the realm of allowed behavior.

No, it's not the same thing as if it "doesn't even exist".

See, if that sentence weren't there, and both you and your standard library implementation added `enum class endian` to `std`, who would be wrong? Without that statement, it is the standard library implementation that would be wrong. The implementation would have absolutely no right to declare things that aren't part of the standard library in `std`.

What that sentence means is that, if you put that declaration in `std`, and there is a conflict, the person responsible for that conflict is you. Not the compiler or the standard library implementation; you. Your code fails to compile because you broke the rules.

If you break the law, but nobody saw you do it, you still broke the law even though there was no punishment... that time. That's how UB works.

The problem isn't the law; it's the police officers not enforcing it. So go talk to your compiler vendor and leave the standard alone.

Howard Hinnant

unread,
Apr 16, 2018, 10:58:19 PM4/16/18
to std-dis...@isocpp.org
On Apr 16, 2018, at 8:52 PM, Nicol Bolas <jmck...@gmail.com> wrote:
>
> Yes, and it's a terrible answer, because nowhere in the standard does it say what the question means! The standard gives precisely zero guarantees about the behavior of this code:
>
> std::uint32_t val = 0xFF00FF00;
> auto bits = std::bitcast<std::array<std::uint8_t, 4>>(val);
>
> if constexpr(std::endian::native == std::endian::little)
> {
> assert(bits[0] == 0x00);
> }
> else if constexpr(std::endian::native == std::endian::big)
> {
> assert(bits[0] == 0xFF);
> }
>
> Where in the standard does it say that a compiler that uses `std::endian::little` will issue that assert?

Well, we did our best:

23.15.9 Endian [meta.endian]

> 1 Two common methods of byte ordering in multibyte scalar types are big-endian and little-endian in the execution environment. Big-endian is a format for storage of binary data in which the most significant byte is placed first, with the rest in descending order. Little-endian is a format for storage of binary data in which the least significant byte is placed first, with the rest in ascending order. This subclause describes the endianness of the scalar types of the execution environment.

I shouldn’t brag. These words were shamelessly lifted from the POSiX spec.

But sometimes English words are accepted as simply terms of art with accepted meanings. For example, prior to this proposal, and now in the deprecated section (for reasons unassociated with a precise meaning of endian) are specifications that used the term endian since C++11:

D.18.1 Header <codecvt> synopsis [depr.codecvt.syn]

> If (Mode & little_endian), the facet shall generate a multibyte sequence in little-endian order, as opposed to the default big-endian order.

So...

> Howard's proposal is completely useless from a pure standards perspective, since the standard never spells out what any of the answers actually change about the behavior of the code.

So lighten up a little. :-)

And most importantly: Suggest improved wording.

Just bitching about our sad state doesn’t improve things. Making yourself a target for bitching by getting improvements through the standardization process improves things.

Howard

signature.asc

Thiago Macieira

unread,
Apr 17, 2018, 1:20:01 AM4/17/18
to std-dis...@isocpp.org
On Monday, 16 April 2018 19:58:15 PDT Howard Hinnant wrote:
> D.18.1 Header <codecvt> synopsis [depr.codecvt.syn]
>
> > If (Mode & little_endian), the facet shall generate a multibyte sequence
> > in little-endian order, as opposed to the default big-endian order.
> So...

This is slightly different because we're talking about I/O, so this is
observable behaviour. This section is specifying what the facet should produce
as output and standardises it.

The internal ABI of integers is not. You can memcpy them to a buffer, but the
standard does not say what you'll find there, only that you can copy back and
get the same, original value. You may write them to I/O, but the standard
again makes no guarantee about what the observed output will be. Nor does it
even guarantee that you can later read them back -- there are, after all,
multi-endian machines (ARM, MIPS, PPC, IA-64, to name a few) and a carefully
compiled program could run in either endianness.

The point is that the abstract machine where standardese C++ runs has no need
for endianness concept.

However, WE DO. And the compiler knows the answer. So I am in support of
std::endian.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center



Richard Hodges

unread,
Apr 17, 2018, 3:15:18 AM4/17/18
to std-dis...@isocpp.org


On Tue, 17 Apr 2018, 00:57 Kevin Morris, <kevin....@codestruct.net> wrote:
How can std enforce this rule; the standard library is not depended on by the compiler? It seems to me that something like this would require that compilers and the standard library be coupled.

They are completely coupled. The tuple and typeinfo headers ensure this. 

I agree 100%. The compiler should enforce the law. Undefined behaviour is c++'s biggest flaw and serves no real-life purpose whatsoever. 




Regards,
Kevin Morris

To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.

To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.

Nicol Bolas

unread,
Apr 17, 2018, 3:36:30 AM4/17/18
to ISO C++ Standard - Discussion


On Monday, April 16, 2018 at 10:58:19 PM UTC-4, Howard Hinnant wrote:
On Apr 16, 2018, at 8:52 PM, Nicol Bolas <jmck...@gmail.com> wrote:
>
> Yes, and it's a terrible answer, because nowhere in the standard does it say what the question means! The standard gives precisely zero guarantees about the behavior of this code:
>
> std::uint32_t val = 0xFF00FF00;
> auto bits = std::bitcast<std::array<std::uint8_t, 4>>(val);
>
> if constexpr(std::endian::native == std::endian::little)
> {
>   assert(bits[0] == 0x00);
> }
> else if constexpr(std::endian::native == std::endian::big)
> {
>   assert(bits[0] == 0xFF);
> }
>
> Where in the standard does it say that a compiler that uses `std::endian::little` will issue that assert?

Well, we did our best:

23.15.9 Endian [meta.endian]

> 1 Two common methods of byte ordering in multibyte scalar types are big-endian and little-endian in the execution environment. Big-endian is a format for storage of binary data in which the most significant byte is placed first, with the rest in descending order. Little-endian is a format for storage of binary data in which the least significant byte is placed first, with the rest in ascending order. This subclause describes the endianness of the scalar types of the execution environment.

I shouldn’t brag.  These words were shamelessly lifted from the POSiX spec.

Maybe that's the problem; C++'s object model is a lot more complicated than POSIX.

C++ defines that an unsigned integer is stored as a binary integer, with some specific maximum size, perhaps based on its bitdepth. Unsigned integers can still have padding, so they may not have unique object representation.

So one can presume that the binary integer, when read as a sequence of bytes-

Oops: reading unsigned integers as a sequence of bytes provides values which are (as I understand it) implementation-defined. Regardless of what your paragraph says, you cannot rely on any particular byte in that sequence to have any particular value relative to the value stored in the object. What you're guaranteed is the ability to copy all of them in order from one object of that type to another (possibly through an intermediate buffer), and this operation shall be equivalent to an object copy, and that different values will use different value representations.

On the one hand, we have the part of the standard that declares the object representation to be a complete black box. On the other hand, this one paragraph tries to define what's inside the box, but it never actually rescinds any of the language declaring it to be a black box.

So at best, we have a contradiction. To resolve this, you would need to pin down something more specific about the object representation of many fundamental types. And this is not a simple prospect that one paragraph can resolve.

For example, scalar types are allowed to have padding. If they do, where does that padding appear with regard to the endian-ness of the object? After all, we're talking about the distinction between value representation and object representation; nothing in the standard says where the unused bits are. Maybe big/little endian needs to also say where any possible padding bits are, or a platform that specifies big/little endian must forbid padding bits on all scalar types. Or perhaps there can be some alternate answer. My main point is that the current wording doesn't even ask the question, let alone provide an answer.

And that's just one example; there could be many others lurking in the specification. I am not nearly familiar enough with value representations vs. object representations, scalar types, and other aspects of the C++ object model to provide a comprehensive list of such circumstances. But I am familiar enough with it to know that the current language seems underspecified.

But sometimes English words are accepted as simply terms of art with accepted meanings.  For example, prior to this proposal, and now in the deprecated section (for reasons unassociated with a precise meaning of endian) are specifications that used the term endian since C++11:

    D.18.1 Header <codecvt> synopsis [depr.codecvt.syn]

> If (Mode & little_endian), the facet shall generate a multibyte sequence in little-endian order, as opposed to the default big-endian order.

The difference here is that codecvt is not dealing with the C++ object model; it is simply outputting a sequence of bytes. It can do so in one order or another; using "endian" as a term of art makes sense in this case.

But when aspects of the C++ object model start getting involved, the technical questions of the prospective "term of art" start becoming too important to ignore.

So...

> Howard's proposal is completely useless from a pure standards perspective, since the standard never spells out what any of the answers actually change about the behavior of the code.

So lighten up a little. :-)

I'd already come to terms with the idea that the standard was not going to specify how endian ordering behaves, that to do endian encoding, you're going to have to dip into implementation/undefined behavior. My principle annoyance here was that the OP seemed to use your proposal as an example of the proper way to write a "specification", that the language should be considered whatever a majority vote of compilers do, that features should be added with focus primarily on compilers (getting them to "answer questions") rather than the sanctity and meaning of the abstract machine (providing some idea of what those questions and their answers actually mean).

I apologize for allowing my annoyance with the OP's ideas to spill over to you.

Jens Maurer

unread,
Apr 17, 2018, 6:17:40 AM4/17/18
to std-dis...@isocpp.org
On 04/17/2018 07:19 AM, Thiago Macieira wrote:
> On Monday, 16 April 2018 19:58:15 PDT Howard Hinnant wrote:
>> D.18.1 Header <codecvt> synopsis [depr.codecvt.syn]
>>
>>> If (Mode & little_endian), the facet shall generate a multibyte sequence
>>> in little-endian order, as opposed to the default big-endian order.
>> So...
>
> This is slightly different because we're talking about I/O, so this is
> observable behaviour. This section is specifying what the facet should produce
> as output and standardises it.
>
> The internal ABI of integers is not. You can memcpy them to a buffer, but the
> standard does not say what you'll find there, only that you can copy back and
> get the same, original value.

Right, we don't say where in the object representation the bits are stored.
(And, in fact, they could be arbitrarily arranged, not just by the simple
octet-chunked concept called endianness.)

However, I'd expect environments that actually define the endian
designation (i.e. are big or little endian) to represent integers
in bits with increasing value (otherwise, the term "endian" makes
little sense to me).

Suggestions for improved wording are welcome.

Jens

Hyman Rosen

unread,
Apr 17, 2018, 8:53:57 AM4/17/18
to std-dis...@isocpp.org
On Tue, Apr 17, 2018 at 3:36 AM, Nicol Bolas <jmck...@gmail.com> wrote:
But when aspects of the C++ object model start getting involved, the technical questions of the prospective "term of art" start becoming too important to ignore.

The C++ object model is ridiculous, and should be replaced with the "bag of bits" model.
Reply all
Reply to author
Forward
0 new messages