https://chromium.googlesource.com/chromium/src/+/HEAD/styleguide/c++/c++11.md#std_byte-tbdAllowing std::byte seems harmless to me. I'm guessing the main concern would be knowing when it's appropriate to use std::byte over something like char or uint8_t. I imagine most of the time it would be used for blob-y data types, though.
--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CACwGi-4-P3YM2MvUsFv3N-yQ79d-TDPxFxW8b_RU4%3D46oBz-Wg%40mail.gmail.com.
https://chromium.googlesource.com/chromium/src/+/HEAD/styleguide/c++/c++11.md#std_byte-tbdAllowing std::byte seems harmless to me. I'm guessing the main concern would be knowing when it's appropriate to use std::byte over something like char or uint8_t. I imagine most of the time it would be used for blob-y data types, though.
I'm reviewing some code at the moment that uses a std::vector<uint8_t> to hold a bag of bytes, and I was thinking, "Hey, we have C++17 now. Could we use a std::byte here?" Turns out the answer is TBD, so here's my proposal. :-)
--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CACwGi-6B2JeOwggqQ-4_5bMPCf9epGv63eOofgGnGLL6oXbu6Q%40mail.gmail.com.
I think it's a good style distinction to use std::byte for cases of "moving opaque data around", and use uint8_t for "arrays of 8-bit integer values". There is no real distinction to the compiler wrt aliasing rules and all that.
The only real distinction is in the front-end experience of requiring `static_cast<some integer type>(byte)` before you can do arithmetic, which IMHO is a good thing in distinguishing the intent of each use. If you find yourself writing those casts, it's probably a case where using uint8_t instead may have been appropriate.--On Thu, Jan 6, 2022 at 10:30 AM K. Moon <km...@chromium.org> wrote:The review in question is https://crrev.com/c/3368635. In this case, we're taking a blob of pixels, which we then transfer using postMessage(), which then gets interpreted on the other side of the IPC boundary. No other operations are performed on the data.--On Thu, Jan 6, 2022 at 10:28 AM Peter Kasting <pkas...@google.com> wrote:On Thu, Jan 6, 2022 at 10:25 AM K. Moon <km...@chromium.org> wrote:I'm reviewing some code at the moment that uses a std::vector<uint8_t> to hold a bag of bytes, and I was thinking, "Hey, we have C++17 now. Could we use a std::byte here?" Turns out the answer is TBD, so here's my proposal. :-)Tentative +1, but it'd be good to see how the code is using uint8_t to see if "byte" would remove any aliasing violations or otherwise add clarity. If the author actually wants numeric values out the other end, std::byte could just add more hassle...PK
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CACwGi-6B2JeOwggqQ-4_5bMPCf9epGv63eOofgGnGLL6oXbu6Q%40mail.gmail.com.
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAB%3D4xhpTSx5yA%3DRpuDPUMW%2ByJNUf2kW9BhAZUkDXUYwQkJrg8Q%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CACwGi-4ndAauzq83E%3DHEspc8zKqvn2UTLL7oTQ_v3_2Z%3Da_W3Q%40mail.gmail.com.
Sure, but we had no choice prior to C++17 :)I do think that allowing it means making a decision for base::as_bytes().
It also matters a bit for things that take std::vectors, because we can't just cast std::vector<uint8_t> to std::vector<std::byte>.
DanielOn Thu, 6 Jan 2022 at 11:39, K. Moon <km...@chromium.org> wrote:I think converging on std::byte in as many places as makes sense might be useful, but ultimately I think that's a library design question, orthogonal to whether or not std::byte should be allowed as a language feature. There are plenty of cases where we allow a C++ feature, but it wouldn't be a good choice in every design scenario.Where I think this becomes a valid concern is if the uncontrolled proliferation of std::byte creates harm, but I think Peter's earlier point about std::byte following the same aliasing rules as char and unsigned char actually give it an advantage over uint8_t in that respect: casting provides a safety valve for cheaply converting from std::byte* to char* (and from anything else). In that respect, uint8_t was a suboptimal choice in the first place, since there's no such guarantee about aliasing behavior.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAF3XrKqGT%2BWx9TSQAtkqUawnvRZoxA0mnKk_aR_LwJi3MBVeqw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CALB5Stb5gntJCCyJm6%3DG4_8HZwLaGKxqC7AGgayc2JeDNPnTHw%40mail.gmail.com.
+1 to allow std::byte.On Thu, Jan 6, 2022 at 9:14 PM Daniel Cheng <dch...@chromium.org> wrote:Sure, but we had no choice prior to C++17 :)I do think that allowing it means making a decision for base::as_bytes().Assuming we eventually want to adopt std::span and its byte conversion utilities I suggest we change base::as_bytes()'s return type sooner rather than later.It also matters a bit for things that take std::vectors, because we can't just cast std::vector<uint8_t> to std::vector<std::byte>.That unfortunately is true (w/o invoking UB). At least the conversion can be done via memcpy, though.DanielOn Thu, 6 Jan 2022 at 11:39, K. Moon <km...@chromium.org> wrote:I think converging on std::byte in as many places as makes sense might be useful, but ultimately I think that's a library design question, orthogonal to whether or not std::byte should be allowed as a language feature. There are plenty of cases where we allow a C++ feature, but it wouldn't be a good choice in every design scenario.Where I think this becomes a valid concern is if the uncontrolled proliferation of std::byte creates harm, but I think Peter's earlier point about std::byte following the same aliasing rules as char and unsigned char actually give it an advantage over uint8_t in that respect: casting provides a safety valve for cheaply converting from std::byte* to char* (and from anything else). In that respect, uint8_t was a suboptimal choice in the first place, since there's no such guarantee about aliasing behavior.While aliasing is not guaranteed, uint8_t should just be an alias for unsigned char on all platforms we care about. So you could statically assert that these are the same types if you rely on uint8_t aliasing.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CALB5Stb5gntJCCyJm6%3DG4_8HZwLaGKxqC7AGgayc2JeDNPnTHw%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAB%3D4xhoN6k%3DEcwjyVNRKnbuewWwe7nBYJN5wPp5miKb21MfhGA%40mail.gmail.com.
We can probably treat array<uint8_t> and array<byte> as a special case of being wire compatible if needed from the Mojo perspective. My bigger concern is to make sure we have some consistency with uint8_t vs std::byte vs char so we don't just end up with more ways to do things and even more conversions marshalling things from vector<uint8_t> to vector<std::byte> to vector<char>.
What about the reverse issue, with APIs that require std::byte? This feels to me like trying to ban size_t; it's just not tenable to ban such a fundamental type, because other C++ code (including in the standard library) is going to assume it's available (and we'll end up on a tech island).
As previously discussed, I also think the in-memory object representation thing is a red herring, as we've already decided that a byte is always going to be 8 bits. A block of char or uint8_t or std::byte are all going to have the same problem, and none of these are suitable as wire formats.
Many programs require byte-oriented access to memory. Today, such programs must use either the char,
signed char, or unsigned char types for this purpose. However, these types perform a “triple duty”.
Not only are they used for byte addressing, but also as arithmetic types, and as character types. This
multiplicity of roles opens the door for programmer error – such as accidentally performing arithmetic on
memory that should be treated as a byte value – and confusion for both programmers and tools.
Having a distinct byte type improves type-safety, by distinguishing byte-oriented access to memory from
accessing memory as a character or integral value. It improves readability. Having the type would also
make the intent of code clearer to readers (as well as tooling for understanding and transforming
programs). It increases type-safety by removing ambiguities in expression of programmer’s intent,
thereby increasing the accuracy of analysis tools.
On Mon, Jan 10, 2022 at 11:32 AM K. Moon <km...@chromium.org> wrote:What about the reverse issue, with APIs that require std::byte? This feels to me like trying to ban size_t; it's just not tenable to ban such a fundamental type, because other C++ code (including in the standard library) is going to assume it's available (and we'll end up on a tech island).Any API taking a `std::byte*` would be fine to pass &object to, without requiring us to use byte ourselves.
Any API taking vector<byte> is doing it wrong, I think, is the conclusion of what byte represents. And if an API wanted to receive that we'd have to convert from vector<thingswecanconstruct> to vector<byte> at some point anyway, so just do that at the edge like we do when we need to pass a std::function.
As previously discussed, I also think the in-memory object representation thing is a red herring, as we've already decided that a byte is always going to be 8 bits. A block of char or uint8_t or std::byte are all going to have the same problem, and none of these are suitable as wire formats.I don't think the point is about the size of the type. A byte is meant to be an opaque representation of an object, not a bytestream. As uint8_t* is meant to be a byte-stream not text, as char* is meant to be text, not a bytestream. They're all the same size but that doesn't mean they make logical sense in any give place. Our code is a mess in using these wrong, and adding byte won't improve the situation, we'll just have N+1 problems. Folks like David have spent a lot of time trying to reduce the mess between uint8_t* and char*, which is why there are strong feelings I presume about backsliding by introducing yet another type. One that, if used for random "blobs of data" is even harder to reason about when to use vs uint8_t. "Will I read the bytes" is a question of semi-global knowledge and not the type of information that necessarily makes sense in the type system in this way.
If, instead, byte* is not referring to intent to read/write the data, but instead refers to being a pointer to an object (or array of objects) rather than a byte stream, you can have a more clear rule of when to use it or not.And in Chrome, we've already spent person-years trying to move away from char* for this purpose to int8_t*. Is it really worth adding inconsistency and spending many more years trying to change int8_t* to byte*, but only when you're not using the data..? I'd wager it's a poor use of our time.Using a char* as a "pointer to anything" is not great, and using byte* to represent that makes sense there. I expect that's why it's not banned. But using it as a concrete type `byte` vs `byte*` makes about as much sense as using a `void` instead of a `void*`, except that the former compiles now.Of course templates complicate things. span<byte> is in fact a byte* inside, but that's hidden behind an abstraction. It makes it hard to have a rule, but if I could: As a first pass, I would say you can use byte* (only as a pointer) in places where you'd have written void* or char* before. Thus span<byte> may technically be okay but makes little sense because you can't have a vector<byte>.
TL;DR: It seems clear that std::byte is controversial enough that it shouldn't be in the initial set of things that are allowed from C++17. That said, I'm less in love with std:byte, and more against banning things: My ideal end state is that we (eventually) ban as few things from C++17 as possible, and I don't see a good reason to exclude std::byte because we think some APIs would be better using char or uint8_t or whatever. As Peter suggested earlier, I think this could/should be an API by API decision.I think what would be even better is if we clearly articulated somewhere what our end goal is, as I don't think even char vs. uint8_t is written down anywhere. (It might also be a good opportunity to get ahead of the char8_t mess in C++20.)
For reference, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0298r3.pdf is the most recent version of P0298 (the std::byte proposal).The Motivation and Scope section reads (emphasis mine):Many programs require byte-oriented access to memory. Today, such programs must use either the char,
signed char, or unsigned char types for this purpose. However, these types perform a “triple duty”.
Not only are they used for byte addressing, but also as arithmetic types, and as character types. This
multiplicity of roles opens the door for programmer error – such as accidentally performing arithmetic on
memory that should be treated as a byte value – and confusion for both programmers and tools.Having a distinct byte type improves type-safety, by distinguishing byte-oriented access to memory from
accessing memory as a character or integral value. It improves readability. Having the type would also
make the intent of code clearer to readers (as well as tooling for understanding and transforming
programs). It increases type-safety by removing ambiguities in expression of programmer’s intent,
thereby increasing the accuracy of analysis tools.Whether or not one agrees with this analysis, the intent was to increase type safety. For example, uint8_t may be the wrong type if you actually have a buffer int8_t, and vice versa; std::byte doesn't make that assumption, so it can safely contain either (requiring explicit conversion). You also can't assign or compare 'a' to a std::byte.On Tue, Jan 11, 2022 at 10:02 AM <dan...@chromium.org> wrote:On Mon, Jan 10, 2022 at 11:32 AM K. Moon <km...@chromium.org> wrote:What about the reverse issue, with APIs that require std::byte? This feels to me like trying to ban size_t; it's just not tenable to ban such a fundamental type, because other C++ code (including in the standard library) is going to assume it's available (and we'll end up on a tech island).Any API taking a `std::byte*` would be fine to pass &object to, without requiring us to use byte ourselves.Any API taking vector<byte> is doing it wrong, I think, is the conclusion of what byte represents. And if an API wanted to receive that we'd have to convert from vector<thingswecanconstruct> to vector<byte> at some point anyway, so just do that at the edge like we do when we need to pass a std::function.For third-party types like absl::Status, I think it's acceptable to draw that line, but if it's part of the standard library, I think the bar for that is higher.
As you note, there's precedent for banning types like std::function, but even when there's a strong technical reason, it makes interop awkward. We can't predict if the standard committee is going to make some standard type we've banned a key part of some critical future API.
It doesn't seem like we have any consensus here.Dana mentions that "//base has no appetite for std::byte". I'd like to understand whether that means "we are opposed because using it would be detrimental" or "we might support it in theory, but the benefit doesn't seem to justify the cost of trying to update all the APIs" or "we have too much on our plates to really think deeply about this at all". (I know thestig@, for example, is in the third camp.)
On Wed, Jan 19, 2022 at 8:08 PM Peter Kasting <pkas...@google.com> wrote:It doesn't seem like we have any consensus here.Dana mentions that "//base has no appetite for std::byte". I'd like to understand whether that means "we are opposed because using it would be detrimental" or "we might support it in theory, but the benefit doesn't seem to justify the cost of trying to update all the APIs" or "we have too much on our plates to really think deeply about this at all". (I know thestig@, for example, is in the third camp.)I think davidben@ is the domain expert here and I would defer to him about using it instead of string/span<uint8_t>/etc. But given the amount of code using string still, it doesn't seem feasible to try to force the conversion to byte. We already failed to get everything to uint8_t instead of char. So, we end up with 3 options instead of 2.
I tend to feel:* The Google styleguide doesn't ban std::byte, and I haven't heard a compelling reason why Chrome is different.* A byte is not the same as an octet.* Bag-of-binary-data APIs should in principle use std::byte instead of std::uint8_t, because it is semantically more correct and eliminates some types of aliasing problems.* Bag-of-not-known-to-be-text-data APIs should in principle use std::byte instead of char/string, because it is semantically more correct and may reduce casting given the above bullet.* The practical benefit of eliminating aliasing problems is small given that Chrome will never compile with strict aliasing on.* Changes to core APIs are viral and thus harder.* Inconsistencies in APIs are stumbling blocks to readability.Personal conclusion: Allow std::byte; see if someone (e.g. me) has appetite to audit and convert a large fraction of applicable APIs to use it; either convert all those or don't convert anything.PK
--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAHtyhaRQnuJRJ%2BUOU2%2BznbdpoHQO%2BAgQYqNVERaVFEQ02WxHPA%40mail.gmail.com.
On Thu, Jan 20, 2022 at 3:46 PM <dan...@chromium.org> wrote:On Wed, Jan 19, 2022 at 8:08 PM Peter Kasting <pkas...@google.com> wrote:It doesn't seem like we have any consensus here.Dana mentions that "//base has no appetite for std::byte". I'd like to understand whether that means "we are opposed because using it would be detrimental" or "we might support it in theory, but the benefit doesn't seem to justify the cost of trying to update all the APIs" or "we have too much on our plates to really think deeply about this at all". (I know thestig@, for example, is in the third camp.)I think davidben@ is the domain expert here and I would defer to him about using it instead of string/span<uint8_t>/etc. But given the amount of code using string still, it doesn't seem feasible to try to force the conversion to byte. We already failed to get everything to uint8_t instead of char. So, we end up with 3 options instead of 2.3 instead of 2 might actually be helpful, if and when some third party libraries start using std::byte.