This proposal should be generalized to a more generic byte swapping solution. Something like:
This is so easy and trivial to implement. Most compilers even have builtins for all of the byte swapping routines. I don't know why its not standardized.
It's not standardized because networking didn't need the generic solution and I don't recall
seeing a proposal for the generic one.
I am strongly against hton and ntoh. There is no single fixed network byteorder. Byteorder depends on the underlying protocol.
I cannot say if it is betters or worse. What is your opinion about it?
> constexpr uint16_t bswap(uint16_t v) {
> return __builtin_bswap16(v);
> }
I didn't know that __builtin_bswap could be used as a constexpr. Awesome!
> constexpr uint16_t bswap16(uint16_t t) { return bswap(t); }
I would prefer a template for bswap:
template <typename T>
constexpr T bswap(T x); //Only specialized for supported types
This enables the following usages:
uint16_t x = bswap<uint16_t>(5);
uint16_t y = bswap(x);
> template <typename T>
> constexpr T cpu_to_le(T t) { return byte_order::value == byte_order::little_endian ? t : bswap(t); }
I personally would prefer the wording 'native' to 'cpu'.
I would move the byte_orders to template arguments (either as addition,
or as a replacement of the above):
template <byte_order from, byte_order to, typename T>
constexpr T convertByteOrder(T t);
> //Optional: Should we support signed integers as well?
IMHO yes!
> //Optional: Should we support floating point types? Do binary formats or hardware devices need this?
Yes, I am sure someone will someday create a format that needs this (If
it is not already existing).
Question: Is there any architecture where integers and floating point
numbers are stored in different endianesses?
> //Optional: Byte Swapping an arbitrary sized buffer? Is this at all useful?
I am not sure about this either. At least all supported integral types
should be supported.
Note: For buffers std::reverse_copy could be used to realize bswap.
> //Optional: Do we need/want a macro interface?
I think we you have it!
What I've been missing is a function the convert the byte_order without
knowing the order during compilation time. e.g:
template<typename T>
constexpr T convertByteOrderRT(T t, byte_order from, byte_order to);
byte_order from = ??; // Is set during runtime
uint16_t x = convertByteOrderRT(v, from, little_endian);
This function could also replace the 'convertByteOrder' function
mentioned above.
Some corner cases we probably should not care about:
- Swapping of types with sizes like 6 Byte
- Behavior when CHAR_BITS != 8
I hope these remarks are helpful for you and welcome any feedback.
regards, Markus
We could also shorten the names even further:htole()htobe()letoh()betoh()
On 10/05/2013 04:04 PM, Ville Voutilainen wrote:
>
>
>
> On 5 October 2013 16:14, <fmatth...@gmail.com
> <mailto:fmatth...@gmail.com>> wrote:
>
>
> We could also shorten the names even further:
> htole()
> htobe()
> letoh()
> betoh()
>
>
>
>
> Perhaps we should try to make them readable. htons and ntohs are bad
> examples to follow
> in that regard.
>
+1 for readability.
I also suggest to rename the bswap template to 'byte_swap' or even
'swap_bytes'.
The same holds for 'bconvert' ('byte_convert' or 'convert_bytes').
If we are going to be this picky, 'host' is not that good when you run
in a virtual environment. Neither is 'native'. :-)
On 10/05/2013 05:54 AM, fmatth...@gmail.com wrote:
>
> Maybe this?
> enum class endian {
> little,
> big,
> native = little
> };
>
Yes. I like your solution. We should just be consistent if we use the
wording 'endian' or 'byte_order' (I prefer the later, which you also use
at your version at github)
>
> I personally would prefer the wording 'native' to 'cpu'.
> I would move the byte_orders to template arguments (either as addition,
> or as a replacement of the above):
>
>
> host is also another possibility. Naming is hard, lets think more, we
> have time.
>
I will create a list of possible names with some pros and cons to help
the discussion.
>
> Never heard of that, but if its possible there can be separate int and
> float endian enums like we have above.
> I don't think that adds very much complexity. In the meantime, we should
> have them in.
A quick look at wikipedia revels the following:
'However, on modern standard computers (i.e., implementing IEEE 754),
one may in practice safely assume that the endianness is the same for
floating point numbers as for integers, making the conversion
straightforward regardless of data type. (Small embedded systems using
special floating point formats may be another matter however.)'
I am not sure if we should cover that case, but my solution would look
something like:
enum class byte_order {
little,
big,
integral_native = little, //Implementation defined
floating_point_native
};
>
>
> > //Optional: Byte Swapping an arbitrary sized buffer? Is this at
> all useful?
>
> I am not sure about this either. At least all supported integral types
> should be supported.
>
After further think about it, I cann't come up with a use case requiring
a buffer swap routine.
> Note: For buffers std::reverse_copy could be used to realize bswap.
>
> > //Optional: Do we need/want a macro interface?
>
> I think we you have it!
I am not that sure anymore if we really need a macro interface.
>
> What I've been missing is a function the convert the byte_order without
> knowing the order during compilation time. e.g:
>
> template<typename T>
> constexpr T convertByteOrderRT(T t, byte_order from, byte_order to);
>
> byte_order from = ??; // Is set during runtime
>
> uint16_t x = convertByteOrderRT(v, from, little_endian);
>
> This function could also replace the 'convertByteOrder' function
> mentioned above.
>
>
> runtime control makes sense.
>
>
I think we could unify the 'reorder_bytes' functions as well as the
'host_to' function to something like:
template <typename T>
constexpr T reorder_bytes(T t, byte_order in, byte_order out =
byte_order::host) {
return in == out ? t : swap_bytes(t);
}
On domingo, 6 de outubro de 2013 11:34:17, fmatth...@gmail.com wrote:
> template <typename T>
> struct byte_order<T> {
> endian value = little;
> }
>
> And then provide specializations for integral and floating point types as
> needed.
Any chance of modifying numeric_limits for this?
On 10/04/2013 07:05 AM, fmatth...@gmail.com wrote:
It's not standardized because networking didn't need the generic
solution and I don't recall
seeing a proposal for the generic one.
In that case I'd like to work on making one. Its a pretty small thing so
it should not be difficult.
Here is an example of what it might look like:
https://github.com/fmatthew5876/stdcxx/blob/master/byteorder/include/byteorder.hh
Anyone care to comment or add suggestions?
Thanks!
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to std-proposals+unsubscribe@isocpp.org.
This enables the following usages:
uint16_t x = bswap<uint16_t>(5);
uint16_t y = bswap(x);
template <typename T>
constexpr T cpu_to_le(T t) { return byte_order::value == byte_order::little_endian ? t : bswap(t); }
I personally would prefer the wording 'native' to 'cpu'.
I would move the byte_orders to template arguments (either as addition, or as a replacement of the above):
template <byte_order from, byte_order to, typename T>
constexpr T convertByteOrder(T t);
//Optional: Should we support signed integers as well?
IMHO yes!
//Optional: Should we support floating point types? Do binary formats or hardware devices need this?
Yes, I am sure someone will someday create a format that needs this (If it is not already existing).
Question: Is there any architecture where integers and floating point numbers are stored in different endianesses?
//Optional: Byte Swapping an arbitrary sized buffer? Is this at all useful?
I am not sure about this either. At least all supported integral types should be supported.
Note: For buffers std::reverse_copy could be used to realize bswap.
//Optional: Do we need/want a macro interface?
I think we you have it!
What I've been missing is a function the convert the byte_order without knowing the order during compilation time. e.g:
template<typename T>
constexpr T convertByteOrderRT(T t, byte_order from, byte_order to);
byte_order from = ??; // Is set during runtime
uint16_t x = convertByteOrderRT(v, from, little_endian);
This function could also replace the 'convertByteOrder' function mentioned above.
Some corner cases we probably should not care about:
- Swapping of types with sizes like 6 Byte
- Behavior when CHAR_BITS != 8
I hope these remarks are helpful for you and welcome any feedback.
regards, Markus
--
--- You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
On Fri, Oct 4, 2013 at 5:51 AM, Markus Mayer <lotha...@gmx.de> wrote:
On 10/04/2013 07:05 AM, fmatth...@gmail.com wrote:
It's not standardized because networking didn't need the generic
solution and I don't recall
seeing a proposal for the generic one.
In that case I'd like to work on making one. Its a pretty small thing so
it should not be difficult.
Here is an example of what it might look like:
https://github.com/fmatthew5876/stdcxx/blob/master/byteorder/include/byteorder.hh
Anyone care to comment or add suggestions?
Thanks!
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to std-proposal...@isocpp.org.
low-level an API for general use, because it requires the user to know or condition on the host endianness, which is almost always a mistake (q.v. http://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html).
It seems to me that what's really wanted are two symmetric operations, serialization and deserialization, which convert between the (unspecified) native format and an explicitly-specified byte format (e.g. 32-bit little-endian two's complement, or big-endian IEEE 754 double-precision). The non-native format should be handled using char*, the standard-sanctioned way to work with uninterpreted bytes (or perhaps some type that wraps char*).
Straw-man proposal: establish a BinaryFormatPolicy concept, such that if P is a type that models that concept, and c is a char* value,Policy::native_type Policy::read(c);
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.
There are situations in which you want to do more than just host vs. specified. For instance, you may have a machine that translates from one networking protocol to another. On the receiving end, the data is in little-endian format, but on the sending end, the data must be in big-endian format. You don't want to have to go through any intermediate step, you just want to convert little to big, even if you happen to be running on a PDP-10.
For what it's worth, both I and Beman Dawes have worked on two fairly complete endian libraries (that are partially merged). Mine only has the byte conversion functions, and his also has integer types of specified endianness. I will let Beman discuss his version (if he
reads this), but mine can be found at https://bitbucket.org/davidstone/endian so this might be a good reference for discussion.
I believe that is should work on any platform with a standard-compliant implementation, even if CHAR_BIT == 19, sizeof(short) == sizeof(int) == sizeof(long) == (sizeof(long long) / 9).
Based on my experiences now, I would actually argue against the naming decisions that I made in that library. For reference, the functions are named in the style of `T be_to_le(T t)` and `T h_to_pdp(T t)`. My original guide was to follow the lead of htons-style function. The type is never included in the function name, though, and possible values in either position are be, le, pdp, h, and n. I now believe that it would be better to depart a bit from these somewhat cryptic names. This functions are unlikely to be called very often (usually relegated to one or two locations as part of a larger library function), and will almost never be part of large chained expressions. Typical use would likely look something like
int const value = n_to_h(read_int(socket));
as the most complicated form in most code. I believe that this justifies a more verbose naming convention with less abbreviation, and were I to re-write this library, I would spell things out more. My preference would be a name like host_to_network or little_to_big. host_to_network_byte_order is getting a little too long.
I also don't feel like we get any benefit from specifying the source and destination formats as an enum passed via template parameter.
Whether the "network" version of byte ordering functions is needed at all is something we would have to decide as well. I lean toward leaving it out (and just having people use "big"), but I do not feel strongly about this and would not complain if others preferred to have it in. I do not know how languages other than C name similar functions.
"host" would be my preferred name over "native" or "cpu", but this also isn't an important issue to me.
Theoretically these could be defined as constexpr using the new relaxed constexpr rules (but not as I defined the functions due to the use of reinterpret_cast). However, this would constrain implementations a bit. Based on my testing, the reinterpret_cast version actually ended up being slower on all compilers, but it is also the only simple solution that works for floating point types. Moreover, most uses of an endian library will be for writing to files or network interfaces, which cannot be done in constexpr functions, anyway, so these functions probably should not be declared constexpr.
They also should not be declared noexcept due to the possibility of trap representations.
Given all of this, I don't believe it would be necessary to worry about defining some sort of enum or integer value to specify what the byte order of the host machine is, as the only purpose for such a thing that I can see would be to define these functions.
There are systems where the data segment of memory can be little or big endian, and on such systems, these functions should still work as expected (do a dynamic determination of what type of system it is).
low-level an API for general use, because it requires the user to know or condition on the host endianness, which is almost always a mistake (q.v. http://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html).Thank you for this, the the point of view made from this article was very enlightening. I'll comment more at the end.It seems to me that what's really wanted are two symmetric operations, serialization and deserialization, which convert between the (unspecified) native format and an explicitly-specified byte format (e.g. 32-bit little-endian two's complement, or big-endian IEEE 754 double-precision). The non-native format should be handled using char*, the standard-sanctioned way to work with uninterpreted bytes (or perhaps some type that wraps char*).
Yes converting from whatever native is to big and little endian is really the end goal. The byte order of the machine is an implementation detail.
Straw-man proposal: establish a BinaryFormatPolicy concept, such that if P is a type that models that concept, and c is a char* value,Policy::native_type Policy::read(c);Isn't this pointer requirement a bit restrictive? File Io in particular does not expose it's internal buffer.
float farray[512];read_from_file(somefile, farray, sizeof(farray));//Entire loop should be optimized out if host is big endian.for(auto& f : farray) {
f = be_to_host(f);}
On 2 October 2013 06:56, <fmatth...@gmail.com> wrote:
This proposal should be generalized to a more generic byte swapping solution. Something like:We know. :)
This is so easy and trivial to implement. Most compilers even have builtins for all of the byte swapping routines. I don't know why its not standardized.
It's not standardized because networking didn't need the generic solution and I don't recall
seeing a proposal for the generic one.