Getting native byte order from encoding/binary

3,726 views
Skip to first unread message

Paul Borman

unread,
Dec 14, 2011, 9:18:34 PM12/14/11
to golang-nuts
Unless there is an existing way to accomplish finding the native byte order, I would like to propose a new variable to augment LittleEndian and BigEndian, namely, NativeEndian.  This would be hand when building up network protocol buffers into byte streams that are being used to communicate with the kernel (for example, the netlink layer).

I would suggest that the Makefile be modified to include the gofile:

binary_$(GOARCH).go\

and then three new files:

binary_386.go
binary_amd64.go
binary_arm.go

that would more or less contain the likes of:

package binary
var NativeEndian littleEndian

I have encountered the need for this multiple times.  The only other suggestion as been to use unsafe.Unreflect.

Dave Cheney

unread,
Dec 14, 2011, 9:37:00 PM12/14/11
to Paul Borman, golang-nuts
This sounds pretty reasonable, with an appropriate comment on the exported variable. 

Sent from my iPhone

Matt Kane's Brain

unread,
Dec 14, 2011, 9:39:05 PM12/14/11
to Paul Borman, golang-nuts
On Wed, Dec 14, 2011 at 21:18, Paul Borman <bor...@google.com> wrote:
> package binary
> var NativeEndian littleEndian

Yes, I would also like this.

> I have encountered the need for this multiple times.  The only other
> suggestion as been to use unsafe.Unreflect.

I'm curious how that would work. gosndfile has:
func isLittleEndian() bool {
var i int32 = 0x01020304
u := unsafe.Pointer(&i)
pb := (*byte)(u)
b := *pb
return (b == 0x04)
}


--
matt kane's brain
http://hydrogenproject.com

Russ Cox

unread,
Dec 15, 2011, 1:53:28 PM12/15/11
to Paul Borman, golang-nuts
People keep suggesting this, and always for talking
to the local kernel. Isn't that what data structures are for?

Russ

David Anderson

unread,
Dec 15, 2011, 2:25:58 PM12/15/11
to r...@golang.org, Paul Borman, golang-nuts
The netlink API is interesting, in that it was built by network kernel hackers to control network functionality. As a result, the API is actually a network protocol more than a regular syscall API. It uses sendmsg/recvmsg, has packet ids, retransmission semantics, optional ack, unicast and multicast, protocol families... The basic C netlink API tries to shoehorn it back into structs, but the illusion is wafer thin if you go beyond basic querying. More elaborate APIs like libnl don't try to preserve that illusion.

The protocol mandates host-endian encoding, to avoid userspace and kernel pointlessly shuffling bytes. You can accomplish this by unsafely casting structs to bytes, or by using the appropriate encoding/binary ordering. Use of the encoding package seems more in line with Go's style, since netlink protocol encoding is a fairly high level construct, but that's obviously up to personal preference.

Paul, I proposed a similar change (with a different implementation - runtime detection of native ordering) a few months back. After discussion, I was convinced that the need for native-endian ordering is quite specific to netlink, and just built the "figure out native ordering" hack into my netlink library code (which I then gave up and started using libnl with cgo - netlink is a large and complex protocol!).

- Dave

Paul Borman

unread,
Dec 15, 2011, 6:11:43 PM12/15/11
to David Anderson, r...@golang.org, golang-nuts
s/netlink/networking

This comes up when talking with the kernel and also with networking between processes on the same machine (or like machines) that might be using a traditionally defined structure rather than some dynamic structure.

It is fine to say "Everyone should use network byte order" but the reality is there are things in big endian, little endian, and native.  It is sad Go cannot handle the third case without unsafe.

People keep suggesting this, and always for talking
to the local kernel.  Isn't that what data structures are for?
 
This is all about getting the array of bytes into that data structure.  You don't always know the structure before you get the bytes.

Certainly the cast is faster, but binary.Read is safer.

Russ Cox

unread,
Dec 16, 2011, 9:44:34 AM12/16/11
to Paul Borman, David Anderson, golang-nuts
On Thu, Dec 15, 2011 at 18:11, Paul Borman <bor...@google.com> wrote:
> This comes up when talking with the kernel and also with networking between
> processes on the same machine (or like machines) that might be using a
> traditionally defined structure rather than some dynamic structure.
>
> It is fine to say "Everyone should use network byte order" but the reality
> is there are things in big endian, little endian, and native.

The reality is there are good programs and bad programs.
Go doesn't have to make it easy to write bad programs.

> It is sad Go cannot handle the third case without unsafe.

Not true. Nothing is stopping you from writing, in *your* code,

var NativeEndian = binary.LittleEndian

We are only objecting to putting that in the standard library.

Russ

Paul Borman

unread,
Dec 16, 2011, 1:49:07 PM12/16/11
to r...@golang.org, David Anderson, golang-nuts
The reality is there are good programs and bad programs.
Go doesn't have to make it easy to write bad programs.

Two programs that must be on the same machine are not bad if they use native byte ordering for communication.  It is called being efficient.
 
> It is sad Go cannot handle the third case without unsafe.

Not true.  Nothing is stopping you from writing, in *your* code,

   var NativeEndian = binary.LittleEndian

We are only objecting to putting that in the standard library.

So you are suggesting writing non-portable programs?  My proposal was not to add the non-portable statement:

 var NativeEndian = binary.LittleEndian

but to have the standard library set NativeEndian to the *correct* value.  The Go library does not provide an easy way for a program to determine it's endianness.  It is not unreasonable or bad for a program to be able to determine this even if not every Go program needs it.

liigo

unread,
Dec 16, 2011, 2:08:31 PM12/16/11
to Paul Borman, golang-nuts, r...@golang.org, David Anderson

I'd like to see go standard pkg provides the NativeEndian information of the machine the program runs currently, at runtime. Do not require programmers hack this in a "unsafe" way.

Steve McCoy

unread,
Dec 16, 2011, 3:47:09 PM12/16/11
to golan...@googlegroups.com, David Anderson
Anyone can do exactly what you propose in their own projects. It's quite easy for someone that knows what they need, and those that don't should have to think about it beforehand. There are already enough platform-dependent C programs that have "fwrite(&st, sizeof(st), 1, f)" when they don't need it — I think the Go standard library has done a good job so far of discouraging similar patterns, so I wouldn't want to see NativeEndian included.

Paul Borman

unread,
Dec 16, 2011, 4:37:54 PM12/16/11
to golan...@googlegroups.com, David Anderson
Once we convert all the software in the world to Go you might be right.

In the meantime, it is faulty to logic to state that knowing your native byte order implies the stdio example you give.

Not knowing it means writing hacks like Russ suggested.  I think that suggestion is horribly ugly even though I have had to resort to it because Go is lacking in this.

Aram Hăvărneanu

unread,
Dec 16, 2011, 4:59:30 PM12/16/11
to Paul Borman, golan...@googlegroups.com, David Anderson
When I use Go I can pretend the clumsy operating system underneath
doesn't exist. The lack of host specific sophistication means the
noise level is very low.

--
Aram Hăvărneanu

Paul Borman

unread,
Dec 16, 2011, 5:11:54 PM12/16/11
to Aram Hăvărneanu, golan...@googlegroups.com, David Anderson
But I choose to use Go rather than C.  Are you suggesting that my work is different from yours and the OS is important I should not use Go?

Adding the ability to know your native byte order does not mean you must use it.  You might never need it, but it will enable Go to be a better and broader systems programming language without damaging the language or the libraries.

Paul Borman

unread,
Dec 16, 2011, 5:13:57 PM12/16/11
to Aram Hăvărneanu, golan...@googlegroups.com, David Anderson
There is an absent if:  "that if my work si different"

Rob 'Commander' Pike

unread,
Dec 16, 2011, 5:13:38 PM12/16/11
to Paul Borman, Aram Hăvărneanu, golan...@googlegroups.com, David Anderson

On Dec 16, 2011, at 2:11 PM, Paul Borman wrote:

> But I choose to use Go rather than C. Are you suggesting that my work is different from yours and the OS is important I should not use Go?
>
> Adding the ability to know your native byte order does not mean you must use it. You might never need it, but it will enable Go to be a better and broader systems programming language without damaging the language or the libraries.

I'm torn.

On the one hand, I want to see you able to use Go for your work.

On the other hand, almost every time I hear someone asking what the native byte order is, it's a mistake.

-rob


Aram Hăvărneanu

unread,
Dec 16, 2011, 5:19:16 PM12/16/11
to Paul Borman, golan...@googlegroups.com, David Anderson
On Fri, Dec 16, 2011 at 11:11 PM, Paul Borman <bor...@google.com> wrote:
> But I choose to use Go rather than C.  Are you suggesting that my work is
> different from yours and the OS is important I should not use Go?

I am not suggesting anything, I merely enjoy the silence :-).

> Adding the ability to know your native byte order does not mean you must use
> it. You might never need it, but it will enable Go to be a better and
> broader systems programming language without damaging the language or the
> libraries.

But you do, just like in C++ you need to use templates and turing
complete exceptions and every other feature it has. Your code doesn't
live in a vacuum, it interacts with other code. If the code uses
templates and exceptions you have to do the same. You can't stick to
a reasonable C++ subset. If anyone chooses NativeEndian, it forces
any consumer of that code to do the same.

This is exactly what's happening right now, you want NativeEndian for
interacting with some code other people wrote! I believe NativeEndian
is a mistake. For anyone that wants it, using cgo to determine host
endianess is a non issue.

--
Aram Hăvărneanu

Paul Borman

unread,
Dec 16, 2011, 5:21:40 PM12/16/11
to Rob 'Commander' Pike, Aram Hăvărneanu, golan...@googlegroups.com, David Anderson
I guess the other solution is an explosion in the net/os/syscall packages...  No, I would not like to see that either.

My problem is that I really must interact with the native OS and it is providing me things with native byte order and I am really trying to avoid unsafe as much as possible.

For months now our code has had:

var hbo = binary.LittleEndian  // hack - we want host byte order!

so we can use encoding.Binary to read things.

Russ Cox

unread,
Dec 16, 2011, 5:23:12 PM12/16/11
to Paul Borman, Rob 'Commander' Pike, Aram Hăvărneanu, golan...@googlegroups.com, David Anderson
On Fri, Dec 16, 2011 at 17:21, Paul Borman <bor...@google.com> wrote:
> For months now our code has had:
>
> var hbo = binary.LittleEndian  // hack - we want host byte order!
>
> so we can use encoding.Binary to read things.

Put that in a file named byteorder_amd64.go and
it stops being a hack. It need not be in the standard
library.

Russ

Message has been deleted

Glenn Brown

unread,
Dec 17, 2011, 11:01:08 PM12/17/11
to Rob 'Commander' Pike, Paul Borman, Aram Hăvărneanu, golan...@googlegroups.com, David Anderson

> I'm torn.

As C networking hardware developer in High Performance Computing, I'm torn, too. Go has so brilliantly addressed so many of C/C++'s shortcomings that I face daily that I'm left longing to use it… but I'm stuck with fixed binary interfaces to hardware and the network. (We are fond of OS-bypass networking for the utmost distributed performance, so we see these binary interfaces even in user-space at the application level.)

Based on my experience with binary interfaces, I long for a simple 'Portable Struct' with portable fields and platform-independent memory layout. A 'Portable Struct would have the following properties:
0) only naturally-aligned fixed-sized integer and Portable Struct fields are permitted,
1) the length of the struct is a multiple of its largest field,
2) implicit padding is prohibited by the compiler,
3) the Endianness is specified per struct or per field,
4) the compiler does implicit Endian conversions on read or write.
Such structs are platform-independent on modern byte-addressed machines. They are a formalization of pragmatic tricks used for C structure portability.

The other missing piece for me in Go (without 'unsafe') is the ability to recast data into these structs. Since Portable Structs contain no pointers, they can be constructed arbitrarily without circumventing security. So, a language can safely allow recasting arbitrary pointers to pointers to Portable Structs as long as the memory backing the new Portable Struct did not originally contain any pointer or private field. So, Go could conceivably allow received data in slices to be dynamically reinterpreted as a Portable Struct without a copy, and without resorting to 'unsafe'.

I do respect the neatness of serialization for portable communication in Go (and Plan 9), but for our CPU-bound network-intensive applications, only binary zero-copy interfaces are competitive.

For what it's worth,
--Glenn

Aram Hăvărneanu

unread,
Dec 18, 2011, 4:47:10 AM12/18/11
to Glenn Brown, Rob 'Commander' Pike, Paul Borman, golan...@googlegroups.com, David Anderson
On Sun, Dec 18, 2011 at 5:01 AM, Glenn Brown <tornad...@gmail.com> wrote:
> High Performance Computing [...]

I believe converting the endianess was not a performance issue even in
1985, much less today. Unspecified endianess is a mistake. In my
last endeavor I worked on a file system that not only swapped
endianess all the time, but also hashed and compressed all the data,
in real time.
We used to advertise this file system as particularly fast.

--
Aram Hăvărneanu

Jan Mercl

unread,
Dec 18, 2011, 5:21:27 AM12/18/11
to golan...@googlegroups.com
On Friday, December 16, 2011 7:49:07 PM UTC+1, Paul Borman wrote:
Not true.  Nothing is stopping you from writing, in *your* code,

   var NativeEndian = binary.LittleEndian

We are only objecting to putting that in the standard library.

So you are suggesting writing non-portable programs?  My proposal was not to add the non-portable statement:

 var NativeEndian = binary.LittleEndian

I think Russ meant to include such declaration in one's foo_thisArch.go and

Jan Mercl

unread,
Dec 18, 2011, 5:23:18 AM12/18/11
to golan...@googlegroups.com
Sorry, error clicked post..

On Sunday, December 18, 2011 11:21:27 AM UTC+1, Jan Mercl wrote:
I think Russ meant to include such declaration in one's foo_thisArch.go and

var NativeEndian = binary.LittleEndian

and "var NativeEndian = binary.BigEndian"

in foo_thatArch.go

roger peppe

unread,
Dec 19, 2011, 7:10:51 AM12/19/11
to r...@golang.org, Paul Borman, Rob 'Commander' Pike, Aram Hăvărneanu, golan...@googlegroups.com, David Anderson
it seems to me that encoding/binary could use this trick
to avoid the conversion overhead when the endianness is
native (and even the data copy when TotalSize(v) == v.Type().Size())

i think encoding/binary could use some work - the signature
of binary.Read means it can never avoid an allocation, for example.

Reply all
Reply to author
Forward
0 new messages