Is this code in package unix assuming machine endianess?

184 views
Skip to first unread message

Tom Parkin

unread,
Mar 10, 2020, 7:03:48 PM3/10/20
to golang-nuts
Hi all,

I'm working on adding a new Linux socket type (L2TPIP) to the unix package, and I noticed some code in there that appears on the face of it to be assuming the endianess of the host.  The #networking channel on the Gophers slack suggested I raise the question here.

The code I am struggling with is in this file:


So far as I can make out, this code implements a variety of system calls for Linux generally, irrespective of GOARCH.

In this file, there is a function anyToSockaddr which serves to convert struct sockaddr from system calls such as getsockname(2) and accept4(2) into Go representations of the various sockaddr types, for example unix.SockaddrUnix and unix.SockaddrInet4.  The function anyToSockaddr switches on the address family in the struct sockaddr, and then converts based on that.

I noticed for the AF_INET case in the switch statement of anyToSockaddr that the struct sockaddr_in sin_port field is being unconditionally byte-swapped during conversion:

            pp := (*RawSockaddrInet4)(unsafe.Pointer(rsa))
            sa := new(SockaddrInet4)
            p := (*[2]byte)(unsafe.Pointer(&pp.Port))
            sa.Port = int(p[0])<<8 + int(p[1])
            for i := 0; i < len(sa.Addr); i++ {
                sa.Addr[i] = pp.Addr[i]
            }   
            return sa, nil

(where 'rsa' is a pointer to a RawSockaddrAny).

Now, ip(7) states of the struct sockaddr_in structure:

  "Note that the address and the port are always stored in network byte order.  In particular, this means that you need to call htons(3) on the number that is assigned to a port."

So the byte swapping that anyToSockaddr is doing makes sense, but only if the host is a little-endian machine.  It seems as though this code would do the wrong thing on a big-endian machine.

Can anyone suggest what I'm missing here?  Is this code really assuming that the host is a little-endian machine?

Thanks!
Tom


Ian Lance Taylor

unread,
Mar 10, 2020, 7:15:03 PM3/10/20
to Tom Parkin, golang-nuts
This does not look like byte swapping to me. Here pp.Port should be
in network byte order. To set sa.Port we read the first byte of
pp.Port, left shift by 8, then or in the second byte of pp.Port. That
is, we interpret pp.Port as a two-byte big-endian number, and compute
the value as a 16-bit integer. That will work regardless of the
endianness of the host.

Ian

Tom Parkin

unread,
Mar 11, 2020, 6:42:39 AM3/11/20
to Ian Lance Taylor, golang-nuts
Thanks Ian for your answer.

It took me a little bit of thinking to get there :-( but I see what you're saying now.

For anyone else playing along at home who may be struggling like I was...

By treating the uint16 pp.Port value as an array of bytes, the code can access the bytes in network byte order, since that's how byte arrays are laid out in memory (e.g. if &p[0] is address A, &p[1] is A+1, etc).

The left shift and addition effectively convert to host byte order, since that's how uint16 value will be stored in memory.


Rob Pike

unread,
Mar 11, 2020, 9:34:11 PM3/11/20
to Tom Parkin, Ian Lance Taylor, golang-nuts
More context, in the form of self-promotion: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

-rob


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CACzvQmaCcqfuZNyX43-%2Bb51z6bK8LLMrc7VxtLCMctBSg_fkJg%40mail.gmail.com.

Tom Parkin

unread,
Mar 12, 2020, 6:10:10 AM3/12/20
to Rob Pike, Ian Lance Taylor, golang-nuts
Thanks Rob, that was an interesting read.

I think your pattern is good, and makes perfect sense when you think about it.  It occurred to me when reading that another way of thinking about the pattern is that you're coding to the endianess of the byte stream, and letting the layout of the type in memory be sorted out by the compiler writers, which seems like a reasonable separation of concerns.

What threw me off the scent somewhat, as someone coming from a C background, is that the code I pointed to superficially resembles the sort of thing you mentioned in your article.  Perhaps even more so as it's taking a uint16 value, treating it as a byte array, and then reassembling the byte array into a uint16.  And once your brain has said "aha, byte swapping", its difficult to think outside that box.

Ah well, TIL, etc, etc.  Thanks again for the link :-)
--
Tom Parkin

Robert Engels

unread,
Mar 13, 2020, 12:37:31 AM3/13/20
to Tom Parkin, Rob Pike, Ian Lance Taylor, golang-nuts
No disrespect to Rob but it’s a bit more complex than that. Almost certainly the code was older and written in C and it used fixed length linked records, so back when machines were a lot slower it was far more efficient to point to the start of a struct in memory and read/write directly - they weren’t streams. Software was usually written for Windows only - ie. Intel (or in rare cases Mac for Photoshop and Motorola). 

Only later when seeking greater revenues or interoperability was the software ported and when doing this (at minimum cost) required keeping the code nearly the same and certainly the file formats the same, you ended up with ifdefs. 

On Mar 12, 2020, at 5:10 AM, Tom Parkin <tom.p...@gmail.com> wrote:


Reply all
Reply to author
Forward
0 new messages