Re: [go-nuts] ntohl/htonl equivalent in Go

2,742 views
Skip to first unread message

David Anderson

unread,
Oct 5, 2012, 5:14:30 PM10/5/12
to donb, golan...@googlegroups.com
On Fri, Oct 5, 2012 at 12:24 PM, donb <do...@capitolhillconsultants.com> wrote:
What's the appropriate way to pack an integer of finite length into a []byte? So far, I've just been using byte((val >> 24) & 0xff), etc. But, I need a more efficient way. Is there a Go specific interface that handles endianness for me? 

The encoding/binary package knows how to pack/unpack fixed width integers into []byte, and deals with endianness.

- Dave
 

Thanks,
D

--
 
 

Devon H. O'Dell

unread,
Oct 5, 2012, 5:15:44 PM10/5/12
to donb, golan...@googlegroups.com
2012/10/5 donb <do...@capitolhillconsultants.com>:
> What's the appropriate way to pack an integer of finite length into a
> []byte? So far, I've just been using byte((val >> 24) & 0xff), etc. But, I
> need a more efficient way. Is there a Go specific interface that handles
> endianness for me?

I think the obligatory answer is:
http://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html.

However, encoding/binary is probably what you're looking for ;) It
will handle endianness issues as well.

--dho

> Thanks,
> D
>
> --
>
>

Don A. Bailey

unread,
Oct 5, 2012, 5:16:47 PM10/5/12
to David Anderson, golan...@googlegroups.com
Tried that but encoding/binary only deals with int64. I get a type error when I try and use, say, the length of a string, which is an int. 

It'd be great to:
n := len(s)
b = append(b, pack(n))

Thanks,
D
--
Don A. Bailey
CEO/Founding Partner
Capitol Hill Consultants LLC


Don A. Bailey

unread,
Oct 5, 2012, 5:18:35 PM10/5/12
to Devon H. O'Dell, golan...@googlegroups.com
Hiya, Devon. 
I remember retweeting Rob's post about byte order back in April. I agree with him. However, the application I'm interacting with over a socket enforces LittleEndian for all inbound data. So, I still need a valid method that accepts int types. 

Unless I just don't know yet how to cast int to int64 correctly? :)

D

David Anderson

unread,
Oct 5, 2012, 5:21:17 PM10/5/12
to Don A. Bailey, golan...@googlegroups.com
On Fri, Oct 5, 2012 at 2:16 PM, Don A. Bailey <do...@capitolhillconsultants.com> wrote:
Tried that but encoding/binary only deals with int64. I get a type error when I try and use, say, the length of a string, which is an int. 

len(x) is of type int, which is not fixed width (varies by platform). You can however cast it to int32 or int64:

n := int64(len(s))

at which point you can use encoding/binary to read/write it.

- Dave

Don A. Bailey

unread,
Oct 5, 2012, 5:22:41 PM10/5/12
to David Anderson, golan...@googlegroups.com
Ah, thanks, that's easy. I'm still getting used to the new style of type casts. 

D

Phil Pennock

unread,
Oct 5, 2012, 5:31:20 PM10/5/12
to donb, golan...@googlegroups.com
On 2012-10-05 at 12:24 -0700, donb wrote:
> What's the appropriate way to pack an integer of finite length into a
> []byte? So far, I've just been using byte((val >> 24) & 0xff), etc. But, I
> need a more efficient way. Is there a Go specific interface that handles
> endianness for me?

encoding/binary ?

----------------------------8< cut here >8------------------------------
package main

import (
"encoding/binary"
"fmt"
"os"
"strconv"
)

func main() {
buf := make([]byte, 8)
for _, arg := range os.Args[1:] {
num, err := strconv.ParseUint(arg, 10, 64)
if err != nil {
fmt.Printf("Error converting \"%s\" to number: %s\n", arg, err)
continue
}
binary.BigEndian.PutUint64(buf, num)
fmt.Printf("Encoded \"%s\" into: %+v\n", arg, buf)
}
}
----------------------------8< cut here >8------------------------------

% ./encoding_demo 300 257
Encoded "300" into: [0 0 0 0 0 0 1 44]
Encoded "257" into: [0 0 0 0 0 0 1 1]

-Phil

Uli Kunitz

unread,
Oct 5, 2012, 6:48:01 PM10/5/12
to golan...@googlegroups.com, Devon H. O'Dell
What is wrong with the following code?

func encode(n int32, a []byte) {
        a[0], a[1], a[2], a[3] = byte(n), byte(n>>8), byte(n>>16), byte(n>>24)
}

func decode(a []byte) int32 {
        return int32(a[0]) | int32(a[1])<<8 | int32(a[2])<<16 | int32(a[3])<<24
}

In my test code both functions were inlined into main and I could decode and encode all 32-bit integers in 44 seconds. This translates in a decoding speed of roughly 5.8 Gigabit/s (2^37/22). Should be sufficient for most use cases.

David Wright

unread,
Oct 6, 2012, 3:19:13 AM10/6/12
to golan...@googlegroups.com, Devon H. O'Dell
It doesn't account for endianness. binary.ByteOrder is what you're after.

David

Rob Pike

unread,
Oct 6, 2012, 5:23:35 AM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
On Sat, Oct 6, 2012 at 5:19 PM, David Wright <d.wrig...@gmail.com> wrote:
> It doesn't account for endianness.

Yes it does.

-rob

David Wright

unread,
Oct 6, 2012, 7:11:41 AM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
My mistake. Network byte order is always the same.

David

David Wright

unread,
Oct 6, 2012, 8:08:02 AM10/6/12
to golan...@googlegroups.com, Devon H. O'Dell
Sorry Uli.

For the sake of others coming from Java and CLR languages, maybe I should mention something on-list about where I think my misunderstanding comes from. When handling binary formats in Java, your runtime byte ordering doesn't necessarily match your platform byte ordering, so you have to do an endianness test to see which direction you'll have to shift. 

It escaped me that it isn't necessary in compiled languages, and the equivalent Java probably looks sort of awful to people more familiar with the underlying hardware.

David


On Friday, October 5, 2012 6:48:01 PM UTC-4, Uli Kunitz wrote:

Rémy Oudompheng

unread,
Oct 6, 2012, 8:10:33 AM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
On 2012/10/6 David Wright <d.wrig...@gmail.com> wrote:
> Sorry Uli.
>
> For the sake of others coming from Java and CLR languages, maybe I should
> mention something on-list about where I think my misunderstanding comes
> from. When handling binary formats in Java, your runtime byte ordering
> doesn't necessarily match your platform byte ordering, so you have to do an
> endianness test to see which direction you'll have to shift.
>
> It escaped me that it isn't necessary in compiled languages, and the
> equivalent Java probably looks sort of awful to people more familiar with
> the underlying hardware.

I don't understand what you mean. I thought Java's arithmetic
operations were portable. Can you give an example?

Rémy.

Rob Pike

unread,
Oct 6, 2012, 8:19:30 AM10/6/12
to Rémy Oudompheng, David Wright, golan...@googlegroups.com, Devon H. O'Dell
Java has nothing to do with it, nor does the portability of arithmetic.

Self-promotion:
http://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html

For all the crap I received about that article, I stand by its central
points, which are that people don't understand byte order and if you
think you need to know the byte order of the machine you're on, you're
wrong.

-rob

David Wright

unread,
Oct 6, 2012, 9:02:38 AM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
This may now constitute a threadjacking because Java is way off topic. If so, sorry.

The shift operators are always going to assume big-endian, because they're for dealing with Java types. A byte array index is always big-endian because it's for dealing with Java arrays. So yes portable in the sense that the operators don't do different things on different platforms, and that's kind of a pain because it looks just like C and doesn't act like it.  If you're dealing with data packed little-endian, it would be common to do something like testing for byte order with ByteOrder.nativeOrder() [1], and conditionally swapping [2]. 

And Rob is right, I need to make sure I'm much clearer on how this is optimally done in Go, because I've been sitting up parsing wav files and using binary.LittleEndian to retrieve /platform appropriate/ unsigned integers [3]. The discussion is timely for me too.


Peter S

unread,
Oct 6, 2012, 9:16:52 AM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
The point of Rob's article (the way I interpret it) is that there is no "platform appropriate" and "platform inappropriate" unsigned integers (or, more precisely, there shouldn't be). "Platform inappropriate" is not an unsigned integer, just a sequence of bytes, that is inappropriately accessed through a variable of type unsigned integer, when it should be an array of bytes.

I think a large part of the confusion comes from the family of functions mentioned in the thread title: ntohl/htonl. Actually, implementing an "ntohl/htonl equivalent" (converting between uint32 and "potentially platform-inappropriate uint32") does indeed require knowing the machine byte order. But the point is that converting between uint32 and "potentially platform-inappropriate uint32" is the wrong approach in the first place, with the right approach being converting between uint32 and []byte and letting the compiler take care of endianness (as well as alignment and optimization).

Peter


--
 
 

David Wright

unread,
Oct 6, 2012, 9:28:42 AM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
On Saturday, October 6, 2012 9:17:05 AM UTC-4, speter wrote:
The point of Rob's article (the way I interpret it) is that there is no "platform appropriate" and "platform inappropriate" unsigned integers (or, more precisely, there shouldn't be). "Platform inappropriate" is not an unsigned integer, just a sequence of bytes, that is inappropriately accessed through a variable of type unsigned integer, when it should be an array of bytes.


But that's just silly. How you gonna do math with just a bag of out-of-order bytes?
 
I think a large part of the confusion comes from the family of functions mentioned in the thread title: ntohl/htonl. Actually, implementing an "ntohl/htonl equivalent" (converting between uint32 and "potentially platform-inappropriate uint32") does indeed require knowing the machine byte order. But the point is that converting between uint32 and "potentially platform-inappropriate uint32" is the wrong approach in the first place, with the right approach being converting between uint32 and []byte and letting the compiler take care of endianness (as well as alignment and optimization).

That is the lesson I need to take. Otherwise, I might as well be doing the thing in Java. 

Aram Hăvărneanu

unread,
Oct 6, 2012, 9:45:06 AM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
> But that's just silly. How you gonna do math with just a bag of out-of-order
> bytes?

42+42 equals 84 on both little endian and big endian computers. 1 << 8
is 256 on both sparcs and vaxen. CPUs provide you this simple
arithmetic abstraction that allows you to operate with familiar
concepts learned in primary school. Learn to love it, don't fight it.
Whether bits flow one way or another when you do an arithmetic shift
doesn't matter.

--
Aram Hăvărneanu

Peter S

unread,
Oct 6, 2012, 9:49:51 AM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
On Sat, Oct 6, 2012 at 10:28 PM, David Wright <d.wrig...@gmail.com> wrote:
On Saturday, October 6, 2012 9:17:05 AM UTC-4, speter wrote:
The point of Rob's article (the way I interpret it) is that there is no "platform appropriate" and "platform inappropriate" unsigned integers (or, more precisely, there shouldn't be). "Platform inappropriate" is not an unsigned integer, just a sequence of bytes, that is inappropriately accessed through a variable of type unsigned integer, when it should be an array of bytes.


But that's just silly. How you gonna do math with just a bag of out-of-order bytes?

Obviously, as it is, you can't and you don't. When you receive a "bag of bytes" through the network, you treat them like a bag of bytes (and not like a bag of uint32's or whatever they represent): you keep them in a []byte. (Actually in Go you could use channel of bytes as well.) When you need to do math with the values, as explained below, convert them from []byte to ("platform appropriate") uint32, either using the snippet above in the thread, or using encoding/binary. The point is that you shouldn't ever have to have (and access data through) "platform-inappropriate" uint32 variables.

Peter

 
 
I think a large part of the confusion comes from the family of functions mentioned in the thread title: ntohl/htonl. Actually, implementing an "ntohl/htonl equivalent" (converting between uint32 and "potentially platform-inappropriate uint32") does indeed require knowing the machine byte order. But the point is that converting between uint32 and "potentially platform-inappropriate uint32" is the wrong approach in the first place, with the right approach being converting between uint32 and []byte and letting the compiler take care of endianness (as well as alignment and optimization).

That is the lesson I need to take. Otherwise, I might as well be doing the thing in Java. 

--
 
 

David Wright

unread,
Oct 6, 2012, 10:30:26 AM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
On Saturday, October 6, 2012 9:50:04 AM UTC-4, speter wrote:

Obviously, as it is, you can't and you don't. When you receive a "bag of bytes" through the network, you treat them like a bag of bytes (and not like a bag of uint32's or whatever they represent): you keep them in a []byte. (Actually in Go you could use channel of bytes as well.) When you need to do math with the values, as explained below, convert them from []byte to ("platform appropriate") uint32, either using the snippet above in the thread, or using encoding/binary. The point is that you shouldn't ever have to have (and access data through) "platform-inappropriate" uint32 variables.

Peter

Well this is far afield. I didn't recognize that Uli's code was explicitly little-endian, and this is Rob's point: "If the data stream encodes values with byte order B, then the algorithm to decode the value on computer with byte order C should be about Bnot about the relationship between B and C."

Now I'm just curious whether binary.LittleEndian.Uint32() is getting inlined as well.

Uli Kunitz

unread,
Oct 6, 2012, 4:41:30 PM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
The important point is, that encoding and decoding of byte streams can be done regardless of host byte order. I had trouble to understand that, because I learned about byte order using the C socket interface that forces the user to use ntohl() and friends. I guess a lot of folks share that experience.

Rob Pike

unread,
Oct 6, 2012, 4:52:11 PM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
On Sat, Oct 6, 2012 at 11:02 PM, David Wright <d.wrig...@gmail.com> wrote:
> This may now constitute a threadjacking because Java is way off topic. If
> so, sorry.
>
> The shift operators are always going to assume big-endian, because they're
> for dealing with Java types.

This statement demonstrates a profound misunderstanding. Shift
operators work on integers. They have no byte order and "assume"
nothing about it.

Byte order is defined by whether successive byte addresses in a
computer correspond to higher or lower significant bytes of the
aliased multibyte integer occupying those several addresses. Shift
(and add and subtract and multiply and everything else) do not expose
the address sequence within a word and so are not subject to different
behavior dependent on byte order. As I said in my blog post, it's
extremely hard even to detect what the byte order is in a type-safe
language. *Nothing* exposes the address sequence within a word except
for aliasing operations, which are by definition unsafe.

The Go statement
a[0], a[1], a[2], a[3] = byte(n), byte(n>>8), byte(n>>16), byte(n>>24)
portably unpacks a 32-bit integer into a little-endian byte sequence
on any computer because the 0th byte is the lowest significant and so
on.

-rob

David Wright

unread,
Oct 6, 2012, 9:07:41 PM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
On Saturday, October 6, 2012 4:52:19 PM UTC-4, Rob Pike wrote:
On Sat, Oct 6, 2012 at 11:02 PM, David Wright <d.wrig...@gmail.com> wrote:
> This may now constitute a threadjacking because Java is way off topic. If
> so, sorry.
>
> The shift operators are always going to assume big-endian, because they're
> for dealing with Java types.

This statement demonstrates a profound misunderstanding. Shift
operators work on integers. They have no byte order and "assume"
nothing about it.


Well, yes, but, no. But sort of. Java has left and right shift operators, which admittedly isn't actually about byte order, but either can be used to reorder bytes.
 
Byte order is defined by whether successive byte addresses in a
computer correspond to higher or lower significant bytes of the
aliased multibyte integer occupying those several addresses.

But the operation isn't portable in the same way with Java because, as I said, a byte array index is always big-endian because it's for dealing with Java arrays. It's the composition of those two operations, shift and array index, that gives you different results on different endinanness in C/C++/Go, and the same results on machines of different endianness in interpreted languages. 

That's the difference I'm now trying to amplify for the sake of mailing list posterity. I'm not the only one who's going to be coming here thinking in Java/Python/C#/Ruby. This is about something kind of subtle, not a fundamental misunderstanding.

Rob Pike

unread,
Oct 6, 2012, 9:22:42 PM10/6/12
to David Wright, golan...@googlegroups.com, Devon H. O'Dell
The nature of shifts have nothing to do with
Java/Ruby/Python/Go/whatever, beyond the definition of what happens
with the sign on right shift.

I'm sorry but I sincerely believe you don't understand the issues.

-rob

David Wright

unread,
Oct 6, 2012, 9:32:34 PM10/6/12
to golan...@googlegroups.com, David Wright, Devon H. O'Dell
Okay. It's fine with me if I leave understanding them better than you.

Don A. Bailey

unread,
Oct 6, 2012, 9:37:16 PM10/6/12
to golan...@googlegroups.com
So I see I've found the thing called "the Internet". Thanks for the reception. 

D

--
 
 

Dan Kortschak

unread,
Oct 6, 2012, 10:20:24 PM10/6/12
to David Wright, golan...@googlegroups.com
I'm sorry, maybe I'm a little slow with words, but I fail to see how the
index endian-ness makes any difference since it's only being used as an
byte-level index into the array, and bytes in any architecture that has
atomic bytes have no endian-ness. As far as I can see there is no
composition of index and shift operation that behaves in a manner that
is dependent on index endian-ness.

So if you could, would you provide a couple of snippets (one for java
and the equivalent in Go that behaves differently) of code that
demonstrate the issue you say exists?

thanks
Dan

zubin....@gmail.com

unread,
Oct 26, 2013, 4:00:58 AM10/26/13
to golan...@googlegroups.com, Rémy Oudompheng, David Wright, Devon H. O'Dell
Rob,
The scheme using shifts that are suggested in your blog post and in Uli's post does not perform anywhere near as fast as a direct load of large words, and therein lies the problem.  For example, I am implementing a B-Tree structure in Go, which requires me to read and write keys and values in node structures that are stored on disk; for this purpose, using the shifts method is very inefficient and makes the B-Tree run substantially slower.  The encoding/binary is no better; in fact, it is a bit worse.  Using the unsafe package to do word reads and writes is nearly 10 times faster when data is in cache; here are some results from a benchmark program I wrote (attached):

    Rate with unsafe: 686.10M sets/sec, 461.55M gets/sec
    Rate with shifts: 78.67M sets/sec, 74.75M gets/sec
    Rate with encoding/binary: 65.84M sets/sec, 55.85M gets/sec

Granted, the unsafe method is not endian-safe, but I don't really care about that since the B-Tree will only ever be running on the same architecture it was created on.  Is there any way other than using unsafe to achieve the high performance reading and writing of integers to and from a byte slice?  One that would hopefully also be endian-safe.

-Zubin
try.go

Rémy Oudompheng

unread,
Oct 26, 2013, 4:16:47 AM10/26/13
to zubin....@gmail.com, golang-nuts, David Wright, Devon H. O'Dell
On 2013/10/26 <zubin....@gmail.com> wrote:
> Rob,
> The scheme using shifts that are suggested in your blog post and in Uli's
> post does not perform anywhere near as fast as a direct load of large words,
> and therein lies the problem. For example, I am implementing a B-Tree
> structure in Go, which requires me to read and write keys and values in node
> structures that are stored on disk; for this purpose, using the shifts
> method is very inefficient and makes the B-Tree run substantially slower.
> The encoding/binary is no better; in fact, it is a bit worse. Using the
> unsafe package to do word reads and writes is nearly 10 times faster when
> data is in cache; here are some results from a benchmark program I wrote
> (attached):
>
> Rate with unsafe: 686.10M sets/sec, 461.55M gets/sec
> Rate with shifts: 78.67M sets/sec, 74.75M gets/sec
> Rate with encoding/binary: 65.84M sets/sec, 55.85M gets/sec
>
> Granted, the unsafe method is not endian-safe, but I don't really care about
> that since the B-Tree will only ever be running on the same architecture it
> was created on. Is there any way other than using unsafe to achieve the
> high performance reading and writing of integers to and from a byte slice?
> One that would hopefully also be endian-safe.

If you are not using unsafe there is no way you can convert an integer
to 4/8 bytes without using arithmetic operations like shifts.

Rémy.

Ian Lance Taylor

unread,
Oct 26, 2013, 1:56:38 PM10/26/13
to zubin....@gmail.com, golang-nuts, Rémy Oudompheng, David Wright, Devon H. O'Dell
On Sat, Oct 26, 2013 at 1:00 AM, <zubin....@gmail.com> wrote:
>
> Granted, the unsafe method is not endian-safe, but I don't really care about
> that since the B-Tree will only ever be running on the same architecture it
> was created on.

I don't doubt this is true in your case, but I have to say that it
pains me to recall the number of times in my career I have had to
clean up after such assumptions. Systems change, but some code lives
forever.


> Is there any way other than using unsafe to achieve the
> high performance reading and writing of integers to and from a byte slice?

No.

Since you are reading from disk, it seems mildly implausible that the
hot spot in your program is the time it takes to encode an integer
into a byte slice.

If it were the hot spot for me I would try writing a []int and use
unsafe to convert that to a []byte.

Ian

Zubin Dittia

unread,
Oct 26, 2013, 4:34:13 PM10/26/13
to Ian Lance Taylor, golang-nuts, Rémy Oudompheng, David Wright, Devon H. O'Dell
On Sat, Oct 26, 2013 at 10:56 AM, Ian Lance Taylor <ia...@golang.org> wrote:
On Sat, Oct 26, 2013 at 1:00 AM,  <zubin....@gmail.com> wrote:
>
> Granted, the unsafe method is not endian-safe, but I don't really care about
> that since the B-Tree will only ever be running on the same architecture it
> was created on.

I don't doubt this is true in your case, but I have to say that it
pains me to recall the number of times in my career I have had to
clean up after such assumptions.  Systems change, but some code lives
forever.

Which is why what I really want is a high-performance way of storing and retrieving integers that is endian-safe.  In other words, I want to do something like:
*((*uint64)(unsafe.Pointer(&s[i]))) = htonll(val)

This would be both performant and endian-safe.  So its annoying that Go doesn't provide processor-optimized versions of htonl/ntohl for different integer sizes.  But at least it would be good if the bytes.encoding implementations were high performance: they're not.



> Is there any way other than using unsafe to achieve the
> high performance reading and writing of integers to and from a byte slice?

No.

Since you are reading from disk, it seems mildly implausible that the
hot spot in your program is the time it takes to encode an integer
into a byte slice.
 
I'm not actually reading from spinning disk, but from a large array of flash disks, and the throughput is high enough that this cost does become an issue.

If it were the hot spot for me I would try writing a []int and use
unsafe to convert that to a []byte.

Its what I'm doing, but it bothers me that its not endian-safe (i.e., if I wrote a B-Tree node on a big-endian arch and then read it on a little-endian, my code would be broken).

Zubin

Ian

Kevin Gillette

unread,
Oct 26, 2013, 5:13:47 PM10/26/13
to golan...@googlegroups.com, Ian Lance Taylor, Rémy Oudompheng, David Wright, Devon H. O'Dell, zubin....@gmail.com
If you have full control over the target platform, and you're implementing the persistence system, why not use the native endianness for your current target platform on-disk so that for your "typical" case requires no conversion?

If you have:

// +build 386 amd64
func ToDisk(v int32) int32 { return v }
func FromDisk(v int32) int32 { return v }

Such calls certainly will be inlined (and thus effectively "free"). When you need to support an architecture with a differing endianness, you can have a separate file that builds only for those architectures and does shifts/unsafe/whatever to actually do the conversion.

Zubin Dittia

unread,
Oct 26, 2013, 6:07:07 PM10/26/13
to Kevin Gillette, golang-nuts, Ian Lance Taylor, Rémy Oudompheng, David Wright, Devon H. O'Dell
Thanks, thats a good suggestion.  But it still bothers me that there is no efficient way to make use of processor instruction like intel bswap to do the translation for me so one never has to worry about it.  Why does binary.encoding not use bswap for a fast implementation; instead, it seems to be the slowest of the different mechanisms.

Ian Lance Taylor

unread,
Oct 26, 2013, 6:15:00 PM10/26/13
to Zubin Dittia, golang-nuts, Rémy Oudompheng, David Wright, Devon H. O'Dell
On Sat, Oct 26, 2013 at 1:34 PM, Zubin Dittia <zubin....@gmail.com> wrote:
>
> Which is why what I really want is a high-performance way of storing and
> retrieving integers that is endian-safe. In other words, I want to do
> something like:
> *((*uint64)(unsafe.Pointer(&s[i]))) = htonll(val)
>
> This would be both performant and endian-safe. So its annoying that Go
> doesn't provide processor-optimized versions of htonl/ntohl for different
> integer sizes. But at least it would be good if the bytes.encoding
> implementations were high performance: they're not.

I personally think it would be fine to make the encoding/binary
package faster. Although it's worth considering what the package
explicitly states:

// This package favors simplicity over efficiency. Clients that require
// high-performance serialization, especially for large data structures,
// should look at more advanced solutions such as the encoding/gob
// package or protocol buffers.


Using htonll, on the other hand--and I understand that you aren't
actually suggesting that--is broken by design.

Ian

Kevin Gillette

unread,
Oct 26, 2013, 6:17:33 PM10/26/13
to golan...@googlegroups.com, Kevin Gillette, Ian Lance Taylor, Rémy Oudompheng, David Wright, Devon H. O'Dell, zubin....@gmail.com
On Saturday, October 26, 2013 4:07:07 PM UTC-6, Zubin Dittia wrote:
But it still bothers me that there is no efficient way to make use of processor instruction like intel bswap to do the translation for me so one never has to worry about it.

If you're not worried about gccgo compatibility (e.g. you can comfortably limit yourself to whatever architectures/OS's the official gc toolchain supports), you may use plan9 style C and plan9 style assembly pretty effortlessly: just put them in .c and .s files within the same package directory and they'll be automatically used; there are plenty of examples in the standard library code.

Uli Kunitz

unread,
Oct 27, 2013, 5:45:23 AM10/27/13
to golan...@googlegroups.com, Zubin Dittia, Rémy Oudompheng, David Wright, Devon H. O'Dell
The reason why Go conversion is slower is that the byte slice access involves a slice length check (a compare and a conditional jump). I tried to handle that with a copy into a Go array but that involves another function call. A highly optimizing compiler could improve the situation or an assembler function.

Nobody will beat mapping a byte array to an int array, but the cost is the dependency on the host endianness. There may be use cases where the high performance is required, but I doubt that Go is the right language to use in those cases anyway. The Golang designers choose garbage collection and array limit checking to simplify programming being aware of the performance costs. From my point of view they got it right, a compiled, static typed language that feels like programming in a scripting language.
Reply all
Reply to author
Forward
0 new messages