alignment of stack-allocated variables?

181 views
Skip to first unread message

TheDiveO

unread,
Mar 3, 2023, 3:30:37 PM3/3/23
to golang-nuts
In dealing with Linux netlink messages I need to decode and encode uint16, uint32, and uint64 numbers that are in an arbitrary aligned byte buffer in an arbitrary position. In any case, these numbers are in native endianess, so I would like to avoid having to go through encoding/binary.

buff := bytes.NewBuffer(/* some data */)

// ...

func foo() uint32 {
    var s struct {
        _ [0]uint32
        b [4]byte
    }
    r.buff.Read(s.b[:])
    return *(*uint32)(unsafe.Pointer(&s.b[0]))
}

Will the go compiler (1.19+) allocate on the stack with the correct alignment for its element b, so that the unsafe.Pointer operation correctly works on different CPU architectures?

Or is this inefficient anyway in a subtle way that my attempt to avoid non-stack allocations is moot anyway?

Keith Randall

unread,
Mar 3, 2023, 7:20:01 PM3/3/23
to golang-nuts
If you're using unsafe anyway, I'd go the other direction, casting from the larger alignment to the smaller one. That avoids any alignment concerns.

var x uint32
b := (*[4]byte)(unsafe.Pointer(&x))[:]
r.buff.Read(b)
return x

I would encourage you to use encoding/binary though. It all works out just as well without unsafe, with a bit of trickiness around making sure that calls can be resolved and inlined.

b := make([]byte, 4)
buf.Read(b)
if little { // some global variable (or constant) you set
   return binary.LittleEndian.Uint32(b)
}
return binary.BigEndian.Uint32(b)

TheDiveO

unread,
Mar 4, 2023, 9:53:42 AM3/4/23
to golang-nuts
Keith, thank you very much for your feedback, it is highly appreciated!

With this in mind, it's time for lies, more lies, and statistics, benchmarking the three different implementations below:

func (r *Reader) Uint32() uint32 {
if r.err != nil {
return 0

}
var s struct {
_ [0]uint32
b [4]byte
}
_, r.err = r.buff.Read(s.b[:])
if r.err != nil {
return 0
}
return *(*uint32)(unsafe.Pointer(&s.b[0]))
}

func (r *Reader) Uint32X() uint32 {
if r.err != nil {
return 0
}
var v uint32
_, r.err = r.buff.Read((*[4]byte)(unsafe.Pointer(&v))[:])
if r.err != nil {
return 0
}
return v
}


func (r *Reader) Uint32N() uint32 {
if r.err != nil {
return 0

}
b := make([]byte, 4)
_, r.err = r.buff.Read(b)
if r.err != nil {
return 0
}
return hostnative.Uint32(b)
}


The benchmarking results using "go test -bench=. -benchtime=60s -benchmem .":

cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
BenchmarkReadUint32
BenchmarkReadUint32-8           1000000000               5.974 ns/op           0 B/op          0 allocs/op
BenchmarkReadUint32X
BenchmarkReadUint32X-8          1000000000               5.977 ns/op           0 B/op          0 allocs/op
BenchmarkReadUint32N
BenchmarkReadUint32N-8          1000000000              20.81 ns/op            4 B/op          1 allocs/op

The two "unsafe" contenders are absolutely neck-to-neck, so in terms of better readability and maintainability your proposed variant wins for me. And as I was somehow suspecting, encoding/binary takes almost 4 times as much as the first two implementations, and throwing a needless heap allocation into the bargain. 

TheDiveO

unread,
Mar 6, 2023, 8:53:07 AM3/6/23
to golang-nuts
Keith made me aware of the fact that my benchmark is using the binary.BigEndian interface instead of "unrolling" the interface to use the specific type at runtime.  

cpu: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
BenchmarkUnsafe-8       1000000000               5.991 ns/op           0 B/op          0 allocs/op
BenchmarkEnc-8          1000000000               6.327 ns/op           0 B/op          0 allocs/op

This now gets within 6% of the unsafe method.
Reply all
Reply to author
Forward
0 new messages