Unsafe string/slice conversions

782 views
Skip to first unread message

Caleb Spare

unread,
Feb 21, 2017, 5:53:53 PM2/21/17
to golang-nuts, Ian Taylor
I have a program that uses unsafe in order to coerce some slices to
strings for use as map keys. (Avoiding these allocations ends up being
an important performance optimization to this program.)

Here's some example code that shows what I'm doing:

https://play.golang.org/p/Yye1Riv0Jj

Does this seem OK? I've tried to make sure I understand how all the
unsafe codes fits into the blessed idioms at
https://golang.org/pkg/unsafe/#Pointer. The part I'm most curious
about is the indicated line:

sh.Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data // <---

This is a double-application of rule 6: it's a conversion *from* a
reflect.StringHeader's Data field *to* a reflect.SliceHeader's Data
field, through an unsafe.Pointer and uintptr.

This code has been working for a long time and appears to continue to
work, but I've been re-reviewing all my unsafe usage after reading the
conversation at https://github.com/golang/go/issues/19168.

Thanks for any insights.
Caleb

P.S. In this particular case, I'm planning on replacing the map with a
custom hashtable (since it's very specialized I can do better than a
built-in map type) and that will eliminate the unsafe code.

Ian Lance Taylor

unread,
Feb 21, 2017, 6:27:49 PM2/21/17
to Caleb Spare, golang-nuts
On Tue, Feb 21, 2017 at 2:53 PM, Caleb Spare <ces...@gmail.com> wrote:
> I have a program that uses unsafe in order to coerce some slices to
> strings for use as map keys. (Avoiding these allocations ends up being
> an important performance optimization to this program.)
>
> Here's some example code that shows what I'm doing:
>
> https://play.golang.org/p/Yye1Riv0Jj
>
> Does this seem OK? I've tried to make sure I understand how all the
> unsafe codes fits into the blessed idioms at
> https://golang.org/pkg/unsafe/#Pointer. The part I'm most curious
> about is the indicated line:
>
> sh.Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data // <---
>
> This is a double-application of rule 6: it's a conversion *from* a
> reflect.StringHeader's Data field *to* a reflect.SliceHeader's Data
> field, through an unsafe.Pointer and uintptr.
>
> This code has been working for a long time and appears to continue to
> work, but I've been re-reviewing all my unsafe usage after reading the
> conversation at https://github.com/golang/go/issues/19168.

I can't see anything wrong with this code. Maybe someone else can.

Ian

T L

unread,
Feb 23, 2017, 7:13:47 AM2/23/17
to golang-nuts, ia...@golang.org


On Wednesday, February 22, 2017 at 6:53:53 AM UTC+8, Caleb Spare wrote:
I have a program that uses unsafe in order to coerce some slices to
strings for use as map keys. (Avoiding these allocations ends up being
an important performance optimization to this program.)

Here's some example code that shows what I'm doing:

https://play.golang.org/p/Yye1Riv0Jj

Looks ok, but sliceToStringUnsafe in m[sliceToStringUnsafe(v)] = 1 is not essential, for gc compiler has already made the same optimization for you.
Just use m[string(v)] = 1 is ok.
 

Caleb Spare

unread,
Feb 23, 2017, 10:46:45 AM2/23/17
to T L, golang-nuts
That only applies to []byte. I have a []uint64.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Caleb Spare

unread,
Mar 23, 2017, 12:16:57 PM3/23/17
to Ian Lance Taylor, golang-nuts
Thanks Ian, that's very helpful.

Brief follow-up: does the seeming validity of the code rely at all on
the fact that the indicated line is written as a single line? What if,
instead, a *StringHeader var were extracted?

func stringToSliceUnsafe(s string) []uint64 {
var v []uint64
h := (*reflect.StringHeader)(unsafe.Pointer(&s)) // <--
sh := (*reflect.SliceHeader)(unsafe.Pointer(&v))
sh.Data = h.Data
sh.Len = h.Len >> 3
sh.Cap = h.Len >> 3
return v
}

(Play link: https://play.golang.org/p/BmGtYTsGNY)

Does h keep s alive? A strict reading of rule 6 doesn't seem to say
that keeping a *StringHeader or *SliceHeader around keeps the
underlying string/slice alive (but it's sort of implied by the rule 6
example code, which doesn't refer to s after converting it to a
*StringHeader).

Caleb

Ian Lance Taylor

unread,
Mar 23, 2017, 1:26:40 PM3/23/17
to Caleb Spare, golang-nuts
On Thu, Mar 23, 2017 at 9:16 AM, Caleb Spare <ces...@gmail.com> wrote:
>
> Brief follow-up: does the seeming validity of the code rely at all on
> the fact that the indicated line is written as a single line? What if,
> instead, a *StringHeader var were extracted?
>
> func stringToSliceUnsafe(s string) []uint64 {
> var v []uint64
> h := (*reflect.StringHeader)(unsafe.Pointer(&s)) // <--
> sh := (*reflect.SliceHeader)(unsafe.Pointer(&v))
> sh.Data = h.Data
> sh.Len = h.Len >> 3
> sh.Cap = h.Len >> 3
> return v
> }
>
> (Play link: https://play.golang.org/p/BmGtYTsGNY)
>
> Does h keep s alive? A strict reading of rule 6 doesn't seem to say
> that keeping a *StringHeader or *SliceHeader around keeps the
> underlying string/slice alive (but it's sort of implied by the rule 6
> example code, which doesn't refer to s after converting it to a
> *StringHeader).

That is an interesting point. I don't think there is anything keeping
s alive here. I think this isn't quite the same as the example in the
docs, because that example is assuming that you are doing to use s
after setting the fields--why else would you be doing that? In this
case it does seem theoretically possible that s could be freed between
the assignment to h and the use of h.Data. With the current and
foreseeable toolchains it's a purely theoretical problem, since there
is no point there where the goroutine could be preempted and the fact
that s is no longer referenced be detected. But as a theoretical
problem it does seem real. One fix would be something like
p := &s
h := (*reflect.StringHeader)(unsafe.Pointer(p))
sh := (*reflect.SliceHeader)(unsafe.Pointer(&v))
sh.Data = h.Data
sh.Len = ...
sh.Cap = ...
runtime.KeepAlive(p)

Ian

Caleb Spare

unread,
Mar 23, 2017, 7:24:40 PM3/23/17
to Ian Lance Taylor, golang-nuts
That's very good to know. Thanks, Ian.

Unfortunately if I use this KeepAlive-based fix, p escapes and so the
function now allocates. I guess I'll stick with the original version
from my first email.

Does this indicate a shortcoming of either compiler support for
KeepAlive or escape analysis in general?

Caleb

Keith Randall

unread,
Mar 23, 2017, 7:41:28 PM3/23/17
to golang-nuts, ia...@golang.org


On Thursday, March 23, 2017 at 4:24:40 PM UTC-7, Caleb Spare wrote:
That's very good to know. Thanks, Ian.

Unfortunately if I use this KeepAlive-based fix, p escapes and so the
function now allocates. I guess I'll stick with the original version
from my first email.

Does this indicate a shortcoming of either compiler support for
KeepAlive or escape analysis in general?


KeepAlive shouldn't be making things escape.  If that is happening you should file a bug for it.
The definition is:

//go:noinline
func KeepAlive(interface{}) {}

which should be pretty easy to analyze :)

Caleb Spare

unread,
Mar 23, 2017, 9:05:48 PM3/23/17
to Keith Randall, golang-nuts, Ian Taylor
Filed: https://github.com/golang/go/issues/19687
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.

Jerome Froelich

unread,
Mar 24, 2017, 2:58:31 PM3/24/17
to golang-nuts, k...@google.com, ia...@golang.org
It just occurred to me that there may actually be a problem with the conversion functions in the example. Specifically, since the Go compiler can allocate byte slices on the heap if they do not escape, the following function may be invalid:

func foo() string {
        v := []uint64{1,2,3}
        s := sliceToStringUnsafe(v)
return s
}

Admittedly, this is a contrived example but I think it can illustrate the issue. As noted on this previous thread the compiler will allocate slices on the heap if it can prove they do not escape. Consequently, in the function above, the data for v can be allocated on the stack. In such a case, since sliceToStringUnsafe only adjusts pointers, the string it returns will point to this stack-allocated data. However, once foo returns, this data is no longer valid and can be overwritten by subsequent functions that are pushed on the stack.

Keith Randall

unread,
Mar 25, 2017, 12:43:38 PM3/25/17
to golang-nuts, k...@google.com, ia...@golang.org
That is a worry, but escape analysis can handle this case.  It understands flow of pointers through unsafe.Pointer and uintptr.  As a consequence, in this example it forces the array to be allocated on the heap.

The runtime has on occasion a need to hide values from escape analysis.  It has to do some weird math on the uintptr to hide the escape.  It uses:

// noescape hides a pointer from escape analysis.  noescape is
// the identity function but escape analysis doesn't think the
// output depends on the input.  noescape is inlined and currently
// compiles down to zero instructions.
// USE CAREFULLY!
//go:nosplit
func noescape(p unsafe.Pointer) unsafe.Pointer {
x := uintptr(p)
return unsafe.Pointer(x ^ 0)

Jerome Froelich

unread,
Mar 26, 2017, 9:57:15 PM3/26/17
to golang-nuts, k...@google.com, ia...@golang.org
Interesting, I actually ran into some odd behavior with this function that I'm not sure how to explain. Consider the following test:

import (
    "reflect"
    "testing"
    "unsafe"
)

func sliceToStringUnsafe(v []byte) string {
var s string
sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
sh.Data = uintptr(unsafe.Pointer(&v[0]))
sh.Len = len(v)
return s
}

func foo() string {
        v := []byte{"foobarbaz"}
        s := sliceToStringUnsafe(v)
return s
}

func TestFoo(t *testing.T) string {
        foo := foo()
        require.Equal(t, "foobarbaz", foo)
}

If I run this test with `go test` it passes. However, if I run the test with `go test -covermode atomic` it fails and foo contains random data. Perhaps the `covermode atomic` affects how the compiler performs escape analysis?

Jerome Froelich

unread,
Mar 27, 2017, 10:10:16 AM3/27/17
to golang-nuts, k...@google.com, ia...@golang.org
It seems the issue has to do with the compiler inlining `foo`. If I add the compiler directive `//go:noinline` above `foo` and run `go test` the test fails, just like when I run `go test -covermode=atomic`. This makes sense because if the function is inlined and the slice is allocated on the stack the string will still point to valid data. On the other hand, if the function is not inlined, as I expect is the case when generating code coverage, then if the slice is allocated on the stack when `foo` returns the string could point to invalid data. In any case, it seems the compiler is not able to "see" this particular pointer swap and may, in fact, be allocating the slice on the stack.
Reply all
Reply to author
Forward
0 new messages