proposal: unsafe.Pointer arithmetic rules

774 views
Skip to first unread message

Matthew Dempsky

unread,
Aug 11, 2014, 3:42:36 PM8/11/14
to golan...@googlegroups.com
In response to http://golang.org/issue/7192, I've written a proposal for unsafe.Pointer arithmetic:

It captures what I think is the simplest / most idiomatic solution to the problem.  However, I'm not really a Go developer and my understanding of the Go compilers/runtime is limited at best, so take it as a best effort attempt to be helpful.

Comments welcome on this thread.  Thanks!

minux

unread,
Aug 11, 2014, 4:40:44 PM8/11/14
to Matthew Dempsky, golang-dev
Go lacked pointer arithmetic for a reason, so it's ok to be very verbose when using them.
(they are highly discouraged anyway.)

for your original issue of the 1.3 GC change docs:
"Programs that use package unsafe to store pointers in integer-typed values are also illegal but more difficult to diagnose during execution."

The emphasis is on "store". Go GC won't collect partial object at this time, so as long as
p is a variable with type unsafe.Pointer, and offset doesn't make the result out of the memory
allocated to hold p, this expression
    unsafe.Pointer(uintptr(p) + offset)
is safe under the Go 1.3 rule (probably not under moving GC, but that the problem issue 7192
is about)

Problem arises when you store p into uintptr and then the original unsafe.Pointer falls out of
scope as p won't be scanned by the GC.

I agree that we need to write down what's expect to always work (future-proof to GC improvements),
but we shouldn't make them easier to write.

Russ Cox

unread,
Aug 11, 2014, 4:52:10 PM8/11/14
to Matthew Dempsky, golang-dev
You should probably look at what go vet already diagnoses for Go 1.3.
That's what I personally believe the rules are.
The only difference I see at a quick glance is that go vet allows
    unsafe.Pointer(uintptr(p)+n)
where you are proposing a different spelling
    unsafe.Add(p, n)
The new spelling is clearer but requiring it invalidates lots of old code. Using the old spelling reduces fragmentation and is otherwise equivalent.

Russ

Matthew Dempsky

unread,
Aug 11, 2014, 5:11:45 PM8/11/14
to minux, golang-dev
Thanks for the feedback!

On Mon, Aug 11, 2014 at 1:40 PM, minux <mi...@golang.org> wrote:
Go lacked pointer arithmetic for a reason, so it's ok to be very verbose when using them. 
(they are highly discouraged anyway.)

Note the core issue I was trying to tackle is that currently pointer arithmetic requires converting an unsafe.Pointer into a uintptr and back, whereas with the changes I'm proposing pointer values would always stay typed as a pointer type (at least within user code).  My rationale is that's easier to specify, for users to reason about, and for implementations to provide runtime instrumentation to catch misuses.

The fact that "unsafe.Add(p, v)" is shorter than "unsafe.Pointer(uintptr(p) + v)" is incidental.

Matthew Dempsky

unread,
Aug 11, 2014, 5:18:42 PM8/11/14
to Russ Cox, golang-dev
On Mon, Aug 11, 2014 at 1:52 PM, Russ Cox <r...@golang.org> wrote:
You should probably look at what go vet already diagnoses for Go 1.3.
That's what I personally believe the rules are.

I looked at it a little bit, but I notice it didn't catch this code from src/pkg/syscall/sockcmsg_unix.go that I would think is erroneous:

        data := uintptr(cmsgData(h))
        for _, fd := range fds {
                *(*int32)(unsafe.Pointer(data)) = int32(fd)
                data += 4
        }

Even if I add a runtime.GC() call within the loop body (which I would think would certainly risk invalidating "data"), "go vet" doesn't complain.

Would you argue that's a bug in "go vet" or that I misunderstand and a Go implementation is required to ensure the above code runs as intended?

The only difference I see at a quick glance is that go vet allows
    unsafe.Pointer(uintptr(p)+n)
where you are proposing a different spelling
    unsafe.Add(p, n)
The new spelling is clearer but requiring it invalidates lots of old code. Using the old spelling reduces fragmentation and is otherwise equivalent.

I'm also proposing stricter semantics that a pointer into a Go variable can only be derived from other pointers into the same Go variable, and that the runtime provide instrumentation to check for that.  Implementing that would be somewhat easier under the new spelling, but it could be made to work under the old spelling as well.

Additionally, I think the new spelling gives some confidence to users that code like

    unsafe.Add(f(), g())

will work as intended, whereas

    unsafe.Pointer(uintptr(f()) + g())

happens to work today because gc rewrites the expression as

    autotmp1 := f()
    autotmp2 := g()
    unsafe.Pointer(uintptr(autotmp1) + autotmp2)

but the Go spec doesn't appear to guarantee that the uintptr conversion can't happen before the call to g().  I think that matters if evaluating g() might cause the GC to run.

Thanks for the feedback!

Russ Cox

unread,
Aug 11, 2014, 5:36:00 PM8/11/14
to Matthew Dempsky, golang-dev
g% go vet syscall
sockcmsg_unix.go:82: possible misuse of unsafe.Pointer
exit status 1
g% 


Russ Cox

unread,
Aug 11, 2014, 5:37:23 PM8/11/14
to Matthew Dempsky, golang-dev
On Mon, Aug 11, 2014 at 5:18 PM, Matthew Dempsky <mdem...@google.com> wrote:
I'm also proposing stricter semantics that a pointer into a Go variable can only be derived from other pointers into the same Go variable, and that the runtime provide instrumentation to check for that.  Implementing that would be somewhat easier under the new spelling, but it could be made to work under the old spelling as well.

That is an interesting idea. Noted.

Russ

Matthew Dempsky

unread,
Aug 11, 2014, 5:38:31 PM8/11/14
to Russ Cox, golang-dev
Interesting, I don't seem to be getting that error message locally.  I'll dig into why.

minux

unread,
Aug 11, 2014, 5:43:07 PM8/11/14
to Matthew Dempsky, Russ Cox, golang-dev
On Mon, Aug 11, 2014 at 5:18 PM, 'Matthew Dempsky' via golang-dev <golan...@googlegroups.com> wrote:
On Mon, Aug 11, 2014 at 1:52 PM, Russ Cox <r...@golang.org> wrote:
You should probably look at what go vet already diagnoses for Go 1.3.
That's what I personally believe the rules are.

I looked at it a little bit, but I notice it didn't catch this code from src/pkg/syscall/sockcmsg_unix.go that I would think is erroneous:

        data := uintptr(cmsgData(h))
        for _, fd := range fds {
                *(*int32)(unsafe.Pointer(data)) = int32(fd)
                data += 4
        }

Even if I add a runtime.GC() call within the loop body (which I would think would certainly risk invalidating "data"), "go vet" doesn't complain.

Would you argue that's a bug in "go vet" or that I misunderstand and a Go implementation is required to ensure the above code runs as intended?
The code is correct, h is still in scope, and the pointer cmsgData returns is still within h.

The only difference I see at a quick glance is that go vet allows
    unsafe.Pointer(uintptr(p)+n)
where you are proposing a different spelling
    unsafe.Add(p, n)
The new spelling is clearer but requiring it invalidates lots of old code. Using the old spelling reduces fragmentation and is otherwise equivalent.

I'm also proposing stricter semantics that a pointer into a Go variable can only be derived from other pointers into the same Go variable,
the wording is difficult to get right. for example, it's ok to use unsafe.Pointer to convert a t of T,
to a []T with len=cap=1?

e.g. is T and [1]T equivalent? what about going from [300]T to [10]T?
and that the runtime provide instrumentation to check for that.  Implementing that would be somewhat easier under the new spelling, but it could be made to work under the old spelling as well.
the trend is to remove type info for every object on the heap, so i don't think we can implement this check at runtime in the future.
(although we probably can check that pointer maps are compatible, and it's probably good enough)

Additionally, I think the new spelling gives some confidence to users that code like

    unsafe.Add(f(), g())

will work as intended, whereas

    unsafe.Pointer(uintptr(f()) + g())
i believe this is not allowed under 1.3 rules. 

minux

unread,
Aug 11, 2014, 5:44:18 PM8/11/14
to Matthew Dempsky, Russ Cox, golang-dev
it's a difference of go vet specifying all package files versus go tool vet pkg/syscall
probably a cmd/go bug?

Matthew Dempsky

unread,
Aug 11, 2014, 5:47:19 PM8/11/14
to minux, Russ Cox, golang-dev
On Mon, Aug 11, 2014 at 2:43 PM, minux <mi...@golang.org> wrote:
it's a difference of go vet specifying all package files versus go tool vet pkg/syscall
probably a cmd/go bug?

Ah, if I run "go tool vet pkg/syscall" from within src then I see the warning Russ quoted. Thanks!

Rob Pike

unread,
Aug 11, 2014, 5:47:52 PM8/11/14
to minux, Matthew Dempsky, Russ Cox, golang-dev
I don't believe so.

% go vet syscall
go/src/pkg/syscall/sockcmsg_unix.go:82: possible misuse of unsafe.Pointer
exit status 1
% go tool vet go/src/pkg/syscall
go/src/pkg/syscall/sockcmsg_unix.go:82: possible misuse of unsafe.Pointer
%
> --
> You received this message because you are subscribed to the Google Groups
> "golang-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

minux

unread,
Aug 11, 2014, 5:59:13 PM8/11/14
to Rob Pike, Matthew Dempsky, Russ Cox, golang-dev
On Mon, Aug 11, 2014 at 5:47 PM, Rob Pike <r...@golang.org> wrote:
I don't believe so.

% go vet syscall
go/src/pkg/syscall/sockcmsg_unix.go:82: possible misuse of unsafe.Pointer
exit status 1
% go tool vet go/src/pkg/syscall
go/src/pkg/syscall/sockcmsg_unix.go:82: possible misuse of unsafe.Pointer
%
Interesting.

Something is wrong with my environment.

$ go version
go version devel +0449858880be Mon Aug 11 17:11:31 2014 -0400 linux/amd64
$ go vet syscall
$ go vet -x syscall
../pkg/tool/linux_amd64/vet pkg/syscall/creds_test.go pkg/syscall/env_unix.go pkg/syscall/exec_linux.go pkg/syscall/exec_unix.go pkg/syscall/flock.go pkg/syscall/lsf_linux.go pkg/syscall/mmap_unix_test.go pkg/syscall/netlink_linux.go pkg/syscall/race0.go pkg/syscall/sockcmsg_linux.go pkg/syscall/sockcmsg_unix.go pkg/syscall/str.go pkg/syscall/syscall.go pkg/syscall/syscall_linux.go pkg/syscall/syscall_linux_amd64.go pkg/syscall/syscall_test.go pkg/syscall/syscall_unix.go pkg/syscall/syscall_unix_test.go pkg/syscall/zerrors_linux_amd64.go pkg/syscall/zsyscall_linux_amd64.go pkg/syscall/zsysnum_linux_amd64.go pkg/syscall/ztypes_linux_amd64.go pkg/syscall/asm_linux_amd64.s
$ ../pkg/tool/linux_amd64/vet pkg/syscall/creds_test.go pkg/syscall/env_unix.go pkg/syscall/exec_linux.go pkg/syscall/exec_unix.go pkg/syscall/flock.go pkg/syscall/lsf_linux.go pkg/syscall/mmap_unix_test.go pkg/syscall/netlink_linux.go pkg/syscall/race0.go pkg/syscall/sockcmsg_linux.go pkg/syscall/sockcmsg_unix.go pkg/syscall/str.go pkg/syscall/syscall.go pkg/syscall/syscall_linux.go pkg/syscall/syscall_linux_amd64.go pkg/syscall/syscall_test.go pkg/syscall/syscall_unix.go pkg/syscall/syscall_unix_test.go pkg/syscall/zerrors_linux_amd64.go pkg/syscall/zsyscall_linux_amd64.go pkg/syscall/zsysnum_linux_amd64.go pkg/syscall/ztypes_linux_amd64.go pkg/syscall/asm_linux_amd64.s
$ ../pkg/tool/linux_amd64/vet syscall
$ ../pkg/tool/linux_amd64/vet syscall
vet: error walking tree: stat syscall: no such file or directory
vet: syscall: open syscall: no such file or directory
vet: no files checked
$ ../pkg/tool/linux_amd64/vet pkg/syscall
pkg/syscall/sockcmsg_unix.go:82: possible misuse of unsafe.Pointer
$ (cd ~/src/code.google.com/p/go.tools/; hg sum)
parent: 1035:35238aaf7394 tip
 go.tools/go/ssa: improvements to CreateTestMainPackage.
branch: default
commit: 1 unknown (clean)
update: (current)

I also tried go1.3 with vet from release-branch.go1.3 package.

And with our official 1.3 binary.
$ tar xf go1.3.linux-amd64.tar.gz
$ cd go
$ GOROOT=`pwd` bin/go vet syscall
$ $ GOROOT=`pwd` bin/go tool vet src/pkg/syscall
src/pkg/syscall/sockcmsg_unix.go:82: possible misuse of unsafe.Pointer

It's because there is a difference between vet on Darwin and Linux?

Keith Randall

unread,
Aug 17, 2014, 1:01:13 AM8/17/14
to minux, Rob Pike, Matthew Dempsky, Russ Cox, golang-dev
> The only difference I see at a quick glance is that go vet allows
>     unsafe.Pointer(uintptr(p)+n)
> where you are proposing a different spelling
>     unsafe.Add(p, n)

I'm partial toward adding a method to unsafe.Pointer so you can do
      p.Add(n)

(or p.Offset(n) or p.Adjust(n)?) which does the equivalent thing.  In any case, I agree with Matthew that having a way to move an unsafe.Pointer without converting to uintptr and back would be nice long term.  We couldn't deprecate the old way for a while, of course.


--

minux

unread,
Aug 17, 2014, 2:11:14 AM8/17/14
to Keith Randall, Russ Cox, Matthew Dempsky, Rob Pike, golang-dev


On Aug 17, 2014 1:01 AM, "Keith Randall" <k...@google.com> wrote:
> > The only difference I see at a quick glance is that go vet allows
> >     unsafe.Pointer(uintptr(p)+n)
> > where you are proposing a different spelling
> >     unsafe.Add(p, n)
>
> I'm partial toward adding a method to unsafe.Pointer so you can do
>       p.Add(n)

this had been proposed before, and i think the answer was always we won't add methods to built-in types.

the semantic is also a problem: should p.Add change p?


> (or p.Offset(n) or p.Adjust(n)?) which does the equivalent thing.  In any case, I agree with Matthew that having a way to move an unsafe.Pointer without converting to uintptr and back would be nice long term.  We couldn't deprecate the old way for a while, of course.

i don't understand why the old form isn't ok in the long term: as long as the operation itself is safe (adjusted p still points to the same object), p should keep the object alive and
during pointer manipulation it isn't a safe point for GC and object movement. (the compiler need to do pointer arithmetic under the hood, so it needs a way to control GC involvement when the pointer is only contained in registers anyway)

i think i understand why you want this: the runtime uses a lot of unsafe. but still i don't support any proposal that make pointer arithmetic easier to write. that feature should be strongly discouraged instead.

Carlos Castillo

unread,
Aug 17, 2014, 10:20:05 AM8/17/14
to golan...@googlegroups.com, k...@google.com, r...@golang.org, mdem...@google.com

On Saturday, August 16, 2014 11:11:14 PM UTC-7, minux wrote:


On Aug 17, 2014 1:01 AM, "Keith Randall" <k...@google.com> wrote:
> > The only difference I see at a quick glance is that go vet allows
> >     unsafe.Pointer(uintptr(p)+n)
> > where you are proposing a different spelling
> >     unsafe.Add(p, n)
>
> I'm partial toward adding a method to unsafe.Pointer so you can do
>       p.Add(n)
this had been proposed before, and i think the answer was always we won't add methods to built-in types.

the semantic is also a problem: should p.Add change p? 

Instead of a method on unsafe.Pointer, you could always have a function in unsafe: func Offset( ptr unsafe.Pointer, offset int ) 
 

> (or p.Offset(n) or p.Adjust(n)?) which does the equivalent thing.  In any case, I agree with Matthew that having a way to move an unsafe.Pointer without converting to uintptr and back would be nice long term.  We couldn't deprecate the old way for a while, of course.
i don't understand why the old form isn't ok in the long term: as long as the operation itself is safe (adjusted p still points to the same object), p should keep the object alive and
during pointer manipulation it isn't a safe point for GC and object movement. (the compiler need to do pointer arithmetic under the hood, so it needs a way to control GC involvement when the pointer is only contained in registers anyway)

i think i understand why you want this: the runtime uses a lot of unsafe. but still i don't support any proposal that make pointer arithmetic easier to write. that feature should be strongly discouraged instead.

If the memory for a value is moved while it's address is stored in a uintptr, the uintptr isn't updated. In the future this might happen with heap objects in a compacting GC, in the present it can happen with stack objects due to contiguous stacks. http://play.golang.org/p/v-NlS3AS9B

Currently to be safe with pointers to stack objects, you need to ensure there are no function calls between the conversion to uintptr, the addition of the offset, and the conversion back to unsafe.Pointer/the original pointer type. In the future, to protect heap pointers from the compacting GC, you need to prevent the GC stop the world from happening in between the three stages, ie: no channel operations, no allocations, as well as no function calls.

For me it isn't about making pointer arithmetic easier, it's about preventing races between the code that needs to perform the arithmetic, and the code that can change the location of objects. If it's possible to (somehow) apply offsets directly to unsafe.Pointers, then the detour though uintptr is not necessary, so the object moving code will always update the pointers correctly, and there is never a point in the code where you are depending on the compaction or stack moving to not happen.

minux

unread,
Aug 17, 2014, 4:16:42 PM8/17/14
to Carlos Castillo, golang-dev, Keith Randall, Russ Cox, Matthew Dempsky
On Sun, Aug 17, 2014 at 10:20 AM, Carlos Castillo <cook...@gmail.com> wrote:

i think i understand why you want this: the runtime uses a lot of unsafe. but still i don't support any proposal that make pointer arithmetic easier to write. that feature should be strongly discouraged instead.

If the memory for a value is moved while it's address is stored in a uintptr, the uintptr isn't updated. In the future this might happen with heap objects in a compacting GC, in the present it can happen with stack objects due to contiguous stacks. http://play.golang.org/p/v-NlS3AS9B
This is wrong usage of unsafe.Pointer and uinptr.
You're storing the pointer in an uintptr, and that is disallowed in Go 1.3 release note.

Currently to be safe with pointers to stack objects, you need to ensure there are no function calls between the conversion to uintptr, the addition of the offset, and the conversion back to unsafe.Pointer/the original pointer type. In the future, to protect heap pointers from the compacting GC, you need to prevent the GC stop the world from happening in between the three stages, ie: no channel operations, no allocations, as well as no function calls.
That's what issue 7192 is for: documenting what's allowed and what isn't.
I don't think we need to introduce new method or functions just to do pointer arithmetics,
and for backward compatibility reasons, simple expression forms like p2 = p1 + n should
always work as intended.
For me it isn't about making pointer arithmetic easier, it's about preventing races between the code that needs to perform the arithmetic, and the code that can change the location of objects. If it's possible to (somehow) apply offsets directly to unsafe.Pointers, then the detour though uintptr is not necessary, so the object moving code will always update the pointers correctly, and there is never a point in the code where you are depending on the compaction or stack moving to not happen.
The side effect is that it simplifies pointer arithmetic, and I object to that. Unsafe
and pointer arithmetics is supposed to be hard, subtle and strongly discouraged
(though there must be written rules, like the memory model docs), just like
sync/atomic (and our sync/atomic is still lacking documentations, for example,
are those functions act as memory barriers? Notably, ARM implementations are
not, and although we tried very hard to come up with tests to prove they're wrong,
we failed.).

Let's me re-iterate. gc could and must place GC/moving inhibit markers in the code,
and it's just a matter of disallowing GC/moving when pointer values are in registers.

Also, p2 = p1 + <-ch could be converted to:
t := <-ch
p2 = p1 + t
(side note: this is what happens today)

so it can still be a safe construct. Similarly for p2 = p1 + f().

What we need to define is if the unsafe.Pointer comes from a channel, or func,
is it required to store it to an variable first, or just p2 = F() + n is acceptable?

What's definitely unsafe is your example that stores pointer in uintptrs.

I expect the rule to be very simple: never store unsafe.Pointer in an uintptr variable,
only convert it to uintptr in an expression, and convert it back to unsafe.Pointer in
the same expression.

Also, the document should bear the same wording that is proposed for ref/mem:
Don't be clever. Don't use unsafe. If you need to look at this document to determine
if your program is correct, you're being clever.

Also, please note that during the process of taking unsafe out of encoding/gob,
several bugs were found and fixed. Even the Go authors didn't use unsafe.Pointer
correctly. Yet another data point to discourage its use.
Reply all
Reply to author
Forward
0 new messages