safety of using unsafe.Pointer for slice element alias detection

342 views
Skip to first unread message

Dan Kortschak

unread,
Aug 19, 2015, 8:38:55 PM8/19/15
to golang-nuts, Ian Lance Taylor
I am in the process of cleaning up some of the assumptions we make in
backing slice alias detection in the gonum mat64 package.

My approach to this is to define a subset of types that we can guarantee
we will safely detect data aliasing to avoid operations clobbering input
data in unpredictable ways (important for e.g. Mul when given
a.Mul(a,b)). The subset is types that have a defined memory layout
matching the types in github.com/gonum/blas/blas64. Then I can take
unsafe.Pointers of the relevant matrix elements to do comparisons
(requires getting 4 unsafe.Pointers). Unfortunately I then need to
convert to uintptr to perform some (small number of) math operations to
query for overlap.

I'm wondering how safe this is. Now it should be -maybe- OK because the
addresses are retained in the original value and unless there is a GC
memory move within the sequence of conversions from unsafe.Pointer to
uintptr the relative position relationships should be preserved, which
is all we care about.

Essentially, how safe is

```
a0 := uintptrAddrOf(&a.data[0])
aEnd := uintptrAddrOf(&a.data[len(a.data)-1])
b0 := uintptrAddrOf(&b.data[0])
bEnd := uintptrAddrOf(&b.data[len(b.data)-1])
```

when needing to retain spatial relationships between a and b?

@Ian, since you've expressed interest in clearing up the rules about
Go->C pointer handling (12116 and 8310) in golang-dev and this is
reasonably similar, what is your guess at this stage?

thanks
Dan

Ian Lance Taylor

unread,
Aug 19, 2015, 9:59:25 PM8/19/15
to Dan Kortschak, golang-nuts
Is your goal to be able to reliably detect whether two slices alias
one another? I believe you can do that reliably by writing something
along the lines of
aend := &a.data[:cap(a.data)][cap(a.data)-1]
bend := &b.data[:cap(b.data)][cap(b.data)-1]
if aend == bend {
// a and b alias each other
}

That doesn't work if you need to know whether two slices of a larger
backing array overlap or not. But you could do it in two steps.
First find out whether they are in the same backing array. If they
are, then write something like
astart := uintptr(unsafe.Pointer(&a.data[:cap(a.data)][cap(a.data)-1])) -
uintptr(unsafe.Pointer(&a[0]))
That should safely give you the index in the backing array of where a
starts, and of course you know the length. Do the same for b, and now
you can see whether they cover the same portion of the backing array.

This approach assumes that the GC is not going to move the array while
evaluating a single expression that makes no function calls. I think
that should be safe for some time.

Ian

Dan Kortschak

unread,
Aug 19, 2015, 10:13:47 PM8/19/15
to Ian Lance Taylor, golang-nuts
Thanks Ian,

The problem is harder than just a linear overlap because matrix data
slices are strided with not all elements within a slice necessarily
being part of the query.

The code here shows the proof of concept, but requires
github.com/gonum/matrix/mat64: http://play.golang.org/p/6TCHqncVLC The
algorithm is straight forward though.

The test is not so much whether there is a potential for aliasing as
whether the matrix views address the same elements.

Dan

Nigel Tao

unread,
Aug 19, 2015, 10:15:37 PM8/19/15
to Ian Lance Taylor, Dan Kortschak, golang-nuts
On Thu, Aug 20, 2015 at 11:58 AM, Ian Lance Taylor <ia...@golang.org> wrote:
> That doesn't work if you need to know whether two slices of a larger
> backing array overlap or not. But you could do it in two steps.
> First find out whether they are in the same backing array. If they
> are, then write something like
> astart := uintptr(unsafe.Pointer(&a.data[:cap(a.data)][cap(a.data)-1])) -
> uintptr(unsafe.Pointer(&a[0]))

Does that still work if you can re-cap slices via the s[i:j:k] syntax?

Dan Kortschak

unread,
Aug 19, 2015, 10:18:26 PM8/19/15
to Nigel Tao, Ian Lance Taylor, golang-nuts
On Thu, 2015-08-20 at 12:15 +1000, Nigel Tao wrote:
> Does that still work if you can re-cap slices via the s[i:j:k] syntax?
>
I don't think it will, but as mentioned, this doesn't do what we need
anyway.

Ian Lance Taylor

unread,
Aug 19, 2015, 10:25:25 PM8/19/15
to Dan Kortschak, golang-nuts
On Wed, Aug 19, 2015 at 7:13 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
>
> The problem is harder than just a linear overlap because matrix data
> slices are strided with not all elements within a slice necessarily
> being part of the query.
>
> The code here shows the proof of concept, but requires
> github.com/gonum/matrix/mat64: http://play.golang.org/p/6TCHqncVLC The
> algorithm is straight forward though.
>
> The test is not so much whether there is a potential for aliasing as
> whether the matrix views address the same elements.

I don't know if there is any way to make that fully safe. One thing
I'll say that you should avoid having any function calls involved,
because that is where objects might move. That is, drop your
uintptrAddrof function, and manually inline the operations anywhere
you use it.

Ian

Dan Kortschak

unread,
Aug 19, 2015, 10:36:08 PM8/19/15
to Ian Lance Taylor, golang-nuts
On Wed, 2015-08-19 at 19:25 -0700, Ian Lance Taylor wrote:
> I don't know if there is any way to make that fully safe. One thing
> I'll say that you should avoid having any function calls involved,
> because that is where objects might move. That is, drop your
> uintptrAddrof function, and manually inline the operations anywhere
> you use it.
>
Yeah, in the OP what I was thinking was doing this (though I wasn't
clear):

f0 := uintptr(unsafe.Pointer(&f.data[0]))
other0 := uintptr(unsafe.Pointer(&other.data[0]))
fEnd := uintptr(unsafe.Pointer(&f.data[len(f.data)-1]))
otherEnd := uintptr(unsafe.Pointer(&other.data[len(other.data)-1]))

Then so long as all these four lines are executed without a GC move
interposed we are safe since we are considering relative positions. It
would be really nice to be able to do this - it would address the cgo
pointer movement issue as well.

The safety is pretty important - without it we can't do anything about
aliasing and a significant fraction of our API becomes untenable (matrix
views become unsafe unless clients keep mental records of what they have
done beyond what is required for the actual math). Writing a doc section
on aliasing to the effect that 'you can do it, but we don't guarantee
any defined behaviour' is pretty unsatisfying.

Dan

Ian Lance Taylor

unread,
Aug 20, 2015, 12:13:59 AM8/20/15
to Dan Kortschak, golang-nuts
On Wed, Aug 19, 2015 at 7:35 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
>
> The safety is pretty important - without it we can't do anything about
> aliasing and a significant fraction of our API becomes untenable (matrix
> views become unsafe unless clients keep mental records of what they have
> done beyond what is required for the actual math). Writing a doc section
> on aliasing to the effect that 'you can do it, but we don't guarantee
> any defined behaviour' is pretty unsatisfying.

Assuming you control the API, you could record a bit of extra
information for each data structure, which would permit you to safely
determine overlaps.

Ian

Dan Kortschak

unread,
Aug 20, 2015, 12:24:47 AM8/20/15
to Ian Lance Taylor, golang-nuts
On Wed, 2015-08-19 at 21:13 -0700, Ian Lance Taylor wrote:
> Assuming you control the API, you could record a bit of extra
> information for each data structure, which would permit you to safely
> determine overlaps.

Yes, that's something I have considered. That information needs to then
be threaded through all the relevant operations that update view
relationships, and it breaks cases where users make excursions from the
mat64 API to use pure BLAS calls (we allow users to get types that work
directly with the BLAS routines without mat64 annotation), a case that
we support and would work when using addresses.

Dan

Dan Kortschak

unread,
Aug 20, 2015, 1:24:24 AM8/20/15
to Ian Lance Taylor, golang-nuts
Is an alternative to write an asm function to get the start and end
addresses as uintptrs and then use those in the calculations
subsequently?:

// func endPoints(a, b []float64) (a0, an, b0, bn uintptr)
TEXT ·endPoints(SB),NOSPLIT,$0
MOVQ a+8(FP), DI
MOVQ b+32(FP), SI
MOVQ a+16(FP), DX
MOVQ b+40(FP), AX
MOVQ DI, a0+56(FP)
MOVQ DX, BX
DECQ BX
LEAQ (DI)(BX*8), BP
MOVQ BP, an+64(FP)
MOVQ SI, b0+72(FP)
MOVQ AX, BX
DECQ BX
LEAQ (SI)(BX*8), BP
MOVQ BP, bn+80(FP)
RET


then in the phrasing of the playground example earlier:

func endPoints(a, b []float64) (a0, an, b0, bn uintptr)

func (f *foot) overlaps(other *foot) bool {
if f == nil || other == nil {
// We don't know.
return false
}

f0, fEnd, other0, otherEnd := endPoints(f.data, other.data)
if f0 == other0 {
return true
}
if fEnd < other0 {
// We know f is completely before other.
return false
}
if otherEnd < f0 {
// We know f is completely after other.
return false
}

if f.stride != other.stride {
// Too hard, so assume the worst.
return true
}

if f0 < other0 {
offset := int((other0 - f0) / unsafe.Sizeof(float64(0)))
return (offset/f.stride) < f.rows && ((offset%f.stride) < f.cols || ((offset+f.cols)%f.stride) < f.cols)
} else {
offset := int((f0 - other0) / unsafe.Sizeof(float64(0)))
return (offset/f.stride) < other.rows && ((offset%f.stride) < other.cols || ((offset+other.cols)%f.stride) < other.cols)
}
}




Dan Kortschak

unread,
Aug 20, 2015, 3:09:53 AM8/20/15
to Ian Lance Taylor, golang-nuts
With corrected offsets:

// func endPoints(a, b []float64) (a0, an, b0, bn uintptr)
TEXT ·endPoints(SB),NOSPLIT,$0
MOVQ a+0(FP), DI
MOVQ b+24(FP), SI
MOVQ a+8(FP), DX
MOVQ b+32(FP), AX
MOVQ DI, a0+48(FP)
MOVQ DX, BX
DECQ BX
LEAQ (DI)(BX*8), BP
MOVQ BP, an+56(FP)
MOVQ SI, b0+64(FP)
MOVQ AX, BX
DECQ BX
LEAQ (SI)(BX*8), BP
MOVQ BP, bn+72(FP)
RET


Ian Lance Taylor

unread,
Aug 20, 2015, 10:08:29 AM8/20/15
to Dan Kortschak, golang-nuts
On Wed, Aug 19, 2015 at 10:23 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
>
> Is an alternative to write an asm function to get the start and end
> addresses as uintptrs and then use those in the calculations
> subsequently?:

The problem you are trying to solve is that there are no rules written
down for when a memory block may move. The lack of rules is
complicated by the fact that at present memory blocks never move
anyhow. Writing the code in assembler doesn't help you with the fact
that there are no rules. If the rules are ever written down it's
unlikely that they will say anything like "this does not affect code
written in assembly."

Ian

Dan Kortschak

unread,
Aug 20, 2015, 3:53:46 PM8/20/15
to Ian Lance Taylor, golang-nuts
Sure, but there are reasonable behaviours to expect. A set of rules that did not preclude that kind of behaviour would break a lot of code unless the assembler generated code that allowed pointer values to updated when GC moves happen - anything that iterates over a slice.

I am sure that I'm grasping at straws, but it is not unreasonable when there are outstanding issues with a reasonable chance of impacting on our capacity to maintain a usable package and when there have been comments made in those issues to the effect that "we might introduce bogey men."

Dan

Ian Lance Taylor

unread,
Aug 20, 2015, 8:56:20 PM8/20/15
to Dan Kortschak, golang-nuts
On Thu, Aug 20, 2015 at 12:53 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
> Sure, but there are reasonable behaviours to expect. A set of rules that did not preclude that kind of behaviour would break a lot of code unless the assembler generated code that allowed pointer values to updated when GC moves happen - anything that iterates over a slice.
>
> I am sure that I'm grasping at straws, but it is not unreasonable when there are outstanding issues with a reasonable chance of impacting on our capacity to maintain a usable package and when there have been comments made in those issues to the effect that "we might introduce bogey men."

It's a reasonable question. I just don't think the answer is going to
be "write in assembler." I can't promise that there will be an
answer, but I'm pretty sure that won't be it.

Ian

Dan Kortschak

unread,
Aug 20, 2015, 9:24:15 PM8/20/15
to Ian Lance Taylor, golang-nuts
On Thu, 2015-08-20 at 17:56 -0700, Ian Lance Taylor wrote:
> It's a reasonable question. I just don't think the answer is going to
> be "write in assembler." I can't promise that there will be an
> answer, but I'm pretty sure that won't be it.
>
No worries. I'm happy if writing asm is not part of the solution. The
lack of an answer is making it very difficult to make any kinds of
stability plans for gonum mat64 dependent code in our projects and
weakens the value of the project as a whole. I understand that the issue
is complex, but the lack of certainty here is a problem.

thanks
Dan

Dan Kortschak

unread,
Sep 1, 2015, 7:34:38 AM9/1/15
to Ian Lance Taylor, golang-nuts
With the recent cgo pointer proposal having been submitted, and the similarity between these two problems, is it worth filing an issue for this and linking to that issue?

Dan

Ian Lance Taylor

unread,
Sep 1, 2015, 12:48:47 PM9/1/15
to Dan Kortschak, golang-nuts
On Tue, Sep 1, 2015 at 4:31 AM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
> With the recent cgo pointer proposal having been submitted, and the similarity between these two problems, is it worth filing an issue for this and linking to that issue?

It's certainly worth filing an issue for this. I see it as
significantly different than the cgo pointer issue, though. The cgo
pointer issue is about how we restrict the garbage collector to enable
pointers to escape its purview, and how we attempt to restrict C code
when working with Go pointers. This issue is about the language
itself. The language permits converting an unsafe.Pointer to uintptr,
but what does it permit once you've done that? That has never been
written down. To me this is not so much a matter of restricting the
garbage collector as considering when it can be invoked.

Ian

Dan Kortschak

unread,
Sep 1, 2015, 3:47:41 PM9/1/15
to Ian Lance Taylor, golang-nuts
I'll file later today.

The reason I see them as similar is that what I'm asking for is a temporary capacity to stop GC activity (not necessary now, but will be when GC moves) so that relative positions can be calculated between two values - the final product is not a uintptr and it could be provded in such a way that no uintptr copy of an unsafe.Pointer value is ever exposed for this. If your cgo pointer proposal were accepted then a cgo call would get exactly that for me, but for a cgo price tag. It seems to me that the mechanism could be shared.

Dan

Dan Kortschak

unread,
Sep 1, 2015, 5:40:00 PM9/1/15
to Ian Lance Taylor, golang-nuts
For the issue, would you put it under unsafe, or something else?

Ian Lance Taylor

unread,
Sep 1, 2015, 9:13:41 PM9/1/15
to Dan Kortschak, golang-nuts
On Tue, Sep 1, 2015 at 2:37 PM, Dan Kortschak
<dan.ko...@adelaide.edu.au> wrote:
> For the issue, would you put it under unsafe, or something else?

cmd/compile

Ian
Reply all
Reply to author
Forward
0 new messages