hi all,
I'm working on project using
OpenXLA (c++ library for fast execution of computation graphs) in Go, and while managing their implementation of "tensors" (high dimensional data) in Go, I bumped into an odd behavior of slices pointing to data that is managed by the C++.
The full example is unfortunately too large (it would include C/C++ code), hopefully the relevant part I copy below is sufficient for someone to spot what could be happening -- I'm hoping I'm simply misunderstanding something:
func (l *Literal) Data() any {
if l.IsNil() {
return nil
}
rawData := unsafe.Pointer(l.cLiteralPtr.data)
len := int(l.cLiteralPtr.size)
return unsafe.Slice((int)(rawData), len)
}
...
indices = indicesT.Literal().Data().([]int)
fmt.Printf("\tindices=%v, ptr=%0X\n", indices, &indices[0])
runtime.GC() // Apparently changes content of indices
fmt.Printf("\tindices=%v, ptr=%0X\n", indices, &indices[0])
...
The `indicesT` is generated (in C++ land) as a small vector of random indices, and it gets converted to a Go slice in `indices`, using `unsafe.Slice`.
Executing the above I get something like:
indices=[421 5309 10924 8068 2193 4380 3475 8713], ptr=2F66300
indices=[0 5309 10924 8068 2193 4380 3475 8713], ptr=2F66300
Notice the value of `indices[0]` changed (!) to zero. There is no other goroutine that knows or interacts in any way with `indices` (or the underlying C++ object) -- at least that I'm aware.
Curiously if I remove the call to `runtime.GC()` everything works (!). Took me a while to discover and isolate the issue related to garbage collection (it could be just correlation) since it would otherwise happen only very occasionally, I usually use these slices into CGO just after I create them.
Now also very curious is that if I add a second conversion for `indices`, after the `fmt.Printf` (!), then the issue goes away. As in:
...
indices = indicesT.Literal().Data().([]int)
fmt.Printf("\tindices=%v, ptr=%0X\n", indices, &indices[0])
runtime.GC()
fmt.Printf("\tindices=%v, ptr=%0X\n", indices, &indices[0])
indices = indicesT.Literal().Data().([]int)
...
A typical output now is (notice the two lines are identical, as expected):
indices=[563 9434 222 8007 9770 11613 9906 11786], ptr=179C300
indices=[563 9434 222 8007 9770 11613 9906 11786], ptr=179C300
I observed the same issue with other larger tensors I was working on (some garbled images stored as tensors).
Any thoughts ?
many thanks!
Jan