Memory corruption with GOGC=on

191 views
Skip to first unread message

yang...@gmail.com

unread,
Feb 25, 2019, 12:16:16 AM2/25/19
to golang-nuts
HI all,

Error unexpected fault address might be triggered by data race or memory
corruption. After code review, I suspected the reasin is memory corruption instead
of the data race.

The  following code panics occasionally, and yes, when I initialized silces.

package controlcan

import "C"

cReceive
:= make([]C.struct__CAN_OBJ, 2500)

or 

package usbcan

import "controlcan"

pReceive
:= make([]controlcan.CanObj, 2500)

The error is:

unexpected fault address 0xffffffffffffffff
fatal error
: fault
[signal 0xc0000005 code=0x0 addr=0xffffffffffffffff pc=0x41c65d]


goroutine
41 [running]:
runtime
.throw(0xcc969a, 0x5)
       
/usr/local/go/src/runtime/panic.go:619 +0x88 fp=0xc0428ffb38 sp=0xc0428ffb18 pc=0x42d0b8
runtime
.sigpanic()
       
/usr/local/go/src/runtime/signal_windows.go:170 +0x13a fp=0xc0428ffb68 sp=0xc0428ffb38 pc=0x43fcca
runtime
.gcMarkRootPrepare()
       
/usr/local/go/src/runtime/mgcmark.go:72 +0x5d fp=0xc0428ffb70 sp=0xc0428ffb68 pc=0x41c65d
runtime
.gcStart(0x0, 0x1, 0x0, 0x0)
       
/usr/local/go/src/runtime/mgc.go:1350 +0x30f fp=0xc0428ffba0 sp=0xc0428ffb70 pc=0x419b6f
runtime
.mallocgc(0x10000, 0xc54660, 0xc0422ee001, 0xc0423ded60)
       
/usr/local/go/src/runtime/malloc.go:803 +0x448 fp=0xc0428ffc40 sp=0xc0428ffba0 pc=0x411c48
runtime
.makeslice(0xc54660, 0x9c4, 0x9c4, 0xc04202ce00, 0xc04202c000, 0x411b23)
       
/usr/local/go/src/runtime/slice.go:61 +0x7e fp=0xc0428ffc70 sp=0xc0428ffc40 pc=0x43fffe
controlcan
.Receive(0x4, 0x0, 0x0, 0xc04267e000, 0x9c4, 0x9c4, 0x64, 0x0, 0x0, 0x0)
       
/media/sf_GOPATH0/src/controlcan/controlcan.go:262 +0x75 fp=0xc0428ffd70 sp=0xc0428ffc70 pc=0xa0d795
posam
/protocol/usbcan.(*Channel).receive(0xc04229d490)
       
/media/sf_GOPATH0/src/posam/protocol/usbcan/usbcan.go:469 +0x536 fp=0xc0428fffd8 sp=0xc0428ffd70 pc=0xa10926
runtime
.goexit()
       
/usr/local/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc0428fffe0 sp=0xc0428fffd8 pc=0x457531
created
by posam/protocol/usbcan.(*Channel).Start
       
/media/sf_GOPATH0/src/posam/protocol/usbcan/usbcan.go:242 +0x3aa


I got confused for couple of days, until I ran the app with GOGC=off.
Everything works fine except the increasing memory usage. As what Dave Cheney
said in cgo is not Go, and JimB said at stackoverflow, I realize that it is possible
to trigger memory corruption when using cgo. But in the above situation, the
only things required is the type CanObj, not any variables.

So what is the reason behind this error, and how can I play well with GC and cgo?

Here're links about this question:

Thank you!

robert engels

unread,
Feb 25, 2019, 1:24:26 AM2/25/19
to yang...@gmail.com, golang-nuts
With GC off the free will not be called, nor will destructors be executed in response to finalizers, so many memory corruption errors will not be experienced (nor detected).

Almost certainly the C memory is not being managed correctly - possibly copying a pointer between structs and then freeing the a pointer, with the other struct still holding a reference.

I am guessing that the C.struct__CAN_OBJ also contains pointers ?

Including the definition of that structure MIGHT help, but honestly, when you bind to C, you really need to be a C programmer and a Go programmer to do it correctly IMO.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

yang...@gmail.com

unread,
Feb 25, 2019, 3:10:19 AM2/25/19
to golang-nuts
Thank you!

There is a pointer copied between Go and Cgo indeed, i.e., Go creates a slice (pReceive) to Cgo, which reads data into it. It would be reasonable when there are some panics occurred there. Actually, what triggers the panic is the make-slice statements, i.e., cReceive := make([]C.struct__CAN_OBJ, 2500), and pReceive := make([]controlcan.CanObj, 2500).  

The C definition of C.struct__CAN_OBJ:

typedef  struct  _VCI_CAN_OBJ{
 UINT ID
;
 UINT
TimeStamp;
 BYTE
TimeFlag;
 BYTE
SendType;
 BYTE
RemoteFlag;
 BYTE
ExternFlag;
 BYTE
DataLen;
 BYTE
Data[8];
 BYTE
Reserved[3];
}VCI_CAN_OBJ,*PVCI_CAN_OBJ;

My Go wrapper controlcan.CanObj:

type CanObj struct {
 ID        
int
 
TimeStamp  int
 
TimeFlag   int
 
SendType   byte
 
RemoteFlag byte
 
ExternFlag byte
 
DataLen    byte
 
Data       [8]byte
 
Reserved   [3]byte
}

My use case is reading data from the wrapper periodically.

package usbcan


import "controlcan"


func
(c *Channel) receive() {
    ticker
:= time.NewTicker(100 * time.Millisecond)
    defer ticker
.Stop()
   
for _ = range ticker.C {
        pReceive
:= make([]controlcan.CanObj, 2500) // unexpected fault address
        count
, err := controlcan.Receive(
            c
.DevType,
            c
.DevIndex,
            c
.CanIndex,
            pReceive
,
           
100,
       
)
// ...

The Receive() function in controlcan package:

func Receive(
    devType
int,
    devIndex
int,
    canIndex
int,
    pReceive
[]CanObj,
    waitTime
int,
) (count int, err error) {
    cReceive
:= make([]C.struct__VCI_CAN_OBJ, 2500) // unexpected fault address
    cCount
:= C.VCI_Receive(
        C
.uint(devType),
        C
.uint(devIndex),
        C
.uint(canIndex),
       
(*C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive)),
        C
.uint(FRAME_LENGTH_OF_RECEPTION),
        C
.int(waitTime),
   
)
    gReceive
:= (*[1 << 30]C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive))[:count:count]
   
// then convert elements in gReceive to pReceive
// ...

My project is going to send instructions to devices concurrently. The device communication protocol is provided as .so or .dll libraries. I know it's complicated, but there will be more fun, isn't it? Fun, at least, in the early peoriod. Recently I hate the runtime and gc a little bit ;)

Tamás Gulácsi

unread,
Feb 25, 2019, 3:46:18 AM2/25/19
to golang-nuts
(*C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive[0]))

should be - a pointer to the first element of the slice, not the slice (struct) itself!!!

yang...@gmail.com

unread,
Feb 25, 2019, 4:18:42 AM2/25/19
to golang-nuts
I'm sorry for the missed detail of Receive() function. Variable gReceive (sorry for the silly variable name) will be manipulated in Go, since some data coversions in the latter loop.

Function Receive():

func Receive(
devType int,
devIndex int,
canIndex int,
pReceive []CanObj,
waitTime int,
) (count int, err error) {
cReceive := C.makeCanObjArray()
defer C.freeCanObjArray(cReceive)
cCount := C.VCI_Receive(
C.uint(devType),
C.uint(devIndex),
C.uint(canIndex),
(*C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive)),
C.uint(FRAME_LENGTH_OF_RECEPTION),
C.int(waitTime),
)
count = int(cCount)
switch count {
case -1:
return count, fmt.Errorf("failed to receive data from invalid device")
case 0:
return count, nil
}
gReceive := (*[1 << 30]C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive))[:count:count]

for i := 0; i < count; i++ {
v := gReceive[i]
var data [8]byte
for j, d := range v.Data {
data[j] = byte(d)
}
var reserved [3]byte
for k, r := range v.Reserved {
reserved[k] = byte(r)
}
pReceive[i] = CanObj{
ID:         int(v.ID),
TimeStamp:  int(v.TimeStamp),
TimeFlag:   int(v.TimeFlag),
SendType:   byte(v.SendType),
RemoteFlag: byte(v.RemoteFlag),
ExternFlag: byte(v.ExternFlag),
DataLen:    byte(v.DataLen),
Data:       data,
Reserved:   reserved,
}
}
return count, err
}

Thank you for the patience!

Tamás Gulácsi

unread,
Feb 25, 2019, 5:00:59 AM2/25/19
to golang-nuts
You have to refer to the first element of the slice when using it as a C pointer-to-an-array!

cReceive := make([]C.struct__VCI_CAN_OBJ, 2500) // unexpected fault address
cCount := C.VCI_Receive(
C.uint(devType),
C.uint(devIndex),
C.uint(canIndex),
(*C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive[0])),
C.uint(FRAME_LENGTH_OF_RECEPTION),
C.int(waitTime),
)
gReceive := (*[1 << 30]C.struct__VCI_CAN_OBJ)(unsafe.Pointer(&cReceive[0]))[:count:count]

yang...@gmail.com

unread,
Feb 26, 2019, 2:58:05 AM2/26/19
to golang-nuts
Sorry for taking a while to reply and, thank you very much!!!

After testing over and over again this morning, the errors are gone, including the bad errors in Does gc hate me?

What I got in the stacktrace is only about gc and runtime, and it costs me lots of days to fix them, via mutex or KeepAlive XD. I will never figure out this point without your help!

Let me take a guess. The pointer I passed to C is the SliceHeader of []C.struct__VCI_CAN_OBJ,  then C fill the data received from devices to the contiguous addresses starts from the pointer. If the addresses are not allocated, e.g., in the begining of the execution, everything works fine. But after a while, when some variables declared or some slices grown in Go, and hits these mistaken addresses coincidentally, memory corruption is triggered. Because the addresses are recognized as available both in Go and C, if a .dll or .so calling is completed and C has freed the corresponding addresses as well. There is a high probability since the length of array is 2500

Thank you again!
Reply all
Reply to author
Forward
0 new messages