unexpected fault address 0x0fatal error: fault[signal 0x7 code=0x80 addr=0x0 pc=0x408e5b]goroutine 106 [running]:runtime.throw(0x8bce98, 0x5)/home/carl/git/go/src/runtime/panic.go:527 +0x90 fp=0xc82003c9c8 sp=0xc82003c9b0runtime.sigpanic()/home/carl/git/go/src/runtime/sigpanic_unix.go:21 +0x10c fp=0xc82003ca18 sp=0xc82003c9c8runtime.mapassign1(0x7c4e20, 0xc820418480, 0xc82003cb20, 0xc82003cb60)/home/carl/git/go/src/runtime/hashmap.go:446 +0x2eb fp=0xc82003cae0 sp=0xc82003ca18FuncA(0xc8203bdd50, 0xc8203dc000, 0xb, 0x0, 0x0)/path/to/package/under/test/funca.go:110 +0x1f9 fp=0xc82003cb80 sp=0xc82003cae0GoroutineOfClosureInTestFunc(0xc82013d8c0, 0xc8200b45a0, 0xc82009eaa0, 0x31)/path/to/package/under/test/my_test.go:82 +0xdf7 fp=0xc82003cf90 sp=0xc82003cb80runtime.goexit()/home/carl/git/go/src/runtime/asm_amd64.s:1696 +0x1 fp=0xc82003cf98 sp=0xc82003cf90created by TestFunc/path/to/package/under/test/my_test.go:103 +0x174
panic: runtime error: makeslice: len out of range
goroutine 82 [running]:FuncA(0xc8204fdd70, 0x7fe150388c18, 0xc82000a090, 0x0, 0x0, 0x0, 0x0, 0x0)/path/to/package/under/test/subpackage/filea.go:18 +0x92FuncB(0xc820369960, 0x7fe150388c18, 0xc82000a090, 0x0, 0x0)/path/to/package/under/test/fileb.go:92 +0xa3FuncC(0xc820369960, 0xc82011e020, 0x1d, 0x1d, 0x0, 0x0)/path/to/package/under/test/fileb.go:125 +0x14eFuncD(0xc820369960, 0xc82011e020, 0x1d, 0x1d, 0x0, 0x0)<autogenerated>:29 +0x7aGoroutineServerRecvLoop(0xc8200a29b0, 0x7fe151bc7138, 0xc8201b80c0, 0x7fe151bc7178, 0xc8201ba000)/path/to/package/under/test/channel.go:195 +0x2fecreated by /path/to/package/under/test.(*Channel).connLoop/path/to/package/under/test/channel.go:110 +0x40b
type Thing struct {Length int}func (t *Thing) FuncA(reader io.Reader) ([]byte, error) {data := make([]byte, t.Length)_, err := io.ReadFull(reader, data)return data, err}
channel_test.go:89: len(data)=2 != field.length=859534820080
func (t *Thing) Write(writer io.Writer, data []byte) error {if len(data) != t.Length {return fmt.Errorf("len(data)=%d != thing.length=%d", len(data), t.Length)}_, err := writer.Write(data)return err}
The value 859534820080 is 0xC8204482F0 in hex - looks like an address. If t.Length is the first word in the struct (as you indicate), it could be that the memory for t has been reclaimed, leaving the t pointer dangling. I ran into heap corruption in golang.org/issue/11643 - my understanding is that when memory is reclaimed, the first word of the object is overwritten by a pointer to the next free piece of memory and the second word (if applicable) is overwritten with 0xdeaddeaddeaddead to indicate that the remainder of the object needs to be zeroed before use.Things that helped us debug were 1) running with GODEBUG=gccheckmark=1 and 2) looking through the code at *everything* that touched the problematic memory location (or held pointers to it). In our application, a pointer to the object was passed on an unbuffered channel via a select statement.HTH,Rhys
I've bumped the number of goroutines in the problematic stress test from 50 to 500, and now have a 64-bit linux test binary that pretty reliably (~90% of runs) triggers one of the three weird behaviors. I've been playing around with GODEBUG flags per your suggestion, and have found the following:
- gcstoptheworld=1 reliably eliminates the errors
- gccheckmark=1 reliably fails with an error like the following:
runtime:greyobject: checkmarks finds unexpected unmarked object obj=0xc820129bd0runtime: found obj at *(0xc820371d50+0x8)base=0xc820371d50 k=0x64101b8 s.start*_PageSize=0xc820370000 s.limit=0xc820371ff0 s.sizeclass=8 s.elemsize=112*(base+0) = 0x7f36649ca9e8*(base+8) = 0xc820129bd0 <==*(base+16) = 0xc82005d3b0*(base+24) = 0x7f366407d000*(base+32) = 0xabbb80*(base+40) = 0xc82005d3e0*(base+48) = 0xc820129be0*(base+56) = 0x2*(base+64) = 0x8*(base+72) = 0xabc558*(base+80) = 0x0*(base+88) = 0x0*(base+96) = 0xc820321a70*(base+104) = 0x0obj=0xc820129bd0 k=0x6410094 s.start*_PageSize=0xc820128000 s.limit=0xc82012a000 s.sizeclass=2 s.elemsize=16*(obj+0) = 0x2*(obj+8) = 0x0fatal error: checkmark found unmarked object
I can't reproduce on Darwin at least. How long does it take? Any special config required?
--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
$ go version
go version devel +fced03a Thu Aug 6 02:59:16 2015 +0000 darwin/amd64
$ time GODEBUG=gccheckmark=1 go run -race heavy.go
runtime:greyobject: checkmarks finds unexpected unmarked object obj=0xc82058c550
runtime: found obj at *(0xc82058c560+0x8)
base=0xc82058c560 k=0x64102c6 s.start*_PageSize=0xc82058c000 s.limit=0xc82058e000 s.sizeclass=2 s.elemsize=16
*(base+0) = 0xc82058c540
*(base+8) = 0xc82058c550 <==
obj=0xc82058c550 k=0x64102c6 s.start*_PageSize=0xc82058c000 s.limit=0xc82058e000 s.sizeclass=2 s.elemsize=16
*(obj+0) = 0x0
*(obj+8) = 0x0
fatal error: checkmark found unmarked object
[snip]
exit status 2
real 0m2.712s
user 0m3.360s
sys 0m0.629s
$ time GODEBUG=gccheckmark=1 go run heavy.go
^Cexit status 2
real 1m25.349s
user 3m25.085s
sys 0m4.475s
I tried bisecting it twice, once with a test timeout of 10s and once with a test timeout of 60s. They both pointed me to different commits, and the previous commit (the "last good" commit) still failed when I ran it again.
Damian