Hi all,
We develop an open source program for consumers that has a reasonably large usage within its niche, on a mix of operating systems and platforms. Recently we enabled crash reporting to get panic traces back from cooperating users. With that we've
discovered a bunch of panics of our own creation, plus a lot of noise in terms of fatal errors outside of our control -- typically users running out of memory or threads.
There remains a lot of "unexplained" oddness however, some of which I'm sure is attributable to hardware errors (bad RAM/CPU/etc). It's hard to be sure either way, but we get a lot of stacks. The list below is a (probably non-exhaustive) selection
of crashes from the last week or so that are odd in my mind:
- fatal error: defer on system stack
- fatal error: fatal error: unexpected signal during runtime execution
- fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?) (this could be ours, though we have no cgo I'm sure there is unsafe deep in the dependencies)
- fatal error: gc: unswept span
- fatal error: malloc deadlock
- fatal error: mSpanList.insertBack
- fatal error: non in-use span in unswept list
- fatal error: out of memory allocating heap arena metadata (I guess this is just a niche case of OOM)
- fatal error: runtime: stack split at bad time
- fatal error: runtime.newosproc (out of threads?)
- fatal error: runtime·unlock: lock count
- fatal error: s.allocCount != s.nelems && freeIndex == s.nelems
- fatal error: slice bounds out of range (deep in the malloc code)
- fatal error: stopm holding locks
- fatal error: sweep increased allocation count
- fatal error: sync: inconsistent mutex state
- fatal error: wirep: invalid p state
- panic: sync: inconsistent mutex state
I'm not going to spend any energy hunting these down or pester with bug reports, especially as I have no idea who the originating user is and no way to communicate with them or experiment. :) However, if there's anyone of you out there who think
"Huh? That GC error should never happen, wonder what's going on?" I would be happy to forward a bunch of crashes for that particular crash or provide access to the crash database for searching.
(A limitation of our crash reporting is that output prior to the panic/fatal error is trimmed as potentially sensitive user data. This means we miss the description that some fatal-error crashes print before the "fatal error:" line. We might fix
this at some point.)
//jb