The immediate cause is clearly two threads using the same gp.schedlink (used by gQueue/gList). In the Gdead case, gfget is returning a g from the local p.gFree gList. There is no corresponding gfput, so the most likely cause is a corrupt gp.schedlink on an earlier entry in the list.
Here's a potential scenario where I think this can go wrong:
1. P1: call runqdrain, read head / tail.
2. P1: Push G1 to drainQ, setting g1.schedlink = 0, drainQ.head = g1.
3. P2: Call runqsteal, read head / tail, steal G1, update head.
4. P2: Execute G1, which exits.
5. P2: In goexit0, call p.gFree.push(g1), which sets g1.schedlink = p.gFree.head.
6. P1: Push G2 to drainQ, setting g2.schedlink = 0, drainQ.head.schedlink (== g1.schedlink) = g2.
7. P1: Attempt to CAS update head, which fails due to P2's steal.
8. P1: Retry, this time only draining G2.
runqdrain thinks it was successful, but step 6 is the critical error. In step 6, runqdrain has linked g2 onto P2's p.gFree list, at which point now a live goroutine can be erroneously returned from gfget as a "dead" G.
I'd have to think about it a bit more, but I think the correct summary would be to say that gp.schedlink may only be used by a caller that fully owns the G. runqdrain does not fully own the Gs it puts in the queue until it successfully CAS updates head, thus it is unsafe. Note that runqgrab doesn't have this problem because it places Gs into an array (actually another runq) rather than using a gQueue / gList.