print* in runtime running into deadlocks

203 views
Skip to first unread message

Xiangdong Ji

unread,
Feb 22, 2023, 9:43:10 PM2/22/23
to golang-nuts
Hi experts,

Looks like the print* helper functions may run into deadlocks if the printer Goroutine is rescheduled to a different M to execute 'printunlock' as the "if mp.printlock==0" checking is likely to fail if M has changed.

Wondering is it intentional behavior? As print* are (almostly if not always) used in runtime during STW or guarded by high level locks.

I didn't figure out how to produce a deadlock by user testcase, but the change below could easily deadlock 'go_bootstrap' when running './src/make.bash'.

 A similar issue was opened in https://github.com/golang/go/issues/54786, but no follow-up for months.

Thanks.

```
diff --git a/src/runtime/mgcsweep.go b/src/runtime/mgcsweep.go
index 6ccf090ac5..3336823dc0 100644
--- a/src/runtime/mgcsweep.go
+++ b/src/runtime/mgcsweep.go
@@ -155,6 +155,7 @@ func (a *activeSweep) begin() sweepLocker {
                        return sweepLocker{mheap_.sweepgen, false}
                }
                if a.state.CompareAndSwap(state, state+1) {
+                       println("activeSweep.begin, old:", state, ",new:", state+1)
                        return sweepLocker{mheap_.sweepgen, true}
                }
        }
@@ -172,6 +173,7 @@ func (a *activeSweep) end(sl sweepLocker) {
                        throw("mismatched begin/end of activeSweep")
                }
                if a.state.CompareAndSwap(state, state-1) {
+                       println("activeSweep.end, old:", state, ",new:", state-1)
                        if state != sweepDrainedMask {
                                return
                        }
```

Ian Lance Taylor

unread,
Feb 23, 2023, 6:47:08 PM2/23/23
to Xiangdong Ji, golang-nuts
On Wed, Feb 22, 2023 at 6:43 PM Xiangdong Ji <xiangd...@gmail.com> wrote:
>
> Looks like the print* helper functions may run into deadlocks if the printer Goroutine is rescheduled to a different M to execute 'printunlock' as the "if mp.printlock==0" checking is likely to fail if M has changed.

Have you seen this happen? It ought to be impossible, because
printlock increments mp.locks. When mp.locks is not zero the M can't
be preempted and the goroutine can't block.

Ian


> Wondering is it intentional behavior? As print* are (almostly if not always) used in runtime during STW or guarded by high level locks.
>
> I didn't figure out how to produce a deadlock by user testcase, but the change below could easily deadlock 'go_bootstrap' when running './src/make.bash'.
>
> A similar issue was opened in https://github.com/golang/go/issues/54786, but no follow-up for months.
>
> Thanks.
>
> ```
> diff --git a/src/runtime/mgcsweep.go b/src/runtime/mgcsweep.go
> index 6ccf090ac5..3336823dc0 100644
> --- a/src/runtime/mgcsweep.go
> +++ b/src/runtime/mgcsweep.go
> @@ -155,6 +155,7 @@ func (a *activeSweep) begin() sweepLocker {
> return sweepLocker{mheap_.sweepgen, false}
> }
> if a.state.CompareAndSwap(state, state+1) {
> + println("activeSweep.begin, old:", state, ",new:", state+1)
> return sweepLocker{mheap_.sweepgen, true}
> }
> }
> @@ -172,6 +173,7 @@ func (a *activeSweep) end(sl sweepLocker) {
> throw("mismatched begin/end of activeSweep")
> }
> if a.state.CompareAndSwap(state, state-1) {
> + println("activeSweep.end, old:", state, ",new:", state-1)
> if state != sweepDrainedMask {
> return
> }
> ```
>
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/e48221e3-0709-49e1-8cbd-314044558c56n%40googlegroups.com.
Message has been deleted

Xiangdong Ji

unread,
Feb 27, 2023, 2:02:48 AM2/27/23
to golang-nuts
>> It ought to be impossible, because printlock increments mp.locks. 

Thanks, realized it's NOT a problem with printlock/printunlock.

 
>> Have you seen this happen?
Not yet with normal program, but a deadlock could be produced with the diagnostic change to mgcsweep.go (say, running ./make.bash will likely deadlock
in go_bootstrap), turns out it involves 'mheap_.lock' and 'debuglock' when print* triggers copystack.

The scenario may look like:
Goroutine-A: printlock -> hold 'debuglock' -> printing -> trigger copystack -> try to allocate -> acquiring 'mheap_.lock'
Goroutine-B: hold 'mheap_.lock' -> allocating -> running into activeSweep's diagnostics -> acquiring 'debuglock'
Reply all
Reply to author
Forward
0 new messages