runtime deadlock with callbacks from Windows thread pool

102 views
Skip to first unread message

Aaron Klotz

unread,
Aug 5, 2022, 3:49:00 PM8/5/22
to golang-dev
Hi everybody,

I'm working on adding support for Windows' asynchronous DNS query APIs to the net package in the standard library.

When these APIs complete asynchronously, they post their completion events to the Windows thread pool. Unfortunately, when these events call into Go, they trigger a deadlock if the initial invocation of the API originated from the main package's init() function. A simplified test case for this scenario is available at [0].

Since the callback is arriving on an OS thread that has not previously entered Go, its stub is waiting for main's init to complete before proceeding [1]. However, the main goroutine is blocked because the API is waiting on the callback [2][3].

Your first question is probably, "Why would you ever even try to call something like that from within main's init()?" Actually I wouldn't, but there is a net.Dial that is effectively triggered from within main's init() as part of the Go test suite! [4]

Now for my actual question: Is this deadlock considered to be expected behaviour on the part of runtime, or is this something that needs to be fixed?

Thanks,

Aaron

[0] https://gist.github.com/dblohm7/e644522feb753cf26d19f4ed20e9c320
[1] https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/runtime/cgocall.go;l=293
[2] https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/runtime/proc.go;l=233
[3] https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/runtime/proc.go;l=239
[4] https://cs.opensource.google/go/go/+/refs/tags/go1.19:src/runtime/testdata/testprognet/net.go

Michael Pratt

unread,
Aug 5, 2022, 4:09:13 PM8/5/22
to Aaron Klotz, golang-dev
On Fri, Aug 5, 2022 at 3:48 PM 'Aaron Klotz' via golang-dev <golan...@googlegroups.com> wrote:
Hi everybody,

I'm working on adding support for Windows' asynchronous DNS query APIs to the net package in the standard library.

When these APIs complete asynchronously, they post their completion events to the Windows thread pool. Unfortunately, when these events call into Go, they trigger a deadlock if the initial invocation of the API originated from the main package's init() function. A simplified test case for this scenario is available at [0].

Since the callback is arriving on an OS thread that has not previously entered Go, its stub is waiting for main's init to complete before proceeding [1]. However, the main goroutine is blocked because the API is waiting on the callback [2][3].

Your first question is probably, "Why would you ever even try to call something like that from within main's init()?" Actually I wouldn't, but there is a net.Dial that is effectively triggered from within main's init() as part of the Go test suite! [4]

Now for my actual question: Is this deadlock considered to be expected behaviour on the part of runtime, or is this something that needs to be fixed?

We allow concurrent execution with init() functions (i.e., you can start goroutines from init functions and they will run immediately), so (ignoring the implementation complexity of this) it seems reasonable that concurrency via C threads should be allowed as well, provided that the only reachable code is from packages that have completed init.

Note that https://go.dev/issue/15943 discusses removing `main_init_done`, but just by changing the mechanism, not the behavior.

 
--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/bdbfb20a-858d-4e81-acdb-955e541091e2n%40googlegroups.com.

Ian Lance Taylor

unread,
Aug 5, 2022, 5:01:10 PM8/5/22
to Aaron Klotz, golang-dev
On Fri, Aug 5, 2022 at 12:48 PM 'Aaron Klotz' via golang-dev
<golan...@googlegroups.com> wrote:
>
> I'm working on adding support for Windows' asynchronous DNS query APIs to the net package in the standard library.
>
> When these APIs complete asynchronously, they post their completion events to the Windows thread pool. Unfortunately, when these events call into Go, they trigger a deadlock if the initial invocation of the API originated from the main package's init() function. A simplified test case for this scenario is available at [0].
>
> Since the callback is arriving on an OS thread that has not previously entered Go, its stub is waiting for main's init to complete before proceeding [1]. However, the main goroutine is blocked because the API is waiting on the callback [2][3].
>
> Your first question is probably, "Why would you ever even try to call something like that from within main's init()?" Actually I wouldn't, but there is a net.Dial that is effectively triggered from within main's init() as part of the Go test suite! [4]
>
> Now for my actual question: Is this deadlock considered to be expected behaviour on the part of runtime, or is this something that needs to be fixed?

Everything you wrote above sounds like expected behavior. When a C
program is linked against Go code, the Go initialization is run in a
separate thread, so that Go initialization doesn't delay the C program
startup. This means that when C code calls into Go code, we have a
check to make sure that the Go code is fully initialized. The way
that we detect C code calling Go code is that the thread was created
by C.

This is breaking in your case because you have Go code that is run at
init time that causes a thread created by C to call Go code. There
are other ways to trigger this deadlock, like having a Go init
function call a C function that creates a thread that calls a Go
function and then waits for the thread to complete. We decided that
that was bizarre enough not to worry about.

I don't know what the Windows thread pool is. I don't know where the
callback lives. Would it be possible to do something like start a
goroutine that waits for Windows, and have WIndows code that passes
information to that goroutine using atomic memory stores rather than
via a function call?

We could also of course add a special purpose hook here: define a
function that can be called by C code that can call into Go code
without waiting for initialization to be complete.

Ian
Reply all
Reply to author
Forward
0 new messages