How to check Cmd.ProcessState without triggering the race checker?

TheDiveO

unread,

Jul 23, 2022, 4:39:17 PM7/23/22

to golang-nuts

In my open source file descriptor leak checker (Linux only) for Gomega, Go's race detector flags a race condition in my code where I need to check whether a Cmd.Process already has terminated as its ProcessState has been set.

if session.Command.ProcessState != nil {

return nil, errors.New("session has already ended")

}

Now, this check might be called be called from a different Goroutine than another Goroutine waiting for the command to exit. In particular, this is the (what I consider to be the relevant) output of the race detector:

WARNING: DATA RACE
Write at 0x00c0002e8208 by goroutine 16:
os/exec.(*Cmd).Wait()
/home/.../sdk/go1.19rc2/src/os/exec/exec.go:598 +0x1eb
github.com/onsi/gomega/gexec.(*Session).monitorForExit()
/home/.../go/pkg/mod/github.com/onsi/gom...@v1.20.0/gexec/session.go:197 +0x44
github.com/onsi/gomega/gexec.Start.func1()
/home/.../go/pkg/mod/github.com/onsi/gom...@v1.20.0/gexec/session.go:96 +0x47

Previous read at 0x00c0002e8208 by goroutine 12:
github.com/thediveo/fdooze/session.FiledescriptorsFor()
/home/.../github/fdooze/session/session_fds.go:21 +0x1b7
github.com/thediveo/fdooze/session.glob..func1.1.1()
/home/.../github/fdooze/session/session_fds_test.go:45 +0x30
runtime.call16()
/home/.../sdk/go1.19rc2/src/runtime/asm_amd64.s:724 +0x48
reflect.Value.Call()
/home/.../sdk/go1.19rc2/src/reflect/value.go:368 +0xc7
github.com/onsi/gomega/internal.NewAsyncAssertion.func1()
/home/.../go/pkg/mod/github.com/onsi/gom...@v1.20.0/internal/async_assertion.go:48 +0x16e
github.com/onsi/gomega/internal.(*AsyncAssertion).pollActual()
/home/.../go/pkg/mod/github.com/onsi/gom...@v1.20.0/internal/async_assertion.go:132 +0x82
github.com/onsi/gomega/internal.(*AsyncAssertion).match()
/home/.../go/pkg/mod/github.com/onsi/gom...@v1.20.0/internal/async_assertion.go:162 +0xe4
github.com/onsi/gomega/internal.(*AsyncAssertion).ShouldNot()
/home/.../go/pkg/mod/github.com/onsi/gom...@v1.20.0/internal/async_assertion.go:112 +0xa9
github.com/thediveo/fdooze/session.glob..func1.1()
/home/.../github/fdooze/session/session_fds_test.go:65 +0xf41
github.com/onsi/ginkgo/v2/internal.(*Suite).runNode.func2()
/home/.../go/pkg/mod/github.com/onsi/ginkgo/v...@v2.1.4/internal/suite.go:596 +0xea

Is there actually any way to check use Cmd.ProcessState at all from a thread different from the one doing a Cmd.Wait? Might this be a design "weakness" of the Cmd type in that it doesn't cover such use cases?

Any idea for a work around to find if the command might already have (prematurely) finished by the time I try to check for its open file descriptors?

Ian Lance Taylor

unread,

Jul 24, 2022, 5:26:57 PM7/24/22

to TheDiveO, golang-nuts

The ProcessState field is documented as only being set when the Wait
method returns. So you should only check it after calling Wait.
Perhaps your program needs some other synchronization mechanism to let
it know when Wait has been called.

Ian

p...@morth.org

unread,

Jul 25, 2022, 3:56:14 AM7/25/22

to golang-nuts

Hi,

I think you need to restructure your code, because what you're actually doing is checking whether the pid is valid or not.

Since the pid is made invalid inside the Wait call (in the kernel), before ProcessState is set, there's no way to do that safely.

Instead, you should avoid calling Wait until you're done using the pid. You'll have to use some other method for detecting if the process has exited. On newer Linux versions, the preferred way would be a pidfd obtained via pidfd_open.

Regards,

Per Johansson

TheDiveO

unread,

Jul 25, 2022, 1:24:14 PM7/25/22

to golang-nuts

Per,

I fully agree for the case where you have full control over the code architecture ... in such a case the only way is to stop digging and getting out of the hole. Or have a party leadership contest instead ... just [j|ch]oking.

In this particular situation I don't have control over the Gomega TDD session package and how it deploys Wait in a separate goroutine. On its own, I consider Gomega's design to be fine here. It is just when bringing file descriptor checks to a session/exec.Cmd where my issue comes up and where I can neither change Gomega's session Wait nor Go's exec.Cmd. The fd leak checks will be called potentially multiple times at seemingly arbitrary times for the session under test, while the session under test might throw its tantrum at any time if it pleases to do so.

After some more pondering, I opted for Python's "try first, beg forgiveness later" strategy by throwing out the problematic check and trying to read the PID's fd directory anyway. If the PID is still valid that will succeed, if its gone that moment, it'll fail. Catching that error and reporting a more test user-friendly error instead makes this more robust, as well as user friendly.

I like your idea of pidfd_open, but after pondering it for a long time I don't think that it does help in this peculiar use case. It even complicates fd checking in several ways, as I will need to sort it out from leak checking and at the same time attempting to not leak it itself. This might quickly turn against be for little gain. On top of this, https://man7.org/linux/man-pages/man2/pidfd_open.2.html points out that when the child gets reaped then the pidfd becomes invalid. Nevertheless, I really appreciate you bringing up pidfd_open!

On a side note: I still cannot get over the fact that the fd returned by pidfd_open call can be used with setns as pointed out by someone else to me.

-- TheDiveO

Steven Hartland

unread,

Jul 25, 2022, 7:28:45 PM7/25/22

to TheDiveO, golang-nuts

Looks like session already has what you need as you can check if session.Exited has been closed something like:

select {

case <- session.Exited:
// Handle session has ended

default:

// Handle session hasn't ended
}

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/8ed84076-b29d-4840-8c15-09621e2fca61n%40googlegroups.com.

TheDiveO

unread,

Jul 26, 2022, 7:56:39 AM7/26/22

to golang-nuts

Hi Steven,

correct, I only noticed this later but in the end I decided to not make use of it. The reason is that there still is no atomicity and no locking, so the Cmd.Process can do whatever it want, especially playing the Blue Norwegian.

Reply all

Reply to author

Forward