signal.Notify + exec.CommandContext + cancel = erratic results

109 views
Skip to first unread message

Art Mellor

unread,
Sep 19, 2017, 10:22:00 AM9/19/17
to golang-nuts

I have a program that is working on my Mac reliably, but gives erratic
results on other  (linux) systems. It maybe something wrong with my code,
but it not obvious to me at  all what I might have done wrong.

You can see (but not run in playground) the code here (copy to your
system to try):

https://play.golang.org/p/w_PG50vz_f

All done with go version go1.9

What it does is launch 3 go routines that exec a sleep call with a
context. Then the cancel routines are called, either by a timer or
SIGINT handler.

The timer fires before the sleep call and cancels all the
execs and prints out the context error values ("context canceled").

***This works everywhere I've tested.***

There is also a SIGINT handler that cancels all the execs and prints
out the context error values.

When hitting ^C (after the 'waiting...' output) I get erratic behavior
where the context errs print out as nil most (but not all) of the
time, but sometimes 1 or more print out as "context canceled".

My Mac (OSX 10.12.6 Sierra) seems to work properly all the time.

Systems I have tested where it has had problems:

"real" Ubuntu 14.04
Docker Ubuntu 14.04 and 16.04 on "real" Ubuntu 14.04
VMWare Ubuntu 16.04, 14.04, 12.04
AWS instance Ubuntu 14.04

Output without SIGINT (all tested platforms - expected output):

% /tmp/exectest
waiting...
cancel 0
cancel 1
cancel 2
ctx.err 2 = "context canceled"
ctx.err 0 = "context canceled"
ctx.err 1 = "context canceled"
...done

Output on Mac with SIGINT (always seems to work):

% ./exectest
waiting...
^C! interrupt received !
cancel 0
cancel 1
cancel 2
ctx.err 0 = "context canceled"
ctx.err 1 = "context canceled"
ctx.err 2 = "context canceled"
...done

Example output on other systems with SIGINT (erratic):

% ./exectest
waiting...
^C! interrupt received !
cancel 0
cancel 1
cancel 2
ctx.err 0 = "context canceled"
ctx.err 1 = "context canceled"
ctx.err 2 = "context canceled"
...done

% ./exectest
waiting...
^C! interrupt received !
cancel 0
cancel 1
cancel 2
ctx.err 2 = %!q(<nil>)
ctx.err 0 = %!q(<nil>)
ctx.err 1 = %!q(<nil>)
...done

% ./exectest
waiting...
^Cctx.err 0 = %!q(<nil>)
! interrupt received !
cancel 0
ctx.err 2 = %!q(<nil>)
cancel 1
cancel 2
ctx.err 1 = %!q(<nil>)
...done

% ./exectest
waiting...
^C! interrupt received !
cancel 0
cancel 1
ctx.err 1 = %!q(<nil>)
ctx.err 2 = %!q(<nil>)
cancel 2
ctx.err 0 = "context canceled"
...done

Ian Lance Taylor

unread,
Sep 19, 2017, 10:50:04 AM9/19/17
to Art Mellor, golang-nuts
On Tue, Sep 19, 2017 at 7:22 AM, Art Mellor
<art-goo...@dontsharemyemail.com> wrote:
>
> I have a program that is working on my Mac reliably, but gives erratic
> results on other (linux) systems. It maybe something wrong with my code,
> but it not obvious to me at all what I might have done wrong.

Have you run your program under the race detector? At a quick glance
it looks like you have races on cmds between the main goroutine and
the signal goroutine.

Ian

Art Mellor

unread,
Sep 19, 2017, 11:11:58 AM9/19/17
to golang-nuts
Sorry - I had done it with the real stuff I'm working on (which is properly mutex'd), but I just added some locks to this one and it still happens.

New code:
https://play.golang.org/p/koUAZOcDjV

Appears to run clean now - still failing on ^C

Art Mellor

unread,
Sep 19, 2017, 2:02:50 PM9/19/17
to golang-nuts
One additional data point. If I add the following line right after 'r.cmd.Run()' in RunMe.Run, it works:

    time.Sleep(1 * time.Nanosecond)

It seems that the context error takes a moment to be filled in properly...

Tamás Gulácsi

unread,
Sep 19, 2017, 2:51:50 PM9/19/17
to golang-nuts
```
<-ctx.Done()
err := ctx.Err()
```
just before printing out the error.

Art Mellor

unread,
Sep 19, 2017, 3:34:39 PM9/19/17
to golang-nuts
[Tamas - re-answering on the list]

This works if the sleepDur is greater than the cancelTimeout value, but if cancelTimeout is greater, it blocks waiting until the timeout fires rather than ending after all go routines are done.

Hmm...

Art Mellor

unread,
Sep 19, 2017, 6:50:54 PM9/19/17
to Tamás Gulácsi, golang-nuts
Spoke too soon - if you make cancelTimeout bigger than sleepDur, it doesn't end after sleepDur, it blocks until the cancelTimeout fires.

I'm going to stew on this some more.


On Tue, Sep 19, 2017 at 3:20 PM, Art Mellor <a...@priestlyemus.com> wrote:
Yes! That was what I needed - thanks!

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/OzIpdnCi_LU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Art Mellor

unread,
Sep 19, 2017, 6:50:54 PM9/19/17
to Tamás Gulácsi, golang-nuts
Yes! That was what I needed - thanks!
On Tue, Sep 19, 2017 at 2:51 PM, Tamás Gulácsi <tgula...@gmail.com> wrote:
```
Reply all
Reply to author
Forward
0 new messages