sigTERM not intercepted

289 views
Skip to first unread message

Madscientist Microneil

unread,
Mar 23, 2021, 11:47:33 PM3/23/21
to golang-nuts
Hi,

I've got a REST endpoint coded in go and it communicates with various child processes to get it's work done.

I've used 

// Set up to handle signals so we can stop when asked
done := make(chan os.Signal, 3)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)

and 

<-done

to capture sigTERM and a few others so that when the server is being shut down for maintenance it will cleanly finish the requests it is working on before it shuts down.

Unfortunately, it seems that the child processes get sigTERM right away and as a result they die so the requests that are in flight end up broken.

Did I miss a step?

Thanks,
_M

Kurtis Rader

unread,
Mar 23, 2021, 11:59:49 PM3/23/21
to Madscientist Microneil, golang-nuts
It sounds as if you might be running your "REST endpoint" program from an interactive shell (i.e., a terminal) and sending SIGTERM by pressing something like Ctrl-C. Interactive job control, which includes how signals generated by a "terminal" are handled, is a complex topic. So we need to know how you are sending SIGTERM to your process. Is it by something like the `kill` command or syscall to a specific PID or by pressing keys on your terminal to terminate the process?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/b99a53f4-732a-489c-be6e-572354cf7386n%40googlegroups.com.


--
Kurtis Rader
Caretaker of the exceptional canines Junior and Hank

madscientist

unread,
Mar 24, 2021, 9:32:52 AM3/24/21
to golang-nuts
On 3/23/21 11:59 PM, Kurtis Rader wrote:
It sounds as if you might be running your "REST endpoint" program from an interactive shell (i.e., a terminal) and sending SIGTERM by pressing something like Ctrl-C. Interactive job control, which includes how signals generated by a "terminal" are handled, is a complex topic. So we need to know how you are sending SIGTERM to your process. Is it by something like the `kill` command or syscall to a specific PID or by pressing keys on your terminal to terminate the process?

The application is installed as a service and managed via systemctl start/stop.

There are two types of children -- one type simply runs and then exits: It's run like this...

func extractFeatures(s session.Session) (string, error) {
output, err := exec.Command(
"./signatureml",
//"-debug",
"-model",
"theModel",
"--no-csv-header",
"-csv",
s.ArtifactsPath(),
s.MessagePath()).CombinedOutput()
return string(output[:]), err
}

The other type is long-running and communicates via stdin/stdout: It's run like this...


type LongRunningPredictor struct {
theCommand *exec.Cmd
input io.WriteCloser
output bytes.Buffer
completionError error
hasEnded bool
}

func (p *LongRunningPredictor) waitLoop() {
p.hasEnded = false
p.completionError = p.theCommand.Wait()
p.hasEnded = true
}
func (p *LongRunningPredictor) start() string {
p.theCommand = exec.Command("./long-running-predictor", "trained-model")
p.theCommand.Stdout = &p.output
p.theCommand.Stderr = &p.output
p.input, _ = p.theCommand.StdinPipe()
p.theCommand.Start()
startupText, _ := readLinesUpToPrefixBestEffort(&p.output, "Ready")
go p.waitLoop()
return startupText
}

Long running predictors are cycled through a channel that acts like a pool -- when one is needed it's pulled from the channel, and when it's done it's put back to be re-used.

===

Once the rest endpoint app gets the signal to terminate these children are killed even though, in theory, the signal was captured.

I know the signal was captured because the logic I have in place for a clean shutdown begins to execute as expected.

Looking forward to a solution or at least an understanding.

Best,

_M


Brian Candler

unread,
Mar 24, 2021, 2:15:59 PM3/24/21
to golang-nuts
That is, you're sending the SIGTERM via systemd?

Is it possible that systemd is sending it to the whole process group, and not just to the parent?  Is systemd running the go command directly, or via a shell?

If you want to be 100% sure, you can start the children in separate process groups.  For Linux it's something like this (not tried for a while):

        p.theCommand.SysProcAttr = &syscall.SysProcAttr{Setsid: true}
        err := p.theCommand.Start()

(BTW, I note you weren't checking the error return from Start, which I do recommend)

But I'd also check the systemd documentation.  Also try sending a kill -TERM <pid> directly to the parent, just to compare.
Reply all
Reply to author
Forward
0 new messages