http CloseNotifier with cgi and httputil/proxy handlers

340 views
Skip to first unread message

Peter Waller

unread,
Jun 6, 2014, 7:31:35 AM6/6/14
to golang-dev, Brad Fitzpatrick
Hi,

I've just bumped up against requests CPU spinning which don't write anything until they're done. They continue to consume resources even after the requester gave up on them and closed their end of the connection.

Would a CL be welcome to make the cgi handler and httputil proxy CloseNotifier-aware? It doesn't look like an intrusive change is required.

Thanks,

- Peter

Details:

AFAICT, we *would* tear things down usually so long as the upstream actually did a write on the closed connection, because the code in the middle would see a failed write and everything would tear down.

But since the processes I'm looking at may not write anything until they're done, we need an out-of-band signalling mechanism (CloseNotifier).

(httputil proxy) I assume it's okay to call Transport.CancelRequest from another goroutine while RoundTrip's body is being io.Copy'd.

(cgi handler) I assume it's okay to cmd.Process.Kill() from another goroutine whilst an io.Copy is happening, so long as cmd.Process != nil. Though it feels like I might be racy if it's possible that while I'm reading cmd.Process, it gets written to. I don't know if this can happen. Is there a safe way to abandon the linked io.Copy from a goroutine watching the CloseNotifier?

Brad Fitzpatrick

unread,
Jun 6, 2014, 9:08:17 AM6/6/14
to Peter Waller, golang-dev
First I'd like to see a repro for and fix the spin. Adding CloseNotifier support to things seems fine after that.

Peter Waller

unread,
Jun 6, 2014, 9:14:40 AM6/6/14
to Brad Fitzpatrick, golang-dev
To be clear, the "spin" is "I invoke a process in CGI" which spins. Nothing go side. I'll try and cook up a test to simulate.

There is a bit of speculation based on observed behaviour in my original post, I'll make a simple repro to be sure.

Brad Fitzpatrick

unread,
Jun 6, 2014, 9:20:26 AM6/6/14
to Peter Waller, golang-dev
Well, in that case we should at least close the child CGI process's stdin if we don't when the user goes away, which might fix the spin.

But that's might still be independent of supporting CloseNotifier on that type of ResponseWriter?

I'll be gone from now ~ until Monday night on vacation. But feel free to start sending repros / patches.


Peter Waller

unread,
Jun 9, 2014, 9:59:23 AM6/9/14
to Brad Fitzpatrick, golang-dev, David Jones
Let's tackle one problem at a time. First I'm going to focus on the CGI handler wasting CPU resources, which is a big problem for us.

On 6 June 2014 14:20, Brad Fitzpatrick <brad...@golang.org> wrote: 
Well, in that case we should at least close the child CGI process's stdin if we don't when the user goes away, which might fix the spin.

In our case, stdin is not connected (cmd.Stdin is unassigned), so I don't think that can help [1].
 
But that's might still be independent of supporting CloseNotifier on that type of ResponseWriter?

The problem seems much harder than I anticipated.

Here is a minimal reproduce which emulates the behaviour of the general processes we may be running:

It's reproduced at the bottom of the file. It consists of a CGI script written in bash which invokes another process in the middle of its execution, and a test which requests the file and then calls CloseClientConnections(). I would expect the process to complete in 2 seconds, but instead it takes 10 seconds (or as long as the CPU activity takes).

If I modify net/http/cgi to watch `CloseNotifier` from a new goroutine, then the only thing I can do (I think), is to signal the process with os.Interrupt or os.Kill. This leaves behind the child's child process consuming CPU resources, and doesn't allow the HTTP request to complete as the io.Copy [2] will only terminate when the file descriptor is closed.

I'm stumped as to what the correct general thing to do is. The only thing we can see from here that satisfies our use case is to start the process as a session leader and then kill the process group.

In particular, it's not possible to interrupt the io.Copy [2], right?

Any help or advice appreciated.

Regards,

- Peter


cgi-reproduce.go


package main
 
import (
"io"
"io/ioutil"
"log"
"net/http"
"net/http/httptest"
"net/url"
"time"
 
"net/http/cgi"
)
 
// Expected: go program finishes after 2 seconds.
// Actual: go program finishes after 10 seconds.
// Contents of test.cgi:
/*
#! /bin/bash
 
echo Status: 200 OK
echo
echo Test.
 
# Pick one of the next three lines
sleep 10
# bash -c "for i in {1..5000000}; do true; done;"
# python -c "for i in xrange(100000000): pass"
 
echo Done.
*/
 
func main() {
h := &cgi.Handler{
Path: "test.cgi",
}
 
s := httptest.NewServer(h)
defer s.Close()
 
go func() {
// Request test.cgi
u := s.URL + "/test.cgi"
log.Println("GET", u)
resp, err := http.Get(u)
if err, ok := err.(*url.Error); ok && err.Err == io.EOF {
log.Println("EOF")
return
} else if err != nil {
panic(err)
}
defer resp.Body.Close()
 
result, err := ioutil.ReadAll(resp.Body)
log.Printf("err, result = %v %s", err, string(result))
}()
 
log.Println("Waiting..")
time.Sleep(1 * time.Second)
 
log.Println("Closing client connections..")
s.CloseClientConnections()
time.Sleep(1 * time.Second)
 
log.Println("Exiting..")
}

Brad Fitzpatrick

unread,
Jun 10, 2014, 1:59:40 PM6/10/14
to Peter Waller, golang-dev, David Jones
Please move this to a bug, with the necessary repro conditions.

Peter Waller

unread,
Jul 22, 2014, 10:44:20 AM7/22/14
to Brad Fitzpatrick, golang-dev, David Jones
I've reported the problem with http.ReverseProxy here:


I'd be happy to submit a CL which makes use of CloseNotifier within the ReverseProxy if this issue is accepted. However, it isn't totally clear to me how to abort the io.Copy in ReverseProxy.copyResponse:

There is an opportunity to close the connection if FlushInterval is being used but not otherwise.

This is actually the very same problem which is faced in the CGI handler. Over there for our own infrastructure we ended up using a patched version of cgi.Handler which starts the process in a process group and then kills the whole process group. This is to get around the fact that there isn't any other way that I can see to terminate the io.Copy.

I would submit a CL for the CGI change, except that killing the process group doesn't seem very general nor portable and I'm not sure what other people would expect of such a handler. "Send a SIGTERM, wait a bit, then SIGKILL"? RFC 3875 does suggest that misbehaving or timed out scripts can be terminated, but SIGKILL with no other chance seems harsh.

A process group was used because that seemed to be the only way to correctly tear everything down if the program being run creates a child.

Peter Waller

unread,
Aug 5, 2014, 6:17:23 AM8/5/14
to golang-dev, Brad Fitzpatrick
Any chance to get this fixed for 1.4?
Reply all
Reply to author
Forward
0 new messages