We currently have a pool of Apache servers behind an F5 load balancer. We have long-lived keep-alives set between the F5 and the "backend" Apache app servers.The problem with this setup is when we do a graceful restart on an Apache instance the keep-alive children don't exit until their connection is closed -- which can be a very long time due to how the F5 keeps connections alive.What we really want for a "graceful" restart is to allow any current request to finish, close the connection, and then kill the process and reload.We have added code to our app that catches SIGUSR1 and sets a flag, and when this flag is set we force a Connect: close header which allows Apache to kill the worker child process.Could someone please explain how all this works with Starman and Server::Starter?Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not clear if that's the same "graceful" apache uses where Apache waits for a child's connection to close before killing off that process.
Would I need to still catch a SIGHUP and send a connection: close header as I do with Apache?Thanks,
--
Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not clear if that's the same "graceful" apache uses where Apache waits for a child's connection to close before killing off that process.You have to convert the SIGHUP to SIGQUIT as in the Starman documentation so that Starman gracefully shuts down the current workers while Server::Starter launches the new set of the workers.
... sending "QUIT" signal to the master process will gracefully shutdown the workers (meaning the currently running requests will shut down once the request is complete).
That said, because of the problem you described, I strongly do not recommend enabling keep-alives between your frontend servers and Starman. I would say it's not even supported. (The documentation suggests disabling keep-alives when the frontend proxy has a long-open keep-alive connections)
On Fri, Aug 17, 2012 at 9:47 AM, Tatsuhiko Miyagawa <miya...@gmail.com> wrote:Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not clear if that's the same "graceful" apache uses where Apache waits for a child's connection to close before killing off that process.You have to convert the SIGHUP to SIGQUIT as in the Starman documentation so that Starman gracefully shuts down the current workers while Server::Starter launches the new set of the workers.Right, because by default Server::Starter sends a TERM.I intended to ask about the docs. The docs say:... sending "QUIT" signal to the master process will gracefully shutdown the workers (meaning the currently running requests will shut down once the request is complete).What does "request is complete" mean? Is that like Apache where it waits for the socket to close on the worker?
As in my case where a long running keep-alive will keep the process busy until the connection is closed?
Now, we are on a WAN with multiple datacenters, so it's possible that the F5 and Apache (or Starman) app servers are not in the same location. I don't know how significant that latency is, but that's why we would prefer to use keep-alives on the backend. At least according to our network engineers.I'd have to check, but IIRC, the F5s don't pre-connect -- that is when a client request comes in and needs to be passed to a backend server the F5 creates a new connection if one does not exist (from a keep-alive). In other words, I don't think the F5 pre-connects before it has a request to handle.I guess it's time to test, but if I really needed to use the keep-alives would I then need to do something similar with Starman and catch the SIGQUIT and then add a "Connection: close" header to the response to free up the child to exit after that response? I'm not clear if that's just a "problem" with Apache or if it also would apply to Starman.