Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
keep-alives and graceful restarts
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Bill Moseley  
View profile  
 More options Aug 17 2012, 12:38 pm
From: Bill Moseley <mose...@hank.org>
Date: Fri, 17 Aug 2012 09:38:06 -0700
Local: Fri, Aug 17 2012 12:38 pm
Subject: keep-alives and graceful restarts

We currently have a pool of Apache servers behind an F5 load balancer.   We
have long-lived keep-alives set between the F5 and the "backend" Apache app
servers.

The problem with this setup is when we do a graceful restart on an Apache
instance the keep-alive children don't exit until their connection is
closed -- which can be a very long time due to how the F5 keeps connections
alive.

What we really want for a "graceful" restart is to allow any current
request to finish, close the connection, and then kill the process and
reload.

We have added code to our app that catches SIGUSR1 and sets a flag, and
when this flag is set we force a Connect: close header which allows Apache
to kill the worker child process.

Could someone please explain how all this works with Starman and
Server::Starter?

Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not
clear if that's the same "graceful" apache uses where Apache waits for a
child's connection to close before killing off that process.

Would I need to still catch a SIGHUP and send a connection: close header as
I do with Apache?

Thanks,

--
Bill Moseley
mose...@hank.org


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tatsuhiko Miyagawa  
View profile  
 More options Aug 17 2012, 12:47 pm
From: Tatsuhiko Miyagawa <miyag...@gmail.com>
Date: Fri, 17 Aug 2012 09:47:27 -0700
Local: Fri, Aug 17 2012 12:47 pm
Subject: Re: keep-alives and graceful restarts

On Aug 17, 2012, at 9:38 AM, Bill Moseley wrote:

> We currently have a pool of Apache servers behind an F5 load balancer.   We have long-lived keep-alives set between the F5 and the "backend" Apache app servers.

> The problem with this setup is when we do a graceful restart on an Apache instance the keep-alive children don't exit until their connection is closed -- which can be a very long time due to how the F5 keeps connections alive.

> What we really want for a "graceful" restart is to allow any current request to finish, close the connection, and then kill the process and reload.

> We have added code to our app that catches SIGUSR1 and sets a flag, and when this flag is set we force a Connect: close header which allows Apache to kill the worker child process.

> Could someone please explain how all this works with Starman and Server::Starter?

> Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not clear if that's the same "graceful" apache uses where Apache waits for a child's connection to close before killing off that process.

You have to convert the SIGHUP to SIGQUIT as in the Starman documentation so that Starman gracefully shuts down the current workers while Server::Starter launches the new set of the workers.

The way it works is that Server::Starter accepts the requests on the listening ports and manages two clusters of Starman during the restart, so the existing (serving) workers will gracefully quit till all the requests are handled, while launching a new set of workers to route new requests.

That said, because of the problem you described, I strongly do not recommend enabling keep-alives between your frontend servers and Starman. I would say it's not even supported. (The documentation suggests disabling keep-alives when the frontend proxy has a long-open keep-alive connections)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bill Moseley  
View profile  
 More options Aug 18 2012, 12:32 pm
From: Bill Moseley <mose...@hank.org>
Date: Sat, 18 Aug 2012 09:32:43 -0700
Local: Sat, Aug 18 2012 12:32 pm
Subject: Re: keep-alives and graceful restarts

On Fri, Aug 17, 2012 at 9:47 AM, Tatsuhiko Miyagawa <miyag...@gmail.com>wrote:

> Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not
> clear if that's the same "graceful" apache uses where Apache waits for a
> child's connection to close before killing off that process.

> You have to convert the SIGHUP to SIGQUIT as in the Starman documentation
> so that Starman gracefully shuts down the current workers while
> Server::Starter launches the new set of the workers.

Right, because by default Server::Starter sends a TERM.

I intended to ask about the docs.  The docs say:

... sending "QUIT" signal to the master process will gracefully shutdown
the workers (meaning the currently running requests will shut down once
the request is complete).

What does "request is complete" mean?  Is that like Apache where it waits
for the socket to close on the worker?  As in my case where a long running
keep-alive will keep the process busy until the connection is closed?

> That said, because of the problem you described, I strongly do not
> recommend enabling keep-alives between your frontend servers and Starman. I
> would say it's not even supported. (The documentation suggests disabling
> keep-alives when the frontend proxy has a long-open keep-alive connections)

Again, we have client-facing load balancers (F5s) that currently talk to
Apache/mod_perl.   So, to be clear, I'm talking about replacing
Apache/mod_perl with Starman so the F5 talks directly to Starman (or
Server::Starter, I suppose when making a connection).

Now, we are on a WAN with multiple datacenters, so it's possible that the
F5 and Apache (or Starman) app servers are not in the same location.   I
don't know how significant that latency is, but that's why we would prefer
to use keep-alives on the backend.   At least according to our network
engineers.

I'd have to check, but IIRC, the F5s don't pre-connect -- that is when a
client request comes in and needs to be passed to a backend server the F5
creates a new connection if one does not exist (from a keep-alive).  In
other words, I don't think the F5 pre-connects before it has a request to
handle.

I guess it's time to test, but if I really needed to use the keep-alives
would I then need to do something similar with Starman and catch the
SIGQUIT and then add a "Connection: close" header to the response to free
up the child to exit after that response?  I'm not clear if that's just a
"problem" with Apache or if it also would apply to Starman.

Thanks!

--
Bill Moseley
mose...@hank.org


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tatsuhiko Miyagawa  
View profile  
 More options Aug 18 2012, 3:06 pm
From: Tatsuhiko Miyagawa <miyag...@gmail.com>
Date: Sat, 18 Aug 2012 12:06:16 -0700
Local: Sat, Aug 18 2012 3:06 pm
Subject: Re: keep-alives and graceful restarts

On Aug 18, 2012, at 9:32 AM, Bill Moseley wrote:

> On Fri, Aug 17, 2012 at 9:47 AM, Tatsuhiko Miyagawa <miyag...@gmail.com> wrote:
>> Now, with Server::Starter a "graceful" restart sends a SIGHUP -- I'm not clear if that's the same "graceful" apache uses where Apache waits for a child's connection to close before killing off that process.

> You have to convert the SIGHUP to SIGQUIT as in the Starman documentation so that Starman gracefully shuts down the current workers while Server::Starter launches the new set of the workers.

> Right, because by default Server::Starter sends a TERM.

> I intended to ask about the docs.  The docs say:

> ... sending "QUIT" signal to the master process will gracefully shutdown the workers (meaning the currently running requests will shut down once the request is complete).

> What does "request is complete" mean?  Is that like Apache where it waits for the socket to close on the worker?

when the current request is completely served and client disconnected. Take a look at the code (it's all pure perl) if you're wondering.

>  As in my case where a long running keep-alive will keep the process busy until the connection is closed?

I suppose so.

> Now, we are on a WAN with multiple datacenters, so it's possible that the F5 and Apache (or Starman) app servers are not in the same location.   I don't know how significant that latency is, but that's why we would prefer to use keep-alives on the backend.   At least according to our network engineers.

> I'd have to check, but IIRC, the F5s don't pre-connect -- that is when a client request comes in and needs to be passed to a backend server the F5 creates a new connection if one does not exist (from a keep-alive).  In other words, I don't think the F5 pre-connects before it has a request to handle.

> I guess it's time to test, but if I really needed to use the keep-alives would I then need to do something similar with Starman and catch the SIGQUIT and then add a "Connection: close" header to the response to free up the child to exit after that response?  I'm not clear if that's just a "problem" with Apache or if it also would apply to Starman.

I suppose so but there's no way to catch the quit signals in your worker to close the connection so long as I see. In other words it's not supported.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »