Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Graceful cluster restart of master and workers with node 0.6, with zero downtime
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  17 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Steve Molitor  
View profile  
 More options Jan 24 2012, 6:22 pm
From: Steve Molitor <stevemoli...@gmail.com>
Date: Tue, 24 Jan 2012 17:22:29 -0600
Local: Tues, Jan 24 2012 6:22 pm
Subject: Graceful cluster restart of master and workers with node 0.6, with zero downtime

Using the node 0.6.x cluster module, what is the best way to gracefully
restart a cluster, including both the workers *and* the master, without any
downtime?  My current naive attempt is as follows:

0. Start a cluster of workers.  Each worker calls 'server.listen(80)'.

When master receives SIGHUP signal:

1. Spawn a new master and set of workers.  Each new worker calls
'server.listen(80)'.  (uh, problem here - same port)
2. Original master sends message to each of its workers telling it to close.
3. Upon receipt of message, each worker calls server.close().  No new
connections are accepted.
4. On server 'close' event, each worker tells its master that it is closed,
and calls process.exit().
5. When the original master has received 'closed' messages back from all
its workers, original master calls process.exit().

The original master and it workers are now dead, and a new master and
worker process are running.  However, if a long running connection is
running on one of the original workers and the new cluster starts before
that connection finishes, the new cluster doesn't receive any HTTP
requests.  I assume this is because the already port is in use.

If I don't start the new cluster (step 1) until after all the original
workers have exited just before closing the original master (step 5),
everything works fine.  However there is an interval of time where I'm am
not accepting any connections.  If I don't start a new master process but
just close and restart individual workers everything works fine also.
 However, my goal is no downtime, and reloading of all node processes
including the master (to pick up new master code, a new node version, etc).

The node-cluster module from LearnBoost accomplished this, but I'm trying
to use node 0.6 and its built in cluster support.

Thanks,

Steve


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Diogo Resende  
View profile  
 More options Jan 24 2012, 6:31 pm
From: Diogo Resende <drese...@thinkdigital.pt>
Date: Tue, 24 Jan 2012 23:31:28 +0000
Local: Tues, Jan 24 2012 6:31 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime
For really zero downtime, you have to do this way:

0. Start master and workers

- SIGHUP comes in

1. Send that information to, for example, half of the workers
2. This half should stop accepting connections and as soon as they
    serve they last request they should exit
3. The master know when the workers start exiting gracefully and
    starts new workers
4. When half of your workers have restarted you can do the same to
    the others

Remember that you will have your workers possibly running different
code versions at the same so ensure this won't be a problem.

The half 1st, half later is just a strategy. You can do one by one
or anything other. Just don't notify all workers at the same time or
you might have them closed too fast for you to start new ones..

---
Diogo R.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Molitor  
View profile  
 More options Jan 24 2012, 9:50 pm
From: Steve Molitor <stevemoli...@gmail.com>
Date: Tue, 24 Jan 2012 20:50:08 -0600
Local: Tues, Jan 24 2012 9:50 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime

Thanks for the suggestion, but I'm trying to kill the original master and
start a new master, in addition to the workers.  However, it seems I can't
have two master processes with workers handling requests on the same port,
and the same time.   I know how to get zero downtime, if I don't restart
the master process.  And I know how to kill and start a new master, if I
give up on zero downtime.  But I'm trying to both restart the master, and
have zero downtime.

Steve

On Tue, Jan 24, 2012 at 5:31 PM, Diogo Resende <drese...@thinkdigital.pt>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Chilton  
View profile  
 More options Jan 24 2012, 10:07 pm
From: Andrew Chilton <chi...@appsattic.com>
Date: Wed, 25 Jan 2012 16:07:47 +1300
Local: Tues, Jan 24 2012 10:07 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime
On 25 January 2012 15:50, Steve Molitor <stevemoli...@gmail.com> wrote:

> Thanks for the suggestion, but I'm trying to kill the original master and
> start a new master, in addition to the workers.  However, it seems I can't
> have two master processes with workers handling requests on the same port,
> and the same time.   I know how to get zero downtime, if I don't restart the
> master process.  And I know how to kill and start a new master, if I give up
> on zero downtime.  But I'm trying to both restart the master, and have zero
> downtime.

There are other ways of doing it too. For example in a recent project
of mine, I start two node.js servers listening on different local
ports. Both are proxied to by Nginx in the same server{} stanza.
Basically I rarely need to restart Nginx but I _can_ restart one
nodejs server, then the other to get new code running.

I know this isn't using cluster like you asked, but extrapolating can
make this work for your use-case too. I'm afraid I don't know how to
do it with a single master process without downtime, but I'd be keen
on knowing if this is possible. I think Nginx can also be restarted
with zero downtime - so the hard problem is already solved - so that
is also an option (as well as making it serve your static content)
which means you can still get what you want.

In an older project of mine where we had 4 webservers, we load
balanced to each of them, but took one out of the load balancer at a
time to load new code. Again, no downtime but a different solution to
what you asked for. Maybe these have given you some ideas. :)

Let us know when you come to a solution you like!

Cheers,
Andy

--
Andrew Chilton
e: chi...@appsattic.com
w: http://www.appsattic.com/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Karl Tiedt  
View profile  
 More options Jan 24 2012, 10:08 pm
From: Karl Tiedt <kti...@gmail.com>
Date: Tue, 24 Jan 2012 21:08:58 -0600
Local: Tues, Jan 24 2012 10:08 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime
software only on 1 system... its not really feasible... you would have
to have a load balancer basically in front of 2 systems... where you
can shut 1 down and still restart the other... you cant really
overcome those limitations w/o downtime otherwise have to add another
layer... (hardware or virtual)

-Karl Tiedt


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Karl Tiedt  
View profile  
 More options Jan 24 2012, 10:14 pm
From: Karl Tiedt <kti...@gmail.com>
Date: Tue, 24 Jan 2012 21:14:04 -0600
Local: Tues, Jan 24 2012 10:14 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime
Not sure what load balancers you used, but don't most support not
serving to a down server already? Seems like extra work to reconfig
for that purpose ;)

-Karl Tiedt


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Chilton  
View profile  
 More options Jan 24 2012, 10:22 pm
From: Andrew Chilton <chi...@appsattic.com>
Date: Wed, 25 Jan 2012 16:22:48 +1300
Local: Tues, Jan 24 2012 10:22 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime
On 25 January 2012 16:14, Karl Tiedt <kti...@gmail.com> wrote:

> Not sure what load balancers you used, but don't most support not
> serving to a down server already? Seems like extra work to reconfig
> for that purpose ;)

Absolutely, nothing will come through to a dead server (which is the
point of this solution).

However, just blithly stopping a server may cause some requests to
fail (maybe this was an artifact of our system) but we found it useful
to take a server out of the lb, wait until zero requests were being
sent to it and then we knew we could restart it without anyone seeing
any problems. :) This could have been automated quite easily so it's
not really a problem - just an extra safety net. :)

You're probably right, but this was what we found in our environment.
Others may differ and playing with your own system will teach you what
you need to do. Hope that makes sense.

Cheers,
Andy

--
Andrew Chilton
e: chi...@appsattic.com
w: http://www.appsattic.com/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Karl Tiedt  
View profile  
 More options Jan 24 2012, 10:29 pm
From: Karl Tiedt <kti...@gmail.com>
Date: Tue, 24 Jan 2012 21:29:05 -0600
Local: Tues, Jan 24 2012 10:29 pm
Subject: Re: [nodejs] Graceful cluster restart of master and workers with node 0.6, with zero downtime
Very true Andrew, I was neglecting to consider open connections - good catch :)

-Karl Tiedt


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
knc  
View profile  
 More options Jan 25 2012, 1:36 am
From: knc <kishor...@gmail.com>
Date: Tue, 24 Jan 2012 22:36:35 -0800 (PST)
Local: Wed, Jan 25 2012 1:36 am
Subject: Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime
Any specific reason why you want to restart the master process as
well?

I recently started working on a wrapper/helper for the core cluster
moduler. Pretty much a work in progress, but a lot of what you have
said would be handy to have in this module.

https://github.com/kishorenc/clusterize

Regards,

Kishore.

On Jan 25, 4:22 am, Steve Molitor <stevemoli...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alan Hoffmeister  
View profile  
 More options Jan 25 2012, 10:03 am
From: Alan Hoffmeister <alanhoffmeis...@gmail.com>
Date: Wed, 25 Jan 2012 13:03:04 -0200
Local: Wed, Jan 25 2012 10:03 am
Subject: Re: [nodejs] Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime
Kishore, any chances to have auto reload on file changes and
coffescritpt support? Also some api fo getting  worker/master status
would be nice.

--
Att,
Alan Hoffmeister


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Molitor  
View profile  
 More options Jan 25 2012, 10:29 am
From: Steve Molitor <stevemoli...@gmail.com>
Date: Wed, 25 Jan 2012 09:29:50 -0600
Local: Wed, Jan 25 2012 10:29 am
Subject: Re: [nodejs] Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime

One reason would be to seamlessly deploy new node versions.

Steve


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Molitor  
View profile  
 More options Jan 25 2012, 11:16 am
From: Steve Molitor <stevemoli...@gmail.com>
Date: Wed, 25 Jan 2012 10:16:10 -0600
Local: Wed, Jan 25 2012 11:16 am
Subject: Re: [nodejs] Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime

If I somehow passed the original socket to the new master and had it use
that, would that work?  Could I have two clusters servicing requests on the
same port (temporarily)?  Is that what learn boost's cluster module does?

Steve


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
billywhizz  
View profile  
 More options Jan 25 2012, 3:02 pm
From: billywhizz <apjohn...@gmail.com>
Date: Wed, 25 Jan 2012 12:02:09 -0800 (PST)
Local: Wed, Jan 25 2012 3:02 pm
Subject: Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime
you won't be able to have two unrelated processes listening on the
same port at the same time so if you want to restart the master
process without downtime you will need to have a load balancer in
front of it. there's a good summary of the options here:
http://www.loadbalancing.org/

if you are on linux/FreeBSD then LVS is probably the best option:
http://www.linuxvirtualserver.org/whatis.html

as far as i know you can only pass a socket to a related (child)
process and not to an unrelated process which would pretty much rule
out what you suggest above. it might be possible to spawn a child
process (which should use the newly installed version of node) and
send the socket to that before killing the parent, as long as there is
nothing that breaks between the two version of node, but you'll still
have to deal with the issue of not being able to copy the new version
of node over the old one while it's running.

On Jan 25, 4:16 pm, Steve Molitor <stevemoli...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
knc  
View profile  
 More options Jan 25 2012, 10:48 pm
From: knc <kishor...@gmail.com>
Date: Wed, 25 Jan 2012 19:48:26 -0800 (PST)
Local: Wed, Jan 25 2012 10:48 pm
Subject: Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime
With respect to two processes using the same port, can someone please
example what's said on this page:

http://nodejs.org/docs/latest/api/net.html#server.listen

It says "All sockets in Node set SO_REUSEADDR already". In TCP, can't
we have two programs listening on the same socket, if we set the
SO_REUSEADDR on the socket before bind?

On Jan 26, 1:02 am, billywhizz <apjohn...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matt  
View profile  
 More options Jan 25 2012, 11:53 pm
From: Matt <hel...@gmail.com>
Date: Wed, 25 Jan 2012 23:53:58 -0500
Local: Wed, Jan 25 2012 11:53 pm
Subject: Re: [nodejs] Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime

No.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
dhruvbird  
View profile  
 More options Jan 27 2012, 7:29 am
From: dhruvbird <dhruvb...@gmail.com>
Date: Fri, 27 Jan 2012 04:29:47 -0800 (PST)
Local: Fri, Jan 27 2012 7:29 am
Subject: Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime
IIRC, you can use linux domain sockets to share stuff across unrelated
processes - I might be wrong on this though.

Regards,
-Dhruv.

On Jan 25, 3:02 pm, billywhizz <apjohn...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dobes  
View profile  
 More options Jan 28 2012, 8:52 am
From: Dobes <dob...@gmail.com>
Date: Sat, 28 Jan 2012 05:52:03 -0800 (PST)
Local: Sat, Jan 28 2012 8:52 am
Subject: Re: Graceful cluster restart of master and workers with node 0.6, with zero downtime
If your master restarts quickly enough you can have the client wait
for it to restart without rejecting the connection - it appears as a
couple second pause for users but not an error.

If not, you can put a proxy in front that listens on the one port and
switches automatically.  Presumably this proxy would be shut down less
frequently than your master process, allowing most upgrades to avoid
downtime.  Perhaps use something like nginx for that proxy.

If you can run the new master on a different port and have new clients/
workers use the new port the proxying logic could be coded right into
the master to save running an extra process - perhaps the master is
told there's a new master and it re-wires itself to proxy all requests
to the new master instead of processing them itself.

On Jan 25, 10:50 am, Steve Molitor <stevemoli...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »