Shutting down VERY busy production node

111 views
Skip to first unread message

Tony Mobily

unread,
Feb 28, 2017, 12:35:48 PM2/28/17
to nodejs

I have a very, very busy node application on a production server. The app deals with a real-time chat (using websockets) as well as e-commerce payments. While everything is absolutely set so that when the server goes down the clients will reconnect their sockets etc., I still have a problem: whenever the server is stopped, with a SIGINT, the event loop is cut off. This means that any pending DB write (possibly for a financial transaction) is simply discarded. There are two especially crucial moments (when the credit card merchant gives the OK, but before we write the record on the db) and at the moment we are shutting it down at off-peak times to prevent any possible problems. But this is bad.

I am thinking of this as a solution:

  • I send a custom UNIX signal to the process (SIGUSR2 for example?);
  • When server.js gets the signal:
    • It stops listening to port 80
    • It waits for the event loop to dry up
    • If after 10 seconds it's still hanging, it forces the closure This means that at each reboot the server will be at the most down for 10 seconds.

Is this what people in the real world do? Any gotcha? How do I check that the event loop is empty?


Bernardo Vieira

unread,
Feb 28, 2017, 8:41:59 PM2/28/17
to nod...@googlegroups.com
Tony,

We have pretty much the same solution you described implemented in our app, except for the 10s timeout, and it works very well. 

You don't have to do anything special to check the event loop, if you call the close method on all your active servers the process will end when all connections have drained.

That said, wouldn't it be wiser to move the credit card handling out of the main app? You could have the main app fill a queue with pending transactions and have the CC app consume the queue, that way the worst that can happen is a delay in processing  a transaction rather than it vanishing into to the ether.

--
Job board: http://jobs.nodejs.org/
New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+unsubscribe@googlegroups.com.
To post to this group, send email to nod...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/2058de6b-d042-4e82-b3ea-b1379e9d721e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tony Mobily

unread,
Mar 1, 2017, 10:23:27 PM3/1/17
to nodejs
Hi Bernardo,

Thanks for your answer.

I was half way through doing this, when I read this:


How do you deal with keepalive connections? They can potentially stay up forever (if they ping often enough) even after close() has been issued...

Processing payments outside the app is probably a very good idea. The queue could be as simple as a database table... but, one question: once the transaction is done, how would the "CC app" let the main app know that the transaction has happened?

Merc.,


On Wednesday, 1 March 2017 09:41:59 UTC+8, Bernardo wrote:
Tony,

We have pretty much the same solution you described implemented in our app, except for the 10s timeout, and it works very well. 

You don't have to do anything special to check the event loop, if you call the close method on all your active servers the process will end when all connections have drained.

That said, wouldn't it be wiser to move the credit card handling out of the main app? You could have the main app fill a queue with pending transactions and have the CC app consume the queue, that way the worst that can happen is a delay in processing  a transaction rather than it vanishing into to the ether.
On Tue, Feb 28, 2017 at 4:23 AM, Tony Mobily <tonym...@gmail.com> wrote:

I have a very, very busy node application on a production server. The app deals with a real-time chat (using websockets) as well as e-commerce payments. While everything is absolutely set so that when the server goes down the clients will reconnect their sockets etc., I still have a problem: whenever the server is stopped, with a SIGINT, the event loop is cut off. This means that any pending DB write (possibly for a financial transaction) is simply discarded. There are two especially crucial moments (when the credit card merchant gives the OK, but before we write the record on the db) and at the moment we are shutting it down at off-peak times to prevent any possible problems. But this is bad.

I am thinking of this as a solution:

  • I send a custom UNIX signal to the process (SIGUSR2 for example?);
  • When server.js gets the signal:
    • It stops listening to port 80
    • It waits for the event loop to dry up
    • If after 10 seconds it's still hanging, it forces the closure This means that at each reboot the server will be at the most down for 10 seconds.

Is this what people in the real world do? Any gotcha? How do I check that the event loop is empty?


--
Job board: http://jobs.nodejs.org/
New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+un...@googlegroups.com.

Amanda Osvaldo

unread,
Mar 2, 2017, 10:23:28 AM3/2/17
to nodejs
Hi Tony.

Have you ever considered using a firewall and routing approach?

For example, on the same server, you can keep open your current node-js server and open other in the port 81, for example.

And the trick is, send only the new users connections to the new node-js server in the port 81.
With time, the users will finish your work in the old node-js server in port 80.

You must evaluate well how to do it to avoid latency or even to freeze the server.

-- Amanda Osvaldo

Bernardo Vieira

unread,
Mar 2, 2017, 10:58:29 AM3/2/17
to nod...@googlegroups.com
.

I was half way through doing this, when I read this:


How do you deal with keepalive connections? They can potentially stay up forever (if they ping often enough) even after close() has been issued...

That's an interesting  issue and definitely a problem. In our particular case, however, it is not relevant because we have a load balancer and a forward routing proxy between the client and the actual server process we're attempting to restart.

I guess that would be an use case for the timeout.

Processing payments outside the app is probably a very good idea. The queue could be as simple as a database table... but, one question: once the transaction is done, how would the "CC app" let the main app know that the transaction has happened?

The main app would have to implement some kind of polling strategy to query the transactions table for complete transactions since the last check. 
Similarly the card processing app would poll looking for new transactions and once the CC provider handles a transaction you'd update the transaction record with the status.

If you wan't to avoid polling you could use something like redis pub/sub to exchange transactions between the two apps.

Note the in any case you'll need to give security a hard thought, having CC numbers in the DB has to be a violation of some 300 pci rules.

Tony Mobily

unread,
Mar 2, 2017, 10:38:07 PM3/2/17
to nod...@googlegroups.com
Hi,


That's an interesting  issue and definitely a problem. In our particular case, however, it is not relevant because we have a load balancer and a forward routing proxy between the client and the actual server process we're attempting to restart.

Well actually my bad, we also use a proxy. Specifically, we have the node server listen to port 8080 and then nginx to act as a reverse proxy. I mainly did it so that I don't run node as roo, and can eventually add more servers.

But... How does that mitigate the keep alive problem? Aren't keep alive connections proxies so that are still there...?

That's one thing I don't quite understand about the use of nginex and node...

Thank you!

Merc.

Bernardo Vieira

unread,
Mar 3, 2017, 4:22:37 PM3/3/17
to nod...@googlegroups.com
Tony,

A proxy, in general terms, consists of a server socket and a http client, it's basic operation is to handle incoming requests to its server socket by issuing http requests to a backend using the http client. Keeping that in mind the keep-alive parameter applies to the http request that exists between a browser and the the proxy's server socket. The request to the backend in an entirely separate request and has its own parameters. It can even be, and habitually is, a different protocol (think ssl offloading or a fast cgi backend).

Thus, unless your proxy uses keep alive on the upstream request, you don't have to worry about that. 
That's my case, the proxy we use is a custom built node http proxy based solution that does not use keep-alive on the backend requests. It's also the default for nginx (http://nginx.org/en/docs/http/ngx_http_upstream_module.html#keepalive)



--
Job board: http://jobs.nodejs.org/
New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+unsubscribe@googlegroups.com.

To post to this group, send email to nod...@googlegroups.com.

Tony Mobily

unread,
Mar 4, 2017, 1:01:28 AM3/4/17
to nod...@googlegroups.com
Hi,

A proxy, in general terms, consists of a server socket and a http client, it's basic operation is to handle incoming requests to its server socket by issuing http requests to a backend using the http client. Keeping that in mind the keep-alive parameter applies to the http request that exists between a browser and the the proxy's server socket. The request to the backend in an entirely separate request and has its own parameters. It can even be, and habitually is, a different protocol (think ssl offloading or a fast cgi backend).

That's actually precisely what I am doing too.
I am using a rather standard NginX configuration for this:

server {
    listen 66.240.255.3:443 ssl http2;
    #server_name www.wonder-app.com ;
    server_name www.wonder-app.com;

    client_max_body_size 0;

    ssl_certificate /etc/nginx/ssl/www.xxxx.com.crt;
    ssl_dhparam /etc/nginx/ssl/dhparam.pem;
    ssl_certificate_key /etc/nginx/ssl/www.xxxx.com.key;

    location / {

      proxy_pass http://localhost:8080;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection 'upgrade';
      proxy_set_header Host $host;
      proxy_cache_bypass $http_upgrade;
    }
}


Thus, unless your proxy uses keep alive on the upstream request, you don't have to worry about that. 

Yep -- and it doesn't.
 
That's my case, the proxy we use is a custom built node http proxy based solution that does not use keep-alive on the backend requests.

That's interesting. How come you built your own?
Yep.
And since websockets is mainly for real-time messaging, and the client has already been programmed so that if the connection drops it simply gets re-established, it won't matter too much if the websockets get killed off once the server is shut down...

I assumed that a keepalive connection from the client to the nginx server would imply a keep-alive connection from nginx to the local node server listening to an unprivileged port... but that's obviously not the case!
(Does that even matter? Would you ever want a keep-alive connection between the proxy and the local node?)

Merc.

Bernardo Vieira

unread,
Mar 6, 2017, 11:15:56 AM3/6/17
to nod...@googlegroups.com
On Fri, Mar 3, 2017 at 8:57 PM, Tony Mobily <me...@mobily1.com> wrote:
 
That's my case, the proxy we use is a custom built node http proxy based solution that does not use keep-alive on the backend requests.

That's interesting. How come you built your own?
 

We needed business logic to implement the routing of the incoming requests to different backend service providers. It turned out to be very simple to implement the logic atop of node-http-proxy.
 
I assumed that a keepalive connection from the client to the nginx server would imply a keep-alive connection from nginx to the local node server listening to an unprivileged port... but that's obviously not the case!
(Does that even matter? Would you ever want a keep-alive connection between the proxy and the local node?)

It would make sense to use a keep-alive strategy to the backend whenever the cost of establishing the connections outweighs the price (memory footprint, usually) of maintaining idle open connections for the keep-alive period. In cases where the proxy and the upstream have very little latency between them I'd wager that there's very little, if any, performance advantage in using keep-alive.
Reply all
Reply to author
Forward
0 new messages