Disconnecting frequently after updating Node.js v6.9.4

91 views
Skip to first unread message

k yosimoto

unread,
Jul 30, 2017, 3:57:54 PM7/30/17
to nodejs

My service has been updated to Node.js v6.9.4, and the connection of Socket.io is now disconnected frequently.

It was not reproduced in the development environment and decided to recover by returning to the state before updating.


Is this problem only with certain versions?

Will updating to the latest Node.js solve the problem?


Thank you for your cooperation.

  • Update versions
    • nod...@6.1.0 → 6.9.4
    • sock...@1.4.6 → 1.7.2
    • exp...@4.13.4 → 4.14.0
    • body-...@1.15.1 → 1.15.2
    • jqu...@2.2.3 → 3.1.1
    • js...@9.0.0 → 9.9.1
  • State at disconnection
    •  The number of connections is about 200.  (The number of connections before updating is about 20,000.)
    • After 1 or 2 hours of reconnecting, the connection was disconnected.
    • There was no abnormality in CPU load and memory usage of Node.js server and load balancer (Pound).
    • About the log of the browser (devtools)

The response time of the request (polling) before disconnection was 85 sec, and "400 Bad request" error was output in the next request. 

(The response time in the normal state is 25 sec)

    •  About server log
      • There was no clue in the log of Node.js.
      • The following error was output to the pound.log, but the causal relation is unknown.

pound: (7f8592ce7700) e501 bad request "HQ" from xxx.xxx.xxx.xxx

pound: (7f84cdbe5700) BackEnd yyy.yyy.yyy.yyy:9443 dead (killed)

"xxx.xxx.xxx.xxx" or "yyy.yyy.yyy.yyy" is the IP address.

  • Infrastructure information

  Browser ─[ HTTPS ]─ Load balancer(Pound) ─[ HTTP ]─┬─  Node.js Server 1

                                                                                                  ├─  Node.js Server 2

                                                                                                  ├─  Node.js Server 3

                                                                                                  └─  Node.js Server 4

    • The connection between the browser and the Node server is maintained by polling. (Not WebSocket
    • The Node application sends an HTTP request to another server
    • Load Balancer

OS: CentOS 6.6
Middleware for load balancing: pound ver.2.6
OS setting change point:
  - Set the maximum number of usable processes to 100,000.
  - Set the limit number of file descriptors to 100,000.
  - Set memory map limit of pthread to 200,000.

  - Set the maximum number of threads to 100,000.

    • Node.js Server

OS: CentOS 6.6

OS setting change point:

  - Set the limit number of file descriptor to 100,000

 


Zlatko

unread,
Jul 31, 2017, 2:44:11 PM7/31/17
to nodejs
Few suggestions from me:


On Sunday, July 30, 2017 at 9:57:54 PM UTC+2, k yosimoto wrote:

  • Update versions
    • nod...@6.1.0 → 6.9.4
    • sock...@1.4.6 → 1.7.2
    • exp...@4.13.4 → 4.14.0
    • body-...@1.15.1 → 1.15.2
    • jqu...@2.2.3 → 3.1.1
    • js...@9.0.0 → 9.9.1

What if you just update the Node version? Try that first. The try updating one thing at a time and see where the thing breaks. Because when you update one of these npm packages, they will likely pull in many others as well, and some of those others might be flattened out by the new npm so your _other_ modules will also have new versions of their dependencies without realising it.


  • State at disconnection

The response time of the request (polling) before disconnection was 85 sec, and "400 Bad request" error was output in the next request. 

(The response time in the normal state is 25 sec)

 


You mean socket polling? Or what url specifically?

One more idea. The great thing about Node 6 is that you can easilly run a debugger and attach chrome dev tools to it and monitor the state. At the very least, get a few dumps and check what's different between the old and new versions.

 

k yosimoto

unread,
Aug 4, 2017, 2:13:40 AM8/4/17
to nodejs

Thanks for your advice.



What if you just update the Node version? Try that first. The try updating one thing at a time and see where the thing breaks. Because when you update one of these npm packages, they will likely pull in many others as well, and some of those others might be flattened out by the new npm so your _other_ modules will also have new versions of their dependencies without realising it.

This problem can not be reproduced in the development environment.

And customer impact is too big to check in production environment.



You mean socket polling? Or what url specifically?


Yes. This is a polling log.

Below is the log output to the network tab of Firefox debugger when problems occur.

 
GET  https://myapp.com/socket.io/?node...sport=polling..        200 OK  24.99ms  
POST https
://myapp.com/socket.io/?node...sport=polling..        200 OK  1ms  
GET https
://myapp.com/socket.io/?node...sport=polling..         200 OK  25.01ms  
POST https
://myapp.com/socket.io/?node...sport=polling..        200 OK  0ms  
GET  https
://myapp.com/socket.io/?node...sport=polling..        200 OK  1m 25s
POST https
://myapp.com/socket.io/?node...sport=polling..        400 Bad Request..  
------ detail ------  
NetworkError : 400 Bad Request - https://myapp.com/socket.io.......  
--------------  
POST https
://myapp.com/socket.io/?node...sport=polling..        400 Bad Request..  
------ detail ------  
NetworkError : 400 Bad Request - https://myapp.com/socket.io.......  
--------------




One more idea. The great thing about Node 6 is that you can easilly run a debugger and attach chrome dev tools to it and monitor the state. At the very least, get a few dumps and check what's different between the old and new versions.

Thank you for your wonderful idea. 
I will try to compare the old and new version using this debugger.

  
Reply all
Reply to author
Forward
0 new messages