Nginx Reload - CPU Spike - Nginx/Redis

469 views
Skip to first unread message

Shane Marsh

unread,
May 10, 2021, 11:53:57 AM5/10/21
to ngx-pagespeed-discuss
Hiya, 

Our front end Nignx servers are loaded to a target of 40% CPU. However when we reload Nginx (for example adding/removing server blocks), we get a high CPU spike - lasts about 3-5 mins usually. Nignx does not pull from cache/Redis during this time and provides little optimised content. Eventually (maybe once all the workers have reloaded??), Nignx starts pulling optimised content from cache/Redis and everything settles. Depending on traffic we have to plan for server reloads because the CPU spike can be enough to cause very slow performance at best and intermittent 502/gateway style errors at worst. 

My question: Is there a way to minimise the CPU spike on reload and does Apache/Redis do this too? 

Shane :)

Longinos

unread,
May 17, 2021, 2:48:23 AM5/17/21
to ngx-pagespeed-discuss
Hi
Reload or restart?
In a reload sockets are migrated, in restart sockets are destroyed an recreated.
The nginx front-end is a cache? Maybe it need to recreate the cache.

Shane Marsh

unread,
May 17, 2021, 5:46:27 AM5/17/21
to ngx-pagespeed-discuss
Hiya, 

On reload. We don't restart Nginx generally unless the instance is offline and not receiving any traffic. I'm really unsure but I have a feeling that on reload the master process dumps the shared memory which causes the Nginx to have this CPU spike. It just takes a few minutes for it to get going settled. I could build an Apache version of the server and try to set it but firstly due to our setup I'd have trouble testing the server under production loads and I also don't have a lot of time at the moment so I wondered if anyone knew if Apache behaves in the same way. It's no biggie :)

Shane :)

Longinos

unread,
May 17, 2021, 9:19:22 AM5/17/21
to ngx-pagespeed-discuss
If you use LRU cache, maybe populating this take some time.
Anyway, reload process chek the syntax of new config files and then start new workers but olds can still here forever (Think in a long keep-alive time) because these olds workess still have conections. After 1.11 (nginx version) there is a parameter ( worker_shutdown_timeout default value none) to close these old workers.
Maybe you can try this?

Shane Marsh

unread,
May 18, 2021, 10:51:57 AM5/18/21
to ngx-pagespeed-discuss
No we don't use LRU. Theres 50mb of shared memory caching and the rest is in Redis/KVRocks. 

I will look into "worker_shutdown_timeout" - if it speeds up the time it takes for workers to gracefully shutdown it might lead to less of an impact during a reload. Thanks for your suggestion. 

Shane :)

Reply all
Reply to author
Forward
0 new messages