gom,
Thanks you for using WebROaR.
In 'TimeOut settings?' response, I have summarized mechanism to identify the worker
which hangs while processing a request.
Summarizing the scenario.
- Head waits for 60 seconds to get processed request back from worker. - After 60 seconds it sends first PING signal and waits for 15 seconds to
get a reply. - After 15 seconds it sends second PING signal and again waits for 15
seconds to get some reply. - If worker is not responding during this time interval, the head assumes
the worker is in a unstable state and unable to process further requests. We have defined values for worker idle time (WR_WKR_IDLE_TIME), ping wait
time (WR_PING_WAIT_TIME) and number of ping trials (WR_PING_TRIALS) in 'wr_config.h' file.
Lets summarize the scenario to create new worker.
- Head creates new worker and wait for it to contact back.
- If worker is unable to contact back within 25 seconds, it is assumed that there might be some problem with loading application and worker never contact back. It kills the worker.
- Create new worker and repeat the above step.
If three consecutive workers get timed out and not contacted back, head assumes that there is no enough memory or processing power to create new worker.
In this case WebROaR waits for 30 minutes to create new worker.
Previously we have defined values for worker add timeout (WR_WKR_ADD_TIMEOUT) and wait time to create new workers (WR_WKR_ADD_WAIT_TIME) in 'wr_config.h'. But in current code we have stored these values in variables in wr_config_server_init(wr_config.c) function.
In your case, due to high load three consecutive workers got timed out and the server waits for 30 minutes to create new workers. After 30 minutes worker created and connected to the server successfully.
I hope this reply would solve your queries.
Thanks,
Nikunj