Selenium hub shutdowns when it receives SIGTERM signal from supervisord

Malarvizhi Ganesan

unread,

Feb 16, 2021, 11:14:20 AM2/16/21

to Selenium Users

Hi All,

We have selenium-grid setup in openshift and running it on daily basis.

Monthly 4-6 times (out of 30 test cycles, maximum 6 times), selenium-hub pod restarts which leads to few session loss & that causes test case failures.

1. Pod configured with restart policy as "on-failure" - so hub pod restarts on failure (no issue here)

2. Reason for hub shutdown which leads to restart is - receives SIGTERM signal to shutdown.

Log details from hub:

Trapped SIGTERM/SIGINT/x so shutting down supervisord

WARN received SIGTERM indicating exit request

INFO waiting for selenium-hub to die

INFO stopped: selenium-hub (terminated by SIGTERM)

Shutdown complete

Which process sends SIGTERM signal to hub and why?

What could be the possible reason for receiving SIGTERM signal in selenium hub?

Diego Molina

unread,

Mar 10, 2021, 8:36:17 AM3/10/21

to Selenium Users

Hi,

Something is killing the pod, which ends up killing the container. I would check the pod logs and overall cluster activity.

Malarvizhi Ganesan

unread,

Mar 11, 2021, 12:15:14 AM3/11/21

to Selenium Users

Hi Diego Molina,

Thanks for replying. We fixed this issue. But took sometime to update here.

You are right :), something is killing the pod.

What is killing the pod?

We have readiness & liveness probe (Health check) defined with timeout of: 5 secs & failureThreshold (retry): 3 (default) for selenium-hub. Sometime liveness probe didn't response within given 5 secs for all 3 retries which tells openshift to take care of un-healthy selenium-hub pod by restarting (openshift deployment/deployment-config) or moving it to not-ready state (openshift pod).

What is the Fix: Increased probes timeout from 5 to 30 secs & failureThreshold from 3 to 5 as recommended in

https://github.com/SeleniumHQ/docker-selenium/tree/selenium-3 - Adding a HEALTHCHECK to the Grid

fixed the issue.

If you see the fix, it's too simple. But analyzing this issue was really superb experience. Once again thank you Diego Molina

Few suggestions if you are running your test in containerized env (to easier the analysis if you encounter any failures):

1. Set hub & chrome-node log level to "FINE" or "ALL" (Default: Info)

2. Collect & store all pod logs (hub, chrome & test pod) once execution completed.

3. Collect & store all pods event logs once execution completed.

4. Collect & store all pods status once execution completed.

5. Add readiness & liveness probe for both hub & node

⇜Krishnan Mahadevan⇝

unread,

Mar 11, 2021, 1:44:08 AM3/11/21

to Selenium Users

Malarvizhi,

Thank you so much for taking the time to come back and close the thread with a detailed resolution and steps that helped you! Appreciate it.

Thanks & Regards
Krishnan Mahadevan

"All the desirable things in life are either illegal, expensive, fattening or in love with someone else!"
My Scribblings @ http://wakened-cognition.blogspot.com/

My Technical Scribblings @ https://rationaleemotions.com/

--
You received this message because you are subscribed to the Google Groups "Selenium Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/selenium-users/b31b36f6-fe52-4e92-a1a9-d0531459f4e4n%40googlegroups.com.

Reply all

Reply to author

Forward